Writing tagged: embeddings | Daniel John Morris

Conversational AI: From RAG Prototypes to Domain-Specific SupervisionDecember 12, 2024 The Hidden Cost of Embedding OpenAI's embedding API charges per token across ingestion, re-ingestion, and every query. Switching to a local Ollama model eliminated the recurring cost with comparable retrieval quality.

ragembeddingsllmollama

Read article Conversational AI: From RAG Prototypes to Domain-Specific SupervisionOctober 15, 2024 Chunk-Then-Summarise: The Embedding Pipeline That Worked Raw PDF chunks make terrible vectors. Summarising each chunk before embedding produced cleaner searches and more relevant retrieval.

ragembeddingsllm