Blog/Tags/rag

rag

Browse 18 articles tagged with “rag”.

Articles tagged “rag”

18 articles

AI-generated illustration for long context vs rag cx agents -- Soul (2020) style, Terra Cotta palette

Technical Guide·17 min read

1M-Token Context or RAG? How to Pick for Your CX Agent

Gemini's 1M-token window is real but not free. A practical decision framework for choosing between long-context and RAG for customer experience agents, with cost numbers, code, and the hybrid pattern most production teams land on.

Watercolor Illustration of a Late-Night Developer Desk with Two Monitors. One Shows a Chat Window Where the Bot Says 'I Don't Know This One. Let Me Get Pat in Support.' The Other Shows a Dashboard with Two Columns Labeled Raw Deflection and Resolved Deflection, the Second Column Visibly Smaller.

Knowledge & Memory·13 min read read

How to Build a Tier-1 Chat Agent That Resolves (Not Just Deflects)

Bots claim 40% deflection; re-contact data says half is fake. Build the architecture that cuts tickets: auth-gated KB, calibrated confidence, escalation with context.

Person sorting through stacks of documents, crossing out wrong ones, with a magnifying glass on the desk

Learning AI·22 min read

Agentic RAG: from dumb retrieval to self-correcting agents

Your RAG pipeline retrieves wrong documents and nobody catches it. Build a self-correcting agent that grades results, rewrites queries, and knows when to stop.

Person drawing a web of connected nodes on a glass wall with colorful sticky notes around the edges

Learning AI·22 min read read

Graph memory for AI agents: when vector search isn't enough

Build graph memory for AI agents in TypeScript and Python. Extract entities, track relationships over time, and compare Mem0, Zep, and Letta in production.

Person exploring geometric shapes representing vector space

Learning AI·20 min read

Embeddings Turn Text Into Meaning. Here's the Math and the Code

What embeddings are, how similarity search works under the hood, and how to build a semantic search engine, from cosine similarity math to production vector databases.

Person examining a branching diagram of document retrieval paths

Knowledge & Memory·12 min read

The RAG You Built Last Year Is Already Outdated

RAG has branched into 5 distinct architectures: Self-RAG, Corrective RAG, Adaptive RAG, GraphRAG, and Agentic RAG. Here's when to use each and how to choose.

Person examining documents through a magnifying glass

Knowledge & Memory·7 min read

Your RAG Returns Wrong Answers. Upgrading the Model Won't Help

Most RAG quality problems are retrieval problems, not model problems. Bad chunking, wrong embeddings, and missing re-ranking cause more hallucinations than model capability gaps.

Warm watercolor illustration of interconnected data streams flowing through a library-like space

Tools & MCP·13 min read

From Keyword Search to Shopping Memory

Build the intelligence layer for an AI shopping assistant: semantic product search with Commerce MCP, customer memory that persists across visits, and MCP tool registration for multi-channel deployment.

Illustration of an AI agent navigating branching knowledge paths across interconnected document nodes

Learning AI·18 min read

Your RAG Pipeline Is Answering the Wrong Question

Naive RAG scores 42% on multi-hop questions. Agentic RAG hits 94.5%. The difference: letting the agent decide what to retrieve, when, and whether the results are good enough. Build both in TypeScript and Python.

Abstract neural pathways splitting into two branches representing episodic and semantic memory systems

Knowledge & Memory·18 min read read

Your Agent Remembers Everything Except What Matters

ICLR 2026 MemAgents research reveals when AI agents need episodic memory (what happened) vs semantic memory (what's true). Covers MAGMA, Mem0, AdaMem papers, comparison of Mem0 vs Letta vs Zep, and architecture patterns with TypeScript examples.

Illustration of an engineer assembling context layers for an AI agent, with memory, tools, and knowledge sources flowing into a central pipeline

Learning AI·21 min read

Context Engineering Is What Your Agent Actually Needs

Prompt engineering hits a wall with production AI agents. Context engineering fixes it. Build a full context pipeline with memory, RAG, history compression, and tool resolution.

Voice agent architecture diagram showing memory persistence across sessions in warm terra cotta and sage tones

Knowledge & Memory·12 min read read

Your Voice Agent Forgets Everything. Here's How to Fix That

How to add persistent memory, tools, and knowledge to Pipecat and LiveKit voice agents using the Chanl Python SDK — one SDK instead of assembling five services.

Ilustracion en acuarela de nodos de memoria interconectados formando una red de conocimiento en tonos verde salvia y oliva

Knowledge & Memory·25 min read read

Memoria de Agentes de IA: Del contexto de sesion al conocimiento a largo plazo

Construye sistemas de memoria para agentes de IA desde cero en TypeScript. Cubre tipos de memoria (sesion, episodica, semantica, procedural), arquitecturas (buffer, resumen, recuperacion vectorial), interseccion con RAG y diseno con privacidad.

Developer comparing two approaches on a whiteboard

Knowledge & Memory·20 min read

Fine-tuning vs RAG: why most teams pick wrong and how to decide

When to fine-tune, when to use RAG, and when you need both — with hands-on LoRA fine-tuning and RAG implementation on the same task to show the difference.

Ilustración de una persona organizando conocimiento en un tablero de corcho con notas conectadas

Learning AI·18 min read

RAG desde Cero: Construye un Pipeline de Generación Aumentada por Recuperación

Construye un pipeline RAG funcional desde cero en TypeScript y Python. Cubre embeddings, chunking, búsqueda vectorial y generación con código real y ejecutable.

Woman researching on laptop with book and glasses at a modern desk

Knowledge & Memory·14 min read

The Knowledge Base Bottleneck: Why RAG Alone Isn't Enough for Production Agents

RAG works beautifully in demos. In production, stale data, chunking failures, and unscored retrieval quietly sink your AI agents. Here's what actually fixes it.

Golden light filtering through a tree-lined path forming a natural tunnel

Knowledge & Memory·16 min read

AI Agent Memory: Build Your Own or Buy Off the Shelf?

Comparing Mem0, Zep, Letta, and custom memory for AI agents. We break down architecture trade-offs, compliance risks, and when each approach makes sense.

Voice AI agent making errors during customer conversation

Voice & Conversation·14 min read

Voice AI Hallucinations: The Hidden Cost of Unvalidated Agents

Discover how voice AI hallucinations can cost businesses thousands daily and learn proven strategies to detect and prevent them before they reach customers.

The Signal Briefing

Un email por semana. Cómo los equipos líderes de CS, ingresos e IA están convirtiendo conversaciones en decisiones. Benchmarks, playbooks y lo que funciona en producción.

500+ líderes de CS e ingresos suscritos