Articles tagged “learning-ai”
43 articles

Build the MCP + A2A agent protocol stack from scratch
Wire an MCP server to an A2A agent that delegates tasks and calls tools. TypeScript and Python examples, Streamable HTTP transport, Agent Cards, and auth.

Agentic RAG: from dumb retrieval to self-correcting agents
Your RAG pipeline retrieves wrong documents and nobody catches it. Build a self-correcting agent that grades results, rewrites queries, and knows when to stop.

Claude Code subagents and the orchestrator pattern
How to structure Claude Code subagents, write dispatch prompts, and coordinate parallel work across services, SDKs, and frontends in a monorepo.

Graph memory for AI agents: when vector search isn't enough
Build graph memory for AI agents in TypeScript and Python. Extract entities, track relationships over time, and compare Mem0, Zep, and Letta in production.

Voice AI pipeline: STT, LLM, TTS and the 300ms budget
Build a real-time voice pipeline with Pipecat. How STT, LLM, and TTS stream concurrently under a 300ms latency budget, with turn detection and interruptions.

Build an AI Agent Observability Pipeline from Scratch
Build a production observability pipeline for AI agents using TypeScript and the Chanl SDK. Covers metrics, traces, quality scoring, drift detection, and alerting.

Your AI Agent's Context Window Is Already Half Full
System prompts, tool schemas, MCP descriptions, memory injection, conversation history. They all eat tokens before the user says a word. Learn where your context budget goes and how to manage it.

Production Agent Evals: Catch Score Drift, Ship Confidently
Your evals pass in staging but miss production failures. Build three eval pipelines with the Chanl SDK: automated scorecards, scenario regression, and drift detection that catches quality degradation before customers do.

How to enforce the orchestrator pattern in Claude Code
The main Claude Code thread plans and reviews. Subagents implement. Three enforcement layers make this mandatory: CLAUDE.md, skills, and hooks. Includes a starter kit you can copy.

12 Ways Your LLM Judge Is Lying to You
Research identifies 12 systematic biases in LLM-as-a-judge systems. Learn to detect and mitigate each one before they corrupt your eval pipeline.

Your Agent Is Getting Smarter. It's Not Getting More Reliable.
Reliability improves at half the rate of accuracy. Three 85%+ tools combine to just 74%. Here's the math, the research, and the testing protocols that close the gap.

Embeddings Turn Text Into Meaning. Here's the Math and the Code
What embeddings are, how similarity search works under the hood, and how to build a semantic search engine, from cosine similarity math to production vector databases.

Function Calling: Build a Multi-Tool AI Agent from Scratch
Build a multi-tool AI agent from scratch using function calling across OpenAI, Anthropic, and Google. Runnable TypeScript and Python code, validation with Zod and Pydantic, and production hardening patterns.

Your RAG Pipeline Is Answering the Wrong Question
Naive RAG scores 42% on multi-hop questions. Agentic RAG hits 94.5%. The difference: letting the agent decide what to retrieve, when, and whether the results are good enough. Build both in TypeScript and Python.

Context Engineering Is What Your Agent Actually Needs
Prompt engineering hits a wall with production AI agents. Context engineering fixes it. Build a full context pipeline with memory, RAG, history compression, and tool resolution.

A 7B Domain Model Beat Everything We Tried
Domain-specific language models are beating trillion-parameter generalists on vertical tasks. Here's when a 7B model is the right call, how the training pipeline works, and what production teams are shipping today.

Fine-Tune a 7B Model for $1,500 (Not $50,000)
Full fine-tuning costs $50K in H100s. QLoRA on an RTX 4090 costs $1,500. Learn how LoRA and QLoRA let you train only 0.1-1% of parameters with nearly identical results, with working code for fine-tuning models that understand your agent's tool schemas.

A 1B Model Just Matched the 70B. Here's How.
How to distill frontier LLMs into small, cheap models that retain 98% accuracy on agent tasks. The teacher-student pattern, NVIDIA's data flywheel, and the Plan-and-Execute architecture that cuts agent costs by 90%.

Why Your AI Bill Is 30x Too High
Small language models match GPT-3.5 at 2% of the size and 95% less cost. Benchmarks, code, and a migration story from $13K/month to $400.

Parte 1: Los 7 Puntos de Extensión de Claude — El Modelo Mental
CLAUDE.md, Skills, Hooks, MCP Servers, Connectors, Claude Apps, Plugins — el ecosistema de extensiones de Claude es poderoso pero confuso. Aquí está el modelo mental que le da sentido a los 7.

Parte 2: CLAUDE.md, Hooks y Skills — Tres Capas
CLAUDE.md establece convenciones. Los Hooks las aplican. Los Skills enseñan flujos de trabajo. Entender estas tres capas — y su espectro de confiabilidad — es la clave para una configuración de Claude Code que realmente funcione.

Parte 3: MCP Servers vs. Connectors vs. Apps
Todas las Claude Apps son Connectors. Todos los Connectors son MCP Servers. Entender esta jerarquía — y cuándo construir vs. usar integraciones administradas — ahorra semanas de ingeniería innecesaria.

Parte 4: Los 7 Puntos de Extensión en una Base de Código de Producción
Más de 50 skills, múltiples MCP servers, reglas con alcance, hooks de seguridad — así es como los 7 puntos de extensión de Claude se componen en un monorepo NestJS real con 17 proyectos. Qué funciona, qué entra en conflicto y qué haríamos diferente.

Claude 4.6 broke our production agent in two hours — here's what's worth the migration
A practical developer guide to Claude 4.6 — adaptive thinking, 1M context, compaction API, tool search, and structured outputs. Real code examples in TypeScript and Python for building production AI agents.

Your agent has 30 tools and no idea when to use them
MCP tools give agents external capabilities. Skills give agents behavioral expertise. Learn the architecture of both, build them in TypeScript, and understand when to use each — and when you need both.

IA Agentica en Produccion: De Prototipo a Servicio Confiable
Lleva IA agentica a produccion sin que se rompa a las 2 AM. Cubre patrones de orquestacion (ReAct, bucles de planificacion), manejo de errores, circuit breakers, degradacion elegante, observabilidad y escalamiento, con implementaciones en TypeScript que puedes reutilizar.

Memoria de Agentes de IA: Del contexto de sesion al conocimiento a largo plazo
Construye sistemas de memoria para agentes de IA desde cero en TypeScript. Cubre tipos de memoria (sesion, episodica, semantica, procedural), arquitecturas (buffer, resumen, recuperacion vectorial), interseccion con RAG y diseno con privacidad.

AI Agent Observability: What to Monitor When Your Agent Goes Live
Build a production observability pipeline for AI agents. Covers latency, token usage, tool success rates, conversation quality, drift detection, structured logging, alerting strategies, and the critical difference between LLM and agent observability.

AI Agent Testing: How to Evaluate Agents Before They Talk to Customers
A practical guide to testing AI agents before production — scenario-based testing with AI personas, scorecard evaluation, regression suites, edge case generation, and CI/CD integration.

Herramientas para Agentes de IA: MCP, OpenAPI y Gestión de Herramientas que Realmente Escala
Cómo los agentes de IA en producción descubren, ejecutan y gestionan herramientas: desde el protocolo MCP hasta la importación automática de OpenAPI, sandboxing de seguridad e infraestructura de herramientas multi-tenant.

Build your own AI agent memory system — what breaks when real users show up?
Build a complete memory system for customer-facing AI agents — session context, persistent recall, semantic search. Then learn what breaks when real customers start returning.

Construye tu propio sistema de herramientas para agentes de IA: ¿qué se rompe cuando agregas la herramienta número 20?
Construye un sistema completo de herramientas para agentes de IA orientados al cliente desde cero: registro, ejecución, autenticación y monitoreo. Luego aprende qué se rompe cuando los clientes reales comienzan a llamar.

MCP Deep Dive: Advanced Patterns for Agent Tool Integration
Production MCP patterns for teams who've built their first server and need to scale it — OAuth 2.1 with PKCE, Streamable HTTP transport, gateways, sampling, dynamic tool registration, and multi-tenant security.

Multimodal AI Agents: Voice, Vision, and Text in Production
How to architect multimodal AI agents that process voice, vision, and text simultaneously — from STT→LLM→TTS pipelines to vision integration, latency budgets, and production fusion strategies.

Voice Agent Platform Architecture: The Stack Behind Sub-300ms Responses
Deep dive into voice agent architecture — the STT→LLM→TTS pipeline, latency budgets, interruption handling, WebRTC vs WebSocket transport, and what orchestration platforms leave on the table.

Fine-tuning vs RAG: why most teams pick wrong and how to decide
When to fine-tune, when to use RAG, and when you need both — with hands-on LoRA fine-tuning and RAG implementation on the same task to show the difference.

Multi-Agent AI Systems: Build an Agent Orchestrator Without a Framework
Build a multi-agent system from scratch — delegation, planning loops, and inter-agent communication — before reaching for LangGraph or CrewAI.

Streaming AI Responses: SSE, WebSockets, and the Architecture Behind ChatGPT's Typing Effect
Build three streaming implementations from scratch — SSE, WebSocket, and HTTP/2 — and learn why token-by-token rendering is harder than it looks.

Como evaluar agentes de IA: construye un framework de evaluacion desde cero
Construye un framework funcional de evaluacion de agentes de IA en TypeScript y Python. Cubre LLM-as-judge, puntuacion por rubrica, pruebas de regresion e integracion con CI.

MCP Explicado: Construye Tu Primer Servidor MCP en TypeScript y Python
Construye un servidor MCP funcional desde cero en TypeScript y Python. Tutorial práctico que cubre tools, resources, transports y testing.

Prompt Engineering desde Primeros Principios: 12 Técnicas que Todo Desarrollador de IA Necesita
Domina 12 técnicas esenciales de prompt engineering con ejemplos reales en TypeScript. Desde zero-shot hasta ReAct, construye mejores agentes de IA desde primeros principios.

RAG desde Cero: Construye un Pipeline de Generación Aumentada por Recuperación
Construye un pipeline RAG funcional desde cero en TypeScript y Python. Cubre embeddings, chunking, búsqueda vectorial y generación con código real y ejecutable.

How Multimodal Voice AI Works: From Audio-Only to Vision-Aware Agents
How multimodal voice AI combines speech, vision, and text into a single agent — architecture patterns, latency tradeoffs, and TypeScript code you can run.
Aprende IA Agéntica
Una lección por semana: técnicas prácticas para construir, probar y lanzar agentes IA. Desde ingeniería de prompts hasta monitoreo en producción. Aprende haciendo.