The Chanl Blog
Insights on building, connecting, and monitoring AI agents for customer experience — from the teams shipping them.
All Articles
215 articles · Page 6 of 18

Memory bugs don't crash. They just give wrong answers.
Memory bugs don't crash your agent. They just give subtly wrong answers using stale context. Here are 5 test patterns to catch them before customers do.

The 17x error trap in multi-agent systems
Multi-agent systems amplify errors 17x, not reduce them. We compare CrewAI, LangGraph, and Autogen failure modes with concrete fixes and a decision tree.

The no-code ceiling: when agent builders hit production
Visual agent builders get you to 80% fast. The last 20%, telephony, monitoring, testing, and memory, requires infrastructure they never intended to provide.

Online vs. Offline Evals: Close the Production Gap
89% of teams have observability but only 37% run online evals. Here's why that gap is where production failures hide, and how to close it with a practical online eval pipeline.

Pipecat vs LiveKit: the trade-offs that lock you in
An opinionated comparison of Pipecat and LiveKit for production voice agents, covering architecture, deployment, cost, and the trade-offs that lock you in.

LLM-as-a-Judge: Build a Production Eval Pipeline
Build a production LLM-as-a-judge eval pipeline step by step. Covers judge selection, rubric design, CI integration, and sampling strategies that scale.

MCP Servers in Production: Observability from Day One
Instrument your MCP servers with OpenTelemetry for production-grade observability. Covers tracing tool calls, detecting loops, cost attribution, and alerting.

Build the MCP + A2A agent protocol stack from scratch
Wire an MCP server to an A2A agent that delegates tasks and calls tools. TypeScript and Python examples, Streamable HTTP transport, Agent Cards, and auth.

Agentic RAG: from dumb retrieval to self-correcting agents
Your RAG pipeline retrieves wrong documents and nobody catches it. Build a self-correcting agent that grades results, rewrites queries, and knows when to stop.

We open-sourced our AI agent testing engine
chanl-eval is an open-source engine for stress-testing AI agents with simulated conversations, adaptive personas, and per-criteria scorecards. MIT licensed.

Claude Code subagents and the orchestrator pattern
How to structure Claude Code subagents, write dispatch prompts, and coordinate parallel work across services, SDKs, and frontends in a monorepo.

Graph memory for AI agents: when vector search isn't enough
Build graph memory for AI agents in TypeScript and Python. Extract entities, track relationships over time, and compare Mem0, Zep, and Letta in production.
The Signal Briefing
One email a week. How leading CS, revenue, and AI teams are turning conversations into decisions. Benchmarks, playbooks, and what's working in production.