The Chanl Blog
Insights on building, connecting, and monitoring AI agents for customer experience — from the teams shipping them.
Latest Articles
Testing & EvaluationIs AI Better Than Your Humans? Score Both on One Rubric
Most teams can't say whether AI beats humans because they score them differently. One rubric, run on both, sliced by segment, gives you an honest answer.
Testing & EvaluationEvery Failed Call Is a Test Case You Haven't Written Yet
The gap between staging and production for AI agents is measured in surprise. Here's how to close the loop from live failure to regression gate.
All Articles
215 articles · Page 1 of 18

How to Measure Cost Per Successful Outcome for AI Agents
Most teams measure AI agent quality by pass rate. The metric that actually predicts ROI is cost per successful outcome: what each resolution costs paired against whether it actually resolved. Here's how to build it.

MCP Webhooks: Build Event-Driven Agents That React in Real Time
MCP's request-response model breaks when AI agents need to react to external events. Build event-driven agents today with stateless HTTP and webhooks.

How to Run the Agent Development Lifecycle (ADLC) in Production
Shipping an AI agent is easy. Keeping it reliable after launch is hard. The ADLC walks you through Intent, Build, Evaluate, Deploy, Observe, then back around.

How MCP Tool Descriptions Break Your Agent
New research shows 97% of MCP tool descriptions have quality issues that hurt agent accuracy. Here's what the smells look like, why they matter, and how to fix them.

AWS Just Gave Your Agent 15,000 Cloud Tools
The AWS MCP Server is now GA. One tool call reaches any of 15,000+ AWS APIs, sandboxed Python execution lets agents run multi-step operations, and Agent Skills replace heavyweight SOPs with on-demand guidance. Here's what changed and how to wire it.

Your Agent Re-reads Its Own Manual on Every Call
Datadog's 2026 State of AI Engineering report found that 69% of input tokens go to system prompts, yet only 28% of LLM calls use prompt caching. Here's how to diagnose the problem and fix it without rewriting your agent.

MCP Apps: Build UIs That Render Inside AI Chat
MCP Apps let your tools return interactive HTML dashboards, forms, and visualizations that render inline in Claude, ChatGPT, and VS Code. Here's how to build them for CX agents.

Trajectory Eval: Catch Agent Bugs Output Scoring Misses
Final-output scoring misses 20-40% of agent regressions. Trajectory evaluation scores every step an agent takes -- tool calls, reasoning decisions, order of operations -- and catches the bugs that output-only evals can't see.

Shadow Mode: Deploy AI Agent Updates Without Risk
Shadow mode runs your new agent version in parallel with production, comparing behavior before customers ever see it. Here's how to build the full deployment pipeline from shadow to canary to 100%.

Your CX Agent Crashes Mid-Task. Here's the Fix.
When your CX agent crashes mid-refund or mid-booking, the customer is stuck. Durable execution guarantees long-running agent tasks survive failures. Here's how to build it.

AG-UI: The Protocol That Connects Agents to UIs
AG-UI is the open event-based protocol that streams AI agent state to any frontend in real time. Here's how it works, what events it defines, and how to wire it up in TypeScript.

Your Agent Is Already a State Machine. Make It Explicit.
Every production AI agent is secretly a state machine. Making it explicit gives you checkpointing, testable paths, and observable state transitions -- without rewriting your agent logic.
The Signal Briefing
One email a week. How leading CS, revenue, and AI teams are turning conversations into decisions. Benchmarks, playbooks, and what's working in production.