The Chanl Blog
Insights on building, connecting, and monitoring AI agents for customer experience — from the teams shipping them.
Latest Articles
Testing & EvaluationIs AI Better Than Your Humans? Score Both on One Rubric
Most teams can't say whether AI beats humans because they score them differently. One rubric, run on both, sliced by segment, gives you an honest answer.
Testing & EvaluationEvery Failed Call Is a Test Case You Haven't Written Yet
The gap between staging and production for AI agents is measured in surprise. Here's how to close the loop from live failure to regression gate.
All Articles
171 articles · Page 1 of 15

The Modern Data Stack Wasn't Built for Agents
Snowflake, dbt, and Fivetran were built for humans asking batch questions. Agents need streaming signals, per-entity memory in under 100ms, and write-back.

Correlation Killed Your Retention Model. Causal AI Fixes It.
Your churn model says support calls cause retention. They don't. Build a causal pipeline with DoWhy, EconML, and propensity matching in Python.

Stop Storing Transcripts. Start Modeling Signals.
A JSON blob of transcripts works at 1k calls and collapses at 50k. Design a Signal schema with entity/event split, confidence, provenance, and versioning.

Every Conversation Is an Experiment You Didn't Run
Your agent already ran the A/B test you're scoping. Here's how to read the results in your logs with propensity matching, synthetic control, and diff-in-diff.

Stop Building Dashboards. Start Shipping Signal.
Dashboards tell VPs what happened last quarter. Signal tells them which account to call today, and why. How CX is exiting the post-dashboard era in 2026.

Is AI Better Than Your Humans? Score Both on One Rubric
Most teams can't say whether AI beats humans because they score them differently. One rubric, run on both, sliced by segment, gives you an honest answer.

Every Failed Call Is a Test Case You Haven't Written Yet
The gap between staging and production for AI agents is measured in surprise. Here's how to close the loop from live failure to regression gate.

Your Conversations Are Already CRM Data. Here's How to Use Them.
Every customer call carries churn risk, expansion intent, and compliance signal. Most teams toss it. Here's how to turn conversations into live CRM data.

How Much Testing Is Enough for Your AI Agent?
Code coverage doesn't apply to AI agents. Here's a framework for thinking about evaluation coverage: how many scenarios you need, what distribution to target, and how to know when you've tested enough.

MCP SSE Is Deprecated. Here's How to Migrate
SSE transport is being deprecated across major MCP platforms in 2026. Here's a practical migration guide from HTTP+SSE to Streamable HTTP, with TypeScript examples and a phased rollout strategy.

Your LLM-as-judge may be highly biased
LLM-as-Judge has 12 documented biases. Here are 6 evaluation methods production teams actually use instead, with code examples and patterns.

7 FastMCP mistakes that break your agent in production
FastMCP servers that work locally often fail at scale. Seven common mistakes, from missing annotations to monolithic tool sets, and how to fix each one.
The Signal Briefing
One email a week. How leading CS, revenue, and AI teams are turning conversations into decisions. Benchmarks, playbooks, and what's working in production.