The Chanl Blog

Insights on building, connecting, and monitoring AI agents for customer experience — from the teams shipping them.

All Articles

235 articles · Page 9 of 20

Split diagram showing MCP connecting an agent to tools on the left and A2A connecting two agents on the right

MCP vs A2A: Tools Protocol, Agents Protocol, and Why You Need Both

MCP connects agents to tools. A2A connects agents to each other. Most developers confuse them. This guide breaks down both protocols with architecture diagrams, real code, and a decision framework for production systems.

Illustration of a quality monitoring dashboard showing score trends and alert thresholds across production AI agent conversations

Learning AI·20 min read

Production Agent Evals: Catch Score Drift, Ship Confidently

Your evals pass in staging but miss production failures. Build three eval pipelines with the Chanl SDK: automated scorecards, scenario regression, and drift detection that catches quality degradation before customers do.

Abstract visualization of a signal gradually losing coherence as it passes through layered processing stages, with early stages showing clean waveforms and later stages showing scattered, fragmented patterns

Testing & Evaluation·14 min read

Agent Drift: Why Your AI Gets Worse the Longer It Runs

AI agents silently degrade over long conversations. Research quantifies three types of drift and shows why point-in-time evals miss them entirely.

Watercolor illustration of a traffic control tower overlooking a busy intersection of code agents, warm amber and teal tones

Learning AI·14 min read read

How to enforce the orchestrator pattern in Claude Code

The main Claude Code thread plans and reviews. Subagents implement. Three enforcement layers make this mandatory: CLAUDE.md, skills, and hooks. Includes a starter kit you can copy.

Modern bank lobby with digital screens and a customer speaking on the phone, soft lighting and glass walls

Industry & Strategy·14 min read

Banks Trust AI With Transactions. Why Not Customer Calls?

How a mid-size bank deploys AI agents for customer service with identity verification, PCI compliance, fraud detection, and regulatory scorecards.

Aerial view of a modern enterprise operations center with rows of monitors displaying conversation analytics dashboards and quality metrics

Industry & Strategy·15 min read

Your Call Center Handles 10,000 Calls a Day. Who's Grading Them?

AI agents handle 40% of your calls. Your QA team samples 2%. The monitoring gap between deployment and quality is where enterprise reputations break.

Warm watercolor illustration of a fashion boutique with digital product recommendations floating above clothing racks

Industry & Strategy·15 min read

The Shopping Assistant That Outsells Your Best Sales Rep

How a $50M fashion retailer turned 15,000 SKUs and customer purchase history into an AI shopping assistant that outsells human sales reps.

Watercolor illustration of a structured data network flowing through an insurance office, with policy documents transforming into organized digital records

Industry & Strategy·15 min read

The Insurance Agent That Never Misquotes a Policy

How regional insurers deploy AI agents that answer policy questions accurately, intake claims end-to-end, and produce the audit trail regulators demand.

Illustration of a balance scale tilted by invisible weights, representing hidden biases in AI evaluation systems

Learning AI·18 min read

12 Ways Your LLM Judge Is Lying to You

Research identifies 12 systematic biases in LLM-as-a-judge systems. Learn to detect and mitigate each one before they corrupt your eval pipeline.

A filing cabinet with most drawers empty and papers scattered on the floor, watercolor illustration in muted blue tones

Knowledge & Memory·12 min read read

Your Agent Completed the Task. It Also Forgot 87% of What It Knew.

Task completion hides a silent failure: agents forget 87% of stored knowledge under complexity. New research reveals why standard evals miss this entirely.

Watercolor illustration of a split dashboard showing human reviewers on one side and automated scoring metrics on the other

Operations·15 min read read

74% of Production Agents Still Rely on Human Evaluation

A survey of 306 practitioners reveals most production agents are far simpler than expected. The eval gap isn't a tooling problem. It's a trust problem.

Watercolor illustration of a digital fortress under siege with abstract red and blue waves representing adversarial AI testing

Testing & Evaluation·15 min read read

NIST Red-Teamed 13 Frontier Models. All of Them Failed.

NIST ran 250K+ attacks against every frontier model. None survived. Here's what the results mean for teams shipping AI agents to production today.

1...8 9 10...20

Learn Agentic AI

Weekly. Patterns for shipping agents that work — MCP, scorecards, regression tests, prompts, model comparisons.

500+ builders subscribed