ChanlChanl

The Chanl Blog

Insights on building, connecting, and monitoring AI agents for customer experience — from the teams shipping them.

All Articles

235 articles · Page 9 of 20

Split diagram showing MCP connecting an agent to tools on the left and A2A connecting two agents on the right
Tools & MCP·16 min read

MCP vs A2A: Tools Protocol, Agents Protocol, and Why You Need Both

MCP connects agents to tools. A2A connects agents to each other. Most developers confuse them. This guide breaks down both protocols with architecture diagrams, real code, and a decision framework for production systems.

Read More
Illustration of a quality monitoring dashboard showing score trends and alert thresholds across production AI agent conversations
Learning AI·20 min read

Production Agent Evals: Catch Score Drift, Ship Confidently

Your evals pass in staging but miss production failures. Build three eval pipelines with the Chanl SDK: automated scorecards, scenario regression, and drift detection that catches quality degradation before customers do.

Read More
Abstract visualization of a signal gradually losing coherence as it passes through layered processing stages, with early stages showing clean waveforms and later stages showing scattered, fragmented patterns
Testing & Evaluation·14 min read

Agent Drift: Why Your AI Gets Worse the Longer It Runs

AI agents silently degrade over long conversations. Research quantifies three types of drift and shows why point-in-time evals miss them entirely.

Read More
Watercolor illustration of a traffic control tower overlooking a busy intersection of code agents, warm amber and teal tones
Learning AI·14 min read read

How to enforce the orchestrator pattern in Claude Code

The main Claude Code thread plans and reviews. Subagents implement. Three enforcement layers make this mandatory: CLAUDE.md, skills, and hooks. Includes a starter kit you can copy.

Read More
Modern bank lobby with digital screens and a customer speaking on the phone, soft lighting and glass walls
Industry & Strategy·14 min read

Banks Trust AI With Transactions. Why Not Customer Calls?

How a mid-size bank deploys AI agents for customer service with identity verification, PCI compliance, fraud detection, and regulatory scorecards.

Read More
Aerial view of a modern enterprise operations center with rows of monitors displaying conversation analytics dashboards and quality metrics
Industry & Strategy·15 min read

Your Call Center Handles 10,000 Calls a Day. Who's Grading Them?

AI agents handle 40% of your calls. Your QA team samples 2%. The monitoring gap between deployment and quality is where enterprise reputations break.

Read More
Warm watercolor illustration of a fashion boutique with digital product recommendations floating above clothing racks
Industry & Strategy·15 min read

The Shopping Assistant That Outsells Your Best Sales Rep

How a $50M fashion retailer turned 15,000 SKUs and customer purchase history into an AI shopping assistant that outsells human sales reps.

Read More
Watercolor illustration of a structured data network flowing through an insurance office, with policy documents transforming into organized digital records
Industry & Strategy·15 min read

The Insurance Agent That Never Misquotes a Policy

How regional insurers deploy AI agents that answer policy questions accurately, intake claims end-to-end, and produce the audit trail regulators demand.

Read More
Illustration of a balance scale tilted by invisible weights, representing hidden biases in AI evaluation systems
Learning AI·18 min read

12 Ways Your LLM Judge Is Lying to You

Research identifies 12 systematic biases in LLM-as-a-judge systems. Learn to detect and mitigate each one before they corrupt your eval pipeline.

Read More
A filing cabinet with most drawers empty and papers scattered on the floor, watercolor illustration in muted blue tones
Knowledge & Memory·12 min read read

Your Agent Completed the Task. It Also Forgot 87% of What It Knew.

Task completion hides a silent failure: agents forget 87% of stored knowledge under complexity. New research reveals why standard evals miss this entirely.

Read More
Watercolor illustration of a split dashboard showing human reviewers on one side and automated scoring metrics on the other
Operations·15 min read read

74% of Production Agents Still Rely on Human Evaluation

A survey of 306 practitioners reveals most production agents are far simpler than expected. The eval gap isn't a tooling problem. It's a trust problem.

Read More
Watercolor illustration of a digital fortress under siege with abstract red and blue waves representing adversarial AI testing
Testing & Evaluation·15 min read read

NIST Red-Teamed 13 Frontier Models. All of Them Failed.

NIST ran 250K+ attacks against every frontier model. None survived. Here's what the results mean for teams shipping AI agents to production today.

Read More

Learn Agentic AI

Weekly. Patterns for shipping agents that work — MCP, scorecards, regression tests, prompts, model comparisons.

500+ builders subscribed