ChanlChanl
Blog/Tags/observability

observability

Browse 16 articles tagged with “observability”.

Articles tagged “observability

16 articles

AI-generated illustration for agent unit economics cost per successful outcome -- Her (2013) style, Terra Cotta palette
Testing & Evaluation·18 min read

How to Measure Cost Per Successful Outcome for AI Agents

Most teams measure AI agent quality by pass rate. The metric that actually predicts ROI is cost per successful outcome: what each resolution costs paired against whether it actually resolved. Here's how to build it.

Read More
AI-generated illustration for agent development lifecycle adlc
Operations·14 min read

How to Run the Agent Development Lifecycle (ADLC) in Production

Shipping an AI agent is easy. Keeping it reliable after launch is hard. The ADLC walks you through Intent, Build, Evaluate, Deploy, Observe, then back around.

Read More
AI-generated illustration for system prompt token sink agent optimization -- Soul (2020) style, Terra Cotta palette
Operations·13 min read read

Your Agent Re-reads Its Own Manual on Every Call

Datadog's 2026 State of AI Engineering report found that 69% of input tokens go to system prompts, yet only 28% of LLM calls use prompt caching. Here's how to diagnose the problem and fix it without rewriting your agent.

Read More
A dashboard showing rich telemetry data on one side and a blank trend chart on the other, representing observability without measurement
Testing & Evaluation·11 min read

Your Agent Has Observability. It Doesn't Have Measurement.

89% of AI teams added observability. 52% added evals. But only 31% can say whether their agent is getting better or worse. Here's the difference between watching your agent and actually measuring it.

Read More
Layered audio waveform splitting into three colored tracks with one outlier spike trailing into fog, teal-copper engineering palette
Voice & Conversation·12 min read

Your voice agent's P95 is lying. The real problem is P99.9

Per-stage P95 hides the tail customers feel. How variance compounds across STT, LLM, and TTS, and how to SLO the joint distribution.

Read More
AI-Generated Illustration for Handoff Is the New Prompt -- Soul (2020) Style, Terra Cotta Palette
Agent Architecture·11 min read read

Multi-Agent Systems Don't Fail at Reasoning. They Fail at Handoff.

Multi-agent systems don't fail at reasoning. They fail at handoff. Command objects, memory transfer, and the 8-10 handoff cliff, plus the telemetry that catches drift.

Read More
Iceberg at Sea With Small Visible Tip Above Dark Water and Enormous Submerged Mass Glowing Amber — Visual Metaphor for Reasoning Tokens Hidden Below the Surface of Agent Responses
Operations·14 min read read

Reasoning Tokens Are Showing Up on the Bill

GPT-5 and Claude thinking tokens bill as output and stay invisible. A 200-token reply can hide 8,000 billable ones. How to measure, cap, and budget.

Read More
Architecture diagram of an agentic data layer with event log, signal extraction, entity store, and improvement loop
Agent Architecture·14 min read

The Modern Data Stack Wasn't Built for Agents

Snowflake, dbt, and Fivetran were built for humans asking batch questions. Agents need streaming signals, per-entity memory in under 100ms, and write-back.

Read More
Watercolor illustration of two figures walking through a warm corridor of looping paths, Her style in warm plum tones
Testing & Evaluation·9 min read

Every Failed Call Is a Test Case You Haven't Written Yet

The gap between staging and production for AI agents is measured in surprise. Here's how to close the loop from live failure to regression gate.

Read More
Control room with green monitoring screens, one cracked display unnoticed in the center, Minority Report style
Testing & Evaluation·14 min read read

Is monitoring your AI agent actually enough?

Research shows 83% of agent teams track capability metrics but only 30% evaluate real outcomes. Here's how to close the gap with multi-turn scenario testing.

Read More
Dashboard showing split-screen comparison of offline test results versus live production scorecard trends for an AI agent
Testing & Evaluation·18 min read

Online vs. Offline Evals: Close the Production Gap

89% of teams have observability but only 37% run online evals. Here's why that gap is where production failures hide, and how to close it with a practical online eval pipeline.

Read More
Illustration of distributed trace spans connecting an AI agent to MCP tool servers with observability signals flowing through
Technical Guide·20 min read

MCP Servers in Production: Observability from Day One

Instrument your MCP servers with OpenTelemetry for production-grade observability. Covers tracing tool calls, detecting loops, cost attribution, and alerting.

Read More
Engineering team reviewing real-time AI agent monitoring dashboards with metrics and conversation traces
Learning AI·22 min read read

Build an AI Agent Observability Pipeline from Scratch

Build a production observability pipeline for AI agents using TypeScript and the Chanl SDK. Covers metrics, traces, quality scoring, drift detection, and alerting.

Read More
Watercolor illustration of distributed trace spans flowing through an AI agent pipeline with OpenTelemetry instrumentation
Operations·18 min read read

What to Trace When Your AI Agent Hits Production

OpenTelemetry GenAI conventions are the production standard for agent tracing. What to instrument, what to skip, and what breaks — from a 2 AM debugging war story.

Read More
Ilustracion en acuarela de un ingeniero monitoreando un dashboard de agentes de IA en produccion con metricas de confiabilidad
Agent Architecture·24 min read

IA Agentica en Produccion: De Prototipo a Servicio Confiable

Lleva IA agentica a produccion sin que se rompa a las 2 AM. Cubre patrones de orquestacion (ReAct, bucles de planificacion), manejo de errores, circuit breakers, degradacion elegante, observabilidad y escalamiento, con implementaciones en TypeScript que puedes reutilizar.

Read More
Watercolor illustration of an engineering team monitoring AI agent dashboards with data flowing across screens
Operations·28 min read read

AI Agent Observability: What to Monitor When Your Agent Goes Live

Build a production observability pipeline for AI agents. Covers latency, token usage, tool success rates, conversation quality, drift detection, structured logging, alerting strategies, and the critical difference between LLM and agent observability.

Read More

The Signal Briefing

Un email por semana. Cómo los equipos líderes de CS, ingresos e IA están convirtiendo conversaciones en decisiones. Benchmarks, playbooks y lo que funciona en producción.

500+ líderes de CS e ingresos suscritos