Blog/Tags/agent-testing

agent-testing

Browse 2 articles tagged with “agent-testing”.

Articles tagged “agent-testing”

2 articles

Branching Network Showing the Tool-Call Path an AI Agent Takes Across a Conversation

Testing & Evaluation·12 min read

How to Build a Trajectory Eval for Your AI Agent

Outcome evals check the final answer. Trajectory evals check the path: tools called, data touched, steps taken. Here's how to build one for a CX agent.

A flowchart showing an agent's step-by-step decision path with one step flagged as diverging from the expected trajectory

Testing & Evaluation·13 min read

Trajectory Eval: Catch Agent Bugs Output Scoring Misses

Final-output scoring misses 20-40% of agent regressions. Trajectory evaluation scores every step an agent takes -- tool calls, reasoning decisions, order of operations -- and catches the bugs that output-only evals can't see.

The Signal Briefing

Un email por semana. Cómo los equipos líderes de CS, ingresos e IA están convirtiendo conversaciones en decisiones. Benchmarks, playbooks y lo que funciona en producción.

500+ líderes de CS e ingresos suscritos