Blog/Tags/ci-cd

ci-cd

Browse 3 articles tagged with “ci-cd”.

Articles tagged “ci-cd”

3 articles

Watercolor Illustration of a CI Pipeline With a Behavioral Testing Gate Between Staging and Production

Testing & Evaluation·15 min read

How to Build a Regression Test Suite for AI Agents

Your CI/CD pipeline catches code regressions. But who catches it when a prompt change breaks your agent's compliance behavior? Here's how to build behavioral regression testing for non-deterministic AI agents.

Watercolor illustration of two figures walking through a warm corridor of looping paths, Her style in warm plum tones

Testing & Evaluation·9 min read

Every Failed Call Is a Test Case You Haven't Written Yet

The gap between staging and production for AI agents is measured in surprise. Here's how to close the loop from live failure to regression gate.

Illustration of an AI judge holding a checklist while reviewing a conversation transcript on a monitor

Technical Guide·22 min read

LLM-as-a-Judge: Build a Production Eval Pipeline

Build a production LLM-as-a-judge eval pipeline step by step. Covers judge selection, rubric design, CI integration, and sampling strategies that scale.

Learn Agentic AI

Weekly. Patterns for shipping agents that work — MCP, scorecards, regression tests, prompts, model comparisons.

500+ builders subscribed