Blog/Tags/agent-infrastructure

agent-infrastructure

Browse 14 articles tagged with “agent-infrastructure”.

Articles tagged “agent-infrastructure”

14 articles

Stateless MCP Servers Behind a Load Balancer With Task Handles Flowing Between Agent and Server

Tools & MCP·16 min read

How to Migrate Your MCP Server to Stateless Mode

The MCP 2026 release candidate makes stateless the recommended default. Your MCP server can now scale behind any load balancer without sticky routing. Here's how to migrate and use the new Tasks extension for async CX work.

A glowing terminal prompt floats above an AWS cloud diagram, agent tool calls fanning out to S3, DynamoDB, Bedrock, and Lambda nodes

Tools & MCP·12 min read read

AWS just gave your agent 15,000 cloud tools

The AWS MCP Server is now GA. One tool call reaches any of 15,000+ AWS APIs, sandboxed Python execution lets agents run multi-step operations, and Agent Skills replace heavyweight SOPs with on-demand guidance. Here's what changed and how to wire it.

Iceberg at Sea With Small Visible Tip Above Dark Water and Enormous Submerged Mass Glowing Amber — Visual Metaphor for Reasoning Tokens Hidden Below the Surface of Agent Responses

Operations·14 min read read

Reasoning Tokens Are Showing Up on the Bill

GPT-5 and Claude thinking tokens bill as output and stay invisible. A 200-token reply can hide 8,000 billable ones. How to measure, cap, and budget.

Illustration of distributed trace spans connecting an AI agent to MCP tool servers with observability signals flowing through

Technical Guide·20 min read

MCP Servers in Production: Observability from Day One

Instrument your MCP servers with OpenTelemetry for production-grade observability. Covers tracing tool calls, detecting loops, cost attribution, and alerting.

Watercolor illustration of a split dashboard showing human reviewers on one side and automated scoring metrics on the other

Operations·15 min read read

74% of Production Agents Still Rely on Human Evaluation

A survey of 306 practitioners reveals most production agents are far simpler than expected. The eval gap isn't a tooling problem. It's a trust problem.

Watercolor illustration of distributed trace spans flowing through an AI agent pipeline with OpenTelemetry instrumentation

Operations·18 min read read

What to Trace When Your AI Agent Hits Production

OpenTelemetry GenAI conventions are the production standard for agent tracing. What to instrument, what to skip, and what breaks — from a 2 AM debugging war story.

Watercolor illustration of descending cost bars alongside token streams flowing through an optimization pipeline

Operations·16 min read read

Your AI Agent Costs $13K/Month. Here's the Fix.

A production customer-service agent burned $13,247 in one month. Prompt caching, model routing, batch processing, and plan-and-execute architecture cut it to $1,100. Real pricing math for every technique.

Browser window with structured tool definitions flowing between a website and an AI agent

Tools & MCP·13 min read read

Why Browser Agents Waste 89% of Their Tokens

Browser agents burn 1,500-2,000 tokens per screenshot. Chrome 146's navigator.modelContext API lets websites expose structured tools instead, cutting token usage by 89% and raising task accuracy to 98%. Here's how WebMCP works.

Watercolor illustration of an engineering team monitoring AI agent dashboards with data flowing across screens

Operations·28 min read read

AI Agent Observability: What to Monitor When Your Agent Goes Live

Build a production observability pipeline for AI agents. Covers latency, token usage, tool success rates, conversation quality, drift detection, structured logging, alerting strategies, and the critical difference between LLM and agent observability.

Illustration of a team evaluating AI agent quality through structured testing scenarios

Testing & Evaluation·24 min read

AI Agent Testing: How to Evaluate Agents Before They Talk to Customers

A practical guide to testing AI agents before production — scenario-based testing with AI personas, scorecard evaluation, regression suites, edge case generation, and CI/CD integration.

Watercolor illustration of developers collaborating around a whiteboard with tool integration diagrams

Tools & MCP·26 min read read

AI Agent Tools: MCP, OpenAPI, and Tool Management That Actually Scales

How production AI agents discover, execute, and manage tools — from MCP protocol to OpenAPI auto-importing, security sandboxing, and multi-tenant tool infrastructure.

Developer working through advanced MCP protocol integration patterns on a screen

Tools & MCP·25 min read

MCP Deep Dive: Advanced Patterns for Agent Tool Integration

Production MCP patterns for teams who've built their first server and need to scale it — OAuth 2.1 with PKCE, Streamable HTTP transport, gateways, sampling, dynamic tool registration, and multi-tenant security.

Watercolor illustration of converging streams representing voice, vision, and text flowing into an AI agent system

Agent Architecture·28 min read read

Multimodal AI Agents: Voice, Vision, and Text in Production

How to architect multimodal AI agents that process voice, vision, and text simultaneously — from STT→LLM→TTS pipelines to vision integration, latency budgets, and production fusion strategies.

man in blue dress shirt sitting on black office rolling chair - Photo by David Schultz on Unsplash

Agent Architecture·22 min read

How Multimodal Voice AI Works: From Audio-Only to Vision-Aware Agents

How multimodal voice AI combines speech, vision, and text into a single agent — architecture patterns, latency tradeoffs, and TypeScript code you can run.

Learn Agentic AI

Weekly. Patterns for shipping agents that work — MCP, scorecards, regression tests, prompts, model comparisons.

500+ builders subscribed