Articles tagged “typescript”
43 articles

How to Measure Cost Per Successful Outcome for AI Agents
Most teams measure AI agent quality by pass rate. The metric that actually predicts ROI is cost per successful outcome: what each resolution costs paired against whether it actually resolved. Here's how to build it.

MCP Webhooks: Build Event-Driven Agents That React in Real Time
MCP's request-response model breaks when AI agents need to react to external events. Build event-driven agents today with stateless HTTP and webhooks.

Correlation Killed Your Retention Model. Causal AI Fixes It.
Your churn model says support calls cause retention. They don't. Build a causal pipeline with DoWhy, EconML, and propensity matching in Python.

MCP Is Now Open Infrastructure: Build for What's Next
MCP was donated to the Linux Foundation and the AAIF just held its first summit. What does the protocol becoming open infrastructure mean for what you build on top of it?

Memory bugs don't crash. They just give wrong answers.
Memory bugs don't crash your agent. They just give subtly wrong answers using stale context. Here are 5 test patterns to catch them before customers do.

Online vs. Offline Evals: Close the Production Gap
89% of teams have observability but only 37% run online evals. Here's why that gap is where production failures hide, and how to close it with a practical online eval pipeline.

LLM-as-a-Judge: Build a Production Eval Pipeline
Build a production LLM-as-a-judge eval pipeline step by step. Covers judge selection, rubric design, CI integration, and sampling strategies that scale.

MCP Servers in Production: Observability from Day One
Instrument your MCP servers with OpenTelemetry for production-grade observability. Covers tracing tool calls, detecting loops, cost attribution, and alerting.

Build the MCP + A2A agent protocol stack from scratch
Wire an MCP server to an A2A agent that delegates tasks and calls tools. TypeScript and Python examples, Streamable HTTP transport, Agent Cards, and auth.

Agentic RAG: from dumb retrieval to self-correcting agents
Your RAG pipeline retrieves wrong documents and nobody catches it. Build a self-correcting agent that grades results, rewrites queries, and knows when to stop.

Graph memory for AI agents: when vector search isn't enough
Build graph memory for AI agents in TypeScript and Python. Extract entities, track relationships over time, and compare Mem0, Zep, and Letta in production.

AI Agent Frameworks Compared: Which Ones Ship?
An honest comparison of 9 AI agent frameworks (LangGraph, CrewAI, Vercel AI SDK, Mastra, OpenAI Agents SDK, Google ADK, Microsoft Agent Framework, Pydantic AI, AutoGen) based on what developers actually ship to production in 2026.

Build an AI Agent Observability Pipeline from Scratch
Build a production observability pipeline for AI agents using TypeScript and the Chanl SDK. Covers metrics, traces, quality scoring, drift detection, and alerting.

Production Agent Evals: Catch Score Drift, Ship Confidently
Your evals pass in staging but miss production failures. Build three eval pipelines with the Chanl SDK: automated scorecards, scenario regression, and drift detection that catches quality degradation before customers do.

Embeddings Turn Text Into Meaning. Here's the Math and the Code
What embeddings are, how similarity search works under the hood, and how to build a semantic search engine, from cosine similarity math to production vector databases.

Function Calling: Build a Multi-Tool AI Agent from Scratch
Build a multi-tool AI agent from scratch using function calling across OpenAI, Anthropic, and Google. Runnable TypeScript and Python code, validation with Zod and Pydantic, and production hardening patterns.

Your RAG Pipeline Is Answering the Wrong Question
Naive RAG scores 42% on multi-hop questions. Agentic RAG hits 94.5%. The difference: letting the agent decide what to retrieve, when, and whether the results are good enough. Build both in TypeScript and Python.

Your Agent Remembers Everything Except What Matters
ICLR 2026 MemAgents research reveals when AI agents need episodic memory (what happened) vs semantic memory (what's true). Covers MAGMA, Mem0, AdaMem papers, comparison of Mem0 vs Letta vs Zep, and architecture patterns with TypeScript examples.

What to Trace When Your AI Agent Hits Production
OpenTelemetry GenAI conventions are the production standard for agent tracing. What to instrument, what to skip, and what breaks — from a 2 AM debugging war story.

Context Engineering Is What Your Agent Actually Needs
Prompt engineering hits a wall with production AI agents. Context engineering fixes it. Build a full context pipeline with memory, RAG, history compression, and tool resolution.

Fine-Tune a 7B Model for $1,500 (Not $50,000)
Full fine-tuning costs $50K in H100s. QLoRA on an RTX 4090 costs $1,500. Learn how LoRA and QLoRA let you train only 0.1-1% of parameters with nearly identical results, with working code for fine-tuning models that understand your agent's tool schemas.

A 1B Model Just Matched the 70B. Here's How.
How to distill frontier LLMs into small, cheap models that retain 98% accuracy on agent tasks. The teacher-student pattern, NVIDIA's data flywheel, and the Plan-and-Execute architecture that cuts agent costs by 90%.

Why Browser Agents Waste 89% of Their Tokens
Browser agents burn 1,500-2,000 tokens per screenshot. Chrome 146's navigator.modelContext API lets websites expose structured tools instead, cutting token usage by 89% and raising task accuracy to 98%. Here's how WebMCP works.

Claude 4.6 broke our production agent in two hours — here's what's worth the migration
A practical developer guide to Claude 4.6 — adaptive thinking, 1M context, compaction API, tool search, and structured outputs. Real code examples in TypeScript and Python for building production AI agents.

Your agent has 30 tools and no idea when to use them
MCP tools give agents external capabilities. Skills give agents behavioral expertise. Learn the architecture of both, build them in TypeScript, and understand when to use each — and when you need both.

Agentic AI in Production: From Prototype to Reliable Service
Ship agentic AI that doesn't break at 2 AM. Covers orchestration patterns (ReAct, planning loops), error handling, circuit breakers, graceful degradation, observability, and scaling — with TypeScript implementations you can steal.

AI Agent Memory: From Session Context to Long-Term Knowledge
Build AI agent memory systems from scratch in TypeScript. Covers memory types (session, episodic, semantic, procedural), architectures (buffer, summary, vector retrieval), RAG intersection, and privacy-first design.

AI Agent Observability: What to Monitor When Your Agent Goes Live
Build a production observability pipeline for AI agents. Covers latency, token usage, tool success rates, conversation quality, drift detection, structured logging, alerting strategies, and the critical difference between LLM and agent observability.

AI Agent Testing: How to Evaluate Agents Before They Talk to Customers
A practical guide to testing AI agents before production — scenario-based testing with AI personas, scorecard evaluation, regression suites, edge case generation, and CI/CD integration.

AI Agent Tools: MCP, OpenAPI, and Tool Management That Actually Scales
How production AI agents discover, execute, and manage tools — from MCP protocol to OpenAPI auto-importing, security sandboxing, and multi-tenant tool infrastructure.

Build your own AI agent memory system — what breaks when real users show up?
Build a complete memory system for customer-facing AI agents — session context, persistent recall, semantic search. Then learn what breaks when real customers start returning.

Build your own AI agent tool system — what breaks when you add the 20th tool?
Build a complete tool system for customer-facing AI agents from scratch — registry, execution, auth, monitoring. Then learn what breaks when real customers start calling.

MCP Deep Dive: Advanced Patterns for Agent Tool Integration
Production MCP patterns for teams who've built their first server and need to scale it — OAuth 2.1 with PKCE, Streamable HTTP transport, gateways, sampling, dynamic tool registration, and multi-tenant security.

Multimodal AI Agents: Voice, Vision, and Text in Production
How to architect multimodal AI agents that process voice, vision, and text simultaneously — from STT→LLM→TTS pipelines to vision integration, latency budgets, and production fusion strategies.

Voice Agent Platform Architecture: The Stack Behind Sub-300ms Responses
Deep dive into voice agent architecture — the STT→LLM→TTS pipeline, latency budgets, interruption handling, WebRTC vs WebSocket transport, and what orchestration platforms leave on the table.

Fine-tuning vs RAG: why most teams pick wrong and how to decide
When to fine-tune, when to use RAG, and when you need both — with hands-on LoRA fine-tuning and RAG implementation on the same task to show the difference.

Multi-Agent AI Systems: Build an Agent Orchestrator Without a Framework
Build a multi-agent system from scratch — delegation, planning loops, and inter-agent communication — before reaching for LangGraph or CrewAI.

Streaming AI Responses: SSE, WebSockets, and the Architecture Behind ChatGPT's Typing Effect
Build three streaming implementations from scratch — SSE, WebSocket, and HTTP/2 — and learn why token-by-token rendering is harder than it looks.

How to Evaluate AI Agents: Build an Eval Framework from Scratch
Build a working AI agent eval framework in TypeScript and Python. Covers LLM-as-judge, rubric scoring, regression testing, and CI integration.

MCP Explained: Build Your First MCP Server in TypeScript and Python
Build a working MCP server from scratch in TypeScript and Python. Hands-on tutorial covering tools, resources, transports, and testing.

Prompt Engineering from First Principles: 12 Techniques Every AI Developer Needs
Master 12 essential prompt engineering techniques with real TypeScript examples. From zero-shot to ReAct, build better AI agents from first principles.

RAG from Scratch: Build a Retrieval-Augmented Generation Pipeline
Build a working RAG pipeline from scratch in TypeScript and Python. Covers embeddings, chunking, vector search, and generation with real, runnable code.

How Multimodal Voice AI Works: From Audio-Only to Vision-Aware Agents
How multimodal voice AI combines speech, vision, and text into a single agent — architecture patterns, latency tradeoffs, and TypeScript code you can run.
The Signal Briefing
One email a week. How leading CS, revenue, and AI teams are turning conversations into decisions. Benchmarks, playbooks, and what's working in production.