Blog/Tags/typescript

typescript

Browse 43 articles tagged with “typescript”.

Articles tagged “typescript”

43 articles

AI-generated illustration for agent unit economics cost per successful outcome -- Her (2013) style, Terra Cotta palette

Testing & Evaluation·18 min read

How to Measure Cost Per Successful Outcome for AI Agents

Most teams measure AI agent quality by pass rate. The metric that actually predicts ROI is cost per successful outcome: what each resolution costs paired against whether it actually resolved. Here's how to build it.

AI-generated illustration for mcp webhooks event driven agents -- Her (2013) style, Terra Cotta palette

Tools & MCP·19 min read

MCP Webhooks: Build Event-Driven Agents That React in Real Time

MCP's request-response model breaks when AI agents need to react to external events. Build event-driven agents today with stateless HTTP and webhooks.

Illustration of a person drawing a causal graph on a whiteboard while teammates watch

Learning AI·22 min read

Correlation Killed Your Retention Model. Causal AI Fixes It.

Your churn model says support calls cause retention. They don't. Build a causal pipeline with DoWhy, EconML, and propensity matching in Python.

Diagram showing MCP as a foundational protocol layer with agent configuration, memory, testing, and observability stacked above it

Tools & MCP·16 min read

MCP Is Now Open Infrastructure: Build for What's Next

MCP was donated to the Linux Foundation and the AAIF just held its first summit. What does the protocol becoming open infrastructure mean for what you build on top of it?

Person examining a translucent board with connected note cards, verifying links between them

Testing & Evaluation·16 min read read

Memory bugs don't crash. They just give wrong answers.

Memory bugs don't crash your agent. They just give subtly wrong answers using stale context. Here are 5 test patterns to catch them before customers do.

Dashboard showing split-screen comparison of offline test results versus live production scorecard trends for an AI agent

Testing & Evaluation·18 min read

Online vs. Offline Evals: Close the Production Gap

89% of teams have observability but only 37% run online evals. Here's why that gap is where production failures hide, and how to close it with a practical online eval pipeline.

Illustration of an AI judge holding a checklist while reviewing a conversation transcript on a monitor

Technical Guide·22 min read

LLM-as-a-Judge: Build a Production Eval Pipeline

Build a production LLM-as-a-judge eval pipeline step by step. Covers judge selection, rubric design, CI integration, and sampling strategies that scale.

Illustration of distributed trace spans connecting an AI agent to MCP tool servers with observability signals flowing through

Technical Guide·20 min read

MCP Servers in Production: Observability from Day One

Instrument your MCP servers with OpenTelemetry for production-grade observability. Covers tracing tool calls, detecting loops, cost attribution, and alerting.

Person connecting protocol cables between two glowing devices with diagrams on a whiteboard

Learning AI·22 min read

Build the MCP + A2A agent protocol stack from scratch

Wire an MCP server to an A2A agent that delegates tasks and calls tools. TypeScript and Python examples, Streamable HTTP transport, Agent Cards, and auth.

Person sorting through stacks of documents, crossing out wrong ones, with a magnifying glass on the desk

Learning AI·22 min read

Agentic RAG: from dumb retrieval to self-correcting agents

Your RAG pipeline retrieves wrong documents and nobody catches it. Build a self-correcting agent that grades results, rewrites queries, and knows when to stop.

Person drawing a web of connected nodes on a glass wall with colorful sticky notes around the edges

Learning AI·22 min read read

Graph memory for AI agents: when vector search isn't enough

Build graph memory for AI agents in TypeScript and Python. Extract entities, track relationships over time, and compare Mem0, Zep, and Letta in production.

Developer comparing AI agent framework options on a split-screen monitor

Agent Architecture·18 min read read

AI Agent Frameworks Compared: Which Ones Ship?

An honest comparison of 9 AI agent frameworks (LangGraph, CrewAI, Vercel AI SDK, Mastra, OpenAI Agents SDK, Google ADK, Microsoft Agent Framework, Pydantic AI, AutoGen) based on what developers actually ship to production in 2026.

Engineering team reviewing real-time AI agent monitoring dashboards with metrics and conversation traces

Learning AI·22 min read read

Build an AI Agent Observability Pipeline from Scratch

Build a production observability pipeline for AI agents using TypeScript and the Chanl SDK. Covers metrics, traces, quality scoring, drift detection, and alerting.

Illustration of a quality monitoring dashboard showing score trends and alert thresholds across production AI agent conversations

Learning AI·20 min read

Production Agent Evals: Catch Score Drift, Ship Confidently

Your evals pass in staging but miss production failures. Build three eval pipelines with the Chanl SDK: automated scorecards, scenario regression, and drift detection that catches quality degradation before customers do.

Person exploring geometric shapes representing vector space

Learning AI·20 min read

Embeddings Turn Text Into Meaning. Here's the Math and the Code

What embeddings are, how similarity search works under the hood, and how to build a semantic search engine, from cosine similarity math to production vector databases.

Person building with tool components at a desk

Learning AI·20 min read

Function Calling: Build a Multi-Tool AI Agent from Scratch

Build a multi-tool AI agent from scratch using function calling across OpenAI, Anthropic, and Google. Runnable TypeScript and Python code, validation with Zod and Pydantic, and production hardening patterns.

Illustration of an AI agent navigating branching knowledge paths across interconnected document nodes

Learning AI·18 min read

Your RAG Pipeline Is Answering the Wrong Question

Naive RAG scores 42% on multi-hop questions. Agentic RAG hits 94.5%. The difference: letting the agent decide what to retrieve, when, and whether the results are good enough. Build both in TypeScript and Python.

Abstract neural pathways splitting into two branches representing episodic and semantic memory systems

Knowledge & Memory·18 min read read

Your Agent Remembers Everything Except What Matters

ICLR 2026 MemAgents research reveals when AI agents need episodic memory (what happened) vs semantic memory (what's true). Covers MAGMA, Mem0, AdaMem papers, comparison of Mem0 vs Letta vs Zep, and architecture patterns with TypeScript examples.

Watercolor illustration of distributed trace spans flowing through an AI agent pipeline with OpenTelemetry instrumentation

Operations·18 min read read

What to Trace When Your AI Agent Hits Production

OpenTelemetry GenAI conventions are the production standard for agent tracing. What to instrument, what to skip, and what breaks — from a 2 AM debugging war story.

Illustration of an engineer assembling context layers for an AI agent, with memory, tools, and knowledge sources flowing into a central pipeline

Learning AI·21 min read

Context Engineering Is What Your Agent Actually Needs

Prompt engineering hits a wall with production AI agents. Context engineering fixes it. Build a full context pipeline with memory, RAG, history compression, and tool resolution.

Illustration of a neural network with low-rank adapter matrices injected between layers, showing only a small percentage of parameters highlighted for training

Learning AI·19 min read

Fine-Tune a 7B Model for $1,500 (Not $50,000)

Full fine-tuning costs $50K in H100s. QLoRA on an RTX 4090 costs $1,500. Learn how LoRA and QLoRA let you train only 0.1-1% of parameters with nearly identical results, with working code for fine-tuning models that understand your agent's tool schemas.

Neural network distillation visualization showing a large teacher model transferring knowledge to a compact student model

Learning AI·16 min read

A 1B Model Just Matched the 70B. Here's How.

How to distill frontier LLMs into small, cheap models that retain 98% accuracy on agent tasks. The teacher-student pattern, NVIDIA's data flywheel, and the Plan-and-Execute architecture that cuts agent costs by 90%.

Browser window with structured tool definitions flowing between a website and an AI agent

Tools & MCP·13 min read read

Why Browser Agents Waste 89% of Their Tokens

Browser agents burn 1,500-2,000 tokens per screenshot. Chrome 146's navigator.modelContext API lets websites expose structured tools instead, cutting token usage by 89% and raising task accuracy to 98%. Here's how WebMCP works.

Claude AI agent development tools with code on a developer workspace

Agent Architecture·20 min read read

Claude 4.6 broke our production agent in two hours — here's what's worth the migration

A practical developer guide to Claude 4.6 — adaptive thinking, 1M context, compaction API, tool search, and structured outputs. Real code examples in TypeScript and Python for building production AI agents.

Watercolor illustration of two interlocking systems — tools and behavioral instructions — powering an AI agent

Tools & MCP·14 min read read

Your agent has 30 tools and no idea when to use them

MCP tools give agents external capabilities. Skills give agents behavioral expertise. Learn the architecture of both, build them in TypeScript, and understand when to use each — and when you need both.

Watercolor illustration of an engineer monitoring a production AI agent dashboard with reliability metrics

Agent Architecture·24 min read

Agentic AI in Production: From Prototype to Reliable Service

Ship agentic AI that doesn't break at 2 AM. Covers orchestration patterns (ReAct, planning loops), error handling, circuit breakers, graceful degradation, observability, and scaling — with TypeScript implementations you can steal.

Watercolor illustration of interconnected memory nodes forming a knowledge network in sage and olive tones

Knowledge & Memory·25 min read read

AI Agent Memory: From Session Context to Long-Term Knowledge

Build AI agent memory systems from scratch in TypeScript. Covers memory types (session, episodic, semantic, procedural), architectures (buffer, summary, vector retrieval), RAG intersection, and privacy-first design.

Watercolor illustration of an engineering team monitoring AI agent dashboards with data flowing across screens

Operations·28 min read read

AI Agent Observability: What to Monitor When Your Agent Goes Live

Build a production observability pipeline for AI agents. Covers latency, token usage, tool success rates, conversation quality, drift detection, structured logging, alerting strategies, and the critical difference between LLM and agent observability.

Illustration of a team evaluating AI agent quality through structured testing scenarios

Testing & Evaluation·24 min read

AI Agent Testing: How to Evaluate Agents Before They Talk to Customers

A practical guide to testing AI agents before production — scenario-based testing with AI personas, scorecard evaluation, regression suites, edge case generation, and CI/CD integration.

Watercolor illustration of developers collaborating around a whiteboard with tool integration diagrams

Tools & MCP·26 min read read

AI Agent Tools: MCP, OpenAPI, and Tool Management That Actually Scales

How production AI agents discover, execute, and manage tools — from MCP protocol to OpenAPI auto-importing, security sandboxing, and multi-tenant tool infrastructure.

AI agent memory architecture with semantic search vectors

Learning AI·20 min read read

Build your own AI agent memory system — what breaks when real users show up?

Build a complete memory system for customer-facing AI agents — session context, persistent recall, semantic search. Then learn what breaks when real customers start returning.

Developer building AI agent tools at a whiteboard

Learning AI·20 min read read

Build your own AI agent tool system — what breaks when you add the 20th tool?

Build a complete tool system for customer-facing AI agents from scratch — registry, execution, auth, monitoring. Then learn what breaks when real customers start calling.

Developer working through advanced MCP protocol integration patterns on a screen

Tools & MCP·25 min read

MCP Deep Dive: Advanced Patterns for Agent Tool Integration

Production MCP patterns for teams who've built their first server and need to scale it — OAuth 2.1 with PKCE, Streamable HTTP transport, gateways, sampling, dynamic tool registration, and multi-tenant security.

Watercolor illustration of converging streams representing voice, vision, and text flowing into an AI agent system

Agent Architecture·28 min read read

Multimodal AI Agents: Voice, Vision, and Text in Production

How to architect multimodal AI agents that process voice, vision, and text simultaneously — from STT→LLM→TTS pipelines to vision integration, latency budgets, and production fusion strategies.

Watercolor illustration of voice AI waveforms flowing through a technical architecture diagram with golden amber tones

Agent Architecture·19 min read read

Voice Agent Platform Architecture: The Stack Behind Sub-300ms Responses

Deep dive into voice agent architecture — the STT→LLM→TTS pipeline, latency budgets, interruption handling, WebRTC vs WebSocket transport, and what orchestration platforms leave on the table.

Developer comparing two approaches on a whiteboard

Knowledge & Memory·20 min read

Fine-tuning vs RAG: why most teams pick wrong and how to decide

When to fine-tune, when to use RAG, and when you need both — with hands-on LoRA fine-tuning and RAG implementation on the same task to show the difference.

Team of developers collaborating on multi-agent AI architecture

Learning AI·20 min read

Multi-Agent AI Systems: Build an Agent Orchestrator Without a Framework

Build a multi-agent system from scratch — delegation, planning loops, and inter-agent communication — before reaching for LangGraph or CrewAI.

Engineer debugging a real-time streaming architecture on a monitor

Learning AI·20 min read

Streaming AI Responses: SSE, WebSockets, and the Architecture Behind ChatGPT's Typing Effect

Build three streaming implementations from scratch — SSE, WebSocket, and HTTP/2 — and learn why token-by-token rendering is harder than it looks.

Illustration of two people reviewing an improvement chart together at a standing desk

Learning AI·20 min read

How to Evaluate AI Agents: Build an Eval Framework from Scratch

Build a working AI agent eval framework in TypeScript and Python. Covers LLM-as-judge, rubric scoring, regression testing, and CI integration.

Illustration of a diverse team collaborating around a whiteboard with code diagrams

Learning AI·20 min read

MCP Explained: Build Your First MCP Server in TypeScript and Python

Build a working MCP server from scratch in TypeScript and Python. Hands-on tutorial covering tools, resources, transports, and testing.

Illustration of a person writing thoughtfully at a desk with sticky notes and a warm lamp

Learning AI·25 min read

Prompt Engineering from First Principles: 12 Techniques Every AI Developer Needs

Master 12 essential prompt engineering techniques with real TypeScript examples. From zero-shot to ReAct, build better AI agents from first principles.

Illustration of a person organizing knowledge on a corkboard with connected notes

Learning AI·18 min read

RAG from Scratch: Build a Retrieval-Augmented Generation Pipeline

Build a working RAG pipeline from scratch in TypeScript and Python. Covers embeddings, chunking, vector search, and generation with real, runnable code.

man in blue dress shirt sitting on black office rolling chair - Photo by David Schultz on Unsplash

Agent Architecture·22 min read

How Multimodal Voice AI Works: From Audio-Only to Vision-Aware Agents

How multimodal voice AI combines speech, vision, and text into a single agent — architecture patterns, latency tradeoffs, and TypeScript code you can run.

The Signal Briefing

One email a week. How leading CS, revenue, and AI teams are turning conversations into decisions. Benchmarks, playbooks, and what's working in production.

500+ CS and revenue leaders subscribed