You book a flight with an AI travel agent. Confirmation comes through: seat 14C, departure 9:40am. The agent closes the loop and you go to sleep.
At 11pm, the airline reassigns your gate. At 2am, the departure time shifts to 6:15am.
Your agent has no idea. It's not listening. It only knows what it knew when you last asked.
This is the fundamental limitation of request-response AI agents: they're great at answering questions, but they can't react to the world changing around them. Every tool call your agent makes is a question it initiated. Events that happen without anyone asking (ticket created, order shipped, payment failed, queue depth spiked) are invisible unless the agent is explicitly polling for them.
That gap matters more than it sounds. Customers waiting on refund status. Agents that should escalate when a wait time crosses a threshold. Systems that need to notify a customer the moment their order ships. The polling workaround handles some of these, but it's expensive, slow, and fragile at the edges.
The answer is event-driven agents. And with MCP's webhook support landing in the June 2026 specification cycle, the protocol now has a native mechanism for it. Here's how to build this pattern today, before the spec ships, and what changes when it does.
Why Request-Response MCP Breaks for Reactive Workflows
Request-response MCP is clean and predictable. The agent needs data, calls a tool, gets a response, continues. Most CX workflows fit this model: a customer asks a question, the agent looks up the answer, the customer gets a response.
But some workflows require the agent to be reactive, not just informative. When a high-priority ticket is created, the agent should route it immediately. When a payment fails, the agent should proactively contact the customer before they realize something is wrong. When a shipment is delayed, the agent should update the customer's expectations before they ask.
In a pure request-response model, these workflows require polling: the agent repeatedly calls a "check for new events" tool and does nothing if nothing changed. Here's what that looks like in practice:
// The pattern to avoid
async function pollForNewTickets(agentId: string) {
while (true) {
const tickets = await mcp.callTool('get_new_tickets', { since: lastChecked });
if (tickets.items.length > 0) {
for (const ticket of tickets.items) {
await routeTicket(ticket);
}
}
lastChecked = new Date().toISOString();
await sleep(30_000); // check every 30 seconds
}
}At first glance this seems fine. But run the numbers. If you're polling every 30 seconds, you're making 2,880 tool calls per day (each one consuming tokens for the request and response) to detect events that might happen a few dozen times. Your agent is paying for 2,840 empty polls to catch 40 real events.
Worse, 30 seconds is the minimum polling interval before your agent costs become visible on your LLM bill. For urgent routing workflows, 30 seconds is too slow. You're stuck: poll faster and pay more, or poll slower and react later.
Event-driven design flips this: instead of the agent asking "did anything happen?", the system notifies the agent when something actually happens. Zero polls for zero events. Immediate notification for real events.
Building the Webhook Bridge Today
The June 2026 MCP spec will add a native subscribe method and server-to-client push mechanism. Until then, you can build the same behavior with two components: a stateless MCP HTTP server for agent-to-server tool calls, and a webhook receiver that translates external events into stored notifications the agent can act on.
The key insight is that your MCP server becomes a buffer. External systems post to your webhook endpoint, which stores events. The agent polls the MCP server for pending events, but this poll is cheap and fast because it's just a queue read, not a live API call to an external service. When the June spec lands, the queue read gets replaced by a server push.
Setting Up the MCP Server
Start with a stateless HTTP MCP server. The stateless constraint is important: it makes your server deployable to serverless environments without managing connection state, and it's required for the upcoming MCP spec's horizontal scaling support.
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { z } from 'zod';
const server = new McpServer({
name: 'cx-event-server',
version: '1.0.0',
});
// In-memory event queue (use Redis in production)
const pendingEvents: PendingEvent[] = [];
server.tool(
'check_pending_events',
'Check for new external events that need agent action. Returns events in FIFO order. Call this at the start of each conversation turn to process any queued events before handling user input.',
{},
async () => {
const events = pendingEvents.splice(0, 10); // dequeue up to 10
return {
content: [
{
type: 'text',
text: JSON.stringify({ events, remaining: pendingEvents.length }),
},
],
};
},
);
server.tool(
'ack_event',
'Acknowledge that you have processed a specific event. Required after handling each event from check_pending_events.',
{ eventId: z.string().describe('The event ID to acknowledge') },
async ({ eventId }) => {
await eventLog.markProcessed(eventId);
return { content: [{ type: 'text', text: JSON.stringify({ acknowledged: true }) }] };
},
);The check_pending_events tool is the agent's entry point. By adding it to your agent's system prompt as something to call at the start of each turn, you get near-real-time event awareness without the polling overhead of querying external APIs directly.
The Webhook Receiver
The webhook receiver is a simple HTTP endpoint that validates incoming events, deduplicates them, and enqueues them for the agent.
import { createHmac, timingSafeEqual } from 'crypto';
interface PendingEvent {
id: string;
type: string;
source: string;
payload: Record<string, unknown>;
receivedAt: string;
}
export async function handleWebhook(
req: Request,
source: 'zendesk' | 'stripe' | 'shopify',
): Promise<Response> {
// Read raw body ONCE; signature verification needs the exact bytes
const rawBody = await req.text();
// Validate signature before touching the payload
const isValid = validateWebhookSignature(req.headers, rawBody, source);
if (!isValid) {
return new Response('Unauthorized', { status: 401 });
}
const payload = JSON.parse(rawBody);
const eventId = extractEventId(payload, source);
// Idempotency: skip events we've already processed
const alreadySeen = await eventLog.exists(eventId);
if (alreadySeen) {
return new Response('OK', { status: 200 }); // acknowledge but don't reprocess
}
// Enqueue for agent processing
pendingEvents.push({
id: eventId,
type: classifyEvent(payload, source),
source,
payload,
receivedAt: new Date().toISOString(),
});
await eventLog.record(eventId);
return new Response('OK', { status: 200 });
}
function validateWebhookSignature(
headers: Headers,
rawBody: string,
source: string,
): boolean {
const secret = process.env[`WEBHOOK_SECRET_${source.toUpperCase()}`];
if (!secret) return false;
const signature = headers.get('x-webhook-signature') ?? '';
const expected = createHmac('sha256', secret).update(rawBody).digest('hex');
// Timing-safe comparison to prevent timing attacks
return timingSafeEqual(
Buffer.from(signature),
Buffer.from(`sha256=${expected}`),
);
}Always validate webhook signatures before processing. The three common sources (Zendesk, Stripe, Shopify) each use a different header and signing scheme, but the pattern is the same: HMAC-SHA256 over the raw request body. Read the body once as a string, verify against that exact byte sequence, then JSON.parse. If you req.json() first, the stream is consumed and the raw bytes the sender signed are gone.
Handling the Three Most Common Event Types
Now that the infrastructure is in place, let's look at how the agent actually handles specific events. Each event type requires a different response pattern.
Order and Shipment Events
Shipment events are the highest-volume webhook type for most CX agents. When an order ships, the customer expects proactive notification. When a shipment is delayed, the agent should update the customer before they contact support.
async function handleShipmentEvent(event: PendingEvent) {
const { orderId, newStatus, estimatedDelivery, delay } = event.payload as ShipmentPayload;
switch (newStatus) {
case 'shipped': {
// Proactive notification: don't wait for the customer to ask
await chanl.memory.upsert({
customerId: event.payload.customerId as string,
key: `order_${orderId}_status`,
value: { status: 'shipped', trackingNumber: event.payload.trackingNumber },
});
// Queue outbound notification if customer has opted in
await notificationQueue.add({
type: 'order_shipped',
customerId: event.payload.customerId as string,
orderId,
trackingNumber: event.payload.trackingNumber as string,
});
break;
}
case 'delayed': {
// Update memory so any inbound contact gets the right context immediately
await chanl.memory.upsert({
customerId: event.payload.customerId as string,
key: `order_${orderId}_status`,
value: {
status: 'delayed',
originalDelivery: event.payload.originalDelivery,
newEstimate: estimatedDelivery,
delayReason: delay?.reason ?? 'carrier delay',
},
});
break;
}
}
}Storing the event outcome in agent memory before the customer contacts you is the key move. When the customer does call ("where's my order?"), the agent already has the current status in context. No tool call needed to look it up. The event-driven update made the agent ready before the conversation started.
For more on pairing event data with persistent memory, see How to Build AI Agent Memory: Session, Context, and Long-Term Knowledge and the agent memory feature.
Payment Events
Payment events are high-urgency. A failed payment that goes unaddressed for 24 hours has a high churn correlation. An agent that can proactively reach out within minutes of a payment failure prevents the churn cycle from starting.
async function handlePaymentEvent(event: PendingEvent) {
const payment = event.payload as StripePaymentEvent;
if (payment.type === 'payment_intent.payment_failed') {
const failureCode = payment.data.object.last_payment_error?.code;
const customerId = payment.data.object.metadata?.customerId;
// Classify failure to route correctly
const isCardIssue = ['card_declined', 'insufficient_funds', 'expired_card'].includes(
failureCode ?? '',
);
const isBankIssue = ['bank_account_restricted', 'debit_not_authorized'].includes(
failureCode ?? '',
);
await chanl.memory.upsert({
customerId,
key: 'payment_status',
value: {
failed: true,
failureCode,
failedAt: new Date().toISOString(),
suggestedAction: isCardIssue
? 'update_payment_method'
: isBankIssue
? 'contact_bank'
: 'retry_later',
},
});
// Route to appropriate outreach flow
await outreachQueue.add({
priority: 'high',
flow: 'payment_failure_recovery',
customerId,
failureCode,
retryAfter: addMinutes(new Date(), 15).toISOString(),
});
}
}The classification step before storing to memory is important. When the agent calls this customer, it should already know whether to say "your card was declined, would you like to update your payment method?" versus "your bank has flagged this transaction and you may need to contact them." The event handler does the classification once, upfront, so the agent doesn't have to reason about raw failure codes mid-conversation.
Ticket Escalation Events
When a ticket crosses an escalation threshold (wait time too long, sentiment too negative, topic too complex), the agent needs to respond immediately rather than waiting for the customer to ask what's happening.
async function handleTicketEscalation(event: PendingEvent) {
const { ticketId, reason, customerId, conversationId } = event.payload as TicketEscalationPayload;
// Pull full context so the receiving agent starts informed
const ticket = await helpdesk.getTicket(ticketId);
// Store escalation context for human agent handoff
await chanl.memory.upsert({
customerId,
key: `escalation_${ticketId}`,
value: {
reason,
originalTicketId: ticketId,
summary: ticket.summary,
sentiment: ticket.sentimentScore,
waitTimeMinutes: ticket.waitTimeMinutes,
previousAttempts: ticket.resolveAttempts,
},
});
// Notify the human agent queue with full context
await humanQueue.add({
ticketId,
conversationId,
priority: reason === 'high_sentiment_negative' ? 'urgent' : 'normal',
contextUrl: `${process.env.PLATFORM_URL}/tickets/${ticketId}`,
});
}The pattern here is consistent: the event handler enriches the memory context before any human or agent interaction happens. The receiving agent, whether AI or human, gets all relevant context immediately rather than spending the first two minutes of the conversation reconstructing what happened.
What's Coming in the June 2026 Spec
The current webhook bridge pattern works well, but it requires you to build and maintain a webhook receiver, event queue, and check_pending_events tool. The June 2026 MCP specification consolidates this into three native capabilities:
Native subscription: An agent calls mcp.subscribe({ resource: 'tickets', filter: { status: 'created' } }) and the MCP server handles the subscription lifecycle without a separate webhook receiver.
Server-push notifications: When a subscribed resource changes, the MCP server sends a notification to the agent client directly, using the Streamable HTTP transport's notification channel. No polling required, not even the lightweight check_pending_events poll.
Ordering guarantees: The spec defines monotonic event IDs and ordering guarantees across transports. You get at-least-once delivery with clear replay semantics built into the protocol.
When the spec ships, your migration is mechanical: replace check_pending_events calls with a subscribe initialization, and remove the webhook receiver in favor of the MCP server's native subscription mechanism. The event-handling logic inside your tools stays the same.
// Available June 2026. Current SDK doesn't expose this yet.
const sub = await mcpClient.subscribe({
resource: 'cx_events',
filter: { types: ['ticket_created', 'payment_failed', 'order_delayed'] },
onEvent: async (event) => {
await routeEvent(event);
},
});
// Clean up when done
await sub.unsubscribe();If you build the webhook bridge today using the stateless HTTP pattern described above, migrating to the native API later is a drop-in replacement for the queue layer. The tool descriptions, event handlers, and memory-update logic stay unchanged.
For context on where the MCP transport stack has been, see SSE to Streamable HTTP: Migrating Your MCP Server. The stateless HTTP transport that underpins the webhook bridge is the same one that enables horizontal scaling for production MCP deployments.
Monitoring Event-Driven Agents
Event-driven agents introduce failure modes that request-response agents don't have. You need to track four additional metrics:
Webhook delivery latency: How long between the external event occurring and your webhook receiver getting the POST. Stripe and Zendesk both publish delivery SLAs. If your p95 delivery latency is 8 seconds but your expected latency is under 2 seconds, something upstream is queuing.
Processing latency: How long between receiving the webhook and the agent completing the triggered action. This is what the customer experiences as response time.
Event drop rate: Webhooks that your receiver accepted (200 OK) but never processed, usually due to exceptions in the event handler or queue failures. This is the silent failure mode: the external system thinks the event was delivered, but your agent never acted on it.
Agent action success rate: For each event type, the percentage of triggered agent actions that completed successfully. A payment failure recovery flow that fails to update memory half the time is worse than not running at all.
async function reportEventMetrics(source: string, period: string = '24h') {
const metrics = await chanl.calls.getMetrics({
filter: { source, type: 'event_driven' },
dateRange: { start: daysAgo(1), end: now() },
});
return {
webhooksReceived: metrics.webhookCount,
webhooksProcessed: metrics.processedCount,
dropRate: ((metrics.webhookCount - metrics.processedCount) / metrics.webhookCount * 100).toFixed(2) + '%',
p50DeliveryLatency: metrics.deliveryLatency.p50,
p95DeliveryLatency: metrics.deliveryLatency.p95,
p99ProcessingLatency: metrics.processingLatency.p99,
actionSuccessRate: (metrics.successfulActions / metrics.totalActions * 100).toFixed(2) + '%',
};
}Drop rate is the one to alert on immediately. A non-zero drop rate means events are being silently missed. Set an alert for drop rate above 0.1%: that's one missed event per thousand, which at scale represents real customer impact.
For production-grade monitoring alongside event-driven metrics, agent monitoring tracks event drop rates, delivery latency, and action success rates in the same dashboard as your conversation quality metrics. See MCP and OpenTelemetry: Observability for Production Agent Stacks for the broader instrumentation picture.
The Agent That's Always Ready
The gate change at 2am. The payment that failed at 11pm. The ticket that crossed into frustration territory while your agent was idle. These events don't wait for someone to ask about them.
Event-driven agents close that gap. The webhook bridge pattern described here works today, in production, without waiting for the June spec. When the spec ships, the webhook receiver and queue layer become unnecessary, but the event-handling logic, memory updates, and monitoring you build now stay intact.
The shift in mental model is the real work. Request-response agents answer questions. Event-driven agents are watching, waiting for the world to change, and ready to act before anyone asks.
Connect your agents to the events that matter
Chanl's MCP runtime handles webhook ingestion, event routing, and memory updates so your agents can react to external events without polling. Build event-driven CX workflows on any orchestration platform.
Explore MCP integrationsCo-founder
Building the platform for AI agents at Chanl — tools, testing, and observability for customer experience.
The Signal Briefing
Un email por semana. Cómo los equipos líderes de CS, ingresos e IA están convirtiendo conversaciones en decisiones. Benchmarks, playbooks y lo que funciona en producción.



