The call ended cleanly. The agent confirmed the appointment, read back the address, and said goodbye in a warm, natural tone. The customer gave it five stars.
Three hours later, they're calling back, frustrated. The confirmation email never arrived. The appointment isn't in the scheduling system. The CRM shows nothing happened.
Your agent didn't fail during the conversation. It failed after it.
This is the async gap -- the space between "conversation complete" and "everything downstream actually ran." It's where a surprising number of production CX agents quietly break, and where even teams with solid conversation quality scores discover that their customers still feel let down.
If you've been debugging failures that only surface hours after a call, or chasing missing CRM entries that the agent swears it created, you've met this problem. Here's what's actually happening, and how to build out of it.
What "Between Conversations" Actually Means
The work that happens between conversations is usually more consequential than the conversation itself. Think about what a completed booking interaction should trigger:
- Send a confirmation email with a calendar link, address, and contact info
- Write a CRM record with the outcome, key fields, and sentiment
- Schedule a reminder 24 hours before the appointment
- Route a callback queue item if something needs follow-up
- Update an internal ticketing system with the resolution
Each of these is a distinct piece of work. Some can fail independently. All of them need to complete for the customer to feel actually helped, not just acknowledged.
The problem is that most AI agent frameworks are built around the request-response model. A conversation comes in, the agent generates a response, the turn ends. Post-conversation work is typically bolted on at the end of the handler -- a few await calls, hoping nothing times out and the server stays up.
That hope isn't an architecture. And the cost of getting it wrong is invisible: you won't see the failure in your conversation quality metrics, but your customers will feel it in callbacks, frustration, and churn.
Why Synchronous Post-Call Code Breaks
The naive approach runs all downstream work synchronously at hang-up:
async function handleConversationEnd(session: ConversationSession) {
// Everything runs in sequence, blocking each other
await sendConfirmationEmail(session.userId, session.bookingDetails);
await updateCRM(session.userId, session.outcome);
await scheduleReminder(session.bookingDetails.time);
await updateTicketingSystem(session.ticketId, session.resolution);
}When sendConfirmationEmail throws -- Resend rate limit, template rendering error, recipient inbox full -- the three calls below it never run. The CRM update dies silently. The reminder never schedules. And you won't know until a customer calls back.
The common patch is Promise.allSettled:
async function handleConversationEnd(session: ConversationSession) {
const results = await Promise.allSettled([
sendConfirmationEmail(session.userId, session.bookingDetails),
updateCRM(session.userId, session.outcome),
scheduleReminder(session.bookingDetails.time),
]);
const failures = results.filter(r => r.status === 'rejected');
if (failures.length > 0) {
console.error('Some post-conversation tasks failed', failures);
}
}Better -- at least the CRM update doesn't die because the email failed. But you still have no retry logic. No state persistence. If your server restarts mid-execution (deploys happen, pods get evicted), all these promises die with it. And the console.error will scroll off your logs before anyone sees it.
What you actually need is to separate the act of creating work from the act of doing work.
Building a Task Queue for CX Agents
A task queue holds this separation in place. The agent writes task records during or immediately after the conversation. A separate worker process picks them up and executes them. The agent never waits for execution to complete.
This gives you three things synchronous code can't provide:
- Durability -- tasks survive server restarts, because they live in a database
- Retry logic -- failed tasks can be retried without re-running the whole conversation
- Observability -- you can see exactly which tasks are pending, running, failed, or expired
Here's a minimal task schema:
interface AgentTask {
id: string;
conversationId: string;
type: 'send_email' | 'update_crm' | 'schedule_reminder' | 'update_ticket';
payload: Record<string, unknown>;
status: 'pending' | 'running' | 'completed' | 'failed' | 'expired';
createdAt: Date;
expiresAt: Date;
attempts: number;
maxAttempts: number;
nextRetryAt?: Date;
lastError?: string;
completedAt?: Date;
}When the conversation ends, the agent writes records rather than executing directly:
async function handleConversationEnd(session: ConversationSession) {
const now = new Date();
const tasks: Omit<AgentTask, 'id'>[] = [
{
conversationId: session.id,
type: 'send_email',
payload: { userId: session.userId, bookingDetails: session.bookingDetails },
status: 'pending',
createdAt: now,
expiresAt: addMinutes(now, 30), // 30-minute window
attempts: 0,
maxAttempts: 3,
},
{
conversationId: session.id,
type: 'update_crm',
payload: { userId: session.userId, outcome: session.outcome },
status: 'pending',
createdAt: now,
expiresAt: addHours(now, 4), // 4-hour window
attempts: 0,
maxAttempts: 5,
},
{
conversationId: session.id,
type: 'schedule_reminder',
payload: {
userId: session.userId,
appointmentTime: session.bookingDetails.datetime,
reminderOffset: '-24h',
},
status: 'pending',
createdAt: now,
// Expires 2 hours before the appointment (pointless after)
expiresAt: subHours(session.bookingDetails.datetime, 2),
attempts: 0,
maxAttempts: 3,
},
];
await db.tasks.createMany({ data: tasks });
// Conversation handler returns immediately -- no waiting
}The worker runs on a polling loop or gets triggered by your queue service:
async function processPendingTasks() {
const tasks = await db.tasks.findMany({
where: {
status: 'pending',
expiresAt: { gt: new Date() }, // Skip expired tasks
OR: [
{ nextRetryAt: null },
{ nextRetryAt: { lte: new Date() } }, // Backoff elapsed
],
},
orderBy: { createdAt: 'asc' },
take: 20,
});
await Promise.allSettled(tasks.map(executeTask));
}
async function executeTask(task: AgentTask) {
await db.tasks.update({
where: { id: task.id },
data: { status: 'running', attempts: { increment: 1 } },
});
try {
await dispatchByType(task);
await db.tasks.update({
where: { id: task.id },
data: { status: 'completed', completedAt: new Date() },
});
} catch (err) {
const exhausted = task.attempts + 1 >= task.maxAttempts;
await db.tasks.update({
where: { id: task.id },
data: {
status: exhausted ? 'failed' : 'pending',
lastError: String(err),
nextRetryAt: exhausted ? null : calcNextRetry(task.attempts + 1),
},
});
if (exhausted) {
await moveToDeadLetter(task);
}
}
}This is the core pattern. The conversation handler writes. The worker reads. Neither one waits for the other.
Idempotency: the Hidden Requirement
Two confirmation emails land in your customer's inbox. Two CRM entries get created for the same call. Two reminder messages fire on Thursday morning. That's what happens when a task worker crashes mid-execution and picks the task up again on restart.
With database polling, a task can be picked up more than once.
With database polling, if two workers run simultaneously or if a worker crashes mid-execution after marking a task running, the same task might execute twice. For external calls, that means two CRM entries, two emails to the customer, two reminder schedules.
The fix is idempotency: running a task twice should produce the same result as running it once.
For external API calls, use an idempotency key:
async function sendEmail(task: AgentTask) {
// Deterministic key: same task always generates the same key
const idempotencyKey = `${task.conversationId}:${task.type}:${task.id}`;
await resend.emails.send({
to: task.payload.userId as string,
subject: 'Your appointment is confirmed',
react: ConfirmationEmail(task.payload as BookingPayload),
headers: {
'Idempotency-Key': idempotencyKey,
},
});
}
async function updateCRM(task: AgentTask) {
// Upsert instead of insert -- safe to run multiple times
await crm.contacts.upsert({
where: { externalId: task.payload.userId as string },
create: { ...buildCRMRecord(task.payload) },
update: { ...buildCRMRecord(task.payload) },
});
}Many APIs (Resend, Stripe, Twilio) accept an idempotency key header natively. When you retry a request with the same key, the provider returns the cached result rather than executing again. For your own database writes, use upsert semantics rather than plain insert.
Retry Semantics That Actually Work
Not all failures deserve a retry. A network timeout is transient -- try again. An invalid user ID is terminal -- retrying forever wastes resources and hides the real problem. Your dispatch function should signal which type it is:
class RetryableError extends Error {
constructor(message: string) {
super(message);
this.name = 'RetryableError';
}
}
class TerminalError extends Error {
constructor(message: string) {
super(message);
this.name = 'TerminalError';
}
}
async function sendEmail(task: AgentTask) {
const result = await resend.emails.send({ ... });
if (result.statusCode === 429) {
// Rate limited -- back off and retry
throw new RetryableError('Resend rate limited');
}
if (result.statusCode === 422) {
// Bad recipient address -- don't retry
throw new TerminalError(`Invalid email for ${task.payload.userId}`);
}
}Exponential backoff with jitter prevents the thundering-herd problem when many tasks fail simultaneously:
function calcNextRetry(attempts: number): Date {
const baseMs = 2000;
const cap = 60_000; // Max 60s between retries
const exponential = Math.min(baseMs * Math.pow(2, attempts), cap);
const jitter = Math.random() * 1000;
return new Date(Date.now() + exponential + jitter);
}
// Attempt 1: ~2s, Attempt 2: ~4s, Attempt 3: ~8s, Attempt 4: ~16s...Dead-letter queue depth is your most important operational signal. Tasks that exhaust retries shouldn't just sit in status: 'failed' -- they should move to a separate dead-letter table with dedicated alerting. If that queue fills, something is systematically broken (your email provider is down, the CRM is rejecting writes), and you need to know now rather than when a customer calls back.
When Tasks Should Expire
Expiry is the feature teams skip most often, and the one they regret most.
Say a customer books an appointment for Thursday at 2pm. The confirmation email task fails and sits in the retry queue for six hours. When it finally executes, you send them a "Your appointment is confirmed!" email at 11pm -- after they've already received a confused callback from your human team. Stale success is worse than visible failure.
Each task type should have a business-realistic expiry window:
| Task Type | Suggested Window | Rationale |
|---|---|---|
| Confirmation email | 30 min | Customer expects it while the call is fresh |
| CRM update | 4 hours | Business SLA for data freshness |
| Appointment reminder | Until T-2h before appointment | Pointless if the appointment already passed |
| Follow-up survey | 48 hours | Response rates drop sharply after that |
| Internal ticket update | 24 hours | Ops team needs timely records |
When a task expires, don't discard it silently. Move it to dead-letter with reason: 'expired'. A surge in expired confirmation emails means your email provider had an outage during peak call hours -- that's a real operational signal you want to see.
Monitoring Task Health Alongside Conversations
A task queue you can't observe is just a different kind of black box. The metrics that matter are straightforward:
- Task creation rate by type -- if it drops, your conversation handler may have stopped writing tasks
- Success rate by type -- different task types have different failure baselines; CRM updates will fail differently from email sends
- P95 execution latency -- how long from creation to completion across all task types
- Retry rate -- climbing retry rate is an early signal of tool degradation, often visible before error logs catch up
- Dead-letter queue depth -- this should hover near zero; if it climbs, something is systematically broken
The important insight is that task health and conversation quality are connected metrics, not separate ones. If your confirmation email failure rate climbs, your callback rate follows 30 minutes later. Watching both together lets you diagnose the upstream cause before it becomes a customer complaint.
Chanl's monitoring dashboard surfaces this correlation directly: task execution metrics alongside conversation scorecards in the same view. You can also run scenario tests that validate the full post-conversation workflow -- not just "did the agent say the right thing" but "did every downstream task actually execute and complete." This matters especially for teams using memory-backed agents, where a failed CRM update means the agent's memory layer serves stale context on the customer's next call.
For teams hitting reliability failures more broadly, the circuit breaker pattern in Circuit Breakers for AI Agents: Stop the 3 AM Meltdown complements the task queue pattern well: circuit breakers prevent your task worker from hammering a degraded external service during an outage, and the task queue gives you the retry budget to recover cleanly once it recovers.
The Idempotency Key + Task Queue Pattern Together
Here's how a complete booking flow looks when you combine these patterns:
// Called when agent confirms booking during conversation
async function onBookingConfirmed(
session: ConversationSession,
booking: BookingDetails,
) {
const now = new Date();
await db.tasks.createMany({
data: [
{
conversationId: session.id,
type: 'send_email',
payload: { template: 'booking_confirmation', to: session.userId, booking },
status: 'pending',
createdAt: now,
expiresAt: addMinutes(now, 30),
attempts: 0,
maxAttempts: 3,
},
{
conversationId: session.id,
type: 'update_crm',
payload: { userId: session.userId, outcome: 'booked', booking },
status: 'pending',
createdAt: now,
expiresAt: addHours(now, 4),
attempts: 0,
maxAttempts: 5,
},
{
conversationId: session.id,
type: 'schedule_reminder',
payload: { userId: session.userId, appointmentTime: booking.datetime },
status: 'pending',
createdAt: now,
expiresAt: subHours(booking.datetime, 2),
attempts: 0,
maxAttempts: 3,
},
],
});
// Agent gets "tasks scheduled" back immediately -- no waiting
return { tasksScheduled: 3 };
}If you'd rather use a managed queue service instead of database polling, BullMQ (Redis-backed) and Inngest (durable functions, first-class retry semantics) map cleanly to this schema. The pattern is identical; the infrastructure layer changes.
For teams already on MCP, the Tasks primitive introduced in the 2026 MCP specification brings native lifecycle management -- retry semantics, expiry policies, and task status callbacks -- at the protocol level. The database task queue above is functionally equivalent, and the concepts translate directly if you migrate to MCP Tasks later. We covered how the A2A and MCP protocol stack fits together in A2A and MCP: Building the Agent Protocol Stack from Scratch.
What Good Looks Like
A production CX agent that handles async tasks well has a few visible properties that distinguish it from one that doesn't:
Conversations end in under 200ms even when complex downstream processing is queued. The agent writes task records -- never task results.
The task health dashboard is boring. Success rates hover near 100%. Dead-letter depth is zero. When something spikes, the on-call engineer knows within minutes and has task type, error message, affected conversation IDs, and retry history to diagnose from.
Failed tasks show up in incident reports, not customer callbacks. When a CRM integration goes down, you see it in dead-letter depth before the first customer calls wondering why the agent didn't follow up.
Customers don't know tasks exist. From their perspective, the confirmation email just arrived. The CRM entry was there when the account manager pulled it up. The reminder fired at exactly the right time. The work happened; they just didn't see the mechanics.
That's the goal: invisible reliability. Your agent handles the conversation. The task queue handles everything else. Neither one waits for the other.
Test your full CX workflow, not just conversations
Chanl's scenario runner validates post-conversation tasks alongside agent quality -- so you catch async failures before customers do.
See How It WorksCo-founder
Building the platform for AI agents at Chanl — tools, testing, and observability for customer experience.
The Signal Briefing
Un email por semana. Cómo los equipos líderes de CS, ingresos e IA están convirtiendo conversaciones en decisiones. Benchmarks, playbooks y lo que funciona en producción.



