ChanlChanl
Operations

How to Build a Cart Recovery Agent (and When to Send Nothing)

Most cart-recovery flows are three discount emails. A real recovery agent decides why the customer left, picks the right channel, caps the discount to protect margin, and sometimes sends nothing at all. Here is how to build it.

LDLucas DalamartaEngineering LeadFollow
April 30, 2026
10 min read
A customer at a kitchen counter, phone in hand, gentle window light. A single product card on the screen, a thoughtful pause.

A customer adds a wool coat to their cart for the third time this month. Your recovery flow fires the same three emails it fires for every other abandoner. Email one: "You left something behind." Email two: same coat, 10% off. Email three: same coat, 20% off.

They never bought it the first two times. They will not buy it now. But you just gave them a 20% discount they did not need, on a sale they were going to make anyway when their next paycheck cleared, on the day they always buy from you.

The discount-blast era is over. The next era is decisioning, and the most important decision a cart-recovery agent makes is whether to send anything at all.

The Discount-Blast Era Is Over

Cart abandonment is a real problem. Baymard's analysis across more than 50 studies puts the average rate at 70.22%, with mobile at 80.2% and tablet at 80.74%. Baymard estimates $260 billion in recoverable sales lost annually in the US and EU alone. So far so good. The math says recovery is worth solving.

But the standard solution is a three-email sequence with progressively larger discounts. Klaviyo's own benchmark report shows abandoned-cart flows convert at 3.33% on average, and even top-decile programs cap out around 7.69%. That means roughly 92 to 97 of every 100 messages do nothing useful, and a meaningful fraction of the wins would have happened anyway.

The interesting opportunity is in the bottom 92. Why are most of those carts unrecoverable? Because most of those carts were never going to convert. Baymard's customer survey found that 43% of US shoppers had abandoned a cart in the last three months because "I was just browsing or not ready to buy." That is half your funnel, and the standard playbook treats them identically to a customer who hit a payment failure.

A reason-aware agent treats them differently.

Why "No Message" Is the Feature

The single most underrated branch in any recovery system is the one where the agent decides not to message. Not because messaging failed. Because messaging would have been wrong.

There are five reasons the no-message branch matters:

  1. Margin protection. A comparison shopper who always converts anyway does not need a 20% coupon.
  2. Frequency hygiene. A customer who abandoned three times this week is not going to be persuaded by message four.
  3. Suppression-list discipline. A customer who unsubscribed last quarter must not be re-engaged just because Shopify fired a webhook.
  4. Compliance. SMS without prior express written consent is a TCPA violation at $500 to $1,500 per message.
  5. Brand restraint. Some carts are quiet by design. The customer is thinking. Pinging them rushes the decision and often kills it.

The skip is a first-class action with a logged reason, not a missing branch. Every skip should answer: which reason, which segment, which model decided. That is the data that lets you tune the agent.

Now let's build the agent that gets to that decision.

Step 1: Catch the Abandon Event

The trigger is the easy part. Shopify emits a checkouts/update webhook when a cart is created or modified, and a checkouts/abandoned event after the timeout. We want to schedule a recovery decision at three offsets (15 minutes, 1 hour, 24 hours) and dedupe by cart id so a still-active cart does not pile up jobs.

webhook.ts·typescript
import { Queue } from 'bullmq';
 
const recoveryQueue = new Queue('cart-recovery', { connection: redis });
 
export async function handleCheckoutUpdate(payload: ShopifyCheckout) {
  const cartId = payload.token;
  const userId = payload.customer?.id;
  if (!userId || !payload.email) return; // anonymous, nothing we can do
 
  // Cancel any pending jobs for this cart. Only the latest update matters.
  const existing = await recoveryQueue.getJobs(['delayed']);
  for (const j of existing) {
    if (j.data.cartId === cartId) await j.remove();
  }
 
  // Three decision points. Each one is a chance to message, or to skip.
  for (const delayMin of [15, 60, 1440]) {
    await recoveryQueue.add(
      `recover-${cartId}-${delayMin}`,
      { cartId, userId, attempt: delayMin },
      { delay: delayMin * 60 * 1000, jobId: `${cartId}:${delayMin}` },
    );
  }
}

The explicit getJobs(['delayed']) + j.remove() loop is what actually deduplicates: when the cart updates, we drop any pending recovery jobs for that cart before scheduling the new ones. The jobId is a second guard. BullMQ silently ignores adds with a duplicate jobId, so even if two webhook deliveries race, you only get one job per offset.

That handles the trigger. Now the worker has to decide what to actually do.

Step 2: Infer the Reason

This is where most teams stop and fall back to "send the discount email." Do not stop here. Before deciding the channel, decide why the cart was abandoned, because the reason determines everything that follows.

Feed the LLM the cart contents, the most recent browse session events, any payment-attempt logs, and a summary of the customer's history. Constrain the output with a Zod schema so the model cannot invent categories your router does not understand.

infer-reason.ts·typescript
import { z } from 'zod';
import { generateObject } from 'ai';
 
const ReasonSchema = z.object({
  reason: z.enum([
    'price_concern',
    'shipping_cost',
    'sizing_doubt',
    'payment_fail',
    'comparison_shopping',
    'distracted',
    'churned',
  ]),
  confidence: z.number().min(0).max(1),
  evidence: z.string().max(280),
});
 
export async function inferReason(ctx: CartContext) {
  const { object } = await generateObject({
    model: openai('gpt-4o-mini'),
    schema: ReasonSchema,
    prompt: `Given this cart and customer context, classify the most likely
reason for abandonment. Use exactly one enum value. Cite evidence from the
provided data. Do not speculate.
 
Cart: ${JSON.stringify(ctx.cart)}
Browse: ${JSON.stringify(ctx.browse.slice(-20))}
Payments: ${JSON.stringify(ctx.paymentAttempts)}
History: ${ctx.history}`,
  });
  return object;
}

The schema is the contract with the rest of the pipeline. Without it the model will improvise reasons like "user-experience friction" that your router does not handle, and the cart silently falls through to the default email blast. The Zod call refuses non-matching output before it can poison downstream logic.

The model now answers a real question, not "did they abandon?" but "why?". Now we can route on that answer.

Step 3: Route to a Channel (or Skip)

The router maps reason to channel. Critically, two of the seven reasons map to skip. Those are not bugs. Those are the most valuable branches in the system, because they are the ones that protect margin and brand.

router.ts·typescript
type Reason = z.infer<typeof ReasonSchema>['reason'];
type Channel = 'in_app_modal' | 'chat' | 'sms' | 'email' | 'skip';
 
const ROUTING: Record<Reason, { channel: Channel; note: string }> = {
  payment_fail:        { channel: 'in_app_modal', note: 'fix card on next visit' },
  sizing_doubt:        { channel: 'chat',         note: 'offer fit assistant' },
  price_concern:       { channel: 'sms',          note: 'one-time code if margin > 15%' },
  shipping_cost:       { channel: 'email',        note: 'ship-threshold reminder' },
  distracted:          { channel: 'email',        note: 'gentle nudge only' },
  comparison_shopping: { channel: 'skip',         note: 'will convert without help' },
  churned:             { channel: 'skip',         note: 'on suppression list' },
};
 
export function route(reason: Reason, confidence: number) {
  if (confidence < 0.6) return { channel: 'email' as const, note: 'low-conf default' };
  return ROUTING[reason];
}

Two design choices matter here. First, low-confidence inferences fall back to email, the lowest-cost and lowest-risk channel. The agent never gambles a SMS-class action on a guess. Second, every skip carries a note. When you query "why did we skip 4,200 carts last week?" the answer is in the same row, not lost.

We have a channel. Now we have to make sure the message we send does not torch margin.

Step 4: Cap the Discount

Discounts are dangerous. A model that has read every recovery email on the internet knows that "20% off" is a normal recovery move, and it will happily propose 20% off a low-margin SKU to a high-LTV customer who would have bought at full price. That is a margin disaster wearing a friendly face.

The cap belongs in code, not in the prompt. The prompt can be argued with. The cap cannot.

cap.ts·typescript
export function capDiscount(input: {
  proposed: number;       // 0..1
  productMargin: number;  // 0..1
  customerLtv: number;    // dollars
  attempt: 1 | 2 | 3;
}) {
  const marginCeiling = Math.max(0, input.productMargin - 0.05); // never go below 5pp
  const ltvCeiling = input.customerLtv > 500 ? 0.10 : 0.20;
  const attemptCeiling = [0.10, 0.15, 0.20][input.attempt - 1];
  const cap = Math.min(marginCeiling, ltvCeiling, attemptCeiling);
  const final = Math.min(input.proposed, cap);
  return { final, capped: input.proposed > cap, cap };
}

Three guardrails. The product margin determines the absolute floor. The customer's LTV says high-value customers do not need big discounts. The attempt number says first-touch should be cheap; only escalate if the customer is still on the fence. Whichever ceiling is lowest wins. The function reports capped: true so analytics can show how often the model is asking for more than it should get.

Now we have a reason, a channel, and a safe discount. The last piece is what happens when the customer actually replies.

Step 5: Pick Up the Conversation in Chat

The recovery message says "we noticed you were eyeing the wool coat. Questions?" The customer replies "yes, what's your return policy on outerwear?" If the chat agent that picks up the thread does not already know about the coat, the recovery message becomes a bait-and-switch. The customer has to re-explain.

That re-explanation is where most multi-channel recovery breaks. The recovery system and the chat system live in different code paths, talk to different LLMs, and reset on every message. Avoid that by passing the cart and reason as session metadata so the chat thread starts pre-loaded.

No Yes payment_fail sizing_doubt price_concern shipping_cost distracted comparison_shopping churned Yes No checkouts/update webhook BullMQ delayed job Pull cart + browse + history + memory Reason inference - Zod schema Confidence > 0.6? Fallback: email, gentle Reason In-app modal next visit Chat with sizing tool Margin > 15%? Email: ship threshold Email: gentle nudge SKIP - log reason SKIP - suppression SMS one-time code SKIP - margin Chat pickup with cart context
Cart-abandon recovery decision tree with no-message branch

Notice the diagram has three skip branches. That is not a coincidence. A well-designed recovery agent will spend a lot of its time choosing not to act, and that is the source of most of its margin lift over the discount-blast playbook.

Before you ship any of this, the compliance cage.

The Compliance Cage

A multichannel recovery agent crosses three regulatory regimes, and any of them can put a fast-growing brand on the wrong end of a class action. Treat the rules as preconditions, not afterthoughts.

RegimeWhat it requiresWhat kills you
CAN-SPAM (US email)Truthful headers, accurate subject, physical address in every message, working opt-out honored within 10 business days, opt-out functional for 30 daysUp to $53,088 per non-compliant message per the FTC
TCPA (US SMS)Prior express written consent for marketing texts; opt-out rule effective April 11, 2025; one-to-one consent rule effective January 26, 2026Private right of action, $500 to $1,500 per message, no proof of harm required
GDPR (EU email/SMS)Either explicit consent or legitimate interest with a documented three-part assessment, soft opt-in only for similar products, one-click opt-out in every messageUp to 4% of global revenue or 20M EUR, whichever is higher

The implementation answers two questions per send. First, do we have a consent record that covers this channel for this recipient? Second, is the recipient on a suppression list (unsubscribe, hard bounce, complaint)? If either fails, the agent skips the message and logs the reason. The skip is the compliant move.

How Chanl Shortens This

Most of the moving parts above are infrastructure you would otherwise build from scratch: persistent memory of prior abandonments, capped tools, scenario tests for the no-message branch, and scorecards that grade the agent on more than just revenue. With those primitives in place, the recovery worker collapses to a few SDK calls.

recovery-worker.ts·typescript
import { Chanl } from '@chanl/sdk';
const chanl = new Chanl({ apiKey: process.env.CHANL_API_KEY! });
 
// 1. Memory: has this customer abandoned this product before?
const { data: prior } = await chanl.memory.search({
  entityType: 'customer',
  entityId: customer.id,
  query: `abandoned cart with ${cart.lineItems[0].title}`,
});
const priorCarts = prior.memories;
 
// 2. Tool: capped discount tool registered once at provisioning time.
//    The cap lives on the tool schema, not the prompt. The model cannot exceed it.
await chanl.tools.create({
  name: 'send_chat_recovery',
  description: 'Send recovery message with bounded discount.',
  type: 'http',
  inputSchema: {
    type: 'object',
    properties: { discount: { type: 'number', maximum: 0.15 } },
    required: ['discount'],
  },
  configuration: {
    http: { method: 'POST', url: 'https://internal/recovery/send' },
  },
});
 
// 3. Inferred reason and routing decision happen here.
//    inferReason() and route() are the helpers from Steps 2 + 3 above.
const reason = await inferReason({ cart, browse, history: priorCarts });
const decision = route(reason.reason, reason.confidence);
if (decision.channel === 'skip') {
  await chanl.memory.create({
    entityType: 'customer',
    entityId: customer.id,
    content: `Recovery skipped: ${decision.note}`,
  });
  return;
}
 
// 4. Chat pickup: when the customer replies, the session knows the cart.
const { data: session } = await chanl.chat.createSession(recoveryAgentId, {
  metadata: { cartId: cart.id, recoveryReason: reason.reason, priorCarts: priorCarts.length },
});
 
// 5. Scorecard: grade every recovery on the axes that matter.
await chanl.scorecards.evaluate(session.sessionId, {
  scorecardId: 'cart-recovery-v1', // axes: reason_correct, channel_appropriate, discount_capped, converted
});

Five things to notice. First, memory.search lets the agent know it is the customer's third attempt without you maintaining a parallel customer-history database. Second, the discount cap rides on the tool definition, so even if the prompt drifts, the schema enforces the ceiling. Third, the chat session metadata seeds the system prompt with cart context, so the customer never re-explains. Fourth, scorecard evaluation gives you the diagnostic axes (reason accuracy, channel fit, cap adherence, conversion) that you cannot get from revenue alone. Fifth, scenario tests, run via chanl.scenarios.run({ persona: 'comparison-shopping-customer' }) in CI, assert the agent skips the comparison shopper segment before any change ships.

The product opportunity worth flagging: today this lives in your worker. A first-class chanl.workflows.run({ steps, delays }) API would let the whole abandonment cascade live inside Chanl with full observability across the cascade. That is on the roadmap.

What to Measure

Recovered revenue is the headline number, but it is not the diagnostic. The agent earns its keep on five axes:

  1. Reason-inference accuracy. Sample 200 carts a week, label by hand, compare. Drop below 75%? Retrain or refine the prompt.
  2. Channel appropriateness. The percentage of sends where the channel matched a manual reviewer's judgment.
  3. Discount-cap hit rate. How often the LLM proposed more than the cap. Rising is a prompt-drift signal.
  4. Skip rate by reason. Comparison-shopping skips should rise as the model gets better. Distracted skips should fall as the agent learns who is actually convertible.
  5. Suppression hygiene. Zero sends to opted-out recipients. Always. Any other number is a compliance incident.

Build for those five and the recovered-revenue number takes care of itself.

The cart-recovery problem is not "send more emails." It is "decide better." The agent that decides better will sometimes decide to do nothing, and that is the move that buys back margin you did not know you were leaking. Build the no-message branch first. Everything else is just configuration.

Customer service representative

Customer Memory

4 memories recalled

Sarah Chen
Premium
Last call
2 days ago
Prefers
Email follow-up
Session Memory

“Discussed upgrading to Business plan. Budget approved at $50k. Follow up next Tuesday.”

85% relevance

Build the recovery agent your CFO will thank you for

Chanl gives your cart-recovery worker persistent memory of every prior abandon, capped tools that cannot blow margin, scenario tests that prove the no-message branch holds, and scorecards that grade more than revenue. AI agents that remember each customer.

Try Chanl

Related reading: Memory for the prior-cart lookup, Tools for the capped-discount pattern, Scenarios for the comparison-shopper test, and Scorecards for the recovery axes.

LD

Engineering Lead

Building the platform for AI agents at Chanl — tools, testing, and observability for customer experience.

The Signal Briefing

One email a week. How leading CS, revenue, and AI teams are turning conversations into decisions. Benchmarks, playbooks, and what's working in production.

500+ CS and revenue leaders subscribed

Frequently Asked Questions