ChanlChanl
Best Practices

How to Build Idempotent Tool Calls for AI Agents

Naive retry logic charges customers twice, sends duplicate emails, and fires double webhooks. Here's how to build idempotent tool calls for AI agents with idempotency keys, deduplication, and safe retries.

DGDean GroverCo-founderFollow
May 24, 2026
14 min read
A Control Panel With a Retry Button That Returns the Same Green Checkmark on Every Press, Showing Idempotent Operations

Your agent is booking a customer's flight. It calls the booking tool, the network blips, the call times out. The agent retries. The customer now has two tickets, one very bad day, and a support thread that's about to get loud.

This is the idempotency problem. It's not exotic, and it's not something you catch in testing because your test environment has reliable networks. It only surfaces in production, at the worst possible moment, to your most frustrated customers.

The fix isn't "handle errors better." It's building tool calls that are safe to call more than once.

Why Agents Are Especially Vulnerable to Duplicate Calls

Most HTTP APIs have some retry risk. Agents have more than most, for three reasons.

Automatic retry logic. A well-designed agent retries transient failures. A network timeout on a booking call gets retried. If the original call succeeded before the timeout, you now have two bookings.

LLM re-planning. If the model loses track of what it already did, because the context was truncated, because tool results weren't injected correctly, or because a multi-step plan got interrupted, it may re-issue a tool call it already made. The tool has no way to know this is a duplicate without explicit deduplication.

Parallel tool execution. Agents that call tools in parallel can race. Two concurrent calls to the same tool with the same arguments create the same side effect twice.

Add these together and you get an agent that charges customers twice, sends duplicate confirmation emails, creates duplicate support tickets, or triggers two webhooks for the same event. The kind of bug your customers find before you do.

The Idempotency Key Pattern

An idempotency key is a unique identifier for a specific operation attempt. You generate it before the call and send it with the request. The server stores the result the first time it sees that key, then returns the cached result every subsequent time, without re-executing the operation.

idempotency-key.ts·typescript
import { createHash } from "crypto";
 
function generateIdempotencyKey(
  conversationId: string,
  toolCallId: string,
  toolName: string,
  args: Record<string, unknown>
): string {
  const payload = JSON.stringify({
    conversationId,
    toolCallId,
    toolName,
    args,
  });
 
  return createHash("sha256").update(payload).digest("hex").slice(0, 32);
}
 
async function callTool(
  toolName: string,
  args: Record<string, unknown>,
  context: { conversationId: string; toolCallId: string }
): Promise<ToolResult> {
  const idempotencyKey = generateIdempotencyKey(
    context.conversationId,
    context.toolCallId,
    toolName,
    args
  );
 
  return await tools.call(toolName, args, {
    headers: { "Idempotency-Key": idempotencyKey },
  });
}

Two notes on key design worth getting right from the start:

Include toolCallId in the hash. This prevents a legitimate second call (the agent genuinely deciding to issue a second refund after the first one settled) from being blocked by the key from the first call. Each tool call the LLM generates gets a unique ID from the framework. Use that.

Don't include timestamps or random values in the key. If you do, a retry generates a different key, which defeats the entire purpose.

Which Tool Calls Are Safe to Retry Naively?

Not every tool call needs idempotency handling. Categorize your tools before you build the retry layer:

Naturally idempotent, no action needed:

  • Get customer profile
  • Check order status
  • Search knowledge base
  • Look up product details
  • Fetch conversation history

These are read operations. Calling them twice returns the same data. No side effects, no deduplication needed.

Needs idempotency keys:

  • Process payment or issue refund
  • Book appointment or reservation
  • Send email or SMS notification
  • Create support ticket
  • Trigger external webhook
  • Subscribe or cancel subscription
  • Update customer record with a write

These are write operations with real-world side effects. Every one of these should accept and respect an idempotency key.

Needs idempotency key AND status check:

  • Long-running operations (document generation, large exports)
  • Third-party workflow triggers (Salesforce record create, Zendesk ticket open)
  • Any call where the response is "started" rather than "completed"

For this last category, you also need a separate status endpoint: GET /operations/{operationId}/status. The retry path checks whether the original operation completed before deciding whether to retry.

tool-categories.ts·typescript
const toolCategories = {
  idempotent: [
    "get_customer",
    "check_order_status",
    "search_knowledge_base",
  ],
  requiresKey: [
    "process_payment",
    "issue_refund",
    "send_email",
    "create_ticket",
    "book_appointment",
  ],
  requiresKeyAndStatus: [
    "generate_report",
    "trigger_workflow",
    "bulk_export",
  ],
} as const;
 
type RetryStrategyType =
  | "simple"
  | "idempotent-key"
  | "idempotent-with-status-check"
  | "no-retry";
 
function getRetryStrategy(toolName: string): {
  type: RetryStrategyType;
  maxAttempts: number;
  backoffMs: number;
} {
  if ((toolCategories.idempotent as readonly string[]).includes(toolName)) {
    return { type: "simple", maxAttempts: 3, backoffMs: 1000 };
  }
  if ((toolCategories.requiresKey as readonly string[]).includes(toolName)) {
    return { type: "idempotent-key", maxAttempts: 3, backoffMs: 2000 };
  }
  if ((toolCategories.requiresKeyAndStatus as readonly string[]).includes(toolName)) {
    return { type: "idempotent-with-status-check", maxAttempts: 5, backoffMs: 5000 };
  }
  return { type: "no-retry", maxAttempts: 1, backoffMs: 0 };
}

How to Implement Deduplication on the Server Side

The idempotency key pattern only works if the server actually enforces it. For tools you own, add a deduplication middleware layer:

idempotency-middleware.ts·typescript
import { Redis } from "ioredis";
 
class IdempotencyMiddleware {
  constructor(
    private redis: Redis,
    private ttlSeconds: number
  ) {}
 
  async handle(
    key: string,
    operation: () => Promise<unknown>
  ): Promise<unknown> {
    const existing = await this.redis.get(`idempotency:${key}`);
    if (existing) {
      return JSON.parse(existing);
    }
 
    const lock = await this.redis.set(
      `idempotency-lock:${key}`,
      "processing",
      "EX",
      30,
      "NX"
    );
 
    if (!lock) {
      await new Promise((resolve) => setTimeout(resolve, 500));
      const result = await this.redis.get(`idempotency:${key}`);
      if (result) return JSON.parse(result);
      throw new Error("Concurrent request with same idempotency key");
    }
 
    try {
      const result = await operation();
 
      await this.redis.setex(
        `idempotency:${key}`,
        this.ttlSeconds,
        JSON.stringify(result)
      );
 
      return result;
    } finally {
      await this.redis.del(`idempotency-lock:${key}`);
    }
  }
}
 
app.post("/tools/process_payment", async (req, res) => {
  const idempotencyKey = req.headers["idempotency-key"] as string;
  const { customerId, amount, currency } = req.body;
 
  if (!idempotencyKey) {
    return res.status(400).json({ error: "Idempotency-Key header required" });
  }
 
  const result = await idempotencyMiddleware.handle(
    idempotencyKey,
    () => paymentProvider.charge({ customerId, amount, currency })
  );
 
  res.json(result);
});

The lock prevents the "lost update" race condition: two simultaneous retries both miss the cache, both attempt to execute, and you get two charges. The 30-second lock window covers the operation's execution time.

For third-party tools you don't own, check their built-in support before building your own deduplication:

  • Stripe: Idempotency-Key header, 24-hour window, per-endpoint
  • Square: idempotency_key in the request body
  • Twilio: no built-in idempotency; build your own deduplication layer before calling their API

The Retry Wrapper That Picks the Right Strategy

Your agent's tool executor needs a retry wrapper that applies the right strategy per tool category:

retry-wrapper.ts·typescript
function isRetryable(error: unknown): boolean {
  if (!(error instanceof Error)) return false;
  const status = (error as { status?: number }).status;
  return (
    error.message.includes("ECONNRESET") ||
    error.message.includes("ETIMEDOUT") ||
    (status !== undefined && status >= 500)
  );
}
 
async function sleep(ms: number): Promise<void> {
  return new Promise((resolve) => setTimeout(resolve, ms));
}
 
class IdempotentRetryWrapper {
  async call(
    toolName: string,
    args: Record<string, unknown>,
    context: AgentContext
  ): Promise<ToolResult> {
    const strategy = getRetryStrategy(toolName);
 
    if (strategy.type === "no-retry") {
      return await this.execute(toolName, args, context);
    }
 
    const idempotencyKey =
      strategy.type !== "simple"
        ? generateIdempotencyKey(
            context.conversationId,
            context.toolCallId,
            toolName,
            args
          )
        : undefined;
 
    let lastError: Error | undefined;
 
    for (let attempt = 1; attempt <= strategy.maxAttempts; attempt++) {
      try {
        if (
          strategy.type === "idempotent-with-status-check" &&
          attempt > 1
        ) {
          const status = await this.checkOperationStatus(
            context.toolCallId
          );
          if (status?.completed) return status.result;
        }
 
        return await this.execute(toolName, args, context, idempotencyKey);
      } catch (error) {
        lastError = error as Error;
 
        if (!isRetryable(error)) throw error;
        if (attempt === strategy.maxAttempts) throw error;
 
        const backoff = strategy.backoffMs * Math.pow(2, attempt - 1);
        await sleep(backoff);
      }
    }
 
    throw lastError!;
  }
}

The exponential backoff on retryable errors is important. Naive retry loops against a struggling API generate a thundering herd that makes the outage worse. Back off.

A Decision Tree for Every Tool Call

No Yes Naturally idempotent Requires key Requires key + status Yes Network/5xx 4xx No Yes Yes Yes No No Agent calls tool Tool in registry? Throw, no retry Tool category Simple retry, no key Generate idempotency key Generate idempotency key Call tool with key Success? Return result Max attempts? Backoff, retry with same key Throw final error Status check needed? Original completed?
Tool call retry decision: how idempotency key and status check interact

How to Test Idempotency Before Production

This is the test most teams skip because it requires simulating partial failures. Don't skip it.

For every tool in the "requires key" category, you need three test scenarios:

Timeout after execution: The tool executes before the timeout fires, but the response is lost. The retry should return the cached result, not create a new operation.

Network error before execution: The request never reached the server. The retry should execute the operation for the first time and succeed.

Concurrent retry race: Two requests with the same idempotency key arrive simultaneously. Only one should execute; the other should get the cached result without a duplicate side effect.

A scenario-based testing layer can simulate transient failures on any tool call during a test run, letting you verify idempotency behavior without building a fault injection harness yourself:

idempotency-test.ts·typescript
const doubleChargeTest = {
  name: "Payment tool is safe to retry after timeout",
  setup: {
    faults: [
      {
        toolName: "process_payment",
        onAttempt: 1,
        fault: "timeout_after_execution",
      },
    ],
  },
  script: [{ role: "user", content: "I'd like to pay for my order now" }],
  assertions: [
    { type: "tool_called", toolName: "process_payment", times: 2 },
    { type: "payment_created", times: 1 },
    { type: "idempotency_key_consistent" },
  ],
};

The assertion payment_created: times: 1 is the one that matters. The agent called twice; the customer was charged once.

How to Monitor for Retry Storms in Production

With idempotency in place, you want to track when it's being exercised. A cache hit once a month means a retry worked correctly. A hundred cache hits in an hour means something upstream broke.

Instrument your deduplication layer to emit a structured log line every time it serves a cached response:

idempotency-logging.ts·typescript
type DeduplicationEvent = {
  toolName: string;
  idempotencyKey: string;
  agentId: string;
  conversationId: string;
  originalTimestamp: string;
};
 
function logDeduplication(event: DeduplicationEvent): void {
  console.log(
    JSON.stringify({
      event: "tool_call_deduplicated",
      toolName: event.toolName,
      agentId: event.agentId,
      conversationId: event.conversationId,
      retryDelayMs:
        Date.now() - new Date(event.originalTimestamp).getTime(),
    })
  );
}

Ship that line through whatever observability pipe you already use (Datadog, Honeycomb, your warehouse) and set an alert when tool_call_deduplicated exceeds your baseline rate for any single tool. A spike is a leading indicator of network instability, a slow downstream service, or a re-planning bug in the agent's context handling. All worth catching before they compound.

Surface the metric alongside your other agent health signals so retry storms don't hide behind clean-looking success rates.

Where Idempotency Fits in the Reliability Stack

Idempotency solves one specific failure mode: a call that succeeded but whose response was lost. It doesn't solve the case where a downstream service is completely unavailable. That's what circuit breakers are for. And it doesn't solve high-stakes actions that need a human's judgment before they fire. That's the interrupt pattern.

Each pattern solves a different failure mode. If you're adding circuit breakers, add idempotency keys at the same time. They're both about making your agent reliable under failure, and they work better together than either does alone.

If you're managing a tool registry at scale, document the idempotency category for each tool so agents can look up the retry strategy automatically rather than encoding it case by case. A tool management layer lets you annotate tools with retry behavior alongside their schemas, so every agent that calls a tool gets the right handling without repeating the logic.

The goal isn't to make your agent cautious about retrying. It's to make retrying safe, so when a network blip hits at 11 PM on a Friday, your customers don't discover the bug before you do.

Test retry safety before your customers find the bug

Chanl Scenarios simulate transient failures on any tool call and verify your agent's retry behavior is correct. Catch duplicate charges and double-sends in testing, not production.

Explore Scenarios
DG

Co-founder

Building the platform for AI agents at Chanl — tools, testing, and observability for customer experience.

Learn Agentic AI

Weekly. Patterns and recipes for shipping AI agents that actually work — MCP, scorecards, regression tests, prompts, model comparisons. From teams running agents in production.

500+ builders subscribed

Frequently Asked Questions