ChanlChanl
Tools & MCP

MCP Auth in Production: Scopes, Tokens, and Tenant Isolation

Most MCP servers ship with no auth. Here's how to add OAuth 2.0 scopes, per-tenant tool sets, and client isolation before your MCP server becomes load-bearing production infrastructure.

DGDean GroverCo-founderFollow
May 4, 2026
15 min read
Diagram showing an MCP server with OAuth 2.0 token validation, per-tenant tool scoping, and multi-tenant isolation layers

Your second MCP server is always better than your first. The first one teaches you the protocol. The second one is where you realize you shipped the first one with no authentication, and now anyone who knows the URL can call your production database tools.

This isn't hypothetical. It's the pattern playing out as MCP moves from experiment to infrastructure. As of April 2026, 78% of enterprise AI teams have at least one MCP-backed agent in production. Most of those servers started as internal demos that gradually became load-bearing systems without anyone stopping to ask who should be allowed to call them.

This guide walks through adding production-grade auth to an MCP server: how MCP's OAuth 2.0 spec works, how to scope tools per tenant, how to keep each customer's data isolated even when you're running one server for hundreds of tenants, and how to test the auth layer before it matters.

If you're starting from scratch with MCP, the hands-on MCP tutorial is a good foundation before this one. If you're already running a server and want to harden it, start with the "Already in production?" section at the end.

Why Most MCP Servers Ship Without Auth

The quick-start path leads you there.

When you build your first MCP server, you're doing it locally with a single client: your own agent, running on localhost, connecting over stdio. Auth would add friction to a proof of concept you're not sure will work yet. So you skip it.

The problem is that MCP's entry path goes through stdio transport first, which has no network exposure at all. By the time you switch to Streamable HTTP to put the server on a real endpoint, the pattern of "just connect and start calling tools" is already baked in. Auth feels like a refactor problem for later.

MCP 1.0 introduced an OAuth 2.0 authorization specification precisely because this was predictable. The spec defines how MCP servers should handle bearer tokens, how clients should acquire them, and how dynamic client registration works. What it doesn't do is turn auth on by default. You opt in and implement the middleware yourself.

This is the gap. Let's close it.

How MCP's OAuth 2.0 Spec Works

MCP's authorization spec applies standard OAuth 2.0 to the tool-call model: the agent acquires a token, the server validates it on every request, and scopes determine which tools are available.

Your AI agent is the OAuth client. Your MCP server is the resource server. An authorization server (your own, or a managed service like Auth0 or Clerk) issues access tokens after verifying identity. The agent sends a bearer token on every MCP request. The server validates the token and filters the tool list to only what the token's scopes allow.

Here's the full flow:

Request token (client_id, requested_scopes) Access token (JWT, signed) ListTools (Authorization: Bearer <token>) Validate token signature + claims Filtered tool list (scope-permitted only) CallTool: read_customer (Bearer <token>) Re-validate token + check scope Query (WHERE tenant_id = claims.tenant_id) Result Tool result Agent Auth Server MCP Server Data Store
MCP OAuth 2.0 flow from agent authorization to scoped tool execution

Two things stand out about this flow. First, token validation happens on every request, not just at connection time. This is intentional: tokens expire, and you want revocation to take effect immediately rather than at the next reconnect. Second, the tool list itself is scope-filtered. The agent only sees the tools it's authorized to call. This is the mechanism that makes multi-tenant isolation possible.

Token Validation Middleware in TypeScript

Token validation is a single async function that runs before any tool call is processed. Here's a production-ready implementation using JWKS-based key rotation.

src/auth/validate-token.ts·typescript
import jwt from 'jsonwebtoken'
import { JwksClient } from 'jwks-rsa'
 
const jwksClient = new JwksClient({
  jwksUri: process.env.JWKS_URI!,
  cache: true,
  cacheMaxAge: 600_000, // 10 min cache to avoid JWKS endpoint overload
})
 
export interface TokenClaims {
  sub: string
  tenant_id: string
  scopes: string[]
  exp: number
}
 
export async function validateToken(bearerToken: string): Promise<TokenClaims> {
  const token = bearerToken.replace('Bearer ', '')
 
  const decoded = jwt.decode(token, { complete: true })
  if (!decoded || typeof decoded === 'string') {
    throw new Error('Invalid token format')
  }
 
  const key = await jwksClient.getSigningKey(decoded.header.kid)
  const publicKey = key.getPublicKey()
 
  const verified = jwt.verify(token, publicKey, {
    algorithms: ['RS256'],
    issuer: process.env.JWT_ISSUER,
    audience: process.env.JWT_AUDIENCE,
  }) as TokenClaims
 
  if (!verified.tenant_id || !Array.isArray(verified.scopes)) {
    throw new Error('Token missing required claims: tenant_id or scopes')
  }
 
  return verified
}

Fetching the public key from JWKS rather than hardcoding it means your auth server can rotate keys without any changes to the MCP server. The 10-minute cache prevents the JWKS endpoint from getting hit on every tool call while still picking up key rotations within a reasonable window.

Wire this into your server's request handler:

src/server.ts·typescript
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js'
import { StreamableHTTPServerTransport } from '@modelcontextprotocol/sdk/server/streamableHttp.js'
import express from 'express'
import { validateToken } from './auth/validate-token.js'
import { buildToolSetForToken } from './tools/scoped-tools.js'
 
const app = express()
app.use(express.json())
 
app.all('/mcp', async (req, res) => {
  const authHeader = req.headers.authorization
  if (!authHeader?.startsWith('Bearer ')) {
    res.status(401).json({ error: 'Missing authorization header' })
    return
  }
 
  let claims
  try {
    claims = await validateToken(authHeader)
  } catch {
    res.status(401).json({ error: 'Token validation failed' })
    return
  }
 
  // Fresh server instance per request -- this is what makes per-token scoping work
  const server = new McpServer({ name: 'cx-agent-mcp', version: '1.0.0' })
  await buildToolSetForToken(server, claims)
 
  const transport = new StreamableHTTPServerTransport({ sessionIdGenerator: undefined })
  await server.connect(transport)
  await transport.handleRequest(req, res, req.body)
})
 
app.listen(3001)

Notice you're creating a new McpServer instance per request, not per connection. This is deliberate. Each instance gets exactly the tools the token authorizes, and nothing else. It's slightly more overhead than a single shared instance, but it's what makes true per-token scoping possible without shared mutable state.

Scoping Tools Per Tenant

Each tenant's token gets a filtered tool list. Tools outside the token's scopes don't appear in ListTools and can't be called.

Define a mapping from OAuth scopes to tool registration functions. When a request arrives, extract the token's scopes, find the matching tools, and register only those on the per-request server instance:

src/tools/scoped-tools.ts·typescript
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js'
import { TokenClaims } from '../auth/validate-token.js'
import { registerReadCustomerTool } from './customer/read.js'
import { registerUpdateCustomerTool } from './customer/update.js'
import { registerCreateTicketTool } from './tickets/create.js'
import { registerSearchKnowledgeBaseTool } from './knowledge/search.js'
 
const SCOPE_TO_TOOL: Record<string, (server: McpServer, tenantId: string) => Promise<void>> = {
  'customers:read': registerReadCustomerTool,
  'customers:write': registerUpdateCustomerTool,
  'tickets:write': registerCreateTicketTool,
  'knowledge:read': registerSearchKnowledgeBaseTool,
}
 
export async function buildToolSetForToken(
  server: McpServer,
  claims: TokenClaims
): Promise<void> {
  for (const scope of claims.scopes) {
    const register = SCOPE_TO_TOOL[scope]
    if (register) {
      await register(server, claims.tenant_id)
    }
  }
}

Each registration function takes the server instance and the tenant_id from the token claims. The tenant ID gets baked into the tool's data access logic via closure. Here's what that looks like in practice:

src/tools/customer/read.ts·typescript
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js'
import { z } from 'zod'
import { db } from '../../db.js'
 
export async function registerReadCustomerTool(
  server: McpServer,
  tenantId: string  // captured in closure, never passed by the agent
): Promise<void> {
  server.tool(
    'read_customer',
    { customer_id: z.string().describe('The ID of the customer to look up') },
    async ({ customer_id }) => {
      const customer = await db.customers.findFirst({
        where: {
          id: customer_id,
          tenant_id: tenantId,  // always enforced, agent can't override
        },
      })
 
      if (!customer) {
        return { content: [{ type: 'text', text: 'Customer not found' }] }
      }
 
      return {
        content: [
          {
            type: 'text',
            text: JSON.stringify({ id: customer.id, name: customer.name, email: customer.email }),
          },
        ],
      }
    }
  )
}

The tenant ID never appears in the tool's input schema. The agent can't pass it, spoof it, or know it's there. It's captured at registration time and applied to every database query the tool makes. Even if an agent somehow constructed a cross-tenant customer ID, the query would return nothing because tenant_id is always in the where clause.

This is different from the pattern you'd use in a REST API, where you'd validate the tenant claim on every endpoint individually. With MCP, you establish tenant context once at registration and trust the closure to enforce it. Fewer places to forget, fewer places to make mistakes.

mcp-config.json
Live
{
"mcpServers":
{
"chanl":
{
"url": "https://acme.chanl.dev/mcp",
"transport": "sse",
"apiKey": "sk-chanl-...a4f2"
}
}
}
Tools
12 connected
Memory
Active
Knowledge
3 sources

Handling Token Expiry in Long Agentic Sessions

Two patterns handle mid-session token expiry: short-lived sessions where clients get a fresh token before starting, and mid-session refresh where the client monitors expiry and renews the token before it lapses. Most production systems need both, depending on session length.

Agents don't always finish in seconds. A complex customer support workflow might run for 10 minutes, making dozens of tool calls. If the access token expires mid-session, tool calls start returning 401s with no clear way for the agent to tell the user re-authentication is needed.

The first is short-lived sessions with fresh tokens. Issue tokens with a short TTL (5 to 10 minutes) and expect the client to get a fresh token before starting any session that might run that long. Simple to implement, but puts the burden on the client.

The second is token refresh mid-session. The MCP client monitors the token's exp claim and requests a refresh token before the access token expires. If the refresh succeeds, subsequent tool calls use the new token. If it fails because the user's session expired, the agent should escalate to a human rather than fail silently.

The fastMCP production mistakes guide covers how to handle mid-session failures gracefully in practice, including checkpointing agent state so you can resume after a re-auth flow without losing progress.

For agents using Chanl's tools infrastructure, token refresh is handled at the runtime level. The SDK detects 401 responses from MCP tools and triggers a credential refresh before retrying, so your agent code doesn't need to handle it explicitly.

MCP Server Cards and Automatic Discovery

MCP Server Cards are machine-readable metadata documents that describe a server's capabilities, supported scopes, and how to authenticate. They're coming to the broader tooling landscape in mid-2026 and will let clients discover and register with new MCP servers without reading any documentation.

For auth specifically, a Server Card tells potential clients which OAuth server to use, which scopes exist, and whether dynamic client registration is supported. Here's what one looks like for a customer service MCP server:

/.well-known/mcp-server-card.json·json
{
  "name": "cx-agent-mcp",
  "version": "1.0.0",
  "description": "Customer experience tools for AI agents",
  "authorization": {
    "type": "oauth2",
    "authorization_server": "https://auth.yourplatform.com",
    "supported_flows": ["authorization_code", "client_credentials"],
    "dynamic_client_registration": true,
    "scopes": {
      "customers:read": "Read customer profiles and interaction history",
      "customers:write": "Update customer records and preferences",
      "tickets:write": "Create and update support tickets",
      "knowledge:read": "Search the knowledge base and documentation"
    }
  },
  "transport": {
    "type": "streamable-http",
    "url": "https://mcp.yourplatform.com/mcp"
  }
}

You can publish a Server Card today even if the client tooling landscape doesn't auto-consume them yet. They're also just good documentation: the auth server URL, scope definitions, and transport configuration in one place, machine-readable and human-readable simultaneously.

Testing Your Auth Layer Before It Counts

Write scenario tests covering five token configurations before you ship: valid token with correct scopes, valid token missing a specific scope, expired token, token from the wrong tenant, and no token at all. Run each against your MCP server and assert on both the tool list returned and the agent's behavior when a tool is unavailable.

Auth bugs are expensive because they don't fail loudly. A token with the wrong scopes might silently restrict what an agent can do, causing invisible task failures rather than obvious error boundaries.

The five cases to cover:

  1. Valid token with correct scopes: agent gets the expected tool list and completes the task
  2. Valid token missing a scope: agent gets a reduced tool list, can't call unauthorized tools, communicates that clearly
  3. Expired token: server returns 401, agent escalates gracefully rather than retrying forever
  4. Token with wrong tenant ID: tools return no results, no data leaks across the tenant boundary
  5. No token at all: immediate 401, no tool list returned, no partial execution

Chanl's scenario runner makes this pattern practical. You define agent scenarios that run against your MCP server with specific JWT configurations, verify that the tool list matches expectations, and confirm the agent handles each failure mode correctly:

tests/mcp-auth.scenario.ts·typescript
import Chanl from '@chanl/sdk'
import { generateTestToken } from './test-utils.js'
 
const chanl = new Chanl({ apiKey: process.env.CHANL_API_KEY! })
 
const result = await chanl.scenarios.run({
  agent: {
    model: 'claude-sonnet-4-6',
    systemPrompt: 'You are a customer service agent. Use your tools to help customers.',
    mcpServers: [
      {
        url: 'http://localhost:3001/mcp',
        auth: {
          type: 'bearer',
          // Token with read-only scopes -- no customers:write
          token: generateTestToken({
            tenant_id: 'tenant-alpha',
            scopes: ['customers:read', 'knowledge:read'],
          }),
        },
      },
    ],
  },
  conversation: [
    { role: 'user', content: "Update customer cus_123's email to new@email.com" },
  ],
})
 
// Agent should acknowledge scope limitation, not silently fail
const acknowledged = result.messages.some(
  m => m.content.includes('unable to update') || m.content.includes("don't have permission")
)
console.assert(acknowledged, 'Agent should communicate scope limitation')
 
// update_customer tool should never appear in tool calls
const calledWriteTool = result.toolCalls.some(tc => tc.tool === 'update_customer')
console.assert(!calledWriteTool, 'update_customer should not be callable with read-only token')

There's a meaningful difference between "auth is configured" and "auth works the way you expect." Tests close that gap before a real customer's data is at stake.

For a broader picture of the MCP security surface beyond auth, the MCP security and attack surface guide covers prompt injection through tool descriptions, tool shadowing attacks, and how to defend against them in production.

Already in Production Without Auth?

If you're running a production MCP server right now without auth, the fix is incremental. You don't need to take it offline.

Start by adding token validation as a passthrough. Accept any valid JWT and log the claims, but don't block anything yet. This gives you a week of data on who's calling what before you start enforcing restrictions. You'll almost certainly discover clients you didn't know about.

Then add scope logging. Track which tools are called and which scopes the request token carries. After a week, you'll know which scopes matter most and which tools are highest priority to protect.

Then harden incrementally: enforce scopes on write operations first, then reads. Write operations carry more blast radius (they change data), so they're higher priority even though both eventually need protection.

One thing to know: once you add per-tenant token validation, you'll have a much richer picture of which tools are called by which tenants and how often. Chanl's MCP integration surfaces these patterns in analytics, which means you can start spotting unusual tool call patterns, like a tenant suddenly calling tools outside their normal usage pattern, as potential signals worth investigating.

Putting It Together

MCP doesn't give you auth for free, but the spec describes exactly how it should work. The implementation is four components: a JWKS-based token validator, a per-request server instantiation pattern, a scope-to-tool mapping, and tenant ID isolation via closure.

Build these in before your first real tenant, not after. Auth retrofitted onto an existing multi-tenant system is harder and riskier than auth designed in from the start. The patterns here add maybe two hours of work to a new MCP server. They add two weeks of careful migration work to an existing one.

The good news is that once you've got this working, you've also got the foundation for fine-grained access control, per-tenant tool customization, and the kind of audit logging that enterprise customers will eventually ask for. Auth is rarely just about security. It's the infrastructure that makes enterprise deployment possible.

Test your MCP auth layer before it matters

Chanl's scenario runner simulates agents with different token configurations against your MCP server and verifies the right tools appear for the right clients — before production finds out first.

Try Chanl Free
DG

Co-founder

Building the platform for AI agents at Chanl — tools, testing, and observability for customer experience.

The Signal Briefing

Un email por semana. Cómo los equipos líderes de CS, ingresos e IA están convirtiendo conversaciones en decisiones. Benchmarks, playbooks y lo que funciona en producción.

500+ líderes de CS e ingresos suscritos

Frequently Asked Questions