On a Wednesday morning last October, a security engineer at a mid-sized SaaS company got a call from their payment processor. An automated system had flagged unusual API activity: 11,400 calls to a refund endpoint in a 47-minute window. The calls had gone through. The refunds had processed. Nobody on the engineering team knew it was happening until the vendor called.
Tracing the root cause took most of the day. An AI agent had entered a retry loop after a network timeout. Each retry called the refund endpoint again. There was no rate limiting. There was no audit trail at the application level, just raw API logs on the vendor's side, hours after the damage was done. The total refund amount was recoverable, but the investigation cost was not.
The team had been running MCP to connect their agents to tools. MCP worked exactly as designed. The problem was everything around it.
What MCP Actually Gives You
MCP standardizes how agents call tools and access resources. If you haven't read why MCP exists, the short version is this: before MCP, every tool integration required bespoke adapter code. MCP gives you a common transport and schema so agents can discover and call tools without you writing a custom integration for each one.
What MCP specifies is real and valuable: the transport layer (HTTP with Server-Sent Events, or stdio for local use), the message schema (tools, resources, and prompts as typed capabilities), the discovery mechanism (servers advertise what they can do), and since June 2025, OAuth 2.1 as the authentication standard.
What MCP does not specify is equally important to understand:
- Which agents are allowed to call which tools (authorization)
- A log of every tool call with its parameters and outcomes (audit trail)
- How many calls per minute any agent can make (rate limiting)
- Centralized visibility into tool usage across all agents (observability)
- How to propagate a human user's identity through an agent's tool call (user-scoped access)
The spec is deliberately narrow. It solves the interoperability problem. The governance layer is out of scope by design. That's not a criticism. It's the right architectural choice for a protocol standard. But it means you have to build the governance layer yourself, or use a gateway that does it for you.
The Five Production Gaps
When teams deploy MCP to production without a gateway layer, five gaps tend to surface, usually in this order:
No tool-level authorization. The agent can call any tool on any server it has access to. In production, you typically want the billing agent to read CRM records but not issue refunds. You want the onboarding agent to schedule meetings but not update payment methods. Server-level access control is too coarse. You need per-tool, per-agent policy.
No audit trail. Tool calls happen and are forgotten. If an agent makes a destructive call (deletes a record, processes a payment, sends an email) you may not find out until a customer complains or a vendor flags it. You can't do incident investigation without a log of what happened, when, with what parameters.
No rate limiting. A retry loop, a hallucinated tool call, a prompt injection. Any of these can turn into thousands of unintended calls in minutes if there's no ceiling. The spec doesn't define rate limiting. Your MCP server might not either.
No user identity propagation. Many MCP servers use a shared service account for agent calls. That works until you need to know which end user's action triggered which tool call, or until a multi-tenant system needs tool calls scoped to a specific user's permissions rather than a system-wide service account.
No centralized observability. When you have five agents calling three MCP servers each, understanding which tools are actually used, which are never called, and which return errors requires correlating logs from multiple places. A gateway gives you a single pane of glass.
The Gateway Pattern
An MCP gateway sits between your agents and your MCP servers, acting as the control plane for every tool invocation. Agents call the gateway. The gateway authenticates the call, evaluates policy, logs the event, applies rate limits, and forwards the request to the appropriate MCP server with the right credentials injected.
The agents don't hold credentials for individual MCP servers. They authenticate to the gateway. The gateway holds the server credentials and injects them per call based on which agent is calling and what it's allowed to do.
This inversion, agents authenticating to a single gateway rather than directly to each server, is what makes centralized policy and audit possible.
The gateway doesn't need to be custom software you build. Several platforms offer hosted MCP gateways: MintMCP, Composio, and others. The key is understanding what the gateway needs to do so you can evaluate whether what you're using actually does it.
Identity: Per-User OAuth and Service Accounts
Production agents need two identity modes, and which one you use depends on the type of tool call.
Service accounts work for background operations: agents running scheduled tasks, agents processing queue items, agents that act on behalf of the system rather than a specific user. A service account has fixed permissions defined by its role. Tool calls made under a service account are attributed to the agent, not to any end user.
Per-user OAuth works when a human user triggers an agent interaction and the tool call should carry that user's identity. A customer calls your support line. The agent needs to look up their account. The tool call should be authorized as that customer, not as a system service account, because the customer's account may have specific permissions, and because the audit trail should show that the customer's own data was accessed during their session.
MCP's OAuth 2.1 implementation handles this with a standard authorization code flow: the user authorizes the agent, the gateway exchanges the code for a short-lived access token, and every downstream tool call carries that token instead of a system credential. The agent itself never sees or stores the user's secrets. If the user's permissions change mid-session, the gateway picks up the change on the next token refresh and the next tool call reflects the new scope.
The practical effect: a customer who calls your support line and gets routed to an agent has every CRM read or ticket write logged as their action, scoped to their permissions, with a token the gateway can revoke the moment the session ends.
Tool-Level Policy
Server-level access control isn't enough. When multiple agents share the same MCP server (and that's common, because you don't want to maintain separate CRM servers for every agent type) you need to restrict access at the tool level.
RBAC at the tool level lets you express policies like:
- Billing agent:
crm.getAccount(read),payments.getHistory(read),payments.issueRefund(deny) - Onboarding agent:
crm.getAccount(read),crm.updateAccount(write),calendar.schedule(write) - Support agent:
crm.getAccount(read),tickets.create(write),tickets.escalate(write),payments.*(deny)
When the billing agent tries to call payments.issueRefund, the gateway blocks it and logs the attempt. The call never reaches the MCP server. The policy itself lives as configuration, not as code in the agent, which is the whole point. You can change what the billing agent is allowed to touch without redeploying the agent.
Policy should also include rate limits per agent per tool. A support agent calling crm.getAccount once per customer interaction is expected. Calling it 200 times per minute is a signal that something is wrong, and the gateway is the only place positioned to notice and stop it before a vendor calls you about an anomaly.
Audit Trails for Ops Teams
A good audit trail does more than satisfy compliance requirements. It's one of the most useful operational signals you have for understanding how agents actually behave in production.
Every entry in the audit log should include:
- Agent ID and session ID
- Tool name and MCP server
- Full parameters passed to the tool (this is critical; a payment call with
amount: 10is very different from one withamount: 10000) - Timestamp and latency
- Response status (success, error, timeout)
- User identity if a per-user OAuth flow was used
- Whether the call was allowed or blocked by policy
With that data, you can answer questions that would otherwise require hours of log archaeology:
"Which agent called the payment API last Tuesday between 10 and 11 AM?" Answered in seconds.
"How many times did the support agent call crm.updateAccount this month?" A dashboard away.
"Did any agent try to call a tool they're not authorized for?" A query on the policy-blocked events.
A good monitoring layer surfaces audit events alongside conversation transcripts so you can correlate a tool call with the specific customer interaction that triggered it. When a customer reports that the agent gave them the wrong information, you can replay the tool calls and see exactly what data the agent was working from.
The Connection to MCP Security
An MCP gateway addresses operational governance: who's calling what, with what authority, how often. It's distinct from MCP security, which addresses attack vectors: tool poisoning, prompt injection through tool descriptions, rug-pull attacks on auto-updated servers.
You need both layers. They protect against different things.
The gateway catches operational failures: runaway retry loops, misconfigured agents calling tools they shouldn't, shared service accounts used where user-scoped identity is required. The security layer catches adversarial attacks: malicious tool descriptions, compromised servers, injected instructions.
If you're thinking through FastMCP server deployment, add "no gateway layer" to the mistake list alongside the seven already covered there. It's a production architecture gap, not a development-time mistake, which is why it tends to show up later and cost more when it does.
Connecting It to Chanl's MCP Feature
Chanl's MCP runtime provides the gateway layer as part of the platform rather than as a separate infrastructure component to maintain. Authentication, audit logging, tool-level policy, and rate limiting are configured through the dashboard and applied to every agent that connects through the platform.
For teams already using VAPI, Retell, or a custom setup, the Chanl MCP gateway integrates as a middleware layer between your orchestration platform and your tool servers. Your agent configuration doesn't change; the gateway sits between.
The practical benefit: when an agent calls a tool in production, the call is logged with full context, policy is applied, and the event is available in analytics alongside the conversation that triggered it. You don't have to piece together what happened from separate logs at separate vendors.
Don't Wait for the Incident
The story at the start of this article had a recoverable outcome. The vendor flagged the issue, the refunds were eventually reversed, the system was fixed. The investigation cost was absorbed.
Most teams that experience a runaway tool call event don't get that lucky on the first one. And the teams that skip the gateway layer because it feels like over-engineering for their current scale are usually the teams that discover they needed it exactly when they can least afford the incident.
Raw MCP is excellent for prototyping. It's the right starting point for any tool integration. But the gateway layer (audit, auth, RBAC, rate limiting) isn't a feature you add later when things are stable. It's the infrastructure that makes production stable in the first place.
Build the gateway in when you build the tool integration. The marginal effort is low. The asymmetry with incident cost is not.
MCP tools, governance included
Chanl's MCP runtime gives your agents access to any tool server with audit logging, tool-level policy, OAuth 2.1, and rate limiting built in. No separate gateway to maintain.
Explore MCP Tools- MCP 2026 Roadmap: Enterprise Readiness as Top Priority -- WorkOS
- Top 5 Enterprise MCP Gateway Solutions in 2026 -- Maxim AI
- Best MCP Gateways for Enterprise Engineering Teams 2026 -- MintMCP
- 2026 MCP Trends: The Shift to Enterprise-Ready Agentic Workflows -- DEV Community
- The MCP Security Loophole Hackers Target (May 2026) -- Product Leaders Day
- MCP Official Specification -- Model Context Protocol
- Best MCP Gateways and AI Agent Security Tools 2026 -- Integrate.io
- Future of AI Agents with MCP in 2026 -- Monkey Digital
Co-founder
Building the platform for AI agents at Chanl — tools, testing, and observability for customer experience.
Learn Agentic AI
Weekly. Patterns and recipes for shipping AI agents that actually work — MCP, scorecards, regression tests, prompts, model comparisons. From teams running agents in production.



