What is the AWS MCP Server?

The AWS MCP Server is a managed MCP server that gives AI agents secure, auditable access to AWS services through the Model Context Protocol. Instead of wrapping individual AWS SDK calls, agents use three standard MCP tools to call any AWS API, retrieve documentation, and run sandboxed scripts against AWS resources.

How does the call_aws tool work?

The call_aws tool takes a service name, operation name, and parameters, then executes the corresponding AWS API call using the agent runtime's IAM credentials. A single tool definition covers all 15,000+ AWS API operations, so agents never need per-service tool wrappers. The server handles authentication, request signing, and response formatting.

What are Agent Skills in the AWS MCP Server?

Agent Skills are curated, versioned guidance documents that agents can discover and load on demand through the MCP protocol. They replace static Agent SOPs (standard operating procedures) that had to live in the system prompt. Because skills load only when needed, they keep the context window small while still giving agents tested procedures for complex multi-step operations.

What is sandboxed script execution in the AWS MCP Server?

Sandboxed execution lets agents run Python code against AWS services without access to the host filesystem or shell tools. The agent writes a script, the MCP server runs it in an isolated environment with its IAM credentials, and returns the output. This is useful for multi-step operations like pulling data from DynamoDB, transforming it, and writing results to S3, all in one agent turn.

What are the three AWS Agent Toolkit plugins?

AWS Core targets application developers and includes infrastructure-as-code, Lambda, and application services. AWS Data Analytics targets data analysts and business intelligence engineers. AWS Agents targets AI engineers building agents with Amazon Bedrock AgentCore. Each plugin bundles a set of relevant Agent Skills pre-loaded for its audience.

How do you test an agent that uses the AWS MCP Server?

The biggest testing challenge is that AWS API calls carry real infrastructure side effects. The recommended pattern is to run tests against isolated AWS accounts with read-only IAM policies, verify tool selection accuracy on representative prompts before routing to write-capable credentials, and use scenario testing against frozen prompt sets so description changes or skill updates don't silently shift routing behavior.

Does the AWS MCP Server cost extra?

No. The AWS MCP Server itself is available at no additional charge. You pay only for the AWS resources your agent actually uses, the same as if a human had made those API calls.

How does prompt caching interact with Agent Skills?

Because Agent Skills load dynamically into the context rather than living in the static system prompt, they don't pollute the cached prefix. The cacheable content stays stable, and skills are appended after the cache boundary only when the agent actually needs them. This is a meaningful architectural improvement over SOPs, which required the full procedure library to be present on every call.

AWS Just Gave Your Agent 15,000 Cloud Tools

Your agent needs to pull a customer's order history from DynamoDB, check whether their subscription is active in your billing table, and then create a support ticket in another service. Three AWS services. In the old world, you wrote three separate tool wrappers, gave each one its own IAM policy, kept them up to date as the SDK changed, and hoped the agent's function-calling accuracy held up across all three.

On May 6, 2026, AWS made that problem optional. The AWS MCP Server is now generally available, and it changes the shape of the tool problem for any agent that talks to AWS.

What the AWS MCP Server Actually Ships

The AWS MCP Server is a managed MCP server that exposes three tools to any agent connected via the Model Context Protocol. Those three tools together cover the entire AWS surface.

The first tool is call_aws. It takes a service name, operation name, and parameters, then executes any of the 15,000+ AWS API operations using the runtime's IAM credentials. Not some curated subset. All of them. The same IAM principal that runs your Lambda functions can now delegate its permissions to an agent without you writing a single wrapper.

The second and third tools are search_documentation and read_documentation. At query time, the agent can ask what operations a service supports, retrieve the current parameter schema, and read best-practice guidance, all without that documentation occupying space in the static system prompt. The agent fetches what it needs when it needs it.

That last sentence is more important than it sounds. System prompts with embedded documentation are expensive to maintain, expensive to cache-invalidate, and invisible to the agent until it's already in the middle of a conversation. Retrieving documentation on demand means the context window stays compact, and the documentation is always current.

aws-mcp-client.ts·typescript

import { Client } from "@modelcontextprotocol/sdk/client/index.js";
import { StdioClientTransport } from "@modelcontextprotocol/sdk/client/stdio.js";
 
const transport = new StdioClientTransport({
  command: "uvx",
  args: ["awslabs.aws-mcp-server@latest"],
  env: {
    AWS_PROFILE: process.env.AWS_PROFILE,
    AWS_REGION: process.env.AWS_REGION ?? "us-east-1",
    FASTMCP_LOG_LEVEL: "ERROR",
  },
});
 
const client = new Client({ name: "my-agent", version: "1.0.0" });
await client.connect(transport);
 
// List what tools the server exposes
const { tools } = await client.listTools();
console.log(tools.map((t) => t.name));
// => ["call_aws", "search_documentation", "read_documentation"]

The server uses your existing AWS credentials. No new secrets to rotate, no separate auth layer. Permissions are bounded by whatever IAM role the agent runtime already has. If you want the deeper background on how function calling works before diving into MCP-based tool calls, the function calling guide covers the mechanics from first principles.

Agent Skills Replace the System Prompt SOP Library

Agent Skills are versioned guidance documents that your agent discovers and loads on demand through the MCP protocol, keeping them out of the static system prompt entirely. Instead of loading a 10,000-token procedure library into every request, your agent fetches the one skill it needs for the current task and ignores the rest.

Before this release, teams that needed agents to follow operational procedures had two options. They could put the procedures in the system prompt, which meant loading the entire library into every request whether it was relevant or not. Or they could fine-tune the model, which meant procedures were baked into weights and had no audit trail.

Agent Skills are the third option that neither approach offered. A Skill is a versioned, curated guidance document that the agent discovers and loads on demand through the MCP protocol. The MCP server exposes a list of available Skills; the agent pulls the one it needs when it encounters a task that matches.

use-agent-skill.ts·typescript

// The agent discovers available skills
const { resources } = await client.listResources();
// resources includes skill manifests like "aws-lambda-deployment-v2"
 
// When the agent identifies a relevant task, it loads the skill
const { contents } = await client.readResource({
  uri: "awsskill://lambda/deployment-checklist",
});
// contents[0].text contains the tested deployment procedure
 
// The agent now has step-by-step guidance without it living in the system prompt

The practical effect is a much cleaner context window. Instead of a 20,000-token system prompt that includes every possible operational procedure, you have a short system prompt plus the specific skill the agent loaded for this task. That's a meaningful cost and latency improvement, and it means updating a procedure doesn't require re-caching the entire system prompt.

The Skills library that shipped with GA covers infrastructure-as-code, storage, analytics, serverless, containers, and AI services. AWS has announced additional Skills for databases, networking, and IAM are coming.

How Agent Skills load at runtime vs. static system prompt SOPs

Sandboxed Script Execution for Multi-Step Operations

The third new capability is sandboxed Python execution. When an agent needs to run a multi-step operation, say, query DynamoDB, filter the results in Python, write the filtered output to S3, it can now write a script and ask the MCP server to run it.

The script runs in an isolated environment with the agent's IAM credentials and no access to the host filesystem or shell tools. The agent gets back the script's stdout and any errors. This is not general compute. It's scoped execution for the kind of data transformation that lives in the gap between "call an API" and "run an arbitrary Lambda."

sandboxed-etl.py·python

# The agent generates this script and sends it to the MCP server for execution
import boto3
import json
 
dynamodb = boto3.resource("dynamodb", region_name="us-east-1")
s3 = boto3.client("s3", region_name="us-east-1")
 
table = dynamodb.Table("CustomerOrders")
response = table.scan(
    FilterExpression="order_status = :s AND created_at > :d",
    ExpressionAttributeValues={
        ":s": "pending",
        ":d": "2026-05-01",
    },
)
 
pending_orders = [
    {"id": item["order_id"], "customer": item["customer_id"], "total": str(item["total"])}
    for item in response["Items"]
]
 
s3.put_object(
    Bucket="ops-reports",
    Key="pending-orders/2026-05-10.json",
    Body=json.dumps(pending_orders),
    ContentType="application/json",
)
 
print(f"Exported {len(pending_orders)} pending orders")

The agent didn't write a Lambda function. It didn't need IAM permissions to create or invoke one. It wrote a script, the MCP server ran it, and the data landed in S3. For agents that orchestrate operational tasks in real-time CX workflows, this cuts the number of prebuilt tools you have to maintain to almost zero.

Which Toolkit Plugin Should You Install?

Install the one plugin that matches your agent's job. AWS shipped the Agent Toolkit alongside the MCP Server with three opinionated plugins, and the Skills bundled inside each are only discoverable when that plugin is loaded. Picking the wrong one bloats the available tool surface without giving the agent the procedures it actually needs.

Plugin	Audience	Skills bundled
AWS Core	Application developers, DevOps	Infrastructure-as-code, Lambda deployment, API Gateway, application service integration
AWS Data Analytics	Data analysts, BI engineers	Athena queries, Glue jobs, Redshift, QuickSight
AWS Agents	AI engineers on Bedrock AgentCore	Bedrock model invocation, knowledge base management, agent deployment, evaluation

For a CX support agent built on Bedrock, the AWS Agents plugin is the obvious starting point. If your agent is operating infrastructure on behalf of a developer, AWS Core. Analytics is the narrower of the three but worth picking if your agent's job really is to write Athena queries on demand.

How Your CX Agent Actually Uses This

For a CX support agent, the AWS MCP Server replaces a tangle of purpose-built wrappers with one tool call and an IAM role. The agent resolves the customer, queries whatever AWS services hold the relevant data, and continues the conversation, all without you maintaining per-service glue code.

Here's the full tool chain for an order inquiry agent:

cx-order-inquiry-agent.ts·typescript

import { Client as McpClient } from "@modelcontextprotocol/sdk/client/index.js";
import { StdioClientTransport } from "@modelcontextprotocol/sdk/client/stdio.js";
 
// 1. Spin up the AWS MCP Server with a read-only profile
const transport = new StdioClientTransport({
  command: "uvx",
  args: ["awslabs.aws-mcp-server@latest"],
  env: { AWS_PROFILE: "cx-readonly", AWS_REGION: "us-east-1" },
});
const mcp = new McpClient({ name: "cx-agent", version: "1.0.0" });
await mcp.connect(transport);
 
// 2. The agent resolves the customer and looks up the order
async function lookupOrder(customerId: string, orderId: string) {
  // Agent calls call_aws to query DynamoDB -- no wrapper needed
  const orderResult = await mcp.callTool({
    name: "call_aws",
    arguments: {
      service: "dynamodb",
      operation: "GetItem",
      parameters: {
        TableName: "Orders",
        Key: { order_id: { S: orderId }, customer_id: { S: customerId } },
      },
    },
  });
 
  return JSON.parse(orderResult.content[0].text);
}

The agent doesn't need a lookupOrder function in its tool list. It doesn't need a curated DynamoDB wrapper. It calls call_aws with the parameters, gets the data, and continues the conversation.

The read-only IAM profile (cx-readonly) is doing real work here. The agent can query any table in the account but can't write. When you wire this into scenario testing, you can verify tool selection accuracy across the full range of order states (shipped, delayed, cancelled) without ever touching production data.

Connected Integrations12 active

Salesforce

Slack

Google

Stripe

HubSpot

Intercom

Zapier

Shopify

GitHub

Jira

Gmail

PostgreSQL

Testing Agents That Call Real Cloud Infrastructure

The safest pattern is to run a test MCP server that returns fixture data for every call_aws operation, point your agent at it during CI, and only switch to real AWS credentials for integration runs. Because all AWS calls flow through the single call_aws tool, you intercept at the MCP layer rather than patching SDK internals.

Most tests either mock AWS calls (missing real schema drift) or run against live AWS (risking side effects and cost). The AWS MCP Server changes the shape of that tradeoff because a single interception point covers the whole service surface.

aws-mcp-test-double.ts·typescript

import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { z } from "zod";
 
// A test MCP server that returns fixture data instead of calling real AWS
const testServer = new McpServer({ name: "aws-test-double", version: "1.0.0" });
 
testServer.tool(
  "call_aws",
  {
    service: z.string(),
    operation: z.string(),
    parameters: z.record(z.unknown()),
  },
  async ({ service, operation, parameters }) => {
    // Route to fixture data based on service + operation
    const fixture = testFixtures[`${service}.${operation}`];
    if (!fixture) {
      return { content: [{ type: "text", text: JSON.stringify({ error: "no fixture" }) }] };
    }
    return { content: [{ type: "text", text: JSON.stringify(fixture(parameters)) }] };
  },
);
 
const testFixtures: Record<string, (params: Record<string, unknown>) => unknown> = {
  "dynamodb.GetItem": (params) => ({
    Item: {
      order_id: { S: (params.Key as { order_id: { S: string } }).order_id.S },
      status: { S: "shipped" },
      tracking_number: { S: "1Z999AA10123456784" },
    },
  }),
};

Point your agent at the test server during CI. Point it at the real server in production. The agent code and the scenario definitions don't change. You're testing real tool selection behavior against real conversation flows, just with controlled data underneath.

You define the tool set once in your tools registry, connect it to either MCP server, and the scenario runner measures whether the agent selected the right tool for the right prompt, not just whether the call succeeded. Pair it with production monitoring and you'll catch the moment a new Agent Skill changes routing behavior before the error rate does.

What to Do This Week

Three things, in order of impact.

First, audit your current tool list. If you have more than five AWS-related tool wrappers, they're candidates for replacement with call_aws. You'll reduce tool count, simplify the function-calling schema, and get more accurate routing because the model sees fewer similar-looking tools.

Second, switch to Agent Skills for any operational procedures currently living in your system prompt. If your system prompt contains more than a paragraph of "when the customer says X, do Y" logic, that's a Skill. Pull it out, version it, and let the agent load it on demand. Your prompt cache hit rate goes up; your context window goes down.

Third, set up a test double for your AWS MCP Server in CI. The pattern above is short enough to wire into an existing test suite. Run it on every PR so description changes and skill updates don't silently shift routing behavior in production. We covered why that drift is so dangerous in the MCP tool description drift article. Chanl's MCP integration lets you register both the real and test servers and switch between them per environment.

The 15,000 APIs haven't changed. What changed is how much scaffolding you need to give your agent access to them, which is to say, almost none. Build the agent. Connect it to AWS. Monitor what it actually does. That's the loop this release makes tractable.

Test your AWS-integrated agent before it goes live

Chanl scenarios run representative customer flows against your agent's full tool set, including AWS MCP calls, so you catch routing regressions before they reach production.

See how scenario testing works

Sources & References

Key Takeaway

Testing edge cases before production deployment can reduce customer complaints by 80% and prevent costly emergency fixes post-launch.

mcp aws tools function-calling agent-infrastructure cloud

Dean Grover

Co-founder

Building the platform for AI agents at Chanl — tools, testing, and observability for customer experience.

The Signal Briefing

Un email por semana. Cómo los equipos líderes de CS, ingresos e IA están convirtiendo conversaciones en decisiones. Benchmarks, playbooks y lo que funciona en producción.

500+ líderes de CS e ingresos suscritos

AWS Just Gave Your Agent 15,000 Cloud Tools

What the AWS MCP Server Actually Ships

Agent Skills Replace the System Prompt SOP Library

Sandboxed Script Execution for Multi-Step Operations

Which Toolkit Plugin Should You Install?

How Your CX Agent Actually Uses This

Testing Agents That Call Real Cloud Infrastructure

What to Do This Week

Test your AWS-integrated agent before it goes live

The Signal Briefing

Frequently Asked Questions

Related Articles

Past 50 tools, function-calling accuracy falls off a cliff

Herramientas para Agentes de IA: MCP, OpenAPI y Gestión de Herramientas que Realmente Escala

MCP tool description drift: the silent failure nobody alerts on