AI & Strategy·15 min read

How to Build AI Tool-Use Agents with MCP and A2A in 2026

MCP crossed 97M monthly SDK downloads and A2A joined the Linux Foundation. Here is how to combine these protocols to build agents that discover tools, invoke APIs, and collaborate.

Nate Laquis

Nate Laquis

Founder & CEO

The Tool-Use Revolution in AI

The biggest shift in AI during 2025 to 2026 was not bigger models. It was tool use. An LLM that can only generate text is a chatbot. An LLM that can call APIs, query databases, read files, and execute code is an agent. The difference is the gap between a conversation partner and a capable assistant that gets work done.

Two protocols made standardized tool use possible. MCP (Model Context Protocol) defines how agents discover and invoke tools. A2A (Agent-to-Agent Protocol) defines how agents delegate tasks to other agents. Together, they enable a new architecture: networks of specialized agents that use tools, share results, and collaborate on complex tasks.

This is not theoretical. Anthropic's Claude can use MCP tools natively. AWS Bedrock agents support MCP tool discovery. Google's ADK implements both MCP and A2A. The infrastructure is production-ready. The question is no longer "can we build agentic AI?" but "how do we build it well?"

Global network visualization showing AI agents communicating through MCP and A2A protocols

MCP: How Agents Discover and Use Tools

MCP is the USB-C of AI tools. Before MCP, every AI application built custom integrations with every tool. MCP provides a standard interface: any MCP-compatible tool works with any MCP-compatible agent.

Architecture

An MCP server exposes tools, resources, and prompts. An MCP client (your agent) connects to one or more MCP servers and discovers available capabilities. When the agent decides to use a tool, it sends a tool call request to the MCP server and receives the result. The agent then uses the result to continue its reasoning.

Building an MCP Server

Use the official MCP SDK (@modelcontextprotocol/sdk for TypeScript, mcp for Python). Define your tools with names, descriptions, and input schemas (JSON Schema). Implement the handler function for each tool. Start the server over stdio (for local tools) or HTTP with SSE (for remote tools). Example tools: query_database, search_documents, send_email, create_ticket, get_weather.

Tool Design Best Practices

Write clear, specific tool descriptions. The LLM decides which tool to use based on the description, so vague descriptions lead to wrong tool selection. Keep input schemas simple. Prefer specific tools (search_customers, search_products) over generic tools (search_anything). Return structured data, not raw dumps. Include error messages that help the agent self-correct. Limit tool output size to prevent context window overflow.

Real-World MCP Servers

Database MCP server: exposes read_query and write_query tools for PostgreSQL. GitHub MCP server: exposes create_issue, list_prs, merge_pr tools. Slack MCP server: exposes send_message, search_messages, create_channel tools. CRM MCP server: exposes search_contacts, update_deal, log_activity tools. Each server encapsulates one integration and can be shared across multiple agents.

A2A: Agent-to-Agent Delegation

A2A enables agents to delegate tasks to specialized sub-agents. Instead of one monolithic agent that does everything, you build a team of focused agents that collaborate.

When A2A Matters

A single agent with 50 tools becomes confused. Tool selection accuracy drops when the agent has too many options. A2A solves this by letting an orchestrator agent delegate to specialists: a research agent with search tools, a coding agent with code execution tools, a communication agent with email and Slack tools. Each specialist has 5 to 10 tools and uses them expertly.

A2A Agent Cards

Every A2A agent publishes an "agent card" that describes its capabilities, supported input formats, and expected output formats. The orchestrator reads agent cards to decide which agent to delegate to. This is analogous to MCP's tool descriptions, but for agents instead of tools.

Task Lifecycle

The orchestrator sends a task to a sub-agent. The sub-agent acknowledges the task and begins processing. During processing, the sub-agent can send progress updates (streaming partial results). When complete, the sub-agent sends the final result. The orchestrator incorporates the result into its own reasoning and continues. If the sub-agent fails, the orchestrator can retry, delegate to a different agent, or escalate to the user.

Implementation

Use the A2A SDK or implement the protocol over HTTP. Define your agent's capabilities in an agent card (JSON). Implement the task handler that processes incoming tasks. Return structured results that the orchestrator can parse and use. For agentic workflow patterns, see our dedicated guide.

Building a Multi-Agent System

Here is how to architect a practical multi-agent system using MCP and A2A together.

Example: AI Sales Assistant

Orchestrator agent: receives user requests, plans the workflow, delegates tasks, synthesizes results. Research agent (A2A sub-agent): uses MCP tools for CRM queries, web search, and company database lookups. Writer agent (A2A sub-agent): uses MCP tools for document generation, template filling, and email drafting. Analytics agent (A2A sub-agent): uses MCP tools for pipeline queries, revenue calculations, and trend analysis.

Workflow Example

User asks: "Prepare a follow-up for the Acme Corp meeting." The orchestrator: (1) delegates to research agent: "Get Acme Corp's account details and recent interactions." (2) Research agent uses CRM MCP tool to query account data, returns structured results. (3) Orchestrator delegates to writer agent: "Draft a follow-up email referencing these meeting notes and action items." (4) Writer agent generates the email, returns draft. (5) Orchestrator presents the draft to the user for review via AG-UI. Total time: 15 to 30 seconds for work that would take a human 20 minutes.

Developer building a multi-agent AI system with MCP tool connections and A2A delegation architecture

Orchestration Frameworks

LangGraph (Python): graph-based workflow orchestration with state machines. Best for complex, branching workflows. CrewAI (Python): role-based agent teams with built-in delegation patterns. Easier to set up for straightforward multi-agent scenarios. Anthropic Agent SDK: direct Claude integration with MCP and tool use. Best for Claude-powered agents. Mastra (TypeScript): full-stack agent framework with MCP support. Best for TypeScript teams.

Production Deployment Patterns

Demo agents work in minutes. Production agents require careful architecture for reliability, cost control, and security.

Reliability

LLM API calls fail. MCP tool calls fail. A2A sub-agents timeout. Build retry logic with exponential backoff at every layer. Implement circuit breakers that stop retrying after N failures and fall back to a degraded experience. Use timeouts on every external call: 30 seconds for LLM calls, 10 seconds for tool calls, 60 seconds for sub-agent delegations. Log every step for debugging when things go wrong.

Cost Control

Agent loops (agent calls tool, gets result, decides to call another tool, repeat) can consume thousands of tokens. Set a maximum iteration limit (10 to 20 tool calls per request). Set a maximum token budget per request ($0.50 to $2.00 for most use cases). Track and display costs per agent run. Use smaller models (Claude Haiku, GPT-4o mini) for simple tool routing decisions and larger models (Claude Opus, GPT-4) only for complex reasoning steps. Model routing can reduce costs by 60 to 80%.

Security

Every MCP tool call should validate permissions. An agent requesting "delete_all_users" should be blocked by the MCP server's authorization layer, not trusted because the LLM requested it. Implement allowlists for which tools each agent can use. Sandbox code execution tools in containers. Rate-limit expensive operations. Audit log every tool call and A2A delegation for compliance.

Evaluation

Test agent behavior with evaluation suites. Define test cases: given this user request, the agent should call these tools in roughly this order and produce this type of result. Run evaluations automatically in CI/CD. Track accuracy, latency, cost, and tool call patterns over time. Agent behavior drifts as LLM versions change, so continuous evaluation is essential.

Tech Stack for Agent Development

Here is the recommended stack for building tool-use agents in production.

Agent Runtime

  • Python teams: LangGraph or CrewAI for orchestration. Anthropic SDK or OpenAI SDK for LLM calls. MCP Python SDK for tool servers.
  • TypeScript teams: Mastra or Vercel AI SDK for orchestration. Anthropic SDK or OpenAI SDK for LLM calls. MCP TypeScript SDK for tool servers.

MCP Infrastructure

Run MCP servers as standalone services (Docker containers or serverless functions). Use HTTP+SSE transport for remote tools (database queries, API calls). Use stdio transport for local tools (file operations, code execution). Deploy MCP servers independently so they can be shared across multiple agents and updated without redeploying the agent.

Observability

Use LangSmith, Braintrust, or Helicone for LLM observability. Track every LLM call, tool call, and agent step. Visualize agent execution traces to debug failures. Monitor latency and cost per agent run. Set alerts for anomalous behavior (agent making 50+ tool calls, agent spending $10+ per request).

Infrastructure

Deploy agent backends on AWS Lambda, Cloud Run, or a containerized service. Use Redis for caching tool results and agent state. PostgreSQL for storing conversation history and agent configurations. WebSocket or SSE endpoint for streaming agent progress to the frontend (AG-UI).

Getting Started: Your First Tool-Use Agent

Here is a concrete path from zero to a working tool-use agent.

Week 1: Single Agent with MCP Tools

Build one MCP server that exposes 3 to 5 tools relevant to your product (e.g., search_customers, get_order_status, update_record). Connect it to Claude or GPT-4 using the respective SDK with tool use enabled. Test with real user queries. Measure tool selection accuracy, response quality, and latency. This alone delivers significant value and teaches you the fundamentals.

Week 2 to 3: Multi-Tool Agent with Guardrails

Add 2 to 3 more MCP servers (e.g., email integration, analytics queries). Implement authorization checks on tool calls. Add retry logic and error handling. Build a simple frontend that streams agent responses (AG-UI or a basic chat interface). Deploy to staging and test with internal users.

Week 4+: Multi-Agent System

Split the single agent into an orchestrator + specialist sub-agents using A2A. Add evaluation suites and CI/CD testing. Implement cost controls and monitoring. Deploy to production with feature flags for gradual rollout. Iterate based on user feedback and evaluation metrics.

Server infrastructure powering production AI agent deployment with MCP tool servers and monitoring

Realistic Costs

MCP server development: $5,000 to $15,000 per integration. Agent orchestration layer: $15,000 to $40,000. Frontend integration: $10,000 to $25,000. Total for a production multi-agent system: $30,000 to $80,000. Ongoing LLM API costs: $500 to $5,000 per month depending on usage volume.

Ready to build AI agents for your product? Book a free strategy call to plan your agent architecture and MCP integration strategy.

Need help building this?

Our team has launched 50+ products for startups and ambitious brands. Let's talk about your project.

AI tool-use agentsMCP protocol guideA2A protocol agentsagentic AI architecturemulti-agent AI system

Ready to build your product?

Book a free 15-minute strategy call. No pitch, just clarity on your next steps.

Get Started