MCP Architecture Explained: Clients, Servers, Transports, and Tools
Most introductions to Model Context Protocol (MCP) jump straight to "here is how to build a server." That works for hello-world demos, but it leaves you without a real mental model of what is happening under the hood — which makes debugging painful and design decisions guesswork.
This article slows down and walks through the complete MCP architecture: the layers, the actors, the message lifecycle, and the wire-level details. By the end you will be able to read an MCP trace and know exactly what is happening at every step.
The four-layer mental model #
MCP is best understood as four stacked layers:
| Layer | What it does | Example |
|---|---|---|
| Host | The user-facing application | Claude Desktop, Cursor, an internal AI agent |
| Client | The MCP runtime embedded in the host | The MCP library inside Claude Desktop |
| Transport | The wire — moves JSON-RPC messages | stdio, Streamable HTTP |
| Server | The capability provider | A weather MCP server, a Postgres MCP server |
The host is the only layer the human sees. Everything below it is plumbing.
Host vs Client — a common point of confusion #
These two terms get used interchangeably, but the spec treats them as distinct:
- A host is the application (Claude Desktop is one host).
- A client is one connection the host opens to one server.
A single host typically opens many clients — one per configured MCP server. So Claude Desktop with three MCP servers configured has three active clients running inside it, each talking to a different server.
┌──────────────────────── Host (Claude Desktop) ────────────────────────┐
│ │
│ ┌─ Client A ─┐ ┌─ Client B ─┐ ┌─ Client C ─┐ │
│ │ stdio ⇄ │ │ HTTP ⇄ │ │ stdio ⇄ │ │
│ └────────────┘ └────────────┘ └────────────┘ │
│ │ │ │ │
└─────────┼──────────────────┼──────────────────┼───────────────────────┘
│ │ │
┌────▼─────┐ ┌─────▼─────┐ ┌─────▼─────┐
│ Postgres │ │ GitHub │ │ Slack │
│ Server │ │ Server │ │ Server │
└──────────┘ └───────────┘ └───────────┘
The three transports #
MCP messages are JSON-RPC 2.0 — a tiny, widely-supported protocol. What changes between transports is how those messages are physically moved.
1. stdio #
The server runs as a subprocess of the host. Messages flow over the subprocess's stdin/stdout streams, one JSON object per line.
Use it when:
- The server runs on the same machine as the host
- You want zero network configuration
- The user already has Node, Python, or whatever runtime installed
2. HTTP + SSE (legacy) #
The server is a standalone HTTP service. Client → server requests use plain HTTP POST. Server → client notifications use Server-Sent Events (SSE) on a separate persistent connection.
Use it when:
- The server is remote (different machine, different network)
- Multiple hosts need to connect to the same server
3. Streamable HTTP (the modern remote transport) #
A refinement introduced in a 2025 spec update. Combines request/response and streaming into a single endpoint using chunked HTTP or SSE responses. Simpler to deploy than the dual-channel HTTP+SSE pattern, and the recommended transport for new remote servers.
Anatomy of an MCP session #
Every MCP session goes through the same five phases. Walking through them with real messages is the fastest way to internalize how the protocol works.
Phase 1: Initialize #
The client says "hello, here is what I support." The server responds with what it supports.
Client → Server:
{
"jsonrpc": "2.0",
"id": 1,
"method": "initialize",
"params": {
"protocolVersion": "2025-06-18",
"capabilities": { "sampling": {}, "roots": { "listChanged": true } },
"clientInfo": { "name": "my-host", "version": "1.0.0" }
}
}
Server → Client:
{
"jsonrpc": "2.0",
"id": 1,
"result": {
"protocolVersion": "2025-06-18",
"capabilities": {
"tools": { "listChanged": true },
"resources": {},
"prompts": {}
},
"serverInfo": { "name": "weather-server", "version": "1.0.0" }
}
}
Note the capability negotiation. Both sides declare what they can do; the rest of the session is constrained by the intersection.
Phase 2: Initialized notification #
The client follows up with a fire-and-forget notification that the handshake is done. No id field — this is a one-way notification, not a request.
{ "jsonrpc": "2.0", "method": "notifications/initialized" }
Phase 3: Discovery #
Now the client asks the server what it offers:
{ "jsonrpc": "2.0", "id": 2, "method": "tools/list" }
Response:
{
"jsonrpc": "2.0",
"id": 2,
"result": {
"tools": [{
"name": "get_weather",
"description": "Get current weather for any city",
"inputSchema": {
"type": "object",
"properties": { "city": { "type": "string" } },
"required": ["city"]
}
}]
}
}
The host hands these tools to the LLM as available actions. Similar resources/list and prompts/list calls fetch the other two primitives.
Phase 4: Invocation #
The LLM decides to call get_weather. The host forwards the call:
{
"jsonrpc": "2.0",
"id": 3,
"method": "tools/call",
"params": {
"name": "get_weather",
"arguments": { "city": "Chennai" }
}
}
The server executes and returns:
{
"jsonrpc": "2.0",
"id": 3,
"result": {
"content": [{ "type": "text", "text": "It is 31°C in Chennai, partly cloudy." }],
"isError": false
}
}
The host feeds the text back into the LLM's context. The conversation continues.
Phase 5: Shutdown #
The client closes the transport. For stdio that means closing stdin/stdout. For HTTP that means closing the SSE connection or letting it time out. There is no explicit "goodbye" message in the protocol.
Requests, responses, and notifications #
JSON-RPC 2.0 has three message kinds; MCP uses all three:
| Kind | Has id? |
Expects reply? | MCP example |
|---|---|---|---|
| Request | Yes | Yes | tools/list, tools/call |
| Response | Same id as request |
— | The result or error |
| Notification | No | No | notifications/initialized, notifications/progress |
Notifications are how servers push events to clients without a request-response roundtrip — used for progress updates, resource changes, and logging.
A minimal echo server in code #
Here is the smallest possible MCP server that exposes one tool. It captures everything the architecture demands in under 30 lines:
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import { z } from 'zod';
const server = new McpServer({
name: 'echo-server',
version: '1.0.0',
});
server.tool(
'echo',
'Echoes back whatever message you send',
{ message: z.string().describe('Anything you want echoed back') },
async ({ message }) => ({
content: [{ type: 'text', text: `You said: ${message}` }],
})
);
await server.connect(new StdioServerTransport());
The SDK handles the initialize handshake, capability negotiation, tools/list response, and tools/call dispatch automatically. You provide the business logic; the SDK provides the wire protocol.
Tools, resources, prompts — the three capabilities #
A server can expose any combination of three primitives:
- Tools — actions the LLM can invoke (
tools/list,tools/call) - Resources — data the LLM can read (
resources/list,resources/read,resources/subscribe) - Prompts — pre-baked templates the user can invoke (
prompts/list,prompts/get)
A server that only offers tools is the most common. A server that also offers resources (e.g., a file system or database server) lets the host attach context to a conversation. A server that offers prompts gives users one-click access to standardized workflows.
Beyond the basics #
A few advanced architectural features that are useful to know exist:
- Sampling — a server can ask the client's LLM to generate text on its behalf (reverse direction). Powerful for agentic patterns.
- Roots — the client tells the server which filesystem paths it should consider "in scope."
- Cancellation — long-running calls can be cancelled via
notifications/cancelled. - Progress — long-running calls can emit
notifications/progressso the host can show a spinner.
You rarely need these on day one, but they are why MCP feels mature compared to ad-hoc tool integrations.
Conclusion #
MCP is conceptually small: four layers, three transports, three primitives, and a JSON-RPC 2.0 message format. The complexity comes from the interactions between them — capability negotiation, progress streaming, multi-server hosts, sampling — but the foundation is approachable.
If you understand the initialize → list → call → shutdown cycle, you understand MCP. Everything else is detail you can pick up as you need it.
Try it yourself #
After connecting your weather MCP server to Claude Desktop, here is what a discovery-and-invoke conversation looks like:
get_weatherIt is currently 31°C in Chennai, partly cloudy with light winds from the southeast. Humidity is around 68% and the feels-like temperature is closer to 34°C.What just happened underneath: the client sent initialize, the server responded with capabilities, the client called tools/list and received the get_weather schema, the LLM emitted tools/call with { city: Chennai }, and the server fetched the upstream weather API and returned a text result. Five JSON-RPC messages total.
Up next in AI & MCP
More from this topic
Enjoyed this article?
Get new AI & MCP tutorials delivered. No spam — just code-first articles when they ship.


