Most introductions to Model Context Protocol (MCP) jump straight to "here is how to build a server." That works for hello-world demos, but it leaves you without a real mental model of what is happening under the hood — which makes debugging painful and design decisions guesswork.
This article slows down and walks through the complete MCP architecture: the layers, the actors, the message lifecycle, and the wire-level details. By the end you will be able to read an MCP trace and know exactly what is happening at every step.
The four-layer mental model
MCP is best understood as four stacked layers:
| Layer | What it does | Example |
|---|---|---|
| Host | The user-facing application | Claude Desktop, Cursor, an internal AI agent |
| Client | The MCP runtime embedded in the host | The MCP library inside Claude Desktop |
| Transport | The wire — moves JSON-RPC messages | stdio, Streamable HTTP |
| Server | The capability provider | A weather MCP server, a Postgres MCP server |
The host is the only layer the human sees. Everything below it is plumbing.
Host vs Client — a common point of confusion
These two terms get used interchangeably, but the spec treats them as distinct:
- A host is the application (Claude Desktop is one host).
- A client is one connection the host opens to one server.
A single host typically opens many clients — one per configured MCP server. So Claude Desktop with three MCP servers configured has three active clients running inside it, each talking to a different server.
┌──────────────────────── Host (Claude Desktop) ────────────────────────┐
│ │
│ ┌─ Client A ─┐ ┌─ Client B ─┐ ┌─ Client C ─┐ │
│ │ stdio ⇄ │ │ HTTP ⇄ │ │ stdio ⇄ │ │
│ └────────────┘ └────────────┘ └────────────┘ │
│ │ │ │ │
└─────────┼──────────────────┼──────────────────┼───────────────────────┘
│ │ │
┌────▼─────┐ ┌─────▼─────┐ ┌─────▼─────┐
│ Postgres │ │ GitHub │ │ Slack │
│ Server │ │ Server │ │ Server │
└──────────┘ └───────────┘ └───────────┘
The three transports
MCP messages are JSON-RPC 2.0 — a tiny, widely-supported protocol. What changes between transports is how those messages are physically moved.
1. stdio
The server runs as a subprocess of the host. Messages flow over the subprocess's stdin/stdout streams, one JSON object per line.
Use it when:
- The server runs on the same machine as the host
- You want zero network configuration
- The user already has Node, Python, or whatever runtime installed
2. HTTP + SSE (legacy)
The server is a standalone HTTP service. Client → server requests use plain HTTP POST. Server → client notifications use Server-Sent Events (SSE) on a separate persistent connection.
Use it when:
- The server is remote (different machine, different network)
- Multiple hosts need to connect to the same server
3. Streamable HTTP (the modern remote transport)
A refinement introduced in a 2025 spec update. Combines request/response and streaming into a single endpoint using chunked HTTP or SSE responses. Simpler to deploy than the dual-channel HTTP+SSE pattern, and the recommended transport for new remote servers.
Anatomy of an MCP session
Every MCP session goes through the same five phases. Walking through them with real messages is the fastest way to internalize how the protocol works.
Phase 1: Initialize
The client says "hello, here is what I support." The server responds with what it supports.
Client → Server:
{
"jsonrpc": "2.0",
"id": 1,
"method": "initialize",
"params": {
"protocolVersion": "2025-06-18",
"capabilities": { "sampling": {}, "roots": { "listChanged": true } },
"clientInfo": { "name": "my-host", "version": "1.0.0" }
}
}
Server → Client:
{
"jsonrpc": "2.0",
"id": 1,
"result": {
"protocolVersion": "2025-06-18",
"capabilities": {
"tools": { "listChanged": true },
"resources": {},
"prompts": {}
},
"serverInfo": { "name": "weather-server", "version": "1.0.0" }
}
}
Note the capability negotiation. Both sides declare what they can do; the rest of the session is constrained by the intersection.
Phase 2: Initialized notification
The client follows up with a fire-and-forget notification that the handshake is done. No id field — this is a one-way notification, not a request.
{ "jsonrpc": "2.0", "method": "notifications/initialized" }
Phase 3: Discovery
Now the client asks the server what it offers:
{ "jsonrpc": "2.0", "id": 2, "method": "tools/list" }
Response:
{
"jsonrpc": "2.0",
"id": 2,
"result": {
"tools": [{
"name": "get_weather",
"description": "Get current weather for any city",
"inputSchema": {
"type": "object",
"properties": { "city": { "type": "string" } },
"required": ["city"]
}
}]
}
}
The host hands these tools to the LLM as available actions. Similar resources/list and prompts/list calls fetch the other two primitives.
Phase 4: Invocation
The LLM decides to call get_weather. The host forwards the call:
{
"jsonrpc": "2.0",
"id": 3,
"method": "tools/call",
"params": {
"name": "get_weather",
"arguments": { "city": "Chennai" }
}
}
The server executes and returns:
{
"jsonrpc": "2.0",
"id": 3,
"result": {
"content": [{ "type": "text", "text": "It is 31°C in Chennai, partly cloudy." }],
"isError": false
}
}
The host feeds the text back into the LLM's context. The conversation continues.
Phase 5: Shutdown
The client closes the transport. For stdio that means closing stdin/stdout. For HTTP that means closing the SSE connection or letting it time out. There is no explicit "goodbye" message in the protocol.
Requests, responses, and notifications
JSON-RPC 2.0 has three message kinds; MCP uses all three:
| Kind | Has id? |
Expects reply? | MCP example |
|---|---|---|---|
| Request | Yes | Yes | tools/list, tools/call |
| Response | Same id as request |
— | The result or error |
| Notification | No | No | notifications/initialized, notifications/progress |
Notifications are how servers push events to clients without a request-response roundtrip — used for progress updates, resource changes, and logging.
A minimal echo server in code
Here is the smallest possible MCP server that exposes one tool. It captures everything the architecture demands in under 30 lines:
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import { z } from 'zod';
const server = new McpServer({
name: 'echo-server',
version: '1.0.0',
});
server.tool(
'echo',
'Echoes back whatever message you send',
{ message: z.string().describe('Anything you want echoed back') },
async ({ message }) => ({
content: [{ type: 'text', text: `You said: ${message}` }],
})
);
await server.connect(new StdioServerTransport());
The SDK handles the initialize handshake, capability negotiation, tools/list response, and tools/call dispatch automatically. You provide the business logic; the SDK provides the wire protocol.
Tools, resources, prompts — the three capabilities
A server can expose any combination of three primitives:
- Tools — actions the LLM can invoke (
tools/list,tools/call) - Resources — data the LLM can read (
resources/list,resources/read,resources/subscribe) - Prompts — pre-baked templates the user can invoke (
prompts/list,prompts/get)
A server that only offers tools is the most common. A server that also offers resources (e.g., a file system or database server) lets the host attach context to a conversation. A server that offers prompts gives users one-click access to standardized workflows.
Beyond the basics
A few advanced architectural features that are useful to know exist:
- Sampling — a server can ask the client's LLM to generate text on its behalf (reverse direction). Powerful for agentic patterns.
- Roots — the client tells the server which filesystem paths it should consider "in scope."
- Cancellation — long-running calls can be cancelled via
notifications/cancelled. - Progress — long-running calls can emit
notifications/progressso the host can show a spinner.
You rarely need these on day one, but they are why MCP feels mature compared to ad-hoc tool integrations.
Conclusion
MCP is conceptually small: four layers, three transports, three primitives, and a JSON-RPC 2.0 message format. The complexity comes from the interactions between them — capability negotiation, progress streaming, multi-server hosts, sampling — but the foundation is approachable.
If you understand the initialize → list → call → shutdown cycle, you understand MCP. Everything else is detail you can pick up as you need it.
Try it yourself
After connecting your weather MCP server to Claude Desktop, here is what a discovery-and-invoke conversation looks like:
get_weatherIt is currently 31°C in Chennai, partly cloudy with light winds from the southeast. Humidity is around 68% and the feels-like temperature is closer to 34°C.What just happened underneath: the client sent initialize, the server responded with capabilities, the client called tools/list and received the get_weather schema, the LLM emitted tools/call with { city: Chennai }, and the server fetched the upstream weather API and returned a text result. Five JSON-RPC messages total.