MCP Architecture Explained: Clients, Servers, Transports, and Tools

Most introductions to Model Context Protocol (MCP) jump straight to "here is how to build a server." That works for hello-world demos, but it leaves you without a real mental model of what is happening under the hood — which makes debugging painful and design decisions guesswork.

This article slows down and walks through the complete MCP architecture: the layers, the actors, the message lifecycle, and the wire-level details. By the end you will be able to read an MCP trace and know exactly what is happening at every step.

The four-layer mental model

MCP is best understood as four stacked layers:

Layer What it does Example
Host The user-facing application Claude Desktop, Cursor, an internal AI agent
Client The MCP runtime embedded in the host The MCP library inside Claude Desktop
Transport The wire — moves JSON-RPC messages stdio, Streamable HTTP
Server The capability provider A weather MCP server, a Postgres MCP server

The host is the only layer the human sees. Everything below it is plumbing.

Host vs Client — a common point of confusion

These two terms get used interchangeably, but the spec treats them as distinct:

  • A host is the application (Claude Desktop is one host).
  • A client is one connection the host opens to one server.

A single host typically opens many clients — one per configured MCP server. So Claude Desktop with three MCP servers configured has three active clients running inside it, each talking to a different server.

┌──────────────────────── Host (Claude Desktop) ────────────────────────┐
│                                                                       │
│   ┌─ Client A ─┐    ┌─ Client B ─┐    ┌─ Client C ─┐                  │
│   │ stdio ⇄    │    │ HTTP ⇄     │    │ stdio ⇄    │                  │
│   └────────────┘    └────────────┘    └────────────┘                  │
│         │                  │                  │                       │
└─────────┼──────────────────┼──────────────────┼───────────────────────┘
          │                  │                  │
     ┌────▼─────┐      ┌─────▼─────┐      ┌─────▼─────┐
     │ Postgres │      │  GitHub   │      │  Slack    │
     │  Server  │      │  Server   │      │  Server   │
     └──────────┘      └───────────┘      └───────────┘

The three transports

MCP messages are JSON-RPC 2.0 — a tiny, widely-supported protocol. What changes between transports is how those messages are physically moved.

1. stdio

The server runs as a subprocess of the host. Messages flow over the subprocess's stdin/stdout streams, one JSON object per line.

Use it when:

  • The server runs on the same machine as the host
  • You want zero network configuration
  • The user already has Node, Python, or whatever runtime installed

2. HTTP + SSE (legacy)

The server is a standalone HTTP service. Client → server requests use plain HTTP POST. Server → client notifications use Server-Sent Events (SSE) on a separate persistent connection.

Use it when:

  • The server is remote (different machine, different network)
  • Multiple hosts need to connect to the same server

3. Streamable HTTP (the modern remote transport)

A refinement introduced in a 2025 spec update. Combines request/response and streaming into a single endpoint using chunked HTTP or SSE responses. Simpler to deploy than the dual-channel HTTP+SSE pattern, and the recommended transport for new remote servers.

Anatomy of an MCP session

Every MCP session goes through the same five phases. Walking through them with real messages is the fastest way to internalize how the protocol works.

Phase 1: Initialize

The client says "hello, here is what I support." The server responds with what it supports.

Client → Server:

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "initialize",
  "params": {
    "protocolVersion": "2025-06-18",
    "capabilities": { "sampling": {}, "roots": { "listChanged": true } },
    "clientInfo": { "name": "my-host", "version": "1.0.0" }
  }
}

Server → Client:

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "protocolVersion": "2025-06-18",
    "capabilities": {
      "tools": { "listChanged": true },
      "resources": {},
      "prompts": {}
    },
    "serverInfo": { "name": "weather-server", "version": "1.0.0" }
  }
}

Note the capability negotiation. Both sides declare what they can do; the rest of the session is constrained by the intersection.

Phase 2: Initialized notification

The client follows up with a fire-and-forget notification that the handshake is done. No id field — this is a one-way notification, not a request.

{ "jsonrpc": "2.0", "method": "notifications/initialized" }

Phase 3: Discovery

Now the client asks the server what it offers:

{ "jsonrpc": "2.0", "id": 2, "method": "tools/list" }

Response:

{
  "jsonrpc": "2.0",
  "id": 2,
  "result": {
    "tools": [{
      "name": "get_weather",
      "description": "Get current weather for any city",
      "inputSchema": {
        "type": "object",
        "properties": { "city": { "type": "string" } },
        "required": ["city"]
      }
    }]
  }
}

The host hands these tools to the LLM as available actions. Similar resources/list and prompts/list calls fetch the other two primitives.

Phase 4: Invocation

The LLM decides to call get_weather. The host forwards the call:

{
  "jsonrpc": "2.0",
  "id": 3,
  "method": "tools/call",
  "params": {
    "name": "get_weather",
    "arguments": { "city": "Chennai" }
  }
}

The server executes and returns:

{
  "jsonrpc": "2.0",
  "id": 3,
  "result": {
    "content": [{ "type": "text", "text": "It is 31°C in Chennai, partly cloudy." }],
    "isError": false
  }
}

The host feeds the text back into the LLM's context. The conversation continues.

Phase 5: Shutdown

The client closes the transport. For stdio that means closing stdin/stdout. For HTTP that means closing the SSE connection or letting it time out. There is no explicit "goodbye" message in the protocol.

Requests, responses, and notifications

JSON-RPC 2.0 has three message kinds; MCP uses all three:

Kind Has id? Expects reply? MCP example
Request Yes Yes tools/list, tools/call
Response Same id as request The result or error
Notification No No notifications/initialized, notifications/progress

Notifications are how servers push events to clients without a request-response roundtrip — used for progress updates, resource changes, and logging.

A minimal echo server in code

Here is the smallest possible MCP server that exposes one tool. It captures everything the architecture demands in under 30 lines:

import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import { z } from 'zod';

const server = new McpServer({
  name: 'echo-server',
  version: '1.0.0',
});

server.tool(
  'echo',
  'Echoes back whatever message you send',
  { message: z.string().describe('Anything you want echoed back') },
  async ({ message }) => ({
    content: [{ type: 'text', text: `You said: ${message}` }],
  })
);

await server.connect(new StdioServerTransport());

The SDK handles the initialize handshake, capability negotiation, tools/list response, and tools/call dispatch automatically. You provide the business logic; the SDK provides the wire protocol.

Tools, resources, prompts — the three capabilities

A server can expose any combination of three primitives:

  • Tools — actions the LLM can invoke (tools/list, tools/call)
  • Resources — data the LLM can read (resources/list, resources/read, resources/subscribe)
  • Prompts — pre-baked templates the user can invoke (prompts/list, prompts/get)

A server that only offers tools is the most common. A server that also offers resources (e.g., a file system or database server) lets the host attach context to a conversation. A server that offers prompts gives users one-click access to standardized workflows.

Beyond the basics

A few advanced architectural features that are useful to know exist:

  • Sampling — a server can ask the client's LLM to generate text on its behalf (reverse direction). Powerful for agentic patterns.
  • Roots — the client tells the server which filesystem paths it should consider "in scope."
  • Cancellation — long-running calls can be cancelled via notifications/cancelled.
  • Progress — long-running calls can emit notifications/progress so the host can show a spinner.

You rarely need these on day one, but they are why MCP feels mature compared to ad-hoc tool integrations.

Conclusion

MCP is conceptually small: four layers, three transports, three primitives, and a JSON-RPC 2.0 message format. The complexity comes from the interactions between them — capability negotiation, progress streaming, multi-server hosts, sampling — but the foundation is approachable.

If you understand the initialize → list → call → shutdown cycle, you understand MCP. Everything else is detail you can pick up as you need it.

Try it yourself

After connecting your weather MCP server to Claude Desktop, here is what a discovery-and-invoke conversation looks like:

YouWhat is the weather in Chennai right now?
Claude · used get_weatherIt is currently 31°C in Chennai, partly cloudy with light winds from the southeast. Humidity is around 68% and the feels-like temperature is closer to 34°C.

What just happened underneath: the client sent initialize, the server responded with capabilities, the client called tools/list and received the get_weather schema, the LLM emitted tools/call with { city: Chennai }, and the server fetched the upstream weather API and returned a text result. Five JSON-RPC messages total.

Leave a Comment

Your email address will not be published. Required fields are marked *