Wrap Any REST API as an MCP Server: The Universal Pattern

If you already have a REST API — your own product API, a third-party service you use, or an internal microservice — you are 80% of the way to having an MCP server. The pattern for wrapping a REST API as an MCP server is generic, well-trodden, and worth knowing well, because it is the single most common way new MCP servers come into existence.

This article shows the universal pattern using a real, free, no-auth API (JSONPlaceholder) so you can run every example as-is. Then we walk through the design choices that make the difference between a technically correct wrapper and one an LLM can actually use well.

The mental model

A REST API exposes endpoints. An MCP server exposes tools. The wrapping job is to map one to the other thoughtfully.

Naive mapping (works, but suboptimal):

REST endpoint	MCP tool
`GET /posts`	`list_posts`
`GET /posts/{id}`	`get_post`
`POST /posts`	`create_post`
`PUT /posts/{id}`	`update_post`
`DELETE /posts/{id}`	`delete_post`

This works, but it gives the LLM the exact surface of the REST API — which is shaped for code, not for natural language. A better wrapper consolidates, hides, and adds LLM-friendly shortcuts.

We will build both and compare.

Setup

mkdir jsonplaceholder-mcp
cd jsonplaceholder-mcp
npm init -y
npm install @modelcontextprotocol/sdk zod

Add "type": "module" to package.json.

The thin wrapper (one tool per endpoint)

Create server.js:

#!/usr/bin/env node
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import { z } from 'zod';

const API = 'https://jsonplaceholder.typicode.com';

async function api(path, init = {}) {
  const res = await fetch(`${API}${path}`, {
    ...init,
    headers: { 'Content-Type': 'application/json', ...(init.headers || {}) },
  });
  if (!res.ok) throw new Error(`API ${res.status}: ${await res.text()}`);
  return res.status === 204 ? null : res.json();
}

const server = new McpServer({ name: 'jsonplaceholder', version: '1.0.0' });

server.tool('list_posts', 'List all posts (returns 100 entries).', {}, async () => {
  const posts = await api('/posts');
  return { content: [{ type: 'text', text: JSON.stringify(posts, null, 2) }] };
});

server.tool('get_post', 'Fetch a single post by ID.',
  { id: z.number().int().positive() },
  async ({ id }) => {
    const post = await api(`/posts/${id}`);
    return { content: [{ type: 'text', text: JSON.stringify(post, null, 2) }] };
  }
);

server.tool('create_post', 'Create a new post.',
  { title: z.string(), body: z.string(), userId: z.number().int() },
  async (args) => {
    const post = await api('/posts', { method: 'POST', body: JSON.stringify(args) });
    return { content: [{ type: 'text', text: `Created post #${post.id}` }] };
  }
);

await server.connect(new StdioServerTransport());

That is a perfectly valid MCP server. Wire it into Claude Desktop, ask "show me post 5," and it works.

But now look at the output of list_posts — 100 JSON objects with hundreds of fields. The LLM has to wade through ~50,000 tokens of noise to answer "what was the title of the most recent post?"

The thoughtful wrapper (LLM-friendly tools)

A better wrapper does four things:

Trims the response to what the LLM actually needs.
Adds parameters that real users want (limit, search, by-user).
Returns text the LLM can read directly, not raw JSON the LLM has to re-parse.
Composes related endpoints when it makes sense (e.g., one tool that returns a post and its comments).

Here is the same API as a thoughtful wrapper:

server.tool('search_posts',
  'Search posts by keyword (matches title and body). Returns up to 10 results with summary info.',
  {
    query: z.string().describe('Keyword to match'),
    limit: z.number().int().min(1).max(50).default(10),
  },
  async ({ query, limit }) => {
    const all = await api('/posts');
    const q = query.toLowerCase();
    const hits = all
      .filter(p => p.title.toLowerCase().includes(q) || p.body.toLowerCase().includes(q))
      .slice(0, limit);
    if (!hits.length) {
      return { content: [{ type: 'text', text: `No posts found matching "${query}".` }] };
    }
    const lines = hits.map(p => `#${p.id} "${p.title}" (by user ${p.userId})`);
    return { content: [{ type: 'text', text: lines.join('\n') }] };
  }
);

server.tool('get_post_with_comments',
  'Fetch a post and all its comments in one call.',
  { id: z.number().int().positive() },
  async ({ id }) => {
    const [post, comments] = await Promise.all([
      api(`/posts/${id}`),
      api(`/posts/${id}/comments`),
    ]);
    const cText = comments.map(c => `  • ${c.email}: ${c.body}`).join('\n');
    return {
      content: [{
        type: 'text',
        text: `Post #${post.id}: ${post.title}\n\n${post.body}\n\n--- Comments (${comments.length}) ---\n${cText}`,
      }],
    };
  }
);

server.tool('posts_by_user',
  "List all posts by a specific user.",
  { userId: z.number().int().positive() },
  async ({ userId }) => {
    const posts = await api(`/users/${userId}/posts`);
    const lines = posts.map(p => `#${p.id} ${p.title}`);
    return { content: [{ type: 'text', text: lines.join('\n') || 'No posts.' }] };
  }
);

Notice the differences:

search_posts is something the LLM actually wants. The raw API has no search; we synthesized it client-side. The LLM never has to scan 100 records itself.
get_post_with_comments collapses two endpoints into one tool. Saves a round-trip and reads naturally as a single "give me the post" intent.
Output is preformatted text, not raw JSON. The LLM does not have to parse it before answering the user.
Limits and defaults make response sizes predictable.

Authentication patterns

JSONPlaceholder needs no auth, but most real APIs do. Three common patterns:

Static API key (simplest)

const API_KEY = process.env.API_KEY;
if (!API_KEY) throw new Error('Set API_KEY environment variable.');

async function api(path, init = {}) {
  return fetch(`${API}${path}`, {
    ...init,
    headers: { Authorization: `Bearer ${API_KEY}`, ...init.headers },
  });
}

The user puts their key in their claude_desktop_config.json server entry as an env property:

{
  "mcpServers": {
    "my-api": {
      "command": "npx",
      "args": ["-y", "my-api-mcp"],
      "env": { "API_KEY": "sk_live_..." }
    }
  }
}

Per-user OAuth (more complex)

For APIs that need per-user authorization (think: "my GitHub issues" not "the GitHub public API"), MCP supports OAuth flows via the Streamable HTTP transport. Full coverage is its own article, but the gist is: the server registers OAuth metadata, the host opens a browser, the user authorizes, and the host stores the resulting token for subsequent calls.

Personal access tokens (PATs)

For developer-tools APIs (GitHub, GitLab, Linear), the simplest path is to ask the user for a PAT once, treat it like an API key, and document the scopes required. Less elegant than OAuth, but works everywhere.

Pagination

APIs that paginate large result sets need careful handling. Two patterns:

Pattern A: Expose the cursor to the LLM.

server.tool('list_users',
  'List users with pagination. Pass cursor from previous response to continue.',
  { cursor: z.string().optional(), limit: z.number().int().min(1).max(100).default(20) },
  async ({ cursor, limit }) => {
    const url = `/users?limit=${limit}${cursor ? `&cursor=${cursor}` : ''}`;
    const data = await api(url);
    return {
      content: [{
        type: 'text',
        text: `${data.items.map(u => u.name).join('\n')}\n\nNext cursor: ${data.nextCursor || '(none)'}`,
      }],
    };
  }
);

Pattern B: Auto-paginate up to a sensible cap.

server.tool('list_all_users',
  'List up to the first 500 users.',
  {},
  async () => {
    const all = [];
    let cursor;
    while (all.length < 500) {
      const data = await api(`/users?limit=100${cursor ? `&cursor=${cursor}` : ''}`);
      all.push(...data.items);
      if (!data.nextCursor) break;
      cursor = data.nextCursor;
    }
    return { content: [{ type: 'text', text: all.map(u => u.name).join('\n') }] };
  }
);

Pattern B is more LLM-friendly (no cursor juggling), but you must enforce a cap to prevent runaway calls.

Error normalization

A real REST API can fail in dozens of ways — 4xx, 5xx, network timeouts, rate limits. Translate them into responses the LLM can act on:

async function safeApi(path, init = {}) {
  try {
    const res = await fetch(`${API}${path}`, init);
    if (res.status === 429) return { error: 'Rate limit hit. Please retry in a minute.' };
    if (res.status === 404) return { error: 'Not found.' };
    if (!res.ok) return { error: `API error ${res.status}: ${await res.text()}` };
    return { data: await res.json() };
  } catch (err) {
    return { error: `Network error: ${err.message}` };
  }
}

server.tool('get_post', 'Fetch a post by ID.',
  { id: z.number().int().positive() },
  async ({ id }) => {
    const r = await safeApi(`/posts/${id}`);
    if (r.error) return { content: [{ type: 'text', text: r.error }], isError: true };
    return { content: [{ type: 'text', text: JSON.stringify(r.data, null, 2) }] };
  }
);

The LLM reads the friendly error and can decide whether to retry, ask the user, or apologize gracefully.

Design checklist

Before you publish a REST-API-backed MCP server, ask yourself:

Check	Why
Did I rename CRUD endpoints into intent-based tool names?	LLMs reason in intent ("find", "show", "send"), not in HTTP verbs
Did I add search/filter tools the raw API lacks?	LLMs cannot fetch 1000 records and scan them efficiently
Are my responses preformatted text, not raw JSON?	Token efficiency and ease of reading
Did I cap response sizes and add limits?	Prevents context blowouts
Do error responses use `isError: true` with a human message?	Lets the LLM react gracefully
Did I write good tool descriptions?	Bad descriptions are the #1 reason LLMs ignore a tool

Conclusion

Wrapping a REST API as an MCP server is one of the highest-leverage patterns in the entire MCP ecosystem. Every internal service in your company, every SaaS your team uses, every API you have ever written documentation for — all of them can be exposed to any AI assistant with a few hundred lines of glue.

The trick is to resist the urge to expose the API one-to-one. Build the wrapper for the LLM that will use it, not for the developer who wrote the REST docs. Consolidate related calls, hide low-value endpoints, preformat responses, and add the conveniences the LLM actually wants.

Do that, and you will have an MCP server that feels native — not a transliteration of a REST API into JSON-RPC.

Try it yourself

With the thoughtful version of the JSONPlaceholder wrapper connected, the LLM can answer real questions instead of dumping raw JSON:

YouFind any posts about voluptatem — what are their titles?

Claude · used search_postsFound 4 matching posts:
• #13 “voluptatum eveniet et nesciunt”
• #42 “commodi ullam sint et excepturi”
• #67 “minima ut consequuntur”
• #88 “totam consequatur expedita”

Want me to fetch the full body of any of these?

The naive list_posts wrapper would have returned all 100 posts and the LLM would have had to scan them itself — chewing through tokens. The thoughtful search_posts tool did the filtering server-side and returned a clean summary.

Wrap Any REST API as an MCP Server: The Universal Pattern

The mental model

Setup

The thin wrapper (one tool per endpoint)

The thoughtful wrapper (LLM-friendly tools)

Authentication patterns

Static API key (simplest)

Per-user OAuth (more complex)

Personal access tokens (PATs)

Pagination

Error normalization

Design checklist

Conclusion

Try it yourself

Leave a Comment Cancel Reply

Sign up to receive email updates, fresh news and more!

The mental model

Setup

The thin wrapper (one tool per endpoint)

The thoughtful wrapper (LLM-friendly tools)

Authentication patterns

Static API key (simplest)

Per-user OAuth (more complex)

Personal access tokens (PATs)

Pagination

Error normalization

Design checklist

Conclusion

Try it yourself

Related Posts

Leave a Comment Cancel Reply