Reflections on building with Model Context Protocol (MCP)

It's great! With some rough edges

Dec 14, 2024 Reflections on building with Model Context Protocol (MCP)

Introduction

The Model Context Protocol (MCP) is a new open standard designed by Anthropic, for connecting LLMs with external tools and data. MCP lets AI apps like Claude Desktop connect to local and remote servers that provide access to tools and resources like databases, search engines, and other APIs. An illustrative screenshot from the MCP website is below:

MCP example

In the screenshot, Claude Desktop (the MCP client) uses a tool call to the weather MCP server, which in turn fetches data from a weather API.

Why not just OpenAPI?

If you’re not familiar with OpenAPI, it’s a standard for defining APIs. I wondered, we already have this specification for describing tools and data sources, why do we need another one like MCP? Indeed, ChatGPT can already connect to OpenAPI servers.

I asked as much in this HackerNews comment. But I sort of see the point now. OpenAPI is great for defining REST APIs but it’s not great for standardizing the kind of interactions that LLMs need. The specification has some handy provisions, for example, letting servers notify clients that a resource (like a database) was updated.

How much of the spec does Claude Desktop implement?

Even though Claude Desktop is meant to be the poster child for MCP, it doesn’t implement the full spec. For example, best I can tell, it does not support SSE transport (stdio only), thus servers must be local. Prompts and resources must be selected manually. Dynamic resources (where the server can generate a bunch of resources on the fly) don’t seem to have any UI affordance. Server generated notifications don’t seem to appear. And perhaps most peskily, Claude Desktop has strict timeouts on tool calls, which means that without any way to schedule work in the background, the client can’t do long-running tasks.

All of this may change with time, but it does mean that a better way to test your MCP server is to use the MCP Inspector, which is a kind of tester tool. It also has some limitations though. For example, if your tool has an input schema with arrays or other complexity, it won’t generate the correct form UI. A screenshot of the inspector from the MCP website is below:

MCP Inspector

How is the DX?

MCP provides two SDKs in Python and TypeScript. The Python one is decent. I appreciate their use of uv in the starter template. But, at the time of writing, the SDK with stdio transport has a critical bug. Remember that client side timeout I mentioned? Client disconnects will cause the server process to also irrecoverably crash (issue 88).

Not great when you have to rewrite your server to TypeScript the night before a demo. The TypeScript SDK is lovely though. It’s got a nice API and the type definitions are very helpful. I might be biased because I am more familiar with the ecosystem.

How is the documentation?

The documentation is decent. I’m not a fan of separating the specification from the main website. The sidebar on the main site is also missing several pages which can only be found through search. The examples repository was definitely a big help to fill in the gaps, especially with some of the less elucidated concepts such as SSE transport.

Is MCP a win for users?

Certainly it opens up a lot of possibilities. Imagine, your AI app like Claude Desktop or Zed IDE can now connect to anything. But I am surprised the spec does not have an opinionated way to actually configure the servers in the client application. With Claude Desktop you have to configure some random JSON config file, or use some third party tool/registry to do that for you. I would have expected a more integrated experience. Similarly, with Zed you have to make a wrapper that just executes your MCP server. It just seems like a lot of lift to close that last mile to the end user.

Where I’d like to see MCP go

I’d like to have the MCP folks maintain an official registry. I’d also want to see registries added to the spec, so that client applications can pull from registries instead of having us configure them.

I’d like to have a way for clients to schedule async work with the MCP server and be notified when completed. This would result in the AI agent starting a new loop, or adjusting the current loop to handle the results.

I’d like clients to automatically decide which prompts and resources are relevant.

I’d also love a way for sessionizing client-server interactions and potentially having multiple clients be part of the same session to accomplish multiplayer work. I’m not sure how that would work, but it would be lovely to see ChatGPT, Claude or niche-AI clients working together to solve a task using all the tools across my computer.