Building an MCP Server in 2025: What Actually Worked

A year ago, MCP was an experiment. Today it's infrastructure—10,000+ public servers, 97 million monthly SDK downloads, adoption by ChatGPT, Gemini, and Copilot, a home in the Linux Foundation. That's not early adoption anymore. That's standardization.

I've been building a Letta MCP server through this entire arc, and the biggest lesson came early: tool sprawl will kill you before complexity does.

Every capability wants to be its own tool. Create agent, update agent, delete agent, list agents, clone agent, export agent—that's six tools before you've done anything interesting. Multiply across memory, sources, files, jobs, and MCP management itself, and you're staring at 60+ tools competing for context window space. Models hate this. Too many options means cognitive overhead on every call, and similar names blur together in ways that cause real failures. Is it update_agent or modify_agent? I think most MCP server authors hit this wall around month two.

The fix wasn't fewer capabilities—it was fewer tools with more operations inside them. letta_agent_advanced now handles 22 operations: CRUD, messaging, import/export, streaming, async jobs. One tool, one schema, clear internal routing via an operation parameter. letta_memory_unified does the same for memory, covering 15 operations across core blocks and archival storage. Seven tools now cover 87 operations total. That's a 92% reduction in tool count with zero capability loss.

The ecosystem noticed this pattern too. The 3rd MCP spec in June introduced structured outputs and interactive prompts—both aimed at making tool interfaces richer without multiplying tool count. Fewer tools with better schemas beats many tools with simple ones. That consensus emerged independently across the community.

Here's what the server looks like today: letta_agent_advanced owns the agent lifecycle from creation to archival. letta_memory_unified handles everything an agent remembers. letta_tool_manager governs capabilities and attachments. letta_source_manager deals with external knowledge. letta_mcp_ops is meta—managing MCP servers themselves. Beyond Letta, 26 external servers connect to this ecosystem: BookStack, Matrix, Huly, Ghost, Graphiti, PhotoPrism, Komodo. Many follow the same consolidation pattern.

Consolidation solved the tool count problem but revealed others. Tools discovered dynamically aren't immediately callable—there's a loop boundary where registration happens but the agent can't use the tool until the next execution cycle. Some tools get discovered but remain uncallable entirely, and the gap between "registered" and "working" isn't visible until invocation fails. Consolidated tools also have large schemas, and models occasionally pick wrong operations or supply mismatched parameters. The 4th MCP spec in November introduced async tasks and agentic sampling partly to address timing issues. The ecosystem is learning these lessons collectively.

I think consolidation was phase one. Phase two is smarter routing—intent-based dispatch where the model expresses "update this agent's memory" and the MCP layer interprets that into the right tool and operation automatically. Context-aware surfacing that promotes relevant tools based on conversation state. Composition layers that let you express "create a configured agent" as one logical operation instead of four calls across three tools. And eventually, federation—26 servers connected locally, 10,000+ in the broader ecosystem, and the question of whether they should know about each other. Can BookStack trigger Huly? Can Matrix invoke Ghost? Meta-tools orchestrating multi-system workflows feel inevitable.

The Official MCP Registry that launched this fall hints at that direction. Discovery is the first step. Orchestration comes next.