Open-Source Tool Shrinks Claude MCP Tokens 14-Fold
Key insights
- An open-source tool removed human-readable docs from MCP schemas, cutting Claude Code token usage 93% with no loss of tool-routing accuracy.
- The 38K-to-2.6K reduction spans 142 tools across 8 MCP servers, compressing descriptions the model ignores during dispatch.
- Claude Code usage limits scale with context consumption, making token-heavy MCP schemas a direct cost multiplier as tool counts grow.
Why this matters
Claude Code's pricing model ties usage limits to total context consumed, so as developers connect more MCP servers, tool-definition overhead becomes a structural cost that compounds with each server added. This technique exposes a meaningful gap between what MCP schemas contain for human readers and what models actually need for routing, a distinction the MCP specification does not currently address. Teams running production Claude Code workflows across large MCP toolsets could cut context consumption by an order of magnitude without modifying their prompts or tooling.
Summary
A developer released an open-source tool that strips documentation from MCP tool schemas before they reach Claude Code's context window, cutting token load 14x from 38K to 2.6K across 8 servers and 142 tools.
MCP definitions carry two layers: structural JSON needed for routing, and natural-language descriptions written for humans. The tool removes the second while leaving dispatch logic intact.
Essentially: (Claude Code, MCP ecosystem) schema compression is lossless for routing while dropping documentation overhead.
- 38K to 2.6K, a 14x reduction with no reported routing degradation across 142 tools
- Claude Code limits are context-billed, so schema bloat directly accelerates rate-limit hits as server counts grow
Schema compression may become infrastructure-level practice as MCP server counts keep climbing.
Potential risks and opportunities
Risks
- MCP server authors whose tool descriptions contain implicit disambiguation cues may see silent routing failures after compression, with no current tooling to detect which descriptions are load-bearing before stripping
- If Anthropic changes how Claude Code processes tool definitions or tightens schema validation, third-party compression wrappers could break production workflows without surfacing clear errors
- Developers deploying compressed schemas in shared team environments risk inconsistency, particularly if certain tools require full descriptions for correct behavior in edge cases not covered by initial testing
Opportunities
- MCP server registry maintainers could standardize dual-track schemas baked into the spec, a verbose developer-facing version and a compact model-facing version, turning this optimization into a first-class feature
- Token optimization tooling vendors could package schema compression alongside prompt caching and context management, targeting enterprise Claude Code users running large MCP configurations
- Developers building MCP servers for commercial distribution gain a new quality signal in schema compression ratio, rewarding clean API design over documentation padding and creating a measurable performance differentiator
What we don't know yet
- Whether the tool degrades routing accuracy for edge-case tool selection when descriptions contain implicit disambiguation cues the model relies on
- Whether Anthropic plans to implement schema compression natively in Claude Code, removing the need for third-party wrappers
- Whether the 14x figure holds across MCP servers with shorter baseline descriptions, or reflects unusually verbose documentation in the tested set
Originally reported by reddit.com
Read the original article →Original headline: r/ClaudeAI: Open-Source Tool Compresses Claude Code Tool-Definition Tokens From 38K to 2.6K — 14× Reduction Across 8 MCP Servers