quandri.io via Reddit

Quandri drops MCP for CLI-first agent architecture

agents enterprise ai mcp agents tooling

Key insights

  • With four MCP servers connected, tool definitions alone consume 10.5% of the context window, leaving less room for actual task content.
  • Quandri proposes a tiered architecture reserving MCP only for services that lack CLI support or require team-wide authentication controls.
  • Multiple teams on Hacker News reported measurable performance improvements after switching from MCP to CLI-based tooling pipelines.

Why this matters

Context window consumption from tool definitions is a hard cost that compounds as agent workflows scale, making MCP's overhead directly measurable in both performance and inference spend. The Quandri post crystallizes a growing practitioner sentiment that standardization protocols extracted from Big Tech environments do not translate cleanly to resource-constrained production deployments. Teams building production agents now have a named framework for evaluating when MCP's auth and standardization benefits actually justify its context and latency costs.

Summary

Quandri's engineering team ran the numbers on MCP in production and found a real problem: connect four MCP servers and tool definitions alone eat 10.5% of your context window before a token of actual work gets done. The post drew 378 Hacker News points and also flags that separate MCP processes add measurable latency, and that existing CLI tooling covers most of the same ground with better debuggability and composability. Essentially: (Quandri) argues for replacing blanket MCP adoption with a three-tier model. - CLI-first for common, well-supported tools - On-demand skill loading for multi-step workflows - MCP reserved only for services lacking CLI support or needing team-wide auth controls Multiple teams in the HN comment thread reported measurable performance gains after partially exiting MCP.

Potential risks and opportunities

Risks

  • Teams that standardized on MCP across large agent fleets face re-architecture costs and deployment risk if they shift to CLI-first patterns mid-production
  • MCP server vendors and the broader Model Context Protocol ecosystem could see adoption slowdown if the Quandri framing spreads, reducing tooling investment and community maintenance
  • Enterprises that bought MCP-native orchestration platforms risk being locked into a high-overhead architecture as context window efficiency becomes a primary cost-optimization target in 2026

Opportunities

  • CLI wrapper and shell-composability tooling vendors (Warp, AWS Fig, and custom DevOps toolchain teams) gain relevance as the recommended first-tier layer in agent architectures
  • Agent observability platforms that track latency and context usage per tool call (Langsmith, Braintrust, Honeycomb) can market directly to teams currently auditing their MCP overhead
  • Teams building MCP servers could reposition as hybrid CLI-plus-MCP adapters, targeting the segment that wants standardization without paying the full context and latency cost

What we don't know yet

  • Whether Anthropic or other MCP advocates will publish updated benchmarks addressing the context-consumption problem at the four-to-ten server scale
  • How the tiered CLI-first architecture handles versioning and reproducibility across heterogeneous team environments, which is a core use case MCP was designed to solve
  • Whether teams reporting performance gains after exiting MCP controlled for model version changes or other concurrent infrastructure improvements during the same period