github.com via Hacker News June 28th 2026

Wayfinder Router Scores LLM Prompts in Microseconds, No Model Call Needed

inference open source agents developer-tools cost-optimization

TL;DR

Wayfinder assigns prompts a 0.0–1.0 complexity score from structural text features, requiring no model call.
Three routing modes — binary threshold, tiered, and fitted classifier — support any OpenAI-compatible endpoint.
The project's own benchmarks find lexical scoring does not generalize on RouterBench; structural scoring is the recommended default.

The standard knock on LLM routing systems is that deciding which model to use requires calling a model. Wayfinder Router takes a different approach: it scores incoming prompts deterministically, in microseconds, using structural features of the text itself — word count, headings, lists, code blocks, and lexical difficulty cues — to produce a 0.0–1.0 complexity number. Below the threshold, the query stays local. Above it, the call goes to a cloud frontier model.

The tool sits as an OpenAI-compatible gateway between a client and its models, forwarding calls to any `/v1` endpoint and tagging responses with routing metadata headers. Swapping it in requires only a `base_url` change. The project lists support for a wide range of providers — OpenAI, Claude (Anthropic), Gemini, Mistral, Ollama, Groq, DeepSeek, vLLM, LM Studio, and llama.cpp — plus integrations with LangChain, LlamaIndex, and a special adapter for Claude Code's Anthropic Messages API format.

Three routing modes are available: a simple binary threshold, tiered bands that route to multiple models at different score ranges, and a fitted classifier trained on labeled user traffic. The calibration system uses L2-regularized Newton/IRLS optimization and can be bootstrapped from A/B onboarding with user judgments — meaning teams can tune the routing to their actual prompt distribution rather than relying on generic benchmarks.

The honest caveat comes from the project's own benchmarks: the optional lexical scoring mode reportedly shows roughly a 20% gain on unseen hard prompts, but shows no generalization on RouterBench. The documentation explicitly acknowledges that lexical features don't reliably generalize and recommends users calibrate on their own traffic. What the source doesn't give you is a real-world accuracy number for the structural-only baseline across different domains or use cases.

For teams already running local inference alongside cloud APIs, a drop-in router that adds no per-request model cost is a meaningful option. The version 2026.6.9 release framing — described as "the gateway as a control plane" — suggests the project is positioning for broader infrastructure use rather than staying a simple CLI utility.

Originally reported by github.com

Read the original article →

Original headline: Show HN: Wayfinder Router — Open-Source Deterministic LLM Routing Proxy Routes Prompts Between Local and Cloud Models Without Calling a Model