reddit.com via Reddit

r/LocalLLaMA: Gemma 4 31B FP8 Matches Claude Sonnet 4.6 Medium in Production Agent Harness — Neo4j Cypher, Entity Extraction, and Agentic Tool Calling

google anthropic open source inference local-models open-source-benchmarks

Summary

A developer running a production agent harness shares a screenshot showing Gemma 4 31B in FP8 precision reaching parity with Claude Sonnet 4.6 medium on three real workloads: Cypher graph traversal for Neo4j, entity extraction combining web queries, graph queries, and vector search, and agentic tool calling. The comparison represents a meaningful parity threshold — a locally-runnable open-weight model matching a commercial frontier model on the coding and orchestration tasks most relevant to production builders. Commenters note the FP8 quantization preserves capability better than lower-bit formats and point to the Gemma team's recent QAT variant releases as enabling the result.