r/LocalLLaMA: Qwen3.6-35B on MacBook M4 Pro Scores 37.8% on Terminal-Bench 2.0, Rivalling Claude Code + Sonnet 4.5 on Local Hardware
Summary
A r/LocalLLaMA developer ran Qwen3.6-35B-A3B (Q6_K_XL) locally via llama.cpp on an M4 Pro 48GB MacBook on Terminal-Bench 2.0, achieving a 3-run average of 37.8% — comparable to the published benchmark scores of Claude Code + Sonnet 4.5, a frontier commercial agentic stack. The result is one of the first published data points showing a local open-weight model competing on a real agentic coding benchmark against a paid frontier service on consumer-grade hardware with no cloud costs.
Originally reported by reddit.com
Read the original article →Original headline: r/LocalLLaMA: Qwen3.6-35B on MacBook M4 Pro Scores 37.8% on Terminal-Bench 2.0, Rivalling Claude Code + Sonnet 4.5 on Local Hardware