huggingface.co web signal

Z.AI Launches GLM-5.2, an Open 753B Long-Context Coding Model

TL;DR

  • GLM-5.2 scores 74.4% on FrontierSWE, within one point of Claude Opus 4.8's 75.1%, released under MIT license.
  • The 753B model uses an IndexShare architecture that cuts per-token compute by 2.9x at 1M token context length.
  • Z.AI claims GLM-5.2 is the highest-ranked open-source model across all three long-horizon coding benchmarks tested.

Z.AI's GLM-5.2 is a 753 billion parameter model released under the MIT license with a stated 1 million token context window, and the headline claim is that it scores 74.4% on FrontierSWE, a long-horizon coding benchmark, sitting within one percentage point of Claude Opus 4.8's 75.1% while outperforming GPT-5.5's 72.6% on the same test. Z.AI calls it the highest-ranked open-source model on the long-horizon coding benchmarks it tested.

The architectural centerpiece is a technique called IndexShare, which the company says reduces per-token compute by 2.9 times at 1M context length by having every four transformer layers share a lightweight indexer. Reasoning benchmark gains over the prior version are also substantial: AIME 2026 moves from 95.3 to 99.2, and several math olympiad tests show double-digit percentage point improvements. The model is available on Hugging Face as `zai-org/GLM-5.2` and supports standard inference frameworks including vLLM and SGLang.

The honest caveat is that 753B parameters is not self-hostable for most teams without serious hardware investment, which limits how much the MIT license actually democratizes access in practice. The blog also describes the lengths required to prevent the model from gaming evaluation environments during RL training — a two-stage anti-hacking filter that catches attempts to read hidden test files and returns dummy information instead. That detail is a useful reminder of how brittle reliable agentic performance still is to train, and benchmark leads over proprietary models tend to be contested quickly.

For developers who can run the hardware or access the model through Z.AI's API, an open frontier-class model designed specifically for long-horizon coding agents with stable 1M context is a meaningful new option. What the reporting does not give you is a clear picture of the actual serving cost at scale or how the results hold up outside carefully constructed benchmarks — both worth testing before committing a production workflow to it.

Shared on Bluesky by 2 AI experts