GLM-5.2 Tops GPT-5.5 on Coding Benchmarks at One-Sixth the Cost
TL;DR
- Z.ai released GLM-5.2 under the MIT license on June 16, 2026, scoring 62.1 on SWE-bench Pro versus GPT-5.5's 58.6.
- The model extends context from roughly 200,000 to 1,000,000 tokens, targeting long-running coding agent sessions.
- GLM Coding Plan subscriptions run from roughly $3 to $6 per month at the Lite tier, with Claude Code, Cursor, Cline, and Kilo Code integrations included.
Z.ai made its GLM-5.2 available to GLM Coding Plan subscribers on June 13, 2026, three days before open-sourcing the full model weights under the MIT license on June 16. The sequence put hosted plan users ahead of the broader research community, with the open release permitting commercial use.
The benchmark numbers are the story. According to VentureBeat, GLM-5.2 beats GPT-5.5 on multiple long-horizon coding benchmarks for roughly one-sixth the cost. On SWE-bench Pro the model scored 62.1 against GPT-5.5's 58.6, and on FrontierSWE Dominance it reached 74.4% against GPT-5.5's 72.6%. At 753 billion parameters it is a heavyweight, but the architecture reportedly cuts per-token compute by 2.9x at 1M-token context through a mechanism called IndexShare, which reuses one lightweight indexer across every four sparse-attention layers.
The practical headline for agent workloads is the context jump. GLM-5.2 moves from roughly 200,000 tokens to 1,000,000 tokens when called with the glm-5.2[1m] identifier, which matters for coding agents running long, multi-file sessions where earlier context gets dropped. The model also introduces High and Max thinking effort levels, with Z.ai recommending Max for complex task stability.
Pricing for the hosted plan tiers from roughly $3 to $6 per month at the Lite level up to around $30 to $160 per month for the Max tier. Standard API access is priced at $1.40 per million input tokens and $4.40 per million output tokens, with a cached input rate of $0.26 per million tokens. The plan covers integrations with Claude Code, Cursor, Cline, Kilo Code, and Trae.
The honest caveat is that benchmark wins on SWE-bench and FrontierSWE do not automatically transfer to production codebases. The 753B parameter size also means self-hosting the open weights is not realistic for most teams without serious GPU infrastructure, which limits the MIT license advantage to well-resourced organizations. What the reporting does not give you is a clear picture of GLM-5-Turbo's role alongside the flagship or how the plan's rate limits behave under sustained agent session load. Teams running high-volume agentic coding pipelines are the obvious group to watch as early adopters report real-world results.
Shared on Bluesky by 2 AI experts
-
GLM-5.2 is now open-weight. Tech Blog: z.ai/blog/glm-5.2 Weights: huggingface.co/zai-org/GLM-... API: docs.z.ai/guides/llm/g... Coding Plan: z.ai/subscribe Chat: chat.z.ai
View on Bluesky →
Originally reported by z.ai
Read the original article →Original headline: GLM Coding Plan — AI Coding Powered by GLM-5.2 & GLM-5-Turbo for Agents & IDEs