huggingface.co web signal July 5th 2026

Meituan open-sources 1.6T-parameter LongCat-2.0 MoE model

TL;DR

Meituan released LongCat-2.0, a Mixture of Experts model with 1.6 trillion total parameters and roughly 48 billion activated per token.
The model ships under an MIT license on Hugging Face, with 16x H20 GPUs recommended for deployment and an FP8 quantized variant available.
Meituan reports LongCat-2.0 beats a Claude Opus reference on IFEval (90.0 vs 86.0) and IMO-AnswerBench (81.8 vs 75.3) while trailing on SWE-bench Pro (59.5 vs 69.2).

Meituan, the Chinese food-delivery and local-services super-app, has quietly become a serious foundation-model shop, and its latest release on Hugging Face is the clearest signal yet. LongCat-2.0 is a Mixture of Experts model with 1.6 trillion total parameters and roughly 48 billion activated per token, released under an MIT license, which is about as permissive as trillion-parameter model weights get.

The architecture is the part worth reading twice. Alongside the MoE routing, the card describes a 135 billion parameter N-gram Embedding component that Meituan says expands capacity in a dimension orthogonal to expert count, and a custom attention scheme called LongCat Sparse Attention with streaming-aware, cross-layer, and hierarchical indexing tricks aimed at 1M-context workloads. The team also says the run was done end-to-end on AI ASIC superpods rather than Nvidia hardware, on 35+ trillion tokens, across what the card describes as millions of accelerator-days with no rollbacks.

On benchmarks, Meituan puts LongCat-2.0 head-to-head with a Claude Opus reference model in the card's own table. The self-reported numbers show LongCat ahead on IFEval (90.0 to 86.0), IMO-AnswerBench (81.8 to 75.3) and a real-world search benchmark called RWSearch, and behind on the harder agentic coding evaluations like SWE-bench Pro (59.5 to 69.2) and Terminal-Bench 2.1 (70.8 to 78.9). Take the specifics as reported, not settled, until independent runs land.

The honest caveat is that the model card is a capability pitch, not an evaluation report. It does not disclose the training compute budget, the safety and refusal tuning, or how the model behaves on Chinese-language and Meituan-native workloads that presumably motivated a lot of the data mix. And the recommended 16x H20 GPU deployment means most people reading this cannot actually self-host the thing, MIT license or not.

Still, the direction is what to watch. A Chinese super-app releasing a trillion-scale MoE under MIT, trained on domestic ASICs, keeps compressing the gap between closed API frontier labs and permissively-licensed weights that anyone with a serious inference stack can serve.

Shared on Bluesky by 2 AI experts

AK: https://t.co/d94szmiwOl →
Sung Kim @sungkim.bsky.social: They released the weight - MIT licensed. huggingface.co/meituan-long... →

Originally reported by huggingface.co

Read the original article →

Original headline: meituan-longcat/LongCat-2.0 · Hugging Face