Meituan open-sources 1.6T-parameter LongCat-2.0 MoE model
TL;DR
- Meituan released LongCat-2.0, a Mixture of Experts model with 1.6 trillion total parameters and roughly 48 billion activated per token.
- The model ships under an MIT license on Hugging Face, with 16x H20 GPUs recommended for deployment and an FP8 quantized variant available.
- Meituan reports LongCat-2.0 beats a Claude Opus reference on IFEval (90.0 vs 86.0) and IMO-AnswerBench (81.8 vs 75.3) while trailing on SWE-bench Pro (59.5 vs 69.2).
Meituan, the Chinese food-delivery and local-services super-app, has quietly become a serious foundation-model shop, and its latest release on Hugging Face is the clearest signal yet. LongCat-2.0 is a Mixture of Experts model with 1.6 trillion total parameters and roughly 48 billion activated per token, released under an MIT license, which is about as permissive as trillion-parameter model weights get.
The architecture is the part worth reading twice. Alongside the MoE routing, the card describes a 135 billion parameter N-gram Embedding component that Meituan says expands capacity in a dimension orthogonal to expert count, and a custom attention scheme called LongCat Sparse Attention with streaming-aware, cross-layer, and hierarchical indexing tricks aimed at 1M-context workloads. The team also says the run was done end-to-end on AI ASIC superpods rather than Nvidia hardware, on 35+ trillion tokens, across what the card describes as millions of accelerator-days with no rollbacks.
On benchmarks, Meituan puts LongCat-2.0 head-to-head with a Claude Opus reference model in the card's own table. The self-reported numbers show LongCat ahead on IFEval (90.0 to 86.0), IMO-AnswerBench (81.8 to 75.3) and a real-world search benchmark called RWSearch, and behind on the harder agentic coding evaluations like SWE-bench Pro (59.5 to 69.2) and Terminal-Bench 2.1 (70.8 to 78.9). Take the specifics as reported, not settled, until independent runs land.
The honest caveat is that the model card is a capability pitch, not an evaluation report. It does not disclose the training compute budget, the safety and refusal tuning, or how the model behaves on Chinese-language and Meituan-native workloads that presumably motivated a lot of the data mix. And the recommended 16x H20 GPU deployment means most people reading this cannot actually self-host the thing, MIT license or not.
Still, the direction is what to watch. A Chinese super-app releasing a trillion-scale MoE under MIT, trained on domestic ASICs, keeps compressing the gap between closed API frontier labs and permissively-licensed weights that anyone with a serious inference stack can serve.
Shared on Bluesky by 2 AI experts
-
Sung Kim @sungkim.bsky.social: They released the weight - MIT licensed. huggingface.co/meituan-long... →
Originally reported by huggingface.co
Read the original article →Original headline: meituan-longcat/LongCat-2.0 · Hugging Face