DeepSeek V4 Pricing Drives 99% API Cuts in China AI
Key insights
- Xiaomi's Sliding Window Attention optimization reduces KV Cache data transfer to 1/7 prior levels, making the 99% cut structurally viable rather than a loss-leader.
- Decrypt quantifies the US-China frontier inference gap at 34x for comparable performance, up from the 15-30x range cited in earlier estimates.
- Xiaomi MiMo head Fuli Luo confirmed the inference engine runs near full capacity and breaks even at new prices, directly rebutting a subsidy narrative.
Why this matters
Summary
Potential risks and opportunities
Risks
- MiniMax's M3 hybrid model (US$7.24 to US$69.28/month) faces an adoption cliff if developers stay on pure usage-based alternatives as DeepSeek V4 pricing continues downward.
- Chinese cloud providers face noted but unquantified friction as DeepSeek V4 pricing compresses inference margins across the domestic market.
- Xiaomi's 99% price cut strategy may prove unsustainable if its 1.7 trillion tokens per week do not convert to durable paying customers beyond the OpenRouter leaderboard.
Opportunities
- OpenRouter and similar API aggregator platforms benefit directly from Chinese model competition, as MiMo-V2.5 reaching sixth place shows aggressive pricing drives developer experimentation and platform traffic.
- Developers and AI startups building on Chinese APIs gain access to dramatically lower inference costs, enabling product experiments previously uneconomical at higher token prices.
- MiniMax's hybrid billing structure could serve as a monetization template for mid-tier AI providers globally who need predictable revenue without competing at the lowest per-token price point.
What we don't know yet
- DeepSeek V4's specific per-token pricing is not disclosed in the article, making the exact cost floor driving the competitive cascade unquantifiable.
- Whether Xiaomi's 99% price cut for MiMo-V2.5 is sustainable at 1.7-trillion-token weekly volumes, or is subsidized to drive OpenRouter rankings.
- No specific cloud providers are named as affected, leaving unclear which players face the most acute margin compression from DeepSeek V4 pricing.
What others are reporting
-
Decrypt Read →
Technical breakdown of why cuts are economically sustainable — KV cache hierarchy and DeepSeek's dual attention — plus Fuli Luo on-record quote and 34x gap quantification.
Our production inference engine is running at near full capacity, and we can still essentially break even. — Fuli Luo, Xiaomi MiMo head
-
Caixin Global Read →
Financial framing: documents Xiaomi's 43.1% Q1 profit decline alongside the cut, reports 30% paying-subscriber share, majority overseas, and 111% OpenRouter daily token surge.
The aggressive discounts threaten to reignite a price war in China's hyper-competitive AI sector.
-
36Kr Read →
Provides the SWA 1/7 KV Cache reduction detail, a full domestic competitor pricing table, and notes Xiaomi leadership previously criticized price wars while accepting free token giveaways.
The package quota has skyrocketed by 5 to 8 times, and the lowest tier also has 500 million Tokens.
Originally reported by scmp.com
Read the original article →Original headline: DeepSeek V4 Forces China's AI Rivals to Slash Prices — Xiaomi Cuts MiMo-V2.5 API Costs 99%, MiniMax Launches Hybrid Billing Model