scmp.com via Reddit

DeepSeek V4 Pricing Drives 99% API Cuts in China AI

4 sources tracking this story
deepseek china ai china-ai ai-pricing deepseek

Key insights

  • DeepSeek V4-Pro permanently locked a 75% price cut, with cached input at $0.0036 per million tokens -- below the per-character cost of SMS.
  • Chinese frontier models now carry a 15-30x cost advantage over Anthropic, OpenAI, and Meta at current published API rates.
  • Xiaomi cut prices even as Q1 net profit fell 43.1% and revenue dropped 10.9%, showing price competition is now unavoidable regardless of margins.

Why this matters

DeepSeek V4 triggered a pricing cascade that now measurably separates Chinese and US AI inference by 15-30x, with DeepSeek V4-Pro cached input settling at $0.0036 per million tokens. Xiaomi executed a 99% price cut while its Q1 net profit had already fallen 43.1%, meaning competitive repricing is now a structural obligation independent of financial health. MiniMax's hybrid billing launch signals that at least one major player views race-to-zero token pricing as untenable and is building alternative revenue stacking instead. Together, these moves establish a new inference floor that API-dependent developers globally will use as a benchmark against US-hosted models.

Summary

DeepSeek's V4 has set off a pricing cascade across China's AI market. Xiaomi cut API costs for MiMo-V2.5 by 99 percent, and the model shot to sixth place on OpenRouter. The scale of adoption is striking: MiMo-V2.5 processed 1.7 trillion tokens in one week, with growth exceeding 999 percent. MiniMax went the opposite direction, launching M3 with hybrid billing pairing token-based fees with subscriptions from US$7.24 to US$69.28 per month. Essentially: (Xiaomi, MiniMax) are taking opposite bets, one on volume, one on pricing diversification. - MiMo-V2.5 reached sixth place on OpenRouter after a 99% cut, processing 1.7 trillion tokens in one week. - MiniMax M3 tiers subscriptions from US$7.24 to US$69.28 monthly rather than racing to zero on per-token rates. - Cloud providers face friction too, as competitive pressure extends beyond developer APIs. China's AI market is splitting between volume plays and hybrid monetization experiments in response to DeepSeek V4.

Potential risks and opportunities

Risks

  • MiniMax's M3 hybrid model (US$7.24 to US$69.28/month) faces an adoption cliff if developers stay on pure usage-based alternatives as DeepSeek V4 pricing continues downward.
  • Chinese cloud providers face noted but unquantified friction as DeepSeek V4 pricing compresses inference margins across the domestic market.
  • Xiaomi's 99% price cut strategy may prove unsustainable if its 1.7 trillion tokens per week do not convert to durable paying customers beyond the OpenRouter leaderboard.

Opportunities

  • OpenRouter and similar API aggregator platforms benefit directly from Chinese model competition, as MiMo-V2.5 reaching sixth place shows aggressive pricing drives developer experimentation and platform traffic.
  • Developers and AI startups building on Chinese APIs gain access to dramatically lower inference costs, enabling product experiments previously uneconomical at higher token prices.
  • MiniMax's hybrid billing structure could serve as a monetization template for mid-tier AI providers globally who need predictable revenue without competing at the lowest per-token price point.

What we don't know yet

  • DeepSeek V4's specific per-token pricing is not disclosed in the article, making the exact cost floor driving the competitive cascade unquantifiable.
  • Whether Xiaomi's 99% price cut for MiMo-V2.5 is sustainable at 1.7-trillion-token weekly volumes, or is subsidized to drive OpenRouter rankings.
  • No specific cloud providers are named as affected, leaving unclear which players face the most acute margin compression from DeepSeek V4 pricing.

What others are reporting

Coverage cluster as of 2h after publish

  1. Decrypt Read →

    Quantifies the US-China pricing gap at 15-30x and attributes cuts to technical advances in hierarchical KV cache and selective attention, not just competitive pressure.

    "Operating at these newly reduced API prices, our production inference engine is running at near full capacity" — Fuli Luo, Xiaomi MiMo team head
  2. Caixin Global Read →

    Ties cuts to Xiaomi's Q1 earnings collapse (-43.1% profit) and discloses that 30% of API users pay, with overseas subscribers comprising the majority of that paying cohort.

    The aggressive discounts threaten to reignite a price war in China's hyper-competitive AI sector.
  3. Frames the cuts as a strategic gamble: Xiaomi raised R&D spending while revenue fell 10.9%, showing aggressive investment even as commercial returns shrink.

    The new price has a maximum reduction of up to 99%, and there is no longer a distinction based on the context window length.