Three things happened in parallel and describe the same shape. Hyperscalers wrote double-digit-billion checks back into the labs they already host, structured as compute-credit loops rather than equity. Open-weight releases landed at quality levels that, six months ago, would have been frontier-only. And a public-software bloodbath showed the market has stopped pricing AI as a tailwind for incumbents and started pricing it as a substitute. The moat is no longer the model — it is the contracted gigawatts behind it and the cost of a token in front of it.
The Big Story
Google plans to invest up to $40B in Anthropic at $350B valuation with 5GW compute deal · Apr 24 · [TechCrunch]
→ $10B cash plus $30B milestone-contingent, and 5GW of Cloud capacity over five years. It mirrors Amazon's $25B top-up and matching 5GW of Trainium four days earlier, paired with Anthropic's $100B+ AWS spending pledge over a decade. Each hyperscaler is funding the same lab to lock in inference revenue back to itself; the lab uses paper to buy two non-overlapping training fleets. At $350-380B, Anthropic is priced as a long-dated electricity hedge with a brand attached, not a model company.
Also This Week
Software stocks plunge as ServiceNow and IBM earnings fuel AI disruption fears · Apr 23 · [CNBC]
→ ServiceNow down 18%, IBM down 10%, S&P Software & Services off 4% — while semis hit a record. The market finally separated the picks-and-shovels trade from the seat-license trade, pricing in that agentic workflows eat the recurring-revenue layer first.
Hackers breach Anthropic's restricted Claude Mythos via vendor credentials · Apr 22 · [Euronews]
→ A Discord-linked group reached Mythos — held back under Project Glasswing because it can autonomously discover zero-days — by guessing endpoint URL patterns and pivoting through a third-party portal. The "do not release" tier now has the same supply-chain attack surface as a SaaS app.
Pichai: 75% of new code at Google is AI-generated and engineer-approved · Apr 22 · [Techmeme]
→ Up from 50% last fall and ~30% a year ago. If even half-true, the per-engineer throughput assumption under every SaaS comp is wrong by a multiple — exactly the trade ServiceNow just printed.
OpenAI releases GPT-5.5 'Spud' with 2x pricing and agentic capabilities · Apr 23 · [TechCrunch]
→ 82.7% on Terminal-Bench 2.0, 73.1% on Expert-SWE, 1M context — and API pricing doubled to $5/$30 per million tokens, the first time in two years a frontier release has lifted price-per-token rather than cut it. The cost curve isn't broken; it's bifurcating by tier, and Polymarket called the date six weeks early off API monitoring alone.
DeepSeek releases V4 open-source flagship: 1.6T-param MoE with 1M context under Apache 2.0 · Apr 24 · [Hugging Face]
→ V4-Pro (1.6T) and V4-Flash (284B), both 1M context, both Apache 2.0, with quality reportedly close to Claude Opus 4.6 non-thinking and best-in-class agentic coding among open weights. With Tencent and Alibaba now in talks to fund DeepSeek above $20B, the open-weight strategy has its own checkbook for the first time.
Alibaba releases Qwen3.6-27B: dense 27B beats 397B MoE on coding benchmarks · Apr 22 · [qwen.ai]
→ Runs at ~18GB quantized and beats the previous 397B MoE on SWE-bench Verified (77.2 vs 76.2) and Terminal-Bench (59.3 vs 52.5), native 262K context. The second open-weight release this week where post-training quality outweighs parameter count — and this one fits on a prosumer GPU.
Google releases Gemma 4 in four sizes under Apache 2.0 · Apr 22 · [blog.google]
→ E2B, E4B, 26B MoE, 31B Dense, shipped alongside Cloud Next. Gemma's role looks defensive now: keep developers on Google infra with weights they can fine-tune, while Gemini stays proprietary.
From the Lab
A Decomposition Perspective to Long-context Reasoning for LLMs · [arxiv 2604.07981]
→ Decomposes long-context reasoning into atomic sub-skills (retrieval over distractors, multi-hop chaining, contradiction handling) and synthesizes targeted training data per skill instead of monolithic long-context corpora. Every release this week quoted 1M-context numbers; the gap between "supports" and "reasons over" 1M tokens is what this measures.
LoRAFusion: Efficient LoRA Fine-Tuning for LLMs · [arxiv 2510.00206]
→ EuroSys '26: up to 1.96× speedup over Megatron-LM by fusing redundant activation reads and co-training multiple independent LoRA adapters on the same GPUs. With Qwen3.6 and Gemma 4 making single-GPU fine-tuning realistic again, the bottleneck moves from "can I afford to train" to "how many adapters fit per box."
Worth Reading
- [Google splits TPU into TPU 8t and TPU 8i at Cloud Next] — Separate training and inference silicon is the architectural admission of the year: serving MoEs with chain-of-thought is a different workload from training them.
- [Meta will record employees' keystrokes and screenshots to train AI agents] — The Model Capability Initiative concedes the next agent-training bottleneck is real white-collar trajectory data — and that frontier labs will pay for it in surveillance.
- [SpaceX secures $60B option on Cursor, preempting its $2B round] — A $10B breakup fee on a $60B option is a new financial primitive: Musk bought a one-year call on the leading coding-agent surface and re-rated Cursor's planned $50B round before it closed.
Capital concentrates, weights diffuse, and the SaaS tape finally noticed.
— Alexis