Machine Learning News: An OpenAI model autonomously disproved Erdős's unit distance conjectur — May 26, 2026

May 26th 2026 · By Alexis

OpenAI's autonomous proof, Cursor's post-training moat, and Google's Flash-first keynote rewrote this week.

This week ML went two ways at once: research that wasn't supposed to happen yet, and shipping that wasn't supposed to happen this fast. A general-purpose reasoning model from OpenAI autonomously disproved Paul Erdős's 1946 unit distance conjecture with no math-specific scaffolding, while Google compressed a year of frontier launches into 22 days. Underneath the headlines, Cursor and Together AI made the quieter case that post-training and inference compression — not parameter counts — are where the next twelve months get decided.

Watch & Listen First

MLST: Michael I. Jordan on why he never followed the field into "AGI" (May 21) — the most influential living computer scientist explains why ML's roots are in statistics and operations research, not AI, and why data markets are Stackelberg games, not optimization problems.
No Priors Ep. 163 with Elad Gil and Sarah Guo (May 21) — 38-minute conversation on inference economics, post-training moats, and where this cycle's defensibility actually lives.

Key Takeaways

Frontier reasoning crossed a research threshold. General-purpose models can now produce publishable proofs of open math problems without domain-specific scaffolding.
The moat moved into post-training. Cursor spent 85% of Composer 2.5's compute budget on RL and continued pretraining over an open checkpoint and matched Opus 4.7 at ~10x lower cost.
2-bit KV caches are production-ready. Together AI's OSCAR drops cache memory ~8x and lifts throughput up to 7.83x with no client changes.
Flash leads the keynote now. Google opening I/O with 3.5 Flash (not Pro) signals that latency and unit economics dominate the agent-era roadmap.
Open source isn't slowing. Hugging Face's spring report shows Chinese labs at ~41% of downloads and robotics datasets up 23x year-on-year — supply, not demand, is shifting.

The Big Story

An OpenAI model autonomously disproved Erdős's unit distance conjecture · May 20, 2026 · OpenAI
→ The model produced an infinite family of point configurations that beats the long-assumed square-grid bound by an explicit polynomial factor (refined to n^1.014 by Will Sawin at Princeton), built on Golod–Shafarevich theory and infinite class field towers — none of which was prompted. The load-bearing claim for ML practitioners isn't "AI did math"; it's that long-horizon proof search emerged from a general-purpose post-training stack, not a math-specialized scaffold like AlphaProof. External mathematicians verified the proof and wrote a companion paper explaining the construction.

Also This Week

Cursor's Composer 2.5 matches Opus 4.7 on SWE-Bench Multilingual at one-tenth the cost · May 18 · Cursor
→ Built on Moonshot's open Kimi K2.5 checkpoint with 85% of total compute spent on Cursor's RL and continued-pretraining pipeline — the strongest production signal yet that coding agents are won in post-training, not pretraining.

Google ships Gemini 3.5 Flash, Gemini Omni, and Antigravity 2.0 in one I/O · May 19 · Google
→ 3.5 Flash at $1.50/$9.00 per million tokens beats Gemini 3.1 Pro on Terminal-Bench 2.1 (76.2% vs 70.3%) and runs ~4x faster — the price/latency frontier is now the keynote story, not raw capability.

Together AI open-sources OSCAR, a 2-bit attention-aware KV cache · May 25 · MarkTechPost
→ Rotating activations with attention-aware covariance matrices (not generic Hadamard transforms) hits 3x decode speedup at 100K context and 7.83x throughput at large batches — drop-in for SGLang with full paged-cache compatibility.

Gemini Omni Flash brings native multi-input video generation to consumer surfaces · May 19 · Google Blog
→ Text + image + audio + video → high-resolution video output with SynthID watermarking, shipping inside YouTube Shorts the same week — the multimodal stack stopped being an API demo this quarter.

From the Lab

OScaR: The Occam's Razor for Extreme KV Cache Quantization in LLMs and Beyond · arXiv 2605.19660
→ The academic cousin to Together AI's OSCAR — different team, same problem space. Uses omni-scaled canalized rotation to reach near-lossless INT2 quantization across X-LLMs. If you're serving long-context models on constrained VRAM, this is the cleanest write-up of the new accuracy-efficiency Pareto front. Code at ZunhaiSu/OScaR-KV-Quant.

Rethinking LLM Ensembling from the Perspective of Mixture Models · arXiv 2605.00419
→ Reinterprets ensembles as mixture models, allowing stochastic per-token selection of a single component — 1.78–2.68x faster than conventional token-vote ensembles with comparable quality. Practical for cheap "router over open checkpoints" inference stacks.

Worth Reading

State of Open Source on Hugging Face: Spring 2026 — China at ~41% of downloads, independent developers up from 17% to 39%, robotics datasets up 23x; the supply side of open ML is no longer Western-led.
Gil Kalai: "Amazing — Erdős' Unit Distance Problem was Disproved! It was achieved by AI!" — a working combinatorialist's reaction to the OpenAI result; the most grounded take you'll find on what the proof actually does and doesn't say about ML.
Cursor's Composer 2.5 hits third on Artificial Analysis's Coding Agent Index — independent benchmark commentary on the 10–60x cost gap vs. higher-effort Opus 4.7 and GPT-5.5 variants.

The week the moat moved twice: away from frontier scale, and toward proofs no one expected an LLM to write yet.

Stay ahead in AI

Join 44,000+ professionals getting the AI briefing that matters. 3x/week, free, no spam.