the-decoder.com via Reddit

DeepMind AlphaProof Nexus solves 9 Erdős problems

google demis hassabis research ai-research formal-math deepmind

Key insights

  • DeepMind's AlphaProof Nexus solved 9 open Erdős problems and 44 OEIS conjectures, with two problems unsolved for 56 years.
  • Each proof cost a few hundred dollars at inference, dramatically undercutting the human effort previously required.
  • All formal Lean proofs are public on GitHub, making the results machine-verifiable and independently checkable.

Why this matters

The cost threshold of a few hundred dollars per research-level proof means any well-funded lab or startup can now run autonomous math discovery as a background process. Formal verification via Lean ties AI output to machine-checkable proofs, which removes the trust gap that has blocked AI math claims from entering peer-reviewed mathematics. DeepMind's lead over OpenAI (9 Erdős problems vs. 1) in a single week signals that agentic proof systems are entering a competitive scaling phase where proof throughput becomes a measurable benchmark metric.

Summary

Google DeepMind's AlphaProof Nexus cracked 9 of 353 open Erdős problems and proved 44 OEIS conjectures, some unsolved for over 56 years, at just a few hundred dollars per proof. The system pairs Gemini with Lean, a formal proof verifier, running as an autonomous agentic loop. All proofs are publicly posted on GitHub. A 15-year-old algebraic geometry question was resolved in the same run. Essentially: (Google DeepMind, Lean) turns decades of unsolved math into a compute budget line item. - Nine Erdős solutions vs. OpenAI's one last week, from a pool of 353 open problems. - Cost per proof: a few hundred dollars at inference prices. - DeepMind explicitly frames this as research-level math, not AGI. Formal verification makes AI-generated proofs machine-checkable, which redefines what peer review means for mathematics.

Potential risks and opportunities

Risks

  • Academic journals and prize committees lack established protocols for crediting AI-solved theorems, creating attribution disputes that could surface within 12-18 months as more proofs are published.
  • Labs racing to match DeepMind's throughput may publish insufficiently verified AI proofs before Lean formalization is complete, contaminating mathematical literature in ways that are slow to correct.
  • The low per-proof cost could enable state-level actors to fund adversarial cryptographic or number-theoretic research at scale, with minimal human oversight of which problems are being targeted.

Opportunities

  • Lean ecosystem contributors and tooling projects (Leanprover community, Formal Abstracts) gain commercial leverage as the infrastructure layer for AI-verified mathematics scales up.
  • Universities and research institutes running open problem catalogs (OEIS Foundation, Erdős Prize fund administrators) could partner with DeepMind or replicate the pipeline for discipline-specific unsolved problems.
  • Inference providers serving long-context agentic math workloads (Together AI, Fireworks AI) are positioned to capture a new research compute category as labs attempt to replicate and extend AlphaProof Nexus results.

What we don't know yet

  • Whether DeepMind plans to continue running AlphaProof Nexus against the remaining 344 open Erdős problems and on what timeline.
  • The degree of human scaffolding required to formalize each problem in Lean before autonomous solving begins, which affects the true cost and scalability claim.
  • Whether the 44 OEIS conjectures solved carry comparable mathematical significance to the Erdős problems, or represent lower-hanging computational targets.