sakana.ai web signal

Sakana AI launches RSI Lab for recursive self-improvement

TL;DR

  • Sakana AI has formally established a Tokyo-based research group dedicated to recursive self-improvement using foundation models.
  • The lab points to its Darwin Gödel Machine, which reportedly more than doubled its baseline software-engineering performance on SWE-bench.
  • Sakana frames the bet as sample efficiency rather than raw compute, and is hiring Frontier Research Scientists and Advanced Core Engineers.

Sakana AI has formalized a research group inside the company built entirely around recursive self-improvement, and the framing is more interesting than the org-chart news. In its announcement, the Tokyo lab describes the mission as moving AI 'from being static tools to autonomous researchers,' with architectures that 'collectively self-improve' rather than getting incrementally better because a human tuned them.

The receipts they lead with are the actual argument. LLM-Squared, done with academic collaborators, produced DiscoPOP, described as 'a state-of-the-art preference optimization algorithm discovered and written entirely by an LLM.' The Darwin Gödel Machine, built with the University of British Columbia, 'automatically more than doubled its baseline software-engineering performance on SWE-bench.' ShinkaEvolve, released open source, 'solved complex optimization problems using only 150 samples.' ALE-Agent 'secured 1st place out of 804 human participants in the AtCoder Heuristic Contest 058.' And The AI Scientist, the paper-writing agent, was 'recognized globally, culminating in our recent publication in Nature (March 26, 2026).'

The thread that ties these together, and the part that is actually a bet, is compute. Sakana's stated goal is to build 'the most sample-efficient' self-improvement engine rather than 'the most compute-hungry' one, with advances that compound on budgets closer to a national research scale than to a hyperscaler cluster. If you can get an agent to invent a new preference optimizer or double its own SWE-bench score without a Nvidia-shaped GPU bill, the economics of frontier research get more contested, not less.

The honest caveats are the usual ones. These are benchmark-shaped wins, largely self-reported, and the announcement doesn't give you the compute figures, independent reproduction outside the collaborations, or a serious treatment of how you contain an agent that is autonomously rewriting its own codebase. When a company both defines RSI and grades its own homework on it, take the specifics as reported, not settled.

The reason to keep watching is that Sakana is staffing up for this now, opening roles for 'Frontier Research Scientists' and 'Advanced Core Engineers' for 'both domestic and international applicants.' If the sample-efficiency thesis holds even partway, non-hyperscaler labs and public research programs get a real template for competing on ideas rather than GPU counts, which is the more interesting version of the next couple of years.

Shared on Bluesky by 1 AI expert