Ethan Mollick: Agents Are Replacing Chatbots for Real Work
TL;DR
- Opus 4.7 worked on its own for 14 hours to build software that would take 2-17 weeks of human engineering, at a token cost of $251.
- A quarter of OpenAI workers already have at least four agents running at one time every week, per the figure Mollick cites.
- In a Claude Code study Mollick highlights, domain experience predicted success in the tool more reliably than professional background.
The framing to hang onto from Ethan Mollick's latest post is small but load-bearing. He argues that we are moving from a world where non-experts use chatbots to fill in gaps to one in which experts use agents to get work done. The interface hasn't just changed. The unit of collaboration has.
The numbers he cites are the ones to sit with. Opus 4.7, working on its own for 14 hours, reportedly built a software package that would take a human engineer 2 to 17 weeks, at a token cost of $251. A model he refers to as Fable worked autonomously for 9 hours on complex software projects that he says would have taken a team well over a week. And a quarter of OpenAI workers, per his figure, have at least four agents running at one time every week. Take the specifics as reported, not settled. The direction is consistent with the programmer-hour measurements he attributes to METR and the UK's official government AI Security Institute.
Why this matters for anyone doing knowledge work: the leverage now goes to the person who can define the job and check the output, not the person who can prompt cleverly. Mollick points at a Claude Code study finding that the more domain experience someone had, the more successful they were in using the tool. That is the sort of result that quietly rewires who is valuable inside a team.
The honest caveat is that most of the striking specifics come from Mollick's write-up rather than independent replication. What the post does not give you is a failure rate on those multi-hour runs, the shape of the tasks agents drop on the floor, or how transferable the OpenAI-worker figure is outside a lab full of unusually motivated users. It also does not really answer what a mid-career worker without deep domain expertise is supposed to do while agent-supervision becomes the skill that pays.
The forward-looking part is his closing line, and it is the one I keep coming back to. Institutions that move at the speed of people, or worse committees, are trying to track a capability curve that is very much not human in nature, and he argues the gap only widens for as long as the exponential holds. If you are an operator, the move is to start practicing the agent-supervision skill now, on small stuff, while the stakes are low.
Shared on Bluesky by 1 AI expert
Originally reported by oneusefulthing.org
Read the original article →Original headline: Ethan Mollick: 'The Twilight of the Chatbots' — Autonomous Agents Now Complete 14–16+ Hour Tasks Per Prompt, Workers Are Becoming Managers of AI Fleets Rather Than Prompt-Users