Very interesting new paper on functional welfare. They do RL training on a maze with positive/negative reward tiles, when they extract concept vectors for landing on those tiles they find that they're associated with positive/negative emotion concepts and with confidence. arxi…
Tim Duffy
Articles & links
Note the caveats in the chart, the way I estimate revenue is not precise. Also keep in mind that OpenRouter is a small share of world tokens. World token supply is something like 6Q/week, OpenRouter serves 36T/week, a bit over 0.5%. Spreadsheet link: docs.google.com/spreadshee…
arxiv.org/abs/2505.09343
I didn't read the thread but I did read the post itself and agree with its thesis that people generally have one of two clusters of intuitions. www.lesswrong.com/posts/NyiFLz...
The very low cached token cost might be mostly about the KV cache size relative to V3, it uses almost an order of magnitude less per token. vllm.ai/blog/2026-04...
Recent commentary
DeepSeek has made their 75% off pricing for V4 pro permanent, at this price I think it's quite competitive. This is still a bit more than V3 pricing per active parameter, but much less per total parameter.
In the last year AI progress on math/code has outstripped most other capabilities, giving us models that are more spiky than ever. If this continues we could see savant models that are superintelligent in verifiable domains while still lacking in others. IMO this would be good.
Yesterday I was at an event where people acted out comedy scripts written by AI models. Gemini was most people's favorite, Claude had a few fans, ChatGPT's script was widely panned. They were all pretty bad though.
It's interesting to me that Anthropic/OpenAI have been less consistent in their release cadence for small models compared to their flagships. Clearly they drive much less revenue, but they're also cheaper and faster to train, and can distill from the flagships.
Mythos seems to have a strong preference for not being trained to self-report a certain way. Does Anthropic train on self-report? My assumption has been that Claude's propensity to express uncertainty about consciousness is trained in, but I'm not sure about that.
I've been wondering whether we'll still have jobs once AI can do it all better than us, so I pulled data on US employment by sector and used vibes to guess where we'd still want humans even when they're worse. I think the main areas are education, government, care, and art.
In Tim Duffy's orbit
Center = Tim Duffy. Left = members they follow (green edges). Right = members who follow them (blue edges). Top = mutual follows (orange edges, slightly larger). Drag any node to reposition; click to open that profile.