Toneva reframes AI models as neuroscience model organisms
TL;DR
- Max Planck's Mariya Toneva argues pretrained AI models should be studied as 'model organisms' rather than finished computational models of the brain.
- Text-only language models can predict auditory cortex activity by exploiting a letter-to-phoneme correlation, not by actually processing speech.
- Brain-tuning models on naturalistic brain recordings reportedly improves predictions for new people and stories and surfaces speech features not yet identified.
There's a useful reframing in a perspective piece by Mariya Toneva in The Transmitter if you sit anywhere near the seam between AI and neuroscience. Toneva, a faculty member at the Max Planck Institute for Software Systems, argues that the last decade's gold rush of using pretrained AI models to predict brain activity has slid into a category error. The systems, she writes, were never built to explain the brain; they were designed as engineering tools, trained to solve practical problems such as predicting the next word in a sentence. The better mental model is the mouse or the fruit fly: complex systems we did not build to test a specific neuroscientific theory, but can study as model organisms.
The example that does the work is sharp. Text-based language models, trained only on written text, can predict activity in brain regions that process low-level features of speech. That should be suspicious, and it is: the number of letters in a written word often correlates with the number of phonemes in the spoken version, so the auditory cortex tracks the sounds and the language model tracks the letters, and the apparent alignment is a shortcut. "The language model wasn't actually 'listening'; it was just exploiting a correlation," she writes. Her group developed an interpretability technique to localize and perturb the model's knowledge of word length, and the ability to predict the auditory cortex immediately vanished.
The constructive half of the essay is a method she calls brain-tuning, which aligns model representations with naturalistic brain recordings that capture perception, language comprehension, memory and prediction at once rather than one cognitive process at a time. Her claim is that when you brain-tune a language model on auditory data, it doesn't just improve at predicting those specific data, it becomes a better general listener: predicting brain activity for entirely new people and new stories, picking up on features that go beyond simple word length, and starting to process speech features researchers haven't even identified yet. She also reports that the internal representation space of brain-tuned models more closely matches the hierarchical processing in the brain than do pretrained AI models.
The honest caveat is that this is a perspective essay, not a benchmark paper, and the supporting results are largely from Toneva's own lab, so take the specifics as reported rather than settled. What the piece doesn't give you is how much naturalistic brain data brain-tuning really needs, or whether the perturbation trick generalizes past word length to the other shortcuts that probably hide in these models. What it does give you is a cleaner posture for anyone using pretrained models to make brain claims: assume the model is winning for a hidden reason until you can perturb the candidate mechanism and watch the prediction die.
Shared on Bluesky by 1 AI expert
-
"Instead of treating AI models as a finished computational model of the brain, we should treat them as 'model organisms.'" www.thetransmitter.org/artificial-i...
View on Bluesky →
Originally reported by thetransmitter.org
Read the original article →Original headline: Transforming AI models into useful model organisms