How much does a language model forget when finetuned on new tasks? We show both model size and optimization matter and forgetting can be nearly eliminated with self-generated replay! arxiv.org/abs/2605.26097 w/Martin Marek, Dongkyu Cho, Shikai Qiu, Rumi Chunara, and Pavel Izma…
Who's Who of AI
Andrew Gordon Wilson
Machine Learning Professor
https://cims.nyu.edu/~andrewgw
What they're sharing
[2605.26097] Forgetting in Language Models: Capacity, Optimization, and Self-Generated Replay arxiv.org
Articles & links
Their own posts
Recent commentary
Anyone want to submit a workshop proposal all about deep learning? I think this area really might take off.
Perhaps I'm an outlier, but generally the value I derive from art is not from its backstory. I love a Bach fugue not because he was suffering, content, had many children, or whatever else, but because it's an extraordinary composition. I'd feel the same about AI generated art.
Their network
In Andrew Gordon Wilson's orbit
Center = Andrew Gordon Wilson. Left = members they follow (green edges). Right = members who follow them (blue edges). Top = mutual follows (orange edges, slightly larger). Drag any node to reposition; click to open that profile.