Who's Who of AI

Naomi Saphra

NLP and interpretability researcher

820 trust researcher @nsaphra.bsky.social · 10,234 followers

Why they matter

NLP and interpretability researcher with public evidence across AI research, Agents & robotics, NLP & language.

AI signals: 4
Sources: 4
Discussions: 18
Latest signal: 11d ago

Waiting on a robot body. All opinions are universal and held by both employers and family. ML/NLP professor. nsaphra.net

What they're sharing

Articles & links

↻ Naomi Saphra reposted

Tiago Pimentel @tpimentel.bsky.social

Our new paper reformulates tokenisation as a linear program (LP), which we solve to get SOTA tokenisers 😁 As a bonus, this LP tells us how close to optimal any tokeniser is! Check it out 👇 w/ J. Tempus, @philipwitti.bsky.social, @craigschmidt.com, D. Komm Paper: arxiv.org/abs/…

[2605.22821] Tokenisation via Convex Relaxations arxiv.org View on Bluesky →

Our ICML HiLD workshop paper shows that the reason why bigger models learn more complex tasks is because they are able to saturate the gradients for easier tasks; different tasks are competing for the same parameters and gradient mass.

Why Larger Models Learn More: Effects of Capacity, Interference, and Rare-Task Retention arxiv.org

AI Weekly's analysis →

The paper argues power-law scaling already implies a larger model will learn parts of the data distribution a smaller model cannot, even with infinite training data.
Pretraining experiments on OLMo models from 4M to 4B parameters found only the larger models learned the infrequent and complex tasks.
The proposed mechanism is reduced gradient interference: weaker common-task updates leave rare-task features intact in larger models.

Read full analysis →

View on Bluesky · ♥ 3 ↻ 0 ↩ 0 · 2 from the directory shared this · 30d ago

↻ Naomi Saphra reposted

David Picard @davidpicard.eurosky.social

Babe, stop everything! New favorite paper of the year is out! kyutai.org/fid-lottery/ arxiv.org/abs/2606.20536

The FID Lottery: Quantifying Hidden Randomness in Generative-Model Evaluation arxiv.org

AI Weekly's analysis →

Retraining a model with a different seed moves its FID score 3.2x more than resampling from a fixed trained network.
FID coefficient of variation stays within a 1-2% band even as compute or model size increases.
The authors recommend treating any FID gap below roughly 1.3% CoV as inconclusive and requiring multi-seed error bars.

Read full analysis →

View on Bluesky →

↻ Naomi Saphra reposted

Isabelle Lee @wordscompute.bsky.social

Benchmarks can be superficial, but model explanations and evaluations are fundamentally intertwined. What if we used interpretability as principled, scientific evaluation? If it met scientific standards? arxiv.org/abs/2605.05508 coming to EvalEval at ACL as oral 🧵 1/6

Rigorous Interpretation Is a Form of Evaluation arxiv.org

AI Weekly's analysis →

The paper argues interpretability methods that are falsifiable, reproducible, and predictive can serve as model evaluation, not just diagnostics.
Of four methods assessed in Table 1, attention mechanisms fail all three criteria; sparse autoencoders fail reproducibility.
An SAE refusal-detection feature trained on chat data failed to generalize when the target model received webtext input instead.

Read full analysis →

View on Bluesky →

Our new paper sets the stage for the biggest practical use case of model interpretability: stress testing and dataset development. All you need is interpretable linear features and simple geometry.

Adversarial Concept Search: Predicting Compositional Errors From Feature Geometry arxiv.org

AI Weekly's analysis →

A Compositional Interference metric derived from feature geometry predicts LLM failures without evaluating specific inputs.
On multihop question answering, correlation between the CI metric and model accuracy reached r = -0.855.
The method predicts cross-lingual transfer failures across 10+ languages using only English fact representations.

Read full analysis →

View on Bluesky · ♥ 23 ↻ 5 ↩ 1 · 2 from the directory shared this · 43d ago

I'm so happy when other people write papers on nondeterministic factors in training. embrace the chaos

Emergent Capabilities Arise Randomly from Learning Sparse Attention Patterns arxiv.org

AI Weekly's analysis →

Emergent capabilities arise stochastically: the same model can gain or fail to gain a capability depending on its random initialization.
Researchers used Pythia models from 14M to 410M parameters to show attention pattern learning is the key bottleneck to capability emergence.
More attention heads improved learning efficiency; MLP-Mixer outperformed standard transformers on tasks with complex attention patterns.

Read full analysis →

View on Bluesky · ♥ 43 ↻ 3 ↩ 0 · 2 from the directory shared this · 33d ago

↻ Naomi Saphra reposted

@n.grnfld.me

Back from Seoul. My first paper, "An Isotropic Approach to Efficient UQ with Gradient Norms", got a poster and an oral at ProbML, and came away with the Best Paper Award. Still a bit stunned. arxiv.org/abs/2603.29466

An Isotropic Approach to Efficient Uncertainty Quantification with Gradient Norms arxiv.org

AI Weekly's analysis →

A new preprint derives LLM predictive uncertainty from a single forward-backward pass through an unmodified pretrained model, avoiding ensembles or training-data access.
Two approximations fall out cleanly: epistemic uncertainty as the squared gradient norm, aleatoric as the Bernoulli variance of the point prediction.
The combined score gets the highest mean AUROC on TruthfulQA but drops to near chance on TriviaQA factual recall, per the authors.

Read full analysis →

View on Bluesky →

Unfortunately, the canonical reference is extremely dense

The generalization error of random features regression: Precise asymptotics and double descent curve arxiv.org

AI Weekly's analysis →

Song Mei and Andrea Montanari analyze ridge regression on N random features, equivalent to a two-layer neural network with random first-layer weights.
They compute precise asymptotics of test error in the limit where N, n, and d go to infinity with N/d and n/d held fixed.
Their setup is described as the first analytically tractable model capturing all features of the double descent curve without ad hoc misspecification.

Read full analysis →

View on Bluesky · ♥ 5 ↻ 1 ↩ 2 · 31d ago

↻ Naomi Saphra reposted

Ethan Mollick @emollick.bsky.social

I wrote this a few months ago right after the first Anthropic/DoW conflict & Citrini & Block: “t But I think that single week is a good illustration of what the near future will feel like… as the stakes go up, it is likely things will feel even more unstable..” www.oneusefulth…

oneusefulthing.org View on Bluesky →

↻ Naomi Saphra reposted

@pamelafox.bsky.social

I printed a custom t-shirt that's an ode to @simonwillison.net's Pelican benchmark. The super chill highly saturated cruising pelican came from this Gemini 3 launch in Feb: simonwillison.net/2026/Feb/12/... (And the caption was inspired by the reaction on social media to the b…

Gemini 3 Deep Think simonwillison.net View on Bluesky →

↻ Naomi Saphra reposted

Zeerak Talat زیرک طلعت (they/them) @zeerak.bsky.social

NLP reviews suck! But why is not clear– @aclrollingreview.bsky.social provides *a lot* of guidance for how to review, but in that, first principles get lost. So @adamlopez.bsky.social and I have written down some of our thoughts on first principles.

medium.com View on Bluesky →

↻ Naomi Saphra reposted

@fernmonkey.bsky.social

This interview with him is quite interesting, and it describes his own accommodations as a disabled academic. I feel genuinely bad for him, because it's clear that he wanted his students safe and supported, and then this happens.

Professor denounces mass AI fraud on an exam at Brown University: ‘Academic integrity is at risk’ english.elpais.com View on Bluesky →

Their own posts

Recent commentary

We don’t always know what problems are hard for LLMs. So devs evaluate on tasks HUMANS find hard or on broad benchmarks. What if we could instead anticipate which scenarios a model will fail on—all without evaluating specific input examples? 🧵NEW PAPER by @jenniferlumeng.bsky.social

View on Bluesky · ♥ 133 ↻ 34 ↩ 3 · 43d ago

if you think students will be routinely collaborating with LLMs after graduation, #1 priority is to study rhetoric because you need to recognize BS and critique the rigor of an argument that *looks* good. the elite users are going to be like, philosophy, classics, and english majors.

View on Bluesky · ♥ 109 ↻ 13 ↩ 5 · 13d ago

if you are a PhD student in AI, remember it is in your interests to distract your advisor from how much money they could be making in industry. should be a daily priority.

View on Bluesky · ♥ 80 ↻ 2 ↩ 4 · 61d ago

Are students embarrassed by AI cheating? Like, there has always been rare but unstigmatized dishonesty (memorizing a frat’s archive of finals questions) and common but stigmatized (saying you’ve started when you definitely have not started it). AI cheating should be the most stigmatized. Is it?

View on Bluesky · ♥ 23 ↻ 0 ↩ 10 · 20d ago

ok the thing about erdos is he obviously loved collaborating with humans. he could have done a lot on his own, but math was how he chose to connect. I'm not sure he would have been very into chatbots?

View on Bluesky · ♥ 32 ↻ 1 ↩ 4 · 64d ago

ACL needs to adopt the expectation from ML conferences that workshops be exciting. If you come from NLP you know ACL workshops are mostly terminal venues for abandoned/unambitious work, but it's clear that the correct approach is to host cutting-edge WiP.

View on Bluesky · ♥ 17 ↻ 1 ↩ 4 · 29d ago

How do you make LLMs actually good at explaining new math? It's like reading a badly written reference for people who already know the subject. When I ask a question, it never matches my level. If I try to rephrase to test my understanding, it just sycophantically agrees.

View on Bluesky · ♥ 16 ↻ 2 ↩ 3 · 35d ago

my new literary award cannot be won by a commercial frontier LLM because I will require that 10% of each submission is smut

View on Bluesky · ♥ 24 ↻ 0 ↩ 1 · 69d ago

the fact that AI judges prefer sloppy AI writing makes the total death of good human-readable prose almost inevitable in scientific publishing and writing competitions. not sure what we can do about that.

View on Bluesky · ♥ 19 ↻ 1 ↩ 2 · 31d ago

ChatGPT just told me helpfully in an explanation that the ceiling for something was lower than its floor and it instantly changed my belief that LLMs can apply systemic metaphors well without grounding.

View on Bluesky · ♥ 11 ↻ 0 ↩ 4 · 18d ago

Their network

In Naomi Saphra's orbit

Center = Naomi Saphra. Left = members they follow (green edges). Right = members who follow them (blue edges). Top = mutual follows (orange edges, slightly larger). Drag any node to reposition; click to open that profile.