Naomi Saphra
NLP and interpretability researcher
Articles & links
Our new paper sets the stage for the biggest practical use case of model interpretability: stress testing and dataset development. All you need is interpretable linear features and simple geometry.
Recent commentary
We don’t always know what problems are hard for LLMs. So devs evaluate on tasks HUMANS find hard or on broad benchmarks. What if we could instead anticipate which scenarios a model will fail on—all without evaluating specific input examples? 🧵NEW PAPER by @jenniferlumeng.bsky.social
if you are a PhD student in AI, remember it is in your interests to distract your advisor from how much money they could be making in industry. should be a daily priority.
ok the thing about erdos is he obviously loved collaborating with humans. he could have done a lot on his own, but math was how he chose to connect. I'm not sure he would have been very into chatbots?
my new literary award cannot be won by a commercial frontier LLM because I will require that 10% of each submission is smut
I tried to make the theory work out but the computer devil kept lying to me (ChatGPT generated incorrect proofs)
In Naomi Saphra's orbit
Center = Naomi Saphra. Left = members they follow (green edges). Right = members who follow them (blue edges). Top = mutual follows (orange edges, slightly larger). Drag any node to reposition; click to open that profile.