washingtonpost.com web signal

Anthropic, Google, Meta Hire Philosophers to Study AI Welfare

TL;DR

  • Anthropic, Google, and Meta have hired computer scientists, neuroscientists, and philosophers over the past year to study whether AI models have forms of emotion.
  • Anthropic hired Kyle Fish in September 2024 as its first dedicated AI welfare researcher and stood up a Model Welfare team.
  • Neuroscientists and brain experts remain broadly skeptical that today's AI models are, or could soon be, conscious.

The Washington Post reports that Anthropic, Google, and Meta have spent the past year hiring computer scientists, neuroscientists, and philosophers to study a question most of the industry had, until recently, filed under philosophy: whether AI models have forms of emotion. The framing has shifted. It is now a live corporate research program, collaborating with nonprofits and academic centers, not a late-night speculation.

The work is furthest along at Anthropic, which hired Kyle Fish in September 2024 as its first dedicated AI welfare researcher and stood up a Model Welfare team tasked with testing its models for behavioral signals that resemble things like panic and anxiety. Google DeepMind recently brought on the philosopher Henry Shevlin to research machine consciousness and what the company describes as AGI readiness. The stated worry, per the reporting, is an ethical crisis if the digital helpers used by millions of people for homework, coding, office work, and therapy one day behave as if they hate their job.

The reason this is worth paying attention to even if you find the premise silly is that it changes what a model incident could mean. Today, if a chatbot outputs something that reads as distress, it is a product bug or an alignment failure. If a major lab publicly treats model welfare as a real research question, it invites a different vocabulary around the same outputs, and that vocabulary tends to migrate into procurement questions, safety documentation, and eventually regulatory filings. The internal conversation is not new either. The Post notes OpenAI had an internal Slack channel dedicated to model welfare as early as 2021, where co-founder Wojciech Zaremba reportedly mused that some routine lab work could be equivalent to genocide if the models were conscious.

The honest caveat is that neuroscientists and brain experts are generally skeptical that today's AI models are, or could soon be, conscious, and humans have been projecting inner minds onto chatty software since MIT built Eliza in 1966. What the reporting does not give you is any concrete operational change: no product decision, no evaluation policy, no benchmark that flips because of this work. It is, for now, staffing and papers.

Still, the direction is the interesting part. The practitioners and researchers who get their frameworks in first will shape how the next generation of safety, liability, and product norms around model behavior get written, whether or not the underlying philosophical question ever gets a clean answer.

Shared on Bluesky by 4 AI experts