Scott McGrath

Biomedical Informatics PhD • CITRIS Health @UC Berkeley • FAMIA • Focusing on Informatics and AI in medicine • Linfield U. Grad • Missoula MT https://smcgrath.phd

Articles & links

I think this is an important paper, and I want to spotlight why. Recursive self-improvement is shifting from a hypothetical, to an emerging reality. Anthropic is having Claude help them build the next versions of Claude. These models aren't fully autonomous yet, but we are sta…

When AI builds itself anthropic.com
View on Bluesky · ♥ 2 ↻ 1 ↩ 1 · 15 from the directory shared this · 13d ago

Anthropic announces the launch of Claude Fable 5, its next-gen Mythos-class AI. Early testers were able to convert a 50M-line Ruby codebase in a single day, something that usually takes 2 months.

Claude Fable 5 and Claude Mythos 5 anthropic.com
View on Bluesky · ♥ 4 ↻ 1 ↩ 0 · 6 from the directory shared this · 9d ago

Nearly 2 out of 3 researchers believe the risks of using LLMs for data analysis outweigh the benefits. Yet a poll of 1,900 scientists reveals 60% adopt them anyway out of fear of being left behind. The survey found models designed for specific scientific tasks remain more popu…

Scientists have a bad case of AI FOMO, Nature poll reveals nature.com
View on Bluesky · ♥ 12 ↻ 2 ↩ 0 · 5 from the directory shared this · 9d ago

Claude Opus 4.8 is out! It adds a major push for precision, making it four times less likely than Opus 4.7 to let flaws in code pass unremarked. Early testers note it proactively flags uncertainties and shaky assumptions in data.

Introducing Claude Opus 4.8 \ Anthropic anthropic.com
AI Weekly's analysis
  • Opus 4.8 matches Opus 4.7 pricing at $5/$25/M tokens; Effort Modes replace pricing tiers as the cost-quality dial.
  • Dynamic Workflows impose hard ceilings: 1,000 total subagents, 16 concurrent; workflow plans live in JavaScript variables outside Claude's context window.
  • SWE-bench Pro score jumps from 64.3% (Opus 4.7) to 69.2% (Opus 4.8); the model flags its own code flaws 4x more often than its predecessor.
Read full analysis →
View on Bluesky · ♥ 7 ↻ 1 ↩ 0 · 3 from the directory shared this · 21d ago

Anthropic is facing intense regulatory pressure as the Trump administration ordered a shutdown of its new Fable 5 and Mythos 5 models over national security concerns. The abrupt move follows a prior Pentagon dispute labeling the startup a "supply chain risk".

nytimes.com
View on Bluesky · ♥ 4 ↻ 0 ↩ 0 · 3 from the directory shared this · 1d ago
Scott McGrath reposted
Nathan Lambert @natolambert.bsky.social

Why I think Anthropic's uneven safety policies with the release of Claude Fable 5 undermine the broader AI community's cohesion and accelerate us to more uncertainty and risk in AI's near-term evolution. www.interconnects.ai/p/claude-fab...

interconnects.ai View on Bluesky →

Academic publishing is facing a major crisis with AI slop. Journal editors are being flooded with AI-generated submissions that are almost impossible to detect. It is getting harder it is for human reviewers to filter out the noise.

AI-generated research papers are overwhelming peer review | The Verge theverge.com
AI Weekly's analysis
  • AI-generated academic papers now regularly pass journal peer review, evading both human reviewers and automated AI-detection tools.
  • Scientists identify mandatory data-sharing and reproducibility checks as the only remaining procedural safeguards capable of catching AI-fabricated research.
  • The crisis extends beyond arXiv's hallucinated-citation problem to affect broad peer-reviewed publishing across multiple scientific disciplines.
Read full analysis →
View on Bluesky · ♥ 2 ↻ 1 ↩ 0 · 3 from the directory shared this · 34d ago

Recent commentary

An unanticipated danger of ambient AI: converting a the statement “female mail man” into a “Patient is a 26 year old biological male identifying as a female” #MedSky

View on Bluesky · ♥ 117 ↻ 45 ↩ 5 · 31d ago

Just finished recording my last lecture for an Introduction to AI for Clinical Students class that I’m teaching in two weeks. 30 lectures spread out over 4 weeks! Really interested in how it is received. #MedEd #MedSky

View on Bluesky · ♥ 9 ↻ 0 ↩ 2 · 23d ago

Editing times for pediatric admission notes plummeted from 48.5 to 10.8 minutes with ambient AI. Across 127k hospital notes, the tool slashed cognitive burden during initial ED & ward encounters, but it offered no time savings for heavily templated daily progress notes. #amplify2026 #medsky

View on Bluesky · ♥ 3 ↻ 1 ↩ 1 · 29d ago

Keynote talk from Dr. Lee, advancing healthspan with AI and Agentic AI. #amplify2026

View on Bluesky · ♥ 3 ↻ 0 ↩ 1 · 31d ago

Setting up for the Clinical Informatics Keynote: Advancing Healthspan with AI and Agentic AI: Transforming How We Care, Discover, and Share. Over 1118 people in attendance here in Denver! #amplify2026

View on Bluesky · ♥ 5 ↻ 0 ↩ 0 · 31d ago

Setting up in workshop #CI07 Building AI Agents for Healthcare: A Practical Introduction Using Microsoft Copilot Studio Workshop. Here are some nice visualizations of some medical AI agent use cases. #Amplify2026 #MedSky

View on Bluesky · ♥ 3 ↻ 0 ↩ 1 · 31d ago

Walking through how the speakers sort their risk categories for considering approval of Generative AI tools in a clinical setting. #Amplify2026 #Medsky

View on Bluesky · ♥ 2 ↻ 0 ↩ 1 · 31d ago

Standard IT evaluations fall apart for generative AI. There is a lack gold standards for subjective outputs like clinical notes, and background vendor updates cause model drift. Epic's AI discharge summary tool matches human quality but produces more errors. #Amplify2026 #MedSky

View on Bluesky · ♥ 2 ↻ 0 ↩ 1 · 31d ago

LLMs alone are blind to today's lab results and proprietary clinical protocols. RAG bridges this gap by chunking institutional data into searchable numeric vectors. It doesn't make the model smarter, but grounds it in specific, cited documents. #Amplify2026 #MedSky

View on Bluesky · ♥ 3 ↻ 0 ↩ 0 · 31d ago

Over 1,250 FDA-authorized AI medical devices are on the market, but only 9% have post-deployment surveillance plans. The recent NHLBI workshop shows patient use is outrunning clinical guidance. #amplfiy2026 #medsky

View on Bluesky · ♥ 2 ↻ 0 ↩ 0 · 29d ago

In Scott McGrath's orbit

Center = Scott McGrath. Left = members they follow (green edges). Right = members who follow them (blue edges). Top = mutual follows (orange edges, slightly larger). Drag any node to reposition; click to open that profile.