I think this is an important paper, and I want to spotlight why. Recursive self-improvement is shifting from a hypothetical, to an emerging reality. Anthropic is having Claude help them build the next versions of Claude. These models aren't fully autonomous yet, but we are sta…
Scott McGrath
Articles & links
US intelligence agencies are shifting surveillance toward a broad new domestic threat category labeled "anti-tech violent extremism." Over 1,000 pages of unpublished reports from the FBI and DHS indicate a coordinated effort to monitor data center protests and digital advocacy.
Microsoft disabled 73 of its own GitHub repositories following a major data breach. Hackers pushed a malicious commit to Azure and AI agent tools to harvest developer credentials through platforms like Claude Code and Gemini CLI.
I think this is going to be looked back upon as an artifact of the peak of the current AI hype cycle. We are already seeing sticker shock of unsubsidized AI costs, and I think people will be incredulous when they find out there were internal competitions to see who could use t…
Interesting perspective on one's student's experience of spending their entire college experience in the AI era. #AcademicSky
Anthropic announces the launch of Claude Fable 5, its next-gen Mythos-class AI. Early testers were able to convert a 50M-line Ruby codebase in a single day, something that usually takes 2 months.
GitHub Copilot is moving from flat-rate request tiers to usage-based billing, sparking widespread sticker shock among power users. Spot tests show complex prompts or large chat histories can burn through a monthly credit allotment in a single day.
Nearly 2 out of 3 researchers believe the risks of using LLMs for data analysis outweigh the benefits. Yet a poll of 1,900 scientists reveals 60% adopt them anyway out of fear of being left behind. The survey found models designed for specific scientific tasks remain more popu…
Claude Opus 4.8 is out! It adds a major push for precision, making it four times less likely than Opus 4.7 to let flaws in code pass unremarked. Early testers note it proactively flags uncertainties and shaky assumptions in data.
- Opus 4.8 matches Opus 4.7 pricing at $5/$25/M tokens; Effort Modes replace pricing tiers as the cost-quality dial.
- Dynamic Workflows impose hard ceilings: 1,000 total subagents, 16 concurrent; workflow plans live in JavaScript variables outside Claude's context window.
- SWE-bench Pro score jumps from 64.3% (Opus 4.7) to 69.2% (Opus 4.8); the model flags its own code flaws 4x more often than its predecessor.
Anthropic is facing intense regulatory pressure as the Trump administration ordered a shutdown of its new Fable 5 and Mythos 5 models over national security concerns. The abrupt move follows a prior Pentagon dispute labeling the startup a "supply chain risk".
Academic publishing is facing a major crisis with AI slop. Journal editors are being flooded with AI-generated submissions that are almost impossible to detect. It is getting harder it is for human reviewers to filter out the noise.
- AI-generated academic papers now regularly pass journal peer review, evading both human reviewers and automated AI-detection tools.
- Scientists identify mandatory data-sharing and reproducibility checks as the only remaining procedural safeguards capable of catching AI-fabricated research.
- The crisis extends beyond arXiv's hallucinated-citation problem to affect broad peer-reviewed publishing across multiple scientific disciplines.
Recent commentary
An unanticipated danger of ambient AI: converting a the statement “female mail man” into a “Patient is a 26 year old biological male identifying as a female” #MedSky
Just finished recording my last lecture for an Introduction to AI for Clinical Students class that I’m teaching in two weeks. 30 lectures spread out over 4 weeks! Really interested in how it is received. #MedEd #MedSky
Editing times for pediatric admission notes plummeted from 48.5 to 10.8 minutes with ambient AI. Across 127k hospital notes, the tool slashed cognitive burden during initial ED & ward encounters, but it offered no time savings for heavily templated daily progress notes. #amplify2026 #medsky
Keynote talk from Dr. Lee, advancing healthspan with AI and Agentic AI. #amplify2026
Setting up for the Clinical Informatics Keynote: Advancing Healthspan with AI and Agentic AI: Transforming How We Care, Discover, and Share. Over 1118 people in attendance here in Denver! #amplify2026
Setting up in workshop #CI07 Building AI Agents for Healthcare: A Practical Introduction Using Microsoft Copilot Studio Workshop. Here are some nice visualizations of some medical AI agent use cases. #Amplify2026 #MedSky
Walking through how the speakers sort their risk categories for considering approval of Generative AI tools in a clinical setting. #Amplify2026 #Medsky
Standard IT evaluations fall apart for generative AI. There is a lack gold standards for subjective outputs like clinical notes, and background vendor updates cause model drift. Epic's AI discharge summary tool matches human quality but produces more errors. #Amplify2026 #MedSky
LLMs alone are blind to today's lab results and proprietary clinical protocols. RAG bridges this gap by chunking institutional data into searchable numeric vectors. It doesn't make the model smarter, but grounds it in specific, cited documents. #Amplify2026 #MedSky
Over 1,250 FDA-authorized AI medical devices are on the market, but only 9% have post-deployment surveillance plans. The recent NHLBI workshop shows patient use is outrunning clinical guidance. #amplfiy2026 #medsky
In Scott McGrath's orbit
Center = Scott McGrath. Left = members they follow (green edges). Right = members who follow them (blue edges). Top = mutual follows (orange edges, slightly larger). Drag any node to reposition; click to open that profile.