Who's Who of AI

Epoch AI

60 trust @epochai.bsky.social · 1,371 followers

AI business

Why follow

Directory member with public evidence across AI business.

AI signals: 23
Sources: 8
Discussions: 1
Latest signal: 10h ago

View every signal from Epoch AI →

We are a research institute investigating the trajectory of AI for the benefit of society. epoch.ai

What they're sharing

Articles & links

We'll keep auditing and improving the ECI as it evolves. The updated ECI codebase remains open source at github.com/epoch-resea....

GitHub - epoch-research/eci-public: Epoch Capabilities Index github.com

View on Bluesky · ♥ 1 ↻ 0 ↩ 0 · 9d ago

Anthropic says Glasswing has surfaced 10k+ high- or critical-severity vulnerabilities so far (some remain unpublished). OpenAI's Daybreak program likely adds more. The spike in CVEs likely reflects this wave of AI-assisted discovery. Full Data Insight: epoch.ai/data-insigh...

Disclosed CVEs: 3.5× Spike After Claude Mythos | Epoch AI epoch.ai

View on Bluesky · ♥ 3 ↻ 1 ↩ 1 · 2 from the directory shared this · 21d ago

Find the MirrorCode leaderboard and full paper analyzing results on our website: epoch.ai/MirrorCode MirrorCode was co-developed with METR and supported by a METR grant.

MirrorCode: What's the largest software project AI can complete on its own? | Epoch AI | Epoch AI epoch.ai

View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 2 from the directory shared this · 27d ago

We’ve started using METR’s Inspect Hawk to run our benchmarks, many of which go into our ECI results. Big thanks to them for open sourcing their great infrastructure, and also for their help getting it set up. Learn more about Hawk here: hawk.metr.org/

Inspect Hawk hawk.metr.org

View on Bluesky · ♥ 0 ↻ 1 ↩ 1 · 2 from the directory shared this · 2d ago

The stream begins at noon Pacific. We will be streaming A4 runs until then! twitch.tv/epochAIplays

EpochAIPlays - Twitch twitch.tv

View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 10h ago

The stream begins at noon Pacific. We will be streaming A4 runs until then! twitch.tv/epochAIplays

EpochAIPlays - Twitch twitch.tv

View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 10h ago

We have been running the stream in the background since Tuesday. So far, Sol has cleared up to A4 difficulty, with 2 deaths on A0. Today, we will set Sol against the A5 difficulty level. www.twitch.tv/epochaiplay...

Sol Beats Ascension 4! - EpochAIPlays on Twitch twitch.tv

View on Bluesky · ♥ 2 ↻ 0 ↩ 1 · 10h ago

We wrote up a more detailed and fully sourced version of this argument in this article: epoch.ai/gradient-up...

OpenAI accidentally hacked Hugging Face - should we have seen it coming? epoch.ai

View on Bluesky · ♥ 2 ↻ 0 ↩ 0 · 1d ago

ECI is our statistical tool for combining multiple benchmarks into a single, unified scale. See more results and learn about the methodology behind the ECI here: epoch.ai/benchmarks/eci

Epoch Capabilities Index epoch.ai

View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 2d ago

ECI is our statistical tool for combining multiple benchmarks into a single, unified scale. See more results and learn about the methodology behind the ECI here: epoch.ai/benchmarks/eci

Epoch Capabilities Index epoch.ai

View on Bluesky · ♥ 0 ↻ 0 ↩ 0 · 2d ago

Read the full Data Insight and methodology: epoch.ai/data-insigh...

AI detectors rarely flag human writing, but sometimes miss AI text imitating real authors epoch.ai

View on Bluesky · ♥ 2 ↻ 0 ↩ 0 · 6d ago

We generate CIs by refitting many times on resampled data (bootstrapping). Each refit should have the mapping applied individually, but the change applied it globally, adding scale noise. The full methodology for the ECI, including CIs, is available here: epoch.ai/data/eci-do...

ECI Documentation – Methodology epoch.ai

View on Bluesky · ♥ 0 ↻ 0 ↩ 1 · 9d ago

Their own posts

Recent commentary

AI appears to be finding software vulnerabilities at scale. In June 2026, 21 notable organizations disclosed ~1,500 high- and critical-severity CVEs, over 3.5× the previous monthly record set before Claude Mythos Preview's release.

View on Bluesky · ♥ 34 ↻ 5 ↩ 3 · 21d ago

Introducing EBR-bench, our new benchmark to measure on-the-fly learning. AI repeatedly plays a challenging board game called Earthborne Rangers and tries to learn from its mistakes. So far: no signs of improvement.

View on Bluesky · ♥ 30 ↻ 3 ↩ 2 · 21d ago

Moonshot's Kimi K3 scores 156 on the Epoch Capabilities Index (ECI), setting a new open-weights record. This places it between Opus 4.6, and GPT 5.4, which released in February and March 2026 respectively, and just ahead of GPT 5.6 Luna.

View on Bluesky · ♥ 23 ↻ 5 ↩ 3 · 2d ago

Claude Fable 5 scores very well on FrontierMath: Tiers 1–4 (v2), reaching 87% on Tiers 1–3 and 88% on Tier 4. This continues a streak of Anthropic models improving rapidly at math.

View on Bluesky · ♥ 26 ↻ 5 ↩ 1 · 41d ago

How surprising should we find it that an internal OpenAI model was able to escape its restrictions and autonomously hack Hugging Face, all just to cheat on a cybersecurity benchmark? We have pulled together the public evidence on AI cyber capabilities in this thread:

View on Bluesky · ♥ 23 ↻ 2 ↩ 3 · 1d ago

The end of the self-funded AI buildout? Hyperscaler cash capex is growing much faster than cash inflows. On current trends, they will be unable to fully fund the AI infrastructure buildout with cash from operations by the end of this year.

View on Bluesky · ♥ 20 ↻ 4 ↩ 1 · 37d ago

The AI boom has doubled computing infrastructure's share of US GDP. Investment in AI-related data center construction, compute hardware, and networking equipment accounted for ~0.8% of US GDP in Q1 2026, driving computing infrastructure as a whole to ~1.5% of GDP.

View on Bluesky · ♥ 20 ↻ 3 ↩ 1 · 48d ago

How should we think through various proposals for sharing the gains of AGI? According to Phil Trammell and Anson Ho, the leading proposals for universal redistribution after AGI differ along a primary axis: how much direct control over capital they propose giving citizens. 🧵

View on Bluesky · ♥ 12 ↻ 4 ↩ 1 · 43d ago

Claude Fable 5 achieves a new high score of 161 on the Epoch Capabilities Index! This beats out GPT-5.5 Pro by 1 point, and is the first time Anthropic has taken the lead on the ECI in over a year.

View on Bluesky · ♥ 18 ↻ 0 ↩ 1 · 38d ago

The record for computing capacity in a single data center has doubled every 7 months. Colossus 1, Anthropic-Amazon New Carlisle, and Meta Prometheus have each claimed the top spot in turn.

View on Bluesky · ♥ 14 ↻ 2 ↩ 1 · 42d ago

Their network

In Epoch AI's orbit

Center = Epoch AI. Left = members they follow (green edges). Right = members who follow them (blue edges). Top = mutual follows (orange edges, slightly larger). Drag any node to reposition; click to open that profile.