paper: https://t.co/gD6tgeLIOt
AI Weekly's analysis
→
- Orca pretrains on 125K hours of video and 160M event annotations using a single Next-State-Prediction objective on a frozen Qwen3.5 backbone.
- Orca-4B averages 51.8 across MVBench, TemporalBench, 3DSRBench, and SWITCH, ahead of Qwen3.5-4B's 46.7 at the same size.
- On a real-robot out-of-distribution test Orca reports 36.6% versus π₀.5's 27.6%, despite using no action labels in pre-training.
Read full analysis →