Fast LeWorldModel Cuts Robot Planning Time 48%, Gains Accuracy
TL;DR
- Fast-LeWM cuts CEM planning time from 54.4 to 28.3 seconds (48%) and reduces model calls per cycle from 55 to 11.
- Average task success rises from 85.8% to 90.5% across four simulated environments, reaching 92.0% with an optional self-consistency mechanism.
- The model uses 17.9M parameters, nearly identical to LeWM's 18.0M, achieving its efficiency gains without increasing model size.
The standard approach for planning with JEPA-based visual world models is computationally costly: to evaluate a candidate action sequence, the original LeWorldModel steps through each state one at a time in an autoregressive loop. A new paper on arXiv from Yuntian Gao and Xiangyu Xu proposes a cleaner alternative. Their Fast LeWorldModel (Fast-LeWM) replaces that sequential rollout with action-prefix prediction, encoding action prefixes and predicting future latent states in parallel rather than step by step.
The efficiency numbers are concrete. Fast-LeWM reduces dynamics-evaluation time from 31.4 seconds to 8.0 seconds and CEM planning solve time from 54.4 seconds to 28.3 seconds, a 48% reduction. Model calls per planning cycle fall from 55 to 11. The approach does this without meaningfully inflating the model: Fast-LeWM runs at 17.9 million parameters, comparable to the 18.0 million of the baseline LeWM checkpoint.
What makes the result less routine is that accuracy improves alongside speed. Across four simulated environments (Two-Room, Reacher, PushT, and OGBench-Cube), average task success climbs from a baseline of 85.8% to 90.5%, with an optional self-consistency mechanism pushing that to 92.0%. The authors attribute this partly to lower error accumulation: prefix-based prediction substantially lowers open-loop prediction error and its growth over the long horizon, because compound single-step errors do not build up the way they do in sequential rollout.
The honest caveat is that all results come from simulation, and absolute planning times, even at 28.3 seconds for the full CEM solve, remain well above the sub-second latency most real-time robot control applications require. The paper also does not report how the method performs on longer-horizon tasks or on physical hardware, and the computational overhead of the self-consistency variant is not detailed in publicly available content.
For teams already building on JEPA architectures, the near-identical parameter count is the practically relevant detail: an algorithmic upgrade that does not require retraining at larger scale is much easier to adopt. The direction, faster and more accurate planning from the same model size, is worth watching if these gains carry into hardware and more complex tasks.
Originally reported by paper
Read the original article →Original headline: Fast LeWorldModel Cuts JEPA Robot Planning Time 48% and Reduces Model Calls 5×