paper web signal June 25th 2026

Ilia Larchenko Tops 62 Teams in ICRA 2026 Garment-Folding Sim

TL;DR

Ilia Larchenko placed 1st of 62 teams in ICRA 2026's LeHome garment-folding simulation round with a 79.63% overall success rate.
The policy network predicts both robot actions and its own value signals (success probability, progress, task-relevant futures) from the same weights, enabling RL training.
The system dropped to 2nd place in the real-world final; the paper does not explain what drove the sim-to-real gap.

Garment folding remains one of the hardest things to teach robots reliably. Cloth is deformable, unpredictable, and punishing for imitation-learned policies that break down whenever the world drifts from the training distribution. So the leading result among 62 teams in the LeHome Challenge at ICRA 2026, described in a solo-authored paper by Ilia Larchenko, is worth a close look for what it actually did differently.

The core design move is to make the policy its own value function. The same network that predicts robot actions also predicts success probability, task progress, and a handful of task-relevant future quantities. Those auxiliary predictions feed directly into the RL training loop, providing advantage estimates and failure signals that turn a brittle imitation base into something more robust. The architecture runs on frozen pretrained components: a SigLIP-So400m/14 image encoder, a Gemma-2B prefix transformer, and a Gemma-300M action expert that generates 30-step action chunks via flow-matching. The RL layer combines AWR and RECAP adapted for flow-matching VLAs, inference-time hyperparameter selection runs via Thompson sampling, and sim-to-real transfer uses heavy augmentation alongside a DAgger-style loop that loads saved failure and semi-success states for human correction. Distributed training runs through HuggingFace Hub rather than a private cluster.

The honest caveat is visible in the competition outcome itself. Larchenko placed first among 62 teams in the online simulation round with a 79.63% overall success rate, 6.1 points ahead of second place, but finished second in the real-world final. The paper does not detail what drove that gap, and it is a real open question for anyone hoping to transfer this approach to physical hardware without further investigation.

What makes the result encouraging is how little of it is novel from scratch. The paper frames the work explicitly as a recombination of existing RL concepts with engineering contributions rather than new theory. A single researcher assembling known building blocks into a competition-leading pipeline, distributed through HuggingFace Hub, suggests the practical ceiling for small teams on hard manipulation problems may be higher than recent big-lab framing implies. The self-value-function pattern is not specific to garment folding either, and could transfer directly to other deformable-object tasks like cable routing or laundry sorting.

Originally reported by paper

Read the original article →

Original headline: Solo Researcher Wins ICRA 2026 Garment-Folding Competition 1st of 62 Teams With VLA+RL Hybrid