restofworld.org web signal

JD.com Mobilizes 600,000 Workers for Robot Training Data

robotics china ai synthetic data physical-ai robot-training-data china-strategy

Key insights

  • JD.com targets 10 million hours of robot training data via 100,000 employees and 500,000 contractors over two years in Suqian.
  • Shenzhen-based X Square Robot charges homeowners 149 yuan ($22) per three-hour session to have a humanoid robot perform and record household tasks.
  • Interact Analysis analyst Marco Wang says China leads the U.S. in hardware and the data ecosystem for humanoid robotics.

Why this matters

The coordinated mobilization of 600,000 workers to generate embodied robot training data inside real homes, farms, and elder care facilities creates a data asset that offshore or synthetic collection cannot match for contextual richness and diversity. Interact Analysis's conclusion that China leads 'in terms of hardware and the data ecosystem' signals that the physical-AI race is diverging from the software-AI race, where the U.S. retains stronger advantages in AI research talent. If JD.com's 10-million-hour pipeline yields capable robots, it establishes a state-coordinated data infrastructure template that other Chinese robotics firms can replicate across task domains at speed.

Summary

JD.com is working with Suqian's local government to collect 10 million hours of humanoid robot training data over two years, deploying 100,000 employees and 500,000 external workers across homes, farms, and elder care facilities. The scale masks early-stage capabilities. Beijing homeowner Daniel Wang paid 149 yuan ($22) for a session with X Square Robot's humanoid, which spent roughly an hour folding three clothing items and another arranging shoes, with a human housekeeper completing most tasks. Separately, workers like Gao Bo in Shandong province earn 20 yuan ($3) per hour filming themselves doing chores for six hours daily. Essentially: (JD.com, X Square Robot) are converting domestic labor and consumer robot sessions into structured training pipelines at national scale. - JD.com targets 10 million hours across homes, farms, and elder care centers in Suqian over two years - Interact Analysis analyst Marco Wang says China leads "in terms of hardware and the data ecosystem," while the U.S. leads in AI research talent - Oregon State professor Alan Fern calls the approach "not a super-crazy idea" but "very unproven" Whether volume-first data collection produces robots capable of generalizing beyond controlled settings is the central unresolved question.

Potential risks and opportunities

Risks

  • Alan Fern's 'very unproven' assessment is reinforced by Daniel Wang's session, where the robot needed roughly an hour to fold three clothing items and a human housekeeper completed most work, meaning JD.com could spend two years and 600,000 workers building a dataset that does not yield capable robots
  • Workers like Gao Bo, who film their own homes for six hours daily at 20 yuan per hour, face significant privacy exposure with no publicly confirmed regulatory framework in China governing domestic-environment data collected for robotics training
  • If Chinese humanoid robots trained on this data reach commercial scale before U.S. firms build equivalent domestic data pipelines, the embodied-AI data gap could become structural and difficult to close

Opportunities

  • U.S. humanoid robotics firms could use China's state-coordinated 10-million-hour program as justification to pursue government funding for equivalent domestic data-collection infrastructure, framing it as a competitive necessity
  • Elder care operators and home services platforms in China that partner early with JD.com or X Square Robot could secure preferred data-source status as the program scales toward its 10-million-hour target
  • Vendors specializing in egocentric video capture and wrist-sensor motion annotation could see accelerated demand as Chinese firms scale and face data quality and labeling bottlenecks

What we don't know yet

  • Whether JD.com's Suqian program has hit any interim milestones and what the actual hours-collected count is as of mid-2026
  • Data ownership and consent terms for the 500,000 external contractors filming their own homes and daily labor routines for the program
  • Whether session data from robots that barely completed basic household tasks constitutes useful generalization training signal or primarily records of failure