nvidianews.nvidia.com via Reddit

NVIDIA Releases Cosmos 3, Open Model for Physical AI

6 sources tracking this story
nvidia robotics open source physical-ai open-source robotics

Key insights

  • Cosmos 3 natively outputs joint angles, gripper positions, and trajectory data, making it a direct robot policy training tool beyond synthetic video generation.
  • Axios frames Cosmos 3 as NVIDIA's move to own the physical AI platform layer and capture recurring value across robotics and autonomous vehicles beyond chip sales.
  • The Decoder finds no external benchmarks against Waymo or Tesla, leaving NVIDIA's autonomous driving performance claims unverified by independent third parties.

Why this matters

NVIDIA's Cosmos 3 release extends the company's value capture from chip hardware into foundational AI platform infrastructure, positioning it as the default model and compute provider for the physical AI category. The mixture-of-transformers architecture, combining vision reasoning, world generation, and action prediction in a single model, compresses robot training cycles from months to days by generating synthetic data for rare or dangerous scenarios. The Cosmos Coalition with Agile Robots, Runway, Black Forest Labs, and Skild AI, alongside named industrial commitments from LG Electronics, Samsung, Li Auto, and Doosan Robotics, signals production-level deployment rather than developer experimentation. Critics note that Cosmos 3's reasoning traces, intended to serve as safety documentation, may not accurately reflect internal network operations, a concern that carries weight as the model moves into autonomous driving and industrial automation.

Summary

NVIDIA released Cosmos 3 on May 31, an open foundation model for physical AI. Built on a mixture-of-transformers architecture, it processes text, images, video, ambient sound, and robot actions in one system, with NVIDIA claiming training cycles drop from months to days. The architecture pairs a reasoning module with expert generation components that understand object interactions and spatial-temporal relationships before producing video and action outputs. Essentially: (LG Electronics, Samsung Electronics, Li Auto) are named early adopters spanning robotics, autonomous vehicles, and vision AI. - Cosmos 3 Super (physics accuracy), Nano (fast reasoning), Edge (forthcoming, real-time inference). - Cosmos Coalition includes Agile Robots, Black Forest Labs, Generalist, LTX, Runway, Skild AI. An open model spanning sensing, reasoning, and physical action is a wider bet than prior open-source AI releases.

Potential risks and opportunities

Risks

  • LG Electronics and Samsung Electronics face liability exposure if Cosmos 3-powered robots cause physical harm before safety standards catch up to the model's shortened development cycles
  • Li Auto and other AV adopters risk regulatory scrutiny if Cosmos 3 world generation outputs contain spatial-temporal errors that propagate into autonomous vehicle decisions
  • Open distribution of a model generating robot action sequences creates dual-use risks if adversarial actors fine-tune Cosmos 3 Nano for physical systems outside intended parameters

Opportunities

  • Agile Robots and Skild AI, as Cosmos Coalition members, gain preferential access to model updates and can ship validated robotics products ahead of non-coalition competitors
  • Vision AI companies named as early adopters (Milestone Systems, Fogsphere, Linker Vision) can consolidate on Cosmos 3 as a shared perception backbone, reducing parallel model maintenance costs
  • Black Forest Labs, LTX, and Runway gain NVIDIA ecosystem exposure as coalition partners, positioning their generation platforms at the intersection of media AI and physical AI tooling

What we don't know yet

  • Whether Cosmos 3 Edge has a confirmed release timeline beyond 'forthcoming,' and which edge hardware platforms it targets
  • What commercial licensing terms govern Cosmos 3 model weights, given that 'open-source' covers a wide range of actual usage rights
  • How Cosmos Coalition members Generalist and LTX contribute technically versus serving primarily as adoption validators for the announcement

What others are reporting

Coverage cluster as of 24h after publish

  1. Frames Cosmos 3 as NVIDIA's strategic move beyond chips into the foundational platform layer for physical AI, capturing recurring value across robotics and AVs.

    World models want to help physical agents become more generalizable by understanding how the world works. - Ming-Yu Liu, NVIDIA VP
  2. NVIDIA Blog Read →

    First-party technical deep-dive detailing the mixture-of-transformers reasoning and generation blocks, with real-world validation across VANTAGE-Bench, TAR, Physics-IQ, R-Bench, and PAI-Bench.

  3. GlobeNewswire Read →

    Official wire with full technical specs: model variant breakdown (Super, Nano, Edge), benchmark rankings across five leaderboards, and named cloud deployment partners including Azure.

    The big bang of physical AI is just around the corner thanks to breakthroughs in multimodal reasoning language, vision and world models. - Jensen Huang
  4. The Decoder Read →

    Critical take questioning whether Cosmos 3's reasoning traces accurately reflect internal operations, with noted absence of external benchmarks against Waymo or Tesla.

    Cosmos 3 generates photorealistic video sequences of rare situations like near-misses or unusual object arrangements in a warehouse.
  5. Interesting Engineering Read →

    Adds TSMC semiconductor manufacturing use case and names Stanford, ETH Zurich, and UC San Diego as GR00T humanoid robot adopters, framing the launch as a full-stack industrial play.