tomshardware.com web signal

Nvidia Ships First Vera Rubin AI Platform Samples

By Alexis Dufresne Published June 1, 2026 at 05:06 UTC Updated June 2, 2026 at 05:17 UTC

5 sources tracking this story

nvidia chips ai infrastructure ai-infrastructure chips

Key insights

The NVL72 requires one-quarter the GPU count of an equivalent Blackwell system for MoE workloads, so the 10x cost-per-token improvement comes from architectural efficiency, not raw FLOPS scaling.
NVIDIA ships the BlueField-4 DPU with integrated SSD for KV cache offload, treating inference memory management as a hardware-solved problem rather than a software optimization.
Dell, HPE, SuperMicro, and Lenovo are already committed to Vera Rubin server builds, compressing the window between engineering samples and hyperscaler-ready hardware availability.

Why this matters

The Vera Rubin VR200 NVL72 moving from roadmap to physical engineering samples marks the transition where hyperscalers — AWS, Google Cloud, Microsoft, Oracle, CoreWeave, and AI labs including OpenAI, Anthropic, and Meta — can begin silicon validation cycles before volume availability in H2 2026. At 50 PFLOPS FP4 and one-tenth the inference cost per token versus Blackwell, the platform shifts the economics of large-scale model inference. NVIDIA ships the NVL72 as a fully integrated six-chip system backed by over 350 supply chain partners across 30 countries, with Dell, HPE, SuperMicro, and Lenovo already committed to server builds. That supply chain concentrates roughly 150 factories in Taiwan, making geopolitical exposure a named production-ramp risk as volume delivery approaches.

Summary

NVIDIA has begun delivering engineering samples of its Vera Rubin AI platform to select hardware partners, with CFO Colette Kress confirming the milestone during the company's earnings call. Sample hardware includes the 88-core Vera CPU, Rubin GPUs with 288 GB of HBM4 each, Rubin CPX GPUs with 128 GB of GDDR7, NVLink 6.0 switching fabric, BlueField-4 DPUs with integrated SSDs for key-value cache storage, and Quantum-CX9 1.6 Tb/s Photonics InfiniBand NICs. Essentially: (NVIDIA, Foxconn, Quanta, Supermicro, Wistron) are in hardware bring-up for NVIDIA's next-generation AI data center platform. - Some partners receive complete NVL72 VR200 racks; others get silicon-only samples to prepare their software and hardware stacks - Production shipments target H2 2026, with the platform positioned to compete against AMD Instinct accelerators in hyperscaler data centers Physical silicon in partner hands marks the real start of Vera Rubin integration work.

Potential risks and opportunities

Risks

If production slips to early 2027, hyperscalers that budgeted for Vera Rubin capacity in H2 2026 would need to extend AMD Instinct contracts or delay AI infrastructure expansion plans.
ODM partners receiving silicon-only samples rather than complete NVL72 VR200 racks face complex integration work with no published performance targets to validate against.
Competitive pressure from AMD's Instinct accelerators could intensify if Vera Rubin production delays give AMD additional time to expand qualification footprints at hyperscalers currently evaluating alternatives.

Opportunities

Foxconn, Quanta, Supermicro, and Wistron gain first-mover rack integration expertise for the NVL72 VR200, positioning them ahead of rival ODMs for high-margin production contracts.
Networking silicon vendors supplying Spectrum-6 Photonics Ethernet and Quantum-CX9 1.6 Tb/s InfiniBand components are locked in early as the default Vera Rubin interconnect fabric.
AI inference software vendors building for BlueField-4 DPU and SSD-backed key-value cache architectures are well-positioned as NVIDIA's inference-at-scale platform enters production.

What we don't know yet

Performance benchmarks comparing Vera Rubin to prior NVIDIA generations, including FLOPS, memory bandwidth, and cost per token, were not disclosed in this announcement.
Whether named hyperscaler customers such as Microsoft, Google, Meta, or Amazon are among the recipients receiving Vera Rubin samples was not confirmed.
Volume ramp timing remains ambiguous: the 'second half of 2026 or early 2027' window spans two distinct planning horizons for hyperscaler capacity commitments.

What others are reporting

Coverage cluster as of 24h after publish

SiliconANGLE Read →

Names Dell, HPE, SuperMicro, and Lenovo as committed OEM server partners; confirms 350+ supply chain partners across 30 countries and first complete systems targeting fall 2026.
TechPowerUp Read →

Hardware-specialist coverage of the sample delivery milestone with technical architecture analysis and production ramp timeline.
Techzine Global Read →

Broadest ecosystem view: covers OpenShell Secure Runtime, NemoClaw, Cosmos 3 physical AI, Isaac GR00T humanoid integration, and Groq 3 LPU pairing for low-latency inference alongside the hardware.

Vera Rubin was built for this moment — an AI factory engine that delivers intelligence at scale, with the performance, efficiency and security needed to power the next industrial revolution.
CryptoBriefing Read →

Investor-risk framing: flags Taiwan supply-chain concentration (150 of 350 factories) and notes that doubling Blackwell's supply chain scale introduces execution and geopolitical exposure.

Vera Rubin delivers up to 10 times lower cost per token compared to the Blackwell architecture.

Originally reported by tomshardware.com

Read the original article →

Original headline: NVIDIA Ships First Vera Rubin VR200 AI Rack Samples — 50 PFLOPS FP4, 10× Inference Cost Reduction Versus Blackwell