AMD Delivers First Multi-Node MLPerf Training 6.0 Submission as Round Adds DeepSeek V3 MoE and GPT-OSS 20B Benchmarks
Summary
AMD submitted its first multi-node MLPerf Training 6.0 results, with Instinct MI355X GPUs delivering over one million tokens per second via the new Primus training framework and MXFP4 precision — a significant milestone for the open CDNA stack. The MLCommons round, published June 16, added two new sparse MoE benchmarks (a DeepSeek V3-scale 671B-parameter model and GPT-OSS 20B) and logged a record 24 submitting organizations across 95 unique systems. NVIDIA Blackwell's sweep of the prior seven benchmarks was already reported; AMD's multi-node debut and the expanded benchmark suite confirm that competitive alternatives to NVIDIA's training stack are now submitting at hyperscale.
Originally reported by amd.com
Read the original article →Original headline: AMD Delivers First Multi-Node MLPerf Training 6.0 Submission as Round Adds DeepSeek V3 MoE and GPT-OSS 20B Benchmarks