ByteDance Open-Sources Bernini-R — Unified Video Generation and Editing Model Combining MLLM Semantic Planner With DiT Renderer, Claims Commercial-Model-Parity on Editing Tasks
Summary
ByteDance open-sourced Bernini-R, releasing inference code and model weights for a unified video generation and editing framework that combines an MLLM-based semantic planner with a DiT-based renderer. ByteDance claims Bernini reaches the first tier among leading closed-source commercial models on video editing benchmarks. Model weights and a diffusers-format bundle are available on HuggingFace (ByteDance/Bernini-R) and GitHub.
Originally reported by reddit.com
Read the original article →Original headline: ByteDance Open-Sources Bernini-R — Unified Video Generation and Editing Model Combining MLLM Semantic Planner With DiT Renderer, Claims Commercial-Model-Parity on Editing Tasks