reddit.com via Reddit June 4th 2026

r/LocalLLaMA: mistral.rs Adds Gemma 4 12B With Full Multimodal, Web Search, Sandboxed Code Execution, and Multi-Token Prediction in One-Step Install

open source inference google open-source local-llm inference

Summary

The mistral.rs Rust-based inference engine shipped Gemma 4 12B support on June 4 bundled with web search, sandboxed code execution for agentic app building, full multimodal inputs (audio, image, video), and multi-token prediction—described as a one-command install. The developer frames it as enabling production-grade agentic pipelines on top of the open-weight model without cloud dependencies. It arrives days after Google's Gemma 4 12B release and is among the first inference frameworks to ship a bundled web-search and code-execution wrapper for the model.

Originally reported by reddit.com

Read the original article →

Original headline: r/LocalLLaMA: mistral.rs Adds Gemma 4 12B With Full Multimodal, Web Search, Sandboxed Code Execution, and Multi-Token Prediction in One-Step Install