Gemma 4 Heretic merge cuts safety refusals to 12/100
Key insights
- Heretic deep neural consolidation achieved KLD 0.0152, indicating minimal distributional drift from the base Gemma-4-31B model.
- The 12/100 refusal rate matches the established Heretic benchmark from earlier series releases, confirming reproducible results.
- This release continues a community pattern of systematically applying Heretic merging to strip safety filters from Google's Gemma 4 family.
Why this matters
The Heretic methodology has now produced consistent, benchmark-verified results across multiple Gemma 4 merges, meaning uncensoring is becoming a repeatable pipeline rather than a one-off effort. For AI practitioners and founders evaluating open-weight models, this signals that any organization releasing model weights should treat safety filtering as permanently porous once weights are public. Technical leaders watching AI governance need to note that a quantitative KLD metric gives the community a measurable handle on merge quality, which accelerates iteration and lowers the barrier to future uncensored releases across other model families.
Summary
The Gemma-4-Harmonia-31B-Uncensored-Heretic model landed on Hugging Face this week, built by community developer llmfan46 using the Heretic deep neural consolidation methodology to merge multiple Gemma-4-31B instruction-tuned fine-tunes into a single uncensored release.
The Heretic merge process is specifically designed to minimize distributional drift while stripping safety filters. This release posts a KLD of 0.0152 and a 12/100 refusal rate, consistent with earlier Heretic series benchmarks, suggesting the methodology produces repeatable results across different source fine-tunes.
Essentially: (llmfan46, Heretic community) are systematically applying a standardized merge recipe to Google's Gemma 4 family at scale.
- KLD 0.0152 signals the merged model stays extremely close to the base Gemma-4-31B distribution despite safety filter removal.
- The 12/100 refusal benchmark is now a defined community standard, not a one-off outcome.
Open-weight models with documented, reproducible uncensoring pipelines represent a structural challenge for any company trying to enforce safety properties on released weights.
Potential risks and opportunities
Risks
- Google's enterprise Gemma 4 sales face reputational drag as uncensored Heretic derivatives undermine its safety posture with compliance-sensitive customers evaluating open-weight deployments
- Hugging Face could face regulatory pressure under the EU AI Act if hosting a documented, growing series of uncensored large-model releases is classified as high-risk distribution
- Organizations self-hosting Heretic-merged models in production expose themselves to liability if outputs cause downstream harm with no vendor safety layer and no audit trail
Opportunities
- Post-deployment guardrail vendors (Guardrails AI, LlamaGuard-based systems, Rebuff) gain a clear sales motion targeting teams running uncensored open-weight models without built-in safety layers
- Cloud providers offering managed Gemma 4 deployments can differentiate on audited safety compliance that self-hosted Heretic merges structurally cannot provide to enterprise buyers
- Researchers studying model merging can use the Heretic KLD benchmark series as a reproducible baseline for evaluating merge quality across other open-weight model families beyond Gemma
What we don't know yet
- Which specific Gemma-4-31B fine-tunes were merged, and what their individual refusal rates were before consolidation
- Whether Google has issued any response or updated its Gemma 4 usage policy in light of the growing Heretic merge series
- How the 12/100 refusal benchmark is operationally defined and what prompt categories account for the remaining 12 refusals
Originally reported by huggingface.co
Read the original article →Original headline: r/LocalLLaMA: Gemma-4-Harmonia-31B-Uncensored-Heretic Ships on Hugging Face — Community Merge of Multiple Gemma-4-31B Fine-Tunes Using Heretic Deep Neural Consolidation, KLD 0.0152, 12/100 Refusals