reddit.com via Reddit May 27th 2026

Qwen3-35B sub-agent silently fails in production

agents inference open source local-llm agents failure-modes

Key insights

Qwen3-35B-A3B sub-agent failures present as plausible tool-call outputs, making errors invisible to orchestrators expecting explicit failure signals.
JSON outputs from the model misroute to its internal reasoning channel, causing silent data loss in multi-agent pipelines.
Community testing across multiple orchestration frameworks confirmed the failure patterns, suggesting the issue is structural rather than model-specific.

Why this matters

Multi-agent pipelines are rapidly becoming the default production architecture for AI systems, and silent failure propagation is a correctness problem that standard logging and completion-rate metrics won't surface. The four failure modes documented here mean an orchestrator can report 100% task completion while downstream outputs are systematically corrupted across the entire run. Any team deploying open-source models as sub-agents without per-call output validation is currently operating without a meaningful signal that their pipeline is working.

Summary

Running Qwen3-35B-A3B as a sub-agent on a single RTX 4090, a developer has documented four failure modes the orchestrator layer never sees. In solo use, model failures are obvious. In a pipeline, the model wraps those failures in plausible tool-call responses. The orchestrator logs success and continues. Corruption propagates silently. Essentially: (Alibaba Qwen, multi-agent pipeline builders) never designed for how failure semantics change when one model controls another. - JSON outputs misroute to the model's internal reasoning channel instead of the structured response field. - Context bleeds silently across sequential tasks, contaminating later instruction calls. - Hallucinated completions are repackaged as successful results rather than flagged errors. Multiple community replies confirmed the patterns across different orchestration frameworks, pointing to a structural gap beyond any single model.

Potential risks and opportunities

Risks

Production teams using Qwen3-35B-A3B in automated pipelines without per-call output validation may have already accumulated silently corrupted results that passed downstream quality checks undetected
Orchestration framework maintainers (LangGraph, CrewAI, AutoGen) face near-term pressure to ship sub-agent failure detection layers, adding latency and engineering cost to existing deployments
Organizations treating high orchestrator-level task-completion rates as a pipeline health signal may misread system correctness for months before corrupted outputs surface in user-facing products

Opportunities

Inference infrastructure providers (Together AI, Fireworks AI, Replicate) can differentiate by adding structured-output enforcement and sub-agent output validation at the API layer before orchestrators ever see a response
Observability vendors building for multi-agent systems (Langfuse, Arize AI, Weights and Biases) have a clear wedge: per-call semantic validation that catches the failure modes orchestrators currently miss entirely
Open-source contributors who ship a standardized sub-agent failure testing harness now could establish it as the default evaluation layer for multi-agent pipelines ahead of any framework-native solution

What we don't know yet

Whether Alibaba's Qwen team has reproduced the instruction-scope leakage behavior internally and whether it persists across newer Qwen3 variants released in early 2026
Which specific orchestration frameworks (LangGraph, CrewAI, AutoGen) were tested and whether any showed meaningfully better sub-agent failure isolation than others
Whether structured-output enforcement modes at the inference layer (JSON schema constraints, grammar-constrained decoding) mitigate the JSON misrouting failure mode

Originally reported by reddit.com

Read the original article →

Original headline: r/LocalLLaMA: Developer Running Qwen3.6-35B-A3B as Sub-Agent Documents Four Failure Modes That Are Invisible to the Orchestrator