dotfiles/.agents/docs/failure-modes.md
Brydon DeWitt 83f456f25b fix(plugin): guard against undefined output.output for MCP tools
MCP tools don't populate output.output in the tool.execute.after hook —
the MCP content flows through OpenCode's internal parts pipeline instead.
This caused a crash: undefined is not an object (evaluating 'text.length')
in the truncate function.
2026-06-06 02:11:24 -04:00

6.8 KiB

Failure Modes — Qwen3.6 & OpenCode

Compiled 2026-05-27. Sources linked inline.


Qwen3.6 Model-Specific Quant & Routing Issues

IQ3 Quant — Tool Call JSON Failure

Name IQ3 quant tool-call JSON breakage
Description Qwen3.6 35B-A3B at IQ3_XXS quant fails function-call JSON generation entirely. BatiAI's Ollama benchmark shows for IQ3, for IQ4 and Q6. IQ3 is memory-bandwidth bound (~45.9 t/s on M4 Max) and loses the precision needed for structured JSON output in tool calls.
Mitigation Use IQ4_XS or Q6_K for any workload with tool calling. IQ3 is acceptable only for text-only chat. IQ4 and Q6 show equivalent throughput.
Sources batiai/qwen3.6-35b:iq3 (Ollama)

MoE Expert Loop — Q4_K_M & Below Routing Lock

Name Q4_K_M MoE expert routing collapse
Description Qwen3.6's MoE architecture (256 routed experts, top-8 selection) degrades at Q4_K_M and below: the router locks into a subset of specialists (e.g., code-completion specialist for math queries, math specialist for syntax tasks). Expert activation entropy collapses. This is a structural MoE failure — dense Qwen2.5-72B does not exhibit this. Perplexity delta of +0.34 at Q4_K_M looks acceptable on paper but produces hallucinated method names, wrong parameter counts, and broken imports.
Mitigation Default to Q6_K (1.6-point SWE-bench loss vs Q8_0, saves 2.1 GB VRAM). For 24 GB cards, Q4_K_M is acceptable only for RAG ingestion or documentation chat — not active code generation or function calling. Q8_0 wins SWE-bench Lite at 28.7%. BFCL v2 function-calling accuracy: 94.2% (Q8_0) → 89.7% (Q4_K_M).
Sources Qwen3.6 quant benchmarks: Q4 vs Q8 for MoE (CraftRigs); Qwen3.6-27B Setup Guide: 24GB GPU (CraftRigs)

Official Chat Template — Non-Standard XML Parameter Format

Name Qwen3.6 official chat_template.jinja XML vs JSON incompatibility
Description Qwen3.6's shipped chat_template.jinja instructs the model to generate function calls using a proprietary XML-like syntax (<function=...><parameter=...>) instead of OpenAI-compatible JSON. Missing closing tags cause parsing failures in standard inference frameworks (vLLM, HuggingFace transformers, llama-cpp-python, OpenAI-compatible API layers). Error: Failed to parse input at pos XXXX: <function=read> <parameter=filePath> ....
Mitigation Patch chat_template.jinja to use OpenAI-compatible JSON schema ({"name": "function_name", "arguments": "{\"param1\": \"value1\"}"}).
Sources abysslover/qwen36_tool_calling_failure (GitHub)

Long-Text Stability — Context Accumulation Amplifies Routing Drift

Name Q4_K_M multi-turn routing drift
Description General chat tolerates +0.50 perplexity delta before quality drop is noticed. Multi-turn technical discussion (>3 turns with context accumulation), chain-of-thought reasoning, and structured output cross the threshold where expert loop errors become detectable within the first 10 responses. Context accumulation amplifies routing drift.
Mitigation Q4_K_M acceptable for single-turn or short-context use. For long contexts or multi-turn structured output, use Q6_K or Q8_0.
Sources Qwen3.6 quant benchmarks: Q4 vs Q8 for MoE (CraftRigs)

OpenCode Plugin / Hook-Specific Failures

session.start — Resume / --continue Does Not Fire Plugin Context

Name session.start hook failure on resume
Description session.start hook fires reliably for new sessions (startup trigger) but fails on resume (--continue/--session) with "No context found for instance" error. Plugin.triggerSessionStart is called during route navigation before the plugin context is fully initialized. Pending hook context is consumed lazily on the next model turn, so resume-triggered context can become stale if a session is resumed but not prompted soon after.
Mitigation Be aware that session.start with resume trigger has a bootstrap timing edge case. Pending context becomes stale if the resumed session sits idle. PR #15224 documents the issue and a partial fix.
Sources OpenCode PR #15224 — feat(plugin): add session.start hook; OpenCode Issue #5409 — SessionStart hook for session lifecycle events

PreToolUse — Ask Response Permanently Disables Bypass Permission

Name PreToolUse permission bypass lock
Description When PreToolUse returns permissionDecision: "ask", it permanently disables bypass permission mode until session restart. This is a state machine vulnerability — the permission bypass mode cannot recover from an ask response without a full session reset.
Mitigation If using permission bypass mode, avoid PreToolUse hooks that return ask. Verify hook behavior after any policy change.
Sources Claude Code #37420 (referenced in AGENTS.md)

session.created — Event Fails Reliably for Plugins

Name session.created event reliability for plugins
Description session.created event fails to fire reliably for plugins due to MCP compatibility errors. This affects plugins that depend on session lifecycle events for initialization.
Mitigation Use session.start hook as the primary initialization mechanism instead of relying on session.created events.
Sources OpenCode #14808 (referenced in AGENTS.md, ~/.config/opencode/plugins/engram.ts)

chat.message — Synthetic Text Injection Required for System Message Position

Name Jinja system message position enforcement
Description vLLM propagates Qwen's strict Jinja template requiring role=system at index 0. Auxiliary context injection (e.g., from session-start hooks) breaks this if it places context after the system message. Fix: inject session-start as a synthetic text part via output.parts.unshift() on the first chat.message turn, not via experimental.chat.system.transform. Text parts have no position constraint.
Mitigation Do not use experimental.chat.system.transform for session-start hooks with Qwen-family models. Use synthetic text parts via output.parts.unshift() on the first chat.message turn.
Sources vLLM #41114; AGENTS.md (system reminder pattern)

Generated 2026-05-27 from web search findings.