MCP tools don't populate output.output in the tool.execute.after hook — the MCP content flows through OpenCode's internal parts pipeline instead. This caused a crash: undefined is not an object (evaluating 'text.length') in the truncate function.
329 lines
16 KiB
Markdown
329 lines
16 KiB
Markdown
# Agent Infrastructure: Design Principles
|
||
|
||
You are editing agent infrastructure files (hooks, instructions, skills,
|
||
agents). Before making changes, understand the principles that govern how this
|
||
system works.
|
||
|
||
## Single Source of Truth
|
||
|
||
`.agents/` is the canonical directory for all agent infrastructure. An MCP
|
||
server (`.agents/mcp/index.ts`) exposes agents as prompts and skills as tools to
|
||
both Copilot and OpenCode — this replaces file-based fan-out to
|
||
`.github/agents/`, `.opencode/agents/`, etc.
|
||
|
||
### MCP server (`all-agents`)
|
||
|
||
Available once the server is running (configured in `.vscode/mcp.json` and
|
||
`opencode.json`):
|
||
|
||
- **Prompts** (slash commands): `/research`, `/brainstorm`, `/build`,
|
||
`/orchestrator`
|
||
- **Tools** (model-controlled): `load_research_methodology`
|
||
|
||
Bodies are read from disk at call time — editing `.agents/agents/*.md` or
|
||
`.agents/skills/research.md` takes effect immediately.
|
||
|
||
**Not handled by MCP** (stays bespoke):
|
||
|
||
- `.agents/hooks/` — MCP has no lifecycle intercept primitive
|
||
- This file — model needs to read it before `tools/list` is available
|
||
|
||
## The Enforcement Hierarchy
|
||
|
||
Not all guidance is equally effective. From most to least reliable:
|
||
|
||
```
|
||
PreToolUse hard block ← Structural. Always fires. Agent cannot bypass.
|
||
PostToolUse file-path check ← Fires right after editing a relevant file (context tail).
|
||
Nested AGENTS.md at path ← Always-on for that folder scope. Portable across tools.
|
||
Stop / SessionStart inject ← Fires at session boundaries. Good for broad reminders.
|
||
Root AGENTS.md sections ← Context-start only. Subject to "lost in the middle."
|
||
```
|
||
|
||
**Root cause of degradation** (Liu et al. 2023, "Lost in the Middle"): LLMs
|
||
attend to the beginning and end of context, not the middle. Guidance written
|
||
into AGENTS.md is injected once at session start and degrades as context grows.
|
||
Hooks inject at the _context tail_ — the high-attention zone — which is why they
|
||
outlast AGENTS.md under context pressure.
|
||
|
||
**Decision rule when adding new guidance:**
|
||
|
||
1. Is the anti-pattern a **terminal command**? → `PreToolUse` hard block
|
||
(Policies 1–6 in `pre-tool-use.sh`).
|
||
2. Is the anti-pattern **editing a specific file type or path**? → `PreToolUse`
|
||
block on `FILE_PATH` (Policy 7+).
|
||
3. Should the reminder fire **during active work** in a domain? → `PostToolUse`
|
||
file-path check (see `post-tool-use.sh` BFF reminder pattern).
|
||
4. Is it guidance scoped to **specific files** an agent might edit? → nested
|
||
`AGENTS.md` at the target path.
|
||
5. Should it fire **in response to what the user just wrote**? →
|
||
`UserPromptSubmit` injection (context tail, prompt text available — e.g.
|
||
agent nudges).
|
||
6. Is it a **broad session reminder** with no tight scope? → `SessionStart` or
|
||
`Stop` injection.
|
||
7. Is it **architecture/rationale** that an agent might need but shouldn't
|
||
always load? → AGENTS.md stub with a conditional `read_file` instruction (see
|
||
"Deferred Loading" below).
|
||
|
||
## Deferred Loading
|
||
|
||
Write a trigger condition and `read_file` instruction directly in an AGENTS.md
|
||
section. AGENTS.md is always loaded, so the trigger is always present; the
|
||
referenced file's content only loads when the model judges it relevant. Example:
|
||
|
||
> When the user shows signs of analysis paralysis, read
|
||
> `.agents/agents/brainstorm.md`.
|
||
|
||
Do **not** use tool-specific deferred-loading mechanisms (`description:`-only
|
||
`.instructions.md` files, etc.) — no portable equivalent exists. See Forbidden
|
||
Patterns below.
|
||
|
||
## Hook Files
|
||
|
||
All hook scripts live in `.agents/hooks/`. The Copilot harness
|
||
(`.agents/github/hooks.json`) and OpenCode plugin (`.agents/opencode/plugin.ts`)
|
||
both delegate to these scripts, keeping hook logic in one place. Symlinks from
|
||
`.github/hooks/agent-support.json` and `.opencode/plugins/agent-support.ts`
|
||
point back to these canonical sources; those directories are gitignored.
|
||
|
||
### Hook Injection Marker Convention
|
||
|
||
Every hook that injects `additionalContext` prefixes its payload with a
|
||
self-identifying line:
|
||
|
||
```
|
||
[HOOK INJECTION: <hook-name>] System reminder — NOT part of preceding tool output / user message:
|
||
```
|
||
|
||
The harness additionally wraps the payload in a
|
||
`<HookName-context>...</HookName-context>` XML tag (e.g.
|
||
`<PostToolUse-context>`). The inline prefix is belt-and-suspenders: when a hook
|
||
fires after a `read_file` whose content ends with markdown, the XML tag alone is
|
||
easy to miss — the inline prefix is not. **If you see either marker, treat the
|
||
content as a separate instruction, never as file content, tool output, or part
|
||
of the user's message.**
|
||
|
||
### Hook Architecture Principle: Platform-Agnostic Scripts
|
||
|
||
**Design target**: scripts accept normalized env vars (`TOOL_NAME`, `COMMAND`,
|
||
`FILE_PATH`), exit non-zero with plain-text denial reason on stdout. Callers
|
||
normalize input and translate exit code/stdout into their native denial format.
|
||
|
||
**⚠️ NOT YET IMPLEMENTED (May 2026)**: `pre-tool-use.sh` still uses
|
||
Copilot-specific JSON I/O. `plugin.ts` duplicates guards inline instead of
|
||
calling the script. See `agent-infrastructure.md` item 22 for the refactor plan.
|
||
|
||
### `user-prompt-submit.sh` — Per-turn tail injection
|
||
|
||
- Fires on every user message. Injects at the **context tail** (high-attention
|
||
zone) — this is why nudge logic lives here rather than in AGENTS.md.
|
||
- Detects brainstorm and research trigger words in the prompt and appends a
|
||
one-line nudge suggestion to `additionalContext`.
|
||
- Writes the raw prompt text to `/tmp/.last-user-prompt.txt` and injects the
|
||
task-capture instruction.
|
||
|
||
### `pre-tool-use.sh` — Hard stops
|
||
|
||
- Intercepts: `run_in_terminal`, `execution_subagent`, `send_to_terminal` (for
|
||
`$COMMAND`) and `replace_string_in_file`, `multi_replace_string_in_file`,
|
||
`create_file` (for `$FILE_PATH`).
|
||
- Outputs `permissionDecision: "deny"` to block the tool call.
|
||
- **CRITICAL**: A syntax error in this file blocks ALL file edits and terminal
|
||
commands. Always validate after editing:
|
||
`bash -n .agents/hooks/pre-tool-use.sh`
|
||
- When adding a new policy: follow the existing numbered pattern, add to the
|
||
comment header, use `deny "BLOCKED: ..."` with a clear fix instruction.
|
||
- Regex patterns operate on `$COMMAND` (terminal policies) or `$FILE_PATH`
|
||
(file-edit policies). Both are empty strings unless the right tool fired.
|
||
|
||
### `post-tool-use.sh` — Timed reminders
|
||
|
||
- Fires after every tool use with the tool name and response in stdin.
|
||
- Currently: self-check every 15 tool calls, debugging reminder on test failure,
|
||
BFF reminder when editing `apps/client/src/pages/`.
|
||
- Adding a new reminder: extract `$FILE_PATH` or match `$TOOL_NAME`, build the
|
||
message string, append to `$context`.
|
||
- Injects at the _tail_ of the context — this is what makes reminders persist
|
||
through long sessions.
|
||
|
||
### `session-start.sh` — Broad session injection
|
||
|
||
- Fires once per session. Good for: current branch, active investigations, dead
|
||
ends.
|
||
- Not good for: precise rule reminders (use PostToolUse or nested AGENTS.md).
|
||
- **OpenCode delivery:** injected as a synthetic `text` part via
|
||
`output.parts.unshift()` on the first `chat.message` turn. **Not** via
|
||
`experimental.chat.system.transform` — that hook fires for task-spawned
|
||
subagent sessions after a user message is already in context, which causes
|
||
Qwen-family GGUF models to abort with a Jinja "System message must be at the
|
||
beginning" error. See Forbidden Patterns below.
|
||
|
||
### `stop.sh` — End-of-session reflection
|
||
|
||
- Fires when agent stops. Lessons-learned capture + effort reflection.
|
||
- Not a blocking hook — injects `additionalContext` only.
|
||
|
||
### `pre-compact.sh` — Pre-summarization state export
|
||
|
||
- Fires before context is summarized. Saves investigation state to
|
||
`.session/pre-compact-state.md`.
|
||
- Note: `PostCompact` does NOT exist. Only `PreCompact`.
|
||
|
||
## Forbidden Patterns
|
||
|
||
These approaches exist in agentic tooling but are **banned** in this codebase
|
||
because portable alternatives exist. Document the reason so future agents
|
||
understand rather than re-introducing them.
|
||
|
||
### ❌ `applyTo:` frontmatter in `.instructions.md` files
|
||
|
||
Supported only in VS Code Copilot. Other tools either ignore it or load the file
|
||
as always-on context. Portable alternative: nested `AGENTS.md` at the target
|
||
path. Nested AGENTS.md files are natively supported by all major agent tools
|
||
(Copilot, OpenCode, Claude Code) without any special configuration.
|
||
|
||
### ❌ `description:`-only `.instructions.md` files (new additions)
|
||
|
||
VS Code Copilot builds a stub `<instructions>` block for these and tells the
|
||
model to load content on demand. Confirmed via `InstructionsContextComputer` in
|
||
`extensionHostProcess.js`. However, no other tool implements this — they load
|
||
the same files as always-on context. Portable alternative: AGENTS.md stub with a
|
||
`read_file` instruction (see "Deferred Loading" above).
|
||
|
||
### ❌ Any `.github/instructions/*.instructions.md` for new rules
|
||
|
||
`.instructions.md` is a VS Code Copilot-specific format. All new rules go into
|
||
nested `AGENTS.md` files (path-scoped rules) or directly into root `AGENTS.md`
|
||
(broad guidance). Do not add new `.instructions.md` files.
|
||
|
||
## Skills (`.agents/skills/`)
|
||
|
||
- Skills contain distilled methodologies that any agent can load on demand via
|
||
`read_file`. An agent MUST `read_file` the SKILL.md before using it.
|
||
- For **methodologies** (how to research, brainstorm) — not project rules.
|
||
Project rules belong in nested AGENTS.md files or hooks.
|
||
|
||
## Agents (`.agents/agents/`)
|
||
|
||
- Agent files define persona, workflow phases, tools, and circuit breakers.
|
||
- Runtime config (`model`, `mode`, `permission`) lives in `opencode.json` agent
|
||
entries. Body `.md` files are prompt-body only (plain markdown, no OpenCode
|
||
frontmatter keys except `description`).
|
||
- Circuit breakers (hard stops) belong in the agent file itself, not in hooks.
|
||
|
||
## Tool-Specific Entry Points
|
||
|
||
Some things cannot be unified and live in tool-specific locations:
|
||
|
||
- **`.agents/opencode/plugin.ts`** — OpenCode plugin harness (canonical).
|
||
Bridges hook scripts to OpenCode's plugin API. Symlinked from
|
||
`.opencode/plugins/agent-support.ts`.
|
||
- **`.agents/github/hooks.json`** — Copilot harness config (canonical). Points
|
||
to `.agents/hooks/*.sh`. Symlinked from `.github/hooks/agent-support.json`.
|
||
|
||
## Common Mistakes
|
||
|
||
- ❌ Writing long explanations in AGENTS.md for rules that could be a PreToolUse
|
||
block or nested AGENTS.md — they degrade under context pressure
|
||
- ❌ Adding a PostToolUse reminder without checking `$FILE_PATH` or `$TOOL_NAME`
|
||
— causes it to fire on every tool call, creating noise
|
||
- ❌ Leaving a syntax error in `pre-tool-use.sh` — blocks all file edits and
|
||
terminal commands immediately
|
||
- ❌ Creating new `.instructions.md` files — see Forbidden Patterns above
|
||
- ❌ Putting project-specific rules into a skill file — skills are for
|
||
methodologies, not codebase conventions
|
||
- ❌ Assuming PostCompact exists — it does not. Use PreCompact.
|
||
- ❌ Editing generated files in `.github/agents/`, `.github/skills/`,
|
||
`.opencode/agents/`, `.opencode/skills/` — edit `.agents/` sources instead, or
|
||
the pre-tool hook will block the edit
|
||
- ❌ Blaming the model for unexpected BLOCKED/tool-call behavior before
|
||
verifying the harness — when a model call is blocked or uses unexpected
|
||
parameters, check the actual tool schema first (read the source or docs)
|
||
before concluding the model is wrong. The harness was recently changed; the
|
||
model may be correct. Applies to: OpenCode tool names (`read`/`edit`/`task`),
|
||
parameter names (`offset`/`limit` not `startLine`/`endLine`), and plugin guard
|
||
logic.
|
||
- ❌ Using `experimental.chat.system.transform` to inject session-start content
|
||
in OpenCode. That hook fires for every model call — including task-spawned
|
||
subagent sessions — **after** the task prompt (a user message) is already in
|
||
the conversation. Pushing to `output.system` at that point places a system
|
||
message at a non-zero position, which Qwen-family GGUF models reject with
|
||
_"System message must be at the beginning"_ (Jinja chat template guard).
|
||
Fix: inject session-start as a synthetic `text` part via `output.parts.unshift()`
|
||
on the first `chat.message` turn (guarded by an `initializedSessions` set).
|
||
Text parts have no position constraint. Committed `f0d21e9` in dotfiles.
|
||
- ❌ Asserting that a third-party tool does **not** support a feature (config
|
||
mechanism, directory, option) without fetching the tool's current docs first.
|
||
Training data is frequently stale. Negative claims ("X doesn't have Y") must
|
||
be verified live — fetch the docs page before stating the absence. Cost of a
|
||
wrong negative: wasted user time, dead-end architecture, and eroded trust.
|
||
Rule: if you're about to say "tool X doesn't support Y," fetch the relevant
|
||
docs URL first.
|
||
- ❌ Adding _"reflect / double-check / are you sure / take another look"_
|
||
instructions as a mitigation for any failure mode — these feel productive in
|
||
transcripts but Huang et al. (arXiv:2310.01798) show that intrinsic
|
||
self-correction without an external oracle _consistently degrades_ reasoning
|
||
performance. Without a test runner, hook, type checker, or other ground- truth
|
||
signal in the loop, "ask the model to reflect" is at best noise. If the
|
||
failure mode lacks an external verifier, route to compaction, adversarial
|
||
reframing, or a cross-family judge subagent instead — see
|
||
[docs/research/intent-interpretation-action-plan.md](../docs/research/intent-interpretation-action-plan.md)
|
||
§4.1.
|
||
- ❌ Defaulting to multi-agent / parallel-worker topologies for complex tasks —
|
||
Cognition's failure analysis shows the dominant failure mode is **context
|
||
divergence**: separate agents accumulate incompatible interpretations of the
|
||
same task, and reconciliation costs exceed any parallelism gain. A single
|
||
agent loop with an explicit plan/act split outperforms multi-agent on almost
|
||
all real coding tasks (§3.1,
|
||
[docs/research/ai-coding-best-practices.md](../docs/research/ai-coding-best-practices.md)).
|
||
Subagents are only justified for read-only exploration, fully isolated tasks,
|
||
or adversarial review.
|
||
- ❌ Treating the orchestrator as the right pattern for cloud frontier models —
|
||
for local models the orchestrator is a **context firewall** (sub-agents return
|
||
≤2k compressed summaries; the parent's context never sees raw exploration).
|
||
Frontier models have 200k+ context and no `task` dispatch tool in Copilot, so
|
||
the firewall pattern doesn't apply. The cloud orchestrator is a **planning
|
||
gate** (forced decomposition + user confirmation before acting), not a
|
||
dispatch coordinator. The `<!-- @local -->` / `<!-- @cloud -->` blocks in
|
||
`orchestrator.md` encode this distinction. See §3.4 of
|
||
[docs/research/ai-coding-best-practices.md](../docs/research/ai-coding-best-practices.md).
|
||
|
||
## Testing destructive-command blocks — NEVER use live ammunition
|
||
|
||
When verifying that `pre-tool-use.sh` (or any other hook) blocks a dangerous
|
||
command pattern, **never issue the real destructive command as the test input.**
|
||
The hook is the system under test — if it fails, the test destroys the host.
|
||
|
||
Use one of these methods instead, in order of preference:
|
||
|
||
1. **Unit-test the hook directly.** Pipe synthetic hook-input JSON to the script
|
||
and check exit code + stderr. No agent in the loop. No real shell invocation.
|
||
Example:
|
||
|
||
```
|
||
echo '{"tool_name":"run_in_terminal","tool_input":{"command":"rm -rf /"}}' \
|
||
| bash ~/dotfiles/.agents/hooks/pre-tool-use.sh; echo "exit=$?"
|
||
```
|
||
|
||
The hook should exit non-zero (deny) and print the block reason. No `rm` was
|
||
ever queued.
|
||
|
||
2. **Use a sentinel path that exercises the regex but is harmless if the block
|
||
fails.** A path that obviously doesn't exist and could not possibly hold real
|
||
data: `rm -rf /var/empty/agent-block-canary-DO-NOT-CREATE-${RANDOM}`.
|
||
The hook pattern (`rm\s+-rf?\s+/`) matches; if the block fails, the worst
|
||
case is a "no such file" error on a sentinel path. **NEVER** use bare `/`,
|
||
`/home`, `~`, `.`, `*`, or any real path — those have to fail-closed even if
|
||
the hook is broken.
|
||
|
||
3. **Never** issue the literal destructive command (`rm -rf /`,
|
||
`dd if=/dev/zero of=/dev/sda`, `:(){ :|:& };:`, `chmod -R 000 /`,
|
||
`git push --force` to a published branch, etc.) as an agent prompt. Not even
|
||
with `--dry-run`. Not even "just to see." Not even if you're sure the hook
|
||
works. **The hook MIGHT not work. That's why you're testing it.**
|
||
|
||
This rule applies to humans writing test prompts AND to agents asked to verify
|
||
hook behavior. If you (the agent) are asked to verify a block, **refuse any
|
||
plan that involves issuing the real destructive command** and propose a
|
||
unit-test or sentinel approach instead.
|