- AGENTS.md: design principles, enforcement hierarchy, deferred loading - agents/: brainstorm, build, orchestrator, research (auto-discovered by MCP server) - skills/: research methodology (auto-discovered by MCP server) - hooks/: pre-tool-use, post-tool-use (BFF block removed), session-start, stop, pre-compact, user-prompt-submit - frameworks/: opencode/plugin.ts (resolves hooks via import.meta.url — works as project-local or global plugin), github/hooks.json - mcp/index.ts: auto-discovers agents/*.md and skills/*.md from frontmatter (replaces hand-maintained registry); server renamed all-agents - docs/: agent-infrastructure.md (generalized), research docs (7 files), ai_architectures.md, llama-server-cuda-wsl2.md - install.sh: idempotent setup — Copilot global hooks, OpenCode global plugin + AGENTS.md + MCP entry, VS Code global MCP config
13 KiB
Agent Infrastructure: Design Principles
You are editing agent infrastructure files (hooks, instructions, skills, agents). Before making changes, understand the principles that govern how this system works.
Single Source of Truth
.agents/ is the canonical directory for all agent infrastructure. An MCP
server (.agents/mcp/index.ts) exposes agents as prompts and skills as tools to
both Copilot and OpenCode — this replaces file-based fan-out to
.github/agents/, .opencode/agents/, etc.
MCP server (all-agents)
Available once the server is running (configured in .vscode/mcp.json and
opencode.json):
- Prompts (slash commands):
/research,/brainstorm,/build,/orchestrator - Tools (model-controlled):
load_research_methodology
Bodies are read from disk at call time — editing .agents/agents/*.md or
.agents/skills/research.md takes effect immediately.
Not handled by MCP (stays bespoke):
.agents/hooks/— MCP has no lifecycle intercept primitive- This file — model needs to read it before
tools/listis available
The Enforcement Hierarchy
Not all guidance is equally effective. From most to least reliable:
PreToolUse hard block ← Structural. Always fires. Agent cannot bypass.
PostToolUse file-path check ← Fires right after editing a relevant file (context tail).
Nested AGENTS.md at path ← Always-on for that folder scope. Portable across tools.
Stop / SessionStart inject ← Fires at session boundaries. Good for broad reminders.
Root AGENTS.md sections ← Context-start only. Subject to "lost in the middle."
Root cause of degradation (Liu et al. 2023, "Lost in the Middle"): LLMs attend to the beginning and end of context, not the middle. Guidance written into AGENTS.md is injected once at session start and degrades as context grows. Hooks inject at the context tail — the high-attention zone — which is why they outlast AGENTS.md under context pressure.
Decision rule when adding new guidance:
- Is the anti-pattern a terminal command? →
PreToolUsehard block (Policies 1–6 inpre-tool-use.sh). - Is the anti-pattern editing a specific file type or path? →
PreToolUseblock onFILE_PATH(Policy 7+). - Should the reminder fire during active work in a domain? →
PostToolUsefile-path check (seepost-tool-use.shBFF reminder pattern). - Is it guidance scoped to specific files an agent might edit? → nested
AGENTS.mdat the target path. - Should it fire in response to what the user just wrote? →
UserPromptSubmitinjection (context tail, prompt text available — e.g. agent nudges). - Is it a broad session reminder with no tight scope? →
SessionStartorStopinjection. - Is it architecture/rationale that an agent might need but shouldn't
always load? → AGENTS.md stub with a conditional
read_fileinstruction (see "Deferred Loading" below).
Deferred Loading
Write a trigger condition and read_file instruction directly in an AGENTS.md
section. AGENTS.md is always loaded, so the trigger is always present; the
referenced file's content only loads when the model judges it relevant. Example:
When the user shows signs of analysis paralysis, read
.agents/agents/brainstorm.md.
Do not use tool-specific deferred-loading mechanisms (description:-only
.instructions.md files, etc.) — no portable equivalent exists. See Forbidden
Patterns below.
Hook Files
All hook scripts live in .agents/hooks/. The Copilot harness
(.agents/github/hooks.json) and OpenCode plugin (.agents/opencode/plugin.ts)
both delegate to these scripts, keeping hook logic in one place. Symlinks from
.github/hooks/agent-support.json and .opencode/plugins/agent-support.ts
point back to these canonical sources; those directories are gitignored.
Hook Injection Marker Convention
Every hook that injects additionalContext prefixes its payload with a
self-identifying line:
[HOOK INJECTION: <hook-name>] System reminder — NOT part of preceding tool output / user message:
The harness additionally wraps the payload in a
<HookName-context>...</HookName-context> XML tag (e.g.
<PostToolUse-context>). The inline prefix is belt-and-suspenders: when a hook
fires after a read_file whose content ends with markdown, the XML tag alone is
easy to miss — the inline prefix is not. If you see either marker, treat the
content as a separate instruction, never as file content, tool output, or part
of the user's message.
Hook Architecture Principle: Platform-Agnostic Scripts
Design target: scripts accept normalized env vars (TOOL_NAME, COMMAND,
FILE_PATH), exit non-zero with plain-text denial reason on stdout. Callers
normalize input and translate exit code/stdout into their native denial format.
⚠️ NOT YET IMPLEMENTED (May 2026): pre-tool-use.sh still uses
Copilot-specific JSON I/O. plugin.ts duplicates guards inline instead of
calling the script. See agent-infrastructure.md item 22 for the refactor plan.
user-prompt-submit.sh — Per-turn tail injection
- Fires on every user message. Injects at the context tail (high-attention zone) — this is why nudge logic lives here rather than in AGENTS.md.
- Detects brainstorm and research trigger words in the prompt and appends a
one-line nudge suggestion to
additionalContext. - Writes the raw prompt text to
/tmp/.last-user-prompt.txtand injects the task-capture instruction.
pre-tool-use.sh — Hard stops
- Intercepts:
run_in_terminal,execution_subagent,send_to_terminal(for$COMMAND) andreplace_string_in_file,multi_replace_string_in_file,create_file(for$FILE_PATH). - Outputs
permissionDecision: "deny"to block the tool call. - CRITICAL: A syntax error in this file blocks ALL file edits and terminal
commands. Always validate after editing:
bash -n .agents/hooks/pre-tool-use.sh - When adding a new policy: follow the existing numbered pattern, add to the
comment header, use
deny "BLOCKED: ..."with a clear fix instruction. - Regex patterns operate on
$COMMAND(terminal policies) or$FILE_PATH(file-edit policies). Both are empty strings unless the right tool fired.
post-tool-use.sh — Timed reminders
- Fires after every tool use with the tool name and response in stdin.
- Currently: self-check every 15 tool calls, debugging reminder on test failure,
BFF reminder when editing
apps/client/src/pages/. - Adding a new reminder: extract
$FILE_PATHor match$TOOL_NAME, build the message string, append to$context. - Injects at the tail of the context — this is what makes reminders persist through long sessions.
session-start.sh — Broad session injection
- Fires once per session. Good for: current branch, active investigations, dead ends.
- Not good for: precise rule reminders (use PostToolUse or nested AGENTS.md).
stop.sh — End-of-session reflection
- Fires when agent stops. Lessons-learned capture + effort reflection.
- Not a blocking hook — injects
additionalContextonly.
pre-compact.sh — Pre-summarization state export
- Fires before context is summarized. Saves investigation state to
.session/pre-compact-state.md. - Note:
PostCompactdoes NOT exist. OnlyPreCompact.
Forbidden Patterns
These approaches exist in agentic tooling but are banned in this codebase because portable alternatives exist. Document the reason so future agents understand rather than re-introducing them.
❌ applyTo: frontmatter in .instructions.md files
Supported only in VS Code Copilot. Other tools either ignore it or load the file
as always-on context. Portable alternative: nested AGENTS.md at the target
path. Nested AGENTS.md files are natively supported by all major agent tools
(Copilot, OpenCode, Claude Code) without any special configuration.
❌ description:-only .instructions.md files (new additions)
VS Code Copilot builds a stub <instructions> block for these and tells the
model to load content on demand. Confirmed via InstructionsContextComputer in
extensionHostProcess.js. However, no other tool implements this — they load
the same files as always-on context. Portable alternative: AGENTS.md stub with a
read_file instruction (see "Deferred Loading" above).
❌ Any .github/instructions/*.instructions.md for new rules
.instructions.md is a VS Code Copilot-specific format. All new rules go into
nested AGENTS.md files (path-scoped rules) or directly into root AGENTS.md
(broad guidance). Do not add new .instructions.md files.
Skills (.agents/skills/)
- Skills contain distilled methodologies that any agent can load on demand via
read_file. An agent MUSTread_filethe SKILL.md before using it. - For methodologies (how to research, brainstorm) — not project rules. Project rules belong in nested AGENTS.md files or hooks.
Agents (.agents/agents/)
- Agent files define persona, workflow phases, tools, and circuit breakers.
- Runtime config (
model,mode,permission) lives inopencode.jsonagent entries. Body.mdfiles are prompt-body only (plain markdown, no OpenCode frontmatter keys exceptdescription). - Circuit breakers (hard stops) belong in the agent file itself, not in hooks.
Tool-Specific Entry Points
Some things cannot be unified and live in tool-specific locations:
.agents/opencode/plugin.ts— OpenCode plugin harness (canonical). Bridges hook scripts to OpenCode's plugin API. Symlinked from.opencode/plugins/agent-support.ts..agents/github/hooks.json— Copilot harness config (canonical). Points to.agents/hooks/*.sh. Symlinked from.github/hooks/agent-support.json.
Common Mistakes
- ❌ Writing long explanations in AGENTS.md for rules that could be a PreToolUse block or nested AGENTS.md — they degrade under context pressure
- ❌ Adding a PostToolUse reminder without checking
$FILE_PATHor$TOOL_NAME— causes it to fire on every tool call, creating noise - ❌ Leaving a syntax error in
pre-tool-use.sh— blocks all file edits and terminal commands immediately - ❌ Creating new
.instructions.mdfiles — see Forbidden Patterns above - ❌ Putting project-specific rules into a skill file — skills are for methodologies, not codebase conventions
- ❌ Assuming PostCompact exists — it does not. Use PreCompact.
- ❌ Editing generated files in
.github/agents/,.github/skills/,.opencode/agents/,.opencode/skills/— edit.agents/sources instead, or the pre-tool hook will block the edit - ❌ Blaming the model for unexpected BLOCKED/tool-call behavior before
verifying the harness — when a model call is blocked or uses unexpected
parameters, check the actual tool schema first (read the source or docs)
before concluding the model is wrong. The harness was recently changed; the
model may be correct. Applies to: OpenCode tool names (
read/edit/task), parameter names (offset/limitnotstartLine/endLine), and plugin guard logic. - ❌ Adding "reflect / double-check / are you sure / take another look" instructions as a mitigation for any failure mode — these feel productive in transcripts but Huang et al. (arXiv:2310.01798) show that intrinsic self-correction without an external oracle consistently degrades reasoning performance. Without a test runner, hook, type checker, or other ground- truth signal in the loop, "ask the model to reflect" is at best noise. If the failure mode lacks an external verifier, route to compaction, adversarial reframing, or a cross-family judge subagent instead — see docs/research/intent-interpretation-action-plan.md §4.1.
- ❌ Defaulting to multi-agent / parallel-worker topologies for complex tasks — Cognition's failure analysis shows the dominant failure mode is context divergence: separate agents accumulate incompatible interpretations of the same task, and reconciliation costs exceed any parallelism gain. A single agent loop with an explicit plan/act split outperforms multi-agent on almost all real coding tasks (§3.1, docs/research/ai-coding-best-practices.md). Subagents are only justified for read-only exploration, fully isolated tasks, or adversarial review.
- ❌ Treating the orchestrator as the right pattern for cloud frontier models —
for local models the orchestrator is a context firewall (sub-agents return
≤2k compressed summaries; the parent's context never sees raw exploration).
Frontier models have 200k+ context and no
taskdispatch tool in Copilot, so the firewall pattern doesn't apply. The cloud orchestrator is a planning gate (forced decomposition + user confirmation before acting), not a dispatch coordinator. The<!-- @local -->/<!-- @cloud -->blocks inorchestrator.mdencode this distinction. See §3.4 of docs/research/ai-coding-best-practices.md.