Agents Playbook
Pillars

Pillar — AI Collaboration

How to make an agent productive in your repo on day one and durably good across sessions.

View raw .md

Pillar — AI Collaboration

How to make an agent productive in your repo on day one and durably good across sessions.

Status

◐ Scoped, not yet detailed. This is the most distinctive pillar of the playbook — it captures lessons that have no analogue in pre-agent software development.

Scope

ConcernUniversal principleConcrete pattern
Bootstrap docOne file an agent reads first, every sessionCLAUDE.md (or AGENTS.md) at the repo root with non-negotiables + routing
Agent compatibilityOne playbook, every agent — the bootstrap doc is universal, only the filename differsCanonical AGENTS.md + thin per-tool pointers (.cursor/rules, .windsurfrules, copilot-instructions.md)
Routing tableMap "I want to change X" → "edit path Y"AGENTS.md table; agents triage faster by group than by file
Persistent memoryLessons survive session endsMEMORY.md (index) + memory/*.md (one fact per file) pattern
Open Knowledge FormatShare curated knowledge as a bundle any agent can readOKF: directory of markdown + YAML frontmatter (type required), index.md, markdown-link graph
Goal modeAgent works toward a condition, not a turn countStop hook with goal condition; clears when the condition holds
Sub-agentsLong fan-outs delegate to scoped specialist agentsSub-agent recipes per task class (search, plan, review, implement)
Slash commandsRepeated workflows become palette entries/goal, /loop, /review, /clear, plus project-specific
System promptsPer-role prompts (architect, reviewer, fixer)Reusable role files; injected per task
Verify-firstBefore acting, confirm the state is what you think it isDefault gh issue view, git fetch, pwd at session start
Single sub-unitOne discrete shippable change per sessionDefined up front; no scope creep
Honest reportingFaithful state, not optimistic state"Tests failed: <output>", not "Tests pass after I fix the unrelated thing"
Duplication detectionVerify against real exports, not doc namesnpm pack + read .d.ts, never trust naming similarity
Concurrent-merge survivalMultiple agents pushing to mainStash-verify red, rebase clean, retry; pre-push hook covers structural drift
Tool & capability designTools are the model's API; design for a non-human userIntent-altitude tools, constrained schemas, skills/artifacts as the product↔model abstraction
Prompt as versioned assetPrompts are behavior, not configPrompt registry; version + hash + eval delta; A/B on traffic; flag-flip rollback
Context managementThe context window is the program the model runsBudgeted window, top-k retrieval, edge-positioning, compaction, external memory
Hallucination reductionConfident wrongness destroys trustGround + cite, constrain, verify (faithfulness judge), first-class abstention
Human-in-the-loopEarn autonomy; humans gate by blast radiusApprove/edit/escalate ladder; corrections captured as eval + prompt signal
Agent evaluationA correct generation can't be a lucky oneSee quality pillar — deterministic + LLM-as-judge + production monitoring

Non-negotiables

  1. CLAUDE.md / AGENTS.md is mandatory. No agent starts work without one.
  2. Persistent memory grows from lessons, not from chat. Each memory file is a fact + how to apply it.
  3. Verify-first. State at session start may not match state at PR open.
  4. Honest reporting. Tests that failed are reported failed. Steps skipped are reported skipped.
  5. One sub-unit per session. Quality over speed.

See also

Roadmap

  • universal.md
  • bootstrap-doc-pattern.md
  • agent-compatibility-pattern.md
  • memory-pattern.md
  • sub-agent-pattern.md
  • slash-commands-pattern.md
  • concurrent-agent-pattern.md
  • self-describe-pattern.md
  • open-knowledge-format-pattern.md
  • tool-design-pattern.md
  • prompt-versioning-pattern.md
  • context-management-pattern.md
  • hallucination-reduction-pattern.md
  • human-in-the-loop-pattern.md
  • verify-first-pattern.md
  • honest-confidence-pattern.md