timmy-home/reports/production/2026-03-31-claude-code-deep-dive.md

# CLAUDE CODE SOURCE CODE DEEP DIVE ANALYSIS
## /tmp/claude-code-src/src/ — 1,884 files, 512K lines of TypeScript

---

## 1. ARCHITECTURE OVERVIEW

### Top-Level Directory Structure (src/):
```
assistant/      - Kairos assistant mode (feature-gated)
bootstrap/      - Global state initialization (state.js holds session-wide mutable state)
bridge/         - Bridge to external integrations
buddy/          - Buddy/companion feature
cli/            - CLI argument parsing and entry
commands/       - Slash commands (/compact, /clear, etc.)
components/     - React/Ink UI components
constants/      - System prompts, product config, OAuth
context/        - Context management (notifications, stats)
coordinator/    - Coordinator mode for multi-agent orchestration
entrypoints/    - Multiple entry points (init.js, agentSdkTypes)
hooks/          - React hooks (useCanUseTool, etc.)
ink/            - Terminal UI framework (Ink-based)
keybindings/    - Terminal keybinding handlers
memdir/         - Memory directory system (memdir.ts)
migrations/     - Config/data migrations
native-ts/      - Native TypeScript utilities
outputStyles/   - Output formatting styles
plugins/        - Plugin system (bundled plugins)
query/          - Query loop helpers (config, deps, transitions, tokenBudget, stopHooks)
remote/         - Remote execution support
schemas/        - Zod schemas
screens/        - UI screens
server/         - Server mode
services/       - Core services (API, MCP, analytics, compact, tools, etc.)
skills/         - Skill system (bundled skills)
state/          - AppState management
tasks/          - Background task management (LocalAgentTask, LocalShellTask, RemoteAgentTask)
tools/          - All tool implementations (40+ tools)
types/          - TypeScript type definitions
upstreamproxy/  - Upstream proxy support
utils/          - Utilities (permissions, git, model, config, etc.)
vim/            - Vim mode support
voice/          - Voice input support
```

### Key Entry Files:
- `main.tsx` (4,683 lines) — CLI entry point, Commander.js argument parsing, session setup
- `query.ts` (1,729 lines) — THE MAIN AGENTIC LOOP
- `Tool.ts` (792 lines) — Tool interface/type definitions
- `tools.ts` (389 lines) — Tool registry/assembly
- `context.ts` (189 lines) — System/user context (git status, CLAUDE.md)
- `cost-tracker.ts` (323 lines) — Cost tracking
- `costHook.ts` (22 lines) — React hook for cost display on exit

---

## 2. THE AGENTIC LOOP (query.ts)

### Core Architecture:
The loop is an **async generator** — `query()` at line 219 delegates to `queryLoop()` at line 241, which is a `while(true)` loop (line 307) that yields `StreamEvent | Message | TombstoneMessage` events.

### Loop State (lines 204-217):
```typescript
type State = {
  messages: Message[]
  toolUseContext: ToolUseContext
  autoCompactTracking: AutoCompactTrackingState | undefined
  maxOutputTokensRecoveryCount: number
  hasAttemptedReactiveCompact: boolean
  maxOutputTokensOverride: number | undefined
  pendingToolUseSummary: Promise<ToolUseSummaryMessage | null> | undefined
  stopHookActive: boolean | undefined
  turnCount: number
  transition: Continue | undefined  // Why the previous iteration continued
}
```

### Each Iteration Does (in order):
1. **Skill discovery prefetch** (line 331) — fires async while model streams
2. **Tool result budget** (line 379) — `applyToolResultBudget()` limits per-message result sizes
3. **Snip compact** (line 401) — feature-gated HISTORY_SNIP trims old messages
4. **Microcompact** (line 414) — compresses tool results inline
5. **Context collapse** (line 441) — feature-gated, projects collapsed view
6. **Auto-compact** (line 454) — if above token threshold, summarizes conversation
7. **Blocking limit check** (line 637) — if tokens exceed hard limit, stop
8. **API call with streaming** (line 659) — `deps.callModel()` streams response
9. **Streaming tool execution** (line 563) — `StreamingToolExecutor` starts tools AS blocks arrive
10. **Post-sampling hooks** (line 1001)
11. **Stop decision** (line 1062) — if no tool_use blocks, check stop hooks
12. **Token budget continuation** (line 1308) — if budget not met, inject nudge and continue
13. **Tool execution** (line 1380-1408) — `runTools()` or `streamingToolExecutor.getRemainingResults()`
14. **Attachment messages** (line 1580) — memory, file changes, queued commands
15. **Max turns check** (line 1705) — if exceeded, stop
16. **State update and continue** (line 1715)

### Stop Conditions:
- No `tool_use` blocks in response → completed (line 1062)
- API error → model_error (line 996)
- User abort → aborted_streaming/aborted_tools (lines 1051, 1515)
- Blocking limit → blocking_limit (line 646)
- Max turns → max_turns (line 1711)
- Stop hook → stop_hook_prevented (line 1279)

### Retry/Recovery:
- **Model fallback** (line 894): on `FallbackTriggeredError`, switch to fallback model
- **Reactive compact** (line 1119): on prompt-too-long 413, try compact then retry
- **Max output tokens recovery** (line 1223): inject "resume" message, retry up to limit
- **Escalated tokens** (line 1199): if hit 8K default, retry at 64K
- **Context collapse drain** (line 1094): drain staged collapses before reactive compact

---

## 3. TOOL SYSTEM

### Tool Interface (Tool.ts, lines 362-695):
Every tool implements the `Tool<Input, Output, Progress>` interface:
- `name: string` — unique identifier
- `inputSchema: Input` — Zod schema for validation
- `call(args, context, canUseTool, parentMessage, onProgress)` — execution
- `description(input, options)` — dynamic prompt text
- `prompt(options)` — tool prompt for system prompt
- `checkPermissions(input, context)` — tool-specific permission logic
- `isReadOnly(input)` — whether tool modifies state
- `isConcurrencySafe(input)` — whether safe to run in parallel
- `isEnabled()` — whether available in current environment
- `maxResultSizeChars` — result size limit before disk persistence
- `mapToolResultToToolResultBlockParam(content, toolUseID)` — convert to API format
- `validateInput(input, context)` — pre-execution validation
- `toAutoClassifierInput(input)` — compact representation for security classifier

### Tool Building (Tool.ts, lines 757-792):
`buildTool()` applies defaults:
- `isEnabled: () => true`
- `isConcurrencySafe: () => false` (fail-closed)
- `isReadOnly: () => false` (fail-closed)
- `checkPermissions: () => { behavior: 'allow', updatedInput }`

### Complete Tool List (tools.ts, getAllBaseTools lines 193-250):
**Core Tools:**
- AgentTool — spawns sub-agents (THE key tool)
- BashTool — shell command execution
- FileReadTool — read files
- FileEditTool — edit files (search-and-replace)
- FileWriteTool — write entire files
- GlobTool — file pattern matching
- GrepTool — content search (ripgrep-backed)
- NotebookEditTool — Jupyter notebook editing
- WebFetchTool — HTTP fetch
- WebSearchTool — web search

**Task/Plan Tools:**
- TaskStopTool — stop agent execution
- TaskOutputTool — output from agent tasks
- TodoWriteTool — write todo items
- TaskCreateTool, TaskGetTool, TaskUpdateTool, TaskListTool — task management (v2)
- EnterPlanModeTool, ExitPlanModeV2Tool — plan mode

**Agent/Swarm Tools:**
- TeamCreateTool, TeamDeleteTool — multi-agent teams
- SendMessageTool — inter-agent communication
- ListPeersTool — list peer agents (UDS)

**Other Tools:**
- AskUserQuestionTool — ask user for input
- SkillTool — invoke registered skills
- BriefTool — brief/summary generation
- ConfigTool — configuration (ant-only)
- TungstenTool — internal (ant-only)
- LSPTool — Language Server Protocol
- ListMcpResourcesTool, ReadMcpResourceTool — MCP resources
- ToolSearchTool — search for deferred tools
- EnterWorktreeTool, ExitWorktreeTool — git worktree isolation
- SleepTool — wait for events (proactive mode)
- CronCreate/Delete/ListTool — scheduled triggers
- RemoteTriggerTool — remote triggers
- MonitorTool — shell monitoring
- PowerShellTool — Windows PowerShell
- SyntheticOutputTool — structured output
- VerifyPlanExecutionTool — verify plan execution
- SnipTool — history snipping
- WorkflowTool — workflow scripts
- WebBrowserTool — full web browser
- TerminalCaptureTool — terminal capture
- OverflowTestTool, CtxInspectTool — debugging
- REPLTool — REPL environment (ant-only)

### Tool Registration:
`assembleToolPool()` (tools.ts line 345) merges built-in + MCP tools, sorted for prompt cache stability. MCP tools are filtered by deny rules. Built-in tools take precedence on name conflicts via `uniqBy`.

### Tool Orchestration (services/tools/toolOrchestration.ts):
`runTools()` partitions tool calls into:
- **Concurrent batches** — if all tools in batch are `isConcurrencySafe`, run in parallel (up to 10, configurable via CLAUDE_CODE_MAX_TOOL_USE_CONCURRENCY)
- **Serial batches** — non-read-only tools run one at a time

`StreamingToolExecutor` (services/tools/StreamingToolExecutor.ts) starts tool execution AS tool_use blocks arrive during streaming, not waiting for the full response.

---

## 4. CONTEXT/MEMORY MANAGEMENT

### Auto-Compact (services/compact/autoCompact.ts):
- Threshold: `effectiveContextWindow - AUTOCOMPACT_BUFFER_TOKENS`
- `shouldAutoCompact()` checks token count via `tokenCountWithEstimation()`
- Circuit breaker: stops after 3 consecutive failures (`MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES`)
- Calls `compactConversation()` which forks a sub-agent to summarize
- Also tries `trySessionMemoryCompaction()` first (lighter)

### Multi-Layer Compaction:
1. **Snip compact** — removes old messages from history (HISTORY_SNIP feature)
2. **Microcompact** — compresses individual tool results inline
3. **Context collapse** — progressive collapse of old context (CONTEXT_COLLAPSE feature)
4. **Auto-compact** — full conversation summarization (when above threshold)
5. **Reactive compact** — emergency compact on API 413 error (prompt-too-long)
6. **Session memory compact** — session memory aware compaction

### CLAUDE.md Memory System (utils/claudemd.ts):
Four-tier memory hierarchy (lines 1-26):
1. **Managed memory** — `/etc/claude-code/CLAUDE.md` (system-wide)
2. **User memory** — `~/.claude/CLAUDE.md` (user-global)
3. **Project memory** — `CLAUDE.md`, `.claude/CLAUDE.md`, `.claude/rules/*.md` (per-project)
4. **Local memory** — `CLAUDE.local.md` (per-project, gitignored)

Discovery: Traverses from CWD up to root. Files closer to CWD have higher priority.
Supports `@include` directives for file inclusion.

### Context Assembly (context.ts):
- `getUserContext()` — loads CLAUDE.md content + current date
- `getSystemContext()` — git status snapshot (branch, last 5 commits, status)
- Both are memoized per session

---

## 5. PERMISSION/SAFETY SYSTEM

### Permission Modes (utils/permissions/PermissionMode.ts):
- `default` — ask for write operations
- `plan` — model proposes, user approves
- `auto` — AI classifier decides (TRANSCRIPT_CLASSIFIER feature)
- `bypassPermissions` — allow everything
- `acceptEdits` — allow file edits without asking
- `bubble` — bubble permission prompts to parent agent

### Permission Check Flow (utils/permissions/permissions.ts, checkRuleBasedPermissions line 1071):
1. **1a. Deny rules** — check if tool is blanket-denied
2. **1b. Ask rules** — check if tool has explicit ask rule
3. **1c. Tool-specific** — call `tool.checkPermissions()` (e.g., bash subcommand matching)
4. **1d. Tool deny** — tool implementation denied
5. **1f. Content-specific ask** — tool returned ask with rule pattern
6. **1g. Safety checks** — protected paths (.git, .claude, shell configs)

### Security Classifier (utils/permissions/yoloClassifier.ts):
- Used in `auto` mode
- Calls a separate Claude model (via `sideQuery()`) with the conversation transcript
- Uses compressed `toAutoClassifierInput()` from each tool
- Has its own system prompt (`auto_mode_system_prompt.txt`)
- Returns allow/deny decision with reasoning
- Falls back to prompting after denial tracking threshold

### Denial Tracking (utils/permissions/denialTracking.ts):
- Tracks consecutive denials per tool
- After threshold, falls back to user prompting
- Prevents infinite deny loops

---

## 6. SYSTEM PROMPT

### Construction (constants/prompts.ts, getSystemPrompt line 444):
Returns a `string[]` (array of sections), assembled as:

**Static (cacheable) sections:**
1. Intro section — "You are an interactive agent..."
2. System section — tool behavior, permissions, tags
3. Doing tasks section — coding style, testing, git practices
4. Actions section — tool usage guidance
5. Using your tools section — tool-specific instructions
6. Tone and style section
7. Output efficiency section
8. `SYSTEM_PROMPT_DYNAMIC_BOUNDARY` marker (separates cacheable from dynamic)

**Dynamic (per-session) sections (via registry):**
9. Session guidance — based on enabled tools
10. Memory — loaded from CLAUDE.md hierarchy
11. Environment info — OS, model, CWD, git info, knowledge cutoff
12. Language preference
13. Output style
14. MCP server instructions
15. Scratchpad instructions
16. Function result clearing
17. Tool result summarization
18. Numeric length anchors (ant-only)
19. Token budget instructions (feature-gated)

### Dynamic Boundary:
`SYSTEM_PROMPT_DYNAMIC_BOUNDARY` (line 114) separates globally-cacheable content from user-specific content. Everything before can use `scope: 'global'` for cross-user caching.

### System Prompt Sections Registry (constants/systemPromptSections.ts):
Uses `systemPromptSection()` and `DANGEROUS_uncachedSystemPromptSection()` to declare sections with caching behavior. `resolveSystemPromptSections()` resolves all async sections.

---

## 7. SUB-AGENT/TASK SYSTEM

### AgentTool (tools/AgentTool/AgentTool.tsx):
The main sub-agent spawning tool. Input schema (line 82):
- `description` — 3-5 word task summary
- `prompt` — the task to perform
- `subagent_type` — optional specialized agent type
- `model` — optional model override (sonnet/opus/haiku)
- `run_in_background` — async execution
- `isolation` — "worktree" or "remote" for isolation
- `cwd` — working directory override
- `name` — addressable name for SendMessage

### runAgent (tools/AgentTool/runAgent.ts, line 248):
`async function* runAgent()` — another async generator that:
1. Creates unique `agentId`
2. Resolves agent-specific model
3. Initializes agent MCP servers (if defined in frontmatter)
4. Creates agent-specific permission context
5. Calls `createSubagentContext()` to create isolated `ToolUseContext`
6. Builds agent system prompt with `getSystemPrompt()` + env details
7. Calls `query()` — THE SAME QUERY LOOP as the main thread
8. Records sidechain transcript for resume

### Agent Isolation:
- Each agent gets its own `ToolUseContext` with:
  - Cloned `readFileState` (file state cache)
  - Its own `abortController`
  - Separate permission mode
  - Can't access parent's tool JSX
- Worktree mode: creates git worktree for filesystem isolation
- Remote mode: launches on remote CCR environment
- Fork mode: shares parent's message context for prompt cache hits

### Built-in Agent Types (tools/AgentTool/built-in/):
- `generalPurposeAgent` — default agent
- `exploreAgent` — read-only exploration
- And custom agents loaded from `.claude/agents/` directory

---

## 8. COST TRACKING

### Architecture:
- **State in bootstrap/state.js** — global mutable state: `totalCostUSD`, `modelUsage`, counters
- **cost-tracker.ts** — higher-level functions for formatting and persisting

### addToTotalSessionCost (cost-tracker.ts, line 278):
- Takes `cost`, `usage` (API response), `model`
- Accumulates per-model: inputTokens, outputTokens, cacheRead, cacheCreation, webSearchRequests
- Calculates cost via `calculateUSDCost()` (utils/modelCost.ts)
- Also tracks advisor model usage separately
- Feeds OpenTelemetry counters via `getCostCounter()?.add()`

### Persistence:
- `saveCurrentSessionCosts()` (line 143) — saves to project config on process exit
- `restoreCostStateForSession()` (line 130) — restores on session resume
- `formatTotalCost()` (line 228) — produces per-model breakdown string

### costHook.ts:
A simple React hook that prints cost summary and saves to config on process exit.

---

## 9. UNIQUE/NOVEL PATTERNS

### 1. Streaming Tool Execution
`StreamingToolExecutor` starts executing tools AS their blocks arrive during model streaming, not waiting for the complete response. This overlaps tool execution with model output generation.

### 2. Prompt Cache Stability Engineering
Tools are sorted alphabetically for cache stability. Built-in tools form a contiguous prefix. `SYSTEM_PROMPT_DYNAMIC_BOUNDARY` separates globally-cacheable from user-specific content. Fork subagents inherit parent's `renderedSystemPrompt` to avoid cache busting.

### 3. Multi-Layer Context Management
Five distinct compaction strategies (snip, microcompact, collapse, auto-compact, reactive compact) working in concert, each with different trigger points and tradeoffs.

### 4. Feature Gate Architecture
Heavy use of `feature('FLAG_NAME')` from `bun:bundle` for dead code elimination at build time. Feature-gated code is completely removed from external builds. Conditional `require()` inside feature blocks.

### 5. Tool Result Budget (utils/toolResultStorage.ts)
Per-message aggregate budget on tool result sizes. Large results are persisted to disk and replaced with a preview + file path. The `maxResultSizeChars` per tool controls thresholds.

### 6. Denial Tracking with Fallback
The permission system tracks consecutive denials and falls back to interactive prompting after a threshold, preventing infinite deny loops in auto mode.

### 7. Side Query Architecture
`sideQuery()` (utils/sideQuery.ts) forks lightweight model calls for classification, summarization, and memory retrieval WITHOUT blocking the main loop. Used by the YOLO classifier, compact, and skill discovery.

### 8. Agent Memory Prefetch
`startRelevantMemoryPrefetch()` fires at loop entry and is polled each iteration. Memory discovery runs in background while tools execute.

### 9. Tool Use Summary Generation
After each tool batch, fires a Haiku call to generate a mobile-friendly summary (async, resolved during next model call).

### 10. Attachment System
File changes, memory files, MCP resources, queued commands, and skill discoveries are injected as "attachment" messages between turns — invisible to the user but visible to the model.

---

## 10. COMPARISON TO HERMES — ACTIONABLE IMPROVEMENTS

### 1. STREAMING TOOL EXECUTION
Claude Code starts executing tool calls AS they stream in. Hermes should implement this — it can save seconds per turn when tools are I/O bound.

### 2. TOOL CONCURRENCY WITH PARTITIONING
Claude Code partitions tool calls into concurrent-safe and serial batches based on `isConcurrencySafe()`. Read-only tools run in parallel (up to 10). Hermes should tag tools as read-only and batch them.

### 3. MULTI-LAYER COMPACTION STRATEGY
Instead of a single compact, implement layered:
- Microcompact (truncate large tool results inline)
- Auto-compact (summarize when above threshold)
- Reactive compact (on API 413 errors)
This gives much better context utilization.

### 4. CLAUDE.MD HIERARCHY
The 4-tier memory system (system > user > project > local) with directory traversal and @include support is much more flexible than a flat memory file. Hermes should adopt the hierarchical discovery.

### 5. TOOL RESULT BUDGET
Large tool results being persisted to disk and replaced with previews prevents context pollution. This is critical for long sessions.

### 6. PROMPT CACHE STABILITY
Sort tools alphabetically, separate cacheable from dynamic prompt sections, and inherit parent prompts for sub-agents. This dramatically reduces API costs.

### 7. CIRCUIT BREAKERS
Auto-compact has a circuit breaker (3 failures → stop). Max output tokens recovery has a limit. Hermes should implement similar guards against infinite retry loops.

### 8. STOP HOOKS
The stop hook system (query/stopHooks.ts) allows custom logic to decide whether to continue after model stops. This enables quality gates.

### 9. TOKEN BUDGET CONTINUATION
When user specifies "+500k" or "spend 2M tokens", the system automatically continues the model with nudge messages until the budget is met. Novel UX feature.

### 10. DENY RULE ARCHITECTURE
The layered permission system with deny/ask/allow rules from multiple sources (CLI, settings, session, managed) with pattern matching (Bash(git *), etc.) is much more granular than simple tool-level allow/deny.