Files
timmy-home/reports/production/2026-03-31-claude-code-deep-dive.md

20 KiB

CLAUDE CODE SOURCE CODE DEEP DIVE ANALYSIS

/tmp/claude-code-src/src/ — 1,884 files, 512K lines of TypeScript


1. ARCHITECTURE OVERVIEW

Top-Level Directory Structure (src/):

assistant/      - Kairos assistant mode (feature-gated)
bootstrap/      - Global state initialization (state.js holds session-wide mutable state)
bridge/         - Bridge to external integrations
buddy/          - Buddy/companion feature
cli/            - CLI argument parsing and entry
commands/       - Slash commands (/compact, /clear, etc.)
components/     - React/Ink UI components
constants/      - System prompts, product config, OAuth
context/        - Context management (notifications, stats)
coordinator/    - Coordinator mode for multi-agent orchestration
entrypoints/    - Multiple entry points (init.js, agentSdkTypes)
hooks/          - React hooks (useCanUseTool, etc.)
ink/            - Terminal UI framework (Ink-based)
keybindings/    - Terminal keybinding handlers
memdir/         - Memory directory system (memdir.ts)
migrations/     - Config/data migrations
native-ts/      - Native TypeScript utilities
outputStyles/   - Output formatting styles
plugins/        - Plugin system (bundled plugins)
query/          - Query loop helpers (config, deps, transitions, tokenBudget, stopHooks)
remote/         - Remote execution support
schemas/        - Zod schemas
screens/        - UI screens
server/         - Server mode
services/       - Core services (API, MCP, analytics, compact, tools, etc.)
skills/         - Skill system (bundled skills)
state/          - AppState management
tasks/          - Background task management (LocalAgentTask, LocalShellTask, RemoteAgentTask)
tools/          - All tool implementations (40+ tools)
types/          - TypeScript type definitions
upstreamproxy/  - Upstream proxy support
utils/          - Utilities (permissions, git, model, config, etc.)
vim/            - Vim mode support
voice/          - Voice input support

Key Entry Files:

  • main.tsx (4,683 lines) — CLI entry point, Commander.js argument parsing, session setup
  • query.ts (1,729 lines) — THE MAIN AGENTIC LOOP
  • Tool.ts (792 lines) — Tool interface/type definitions
  • tools.ts (389 lines) — Tool registry/assembly
  • context.ts (189 lines) — System/user context (git status, CLAUDE.md)
  • cost-tracker.ts (323 lines) — Cost tracking
  • costHook.ts (22 lines) — React hook for cost display on exit

2. THE AGENTIC LOOP (query.ts)

Core Architecture:

The loop is an async generatorquery() at line 219 delegates to queryLoop() at line 241, which is a while(true) loop (line 307) that yields StreamEvent | Message | TombstoneMessage events.

Loop State (lines 204-217):

type State = {
  messages: Message[]
  toolUseContext: ToolUseContext
  autoCompactTracking: AutoCompactTrackingState | undefined
  maxOutputTokensRecoveryCount: number
  hasAttemptedReactiveCompact: boolean
  maxOutputTokensOverride: number | undefined
  pendingToolUseSummary: Promise<ToolUseSummaryMessage | null> | undefined
  stopHookActive: boolean | undefined
  turnCount: number
  transition: Continue | undefined  // Why the previous iteration continued
}

Each Iteration Does (in order):

  1. Skill discovery prefetch (line 331) — fires async while model streams
  2. Tool result budget (line 379) — applyToolResultBudget() limits per-message result sizes
  3. Snip compact (line 401) — feature-gated HISTORY_SNIP trims old messages
  4. Microcompact (line 414) — compresses tool results inline
  5. Context collapse (line 441) — feature-gated, projects collapsed view
  6. Auto-compact (line 454) — if above token threshold, summarizes conversation
  7. Blocking limit check (line 637) — if tokens exceed hard limit, stop
  8. API call with streaming (line 659) — deps.callModel() streams response
  9. Streaming tool execution (line 563) — StreamingToolExecutor starts tools AS blocks arrive
  10. Post-sampling hooks (line 1001)
  11. Stop decision (line 1062) — if no tool_use blocks, check stop hooks
  12. Token budget continuation (line 1308) — if budget not met, inject nudge and continue
  13. Tool execution (line 1380-1408) — runTools() or streamingToolExecutor.getRemainingResults()
  14. Attachment messages (line 1580) — memory, file changes, queued commands
  15. Max turns check (line 1705) — if exceeded, stop
  16. State update and continue (line 1715)

Stop Conditions:

  • No tool_use blocks in response → completed (line 1062)
  • API error → model_error (line 996)
  • User abort → aborted_streaming/aborted_tools (lines 1051, 1515)
  • Blocking limit → blocking_limit (line 646)
  • Max turns → max_turns (line 1711)
  • Stop hook → stop_hook_prevented (line 1279)

Retry/Recovery:

  • Model fallback (line 894): on FallbackTriggeredError, switch to fallback model
  • Reactive compact (line 1119): on prompt-too-long 413, try compact then retry
  • Max output tokens recovery (line 1223): inject "resume" message, retry up to limit
  • Escalated tokens (line 1199): if hit 8K default, retry at 64K
  • Context collapse drain (line 1094): drain staged collapses before reactive compact

3. TOOL SYSTEM

Tool Interface (Tool.ts, lines 362-695):

Every tool implements the Tool<Input, Output, Progress> interface:

  • name: string — unique identifier
  • inputSchema: Input — Zod schema for validation
  • call(args, context, canUseTool, parentMessage, onProgress) — execution
  • description(input, options) — dynamic prompt text
  • prompt(options) — tool prompt for system prompt
  • checkPermissions(input, context) — tool-specific permission logic
  • isReadOnly(input) — whether tool modifies state
  • isConcurrencySafe(input) — whether safe to run in parallel
  • isEnabled() — whether available in current environment
  • maxResultSizeChars — result size limit before disk persistence
  • mapToolResultToToolResultBlockParam(content, toolUseID) — convert to API format
  • validateInput(input, context) — pre-execution validation
  • toAutoClassifierInput(input) — compact representation for security classifier

Tool Building (Tool.ts, lines 757-792):

buildTool() applies defaults:

  • isEnabled: () => true
  • isConcurrencySafe: () => false (fail-closed)
  • isReadOnly: () => false (fail-closed)
  • checkPermissions: () => { behavior: 'allow', updatedInput }

Complete Tool List (tools.ts, getAllBaseTools lines 193-250):

Core Tools:

  • AgentTool — spawns sub-agents (THE key tool)
  • BashTool — shell command execution
  • FileReadTool — read files
  • FileEditTool — edit files (search-and-replace)
  • FileWriteTool — write entire files
  • GlobTool — file pattern matching
  • GrepTool — content search (ripgrep-backed)
  • NotebookEditTool — Jupyter notebook editing
  • WebFetchTool — HTTP fetch
  • WebSearchTool — web search

Task/Plan Tools:

  • TaskStopTool — stop agent execution
  • TaskOutputTool — output from agent tasks
  • TodoWriteTool — write todo items
  • TaskCreateTool, TaskGetTool, TaskUpdateTool, TaskListTool — task management (v2)
  • EnterPlanModeTool, ExitPlanModeV2Tool — plan mode

Agent/Swarm Tools:

  • TeamCreateTool, TeamDeleteTool — multi-agent teams
  • SendMessageTool — inter-agent communication
  • ListPeersTool — list peer agents (UDS)

Other Tools:

  • AskUserQuestionTool — ask user for input
  • SkillTool — invoke registered skills
  • BriefTool — brief/summary generation
  • ConfigTool — configuration (ant-only)
  • TungstenTool — internal (ant-only)
  • LSPTool — Language Server Protocol
  • ListMcpResourcesTool, ReadMcpResourceTool — MCP resources
  • ToolSearchTool — search for deferred tools
  • EnterWorktreeTool, ExitWorktreeTool — git worktree isolation
  • SleepTool — wait for events (proactive mode)
  • CronCreate/Delete/ListTool — scheduled triggers
  • RemoteTriggerTool — remote triggers
  • MonitorTool — shell monitoring
  • PowerShellTool — Windows PowerShell
  • SyntheticOutputTool — structured output
  • VerifyPlanExecutionTool — verify plan execution
  • SnipTool — history snipping
  • WorkflowTool — workflow scripts
  • WebBrowserTool — full web browser
  • TerminalCaptureTool — terminal capture
  • OverflowTestTool, CtxInspectTool — debugging
  • REPLTool — REPL environment (ant-only)

Tool Registration:

assembleToolPool() (tools.ts line 345) merges built-in + MCP tools, sorted for prompt cache stability. MCP tools are filtered by deny rules. Built-in tools take precedence on name conflicts via uniqBy.

Tool Orchestration (services/tools/toolOrchestration.ts):

runTools() partitions tool calls into:

  • Concurrent batches — if all tools in batch are isConcurrencySafe, run in parallel (up to 10, configurable via CLAUDE_CODE_MAX_TOOL_USE_CONCURRENCY)
  • Serial batches — non-read-only tools run one at a time

StreamingToolExecutor (services/tools/StreamingToolExecutor.ts) starts tool execution AS tool_use blocks arrive during streaming, not waiting for the full response.


4. CONTEXT/MEMORY MANAGEMENT

Auto-Compact (services/compact/autoCompact.ts):

  • Threshold: effectiveContextWindow - AUTOCOMPACT_BUFFER_TOKENS
  • shouldAutoCompact() checks token count via tokenCountWithEstimation()
  • Circuit breaker: stops after 3 consecutive failures (MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES)
  • Calls compactConversation() which forks a sub-agent to summarize
  • Also tries trySessionMemoryCompaction() first (lighter)

Multi-Layer Compaction:

  1. Snip compact — removes old messages from history (HISTORY_SNIP feature)
  2. Microcompact — compresses individual tool results inline
  3. Context collapse — progressive collapse of old context (CONTEXT_COLLAPSE feature)
  4. Auto-compact — full conversation summarization (when above threshold)
  5. Reactive compact — emergency compact on API 413 error (prompt-too-long)
  6. Session memory compact — session memory aware compaction

CLAUDE.md Memory System (utils/claudemd.ts):

Four-tier memory hierarchy (lines 1-26):

  1. Managed memory/etc/claude-code/CLAUDE.md (system-wide)
  2. User memory~/.claude/CLAUDE.md (user-global)
  3. Project memoryCLAUDE.md, .claude/CLAUDE.md, .claude/rules/*.md (per-project)
  4. Local memoryCLAUDE.local.md (per-project, gitignored)

Discovery: Traverses from CWD up to root. Files closer to CWD have higher priority. Supports @include directives for file inclusion.

Context Assembly (context.ts):

  • getUserContext() — loads CLAUDE.md content + current date
  • getSystemContext() — git status snapshot (branch, last 5 commits, status)
  • Both are memoized per session

5. PERMISSION/SAFETY SYSTEM

Permission Modes (utils/permissions/PermissionMode.ts):

  • default — ask for write operations
  • plan — model proposes, user approves
  • auto — AI classifier decides (TRANSCRIPT_CLASSIFIER feature)
  • bypassPermissions — allow everything
  • acceptEdits — allow file edits without asking
  • bubble — bubble permission prompts to parent agent

Permission Check Flow (utils/permissions/permissions.ts, checkRuleBasedPermissions line 1071):

  1. 1a. Deny rules — check if tool is blanket-denied
  2. 1b. Ask rules — check if tool has explicit ask rule
  3. 1c. Tool-specific — call tool.checkPermissions() (e.g., bash subcommand matching)
  4. 1d. Tool deny — tool implementation denied
  5. 1f. Content-specific ask — tool returned ask with rule pattern
  6. 1g. Safety checks — protected paths (.git, .claude, shell configs)

Security Classifier (utils/permissions/yoloClassifier.ts):

  • Used in auto mode
  • Calls a separate Claude model (via sideQuery()) with the conversation transcript
  • Uses compressed toAutoClassifierInput() from each tool
  • Has its own system prompt (auto_mode_system_prompt.txt)
  • Returns allow/deny decision with reasoning
  • Falls back to prompting after denial tracking threshold

Denial Tracking (utils/permissions/denialTracking.ts):

  • Tracks consecutive denials per tool
  • After threshold, falls back to user prompting
  • Prevents infinite deny loops

6. SYSTEM PROMPT

Construction (constants/prompts.ts, getSystemPrompt line 444):

Returns a string[] (array of sections), assembled as:

Static (cacheable) sections:

  1. Intro section — "You are an interactive agent..."
  2. System section — tool behavior, permissions, tags
  3. Doing tasks section — coding style, testing, git practices
  4. Actions section — tool usage guidance
  5. Using your tools section — tool-specific instructions
  6. Tone and style section
  7. Output efficiency section
  8. SYSTEM_PROMPT_DYNAMIC_BOUNDARY marker (separates cacheable from dynamic)

Dynamic (per-session) sections (via registry): 9. Session guidance — based on enabled tools 10. Memory — loaded from CLAUDE.md hierarchy 11. Environment info — OS, model, CWD, git info, knowledge cutoff 12. Language preference 13. Output style 14. MCP server instructions 15. Scratchpad instructions 16. Function result clearing 17. Tool result summarization 18. Numeric length anchors (ant-only) 19. Token budget instructions (feature-gated)

Dynamic Boundary:

SYSTEM_PROMPT_DYNAMIC_BOUNDARY (line 114) separates globally-cacheable content from user-specific content. Everything before can use scope: 'global' for cross-user caching.

System Prompt Sections Registry (constants/systemPromptSections.ts):

Uses systemPromptSection() and DANGEROUS_uncachedSystemPromptSection() to declare sections with caching behavior. resolveSystemPromptSections() resolves all async sections.


7. SUB-AGENT/TASK SYSTEM

AgentTool (tools/AgentTool/AgentTool.tsx):

The main sub-agent spawning tool. Input schema (line 82):

  • description — 3-5 word task summary
  • prompt — the task to perform
  • subagent_type — optional specialized agent type
  • model — optional model override (sonnet/opus/haiku)
  • run_in_background — async execution
  • isolation — "worktree" or "remote" for isolation
  • cwd — working directory override
  • name — addressable name for SendMessage

runAgent (tools/AgentTool/runAgent.ts, line 248):

async function* runAgent() — another async generator that:

  1. Creates unique agentId
  2. Resolves agent-specific model
  3. Initializes agent MCP servers (if defined in frontmatter)
  4. Creates agent-specific permission context
  5. Calls createSubagentContext() to create isolated ToolUseContext
  6. Builds agent system prompt with getSystemPrompt() + env details
  7. Calls query() — THE SAME QUERY LOOP as the main thread
  8. Records sidechain transcript for resume

Agent Isolation:

  • Each agent gets its own ToolUseContext with:
    • Cloned readFileState (file state cache)
    • Its own abortController
    • Separate permission mode
    • Can't access parent's tool JSX
  • Worktree mode: creates git worktree for filesystem isolation
  • Remote mode: launches on remote CCR environment
  • Fork mode: shares parent's message context for prompt cache hits

Built-in Agent Types (tools/AgentTool/built-in/):

  • generalPurposeAgent — default agent
  • exploreAgent — read-only exploration
  • And custom agents loaded from .claude/agents/ directory

8. COST TRACKING

Architecture:

  • State in bootstrap/state.js — global mutable state: totalCostUSD, modelUsage, counters
  • cost-tracker.ts — higher-level functions for formatting and persisting

addToTotalSessionCost (cost-tracker.ts, line 278):

  • Takes cost, usage (API response), model
  • Accumulates per-model: inputTokens, outputTokens, cacheRead, cacheCreation, webSearchRequests
  • Calculates cost via calculateUSDCost() (utils/modelCost.ts)
  • Also tracks advisor model usage separately
  • Feeds OpenTelemetry counters via getCostCounter()?.add()

Persistence:

  • saveCurrentSessionCosts() (line 143) — saves to project config on process exit
  • restoreCostStateForSession() (line 130) — restores on session resume
  • formatTotalCost() (line 228) — produces per-model breakdown string

costHook.ts:

A simple React hook that prints cost summary and saves to config on process exit.


9. UNIQUE/NOVEL PATTERNS

1. Streaming Tool Execution

StreamingToolExecutor starts executing tools AS their blocks arrive during model streaming, not waiting for the complete response. This overlaps tool execution with model output generation.

2. Prompt Cache Stability Engineering

Tools are sorted alphabetically for cache stability. Built-in tools form a contiguous prefix. SYSTEM_PROMPT_DYNAMIC_BOUNDARY separates globally-cacheable from user-specific content. Fork subagents inherit parent's renderedSystemPrompt to avoid cache busting.

3. Multi-Layer Context Management

Five distinct compaction strategies (snip, microcompact, collapse, auto-compact, reactive compact) working in concert, each with different trigger points and tradeoffs.

4. Feature Gate Architecture

Heavy use of feature('FLAG_NAME') from bun:bundle for dead code elimination at build time. Feature-gated code is completely removed from external builds. Conditional require() inside feature blocks.

5. Tool Result Budget (utils/toolResultStorage.ts)

Per-message aggregate budget on tool result sizes. Large results are persisted to disk and replaced with a preview + file path. The maxResultSizeChars per tool controls thresholds.

6. Denial Tracking with Fallback

The permission system tracks consecutive denials and falls back to interactive prompting after a threshold, preventing infinite deny loops in auto mode.

7. Side Query Architecture

sideQuery() (utils/sideQuery.ts) forks lightweight model calls for classification, summarization, and memory retrieval WITHOUT blocking the main loop. Used by the YOLO classifier, compact, and skill discovery.

8. Agent Memory Prefetch

startRelevantMemoryPrefetch() fires at loop entry and is polled each iteration. Memory discovery runs in background while tools execute.

9. Tool Use Summary Generation

After each tool batch, fires a Haiku call to generate a mobile-friendly summary (async, resolved during next model call).

10. Attachment System

File changes, memory files, MCP resources, queued commands, and skill discoveries are injected as "attachment" messages between turns — invisible to the user but visible to the model.


10. COMPARISON TO HERMES — ACTIONABLE IMPROVEMENTS

1. STREAMING TOOL EXECUTION

Claude Code starts executing tool calls AS they stream in. Hermes should implement this — it can save seconds per turn when tools are I/O bound.

2. TOOL CONCURRENCY WITH PARTITIONING

Claude Code partitions tool calls into concurrent-safe and serial batches based on isConcurrencySafe(). Read-only tools run in parallel (up to 10). Hermes should tag tools as read-only and batch them.

3. MULTI-LAYER COMPACTION STRATEGY

Instead of a single compact, implement layered:

  • Microcompact (truncate large tool results inline)
  • Auto-compact (summarize when above threshold)
  • Reactive compact (on API 413 errors) This gives much better context utilization.

4. CLAUDE.MD HIERARCHY

The 4-tier memory system (system > user > project > local) with directory traversal and @include support is much more flexible than a flat memory file. Hermes should adopt the hierarchical discovery.

5. TOOL RESULT BUDGET

Large tool results being persisted to disk and replaced with previews prevents context pollution. This is critical for long sessions.

6. PROMPT CACHE STABILITY

Sort tools alphabetically, separate cacheable from dynamic prompt sections, and inherit parent prompts for sub-agents. This dramatically reduces API costs.

7. CIRCUIT BREAKERS

Auto-compact has a circuit breaker (3 failures → stop). Max output tokens recovery has a limit. Hermes should implement similar guards against infinite retry loops.

8. STOP HOOKS

The stop hook system (query/stopHooks.ts) allows custom logic to decide whether to continue after model stops. This enables quality gates.

9. TOKEN BUDGET CONTINUATION

When user specifies "+500k" or "spend 2M tokens", the system automatically continues the model with nudge messages until the budget is met. Novel UX feature.

10. DENY RULE ARCHITECTURE

The layered permission system with deny/ask/allow rules from multiple sources (CLI, settings, session, managed) with pattern matching (Bash(git *), etc.) is much more granular than simple tool-level allow/deny.