[EPIC-999] Phase I — The Mirror: formal spec extraction artifacts

- module_inventory.json: 679 Python files, 298k lines, 232k SLOC - core_analysis.json: deep AST parse of 9 core modules - SPEC.md: high-level architecture, module specs, coupling risks, Phase II prep Authored-by: Ezra <ezra@hermes.vps>
2026-04-05 23:27:29 +00:00
parent 77a2aad771
commit 5f1cdfc9e4
4 changed files with 47291 additions and 0 deletions
--- a/docs/ouroboros/artifacts/core_analysis.json
+++ b/docs/ouroboros/artifacts/core_analysis.json
--- a/docs/ouroboros/artifacts/import_graph.json
+++ b/docs/ouroboros/artifacts/import_graph.json
--- a/docs/ouroboros/artifacts/module_inventory.json
+++ b/docs/ouroboros/artifacts/module_inventory.json
--- a/docs/ouroboros/specs/SPEC.md
+++ b/docs/ouroboros/specs/SPEC.md
@@ -0,0 +1,263 @@
+# Hermes Ω Specification Draft (Ouroboros Phase I)
+
+> Auto-generated by Ezra as part of EPIC-999. This document is a living artifact.
+
+## Scope
+This specification covers the core runtime of Hermes agent v0.7.x as found in the `hermes-agent` codebase.
+
+## High-Level Architecture
+
+```
+User Message
+    ↓
+Gateway (gateway/run.py) — platform adapter (Telegram, Discord, CLI, etc.)
+    ↓
+HermesCLI (cli.py) or AIAgent.chat() (run_agent.py)
+    ↓
+ModelTools (model_tools.py) — tool discovery, schema assembly, dispatch
+    ↓
+Tool Registry (tools/registry.py) — handler lookup, availability checks
+    ↓
+Individual Tool Implementations (tools/*.py)
+    ↓
+Results returned up the stack
+```
+
+## Module Specifications
+
+### `run_agent.py`
+**Lines of Code:** 8948
+
+**Classes:**
+- `_SafeWriter`
+  - *Transparent stdio wrapper that catches OSError/ValueError from broken pipes.*
+  - `__init__(self, inner)`
+  - `write(self, data)`
+  - `flush(self)`
+  - `fileno(self)`
+  - `isatty(self)`
+  - ... and 1 more methods
+- `IterationBudget`
+  - *Thread-safe iteration counter for an agent.*
+  - `__init__(self, max_total)`
+  - `consume(self)`
+  - `refund(self)`
+  - `used(self)`
+  - `remaining(self)`
+- `AIAgent`
+  - *AI Agent with tool calling capabilities.*
+  - `base_url(self)`
+  - `base_url(self, value)`
+  - `__init__(self, base_url, api_key, provider, api_mode, acp_command, acp_args, command, args, model, max_iterations, tool_delay, enabled_toolsets, disabled_toolsets, save_trajectories, verbose_logging, quiet_mode, ephemeral_system_prompt, log_prefix_chars, log_prefix, providers_allowed, providers_ignored, providers_order, provider_sort, provider_require_parameters, provider_data_collection, session_id, tool_progress_callback, tool_start_callback, tool_complete_callback, thinking_callback, reasoning_callback, clarify_callback, step_callback, stream_delta_callback, tool_gen_callback, status_callback, max_tokens, reasoning_config, prefill_messages, platform, skip_context_files, skip_memory, session_db, iteration_budget, fallback_model, credential_pool, checkpoints_enabled, checkpoint_max_snapshots, pass_session_id, persist_session)`
+  - `reset_session_state(self)`
+  - `_safe_print(self)`
+  - ... and 100 more methods
+
+**Top-Level Functions:**
+- `_install_safe_stdio()`
+- `_is_destructive_command(cmd)`
+- `_should_parallelize_tool_batch(tool_calls)`
+- `_extract_parallel_scope_path(tool_name, function_args)`
+- `_paths_overlap(left, right)`
+- `_sanitize_surrogates(text)`
+- `_sanitize_messages_surrogates(messages)`
+- `_strip_budget_warnings_from_history(messages)`
+- `main(query, model, api_key, base_url, max_turns, enabled_toolsets, disabled_toolsets, list_tools, save_trajectories, save_sample, verbose, log_prefix_chars)`
+
+**Inferred Side Effects & Invariants:**
+- Persists state to SQLite database.
+- Performs file I/O.
+- Makes HTTP network calls.
+- Uses global mutable state (risk factor).
+
+### `model_tools.py`
+**Lines of Code:** 466
+
+**Top-Level Functions:**
+- `_get_tool_loop()`
+- `_get_worker_loop()`
+- `_run_async(coro)`
+- `_discover_tools()`
+- `get_tool_definitions(enabled_toolsets, disabled_toolsets, quiet_mode)`
+- `handle_function_call(function_name, function_args, task_id, user_task, enabled_tools)`
+- `get_all_tool_names()`
+- `get_toolset_for_tool(tool_name)`
+- `get_available_toolsets()`
+- `check_toolset_requirements()`
+- ... and 1 more functions
+
+**Inferred Side Effects & Invariants:**
+- Uses global mutable state (risk factor).
+- Primarily pure Python logic / orchestration.
+
+### `cli.py`
+**Lines of Code:** 8280
+
+**Classes:**
+- `ChatConsole`
+  - *Rich Console adapter for prompt_toolkit's patch_stdout context.*
+  - `__init__(self)`
+  - `print(self)`
+- `HermesCLI`
+  - *Interactive CLI for the Hermes Agent.*
+  - `__init__(self, model, toolsets, provider, api_key, base_url, max_turns, verbose, compact, resume, checkpoints, pass_session_id)`
+  - `_invalidate(self, min_interval)`
+  - `_status_bar_context_style(self, percent_used)`
+  - `_build_context_bar(self, percent_used, width)`
+  - `_get_status_bar_snapshot(self)`
+  - ... and 106 more methods
+
+**Top-Level Functions:**
+- `_load_prefill_messages(file_path)`
+- `_parse_reasoning_config(effort)`
+- `load_cli_config()`
+- `_run_cleanup()`
+- `_git_repo_root()`
+- `_path_is_within_root(path, root)`
+- `_setup_worktree(repo_root)`
+- `_cleanup_worktree(info)`
+- `_prune_stale_worktrees(repo_root, max_age_hours)`
+- `_accent_hex()`
+- ... and 9 more functions
+
+**Inferred Side Effects & Invariants:**
+- Persists state to SQLite database.
+- Performs file I/O.
+- Spawns subprocesses / shell commands.
+- Uses global mutable state (risk factor).
+
+### `tools/registry.py`
+**Lines of Code:** 275
+
+**Classes:**
+- `ToolEntry`
+  - *Metadata for a single registered tool.*
+  - `__init__(self, name, toolset, schema, handler, check_fn, requires_env, is_async, description, emoji)`
+- `ToolRegistry`
+  - *Singleton registry that collects tool schemas + handlers from tool files.*
+  - `__init__(self)`
+  - `register(self, name, toolset, schema, handler, check_fn, requires_env, is_async, description, emoji)`
+  - `deregister(self, name)`
+  - `get_definitions(self, tool_names, quiet)`
+  - `dispatch(self, name, args)`
+  - ... and 10 more methods
+
+**Inferred Side Effects & Invariants:**
+- Primarily pure Python logic / orchestration.
+
+### `gateway/run.py`
+**Lines of Code:** 6657
+
+**Classes:**
+- `GatewayRunner`
+  - *Main gateway controller.*
+  - `__init__(self, config)`
+  - `_has_setup_skill(self)`
+  - `_load_voice_modes(self)`
+  - `_save_voice_modes(self)`
+  - `_set_adapter_auto_tts_disabled(self, adapter, chat_id, disabled)`
+  - ... and 78 more methods
+
+**Top-Level Functions:**
+- `_ensure_ssl_certs()`
+- `_normalize_whatsapp_identifier(value)`
+- `_expand_whatsapp_auth_aliases(identifier)`
+- `_resolve_runtime_agent_kwargs()`
+- `_build_media_placeholder(event)`
+- `_dequeue_pending_text(adapter, session_key)`
+- `_check_unavailable_skill(command_name)`
+- `_platform_config_key(platform)`
+- `_load_gateway_config()`
+- `_resolve_gateway_model(config)`
+- ... and 4 more functions
+
+**Inferred Side Effects & Invariants:**
+- Persists state to SQLite database.
+- Performs file I/O.
+- Spawns subprocesses / shell commands.
+- Contains async code paths.
+- Uses global mutable state (risk factor).
+
+### `hermes_state.py`
+**Lines of Code:** 1270
+
+**Classes:**
+- `SessionDB`
+  - *SQLite-backed session storage with FTS5 search.*
+  - `__init__(self, db_path)`
+  - `_execute_write(self, fn)`
+  - `_try_wal_checkpoint(self)`
+  - `close(self)`
+  - `_init_schema(self)`
+  - ... and 29 more methods
+
+**Inferred Side Effects & Invariants:**
+- Persists state to SQLite database.
+
+### `agent/context_compressor.py`
+**Lines of Code:** 676
+
+**Classes:**
+- `ContextCompressor`
+  - *Compresses conversation context when approaching the model's context limit.*
+  - `__init__(self, model, threshold_percent, protect_first_n, protect_last_n, summary_target_ratio, quiet_mode, summary_model_override, base_url, api_key, config_context_length, provider)`
+  - `update_from_response(self, usage)`
+  - `should_compress(self, prompt_tokens)`
+  - `should_compress_preflight(self, messages)`
+  - `get_status(self)`
+  - ... and 11 more methods
+
+**Inferred Side Effects & Invariants:**
+- Primarily pure Python logic / orchestration.
+
+### `agent/prompt_caching.py`
+**Lines of Code:** 72
+
+**Top-Level Functions:**
+- `_apply_cache_marker(msg, cache_marker, native_anthropic)`
+- `apply_anthropic_cache_control(api_messages, cache_ttl, native_anthropic)`
+
+**Inferred Side Effects & Invariants:**
+- Primarily pure Python logic / orchestration.
+
+### `agent/skill_commands.py`
+**Lines of Code:** 297
+
+**Top-Level Functions:**
+- `build_plan_path(user_instruction)`
+- `_load_skill_payload(skill_identifier, task_id)`
+- `_build_skill_message(loaded_skill, skill_dir, activation_note, user_instruction, runtime_note)`
+- `scan_skill_commands()`
+- `get_skill_commands()`
+- `build_skill_invocation_message(cmd_key, user_instruction, task_id, runtime_note)`
+- `build_preloaded_skills_prompt(skill_identifiers, task_id)`
+
+**Inferred Side Effects & Invariants:**
+- Uses global mutable state (risk factor).
+- Primarily pure Python logic / orchestration.
+
+## Cross-Module Dependencies
+
+Key data flow:
+1. `run_agent.py` defines `AIAgent` — the canonical conversation loop.
+2. `model_tools.py` assembles tool schemas and dispatches function calls.
+3. `tools/registry.py` maintains the central registry; all tool files import it.
+4. `gateway/run.py` adapts platform events into `AIAgent.run_conversation()` calls.
+5. `cli.py` (`HermesCLI`) provides the interactive shell and slash-command routing.
+
+## Known Coupling Risks
+
+- `run_agent.py` is ~7k SLOC and contains the core loop, todo/memory interception, context compression, and trajectory saving. High blast radius.
+- `cli.py` is ~6.5k SLOC and combines UI (Rich/prompt_toolkit), config loading, and command dispatch. Tightly coupled to display state.
+- `model_tools.py` holds a process-global `_last_resolved_tool_names`. Subagent execution saves/restores this global.
+- `tools/registry.py` is imported by ALL tool files; schema generation happens at import time.
+
+## Next Actions (Phase II Prep)
+
+1. Decompose `AIAgent` into: `ConversationLoop`, `ContextManager`, `ToolDispatcher`, `MemoryInterceptor`.
+2. Extract CLI display logic from command dispatch.
+3. Define strict interfaces between gateway → agent → tools.
+4. Write property-based tests for the conversation loop invariant: *given the same message history and tool results, the agent must produce deterministic tool_call ordering*.
+
+---
+Generated: 2026-04-05 by Ezra (Phase I)