Compare commits
3 Commits
main
...
epic-999-p
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
f9839ad278 | ||
|
|
c266661bff | ||
|
|
5f1cdfc9e4 |
4657
docs/ouroboros/artifacts/call_graph.json
Normal file
4657
docs/ouroboros/artifacts/call_graph.json
Normal file
File diff suppressed because it is too large
Load Diff
4291
docs/ouroboros/artifacts/core_analysis.json
Normal file
4291
docs/ouroboros/artifacts/core_analysis.json
Normal file
File diff suppressed because it is too large
Load Diff
39340
docs/ouroboros/artifacts/import_graph.json
Normal file
39340
docs/ouroboros/artifacts/import_graph.json
Normal file
File diff suppressed because it is too large
Load Diff
3397
docs/ouroboros/artifacts/module_inventory.json
Normal file
3397
docs/ouroboros/artifacts/module_inventory.json
Normal file
File diff suppressed because it is too large
Load Diff
74
docs/ouroboros/specs/AIAgent_DECOMPOSITION.md
Normal file
74
docs/ouroboros/specs/AIAgent_DECOMPOSITION.md
Normal file
@@ -0,0 +1,74 @@
|
|||||||
|
# AIAgent Decomposition Plan (EPIC-999 Phase II Prep)
|
||||||
|
|
||||||
|
## Current State
|
||||||
|
`run_agent.py` contains `AIAgent` — a ~7,000-SLOC class that is the highest-blast-radius module in Hermes.
|
||||||
|
|
||||||
|
## Goal
|
||||||
|
Decompose `AIAgent` into 5 focused classes with strict interfaces, enabling:
|
||||||
|
- Parallel rewrites by competing sub-agents (Phase II)
|
||||||
|
- Independent testing of loop semantics vs. model I/O vs. memory
|
||||||
|
- Future runtime replacement (Hermes Ω) without touching tool infrastructure
|
||||||
|
|
||||||
|
## Proposed Decomposition
|
||||||
|
|
||||||
|
### 1. `ConversationLoop`
|
||||||
|
**Responsibility:** Own the `while` loop invariant, iteration budget, and termination conditions.
|
||||||
|
**Interface:**
|
||||||
|
```python
|
||||||
|
class ConversationLoop:
|
||||||
|
def run(self, messages: list, tools: list, client) -> dict:
|
||||||
|
...
|
||||||
|
```
|
||||||
|
**Invariant:** Must terminate before `max_iterations` and `iteration_budget.remaining <= 0`.
|
||||||
|
|
||||||
|
### 2. `ModelDispatcher`
|
||||||
|
**Responsibility:** All interaction with `client.chat.completions.create`, including streaming, fallback activation, and response normalization.
|
||||||
|
**Interface:**
|
||||||
|
```python
|
||||||
|
class ModelDispatcher:
|
||||||
|
def call(self, model: str, messages: list, tools: list, **kwargs) -> ModelResponse:
|
||||||
|
...
|
||||||
|
```
|
||||||
|
**Invariant:** Must always return a normalized object with `.content`, `.tool_calls`, `.reasoning`.
|
||||||
|
|
||||||
|
### 3. `ToolExecutor`
|
||||||
|
**Responsibility:** Execute tool calls (sequential or concurrent), handle errors, and format results.
|
||||||
|
**Interface:**
|
||||||
|
```python
|
||||||
|
class ToolExecutor:
|
||||||
|
def execute(self, tool_calls: list, task_id: str = None) -> list[ToolResult]:
|
||||||
|
...
|
||||||
|
```
|
||||||
|
**Invariant:** Every tool_call produces exactly one ToolResult, and errors are JSON-serializable.
|
||||||
|
|
||||||
|
### 4. `MemoryInterceptor`
|
||||||
|
**Responsibility:** Intercept `memory` and `todo` tool calls before they reach the registry, plus flush memories on session end.
|
||||||
|
**Interface:**
|
||||||
|
```python
|
||||||
|
class MemoryInterceptor:
|
||||||
|
def intercept(self, tool_name: str, args: dict, task_id: str = None) -> str | None:
|
||||||
|
... # returns result if intercepted, None if pass-through
|
||||||
|
```
|
||||||
|
**Invariant:** Must not mutate agent state except through explicit `flush()` calls.
|
||||||
|
|
||||||
|
### 5. `PromptBuilder`
|
||||||
|
**Responsibility:** Assemble system prompt, inject skills, apply context compression, and manage prompt caching markers.
|
||||||
|
**Interface:**
|
||||||
|
```python
|
||||||
|
class PromptBuilder:
|
||||||
|
def build(self, user_message: str, conversation_history: list) -> list:
|
||||||
|
...
|
||||||
|
```
|
||||||
|
**Invariant:** Output list must start with a system message (or equivalent provider parameter).
|
||||||
|
|
||||||
|
## Migration Path
|
||||||
|
1. Create the 5 classes as thin facades that delegate back to `AIAgent` methods.
|
||||||
|
2. Move logic incrementally from `AIAgent` into the new classes.
|
||||||
|
3. Once `AIAgent` is a pure coordinator (~500 SLOC), freeze the interface.
|
||||||
|
4. Phase II competing agents rewrite one class at a time.
|
||||||
|
|
||||||
|
## Acceptance Criteria
|
||||||
|
- [ ] `AIAgent` reduced to < 1,000 SLOC
|
||||||
|
- [ ] Each new class has > 80% test coverage
|
||||||
|
- [ ] Full existing test suite still passes
|
||||||
|
- [ ] No behavioral regressions in shadow mode
|
||||||
263
docs/ouroboros/specs/SPEC.md
Normal file
263
docs/ouroboros/specs/SPEC.md
Normal file
@@ -0,0 +1,263 @@
|
|||||||
|
# Hermes Ω Specification Draft (Ouroboros Phase I)
|
||||||
|
|
||||||
|
> Auto-generated by Ezra as part of EPIC-999. This document is a living artifact.
|
||||||
|
|
||||||
|
## Scope
|
||||||
|
This specification covers the core runtime of Hermes agent v0.7.x as found in the `hermes-agent` codebase.
|
||||||
|
|
||||||
|
## High-Level Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
User Message
|
||||||
|
↓
|
||||||
|
Gateway (gateway/run.py) — platform adapter (Telegram, Discord, CLI, etc.)
|
||||||
|
↓
|
||||||
|
HermesCLI (cli.py) or AIAgent.chat() (run_agent.py)
|
||||||
|
↓
|
||||||
|
ModelTools (model_tools.py) — tool discovery, schema assembly, dispatch
|
||||||
|
↓
|
||||||
|
Tool Registry (tools/registry.py) — handler lookup, availability checks
|
||||||
|
↓
|
||||||
|
Individual Tool Implementations (tools/*.py)
|
||||||
|
↓
|
||||||
|
Results returned up the stack
|
||||||
|
```
|
||||||
|
|
||||||
|
## Module Specifications
|
||||||
|
|
||||||
|
### `run_agent.py`
|
||||||
|
**Lines of Code:** 8948
|
||||||
|
|
||||||
|
**Classes:**
|
||||||
|
- `_SafeWriter`
|
||||||
|
- *Transparent stdio wrapper that catches OSError/ValueError from broken pipes.*
|
||||||
|
- `__init__(self, inner)`
|
||||||
|
- `write(self, data)`
|
||||||
|
- `flush(self)`
|
||||||
|
- `fileno(self)`
|
||||||
|
- `isatty(self)`
|
||||||
|
- ... and 1 more methods
|
||||||
|
- `IterationBudget`
|
||||||
|
- *Thread-safe iteration counter for an agent.*
|
||||||
|
- `__init__(self, max_total)`
|
||||||
|
- `consume(self)`
|
||||||
|
- `refund(self)`
|
||||||
|
- `used(self)`
|
||||||
|
- `remaining(self)`
|
||||||
|
- `AIAgent`
|
||||||
|
- *AI Agent with tool calling capabilities.*
|
||||||
|
- `base_url(self)`
|
||||||
|
- `base_url(self, value)`
|
||||||
|
- `__init__(self, base_url, api_key, provider, api_mode, acp_command, acp_args, command, args, model, max_iterations, tool_delay, enabled_toolsets, disabled_toolsets, save_trajectories, verbose_logging, quiet_mode, ephemeral_system_prompt, log_prefix_chars, log_prefix, providers_allowed, providers_ignored, providers_order, provider_sort, provider_require_parameters, provider_data_collection, session_id, tool_progress_callback, tool_start_callback, tool_complete_callback, thinking_callback, reasoning_callback, clarify_callback, step_callback, stream_delta_callback, tool_gen_callback, status_callback, max_tokens, reasoning_config, prefill_messages, platform, skip_context_files, skip_memory, session_db, iteration_budget, fallback_model, credential_pool, checkpoints_enabled, checkpoint_max_snapshots, pass_session_id, persist_session)`
|
||||||
|
- `reset_session_state(self)`
|
||||||
|
- `_safe_print(self)`
|
||||||
|
- ... and 100 more methods
|
||||||
|
|
||||||
|
**Top-Level Functions:**
|
||||||
|
- `_install_safe_stdio()`
|
||||||
|
- `_is_destructive_command(cmd)`
|
||||||
|
- `_should_parallelize_tool_batch(tool_calls)`
|
||||||
|
- `_extract_parallel_scope_path(tool_name, function_args)`
|
||||||
|
- `_paths_overlap(left, right)`
|
||||||
|
- `_sanitize_surrogates(text)`
|
||||||
|
- `_sanitize_messages_surrogates(messages)`
|
||||||
|
- `_strip_budget_warnings_from_history(messages)`
|
||||||
|
- `main(query, model, api_key, base_url, max_turns, enabled_toolsets, disabled_toolsets, list_tools, save_trajectories, save_sample, verbose, log_prefix_chars)`
|
||||||
|
|
||||||
|
**Inferred Side Effects & Invariants:**
|
||||||
|
- Persists state to SQLite database.
|
||||||
|
- Performs file I/O.
|
||||||
|
- Makes HTTP network calls.
|
||||||
|
- Uses global mutable state (risk factor).
|
||||||
|
|
||||||
|
### `model_tools.py`
|
||||||
|
**Lines of Code:** 466
|
||||||
|
|
||||||
|
**Top-Level Functions:**
|
||||||
|
- `_get_tool_loop()`
|
||||||
|
- `_get_worker_loop()`
|
||||||
|
- `_run_async(coro)`
|
||||||
|
- `_discover_tools()`
|
||||||
|
- `get_tool_definitions(enabled_toolsets, disabled_toolsets, quiet_mode)`
|
||||||
|
- `handle_function_call(function_name, function_args, task_id, user_task, enabled_tools)`
|
||||||
|
- `get_all_tool_names()`
|
||||||
|
- `get_toolset_for_tool(tool_name)`
|
||||||
|
- `get_available_toolsets()`
|
||||||
|
- `check_toolset_requirements()`
|
||||||
|
- ... and 1 more functions
|
||||||
|
|
||||||
|
**Inferred Side Effects & Invariants:**
|
||||||
|
- Uses global mutable state (risk factor).
|
||||||
|
- Primarily pure Python logic / orchestration.
|
||||||
|
|
||||||
|
### `cli.py`
|
||||||
|
**Lines of Code:** 8280
|
||||||
|
|
||||||
|
**Classes:**
|
||||||
|
- `ChatConsole`
|
||||||
|
- *Rich Console adapter for prompt_toolkit's patch_stdout context.*
|
||||||
|
- `__init__(self)`
|
||||||
|
- `print(self)`
|
||||||
|
- `HermesCLI`
|
||||||
|
- *Interactive CLI for the Hermes Agent.*
|
||||||
|
- `__init__(self, model, toolsets, provider, api_key, base_url, max_turns, verbose, compact, resume, checkpoints, pass_session_id)`
|
||||||
|
- `_invalidate(self, min_interval)`
|
||||||
|
- `_status_bar_context_style(self, percent_used)`
|
||||||
|
- `_build_context_bar(self, percent_used, width)`
|
||||||
|
- `_get_status_bar_snapshot(self)`
|
||||||
|
- ... and 106 more methods
|
||||||
|
|
||||||
|
**Top-Level Functions:**
|
||||||
|
- `_load_prefill_messages(file_path)`
|
||||||
|
- `_parse_reasoning_config(effort)`
|
||||||
|
- `load_cli_config()`
|
||||||
|
- `_run_cleanup()`
|
||||||
|
- `_git_repo_root()`
|
||||||
|
- `_path_is_within_root(path, root)`
|
||||||
|
- `_setup_worktree(repo_root)`
|
||||||
|
- `_cleanup_worktree(info)`
|
||||||
|
- `_prune_stale_worktrees(repo_root, max_age_hours)`
|
||||||
|
- `_accent_hex()`
|
||||||
|
- ... and 9 more functions
|
||||||
|
|
||||||
|
**Inferred Side Effects & Invariants:**
|
||||||
|
- Persists state to SQLite database.
|
||||||
|
- Performs file I/O.
|
||||||
|
- Spawns subprocesses / shell commands.
|
||||||
|
- Uses global mutable state (risk factor).
|
||||||
|
|
||||||
|
### `tools/registry.py`
|
||||||
|
**Lines of Code:** 275
|
||||||
|
|
||||||
|
**Classes:**
|
||||||
|
- `ToolEntry`
|
||||||
|
- *Metadata for a single registered tool.*
|
||||||
|
- `__init__(self, name, toolset, schema, handler, check_fn, requires_env, is_async, description, emoji)`
|
||||||
|
- `ToolRegistry`
|
||||||
|
- *Singleton registry that collects tool schemas + handlers from tool files.*
|
||||||
|
- `__init__(self)`
|
||||||
|
- `register(self, name, toolset, schema, handler, check_fn, requires_env, is_async, description, emoji)`
|
||||||
|
- `deregister(self, name)`
|
||||||
|
- `get_definitions(self, tool_names, quiet)`
|
||||||
|
- `dispatch(self, name, args)`
|
||||||
|
- ... and 10 more methods
|
||||||
|
|
||||||
|
**Inferred Side Effects & Invariants:**
|
||||||
|
- Primarily pure Python logic / orchestration.
|
||||||
|
|
||||||
|
### `gateway/run.py`
|
||||||
|
**Lines of Code:** 6657
|
||||||
|
|
||||||
|
**Classes:**
|
||||||
|
- `GatewayRunner`
|
||||||
|
- *Main gateway controller.*
|
||||||
|
- `__init__(self, config)`
|
||||||
|
- `_has_setup_skill(self)`
|
||||||
|
- `_load_voice_modes(self)`
|
||||||
|
- `_save_voice_modes(self)`
|
||||||
|
- `_set_adapter_auto_tts_disabled(self, adapter, chat_id, disabled)`
|
||||||
|
- ... and 78 more methods
|
||||||
|
|
||||||
|
**Top-Level Functions:**
|
||||||
|
- `_ensure_ssl_certs()`
|
||||||
|
- `_normalize_whatsapp_identifier(value)`
|
||||||
|
- `_expand_whatsapp_auth_aliases(identifier)`
|
||||||
|
- `_resolve_runtime_agent_kwargs()`
|
||||||
|
- `_build_media_placeholder(event)`
|
||||||
|
- `_dequeue_pending_text(adapter, session_key)`
|
||||||
|
- `_check_unavailable_skill(command_name)`
|
||||||
|
- `_platform_config_key(platform)`
|
||||||
|
- `_load_gateway_config()`
|
||||||
|
- `_resolve_gateway_model(config)`
|
||||||
|
- ... and 4 more functions
|
||||||
|
|
||||||
|
**Inferred Side Effects & Invariants:**
|
||||||
|
- Persists state to SQLite database.
|
||||||
|
- Performs file I/O.
|
||||||
|
- Spawns subprocesses / shell commands.
|
||||||
|
- Contains async code paths.
|
||||||
|
- Uses global mutable state (risk factor).
|
||||||
|
|
||||||
|
### `hermes_state.py`
|
||||||
|
**Lines of Code:** 1270
|
||||||
|
|
||||||
|
**Classes:**
|
||||||
|
- `SessionDB`
|
||||||
|
- *SQLite-backed session storage with FTS5 search.*
|
||||||
|
- `__init__(self, db_path)`
|
||||||
|
- `_execute_write(self, fn)`
|
||||||
|
- `_try_wal_checkpoint(self)`
|
||||||
|
- `close(self)`
|
||||||
|
- `_init_schema(self)`
|
||||||
|
- ... and 29 more methods
|
||||||
|
|
||||||
|
**Inferred Side Effects & Invariants:**
|
||||||
|
- Persists state to SQLite database.
|
||||||
|
|
||||||
|
### `agent/context_compressor.py`
|
||||||
|
**Lines of Code:** 676
|
||||||
|
|
||||||
|
**Classes:**
|
||||||
|
- `ContextCompressor`
|
||||||
|
- *Compresses conversation context when approaching the model's context limit.*
|
||||||
|
- `__init__(self, model, threshold_percent, protect_first_n, protect_last_n, summary_target_ratio, quiet_mode, summary_model_override, base_url, api_key, config_context_length, provider)`
|
||||||
|
- `update_from_response(self, usage)`
|
||||||
|
- `should_compress(self, prompt_tokens)`
|
||||||
|
- `should_compress_preflight(self, messages)`
|
||||||
|
- `get_status(self)`
|
||||||
|
- ... and 11 more methods
|
||||||
|
|
||||||
|
**Inferred Side Effects & Invariants:**
|
||||||
|
- Primarily pure Python logic / orchestration.
|
||||||
|
|
||||||
|
### `agent/prompt_caching.py`
|
||||||
|
**Lines of Code:** 72
|
||||||
|
|
||||||
|
**Top-Level Functions:**
|
||||||
|
- `_apply_cache_marker(msg, cache_marker, native_anthropic)`
|
||||||
|
- `apply_anthropic_cache_control(api_messages, cache_ttl, native_anthropic)`
|
||||||
|
|
||||||
|
**Inferred Side Effects & Invariants:**
|
||||||
|
- Primarily pure Python logic / orchestration.
|
||||||
|
|
||||||
|
### `agent/skill_commands.py`
|
||||||
|
**Lines of Code:** 297
|
||||||
|
|
||||||
|
**Top-Level Functions:**
|
||||||
|
- `build_plan_path(user_instruction)`
|
||||||
|
- `_load_skill_payload(skill_identifier, task_id)`
|
||||||
|
- `_build_skill_message(loaded_skill, skill_dir, activation_note, user_instruction, runtime_note)`
|
||||||
|
- `scan_skill_commands()`
|
||||||
|
- `get_skill_commands()`
|
||||||
|
- `build_skill_invocation_message(cmd_key, user_instruction, task_id, runtime_note)`
|
||||||
|
- `build_preloaded_skills_prompt(skill_identifiers, task_id)`
|
||||||
|
|
||||||
|
**Inferred Side Effects & Invariants:**
|
||||||
|
- Uses global mutable state (risk factor).
|
||||||
|
- Primarily pure Python logic / orchestration.
|
||||||
|
|
||||||
|
## Cross-Module Dependencies
|
||||||
|
|
||||||
|
Key data flow:
|
||||||
|
1. `run_agent.py` defines `AIAgent` — the canonical conversation loop.
|
||||||
|
2. `model_tools.py` assembles tool schemas and dispatches function calls.
|
||||||
|
3. `tools/registry.py` maintains the central registry; all tool files import it.
|
||||||
|
4. `gateway/run.py` adapts platform events into `AIAgent.run_conversation()` calls.
|
||||||
|
5. `cli.py` (`HermesCLI`) provides the interactive shell and slash-command routing.
|
||||||
|
|
||||||
|
## Known Coupling Risks
|
||||||
|
|
||||||
|
- `run_agent.py` is ~7k SLOC and contains the core loop, todo/memory interception, context compression, and trajectory saving. High blast radius.
|
||||||
|
- `cli.py` is ~6.5k SLOC and combines UI (Rich/prompt_toolkit), config loading, and command dispatch. Tightly coupled to display state.
|
||||||
|
- `model_tools.py` holds a process-global `_last_resolved_tool_names`. Subagent execution saves/restores this global.
|
||||||
|
- `tools/registry.py` is imported by ALL tool files; schema generation happens at import time.
|
||||||
|
|
||||||
|
## Next Actions (Phase II Prep)
|
||||||
|
|
||||||
|
1. Decompose `AIAgent` into: `ConversationLoop`, `ContextManager`, `ToolDispatcher`, `MemoryInterceptor`.
|
||||||
|
2. Extract CLI display logic from command dispatch.
|
||||||
|
3. Define strict interfaces between gateway → agent → tools.
|
||||||
|
4. Write property-based tests for the conversation loop invariant: *given the same message history and tool results, the agent must produce deterministic tool_call ordering*.
|
||||||
|
|
||||||
|
---
|
||||||
|
Generated: 2026-04-05 by Ezra (Phase I)
|
||||||
137
docs/ouroboros/specs/test_invariants_stubs.py
Normal file
137
docs/ouroboros/specs/test_invariants_stubs.py
Normal file
@@ -0,0 +1,137 @@
|
|||||||
|
"""
|
||||||
|
Property-based test stubs for Hermes core invariants.
|
||||||
|
Part of EPIC-999 Phase I — The Mirror.
|
||||||
|
|
||||||
|
These tests define behavioral contracts that ANY rewrite of the runtime
|
||||||
|
must satisfy, including the Hermes Ω target.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
from unittest.mock import Mock, patch
|
||||||
|
|
||||||
|
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
# Conversation Loop Invariants
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
|
||||||
|
class TestConversationLoopInvariants:
|
||||||
|
"""
|
||||||
|
Invariants for AIAgent.run_conversation and its successors.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def test_deterministic_tool_ordering(self):
|
||||||
|
"""
|
||||||
|
Given the same message history and available tools,
|
||||||
|
the agent must produce the same tool_call ordering.
|
||||||
|
|
||||||
|
(If non-determinism is introduced by temperature > 0,
|
||||||
|
this becomes a statistical test.)
|
||||||
|
"""
|
||||||
|
pytest.skip("TODO: implement with seeded mock model responses")
|
||||||
|
|
||||||
|
def test_tool_result_always_appended_to_history(self):
|
||||||
|
"""
|
||||||
|
After any tool_call is executed, its result MUST appear
|
||||||
|
in the conversation history before the next assistant turn.
|
||||||
|
"""
|
||||||
|
pytest.skip("TODO: mock model with forced tool_call and verify history")
|
||||||
|
|
||||||
|
def test_iteration_budget_never_exceeded(self):
|
||||||
|
"""
|
||||||
|
The loop must terminate before api_call_count >= max_iterations
|
||||||
|
AND before iteration_budget.remaining <= 0.
|
||||||
|
"""
|
||||||
|
pytest.skip("TODO: mock model to always return tool_calls; verify termination")
|
||||||
|
|
||||||
|
def test_system_prompt_presence(self):
|
||||||
|
"""
|
||||||
|
Every API call must include a system message as the first message
|
||||||
|
(or system parameter for providers that support it).
|
||||||
|
"""
|
||||||
|
pytest.skip("TODO: intercept all client.chat.completions.create calls")
|
||||||
|
|
||||||
|
def test_compression_preserves_last_n_messages(self):
|
||||||
|
"""
|
||||||
|
After context compression, the final N messages (configurable,
|
||||||
|
default ~4) must remain uncompressed to preserve local context.
|
||||||
|
"""
|
||||||
|
pytest.skip("TODO: create history > threshold, compress, verify tail")
|
||||||
|
|
||||||
|
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
# Tool Registry Invariants
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
|
||||||
|
class TestToolRegistryInvariants:
|
||||||
|
"""
|
||||||
|
Invariants for tools.registry.Registry.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def test_register_then_list_contains_tool(self):
|
||||||
|
"""
|
||||||
|
After register() is called with a valid schema and handler,
|
||||||
|
list_tools() must include the registered name.
|
||||||
|
"""
|
||||||
|
pytest.skip("TODO: instantiate fresh Registry, register, assert membership")
|
||||||
|
|
||||||
|
def test_dispatch_unknown_tool_returns_error_json(self):
|
||||||
|
"""
|
||||||
|
Calling dispatch() with an unregistered tool name must return
|
||||||
|
a JSON string containing an error key, never raise raw.
|
||||||
|
"""
|
||||||
|
pytest.skip("TODO: call dispatch with 'nonexistent_tool', parse result")
|
||||||
|
|
||||||
|
def test_handler_receives_task_id_kwarg(self):
|
||||||
|
"""
|
||||||
|
Registered handlers that accept **kwargs must receive task_id
|
||||||
|
when dispatch is called with one.
|
||||||
|
"""
|
||||||
|
pytest.skip("TODO: register mock handler, dispatch with task_id, verify")
|
||||||
|
|
||||||
|
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
# State Persistence Invariants
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
|
||||||
|
class TestStatePersistenceInvariants:
|
||||||
|
"""
|
||||||
|
Invariants for hermes_state.SessionDB.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def test_saved_message_is_retrievable_by_session_id(self):
|
||||||
|
"""
|
||||||
|
After save_message(session_id, ...), get_messages(session_id)
|
||||||
|
must return the message.
|
||||||
|
"""
|
||||||
|
pytest.skip("TODO: use temp SQLite DB, save, query, assert")
|
||||||
|
|
||||||
|
def test_fts_search_returns_relevant_messages(self):
|
||||||
|
"""
|
||||||
|
After indexing messages, FTS search for a unique keyword
|
||||||
|
must return the message containing it.
|
||||||
|
"""
|
||||||
|
pytest.skip("TODO: seed DB with messages, search unique token")
|
||||||
|
|
||||||
|
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
# Context Compressor Invariants
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
|
||||||
|
class TestContextCompressorInvariants:
|
||||||
|
"""
|
||||||
|
Invariants for agent.context_compressor.ContextCompressor.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def test_compression_reduces_token_count(self):
|
||||||
|
"""
|
||||||
|
compress_messages(output) must have fewer tokens than
|
||||||
|
the uncompressed input (for any input > threshold).
|
||||||
|
"""
|
||||||
|
pytest.skip("TODO: mock tokenizer, provide long history, assert reduction")
|
||||||
|
|
||||||
|
def test_compression_never_drops_system_message(self):
|
||||||
|
"""
|
||||||
|
The system message must survive compression and remain
|
||||||
|
at index 0 of the returned message list.
|
||||||
|
"""
|
||||||
|
pytest.skip("TODO: compress history with system msg, verify position")
|
||||||
Reference in New Issue
Block a user