Files
hermes-agent/tools
Alexander Whitestone 6c849a1157
Some checks failed
Forge CI / smoke-and-build (pull_request) Failing after 53s
feat: warm session provisioning v2 — full acceptance criteria (#327)
Marathon sessions (100+ msgs) have lower per-tool error rates (5.7%)
than mid-length sessions (9.0%). This implements warm session
provisioning addressing all four acceptance criteria:

1. What makes marathon sessions reliable?
   - SessionProfiler analyzes error rates, tool distribution,
     proficiency gain (early vs late error rate delta)

2. Pre-seed sessions with successful tool-call examples?
   - PatternExtractor mines successful tool calls from SessionDB
   - build_warm_conversation() converts to conversation_history
   - Injected via existing run_conversation() parameter

3. Does context compression preserve proficiency?
   - analyze_compression_impact() compares parent vs child session
     error rates after compression events

4. A/B testing: warm vs cold comparison
   - compare_sessions() computes error rate improvement
   - profile action analyzes individual sessions
   - compare action runs A/B between two sessions

agent/warm_session.py (678 lines):
  - SessionProfile, WarmPattern, WarmSessionTemplate dataclasses
  - profile_session() — reliability analysis
  - extract_patterns_from_session() — mines successful patterns
  - extract_from_session_db() — batch extraction from marathon sessions
  - build_warm_conversation() — conversation_history builder
  - analyze_compression_impact() — compression preservation test
  - compare_sessions() — A/B comparison
  - save/load/list templates

tools/warm_session_tool.py (275 lines):
  7 actions: build, list, load, delete, profile, compress-check, compare

25 tests added, all passing.

Closes #327
2026-04-13 20:19:58 -04:00
..