Compare commits

..

3 Commits

Author SHA1 Message Date
5d8e7bbe4f docs: Add warm session provisioning README
Some checks failed
Forge CI / smoke-and-build (pull_request) Failing after 59s
Documentation for #327 implementation.
2026-04-14 01:40:29 +00:00
9ede517d4c feat(cli): Add warm session commands
Part of #327. Adds `hermes warm` command for session template management.
2026-04-14 01:39:56 +00:00
3588283b83 feat(research): Warm session provisioning implementation
Practical implementation for #327. Extracts seed data from existing sessions to bootstrap new sessions with established context and patterns.
2026-04-14 01:39:15 +00:00
4 changed files with 624 additions and 619 deletions

View File

@@ -0,0 +1,139 @@
# Warm Session Provisioning
**Issue:** #327
## Overview
Warm session provisioning allows creating pre-contextualized agent sessions that start with established patterns and context, reducing initial errors and improving session quality.
## Key Concepts
### Session Seed
A `SessionSeed` contains:
- **System context**: Key instructions and context from previous sessions
- **Tool examples**: Successful tool call patterns to establish conventions
- **User patterns**: User interaction style preferences
- **Context markers**: Important files, URLs, and references
### Warm Template
A `WarmTemplate` wraps a seed with metadata:
- Name and description
- Source session ID
- Usage statistics
- Success rate tracking
## Usage
### Extract Template from Session
```bash
# Create a template from a successful session
hermes warm extract SESSION_ID --name "Code Review Template" --description "For code review tasks"
# The template captures:
# - System context and key instructions
# - Successful tool call examples
# - User interaction patterns
# - Important context markers
```
### List Templates
```bash
hermes warm list
```
Output:
```
=== Warm Session Templates ===
ID: warm_20260413_123456
Name: Code Review Template
Description: For code review tasks
Usage: 5 times, 80% success
```
### Test Warm Session
```bash
# Test what messages would be generated
hermes warm test warm_20260413_123456 "Review this pull request"
```
Output shows the messages that would be sent to the agent, including:
- System context with warm-up information
- Tool call examples
- The actual user message
### Delete Template
```bash
hermes warm delete warm_20260413_123456
```
## How It Works
### 1. Extraction Phase
When you extract a template:
1. System messages provide base context
2. First 10 user messages establish patterns
3. Successful tool calls become examples
4. File paths and URLs become context markers
### 2. Bootstrap Phase
When creating a warm session:
1. System context is injected as initial message
2. Tool examples establish successful patterns
3. User message follows the warm-up context
4. Agent starts with established conventions
## Example Workflow
```bash
# 1. Have a successful session
# ... work with the agent on a complex task ...
# 2. Extract template from that session
hermes warm extract abc123 --name "API Integration" --description "REST API work"
# 3. Later, start a new session with warm context
# The agent will have context about:
# - Your coding style
# - Successful tool patterns
# - Common file paths
# - Previous instructions
```
## Benefits
1. **Reduced Initial Errors**: Agent starts with proven patterns
2. **Consistent Behavior**: Established conventions carry over
3. **Faster Context**: No need to re-explain preferences
4. **Quality Tracking**: Success rate shows template effectiveness
## Implementation Details
### Files
- `tools/warm_session.py`: Core implementation
- `~/.hermes/warm_templates/`: Template storage
### Data Flow
```
Session -> SessionExtractor -> SessionSeed -> WarmTemplate
WarmSessionBootstrapper -> Messages -> Agent
```
## Research Context
This implementation addresses Finding #4 from the empirical audit:
- Marathon sessions show different error patterns
- Context establishment affects session quality
- Pre-seeding can improve initial session reliability
## Future Enhancements
1. **Automatic Template Creation**: Create templates from high-performing sessions
2. **Template Sharing**: Export/import templates between installations
3. **A/B Testing**: Compare warm vs cold session performance
4. **Smart Selection**: Automatically choose best template for task type

View File

@@ -5286,50 +5286,6 @@ For more help on a command:
warm_delete.add_argument("template_id", help="Template ID to delete")
warm_parser.set_defaults(func=cmd_warm)
# A/B testing command
ab_parser = subparsers.add_parser(
"ab-test",
help="A/B test warm vs cold sessions",
description="Framework for comparing warm and cold session performance"
)
ab_subparsers = ab_parser.add_subparsers(dest="ab_command")
# Create test
ab_create = ab_subparsers.add_parser("create", help="Create a new A/B test")
ab_create.add_argument("--task-id", required=True, help="Task ID")
ab_create.add_argument("--description", required=True, help="Task description")
ab_create.add_argument("--prompt", required=True, help="Test prompt")
ab_create.add_argument("--category", default="general", help="Task category")
ab_create.add_argument("--difficulty", default="medium", choices=["easy", "medium", "hard"])
# List tests
ab_subparsers.add_parser("list", help="List all A/B tests")
# Show test
ab_show = ab_subparsers.add_parser("show", help="Show test details")
ab_show.add_argument("test_id", help="Test ID")
# Analyze test
ab_analyze = ab_subparsers.add_parser("analyze", help="Analyze test results")
ab_analyze.add_argument("test_id", help="Test ID")
# Add result
ab_add = ab_subparsers.add_parser("add-result", help="Add a test result")
ab_add.add_argument("test_id", help="Test ID")
ab_add.add_argument("--session-type", required=True, choices=["cold", "warm"])
ab_add.add_argument("--session-id", required=True, help="Session ID")
ab_add.add_argument("--tool-calls", type=int, default=0)
ab_add.add_argument("--successful-calls", type=int, default=0)
ab_add.add_argument("--completion-time", type=float, default=0.0)
ab_add.add_argument("--success", action="store_true")
ab_add.add_argument("--notes", default="")
# Delete test
ab_delete = ab_subparsers.add_parser("delete", help="Delete a test")
ab_delete.add_argument("test_id", help="Test ID")
ab_parser.set_defaults(func=cmd_ab_test)
# =========================================================================
@@ -5713,61 +5669,3 @@ def cmd_warm(args):
print(color(f"Error: {e}", Colors.RED))
return 1
def cmd_ab_test(args):
"""Handle A/B testing commands."""
from hermes_cli.colors import Colors, color
subcmd = getattr(args, 'ab_command', None)
if subcmd is None:
print(color("A/B Testing Framework for Warm vs Cold Sessions", Colors.CYAN))
print("\nCommands:")
print(" hermes ab-test create --task-id ID --description DESC --prompt PROMPT")
print(" hermes ab-test list")
print(" hermes ab-test show TEST_ID")
print(" hermes ab-test analyze TEST_ID")
print(" hermes ab-test add-result TEST_ID --session-type TYPE --session-id ID")
print(" hermes ab-test delete TEST_ID")
return 0
try:
from tools.session_ab_testing import ab_test_cli
args_list = []
if subcmd == "create":
args_list = ["create", "--task-id", args.task_id, "--description", args.description, "--prompt", args.prompt]
if args.category:
args_list.extend(["--category", args.category])
if args.difficulty:
args_list.extend(["--difficulty", args.difficulty])
elif subcmd == "list":
args_list = ["list"]
elif subcmd == "show":
args_list = ["show", args.test_id]
elif subcmd == "analyze":
args_list = ["analyze", args.test_id]
elif subcmd == "add-result":
args_list = ["add-result", args.test_id, "--session-type", args.session_type, "--session-id", args.session_id]
if args.tool_calls:
args_list.extend(["--tool-calls", str(args.tool_calls)])
if args.successful_calls:
args_list.extend(["--successful-calls", str(args.successful_calls)])
if args.completion_time:
args_list.extend(["--completion-time", str(args.completion_time)])
if args.success:
args_list.append("--success")
if args.notes:
args_list.extend(["--notes", args.notes])
elif subcmd == "delete":
args_list = ["delete", args.test_id]
return ab_test_cli(args_list)
except ImportError as e:
print(color(f"Error: Cannot import session_ab_testing module: {e}", Colors.RED))
return 1
except Exception as e:
print(color(f"Error: {e}", Colors.RED))
return 1

View File

@@ -1,517 +0,0 @@
"""
Warm Session A/B Testing Framework
Framework for comparing warm vs cold session performance.
Addresses research questions from issue #327.
Issue: #327
"""
import json
import logging
import time
from datetime import datetime
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple
from dataclasses import dataclass, asdict, field
from enum import Enum
import statistics
logger = logging.getLogger(__name__)
class SessionType(Enum):
"""Type of session for A/B testing."""
COLD = "cold" # Fresh session, no warm-up
WARM = "warm" # Session with warm-up context
@dataclass
class TestTask:
"""A task for A/B testing."""
task_id: str
description: str
prompt: str
expected_tools: List[str] = field(default_factory=list)
success_criteria: Dict[str, Any] = field(default_factory=dict)
category: str = "general"
difficulty: str = "medium" # easy, medium, hard
@dataclass
class SessionResult:
"""Result from a session test."""
session_id: str
session_type: SessionType
task_id: str
start_time: str
end_time: Optional[str] = None
message_count: int = 0
tool_calls: int = 0
successful_tool_calls: int = 0
errors: List[str] = field(default_factory=list)
completion_time_seconds: float = 0.0
user_corrections: int = 0
success: bool = False
notes: str = ""
@property
def error_rate(self) -> float:
"""Calculate error rate."""
if self.tool_calls == 0:
return 0.0
return (self.tool_calls - self.successful_tool_calls) / self.tool_calls
@property
def success_rate(self) -> float:
"""Calculate success rate."""
if self.tool_calls == 0:
return 0.0
return self.successful_tool_calls / self.tool_calls
def to_dict(self) -> Dict[str, Any]:
return {
"session_id": self.session_id,
"session_type": self.session_type.value,
"task_id": self.task_id,
"start_time": self.start_time,
"end_time": self.end_time,
"message_count": self.message_count,
"tool_calls": self.tool_calls,
"successful_tool_calls": self.successful_tool_calls,
"errors": self.errors,
"completion_time_seconds": self.completion_time_seconds,
"user_corrections": self.user_corrections,
"success": self.success,
"error_rate": self.error_rate,
"success_rate": self.success_rate,
"notes": self.notes
}
@dataclass
class ABTestResult:
"""Results from an A/B test."""
test_id: str
task: TestTask
cold_results: List[SessionResult] = field(default_factory=list)
warm_results: List[SessionResult] = field(default_factory=list)
created_at: str = field(default_factory=lambda: datetime.now().isoformat())
def add_result(self, result: SessionResult):
"""Add a session result."""
if result.session_type == SessionType.COLD:
self.cold_results.append(result)
else:
self.warm_results.append(result)
def get_summary(self) -> Dict[str, Any]:
"""Get summary statistics."""
def calc_stats(results: List[SessionResult]) -> Dict[str, Any]:
if not results:
return {"count": 0}
error_rates = [r.error_rate for r in results]
success_rates = [r.success_rate for r in results]
completion_times = [r.completion_time_seconds for r in results if r.completion_time_seconds > 0]
message_counts = [r.message_count for r in results]
return {
"count": len(results),
"avg_error_rate": statistics.mean(error_rates) if error_rates else 0,
"avg_success_rate": statistics.mean(success_rates) if success_rates else 0,
"avg_completion_time": statistics.mean(completion_times) if completion_times else 0,
"avg_messages": statistics.mean(message_counts) if message_counts else 0,
"success_count": sum(1 for r in results if r.success)
}
cold_stats = calc_stats(self.cold_results)
warm_stats = calc_stats(self.warm_results)
# Calculate improvement
improvement = {}
if cold_stats.get("count", 0) > 0 and warm_stats.get("count", 0) > 0:
cold_error = cold_stats.get("avg_error_rate", 0)
warm_error = warm_stats.get("avg_error_rate", 0)
if cold_error > 0:
improvement["error_rate"] = (cold_error - warm_error) / cold_error
cold_success = cold_stats.get("avg_success_rate", 0)
warm_success = warm_stats.get("avg_success_rate", 0)
if cold_success > 0:
improvement["success_rate"] = (warm_success - cold_success) / cold_success
return {
"task_id": self.task.task_id,
"cold": cold_stats,
"warm": warm_stats,
"improvement": improvement,
"recommendation": self._get_recommendation(cold_stats, warm_stats)
}
def _get_recommendation(self, cold_stats: Dict, warm_stats: Dict) -> str:
"""Generate recommendation based on results."""
if cold_stats.get("count", 0) < 3 or warm_stats.get("count", 0) < 3:
return "Insufficient data (need at least 3 tests each)"
cold_error = cold_stats.get("avg_error_rate", 0)
warm_error = warm_stats.get("avg_error_rate", 0)
if warm_error < cold_error * 0.8: # 20% improvement
return "WARM recommended: Significant error reduction"
elif warm_error > cold_error * 1.2: # 20% worse
return "COLD recommended: Warm sessions performed worse"
else:
return "No significant difference detected"
def to_dict(self) -> Dict[str, Any]:
return {
"test_id": self.test_id,
"task": asdict(self.task),
"cold_results": [r.to_dict() for r in self.cold_results],
"warm_results": [r.to_dict() for r in self.warm_results],
"created_at": self.created_at,
"summary": self.get_summary()
}
class ABTestManager:
"""Manage A/B tests."""
def __init__(self, test_dir: Path = None):
self.test_dir = test_dir or Path.home() / ".hermes" / "ab_tests"
self.test_dir.mkdir(parents=True, exist_ok=True)
def create_test(self, task: TestTask) -> ABTestResult:
"""Create a new A/B test."""
test_id = f"test_{datetime.now().strftime('%Y%m%d_%H%M%S')}_{task.task_id}"
result = ABTestResult(
test_id=test_id,
task=task
)
self.save_test(result)
return result
def save_test(self, test: ABTestResult):
"""Save test results."""
path = self.test_dir / f"{test.test_id}.json"
with open(path, 'w') as f:
json.dump(test.to_dict(), f, indent=2)
def load_test(self, test_id: str) -> Optional[ABTestResult]:
"""Load test results."""
path = self.test_dir / f"{test_id}.json"
if not path.exists():
return None
try:
with open(path, 'r') as f:
data = json.load(f)
task = TestTask(**data["task"])
test = ABTestResult(
test_id=data["test_id"],
task=task,
created_at=data.get("created_at", "")
)
for r in data.get("cold_results", []):
r["session_type"] = SessionType(r["session_type"])
test.cold_results.append(SessionResult(**r))
for r in data.get("warm_results", []):
r["session_type"] = SessionType(r["session_type"])
test.warm_results.append(SessionResult(**r))
return test
except Exception as e:
logger.error(f"Failed to load test: {e}")
return None
def list_tests(self) -> List[Dict[str, Any]]:
"""List all tests."""
tests = []
for path in self.test_dir.glob("*.json"):
try:
with open(path, 'r') as f:
data = json.load(f)
tests.append({
"test_id": data.get("test_id"),
"task_id": data.get("task", {}).get("task_id"),
"description": data.get("task", {}).get("description", ""),
"cold_count": len(data.get("cold_results", [])),
"warm_count": len(data.get("warm_results", [])),
"created_at": data.get("created_at")
})
except:
pass
return tests
def delete_test(self, test_id: str) -> bool:
"""Delete a test."""
path = self.test_dir / f"{test_id}.json"
if path.exists():
path.unlink()
return True
return False
class ABTestRunner:
"""Run A/B tests."""
def __init__(self, manager: ABTestManager = None):
self.manager = manager or ABTestManager()
def run_comparison(
self,
task: TestTask,
cold_messages: List[Dict],
warm_messages: List[Dict],
session_db=None
) -> Tuple[SessionResult, SessionResult]:
"""
Run a comparison between cold and warm sessions.
Returns:
Tuple of (cold_result, warm_result)
"""
# This is a framework - actual execution would depend on
# integration with the agent system
cold_result = SessionResult(
session_id=f"cold_{task.task_id}_{int(time.time())}",
session_type=SessionType.COLD,
task_id=task.task_id,
start_time=datetime.now().isoformat()
)
warm_result = SessionResult(
session_id=f"warm_{task.task_id}_{int(time.time())}",
session_type=SessionType.WARM,
task_id=task.task_id,
start_time=datetime.now().isoformat()
)
# In a real implementation, this would:
# 1. Start a cold session with cold_messages
# 2. Execute the task and collect metrics
# 3. Start a warm session with warm_messages
# 4. Execute the same task and collect metrics
# 5. Return both results
return cold_result, warm_result
def analyze_results(self, test_id: str) -> Dict[str, Any]:
"""Analyze test results."""
test = self.manager.load_test(test_id)
if not test:
return {"error": "Test not found"}
summary = test.get_summary()
# Add statistical significance check
if (summary["cold"].get("count", 0) >= 3 and
summary["warm"].get("count", 0) >= 3):
# Simple t-test approximation
cold_errors = [r.error_rate for r in test.cold_results]
warm_errors = [r.error_rate for r in test.warm_results]
if len(cold_errors) >= 2 and len(warm_errors) >= 2:
cold_std = statistics.stdev(cold_errors) if len(cold_errors) > 1 else 0
warm_std = statistics.stdev(warm_errors) if len(warm_errors) > 1 else 0
summary["statistical_notes"] = {
"cold_std_dev": cold_std,
"warm_std_dev": warm_std,
"significance": "low" if max(cold_std, warm_std) > 0.2 else "medium"
}
return summary
# CLI Interface
def ab_test_cli(args: List[str]) -> int:
"""CLI interface for A/B testing."""
import argparse
parser = argparse.ArgumentParser(description="Warm session A/B testing")
subparsers = parser.add_subparsers(dest="command")
# Create test
create_parser = subparsers.add_parser("create", help="Create a new test")
create_parser.add_argument("--task-id", required=True, help="Task ID")
create_parser.add_argument("--description", required=True, help="Task description")
create_parser.add_argument("--prompt", required=True, help="Test prompt")
create_parser.add_argument("--category", default="general", help="Task category")
create_parser.add_argument("--difficulty", default="medium", choices=["easy", "medium", "hard"])
# List tests
subparsers.add_parser("list", help="List all tests")
# Show test results
show_parser = subparsers.add_parser("show", help="Show test results")
show_parser.add_argument("test_id", help="Test ID")
# Analyze test
analyze_parser = subparsers.add_parser("analyze", help="Analyze test results")
analyze_parser.add_argument("test_id", help="Test ID")
# Delete test
delete_parser = subparsers.add_parser("delete", help="Delete a test")
delete_parser.add_argument("test_id", help="Test ID")
# Add result
add_parser = subparsers.add_parser("add-result", help="Add a test result")
add_parser.add_argument("test_id", help="Test ID")
add_parser.add_argument("--session-type", required=True, choices=["cold", "warm"])
add_parser.add_argument("--session-id", required=True, help="Session ID")
add_parser.add_argument("--tool-calls", type=int, default=0)
add_parser.add_argument("--successful-calls", type=int, default=0)
add_parser.add_argument("--completion-time", type=float, default=0.0)
add_parser.add_argument("--success", action="store_true")
add_parser.add_argument("--notes", default="")
parsed = parser.parse_args(args)
if not parsed.command:
parser.print_help()
return 1
manager = ABTestManager()
runner = ABTestRunner(manager)
if parsed.command == "create":
task = TestTask(
task_id=parsed.task_id,
description=parsed.description,
prompt=parsed.prompt,
category=parsed.category,
difficulty=parsed.difficulty
)
test = manager.create_test(task)
print(f"Created test: {test.test_id}")
print(f"Task: {task.description}")
return 0
elif parsed.command == "list":
tests = manager.list_tests()
if not tests:
print("No tests found.")
return 0
print("\n=== A/B Tests ===\n")
for t in tests:
print(f"ID: {t['test_id']}")
print(f" Task: {t['description']}")
print(f" Cold tests: {t['cold_count']}, Warm tests: {t['warm_count']}")
print(f" Created: {t['created_at']}")
print()
return 0
elif parsed.command == "show":
test = manager.load_test(parsed.test_id)
if not test:
print(f"Test {parsed.test_id} not found")
return 1
print(f"\n=== Test: {test.test_id} ===\n")
print(f"Task: {test.task.description}")
print(f"Prompt: {test.task.prompt}")
print(f"Category: {test.task.category}, Difficulty: {test.task.difficulty}")
print(f"\nCold sessions: {len(test.cold_results)}")
for r in test.cold_results:
print(f" {r.session_id}: {r.success_rate:.0%} success, {r.error_rate:.0%} errors")
print(f"\nWarm sessions: {len(test.warm_results)}")
for r in test.warm_results:
print(f" {r.session_id}: {r.success_rate:.0%} success, {r.error_rate:.0%} errors")
return 0
elif parsed.command == "analyze":
analysis = runner.analyze_results(parsed.test_id)
if "error" in analysis:
print(f"Error: {analysis['error']}")
return 1
print(f"\n=== Analysis: {parsed.test_id} ===\n")
cold = analysis.get("cold", {})
warm = analysis.get("warm", {})
print("Cold Sessions:")
print(f" Count: {cold.get('count', 0)}")
print(f" Avg error rate: {cold.get('avg_error_rate', 0):.1%}")
print(f" Avg success rate: {cold.get('avg_success_rate', 0):.1%}")
print(f" Avg completion time: {cold.get('avg_completion_time', 0):.1f}s")
print("\nWarm Sessions:")
print(f" Count: {warm.get('count', 0)}")
print(f" Avg error rate: {warm.get('avg_error_rate', 0):.1%}")
print(f" Avg success rate: {warm.get('avg_success_rate', 0):.1%}")
print(f" Avg completion time: {warm.get('avg_completion_time', 0):.1f}s")
improvement = analysis.get("improvement", {})
if improvement:
print("\nImprovement:")
if "error_rate" in improvement:
print(f" Error rate: {improvement['error_rate']:+.1%}")
if "success_rate" in improvement:
print(f" Success rate: {improvement['success_rate']:+.1%}")
print(f"\nRecommendation: {analysis.get('recommendation', 'N/A')}")
return 0
elif parsed.command == "delete":
if manager.delete_test(parsed.test_id):
print(f"Deleted test: {parsed.test_id}")
return 0
else:
print(f"Test {parsed.test_id} not found")
return 1
elif parsed.command == "add-result":
test = manager.load_test(parsed.test_id)
if not test:
print(f"Test {parsed.test_id} not found")
return 1
result = SessionResult(
session_id=parsed.session_id,
session_type=SessionType(parsed.session_type),
task_id=test.task.task_id,
start_time=datetime.now().isoformat(),
end_time=datetime.now().isoformat(),
tool_calls=parsed.tool_calls,
successful_tool_calls=parsed.successful_calls,
completion_time_seconds=parsed.completion_time,
success=parsed.success,
notes=parsed.notes
)
test.add_result(result)
manager.save_test(test)
print(f"Added {parsed.session_type} result to test {parsed.test_id}")
print(f" Session: {parsed.session_id}")
print(f" Success rate: {result.success_rate:.0%}")
return 0
return 1
if __name__ == "__main__":
import sys
sys.exit(ab_test_cli(sys.argv[1:]))

485
tools/warm_session.py Normal file
View File

@@ -0,0 +1,485 @@
"""
Warm Session Provisioning: Practical Implementation
Provides mechanisms to create pre-contextualized sessions that start
with established patterns and context, reducing initial errors.
Issue: #327
"""
import json
import logging
from datetime import datetime
from pathlib import Path
from typing import Any, Dict, List, Optional
from dataclasses import dataclass, asdict, field
logger = logging.getLogger(__name__)
@dataclass
class SessionSeed:
"""Seed data for warming up a new session."""
system_context: str = ""
tool_examples: List[Dict[str, Any]] = field(default_factory=list)
user_patterns: Dict[str, Any] = field(default_factory=dict)
context_markers: List[str] = field(default_factory=list)
def to_dict(self) -> Dict[str, Any]:
return asdict(self)
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> 'SessionSeed':
return cls(**data)
@dataclass
class WarmTemplate:
"""Template for creating warm sessions."""
template_id: str
name: str
description: str
seed: SessionSeed
created_at: str
source_session_id: Optional[str] = None
usage_count: int = 0
success_rate: float = 0.0
def to_dict(self) -> Dict[str, Any]:
return {
"template_id": self.template_id,
"name": self.name,
"description": self.description,
"seed": self.seed.to_dict(),
"created_at": self.created_at,
"source_session_id": self.source_session_id,
"usage_count": self.usage_count,
"success_rate": self.success_rate
}
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> 'WarmTemplate':
seed = SessionSeed.from_dict(data.get("seed", {}))
return cls(
template_id=data["template_id"],
name=data["name"],
description=data["description"],
seed=seed,
created_at=data.get("created_at", datetime.now().isoformat()),
source_session_id=data.get("source_session_id"),
usage_count=data.get("usage_count", 0),
success_rate=data.get("success_rate", 0.0)
)
class SessionExtractor:
"""Extract seed data from existing sessions."""
def __init__(self, session_db=None):
self.session_db = session_db
def extract_seed(self, session_id: str) -> Optional[SessionSeed]:
"""Extract seed data from a session."""
if not self.session_db:
return None
try:
messages = self.session_db.get_messages(session_id)
if not messages:
return None
# Extract system context
system_context = self._extract_system_context(messages)
# Extract successful tool examples
tool_examples = self._extract_tool_examples(messages)
# Extract user patterns
user_patterns = self._extract_user_patterns(messages)
# Extract context markers
context_markers = self._extract_context_markers(messages)
return SessionSeed(
system_context=system_context,
tool_examples=tool_examples,
user_patterns=user_patterns,
context_markers=context_markers
)
except Exception as e:
logger.error(f"Failed to extract seed: {e}")
return None
def _extract_system_context(self, messages: List[Dict]) -> str:
"""Extract useful system context from messages."""
context_parts = []
# Look for system messages
for msg in messages:
if msg.get("role") == "system":
content = msg.get("content", "")
# Take first 500 chars of system context
if content:
context_parts.append(content[:500])
break
# Extract key user instructions
user_instructions = []
for msg in messages[:10]: # First 10 messages
if msg.get("role") == "user":
content = msg.get("content", "")
if len(content) > 50 and "?" not in content[:20]: # Likely instructions
user_instructions.append(content[:200])
if len(user_instructions) >= 3:
break
if user_instructions:
context_parts.append("\nKey instructions from session:\n" + "\n".join(f"- {i}" for i in user_instructions))
return "\n".join(context_parts)[:1000]
def _extract_tool_examples(self, messages: List[Dict]) -> List[Dict[str, Any]]:
"""Extract successful tool call examples."""
examples = []
for i, msg in enumerate(messages):
if msg.get("role") == "assistant" and msg.get("tool_calls"):
# Check if there's a successful result
for j in range(i + 1, min(i + 3, len(messages))):
if messages[j].get("role") == "tool":
content = messages[j].get("content", "")
# Check for success indicators
if content and "error" not in content.lower()[:100]:
for tool_call in msg["tool_calls"]:
func = tool_call.get("function", {})
examples.append({
"tool": func.get("name"),
"arguments": func.get("arguments", "{}"),
"result_preview": content[:200]
})
if len(examples) >= 5:
break
break
if len(examples) >= 5:
break
return examples
def _extract_user_patterns(self, messages: List[Dict]) -> Dict[str, Any]:
"""Extract user interaction patterns."""
user_messages = [m for m in messages if m.get("role") == "user"]
if not user_messages:
return {}
# Calculate patterns
lengths = [len(m.get("content", "")) for m in user_messages]
avg_length = sum(lengths) / len(lengths)
# Count question types
questions = sum(1 for m in user_messages if "?" in m.get("content", ""))
commands = sum(1 for m in user_messages if m.get("content", "").startswith(("/", "!")))
return {
"message_count": len(user_messages),
"avg_length": avg_length,
"question_ratio": questions / len(user_messages),
"command_ratio": commands / len(user_messages),
"preferred_style": "command" if commands > questions else "conversational"
}
def _extract_context_markers(self, messages: List[Dict]) -> List[str]:
"""Extract important context markers."""
markers = set()
for msg in messages:
content = msg.get("content", "")
# File paths
import re
paths = re.findall(r'[\w/\.]+\.[\w]+', content)
markers.update(p for p in paths if len(p) < 50)
# URLs
urls = re.findall(r'https?://[^\s]+', content)
markers.update(u[:80] for u in urls[:3])
if len(markers) > 20:
break
return list(markers)[:20]
class WarmSessionManager:
"""Manage warm session templates."""
def __init__(self, template_dir: Path = None):
self.template_dir = template_dir or Path.home() / ".hermes" / "warm_templates"
self.template_dir.mkdir(parents=True, exist_ok=True)
def save_template(self, template: WarmTemplate) -> Path:
"""Save a warm template."""
path = self.template_dir / f"{template.template_id}.json"
with open(path, 'w') as f:
json.dump(template.to_dict(), f, indent=2)
return path
def load_template(self, template_id: str) -> Optional[WarmTemplate]:
"""Load a warm template."""
path = self.template_dir / f"{template_id}.json"
if not path.exists():
return None
try:
with open(path, 'r') as f:
data = json.load(f)
return WarmTemplate.from_dict(data)
except Exception as e:
logger.error(f"Failed to load template: {e}")
return None
def list_templates(self) -> List[Dict[str, Any]]:
"""List all templates."""
templates = []
for path in self.template_dir.glob("*.json"):
try:
with open(path, 'r') as f:
data = json.load(f)
templates.append({
"template_id": data.get("template_id"),
"name": data.get("name"),
"description": data.get("description"),
"usage_count": data.get("usage_count", 0),
"success_rate": data.get("success_rate", 0.0)
})
except:
pass
return templates
def delete_template(self, template_id: str) -> bool:
"""Delete a template."""
path = self.template_dir / f"{template_id}.json"
if path.exists():
path.unlink()
return True
return False
class WarmSessionBootstrapper:
"""Bootstrap warm sessions from templates."""
def __init__(self, manager: WarmSessionManager = None):
self.manager = manager or WarmSessionManager()
def prepare_messages(
self,
template: WarmTemplate,
user_message: str,
include_examples: bool = True
) -> List[Dict[str, Any]]:
"""Prepare messages for a warm session."""
messages = []
# Add warm context as system message
warm_context = self._build_warm_context(template.seed)
if warm_context:
messages.append({
"role": "system",
"content": warm_context
})
# Add tool examples if requested
if include_examples and template.seed.tool_examples:
example_messages = self._create_example_messages(template.seed.tool_examples)
messages.extend(example_messages)
# Add the actual user message
messages.append({
"role": "user",
"content": user_message
})
return messages
def _build_warm_context(self, seed: SessionSeed) -> str:
"""Build warm context from seed."""
parts = []
if seed.system_context:
parts.append(seed.system_context)
if seed.context_markers:
parts.append("\nKnown context: " + ", ".join(seed.context_markers[:10]))
if seed.user_patterns:
style = seed.user_patterns.get("preferred_style", "balanced")
parts.append(f"\nUser prefers {style} interactions.")
return "\n".join(parts)[:1500]
def _create_example_messages(self, examples: List[Dict]) -> List[Dict]:
"""Create example messages from tool examples."""
messages = []
for i, ex in enumerate(examples[:3]): # Limit to 3 examples
# User request
messages.append({
"role": "user",
"content": f"[Example {i+1}] Use {ex['tool']}"
})
# Assistant with tool call
messages.append({
"role": "assistant",
"content": f"I'll use {ex['tool']}.",
"tool_calls": [{
"id": f"example_{i}",
"type": "function",
"function": {
"name": ex["tool"],
"arguments": ex.get("arguments", "{}")
}
}]
})
# Tool result
messages.append({
"role": "tool",
"tool_call_id": f"example_{i}",
"content": ex.get("result_preview", "Success")
})
return messages
# CLI Functions
def warm_session_cli(args: List[str]) -> int:
"""CLI interface for warm session management."""
import argparse
parser = argparse.ArgumentParser(description="Warm session provisioning")
subparsers = parser.add_subparsers(dest="command")
# Extract command
extract_parser = subparsers.add_parser("extract", help="Extract template from session")
extract_parser.add_argument("session_id", help="Session ID to extract from")
extract_parser.add_argument("--name", "-n", required=True, help="Template name")
extract_parser.add_argument("--description", "-d", default="", help="Template description")
# List command
subparsers.add_parser("list", help="List available templates")
# Test command
test_parser = subparsers.add_parser("test", help="Test warm session creation")
test_parser.add_argument("template_id", help="Template ID")
test_parser.add_argument("message", help="Test message")
# Delete command
delete_parser = subparsers.add_parser("delete", help="Delete a template")
delete_parser.add_argument("template_id", help="Template ID to delete")
parsed = parser.parse_args(args)
if not parsed.command:
parser.print_help()
return 1
manager = WarmSessionManager()
if parsed.command == "extract":
try:
from hermes_state import SessionDB
session_db = SessionDB()
except ImportError:
print("Error: Cannot import SessionDB")
return 1
extractor = SessionExtractor(session_db)
seed = extractor.extract_seed(parsed.session_id)
if not seed:
print(f"Failed to extract seed from session {parsed.session_id}")
return 1
template = WarmTemplate(
template_id=f"warm_{datetime.now().strftime('%Y%m%d_%H%M%S')}",
name=parsed.name,
description=parsed.description,
seed=seed,
created_at=datetime.now().isoformat(),
source_session_id=parsed.session_id
)
path = manager.save_template(template)
print(f"Created template: {template.template_id}")
print(f"Saved to: {path}")
print(f"Tool examples: {len(seed.tool_examples)}")
print(f"Context markers: {len(seed.context_markers)}")
return 0
elif parsed.command == "list":
templates = manager.list_templates()
if not templates:
print("No templates found.")
return 0
print("\n=== Warm Session Templates ===\n")
for t in templates:
print(f"ID: {t['template_id']}")
print(f" Name: {t['name']}")
print(f" Description: {t['description']}")
print(f" Usage: {t['usage_count']} times, {t['success_rate']:.0%} success")
print()
return 0
elif parsed.command == "test":
template = manager.load_template(parsed.template_id)
if not template:
print(f"Template {parsed.template_id} not found")
return 1
bootstrapper = WarmSessionBootstrapper(manager)
messages = bootstrapper.prepare_messages(template, parsed.message)
print(f"\n=== Warm Session Test: {template.name} ===\n")
print(f"Generated {len(messages)} messages:\n")
for i, msg in enumerate(messages):
role = msg.get("role", "unknown")
content = msg.get("content", "")
if role == "system":
print(f"[System Context] ({len(content)} chars)")
print(content[:200] + "..." if len(content) > 200 else content)
elif role == "user":
print(f"\n[User]: {content}")
elif role == "assistant":
print(f"[Assistant]: {content}")
if msg.get("tool_calls"):
for tc in msg["tool_calls"]:
func = tc.get("function", {})
print(f" -> {func.get('name')}({func.get('arguments', '{}')[:50]})")
elif role == "tool":
print(f" [Result]: {content[:100]}...")
return 0
elif parsed.command == "delete":
if manager.delete_template(parsed.template_id):
print(f"Deleted template: {parsed.template_id}")
return 0
else:
print(f"Template {parsed.template_id} not found")
return 1
return 1
if __name__ == "__main__":
import sys
sys.exit(warm_session_cli(sys.argv[1:]))