Compare commits

..

2 Commits

Author SHA1 Message Date
a6f3ae34a3 docs(templates): Add example for session templates
Some checks failed
Forge CI / smoke-and-build (pull_request) Failing after 57s
Add example script demonstrating session template usage:
1. Listing existing templates
2. Getting templates by task type
3. Injecting templates into messages
4. Usage tracking

Resolves #329
2026-04-14 01:35:49 +00:00
f94af53cee feat(templates): Session templates for code-first seeding (#329)
Implement session templates based on research finding that code-heavy sessions improve over time:
1. Task type classification (code, file, research, mixed)
2. Template extraction from successful sessions
3. Template storage in ~/.hermes/session-templates/
4. Template injection into new sessions
5. CLI interface for template management

Resolves #329
2026-04-14 01:35:02 +00:00
6 changed files with 473 additions and 1181 deletions

View File

@@ -1,113 +0,0 @@
# Warm Session Provisioning: Revised Hypothesis
**Research Document v2.0**
**Issue:** #327
**Date:** April 2026
**Status:** Revised Based on Empirical Data
## Executive Summary
Initial hypothesis: Marathon sessions (100+ messages) have lower error rates, suggesting agents improve with experience. This was **partially incorrect**.
**Actual finding:** Error rates INCREASE within marathon sessions (avg first-half: 26.8%, second-half: 32.7%). Sessions don't improve - they degrade.
## Corrected Understanding
### What the Data Actually Shows
1. **Error rates increase over time** within sessions
2. **Marathon sessions appear more reliable** in aggregate because:
- Only well-guided sessions survive to 100+ messages
- Users who correct errors keep sessions alive
- Selection bias: failed sessions end early
3. **User guidance drives success**, not agent adaptation
### Revised Hypothesis
The "proficiency" observed in marathon sessions comes from:
- **User expertise**: Users who know how to guide the agent
- **Established context**: Shared reference points reduce ambiguity
- **Error correction patterns**: Users develop strategies to fix agent mistakes
- **Session survivorship**: Only well-managed sessions reach marathon length
## New Research Direction
### 1. User Guidance Patterns
Instead of agent proficiency, study user strategies:
- How do expert users phrase requests?
- What correction patterns work best?
- How do users establish context?
### 2. Context Window Management
Long sessions may suffer from context degradation:
- Attention dilution over many messages
- Lost context from early messages
- Compression artifacts
### 3. Warm Session v2: User-Guided Templates
Instead of pre-seeding agent patterns, pre-seed user guidance:
- Effective prompt templates
- Error correction strategies
- Context establishment patterns
## Implementation Plan
### Phase 1: User Pattern Analysis
- Analyze successful user strategies
- Extract effective prompt patterns
- Identify error correction techniques
### Phase 2: Guidance Templates
- Create user-facing templates
- Document effective patterns
- Provide prompt engineering guidance
### Phase 3: Context Management
- Optimize context window usage
- Implement smart context refresh
- Prevent attention degradation
### Phase 4: A/B Testing
- Test guided vs unguided sessions
- Measure error reduction from user guidance
- Statistical validation
## Key Metrics
1. **Error Rate by Position**
- First 10 messages: baseline
- Messages 10-50: degradation rate
- Messages 50+: long-session behavior
2. **User Intervention Rate**
- How often users correct errors
- Success rate of corrections
- Patterns in effective corrections
3. **Context Window Utilization**
- Token usage over time
- Information retention rate
- Compression effectiveness
## Paper Contributions (Revised)
1. **Counterintuitive finding**: Longer sessions have HIGHER error rates
2. **Selection bias**: Marathon sessions represent survivorship bias
3. **User expertise matters more than agent adaptation**
4. **Context degradation over long sessions**
## Next Steps
1. ✅ Correct initial hypothesis
2. ⏳ Analyze user guidance patterns
3. ⏳ Extract effective prompt strategies
4. ⏳ Create user-facing guidance templates
5. ⏳ Optimize context window management
6. ⏳ Run A/B tests on guided sessions
7. ⏳ Write paper with corrected findings
## References
- Empirical Audit 2026-04-12, Finding 4
- Follow-up Analysis: Comment on #327 (2026-04-13)
- Issue #327 (original hypothesis)

View File

@@ -0,0 +1,89 @@
#!/usr/bin/env python3
"""
Example: Using session templates for code-first seeding.
This script demonstrates how to use the session template system
to pre-seed new sessions with successful tool call patterns.
"""
import sys
from pathlib import Path
# Add the parent directory to the path
sys.path.insert(0, str(Path(__file__).parent.parent))
from tools.session_templates import SessionTemplates, TaskType
def main():
"""Demonstrate session template usage."""
# Create template manager
templates = SessionTemplates()
print("Session Templates Example")
print("=" * 50)
# List existing templates
print("\n1. Existing templates:")
template_list = templates.list_templates()
if template_list:
for t in template_list:
print(f" - {t.name}: {t.task_type.value} ({len(t.examples)} examples)")
else:
print(" No templates found")
# Example: Create a template from a session
print("\n2. Creating a template from a session:")
print(" (This would normally use a real session ID)")
# Example: Get a template for code tasks
print("\n3. Getting a template for CODE tasks:")
code_template = templates.get_template(TaskType.CODE)
if code_template:
print(f" Found template: {code_template.name}")
print(f" Type: {code_template.task_type.value}")
print(f" Examples: {len(code_template.examples)}")
# Show first example
if code_template.examples:
example = code_template.examples[0]
print(f" First example: {example.tool_name}")
print(f" Arguments: {example.arguments}")
print(f" Result preview: {example.result[:100]}...")
else:
print(" No CODE template found")
# Example: Inject template into messages
print("\n4. Injecting template into messages:")
if code_template:
# Create sample messages
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Help me write some code"}
]
# Inject template
updated_messages = templates.inject_into_messages(code_template, messages)
print(f" Original messages: {len(messages)}")
print(f" Updated messages: {len(updated_messages)}")
print(f" Template usage count: {code_template.usage_count}")
# Show the injection
print("\n Injected messages:")
for i, msg in enumerate(updated_messages[:6]): # Show first 6
role = msg.get('role', 'unknown')
content = msg.get('content', '')
if content:
content_preview = content[:50] + "..." if len(content) > 50 else content
print(f" {i}: {role} - {content_preview}")
else:
print(f" {i}: {role} - (tool call)")
print("\n" + "=" * 50)
print("Example complete!")
if __name__ == "__main__":
main()

View File

@@ -5258,34 +5258,6 @@ For more help on a command:
sessions_parser.set_defaults(func=cmd_sessions)
# User guidance command (research #327 revised)
guidance_parser = subparsers.add_parser(
"guidance",
help="User guidance pattern analysis (research)",
description="Analyze effective user strategies for agent sessions"
)
guidance_subparsers = guidance_parser.add_subparsers(dest="guidance_command")
# Guidance analyze command
guidance_analyze = guidance_subparsers.add_parser("analyze", help="Analyze user guidance in a session")
guidance_analyze.add_argument("session_id", help="Session ID to analyze")
# Guidance create-template command
guidance_create = guidance_subparsers.add_parser("create-template", help="Create guidance template from sessions")
guidance_create.add_argument("session_ids", nargs="+", help="Session IDs to analyze")
guidance_create.add_argument("--name", "-n", help="Template name")
# Guidance list-templates command
guidance_subparsers.add_parser("list-templates", help="List available guidance templates")
# Guidance generate-guide command
guidance_guide = guidance_subparsers.add_parser("generate-guide", help="Generate user guide from template")
guidance_guide.add_argument("profile_id", help="Profile ID to generate guide from")
guidance_parser.set_defaults(func=cmd_guidance)
# =========================================================================
# insights command
# =========================================================================
@@ -5626,48 +5598,3 @@ Examples:
if __name__ == "__main__":
main()
def cmd_guidance(args):
"""Handle user guidance pattern analysis commands."""
from hermes_cli.colors import Colors, color
subcmd = getattr(args, 'guidance_command', None)
if subcmd is None:
print(color("User Guidance Pattern Analysis (Research #327 Revised)", Colors.CYAN))
print("\nCommands:")
print(" hermes guidance analyze SESSION_ID - Analyze user guidance patterns")
print(" hermes guidance create-template SESSION_IDS - Create guidance template")
print(" hermes guidance list-templates - List available templates")
print(" hermes guidance generate-guide PROFILE_ID - Generate user guide")
print("\nNote: Research shows user guidance matters more than agent experience.")
return 0
# Import user guidance module
try:
from tools.user_guidance import guidance_command
# Convert args to list for the module
args_list = []
if subcmd == "analyze":
args_list = ["analyze", args.session_id]
elif subcmd == "create-template":
args_list = ["create-template"] + args.session_ids
if hasattr(args, 'name') and args.name:
args_list.extend(["--name", args.name])
elif subcmd == "list-templates":
args_list = ["list-templates"]
elif subcmd == "generate-guide":
args_list = ["generate-guide", args.profile_id]
return guidance_command(args_list)
except ImportError as e:
print(color(f"Error: Cannot import user_guidance module: {e}", Colors.RED))
print("Make sure tools/user_guidance.py exists")
return 1
except Exception as e:
print(color(f"Error: {e}", Colors.RED))
return 1

View File

@@ -1,229 +0,0 @@
#!/usr/bin/env python3
"""
Test script for user guidance pattern analysis.
This script tests the revised approach for issue #327,
focusing on user guidance patterns rather than agent proficiency.
Issue: #327 (Revised hypothesis)
"""
import sys
import os
from pathlib import Path
# Add the tools directory to path
sys.path.insert(0, str(Path(__file__).parent.parent))
def test_user_guidance_analysis():
"""Test user guidance analysis functionality."""
print("=== Testing User Guidance Analysis ===\n")
try:
from tools.user_guidance import UserGuidanceAnalyzer
from hermes_state import SessionDB
session_db = SessionDB()
analyzer = UserGuidanceAnalyzer(session_db)
# Get a session to analyze
sessions = session_db.get_messages.__self__.execute_write(
"SELECT id FROM sessions ORDER BY started_at DESC LIMIT 1"
)
if not sessions:
print("No sessions found in database.")
return False
session_id = sessions[0][0]
print(f"Analyzing session: {session_id}\n")
analysis = analyzer.analyze_user_guidance(session_id)
if "error" in analysis:
print(f"Analysis error: {analysis['error']}")
return False
print(f"Message count: {analysis['message_count']}")
print("\nPrompt Patterns:")
for p in analysis.get("prompt_patterns", [])[:3]:
print(f" {p['type']}: {'' if p.get('success') else ''} ({p['length']} chars)")
print("\nCorrection Patterns:")
for c in analysis.get("correction_patterns", [])[:2]:
print(f" {c['error_content'][:50]}... -> {c['user_correction'][:50]}...")
print("\nSuccess Metrics:")
metrics = analysis.get("success_metrics", {})
print(f" Tool calls: {metrics.get('tool_calls', 0)}")
print(f" Success rate: {metrics.get('success_rate', 0):.0%}")
print(f" User corrections: {metrics.get('user_corrections', 0)}")
return True
except Exception as e:
print(f"Test failed: {e}")
return False
def test_guidance_template_creation():
"""Test guidance template creation."""
print("\n=== Testing Guidance Template Creation ===\n")
try:
from tools.user_guidance import UserGuidanceAnalyzer, GuidanceTemplateGenerator
from hermes_state import SessionDB
session_db = SessionDB()
analyzer = UserGuidanceAnalyzer(session_db)
generator = GuidanceTemplateGenerator(analyzer)
# Get sessions
sessions = session_db.get_messages.__self__.execute_write(
"SELECT id FROM sessions ORDER BY started_at DESC LIMIT 3"
)
if not sessions:
print("No sessions found.")
return False
session_ids = [s[0] for s in sessions]
print(f"Creating template from {len(session_ids)} sessions\n")
profile = generator.create_guidance_template(
session_ids,
name="Test Guidance Template"
)
print(f"Profile ID: {profile.profile_id}")
print(f"Name: {profile.name}")
print(f"Prompt patterns: {len(profile.prompt_patterns)}")
print(f"Correction patterns: {len(profile.correction_patterns)}")
# Save the template
from tools.user_guidance import GuidanceTemplateManager
manager = GuidanceTemplateManager()
path = manager.save_template(profile)
print(f"Saved to: {path}")
return True
except Exception as e:
print(f"Test failed: {e}")
return False
def test_user_guide_generation():
"""Test user guide generation."""
print("\n=== Testing User Guide Generation ===\n")
try:
from tools.user_guidance import UserGuidanceProfile, PromptPattern, CorrectionPattern, ContextStrategy, generate_user_guide
# Create a test profile
profile = UserGuidanceProfile(
profile_id="test_guidance_001",
name="Test User Guidance",
description="Test profile for guide generation",
prompt_patterns=[
PromptPattern(
pattern_type="polite_request",
template="Please [action] [details]",
success_rate=0.85,
usage_count=15,
context_requirements=[]
),
PromptPattern(
pattern_type="question",
template="How do I [action]?",
success_rate=0.75,
usage_count=20,
context_requirements=[]
)
],
correction_patterns=[
CorrectionPattern(
error_type="file_not_found",
correction_strategy="direct",
effectiveness=0.90,
common_phrases=["Use the correct path: [path]", "The file is at [location]"]
),
CorrectionPattern(
error_type="command_not_found",
correction_strategy="example",
effectiveness=0.80,
common_phrases=["Try: [command]", "Use [alternative] instead"]
)
],
context_strategies=[
ContextStrategy(
strategy_type="file_reference",
description="Reference specific files",
effectiveness=0.85,
token_cost=10
),
ContextStrategy(
strategy_type="code_example",
description="Provide code examples",
effectiveness=0.90,
token_cost=50
)
],
created_at="2026-04-13T00:00:00",
source_analysis="Test sessions"
)
guide = generate_user_guide(profile)
print("Generated User Guide:")
print("=" * 50)
print(guide[:1000] + "..." if len(guide) > 1000 else guide)
return True
except Exception as e:
print(f"Test failed: {e}")
return False
def main():
"""Run all tests."""
print("User Guidance Pattern Analysis Test Suite")
print("=" * 50)
tests = [
("User Guidance Analysis", test_user_guidance_analysis),
("Guidance Template Creation", test_guidance_template_creation),
("User Guide Generation", test_user_guide_generation)
]
results = []
for name, test_func in tests:
print(f"\nRunning: {name}")
try:
result = test_func()
results.append((name, result))
print(f"Result: {'PASS' if result else 'FAIL'}")
except Exception as e:
print(f"Error: {e}")
results.append((name, False))
print("\n" + "=" * 50)
print("Test Results:")
passed = sum(1 for _, result in results if result)
total = len(results)
for name, result in results:
status = "✓ PASS" if result else "✗ FAIL"
print(f" {status}: {name}")
print(f"\nPassed: {passed}/{total}")
return 0 if passed == total else 1
if __name__ == "__main__":
sys.exit(main())

384
tools/session_templates.py Normal file
View File

@@ -0,0 +1,384 @@
"""
Session templates for code-first seeding.
Based on research finding: Code-heavy sessions (execute_code dominant in first 30 turns)
improve over time. File-heavy sessions degrade. The key is deterministic feedback loops.
This module provides:
1. Template extraction from successful sessions
2. Task type classification (code, file, research)
3. Template storage in ~/.hermes/session-templates/
4. Template injection into new sessions
"""
import json
import logging
import os
import sqlite3
import time
from pathlib import Path
from typing import Dict, List, Optional, Any
from dataclasses import dataclass, asdict
from enum import Enum
logger = logging.getLogger(__name__)
# Default template directory
DEFAULT_TEMPLATE_DIR = Path.home() / ".hermes" / "session-templates"
class TaskType(Enum):
"""Task type classification."""
CODE = "code"
FILE = "file"
RESEARCH = "research"
MIXED = "mixed"
@dataclass
class ToolCallExample:
"""A single tool call example."""
tool_name: str
arguments: Dict[str, Any]
result: str
success: bool
def to_dict(self) -> Dict[str, Any]:
return asdict(self)
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> 'ToolCallExample':
return cls(**data)
@dataclass
class SessionTemplate:
"""A session template with tool call examples."""
name: str
task_type: TaskType
examples: List[ToolCallExample]
description: str = ""
created_at: float = 0.0
usage_count: int = 0
def __post_init__(self):
if self.created_at == 0.0:
self.created_at = time.time()
def to_dict(self) -> Dict[str, Any]:
data = asdict(self)
data['task_type'] = self.task_type.value
return data
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> 'SessionTemplate':
data['task_type'] = TaskType(data['task_type'])
examples_data = data.get('examples', [])
data['examples'] = [ToolCallExample.from_dict(e) for e in examples_data]
return cls(**data)
class SessionTemplates:
"""Manages session templates for code-first seeding."""
def __init__(self, template_dir: Optional[Path] = None):
self.template_dir = template_dir or DEFAULT_TEMPLATE_DIR
self.template_dir.mkdir(parents=True, exist_ok=True)
self.templates: Dict[str, SessionTemplate] = {}
self._load_templates()
def _load_templates(self):
"""Load all templates from disk."""
for template_file in self.template_dir.glob("*.json"):
try:
with open(template_file, 'r') as f:
data = json.load(f)
template = SessionTemplate.from_dict(data)
self.templates[template.name] = template
except Exception as e:
logger.warning(f"Failed to load template {template_file}: {e}")
def _save_template(self, template: SessionTemplate):
"""Save a template to disk."""
template_file = self.template_dir / f"{template.name}.json"
with open(template_file, 'w') as f:
json.dump(template.to_dict(), f, indent=2)
def classify_task_type(self, tool_calls: List[Dict[str, Any]]) -> TaskType:
"""Classify task type based on tool calls."""
if not tool_calls:
return TaskType.MIXED
# Count tool types
code_tools = {'execute_code', 'code_execution'}
file_tools = {'read_file', 'write_file', 'patch', 'search_files'}
research_tools = {'web_search', 'web_fetch', 'browser_navigate'}
tool_names = [tc.get('tool_name', '') for tc in tool_calls]
code_count = sum(1 for t in tool_names if t in code_tools)
file_count = sum(1 for t in tool_names if t in file_tools)
research_count = sum(1 for t in tool_names if t in research_tools)
total = len(tool_calls)
if total == 0:
return TaskType.MIXED
# Determine dominant type (60% threshold)
if code_count / total > 0.6:
return TaskType.CODE
elif file_count / total > 0.6:
return TaskType.FILE
elif research_count / total > 0.6:
return TaskType.RESEARCH
else:
return TaskType.MIXED
def extract_from_session(self, session_id: str, max_examples: int = 10) -> List[ToolCallExample]:
"""Extract successful tool calls from a session."""
db_path = Path.home() / ".hermes" / "state.db"
if not db_path.exists():
return []
try:
conn = sqlite3.connect(str(db_path))
conn.row_factory = sqlite3.Row
# Get messages with tool calls
cursor = conn.execute("""
SELECT role, content, tool_calls, tool_name
FROM messages
WHERE session_id = ?
ORDER BY timestamp
LIMIT 100
""", (session_id,))
messages = cursor.fetchall()
conn.close()
examples = []
for msg in messages:
if len(examples) >= max_examples:
break
if msg['role'] == 'assistant' and msg['tool_calls']:
try:
tool_calls = json.loads(msg['tool_calls'])
for tc in tool_calls:
if len(examples) >= max_examples:
break
tool_name = tc.get('function', {}).get('name')
if not tool_name:
continue
try:
arguments = json.loads(tc.get('function', {}).get('arguments', '{}'))
except:
arguments = {}
examples.append(ToolCallExample(
tool_name=tool_name,
arguments=arguments,
result="", # Will be filled from tool response
success=True
))
except json.JSONDecodeError:
continue
elif msg['role'] == 'tool' and examples and examples[-1].result == "":
examples[-1].result = msg['content'] or ""
return examples
except Exception as e:
logger.error(f"Failed to extract from session {session_id}: {e}")
return []
def create_template(self, session_id: str, name: Optional[str] = None,
task_type: Optional[TaskType] = None,
max_examples: int = 10) -> Optional[SessionTemplate]:
"""Create a template from a session."""
examples = self.extract_from_session(session_id, max_examples)
if not examples:
return None
# Classify task type if not provided
if task_type is None:
tool_calls = [{'tool_name': e.tool_name} for e in examples]
task_type = self.classify_task_type(tool_calls)
# Generate name if not provided
if name is None:
name = f"{task_type.value}_{session_id[:8]}_{int(time.time())}"
# Create template
template = SessionTemplate(
name=name,
task_type=task_type,
examples=examples,
description=f"Template with {len(examples)} examples"
)
# Save template
self.templates[name] = template
self._save_template(template)
logger.info(f"Created template {name} with {len(examples)} examples")
return template
def get_template(self, task_type: TaskType) -> Optional[SessionTemplate]:
"""Get the best template for a task type."""
matching = [t for t in self.templates.values() if t.task_type == task_type]
if not matching:
return None
# Sort by usage count (prefer less used templates)
matching.sort(key=lambda t: t.usage_count)
return matching[0]
def inject_into_messages(self, template: SessionTemplate,
messages: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
"""Inject template examples into messages."""
if not template.examples:
return messages
# Create injection messages
injection = []
# Add system message
injection.append({
"role": "system",
"content": f"Session template: {template.name} ({template.task_type.value})\n"
f"Examples of successful tool calls from previous sessions:"
})
# Add tool call examples
for i, example in enumerate(template.examples):
# Assistant message with tool call
injection.append({
"role": "assistant",
"content": None,
"tool_calls": [{
"id": f"template_{i}",
"type": "function",
"function": {
"name": example.tool_name,
"arguments": json.dumps(example.arguments)
}
}]
})
# Tool response
injection.append({
"role": "tool",
"tool_call_id": f"template_{i}",
"content": example.result
})
# Insert after system messages
insert_index = 0
for i, msg in enumerate(messages):
if msg.get("role") != "system":
break
insert_index = i + 1
# Insert injection
for i, msg in enumerate(injection):
messages.insert(insert_index + i, msg)
# Update usage count
template.usage_count += 1
self._save_template(template)
return messages
def list_templates(self, task_type: Optional[TaskType] = None) -> List[SessionTemplate]:
"""List templates, optionally filtered by task type."""
templates = list(self.templates.values())
if task_type:
templates = [t for t in templates if t.task_type == task_type]
templates.sort(key=lambda t: t.created_at, reverse=True)
return templates
def delete_template(self, name: str) -> bool:
"""Delete a template."""
if name not in self.templates:
return False
del self.templates[name]
template_file = self.template_dir / f"{name}.json"
if template_file.exists():
template_file.unlink()
logger.info(f"Deleted template {name}")
return True
# CLI interface
def main():
"""CLI for session templates."""
import argparse
parser = argparse.ArgumentParser(description="Session Templates")
subparsers = parser.add_subparsers(dest="command")
# List templates
list_parser = subparsers.add_parser("list", help="List templates")
list_parser.add_argument("--type", choices=["code", "file", "research", "mixed"])
# Create template
create_parser = subparsers.add_parser("create", help="Create template from session")
create_parser.add_argument("session_id", help="Session ID")
create_parser.add_argument("--name", help="Template name")
create_parser.add_argument("--type", choices=["code", "file", "research", "mixed"])
create_parser.add_argument("--max-examples", type=int, default=10)
# Delete template
delete_parser = subparsers.add_parser("delete", help="Delete template")
delete_parser.add_argument("name", help="Template name")
args = parser.parse_args()
templates = SessionTemplates()
if args.command == "list":
task_type = TaskType(args.type) if args.type else None
template_list = templates.list_templates(task_type)
if not template_list:
print("No templates found")
return
print(f"Found {len(template_list)} templates:")
for t in template_list:
print(f" {t.name}: {t.task_type.value} ({len(t.examples)} examples, used {t.usage_count} times)")
elif args.command == "create":
task_type = TaskType(args.type) if args.type else None
template = templates.create_template(
args.session_id,
name=args.name,
task_type=task_type,
max_examples=args.max_examples
)
if template:
print(f"Created template: {template.name}")
print(f" Type: {template.task_type.value}")
print(f" Examples: {len(template.examples)}")
else:
print("Failed to create template")
elif args.command == "delete":
if templates.delete_template(args.name):
print(f"Deleted template: {args.name}")
else:
print(f"Template not found: {args.name}")
else:
parser.print_help()
if __name__ == "__main__":
main()

View File

@@ -1,766 +0,0 @@
"""
User Guidance Patterns for Effective Agent Sessions
This module analyzes user strategies that lead to successful agent sessions,
focusing on prompt patterns, error correction techniques, and context management.
Issue: #327 (Revised hypothesis)
"""
import json
import logging
from datetime import datetime
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple
from dataclasses import dataclass, asdict
import re
logger = logging.getLogger(__name__)
@dataclass
class PromptPattern:
"""Effective prompt pattern."""
pattern_type: str # "instruction", "context", "constraint", "example"
template: str
success_rate: float
usage_count: int
context_requirements: List[str] = None
def to_dict(self) -> Dict[str, Any]:
return asdict(self)
@dataclass
class CorrectionPattern:
"""User error correction pattern."""
error_type: str
correction_strategy: str # "direct", "example", "reframe", "constraint"
effectiveness: float # Success rate of this correction
common_phrases: List[str]
@dataclass
class ContextStrategy:
"""Context establishment strategy."""
strategy_type: str # "reference", "example", "constraint", "background"
description: str
effectiveness: float
token_cost: int # Approximate token usage
@dataclass
class UserGuidanceProfile:
"""Profile of effective user guidance strategies."""
profile_id: str
name: str
description: str
prompt_patterns: List[PromptPattern]
correction_patterns: List[CorrectionPattern]
context_strategies: List[ContextStrategy]
created_at: str
source_analysis: str = None
version: str = "1.0"
def to_dict(self) -> Dict[str, Any]:
return {
"profile_id": self.profile_id,
"name": self.name,
"description": self.description,
"prompt_patterns": [p.to_dict() for p in self.prompt_patterns],
"correction_patterns": [asdict(c) for c in self.correction_patterns],
"context_strategies": [asdict(c) for c in self.context_strategies],
"created_at": self.created_at,
"source_analysis": self.source_analysis,
"version": self.version
}
@classmethod
def from_dict(cls, data: Dict[str, Any]) -> 'UserGuidanceProfile':
"""Create profile from dictionary."""
prompt_patterns = [
PromptPattern(**p) for p in data.get("prompt_patterns", [])
]
correction_patterns = [
CorrectionPattern(**c) for c in data.get("correction_patterns", [])
]
context_strategies = [
ContextStrategy(**c) for c in data.get("context_strategies", [])
]
return cls(
profile_id=data["profile_id"],
name=data["name"],
description=data["description"],
prompt_patterns=prompt_patterns,
correction_patterns=correction_patterns,
context_strategies=context_strategies,
created_at=data.get("created_at", datetime.now().isoformat()),
source_analysis=data.get("source_analysis"),
version=data.get("version", "1.0")
)
class UserGuidanceAnalyzer:
"""Analyze user guidance patterns in sessions."""
def __init__(self, session_db=None):
self.session_db = session_db
def analyze_user_guidance(self, session_id: str) -> Dict[str, Any]:
"""
Analyze user guidance patterns in a session.
Returns:
Dict with user guidance analysis including:
- prompt_patterns: Effective prompt structures
- correction_patterns: Error correction strategies
- context_strategies: How users establish context
- success_indicators: What makes guidance effective
"""
if not self.session_db:
return {"error": "No session database available"}
try:
messages = self.session_db.get_messages(session_id)
if not messages:
return {"error": "No messages found"}
analysis = {
"session_id": session_id,
"message_count": len(messages),
"user_messages": self._extract_user_messages(messages),
"prompt_patterns": self._analyze_prompt_patterns(messages),
"correction_patterns": self._analyze_corrections(messages),
"context_strategies": self._analyze_context_strategies(messages),
"success_metrics": self._calculate_success_metrics(messages)
}
return analysis
except Exception as e:
logger.error(f"User guidance analysis failed: {e}")
return {"error": str(e)}
def _extract_user_messages(self, messages: List[Dict]) -> List[Dict]:
"""Extract user messages with context."""
user_messages = []
for i, msg in enumerate(messages):
if msg.get("role") == "user":
# Get surrounding context
context_before = []
context_after = []
# Previous assistant message
if i > 0 and messages[i-1].get("role") == "assistant":
context_before.append(messages[i-1].get("content", "")[:200])
# Next assistant message
if i < len(messages) - 1 and messages[i+1].get("role") == "assistant":
context_after.append(messages[i+1].get("content", "")[:200])
user_messages.append({
"content": msg.get("content", ""),
"position": i,
"context_before": context_before,
"context_after": context_after
})
return user_messages
def _analyze_prompt_patterns(self, messages: List[Dict]) -> List[Dict[str, Any]]:
"""Analyze prompt patterns for effectiveness."""
patterns = []
user_messages = [m for m in messages if m.get("role") == "user"]
for msg in user_messages:
content = msg.get("content", "")
# Identify prompt types
if content.startswith(("Please", "Could you", "Can you")):
patterns.append({
"type": "polite_request",
"content": content,
"length": len(content),
"success": self._check_prompt_success(messages, msg)
})
elif "?" in content:
patterns.append({
"type": "question",
"content": content,
"length": len(content),
"success": self._check_prompt_success(messages, msg)
})
elif content.startswith(("/", "!")):
patterns.append({
"type": "command",
"content": content,
"length": len(content),
"success": self._check_prompt_success(messages, msg)
})
elif len(content) > 200:
patterns.append({
"type": "detailed_request",
"content": content,
"length": len(content),
"success": self._check_prompt_success(messages, msg)
})
return patterns
def _check_prompt_success(self, messages: List[Dict], user_msg: Dict) -> bool:
"""Check if a prompt led to successful execution."""
# Find the user message position
user_pos = None
for i, msg in enumerate(messages):
if msg == user_msg:
user_pos = i
break
if user_pos is None:
return False
# Check if there's a successful tool call after this message
for i in range(user_pos + 1, min(user_pos + 5, len(messages))):
msg = messages[i]
if msg.get("role") == "assistant" and msg.get("tool_calls"):
# Check if tool result indicates success
for j in range(i + 1, min(i + 3, len(messages))):
if messages[j].get("role") == "tool":
content = messages[j].get("content", "")
if "error" not in content.lower() and "failed" not in content.lower():
return True
return False
def _analyze_corrections(self, messages: List[Dict]) -> List[Dict[str, Any]]:
"""Analyze error correction patterns."""
corrections = []
# Look for error patterns followed by corrections
for i in range(len(messages) - 2):
msg1 = messages[i]
msg2 = messages[i + 1]
msg3 = messages[i + 2]
# Pattern: Assistant error -> User correction -> Assistant success
if (msg1.get("role") == "tool" and
("error" in msg1.get("content", "").lower() or "failed" in msg1.get("content", "").lower()) and
msg2.get("role") == "user" and
msg3.get("role") == "assistant"):
corrections.append({
"error_content": msg1.get("content", "")[:200],
"user_correction": msg2.get("content", ""),
"assistant_response": msg3.get("content", "")[:200],
"success": self._check_correction_success(messages, i + 2)
})
return corrections
def _check_correction_success(self, messages: List[Dict], assistant_pos: int) -> bool:
"""Check if a correction led to success."""
# Look for successful tool calls after correction
for i in range(assistant_pos + 1, min(assistant_pos + 3, len(messages))):
if messages[i].get("role") == "tool":
content = messages[i].get("content", "")
if "error" not in content.lower() and "failed" not in content.lower():
return True
return False
def _analyze_context_strategies(self, messages: List[Dict]) -> List[Dict[str, Any]]:
"""Analyze how users establish context."""
strategies = []
user_messages = [m for m in messages if m.get("role") == "user"]
for msg in user_messages[:10]: # Analyze first 10 user messages
content = msg.get("content", "")
# Identify context establishment strategies
if re.search(r'[/.\][\w/.-]+\.\w+', content):
strategies.append({
"type": "file_reference",
"content": content[:200],
"tokens": len(content.split())
})
elif "```" in content:
strategies.append({
"type": "code_example",
"content": content[:200],
"tokens": len(content.split())
})
elif len(content) > 300:
strategies.append({
"type": "detailed_background",
"content": content[:200],
"tokens": len(content.split())
})
return strategies
def _calculate_success_metrics(self, messages: List[Dict]) -> Dict[str, Any]:
"""Calculate success metrics for the session."""
tool_calls = 0
successful_tool_calls = 0
user_corrections = 0
successful_corrections = 0
for i, msg in enumerate(messages):
if msg.get("role") == "assistant" and msg.get("tool_calls"):
tool_calls += 1
if msg.get("role") == "tool":
content = msg.get("content", "")
if "error" not in content.lower() and "failed" not in content.lower():
successful_tool_calls += 1
# Count corrections
if (msg.get("role") == "user" and i > 0 and
messages[i-1].get("role") == "tool" and
("error" in messages[i-1].get("content", "").lower() or
"failed" in messages[i-1].get("content", "").lower())):
user_corrections += 1
return {
"tool_calls": tool_calls,
"successful_tool_calls": successful_tool_calls,
"success_rate": successful_tool_calls / tool_calls if tool_calls > 0 else 0,
"user_corrections": user_corrections,
"messages_per_correction": len(messages) / user_corrections if user_corrections > 0 else 0
}
class GuidanceTemplateGenerator:
"""Generate user guidance templates from analysis."""
def __init__(self, analyzer: UserGuidanceAnalyzer = None):
self.analyzer = analyzer or UserGuidanceAnalyzer()
def create_guidance_template(self, session_ids: List[str], name: str = None) -> UserGuidanceProfile:
"""
Create a guidance template from multiple sessions.
Args:
session_ids: List of session IDs to analyze
name: Template name
Returns:
UserGuidanceProfile with extracted patterns
"""
all_patterns = []
all_corrections = []
all_strategies = []
for session_id in session_ids:
analysis = self.analyzer.analyze_user_guidance(session_id)
if "error" in analysis:
logger.warning(f"Skipping session {session_id}: {analysis['error']}")
continue
all_patterns.extend(analysis.get("prompt_patterns", []))
all_corrections.extend(分析.get("correction_patterns", []))
all_strategies.extend(analysis.get("context_strategies", []))
# Aggregate patterns
prompt_patterns = self._aggregate_prompt_patterns(all_patterns)
correction_patterns = self._aggregate_corrections(all_corrections)
context_strategies = self._aggregate_strategies(all_strategies)
profile = UserGuidanceProfile(
profile_id=f"guidance_{datetime.now().strftime('%Y%m%d_%H%M%S')}",
name=name or f"User Guidance Template",
description=f"Extracted from {len(session_ids)} sessions",
prompt_patterns=prompt_patterns,
correction_patterns=correction_patterns,
context_strategies=context_strategies,
created_at=datetime.now().isoformat(),
source_analysis=f"Sessions: {', '.join(session_ids[:5])}{'...' if len(session_ids) > 5 else ''}"
)
return profile
def _aggregate_prompt_patterns(self, patterns: List[Dict]) -> List[PromptPattern]:
"""Aggregate prompt patterns by type."""
pattern_groups = {}
for p in patterns:
ptype = p.get("type", "unknown")
if ptype not in pattern_groups:
pattern_groups[ptype] = {"count": 0, "successes": 0, "examples": []}
pattern_groups[ptype]["count"] += 1
if p.get("success"):
pattern_groups[ptype]["successes"] += 1
if len(pattern_groups[ptype]["examples"]) < 3:
pattern_groups[ptype]["examples"].append(p.get("content", "")[:100])
result = []
for ptype, data in pattern_groups.items():
success_rate = data["successes"] / data["count"] if data["count"] > 0 else 0
# Create template from examples
template = self._create_template_from_examples(data["examples"], ptype)
result.append(PromptPattern(
pattern_type=ptype,
template=template,
success_rate=success_rate,
usage_count=data["count"],
context_requirements=[]
))
return result
def _aggregate_corrections(self, corrections: List[Dict]) -> List[CorrectionPattern]:
"""Aggregate correction patterns."""
correction_groups = {}
for c in corrections:
# Simplify error type
error_content = c.get("error_content", "").lower()
if "filenotfound" in error_content or "no such file" in error_content:
error_type = "file_not_found"
elif "permission" in error_content:
error_type = "permission_denied"
elif "command not found" in error_content:
error_type = "command_not_found"
else:
error_type = "general_error"
if error_type not in correction_groups:
correction_groups[error_type] = {"count": 0, "successes": 0, "examples": []}
correction_groups[error_type]["count"] += 1
if c.get("success"):
correction_groups[error_type]["successes"] += 1
if len(correction_groups[error_type]["examples"]) < 3:
correction_groups[error_type]["examples"].append(c.get("user_correction", "")[:100])
result = []
for error_type, data in correction_groups.items():
effectiveness = data["successes"] / data["count"] if data["count"] > 0 else 0
# Determine correction strategy
if data["examples"]:
first_example = data["examples"][0].lower()
if "try" in first_example or "instead" in first_example:
strategy = "reframe"
elif "use" in first_example or "run" in first_example:
strategy = "direct"
elif "like this" in first_example or "example" in first_example:
strategy = "example"
else:
strategy = "constraint"
else:
strategy = "unknown"
result.append(CorrectionPattern(
error_type=error_type,
correction_strategy=strategy,
effectiveness=effectiveness,
common_phrases=data["examples"][:3]
))
return result
def _aggregate_strategies(self, strategies: List[Dict]) -> List[ContextStrategy]:
"""Aggregate context strategies."""
strategy_groups = {}
for s in strategies:
stype = s.get("type", "unknown")
if stype not in strategy_groups:
strategy_groups[stype] = {"count": 0, "tokens": []}
strategy_groups[stype]["count"] += 1
strategy_groups[stype]["tokens"].append(s.get("tokens", 0))
result = []
for stype, data in strategy_groups.items():
avg_tokens = sum(data["tokens"]) / len(data["tokens"]) if data["tokens"] else 0
result.append(ContextStrategy(
strategy_type=stype,
description=f"Used {data['count']} times, avg {avg_tokens:.0f} tokens",
effectiveness=0.5, # Would need more analysis
token_cost=int(avg_tokens)
))
return result
def _create_template_from_examples(self, examples: List[str], ptype: str) -> str:
"""Create a template from examples."""
if not examples:
return f"Example {ptype} prompt"
# Simple template creation
if ptype == "polite_request":
return "Please [action] [details]"
elif ptype == "question":
return "How do I [action]?"
elif ptype == "command":
return "/[command] [arguments]"
elif ptype == "detailed_request":
return "I need to [goal]. Specifically, [details]. Context: [background]"
else:
return examples[0][:50] + "..."
class GuidanceTemplateManager:
"""Manage user guidance templates."""
def __init__(self, template_dir: Path = None):
self.template_dir = template_dir or Path.home() / ".hermes" / "guidance_templates"
self.template_dir.mkdir(parents=True, exist_ok=True)
def save_template(self, profile: UserGuidanceProfile) -> Path:
"""Save a guidance template."""
template_path = self.template_dir / f"{profile.profile_id}.json"
with open(template_path, 'w') as f:
json.dump(profile.to_dict(), f, indent=2)
logger.info(f"Saved guidance template {profile.profile_id} to {template_path}")
return template_path
def load_template(self, profile_id: str) -> Optional[UserGuidanceProfile]:
"""Load a guidance template."""
template_path = self.template_dir / f"{profile_id}.json"
if not template_path.exists():
logger.warning(f"Template {profile_id} not found")
return None
try:
with open(template_path, 'r') as f:
data = json.load(f)
return UserGuidanceProfile.from_dict(data)
except Exception as e:
logger.error(f"Failed to load template {profile_id}: {e}")
return None
def list_templates(self) -> List[Dict[str, Any]]:
"""List all available templates."""
templates = []
for template_path in self.template_dir.glob("*.json"):
try:
with open(template_path, 'r') as f:
data = json.load(f)
templates.append({
"profile_id": data.get("profile_id"),
"name": data.get("name"),
"description": data.get("description"),
"created_at": data.get("created_at"),
"prompt_patterns": len(data.get("prompt_patterns", [])),
"correction_patterns": len(data.get("correction_patterns", []))
})
except Exception as e:
logger.warning(f"Failed to read template {template_path}: {e}")
return templates
def generate_user_guide(profile: UserGuidanceProfile) -> str:
"""Generate a user-facing guide from a guidance profile."""
guide = f"""# Effective Agent Session Guide
**Template:** {profile.name}
**Generated:** {profile.created_at}
**Source:** {profile.source_analysis or "Multiple sessions"}
## Effective Prompt Patterns
"""
for pattern in sorted(profile.prompt_patterns, key=lambda x: x.success_rate, reverse=True):
guide += f"### {pattern.pattern_type.replace('_', ' ').title()}
"
guide += f"**Success Rate:** {pattern.success_rate:.0%}
"
guide += f"**Usage:** {pattern.usage_count} times
"
guide += f"**Template:** `{pattern.template}`
"
guide += "## Error Correction Strategies
"
for correction in sorted(profile.correction_patterns, key=lambda x: x.effectiveness, reverse=True):
guide += f"### {correction.error_type.replace('_', ' ').title()}
"
guide += f"**Effectiveness:** {correction.effectiveness:.0%}
"
guide += f"**Strategy:** {correction.correction_strategy}
"
if correction.common_phrases:
guide += f"**Example:** "{correction.common_phrases[0]}"
"
guide += "
"
guide += "## Context Establishment Tips
"
for strategy in profile.context_strategies:
guide += f"- **{strategy.strategy_type.replace('_', ' ').title()}:** {strategy.description}
"
guide += """
## Key Insights
1. **Be specific:** Vague prompts lead to errors
2. **Provide context:** Help the agent understand your environment
3. **Use examples:** Show what you want when possible
4. **Correct effectively:** Use the strategies above when errors occur
5. **Manage context:** Don't overload with unnecessary information
## Remember
- Agent sessions degrade over time (error rates increase)
- Your guidance matters more than agent "experience"
- Use the patterns above to improve success rates
"""
return guide
# CLI Integration
def guidance_command(args):
"""CLI command for user guidance analysis."""
import argparse
parser = argparse.ArgumentParser(description="User guidance pattern analysis")
subparsers = parser.add_subparsers(dest="command")
# Analyze command
analyze_parser = subparsers.add_parser("analyze", help="Analyze user guidance in a session")
analyze_parser.add_argument("session_id", help="Session ID to analyze")
# Create template command
create_parser = subparsers.add_parser("create-template", help="Create guidance template from sessions")
create_parser.add_argument("session_ids", nargs="+", help="Session IDs to analyze")
create_parser.add_argument("--name", "-n", help="Template name")
# List templates command
subparsers.add_parser("list-templates", help="List available guidance templates")
# Generate guide command
guide_parser = subparsers.add_parser("generate-guide", help="Generate user guide from template")
guide_parser.add_argument("profile_id", help="Profile ID to generate guide from")
# Parse args
parsed = parser.parse_args(args)
if not parsed.command:
parser.print_help()
return 1
# Import session DB
try:
from hermes_state import SessionDB
session_db = SessionDB()
except ImportError:
print("Error: Cannot import SessionDB")
return 1
if parsed.command == "analyze":
analyzer = UserGuidanceAnalyzer(session_db)
analysis = analyzer.analyze_user_guidance(parsed.session_id)
print(f"\n=== User Guidance Analysis: {parsed.session_id} ===\n")
if "error" in analysis:
print(f"Error: {analysis['error']}")
return 1
print(f"Messages: {analysis['message_count']}")
print("\nPrompt Patterns:")
for p in analysis.get("prompt_patterns", [])[:5]:
print(f" {p['type']}: {'' if p.get('success') else ''} ({p['length']} chars)")
print("\nCorrection Patterns:")
for c in analysis.get("correction_patterns", [])[:3]:
print(f" {c['error_content'][:50]}... -> {c['user_correction'][:50]}...")
print("\nSuccess Metrics:")
metrics = analysis.get("success_metrics", {})
print(f" Tool calls: {metrics.get('tool_calls', 0)}")
print(f" Success rate: {metrics.get('success_rate', 0):.0%}")
print(f" User corrections: {metrics.get('user_corrections', 0)}")
return 0
elif parsed.command == "create-template":
analyzer = UserGuidanceAnalyzer(session_db)
generator = GuidanceTemplateGenerator(analyzer)
profile = generator.create_guidance_template(
parsed.session_ids,
name=parsed.name
)
manager = GuidanceTemplateManager()
path = manager.save_template(profile)
print(f"Created guidance template: {profile.profile_id}")
print(f"Saved to: {path}")
print(f"Prompt patterns: {len(profile.prompt_patterns)}")
print(f"Correction patterns: {len(profile.correction_patterns)}")
return 0
elif parsed.command == "list-templates":
manager = GuidanceTemplateManager()
templates = manager.list_templates()
if not templates:
print("No templates found.")
return 0
print("\n=== Available Guidance Templates ===\n")
for t in templates:
print(f"ID: {t['profile_id']}")
print(f"Name: {t['name']}")
print(f"Description: {t['description']}")
print(f"Prompt patterns: {t['prompt_patterns']}")
print(f"Correction patterns: {t['correction_patterns']}")
print()
return 0
elif parsed.command == "generate-guide":
manager = GuidanceTemplateManager()
profile = manager.load_template(parsed.profile_id)
if not profile:
print(f"Template {parsed.profile_id} not found")
return 1
guide = generate_user_guide(profile)
print(guide)
# Also save to file
guide_path = manager.template_dir / f"{parsed.profile_id}_guide.md"
with open(guide_path, 'w') as f:
f.write(guide)
print(f"\nGuide saved to: {guide_path}")
return 0
return 1
if __name__ == "__main__":
import sys
sys.exit(guidance_command(sys.argv[1:]))