Update agent configuration for maximum tool-calling iterations
- Increased the default maximum tool-calling iterations from 20 to 60 in the CLI configuration and related files, allowing for more complex tasks. - Updated documentation and comments to reflect the new recommended range for iterations, enhancing user guidance. - Implemented backward compatibility for loading max iterations from the root-level configuration, ensuring a smooth transition for existing users. - Adjusted the setup wizard to prompt for the maximum iterations setting, improving user experience during configuration.
This commit is contained in:
145
TODO.md
145
TODO.md
@@ -441,4 +441,149 @@
|
||||
|
||||
---
|
||||
|
||||
## 14. Learning Machine / Dynamic Memory System 🧠
|
||||
|
||||
*Inspired by [Dash](~/agent-codebases/dash) - a self-learning data agent.*
|
||||
|
||||
**Problem:** Agent starts fresh every session. Valuable learnings from debugging, error patterns, successful approaches, and user preferences are lost.
|
||||
|
||||
**Dash's Key Insight:** Separate **Knowledge** (static, curated) from **Learnings** (dynamic, discovered):
|
||||
|
||||
| System | What It Stores | How It Evolves |
|
||||
|--------|---------------|----------------|
|
||||
| **Knowledge** (Skills) | Validated approaches, templates, best practices | Curated by user |
|
||||
| **Learnings** | Error patterns, gotchas, discovered fixes | Managed automatically |
|
||||
|
||||
**Tools to implement:**
|
||||
- [ ] `save_learning(topic, learning, context?)` - Record a discovered pattern
|
||||
```python
|
||||
save_learning(
|
||||
topic="python-ssl",
|
||||
learning="On Ubuntu 22.04, SSL certificate errors often fixed by: apt install ca-certificates",
|
||||
context="Debugging requests SSL failure"
|
||||
)
|
||||
```
|
||||
- [ ] `search_learnings(query)` - Find relevant past learnings
|
||||
```python
|
||||
search_learnings("SSL certificate error Python")
|
||||
# Returns: "On Ubuntu 22.04, SSL certificate errors often fixed by..."
|
||||
```
|
||||
|
||||
**User Profile & Memory:**
|
||||
- [ ] `user_profile` - Structured facts about user preferences
|
||||
```yaml
|
||||
# ~/.hermes/user_profile.yaml
|
||||
coding_style:
|
||||
python_formatter: black
|
||||
type_hints: always
|
||||
test_framework: pytest
|
||||
preferences:
|
||||
verbosity: detailed
|
||||
confirm_destructive: true
|
||||
environment:
|
||||
os: linux
|
||||
shell: bash
|
||||
default_python: 3.11
|
||||
```
|
||||
- [ ] `user_memory` - Unstructured observations the agent learns
|
||||
```yaml
|
||||
# ~/.hermes/user_memory.yaml
|
||||
- "User prefers tabs over spaces despite black's defaults"
|
||||
- "User's main project is ~/work/myapp - a Django app"
|
||||
- "User often works late - don't ask about timezone"
|
||||
```
|
||||
|
||||
**When to learn:**
|
||||
- After fixing an error that took multiple attempts
|
||||
- When user corrects the agent's approach
|
||||
- When a workaround is discovered for a tool limitation
|
||||
- When user expresses a preference
|
||||
|
||||
**Storage:** Vector database (ChromaDB) or simple YAML with embedding search.
|
||||
|
||||
**Files to create:** `tools/learning_tools.py`, `learning/store.py`, `~/.hermes/learnings/`
|
||||
|
||||
---
|
||||
|
||||
## 15. Layered Context Architecture 📊
|
||||
|
||||
*Inspired by Dash's "Six Layers of Context" - grounding responses in multiple sources.*
|
||||
|
||||
**Problem:** Context sources are ad-hoc. No clear hierarchy or strategy for what context to include when.
|
||||
|
||||
**Proposed Layers for Hermes:**
|
||||
|
||||
| Layer | Source | When Loaded | Example |
|
||||
|-------|--------|-------------|---------|
|
||||
| 1. **Project Context** | `.hermes/context.md` | Auto on cwd | "This is a FastAPI project using PostgreSQL" |
|
||||
| 2. **Skills** | `skills/*.md` | On request | "How to set up React project" |
|
||||
| 3. **User Profile** | `~/.hermes/user_profile.yaml` | Always | "User prefers pytest, uses black" |
|
||||
| 4. **Learnings** | `~/.hermes/learnings/` | Semantic search | "SSL fix for Ubuntu" |
|
||||
| 5. **External Knowledge** | Web search, docs | On demand | Current API docs, Stack Overflow |
|
||||
| 6. **Runtime Introspection** | Tool calls | Real-time | File contents, terminal output |
|
||||
|
||||
**Benefits:**
|
||||
- Clear mental model for what context is available
|
||||
- Prioritization: local > learned > external
|
||||
- Debugging: "Why did agent do X?" → check which layers contributed
|
||||
|
||||
**Files to modify:** `run_agent.py` (context loading), new `context/layers.py`
|
||||
|
||||
---
|
||||
|
||||
## 16. Evaluation System with LLM Grading 📏
|
||||
|
||||
*Inspired by Dash's evaluation framework.*
|
||||
|
||||
**Problem:** `batch_runner.py` runs test cases but lacks quality assessment.
|
||||
|
||||
**Dash's Approach:**
|
||||
- **String matching** (default) - Check if expected strings appear
|
||||
- **LLM grader** (-g flag) - GPT evaluates response quality
|
||||
- **Result comparison** (-r flag) - Compare against golden output
|
||||
|
||||
**Implementation for Hermes:**
|
||||
|
||||
- [ ] **Test case format:**
|
||||
```python
|
||||
TestCase(
|
||||
name="create_python_project",
|
||||
prompt="Create a new Python project with FastAPI and tests",
|
||||
expected_strings=["requirements.txt", "main.py", "test_"], # Basic check
|
||||
golden_actions=["write:main.py", "write:requirements.txt", "terminal:pip install"],
|
||||
grader_criteria="Should create complete project structure with working code"
|
||||
)
|
||||
```
|
||||
|
||||
- [ ] **LLM grader mode:**
|
||||
```python
|
||||
def grade_response(response: str, criteria: str) -> Grade:
|
||||
"""Use GPT to evaluate response quality."""
|
||||
prompt = f"""
|
||||
Evaluate this agent response against the criteria.
|
||||
Criteria: {criteria}
|
||||
Response: {response}
|
||||
|
||||
Score (1-5) and explain why.
|
||||
"""
|
||||
# Returns: Grade(score=4, explanation="Created all files but tests are minimal")
|
||||
```
|
||||
|
||||
- [ ] **Action comparison mode:**
|
||||
- Record tool calls made during test
|
||||
- Compare against expected actions
|
||||
- "Expected terminal call to pip install, got npm install"
|
||||
|
||||
- [ ] **CLI flags:**
|
||||
```bash
|
||||
python batch_runner.py eval test_cases.yaml # String matching
|
||||
python batch_runner.py eval test_cases.yaml -g # + LLM grading
|
||||
python batch_runner.py eval test_cases.yaml -r # + Result comparison
|
||||
python batch_runner.py eval test_cases.yaml -v # Verbose (show responses)
|
||||
```
|
||||
|
||||
**Files to modify:** `batch_runner.py`, new `evals/test_cases.py`, new `evals/grader.py`
|
||||
|
||||
---
|
||||
|
||||
*Last updated: $(date +%Y-%m-%d)* 🤖
|
||||
|
||||
@@ -146,8 +146,10 @@ compression:
|
||||
# Agent Behavior
|
||||
# =============================================================================
|
||||
agent:
|
||||
# Maximum conversation turns before stopping
|
||||
max_turns: 20
|
||||
# Maximum tool-calling iterations per conversation
|
||||
# Higher = more room for complex tasks, but costs more tokens
|
||||
# Recommended: 20-30 for focused tasks, 50-100 for open exploration
|
||||
max_turns: 60
|
||||
|
||||
# Enable verbose logging
|
||||
verbose: false
|
||||
|
||||
26
cli.py
26
cli.py
@@ -95,7 +95,7 @@ def load_cli_config() -> Dict[str, Any]:
|
||||
"summary_model": "google/gemini-2.0-flash-001", # Fast/cheap model for summaries
|
||||
},
|
||||
"agent": {
|
||||
"max_turns": 20,
|
||||
"max_turns": 60, # Default max tool-calling iterations
|
||||
"verbose": False,
|
||||
"system_prompt": "",
|
||||
"personalities": {
|
||||
@@ -145,6 +145,10 @@ def load_cli_config() -> Dict[str, Any]:
|
||||
defaults[key].update(file_config[key])
|
||||
else:
|
||||
defaults[key] = file_config[key]
|
||||
|
||||
# Handle root-level max_turns (backwards compat) - copy to agent.max_turns
|
||||
if "max_turns" in file_config and "agent" not in file_config:
|
||||
defaults["agent"]["max_turns"] = file_config["max_turns"]
|
||||
except Exception as e:
|
||||
print(f"[Warning] Failed to load cli-config.yaml: {e}")
|
||||
|
||||
@@ -547,7 +551,7 @@ class HermesCLI:
|
||||
toolsets: List[str] = None,
|
||||
api_key: str = None,
|
||||
base_url: str = None,
|
||||
max_turns: int = 20,
|
||||
max_turns: int = 60,
|
||||
verbose: bool = False,
|
||||
compact: bool = False,
|
||||
):
|
||||
@@ -559,7 +563,7 @@ class HermesCLI:
|
||||
toolsets: List of toolsets to enable (default: all)
|
||||
api_key: API key (default: from environment)
|
||||
base_url: API base URL (default: OpenRouter)
|
||||
max_turns: Maximum conversation turns
|
||||
max_turns: Maximum tool-calling iterations (default: 60)
|
||||
verbose: Enable verbose logging
|
||||
compact: Use compact display mode
|
||||
"""
|
||||
@@ -577,7 +581,17 @@ class HermesCLI:
|
||||
|
||||
# API key: custom endpoint (OPENAI_API_KEY) takes precedence over OpenRouter
|
||||
self.api_key = api_key or os.getenv("OPENAI_API_KEY") or os.getenv("OPENROUTER_API_KEY")
|
||||
self.max_turns = max_turns if max_turns != 20 else CLI_CONFIG["agent"].get("max_turns", 20)
|
||||
# Max turns priority: CLI arg > env var > config file (agent.max_turns or root max_turns) > default
|
||||
if max_turns != 60: # CLI arg was explicitly set
|
||||
self.max_turns = max_turns
|
||||
elif os.getenv("HERMES_MAX_ITERATIONS"):
|
||||
self.max_turns = int(os.getenv("HERMES_MAX_ITERATIONS"))
|
||||
elif CLI_CONFIG["agent"].get("max_turns"):
|
||||
self.max_turns = CLI_CONFIG["agent"]["max_turns"]
|
||||
elif CLI_CONFIG.get("max_turns"): # Backwards compat: root-level max_turns
|
||||
self.max_turns = CLI_CONFIG["max_turns"]
|
||||
else:
|
||||
self.max_turns = 60
|
||||
|
||||
# Parse and validate toolsets
|
||||
self.enabled_toolsets = toolsets
|
||||
@@ -1377,7 +1391,7 @@ def main(
|
||||
model: str = None,
|
||||
api_key: str = None,
|
||||
base_url: str = None,
|
||||
max_turns: int = 20,
|
||||
max_turns: int = 60,
|
||||
verbose: bool = False,
|
||||
compact: bool = False,
|
||||
list_tools: bool = False,
|
||||
@@ -1396,7 +1410,7 @@ def main(
|
||||
model: Model to use (default: anthropic/claude-opus-4-20250514)
|
||||
api_key: API key for authentication
|
||||
base_url: Base URL for the API
|
||||
max_turns: Maximum conversation turns (default: 20)
|
||||
max_turns: Maximum tool-calling iterations (default: 60)
|
||||
verbose: Enable verbose logging
|
||||
compact: Use compact display mode
|
||||
list_tools: List available tools and exit
|
||||
|
||||
@@ -360,8 +360,12 @@ class GatewayRunner:
|
||||
toolset = toolset_map.get(source.platform, "hermes-telegram")
|
||||
|
||||
def run_sync():
|
||||
# Read from env var or use default (same as CLI)
|
||||
max_iterations = int(os.getenv("HERMES_MAX_ITERATIONS", "60"))
|
||||
|
||||
agent = AIAgent(
|
||||
model=os.getenv("HERMES_MODEL", "anthropic/claude-sonnet-4"),
|
||||
max_iterations=max_iterations,
|
||||
quiet_mode=True,
|
||||
enabled_toolsets=[toolset],
|
||||
ephemeral_system_prompt=context_prompt,
|
||||
|
||||
@@ -201,6 +201,13 @@ OPTIONAL_ENV_VARS = {
|
||||
"url": None,
|
||||
"password": True,
|
||||
},
|
||||
# Agent configuration
|
||||
"HERMES_MAX_ITERATIONS": {
|
||||
"description": "Maximum tool-calling iterations per conversation (default: 25 for messaging, 10 for CLI)",
|
||||
"prompt": "Max iterations",
|
||||
"url": None,
|
||||
"password": False,
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
|
||||
@@ -693,7 +693,28 @@ def run_setup_wizard(args):
|
||||
# else: Keep current (selected_backend is None)
|
||||
|
||||
# =========================================================================
|
||||
# Step 5: Context Compression
|
||||
# Step 5: Agent Settings
|
||||
# =========================================================================
|
||||
print_header("Agent Settings")
|
||||
|
||||
# Max iterations
|
||||
current_max = get_env_value('HERMES_MAX_ITERATIONS') or '60'
|
||||
print_info("Maximum tool-calling iterations per conversation.")
|
||||
print_info("Higher = more complex tasks, but costs more tokens.")
|
||||
print_info("Recommended: 30-60 for most tasks, 100+ for open exploration.")
|
||||
|
||||
max_iter_str = prompt("Max iterations", current_max)
|
||||
try:
|
||||
max_iter = int(max_iter_str)
|
||||
if max_iter > 0:
|
||||
save_env_value("HERMES_MAX_ITERATIONS", str(max_iter))
|
||||
config['max_turns'] = max_iter
|
||||
print_success(f"Max iterations set to {max_iter}")
|
||||
except ValueError:
|
||||
print_warning("Invalid number, keeping current value")
|
||||
|
||||
# =========================================================================
|
||||
# Step 6: Context Compression
|
||||
# =========================================================================
|
||||
print_header("Context Compression")
|
||||
print_info("Automatically summarize old messages when context gets too long.")
|
||||
@@ -718,7 +739,7 @@ def run_setup_wizard(args):
|
||||
config.setdefault('compression', {})['enabled'] = False
|
||||
|
||||
# =========================================================================
|
||||
# Step 6: Messaging Platforms (Optional)
|
||||
# Step 7: Messaging Platforms (Optional)
|
||||
# =========================================================================
|
||||
print_header("Messaging Platforms (Optional)")
|
||||
print_info("Connect to messaging platforms to chat with Hermes from anywhere.")
|
||||
@@ -812,7 +833,7 @@ def run_setup_wizard(args):
|
||||
print_success("Discord allowlist configured")
|
||||
|
||||
# =========================================================================
|
||||
# Step 7: Additional Tools (Optional)
|
||||
# Step 8: Additional Tools (Optional)
|
||||
# =========================================================================
|
||||
print_header("Additional Tools (Optional)")
|
||||
print_info("These tools extend the agent's capabilities.")
|
||||
|
||||
48
run_agent.py
48
run_agent.py
@@ -585,7 +585,7 @@ class AIAgent:
|
||||
base_url: str = None,
|
||||
api_key: str = None,
|
||||
model: str = "anthropic/claude-sonnet-4-20250514", # OpenRouter format
|
||||
max_iterations: int = 10,
|
||||
max_iterations: int = 60, # Default tool-calling iterations
|
||||
tool_delay: float = 1.0,
|
||||
enabled_toolsets: List[str] = None,
|
||||
disabled_toolsets: List[str] = None,
|
||||
@@ -1966,11 +1966,47 @@ class AIAgent:
|
||||
final_response = f"I apologize, but I encountered repeated errors: {error_msg}"
|
||||
break
|
||||
|
||||
# Handle max iterations reached
|
||||
if api_call_count >= self.max_iterations:
|
||||
print(f"⚠️ Reached maximum iterations ({self.max_iterations}). Stopping to prevent infinite loop.")
|
||||
if final_response is None:
|
||||
final_response = "I've reached the maximum number of iterations. Here's what I found so far."
|
||||
# Handle max iterations reached - ask model to summarize what it found
|
||||
if api_call_count >= self.max_iterations and final_response is None:
|
||||
print(f"⚠️ Reached maximum iterations ({self.max_iterations}). Requesting summary...")
|
||||
|
||||
# Inject a user message asking for a summary
|
||||
summary_request = (
|
||||
"You've reached the maximum number of tool-calling iterations allowed. "
|
||||
"Please provide a final response summarizing what you've found and accomplished so far, "
|
||||
"without calling any more tools."
|
||||
)
|
||||
messages.append({"role": "user", "content": summary_request})
|
||||
|
||||
# Make one final API call WITHOUT tools to force a text response
|
||||
try:
|
||||
api_messages = messages.copy()
|
||||
if self.ephemeral_system_prompt:
|
||||
api_messages = [{"role": "system", "content": self.ephemeral_system_prompt}] + api_messages
|
||||
|
||||
summary_response = self.client.chat.completions.create(
|
||||
model=self.model,
|
||||
messages=api_messages,
|
||||
# No tools parameter - forces text response
|
||||
extra_headers=self.extra_headers,
|
||||
extra_body=self.extra_body,
|
||||
)
|
||||
|
||||
if summary_response.choices and summary_response.choices[0].message.content:
|
||||
final_response = summary_response.choices[0].message.content
|
||||
# Strip think blocks from final response
|
||||
if "<think>" in final_response:
|
||||
import re
|
||||
final_response = re.sub(r'<think>.*?</think>\s*', '', final_response, flags=re.DOTALL).strip()
|
||||
|
||||
# Add to messages for session continuity
|
||||
messages.append({"role": "assistant", "content": final_response})
|
||||
else:
|
||||
final_response = "I reached the iteration limit and couldn't generate a summary."
|
||||
|
||||
except Exception as e:
|
||||
logging.warning(f"Failed to get summary response: {e}")
|
||||
final_response = f"I reached the maximum iterations ({self.max_iterations}) but couldn't summarize. Error: {str(e)}"
|
||||
|
||||
# Determine if conversation completed successfully
|
||||
completed = final_response is not None and api_call_count < self.max_iterations
|
||||
|
||||
Reference in New Issue
Block a user