Moved "architecture" dir to "docs" for clarity
This commit is contained in:
104
docs/agents.md
Normal file
104
docs/agents.md
Normal file
@@ -0,0 +1,104 @@
|
||||
# Agents
|
||||
|
||||
The agent is the core loop that orchestrates LLM calls and tool execution.
|
||||
|
||||
## AIAgent Class
|
||||
|
||||
The main agent is implemented in `run_agent.py`:
|
||||
|
||||
```python
|
||||
class AIAgent:
|
||||
def __init__(
|
||||
self,
|
||||
model: str = "anthropic/claude-sonnet-4",
|
||||
api_key: str = None,
|
||||
base_url: str = "https://openrouter.ai/api/v1",
|
||||
max_turns: int = 20,
|
||||
enabled_toolsets: list = None,
|
||||
disabled_toolsets: list = None,
|
||||
verbose_logging: bool = False,
|
||||
):
|
||||
# Initialize OpenAI client, load tools based on toolsets
|
||||
...
|
||||
|
||||
def chat(self, user_message: str, task_id: str = None) -> str:
|
||||
# Main entry point - runs the agent loop
|
||||
...
|
||||
```
|
||||
|
||||
## Agent Loop
|
||||
|
||||
The core loop in `_run_agent_loop()`:
|
||||
|
||||
```
|
||||
1. Add user message to conversation
|
||||
2. Call LLM with tools
|
||||
3. If LLM returns tool calls:
|
||||
- Execute each tool
|
||||
- Add tool results to conversation
|
||||
- Go to step 2
|
||||
4. If LLM returns text response:
|
||||
- Return response to user
|
||||
```
|
||||
|
||||
```python
|
||||
while turns < max_turns:
|
||||
response = client.chat.completions.create(
|
||||
model=model,
|
||||
messages=messages,
|
||||
tools=tool_schemas,
|
||||
)
|
||||
|
||||
if response.tool_calls:
|
||||
for tool_call in response.tool_calls:
|
||||
result = await execute_tool(tool_call)
|
||||
messages.append(tool_result_message(result))
|
||||
turns += 1
|
||||
else:
|
||||
return response.content
|
||||
```
|
||||
|
||||
## Conversation Management
|
||||
|
||||
Messages are stored as a list of dicts following OpenAI format:
|
||||
|
||||
```python
|
||||
messages = [
|
||||
{"role": "system", "content": "You are a helpful assistant..."},
|
||||
{"role": "user", "content": "Search for Python tutorials"},
|
||||
{"role": "assistant", "content": None, "tool_calls": [...]},
|
||||
{"role": "tool", "tool_call_id": "...", "content": "..."},
|
||||
{"role": "assistant", "content": "Here's what I found..."},
|
||||
]
|
||||
```
|
||||
|
||||
## Reasoning Context
|
||||
|
||||
For models that support reasoning (chain-of-thought), the agent:
|
||||
1. Extracts `reasoning_content` from API responses
|
||||
2. Stores it in `assistant_msg["reasoning"]` for trajectory export
|
||||
3. Passes it back via `reasoning_content` field on subsequent turns
|
||||
|
||||
## Trajectory Export
|
||||
|
||||
Conversations can be exported for training:
|
||||
|
||||
```python
|
||||
agent = AIAgent(save_trajectories=True)
|
||||
agent.chat("Do something")
|
||||
# Saves to trajectories/*.jsonl in ShareGPT format
|
||||
```
|
||||
|
||||
## Batch Processing
|
||||
|
||||
For processing multiple prompts, use `batch_runner.py`:
|
||||
|
||||
```bash
|
||||
python batch_runner.py \
|
||||
--dataset_file=prompts.jsonl \
|
||||
--batch_size=20 \
|
||||
--num_workers=4 \
|
||||
--run_name=my_run
|
||||
```
|
||||
|
||||
See `batch_runner.py` for parallel execution with checkpointing.
|
||||
124
docs/llm_client.md
Normal file
124
docs/llm_client.md
Normal file
@@ -0,0 +1,124 @@
|
||||
# LLM Client
|
||||
|
||||
Hermes Agent uses the OpenAI Python SDK with OpenRouter as the backend, providing access to many models through a single API.
|
||||
|
||||
## Configuration
|
||||
|
||||
```python
|
||||
from openai import OpenAI
|
||||
|
||||
client = OpenAI(
|
||||
api_key=os.getenv("OPENROUTER_API_KEY"),
|
||||
base_url="https://openrouter.ai/api/v1"
|
||||
)
|
||||
```
|
||||
|
||||
## Supported Models
|
||||
|
||||
Any model available on [OpenRouter](https://openrouter.ai/models):
|
||||
|
||||
```python
|
||||
# Anthropic
|
||||
model = "anthropic/claude-sonnet-4"
|
||||
model = "anthropic/claude-opus-4"
|
||||
|
||||
# OpenAI
|
||||
model = "openai/gpt-4o"
|
||||
model = "openai/o1"
|
||||
|
||||
# Google
|
||||
model = "google/gemini-2.0-flash"
|
||||
|
||||
# Open models
|
||||
model = "meta-llama/llama-3.3-70b-instruct"
|
||||
model = "deepseek/deepseek-chat-v3"
|
||||
model = "moonshotai/kimi-k2.5"
|
||||
```
|
||||
|
||||
## Tool Calling
|
||||
|
||||
Standard OpenAI function calling format:
|
||||
|
||||
```python
|
||||
response = client.chat.completions.create(
|
||||
model=model,
|
||||
messages=messages,
|
||||
tools=[
|
||||
{
|
||||
"type": "function",
|
||||
"function": {
|
||||
"name": "web_search",
|
||||
"description": "Search the web",
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"query": {"type": "string"}
|
||||
},
|
||||
"required": ["query"]
|
||||
}
|
||||
}
|
||||
}
|
||||
],
|
||||
)
|
||||
|
||||
# Check for tool calls
|
||||
if response.choices[0].message.tool_calls:
|
||||
for tool_call in response.choices[0].message.tool_calls:
|
||||
name = tool_call.function.name
|
||||
args = json.loads(tool_call.function.arguments)
|
||||
# Execute tool...
|
||||
```
|
||||
|
||||
## Reasoning Models
|
||||
|
||||
Some models return reasoning/thinking content:
|
||||
|
||||
```python
|
||||
# Access reasoning if available
|
||||
message = response.choices[0].message
|
||||
if hasattr(message, 'reasoning_content') and message.reasoning_content:
|
||||
reasoning = message.reasoning_content
|
||||
# Store for trajectory export
|
||||
```
|
||||
|
||||
## Provider Selection
|
||||
|
||||
OpenRouter allows selecting specific providers:
|
||||
|
||||
```python
|
||||
response = client.chat.completions.create(
|
||||
model=model,
|
||||
messages=messages,
|
||||
extra_body={
|
||||
"provider": {
|
||||
"order": ["Anthropic", "Google"], # Preferred providers
|
||||
"ignore": ["Novita"], # Providers to skip
|
||||
}
|
||||
}
|
||||
)
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
Common errors and handling:
|
||||
|
||||
```python
|
||||
try:
|
||||
response = client.chat.completions.create(...)
|
||||
except openai.RateLimitError:
|
||||
# Back off and retry
|
||||
except openai.APIError as e:
|
||||
# Check e.code for specific errors
|
||||
# 400 = bad request (often provider-specific)
|
||||
# 502 = bad gateway (retry with different provider)
|
||||
```
|
||||
|
||||
## Cost Tracking
|
||||
|
||||
OpenRouter returns usage info:
|
||||
|
||||
```python
|
||||
usage = response.usage
|
||||
print(f"Tokens: {usage.prompt_tokens} + {usage.completion_tokens}")
|
||||
print(f"Cost: ${usage.cost:.6f}") # If available
|
||||
```
|
||||
121
docs/message_graph.md
Normal file
121
docs/message_graph.md
Normal file
@@ -0,0 +1,121 @@
|
||||
# Message Format & Trajectories
|
||||
|
||||
Hermes Agent uses two message formats: the **API format** for LLM calls and the **trajectory format** for training data export.
|
||||
|
||||
## API Message Format
|
||||
|
||||
Standard OpenAI chat format used during execution:
|
||||
|
||||
```python
|
||||
messages = [
|
||||
# System prompt
|
||||
{"role": "system", "content": "You are a helpful assistant with tools..."},
|
||||
|
||||
# User query
|
||||
{"role": "user", "content": "Search for Python tutorials"},
|
||||
|
||||
# Assistant with tool call
|
||||
{
|
||||
"role": "assistant",
|
||||
"content": None,
|
||||
"tool_calls": [{
|
||||
"id": "call_abc123",
|
||||
"type": "function",
|
||||
"function": {
|
||||
"name": "web_search",
|
||||
"arguments": "{\"query\": \"Python tutorials\"}"
|
||||
}
|
||||
}]
|
||||
},
|
||||
|
||||
# Tool result
|
||||
{
|
||||
"role": "tool",
|
||||
"tool_call_id": "call_abc123",
|
||||
"content": "{\"results\": [...]}"
|
||||
},
|
||||
|
||||
# Final response
|
||||
{"role": "assistant", "content": "Here's what I found..."}
|
||||
]
|
||||
```
|
||||
|
||||
## Trajectory Format (ShareGPT)
|
||||
|
||||
Exported for training in ShareGPT format:
|
||||
|
||||
```json
|
||||
{
|
||||
"conversations": [
|
||||
{"from": "system", "value": "You are a helpful assistant..."},
|
||||
{"from": "human", "value": "Search for Python tutorials"},
|
||||
{"from": "gpt", "value": "<tool_call>\n{\"name\": \"web_search\", \"arguments\": {\"query\": \"Python tutorials\"}}\n</tool_call>"},
|
||||
{"from": "tool", "value": "<tool_response>\n{\"results\": [...]}\n</tool_response>"},
|
||||
{"from": "gpt", "value": "Here's what I found..."}
|
||||
],
|
||||
"tools": "[{\"type\": \"function\", \"function\": {...}}]",
|
||||
"source": "hermes-agent"
|
||||
}
|
||||
```
|
||||
|
||||
## Reasoning Content
|
||||
|
||||
For models that output reasoning/chain-of-thought:
|
||||
|
||||
**During execution** (API format):
|
||||
```python
|
||||
# Stored internally but not sent back to model in content
|
||||
assistant_msg = {
|
||||
"role": "assistant",
|
||||
"content": "Here's what I found...",
|
||||
"reasoning": "Let me think about this step by step..." # Internal only
|
||||
}
|
||||
```
|
||||
|
||||
**In trajectory export** (reasoning wrapped in tags):
|
||||
```json
|
||||
{
|
||||
"from": "gpt",
|
||||
"value": "<think>\nLet me think about this step by step...\n</think>\nHere's what I found..."
|
||||
}
|
||||
```
|
||||
|
||||
## Conversion Flow
|
||||
|
||||
```
|
||||
API Response → Internal Storage → Trajectory Export
|
||||
↓ ↓ ↓
|
||||
tool_calls reasoning field <tool_call> tags
|
||||
reasoning_content <think> tags
|
||||
```
|
||||
|
||||
The conversion happens in `_convert_to_trajectory_format()` in `run_agent.py`.
|
||||
|
||||
## Ephemeral System Prompts
|
||||
|
||||
Batch processing supports ephemeral system prompts that guide behavior during execution but are NOT saved to trajectories:
|
||||
|
||||
```python
|
||||
# During execution: full system prompt + ephemeral guidance
|
||||
messages = [
|
||||
{"role": "system", "content": SYSTEM_PROMPT + "\n\n" + ephemeral_prompt},
|
||||
...
|
||||
]
|
||||
|
||||
# In saved trajectory: only the base system prompt
|
||||
trajectory = {
|
||||
"conversations": [
|
||||
{"from": "system", "value": SYSTEM_PROMPT}, # No ephemeral
|
||||
...
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## Trajectory Compression
|
||||
|
||||
Long trajectories can be compressed for training using `trajectory_compressor.py`:
|
||||
|
||||
- Protects first/last N turns
|
||||
- Summarizes middle turns with LLM
|
||||
- Targets specific token budget
|
||||
- See `configs/trajectory_compression.yaml` for settings
|
||||
133
docs/tools.md
Normal file
133
docs/tools.md
Normal file
@@ -0,0 +1,133 @@
|
||||
# Tools
|
||||
|
||||
Tools are functions that extend the agent's capabilities. Each tool is defined with an OpenAI-compatible JSON schema and an async handler function.
|
||||
|
||||
## Tool Structure
|
||||
|
||||
Each tool module in `tools/` exports:
|
||||
1. **Schema definitions** - OpenAI function-calling format
|
||||
2. **Handler functions** - Async functions that execute the tool
|
||||
|
||||
```python
|
||||
# Example: tools/web_tools.py
|
||||
|
||||
# Schema definition
|
||||
WEB_SEARCH_SCHEMA = {
|
||||
"type": "function",
|
||||
"function": {
|
||||
"name": "web_search",
|
||||
"description": "Search the web for information",
|
||||
"parameters": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"query": {"type": "string", "description": "Search query"}
|
||||
},
|
||||
"required": ["query"]
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
# Handler function
|
||||
async def web_search(query: str) -> dict:
|
||||
"""Execute web search and return results."""
|
||||
# Implementation...
|
||||
return {"results": [...]}
|
||||
```
|
||||
|
||||
## Tool Categories
|
||||
|
||||
| Category | Module | Tools |
|
||||
|----------|--------|-------|
|
||||
| **Web** | `web_tools.py` | `web_search`, `web_extract`, `web_crawl` |
|
||||
| **Terminal** | `terminal_tool.py` | `terminal` (local/docker/singularity/modal backends) |
|
||||
| **Browser** | `browser_tool.py` | `browser_navigate`, `browser_click`, `browser_type`, etc. |
|
||||
| **Vision** | `vision_tools.py` | `vision_analyze` |
|
||||
| **Image Gen** | `image_generation_tool.py` | `image_generate` |
|
||||
| **Reasoning** | `mixture_of_agents_tool.py` | `mixture_of_agents` |
|
||||
| **Skills** | `skills_tool.py` | `skills_categories`, `skills_list`, `skill_view` |
|
||||
|
||||
## Tool Registration
|
||||
|
||||
Tools are registered in `model_tools.py`:
|
||||
|
||||
```python
|
||||
# model_tools.py
|
||||
TOOL_SCHEMAS = [
|
||||
*WEB_TOOL_SCHEMAS,
|
||||
*TERMINAL_TOOL_SCHEMAS,
|
||||
*BROWSER_TOOL_SCHEMAS,
|
||||
# ...
|
||||
]
|
||||
|
||||
TOOL_HANDLERS = {
|
||||
"web_search": web_search,
|
||||
"terminal": terminal_tool,
|
||||
"browser_navigate": browser_navigate,
|
||||
# ...
|
||||
}
|
||||
```
|
||||
|
||||
## Toolsets
|
||||
|
||||
Tools are grouped into **toolsets** for logical organization (see `toolsets.py`):
|
||||
|
||||
```python
|
||||
TOOLSETS = {
|
||||
"web": {
|
||||
"description": "Web search and content extraction",
|
||||
"tools": ["web_search", "web_extract", "web_crawl"]
|
||||
},
|
||||
"terminal": {
|
||||
"description": "Command execution",
|
||||
"tools": ["terminal"]
|
||||
},
|
||||
# ...
|
||||
}
|
||||
```
|
||||
|
||||
## Adding a New Tool
|
||||
|
||||
1. Create handler function in `tools/your_tool.py`
|
||||
2. Define JSON schema following OpenAI format
|
||||
3. Register in `model_tools.py` (schemas and handlers)
|
||||
4. Add to appropriate toolset in `toolsets.py`
|
||||
5. Update `tools/__init__.py` exports
|
||||
|
||||
## Stateful Tools
|
||||
|
||||
Some tools maintain state across calls within a session:
|
||||
|
||||
- **Terminal**: Keeps container/sandbox running between commands
|
||||
- **Browser**: Maintains browser session for multi-step navigation
|
||||
|
||||
State is managed per `task_id` and cleaned up automatically.
|
||||
|
||||
## Skills Tools (Progressive Disclosure)
|
||||
|
||||
Skills are on-demand knowledge documents. They use **progressive disclosure** to minimize tokens:
|
||||
|
||||
```
|
||||
Level 0: skills_categories() → ["mlops", "devops"] (~50 tokens)
|
||||
Level 1: skills_list(category) → [{name, description}, ...] (~3k tokens)
|
||||
Level 2: skill_view(name) → Full content + metadata (varies)
|
||||
Level 3: skill_view(name, path) → Specific reference file (varies)
|
||||
```
|
||||
|
||||
Skill directory structure:
|
||||
```
|
||||
skills/
|
||||
└── mlops/
|
||||
└── axolotl/
|
||||
├── SKILL.md # Main instructions (required)
|
||||
├── references/ # Additional docs
|
||||
└── templates/ # Output formats, configs
|
||||
```
|
||||
|
||||
SKILL.md uses YAML frontmatter:
|
||||
```yaml
|
||||
---
|
||||
name: axolotl
|
||||
description: Fine-tuning LLMs with Axolotl
|
||||
tags: [Fine-Tuning, LoRA, DPO]
|
||||
---
|
||||
```
|
||||
Reference in New Issue
Block a user