Cross-referenced all 84 docs pages against the actual codebase and corrected every discrepancy found. Reference docs: - faq.md: Fix non-existent commands (/stats→/usage, /context→/usage, hermes models→hermes model, hermes config get→hermes config show, hermes gateway logs→cat gateway.log, async→sync chat() call) - cli-commands.md: Fix --provider choices list (remove providers not in argparse), add undocumented -s/--skills flag - slash-commands.md: Add missing /queue and /resume commands, fix /approve args_hint to show [session|always] - tools-reference.md: Remove duplicate vision and web toolset sections - environment-variables.md: Fix HERMES_INFERENCE_PROVIDER list (add copilot-acp, remove alibaba to match actual argparse choices) Configuration & user guide: - configuration.md: Fix approval_mode→approvals.mode (manual not ask), checkpoints.enabled default true not false, human_delay defaults (500/2000→800/2500), remove non-existent delegation.max_iterations and delegation.default_toolsets, fix website_blocklist nesting under security:, add .hermes.md and CLAUDE.md to context files table with priority system explanation - security.md: Fix website_blocklist nesting under security: - context-files.md: Add .hermes.md/HERMES.md and CLAUDE.md support, document priority-based first-match-wins loading behavior - cli.md: Fix personalities config nesting (top-level, not under agent:) - delegation.md: Fix model override docs (config-level, not per-call tool parameter) - rl-training.md: Fix log directory (tinker-atropos/logs/→ ~/.hermes/logs/rl_training/) - tts.md: Fix Discord delivery format (voice bubble with fallback, not just file attachment) - git-worktrees.md: Remove outdated v0.2.0 version reference Developer guide: - prompt-assembly.md: Add .hermes.md, CLAUDE.md, document priority system for context files - agent-loop.md: Fix callback list (remove non-existent message_callback, add stream_delta_callback, tool_gen_callback, status_callback) Messaging & guides: - webhooks.md: Fix command (hermes setup gateway→hermes gateway setup) - tips.md: Fix session idle timeout (120min→24h), config file (gateway.json→config.yaml) - build-a-hermes-plugin.md: Fix plugin.yaml provides: format (provides_tools/provides_hooks as lists), note register_command() as not yet implemented
113 lines
3.3 KiB
Markdown
113 lines
3.3 KiB
Markdown
---
|
|
sidebar_position: 3
|
|
title: "Agent Loop Internals"
|
|
description: "Detailed walkthrough of AIAgent execution, API modes, tools, callbacks, and fallback behavior"
|
|
---
|
|
|
|
# Agent Loop Internals
|
|
|
|
The core orchestration engine is `run_agent.py`'s `AIAgent`.
|
|
|
|
## Core responsibilities
|
|
|
|
`AIAgent` is responsible for:
|
|
|
|
- assembling the effective prompt and tool schemas
|
|
- selecting the correct provider/API mode
|
|
- making interruptible model calls
|
|
- executing tool calls (sequentially or concurrently)
|
|
- maintaining session history
|
|
- handling compression, retries, and fallback models
|
|
|
|
## API modes
|
|
|
|
Hermes currently supports three API execution modes:
|
|
|
|
| API mode | Used for |
|
|
|----------|----------|
|
|
| `chat_completions` | OpenAI-compatible chat endpoints, including OpenRouter and most custom endpoints |
|
|
| `codex_responses` | OpenAI Codex / Responses API path |
|
|
| `anthropic_messages` | Native Anthropic Messages API |
|
|
|
|
The mode is resolved from explicit args, provider selection, and base URL heuristics.
|
|
|
|
## Turn lifecycle
|
|
|
|
```text
|
|
run_conversation()
|
|
-> generate effective task_id
|
|
-> append current user message
|
|
-> load or build cached system prompt
|
|
-> maybe preflight-compress
|
|
-> build api_messages
|
|
-> inject ephemeral prompt layers
|
|
-> apply prompt caching if appropriate
|
|
-> make interruptible API call
|
|
-> if tool calls: execute them, append tool results, loop
|
|
-> if final text: persist, cleanup, return response
|
|
```
|
|
|
|
## Interruptible API calls
|
|
|
|
Hermes wraps API requests so they can be interrupted from the CLI or gateway.
|
|
|
|
This matters because:
|
|
|
|
- the agent may be in a long LLM call
|
|
- the user may send a new message mid-flight
|
|
- background systems may need cancellation semantics
|
|
|
|
## Tool execution modes
|
|
|
|
Hermes uses two execution strategies:
|
|
|
|
- sequential execution for single or interactive tools
|
|
- concurrent execution for multiple non-interactive tools
|
|
|
|
Concurrent tool execution preserves message/result ordering when reinserting tool responses into conversation history.
|
|
|
|
## Callback surfaces
|
|
|
|
`AIAgent` supports platform/integration callbacks such as:
|
|
|
|
- `tool_progress_callback`
|
|
- `thinking_callback`
|
|
- `reasoning_callback`
|
|
- `clarify_callback`
|
|
- `step_callback`
|
|
- `stream_delta_callback`
|
|
- `tool_gen_callback`
|
|
- `status_callback`
|
|
|
|
These are how the CLI, gateway, and ACP integrations stream intermediate progress and interactive approval/clarification flows.
|
|
|
|
## Budget and fallback behavior
|
|
|
|
Hermes tracks a shared iteration budget across parent and subagents. It also injects budget pressure hints near the end of the available iteration window.
|
|
|
|
Fallback model support allows the agent to switch providers/models when the primary route fails in supported failure paths.
|
|
|
|
## Compression and persistence
|
|
|
|
Before and during long runs, Hermes may:
|
|
|
|
- flush memory before context loss
|
|
- compress middle conversation turns
|
|
- split the session lineage into a new session ID after compression
|
|
- preserve recent context and structural tool-call/result consistency
|
|
|
|
## Key files to read next
|
|
|
|
- `run_agent.py`
|
|
- `agent/prompt_builder.py`
|
|
- `agent/context_compressor.py`
|
|
- `agent/prompt_caching.py`
|
|
- `model_tools.py`
|
|
|
|
## Related docs
|
|
|
|
- [Provider Runtime Resolution](./provider-runtime.md)
|
|
- [Prompt Assembly](./prompt-assembly.md)
|
|
- [Context Compression & Prompt Caching](./context-compression-and-caching.md)
|
|
- [Tools Runtime](./tools-runtime.md)
|