Files
hermes-agent/website/docs/developer-guide/prompt-assembly.md

90 lines
3.0 KiB
Markdown
Raw Permalink Normal View History

---
sidebar_position: 5
title: "Prompt Assembly"
description: "How Hermes builds the system prompt, preserves cache stability, and injects ephemeral layers"
---
# Prompt Assembly
Hermes deliberately separates:
- **cached system prompt state**
- **ephemeral API-call-time additions**
This is one of the most important design choices in the project because it affects:
- token usage
- prompt caching effectiveness
- session continuity
- memory correctness
Primary files:
- `run_agent.py`
- `agent/prompt_builder.py`
- `tools/memory_tool.py`
## Cached system prompt layers
The cached system prompt is assembled in roughly this order:
1. agent identity — `SOUL.md` from `HERMES_HOME` when available, otherwise falls back to `DEFAULT_AGENT_IDENTITY` in `prompt_builder.py`
2. tool-aware behavior guidance
3. Honcho static block (when active)
4. optional system message
5. frozen MEMORY snapshot
6. frozen USER profile snapshot
7. skills index
8. context files (`AGENTS.md`, `.cursorrules`, `.cursor/rules/*.mdc`) — SOUL.md is **not** included here when it was already loaded as the identity in step 1
9. timestamp / optional session ID
10. platform hint
When `skip_context_files` is set (e.g., subagent delegation), SOUL.md is not loaded and the hardcoded `DEFAULT_AGENT_IDENTITY` is used instead.
## API-call-time-only layers
These are intentionally *not* persisted as part of the cached system prompt:
- `ephemeral_system_prompt`
- prefill messages
- gateway-derived session context overlays
- later-turn Honcho recall injected into the current-turn user message
This separation keeps the stable prefix stable for caching.
## Memory snapshots
Local memory and user profile data are injected as frozen snapshots at session start. Mid-session writes update disk state but do not mutate the already-built system prompt until a new session or forced rebuild occurs.
## Context files
docs: fix stale and incorrect documentation across 18 files Cross-referenced all 84 docs pages against the actual codebase and corrected every discrepancy found. Reference docs: - faq.md: Fix non-existent commands (/stats→/usage, /context→/usage, hermes models→hermes model, hermes config get→hermes config show, hermes gateway logs→cat gateway.log, async→sync chat() call) - cli-commands.md: Fix --provider choices list (remove providers not in argparse), add undocumented -s/--skills flag - slash-commands.md: Add missing /queue and /resume commands, fix /approve args_hint to show [session|always] - tools-reference.md: Remove duplicate vision and web toolset sections - environment-variables.md: Fix HERMES_INFERENCE_PROVIDER list (add copilot-acp, remove alibaba to match actual argparse choices) Configuration & user guide: - configuration.md: Fix approval_mode→approvals.mode (manual not ask), checkpoints.enabled default true not false, human_delay defaults (500/2000→800/2500), remove non-existent delegation.max_iterations and delegation.default_toolsets, fix website_blocklist nesting under security:, add .hermes.md and CLAUDE.md to context files table with priority system explanation - security.md: Fix website_blocklist nesting under security: - context-files.md: Add .hermes.md/HERMES.md and CLAUDE.md support, document priority-based first-match-wins loading behavior - cli.md: Fix personalities config nesting (top-level, not under agent:) - delegation.md: Fix model override docs (config-level, not per-call tool parameter) - rl-training.md: Fix log directory (tinker-atropos/logs/→ ~/.hermes/logs/rl_training/) - tts.md: Fix Discord delivery format (voice bubble with fallback, not just file attachment) - git-worktrees.md: Remove outdated v0.2.0 version reference Developer guide: - prompt-assembly.md: Add .hermes.md, CLAUDE.md, document priority system for context files - agent-loop.md: Fix callback list (remove non-existent message_callback, add stream_delta_callback, tool_gen_callback, status_callback) Messaging & guides: - webhooks.md: Fix command (hermes setup gateway→hermes gateway setup) - tips.md: Fix session idle timeout (120min→24h), config file (gateway.json→config.yaml) - build-a-hermes-plugin.md: Fix plugin.yaml provides: format (provides_tools/provides_hooks as lists), note register_command() as not yet implemented
2026-03-24 07:53:07 -07:00
`agent/prompt_builder.py` scans and sanitizes project context files using a **priority system** — only one type is loaded (first match wins):
docs: fix stale and incorrect documentation across 18 files Cross-referenced all 84 docs pages against the actual codebase and corrected every discrepancy found. Reference docs: - faq.md: Fix non-existent commands (/stats→/usage, /context→/usage, hermes models→hermes model, hermes config get→hermes config show, hermes gateway logs→cat gateway.log, async→sync chat() call) - cli-commands.md: Fix --provider choices list (remove providers not in argparse), add undocumented -s/--skills flag - slash-commands.md: Add missing /queue and /resume commands, fix /approve args_hint to show [session|always] - tools-reference.md: Remove duplicate vision and web toolset sections - environment-variables.md: Fix HERMES_INFERENCE_PROVIDER list (add copilot-acp, remove alibaba to match actual argparse choices) Configuration & user guide: - configuration.md: Fix approval_mode→approvals.mode (manual not ask), checkpoints.enabled default true not false, human_delay defaults (500/2000→800/2500), remove non-existent delegation.max_iterations and delegation.default_toolsets, fix website_blocklist nesting under security:, add .hermes.md and CLAUDE.md to context files table with priority system explanation - security.md: Fix website_blocklist nesting under security: - context-files.md: Add .hermes.md/HERMES.md and CLAUDE.md support, document priority-based first-match-wins loading behavior - cli.md: Fix personalities config nesting (top-level, not under agent:) - delegation.md: Fix model override docs (config-level, not per-call tool parameter) - rl-training.md: Fix log directory (tinker-atropos/logs/→ ~/.hermes/logs/rl_training/) - tts.md: Fix Discord delivery format (voice bubble with fallback, not just file attachment) - git-worktrees.md: Remove outdated v0.2.0 version reference Developer guide: - prompt-assembly.md: Add .hermes.md, CLAUDE.md, document priority system for context files - agent-loop.md: Fix callback list (remove non-existent message_callback, add stream_delta_callback, tool_gen_callback, status_callback) Messaging & guides: - webhooks.md: Fix command (hermes setup gateway→hermes gateway setup) - tips.md: Fix session idle timeout (120min→24h), config file (gateway.json→config.yaml) - build-a-hermes-plugin.md: Fix plugin.yaml provides: format (provides_tools/provides_hooks as lists), note register_command() as not yet implemented
2026-03-24 07:53:07 -07:00
1. `.hermes.md` / `HERMES.md` (walks to git root)
2. `AGENTS.md` (recursive directory walk)
3. `CLAUDE.md` (CWD only)
4. `.cursorrules` / `.cursor/rules/*.mdc` (CWD only)
`SOUL.md` is loaded separately via `load_soul_md()` for the identity slot. When it loads successfully, `build_context_files_prompt(skip_soul=True)` prevents it from appearing twice.
Long files are truncated before injection.
## Skills index
The skills system contributes a compact skills index to the prompt when skills tooling is available.
## Why prompt assembly is split this way
The architecture is intentionally optimized to:
- preserve provider-side prompt caching
- avoid mutating history unnecessarily
- keep memory semantics understandable
- let gateway/ACP/CLI add context without poisoning persistent prompt state
## Related docs
- [Context Compression & Prompt Caching](./context-compression-and-caching.md)
- [Session Storage](./session-storage.md)
- [Gateway Internals](./gateway-internals.md)