docs: add ACP and internal systems implementation guides
- add ACP user and developer docs covering setup, lifecycle, callbacks,
permissions, tool rendering, and runtime behavior
- add developer guides for agent loop, provider runtime resolution,
prompt assembly, context caching/compression, gateway internals,
session storage, tools runtime, trajectories, and cron internals
- refresh architecture, quickstart, installation, CLI reference, and
environments docs to link the new implementation pages and ACP support
2026-03-14 00:29:48 -07:00
---
sidebar_position: 5
title: "Prompt Assembly"
description: "How Hermes builds the system prompt, preserves cache stability, and injects ephemeral layers"
---
# Prompt Assembly
Hermes deliberately separates:
- **cached system prompt state**
- **ephemeral API-call-time additions**
This is one of the most important design choices in the project because it affects:
- token usage
- prompt caching effectiveness
- session continuity
- memory correctness
Primary files:
- `run_agent.py`
- `agent/prompt_builder.py`
- `tools/memory_tool.py`
## Cached system prompt layers
The cached system prompt is assembled in roughly this order:
2026-03-18 04:18:08 -07:00
1. agent identity — `SOUL.md` from `HERMES_HOME` when available, otherwise falls back to `DEFAULT_AGENT_IDENTITY` in `prompt_builder.py`
docs: add ACP and internal systems implementation guides
- add ACP user and developer docs covering setup, lifecycle, callbacks,
permissions, tool rendering, and runtime behavior
- add developer guides for agent loop, provider runtime resolution,
prompt assembly, context caching/compression, gateway internals,
session storage, tools runtime, trajectories, and cron internals
- refresh architecture, quickstart, installation, CLI reference, and
environments docs to link the new implementation pages and ACP support
2026-03-14 00:29:48 -07:00
2. tool-aware behavior guidance
3. Honcho static block (when active)
4. optional system message
5. frozen MEMORY snapshot
6. frozen USER profile snapshot
7. skills index
2026-03-18 04:18:08 -07:00
8. context files (`AGENTS.md` , `.cursorrules` , `.cursor/rules/*.mdc` ) — SOUL.md is **not ** included here when it was already loaded as the identity in step 1
docs: add ACP and internal systems implementation guides
- add ACP user and developer docs covering setup, lifecycle, callbacks,
permissions, tool rendering, and runtime behavior
- add developer guides for agent loop, provider runtime resolution,
prompt assembly, context caching/compression, gateway internals,
session storage, tools runtime, trajectories, and cron internals
- refresh architecture, quickstart, installation, CLI reference, and
environments docs to link the new implementation pages and ACP support
2026-03-14 00:29:48 -07:00
9. timestamp / optional session ID
10. platform hint
2026-03-18 04:18:08 -07:00
When `skip_context_files` is set (e.g., subagent delegation), SOUL.md is not loaded and the hardcoded `DEFAULT_AGENT_IDENTITY` is used instead.
docs: deep quality pass — expand 10 thin pages, fix specific issues (#4134)
Developer guide stubs expanded to full documentation:
- trajectory-format.md: 56→233 lines (JSONL format, ShareGPT example,
normalization rules, reasoning markup, replay code)
- session-storage.md: 66→388 lines (SQLite schema, migration table,
FTS5 search syntax, lineage queries, Python API examples)
- context-compression-and-caching.md: 72→321 lines (dual compression
system, config defaults, 4-phase algorithm, before/after example,
prompt caching mechanics, cache-aware patterns)
- tools-runtime.md: 65→246 lines (registry API, dispatch flow,
availability checking, error wrapping, approval flow)
- prompt-assembly.md: 89→246 lines (concrete assembled prompt example,
SOUL.md injection, context file discovery table)
User-facing pages expanded:
- docker.md: 62→224 lines (volumes, env forwarding, docker-compose,
resource limits, troubleshooting)
- updating.md: 79→167 lines (update behavior, version checking,
rollback instructions, Nix users)
- skins.md: 80→206 lines (all color/spinner/branding keys, built-in
skin descriptions, full custom skin YAML template)
Hub pages improved:
- integrations/index.md: 25→82 lines (web search backends table,
TTS/browser providers, quick config example)
- features/overview.md: added Integrations section with 6 missing links
Specific fixes:
- configuration.md: removed duplicate Gateway Streaming section
- mcp.md: removed internal "PR work" language
- plugins.md: added inline minimal plugin example (self-contained)
13 files changed, ~1700 lines added. Docusaurus build verified clean.
2026-03-30 20:30:11 -07:00
### Concrete example: assembled system prompt
Here is a simplified view of what the final system prompt looks like when all layers are present (comments show the source of each section):
```
# Layer 1: Agent Identity (from ~/.hermes/SOUL.md)
You are Hermes, an AI assistant created by Nous Research.
You are an expert software engineer and researcher.
You value correctness, clarity, and efficiency.
...
# Layer 2: Tool-aware behavior guidance
You have persistent memory across sessions. Save durable facts using
the memory tool: user preferences, environment details, tool quirks,
and stable conventions. Memory is injected into every turn, so keep
it compact and focused on facts that will still matter later.
...
When the user references something from a past conversation or you
suspect relevant cross-session context exists, use session_search
to recall it before asking them to repeat themselves.
# Tool-use enforcement (for GPT/Codex models only)
You MUST use your tools to take action — do not describe what you
would do or plan to do without actually doing it.
...
# Layer 3: Honcho static block (when active)
[Honcho personality/context data]
# Layer 4: Optional system message (from config or API)
[User-configured system message override]
# Layer 5: Frozen MEMORY snapshot
## Persistent Memory
- User prefers Python 3.12, uses pyproject.toml
- Default editor is nvim
- Working on project "atlas" in ~/code/atlas
- Timezone: US/Pacific
# Layer 6: Frozen USER profile snapshot
## User Profile
- Name: Alice
- GitHub: alice-dev
# Layer 7: Skills index
## Skills (mandatory)
Before replying, scan the skills below. If one clearly matches
your task, load it with skill_view(name) and follow its instructions.
...
<available_skills>
software-development:
- code-review: Structured code review workflow
- test-driven-development: TDD methodology
research:
- arxiv: Search and summarize arXiv papers
</available_skills>
# Layer 8: Context files (from project directory)
# Project Context
The following project context files have been loaded and should be followed:
## AGENTS.md
This is the atlas project. Use pytest for testing. The main
entry point is src/atlas/main.py. Always run `make lint` before
committing.
# Layer 9: Timestamp + session
Current time: 2026-03-30T14:30:00-07:00
Session: abc123
# Layer 10: Platform hint
You are a CLI AI Agent. Try not to use markdown but simple text
renderable inside a terminal.
```
## How SOUL.md appears in the prompt
`SOUL.md` lives at `~/.hermes/SOUL.md` and serves as the agent's identity — the very first section of the system prompt. The loading logic in `prompt_builder.py` works as follows:
```python
# From agent/prompt_builder.py (simplified)
def load_soul_md() -> Optional[str]:
soul_path = get_hermes_home() / "SOUL.md"
if not soul_path.exists():
return None
content = soul_path.read_text(encoding="utf-8").strip()
content = _scan_context_content(content, "SOUL.md") # Security scan
content = _truncate_content(content, "SOUL.md") # Cap at 20k chars
return content
```
When `load_soul_md()` returns content, it replaces the hardcoded `DEFAULT_AGENT_IDENTITY` . The `build_context_files_prompt()` function is then called with `skip_soul=True` to prevent SOUL.md from appearing twice (once as identity, once as a context file).
If `SOUL.md` doesn't exist, the system falls back to:
```
You are Hermes Agent, an intelligent AI assistant created by Nous Research.
You are helpful, knowledgeable, and direct. You assist users with a wide
range of tasks including answering questions, writing and editing code,
analyzing information, creative work, and executing actions via your tools.
You communicate clearly, admit uncertainty when appropriate, and prioritize
being genuinely useful over being verbose unless otherwise directed below.
Be targeted and efficient in your exploration and investigations.
```
## How context files are injected
`build_context_files_prompt()` uses a **priority system ** — only one project context type is loaded (first match wins):
```python
# From agent/prompt_builder.py (simplified)
def build_context_files_prompt(cwd=None, skip_soul=False):
cwd_path = Path(cwd).resolve()
# Priority: first match wins — only ONE project context loaded
project_context = (
_load_hermes_md(cwd_path) # 1. .hermes.md / HERMES.md (walks to git root)
or _load_agents_md(cwd_path) # 2. AGENTS.md (cwd only)
or _load_claude_md(cwd_path) # 3. CLAUDE.md (cwd only)
or _load_cursorrules(cwd_path) # 4. .cursorrules / .cursor/rules/*.mdc
)
sections = []
if project_context:
sections.append(project_context)
# SOUL.md from HERMES_HOME (independent of project context)
if not skip_soul:
soul_content = load_soul_md()
if soul_content:
sections.append(soul_content)
if not sections:
return ""
return (
"# Project Context\n\n"
"The following project context files have been loaded "
"and should be followed:\n\n"
+ "\n".join(sections)
)
```
### Context file discovery details
| Priority | Files | Search scope | Notes |
|----------|-------|-------------|-------|
| 1 | `.hermes.md` , `HERMES.md` | CWD up to git root | Hermes-native project config |
| 2 | `AGENTS.md` | CWD only | Common agent instruction file |
| 3 | `CLAUDE.md` | CWD only | Claude Code compatibility |
| 4 | `.cursorrules` , `.cursor/rules/*.mdc` | CWD only | Cursor compatibility |
All context files are:
- **Security scanned** — checked for prompt injection patterns (invisible unicode, "ignore previous instructions", credential exfiltration attempts)
- **Truncated** — capped at 20,000 characters using 70/20 head/tail ratio with a truncation marker
- **YAML frontmatter stripped** — `.hermes.md` frontmatter is removed (reserved for future config overrides)
docs: add ACP and internal systems implementation guides
- add ACP user and developer docs covering setup, lifecycle, callbacks,
permissions, tool rendering, and runtime behavior
- add developer guides for agent loop, provider runtime resolution,
prompt assembly, context caching/compression, gateway internals,
session storage, tools runtime, trajectories, and cron internals
- refresh architecture, quickstart, installation, CLI reference, and
environments docs to link the new implementation pages and ACP support
2026-03-14 00:29:48 -07:00
## API-call-time-only layers
These are intentionally * not * persisted as part of the cached system prompt:
- `ephemeral_system_prompt`
- prefill messages
- gateway-derived session context overlays
- later-turn Honcho recall injected into the current-turn user message
This separation keeps the stable prefix stable for caching.
## Memory snapshots
Local memory and user profile data are injected as frozen snapshots at session start. Mid-session writes update disk state but do not mutate the already-built system prompt until a new session or forced rebuild occurs.
## Context files
docs: fix stale and incorrect documentation across 18 files
Cross-referenced all 84 docs pages against the actual codebase and
corrected every discrepancy found.
Reference docs:
- faq.md: Fix non-existent commands (/stats→/usage, /context→/usage,
hermes models→hermes model, hermes config get→hermes config show,
hermes gateway logs→cat gateway.log, async→sync chat() call)
- cli-commands.md: Fix --provider choices list (remove providers not
in argparse), add undocumented -s/--skills flag
- slash-commands.md: Add missing /queue and /resume commands, fix
/approve args_hint to show [session|always]
- tools-reference.md: Remove duplicate vision and web toolset sections
- environment-variables.md: Fix HERMES_INFERENCE_PROVIDER list (add
copilot-acp, remove alibaba to match actual argparse choices)
Configuration & user guide:
- configuration.md: Fix approval_mode→approvals.mode (manual not ask),
checkpoints.enabled default true not false, human_delay defaults
(500/2000→800/2500), remove non-existent delegation.max_iterations
and delegation.default_toolsets, fix website_blocklist nesting
under security:, add .hermes.md and CLAUDE.md to context files
table with priority system explanation
- security.md: Fix website_blocklist nesting under security:
- context-files.md: Add .hermes.md/HERMES.md and CLAUDE.md support,
document priority-based first-match-wins loading behavior
- cli.md: Fix personalities config nesting (top-level, not under agent:)
- delegation.md: Fix model override docs (config-level, not per-call
tool parameter)
- rl-training.md: Fix log directory (tinker-atropos/logs/→
~/.hermes/logs/rl_training/)
- tts.md: Fix Discord delivery format (voice bubble with fallback,
not just file attachment)
- git-worktrees.md: Remove outdated v0.2.0 version reference
Developer guide:
- prompt-assembly.md: Add .hermes.md, CLAUDE.md, document priority
system for context files
- agent-loop.md: Fix callback list (remove non-existent
message_callback, add stream_delta_callback, tool_gen_callback,
status_callback)
Messaging & guides:
- webhooks.md: Fix command (hermes setup gateway→hermes gateway setup)
- tips.md: Fix session idle timeout (120min→24h), config file
(gateway.json→config.yaml)
- build-a-hermes-plugin.md: Fix plugin.yaml provides: format
(provides_tools/provides_hooks as lists), note register_command()
as not yet implemented
2026-03-24 07:53:07 -07:00
`agent/prompt_builder.py` scans and sanitizes project context files using a **priority system ** — only one type is loaded (first match wins):
docs: add ACP and internal systems implementation guides
- add ACP user and developer docs covering setup, lifecycle, callbacks,
permissions, tool rendering, and runtime behavior
- add developer guides for agent loop, provider runtime resolution,
prompt assembly, context caching/compression, gateway internals,
session storage, tools runtime, trajectories, and cron internals
- refresh architecture, quickstart, installation, CLI reference, and
environments docs to link the new implementation pages and ACP support
2026-03-14 00:29:48 -07:00
docs: fix stale and incorrect documentation across 18 files
Cross-referenced all 84 docs pages against the actual codebase and
corrected every discrepancy found.
Reference docs:
- faq.md: Fix non-existent commands (/stats→/usage, /context→/usage,
hermes models→hermes model, hermes config get→hermes config show,
hermes gateway logs→cat gateway.log, async→sync chat() call)
- cli-commands.md: Fix --provider choices list (remove providers not
in argparse), add undocumented -s/--skills flag
- slash-commands.md: Add missing /queue and /resume commands, fix
/approve args_hint to show [session|always]
- tools-reference.md: Remove duplicate vision and web toolset sections
- environment-variables.md: Fix HERMES_INFERENCE_PROVIDER list (add
copilot-acp, remove alibaba to match actual argparse choices)
Configuration & user guide:
- configuration.md: Fix approval_mode→approvals.mode (manual not ask),
checkpoints.enabled default true not false, human_delay defaults
(500/2000→800/2500), remove non-existent delegation.max_iterations
and delegation.default_toolsets, fix website_blocklist nesting
under security:, add .hermes.md and CLAUDE.md to context files
table with priority system explanation
- security.md: Fix website_blocklist nesting under security:
- context-files.md: Add .hermes.md/HERMES.md and CLAUDE.md support,
document priority-based first-match-wins loading behavior
- cli.md: Fix personalities config nesting (top-level, not under agent:)
- delegation.md: Fix model override docs (config-level, not per-call
tool parameter)
- rl-training.md: Fix log directory (tinker-atropos/logs/→
~/.hermes/logs/rl_training/)
- tts.md: Fix Discord delivery format (voice bubble with fallback,
not just file attachment)
- git-worktrees.md: Remove outdated v0.2.0 version reference
Developer guide:
- prompt-assembly.md: Add .hermes.md, CLAUDE.md, document priority
system for context files
- agent-loop.md: Fix callback list (remove non-existent
message_callback, add stream_delta_callback, tool_gen_callback,
status_callback)
Messaging & guides:
- webhooks.md: Fix command (hermes setup gateway→hermes gateway setup)
- tips.md: Fix session idle timeout (120min→24h), config file
(gateway.json→config.yaml)
- build-a-hermes-plugin.md: Fix plugin.yaml provides: format
(provides_tools/provides_hooks as lists), note register_command()
as not yet implemented
2026-03-24 07:53:07 -07:00
1. `.hermes.md` / `HERMES.md` (walks to git root)
feat: progressive subdirectory hint discovery (#5291)
As the agent navigates into subdirectories via tool calls (read_file,
terminal, search_files, etc.), automatically discover and load project
context files (AGENTS.md, CLAUDE.md, .cursorrules) from those directories.
Previously, context files were only loaded from the CWD at session start.
If the agent moved into backend/, frontend/, or any subdirectory with its
own AGENTS.md, those instructions were never seen.
Now, SubdirectoryHintTracker watches tool call arguments for file paths
and shell commands, resolves directories, and loads hint files on first
access. Discovered hints are appended to the tool result so the model
gets relevant context at the moment it starts working in a new area —
without modifying the system prompt (preserving prompt caching).
Features:
- Extracts paths from tool args (path, workdir) and shell commands
- Loads AGENTS.md, CLAUDE.md, .cursorrules (first match per directory)
- Deduplicates — each directory loaded at most once per session
- Ignores paths outside the working directory
- Truncates large hint files at 8K chars
- Works on both sequential and concurrent tool execution paths
Inspired by Block/goose SubdirectoryHintTracker.
2026-04-05 12:33:47 -07:00
2. `AGENTS.md` (CWD at startup; subdirectories discovered progressively during the session via `agent/subdirectory_hints.py` )
docs: fix stale and incorrect documentation across 18 files
Cross-referenced all 84 docs pages against the actual codebase and
corrected every discrepancy found.
Reference docs:
- faq.md: Fix non-existent commands (/stats→/usage, /context→/usage,
hermes models→hermes model, hermes config get→hermes config show,
hermes gateway logs→cat gateway.log, async→sync chat() call)
- cli-commands.md: Fix --provider choices list (remove providers not
in argparse), add undocumented -s/--skills flag
- slash-commands.md: Add missing /queue and /resume commands, fix
/approve args_hint to show [session|always]
- tools-reference.md: Remove duplicate vision and web toolset sections
- environment-variables.md: Fix HERMES_INFERENCE_PROVIDER list (add
copilot-acp, remove alibaba to match actual argparse choices)
Configuration & user guide:
- configuration.md: Fix approval_mode→approvals.mode (manual not ask),
checkpoints.enabled default true not false, human_delay defaults
(500/2000→800/2500), remove non-existent delegation.max_iterations
and delegation.default_toolsets, fix website_blocklist nesting
under security:, add .hermes.md and CLAUDE.md to context files
table with priority system explanation
- security.md: Fix website_blocklist nesting under security:
- context-files.md: Add .hermes.md/HERMES.md and CLAUDE.md support,
document priority-based first-match-wins loading behavior
- cli.md: Fix personalities config nesting (top-level, not under agent:)
- delegation.md: Fix model override docs (config-level, not per-call
tool parameter)
- rl-training.md: Fix log directory (tinker-atropos/logs/→
~/.hermes/logs/rl_training/)
- tts.md: Fix Discord delivery format (voice bubble with fallback,
not just file attachment)
- git-worktrees.md: Remove outdated v0.2.0 version reference
Developer guide:
- prompt-assembly.md: Add .hermes.md, CLAUDE.md, document priority
system for context files
- agent-loop.md: Fix callback list (remove non-existent
message_callback, add stream_delta_callback, tool_gen_callback,
status_callback)
Messaging & guides:
- webhooks.md: Fix command (hermes setup gateway→hermes gateway setup)
- tips.md: Fix session idle timeout (120min→24h), config file
(gateway.json→config.yaml)
- build-a-hermes-plugin.md: Fix plugin.yaml provides: format
(provides_tools/provides_hooks as lists), note register_command()
as not yet implemented
2026-03-24 07:53:07 -07:00
3. `CLAUDE.md` (CWD only)
4. `.cursorrules` / `.cursor/rules/*.mdc` (CWD only)
docs: add ACP and internal systems implementation guides
- add ACP user and developer docs covering setup, lifecycle, callbacks,
permissions, tool rendering, and runtime behavior
- add developer guides for agent loop, provider runtime resolution,
prompt assembly, context caching/compression, gateway internals,
session storage, tools runtime, trajectories, and cron internals
- refresh architecture, quickstart, installation, CLI reference, and
environments docs to link the new implementation pages and ACP support
2026-03-14 00:29:48 -07:00
2026-03-18 04:18:08 -07:00
`SOUL.md` is loaded separately via `load_soul_md()` for the identity slot. When it loads successfully, `build_context_files_prompt(skip_soul=True)` prevents it from appearing twice.
docs: add ACP and internal systems implementation guides
- add ACP user and developer docs covering setup, lifecycle, callbacks,
permissions, tool rendering, and runtime behavior
- add developer guides for agent loop, provider runtime resolution,
prompt assembly, context caching/compression, gateway internals,
session storage, tools runtime, trajectories, and cron internals
- refresh architecture, quickstart, installation, CLI reference, and
environments docs to link the new implementation pages and ACP support
2026-03-14 00:29:48 -07:00
Long files are truncated before injection.
## Skills index
The skills system contributes a compact skills index to the prompt when skills tooling is available.
## Why prompt assembly is split this way
The architecture is intentionally optimized to:
- preserve provider-side prompt caching
- avoid mutating history unnecessarily
- keep memory semantics understandable
- let gateway/ACP/CLI add context without poisoning persistent prompt state
## Related docs
- [Context Compression & Prompt Caching ](./context-compression-and-caching.md )
- [Session Storage ](./session-storage.md )
- [Gateway Internals ](./gateway-internals.md )