--- sidebar_position: 5 title: "Prompt Assembly" description: "How Hermes builds the system prompt, preserves cache stability, and injects ephemeral layers" --- # Prompt Assembly Hermes deliberately separates: - **cached system prompt state** - **ephemeral API-call-time additions** This is one of the most important design choices in the project because it affects: - token usage - prompt caching effectiveness - session continuity - memory correctness Primary files: - `run_agent.py` - `agent/prompt_builder.py` - `tools/memory_tool.py` ## Cached system prompt layers The cached system prompt is assembled in roughly this order: 1. agent identity — `SOUL.md` from `HERMES_HOME` when available, otherwise falls back to `DEFAULT_AGENT_IDENTITY` in `prompt_builder.py` 2. tool-aware behavior guidance 3. Honcho static block (when active) 4. optional system message 5. frozen MEMORY snapshot 6. frozen USER profile snapshot 7. skills index 8. context files (`AGENTS.md`, `.cursorrules`, `.cursor/rules/*.mdc`) — SOUL.md is **not** included here when it was already loaded as the identity in step 1 9. timestamp / optional session ID 10. platform hint When `skip_context_files` is set (e.g., subagent delegation), SOUL.md is not loaded and the hardcoded `DEFAULT_AGENT_IDENTITY` is used instead. ## API-call-time-only layers These are intentionally *not* persisted as part of the cached system prompt: - `ephemeral_system_prompt` - prefill messages - gateway-derived session context overlays - later-turn Honcho recall injected into the current-turn user message This separation keeps the stable prefix stable for caching. ## Memory snapshots Local memory and user profile data are injected as frozen snapshots at session start. Mid-session writes update disk state but do not mutate the already-built system prompt until a new session or forced rebuild occurs. ## Context files `agent/prompt_builder.py` scans and sanitizes project context files using a **priority system** — only one type is loaded (first match wins): 1. `.hermes.md` / `HERMES.md` (walks to git root) 2. `AGENTS.md` (recursive directory walk) 3. `CLAUDE.md` (CWD only) 4. `.cursorrules` / `.cursor/rules/*.mdc` (CWD only) `SOUL.md` is loaded separately via `load_soul_md()` for the identity slot. When it loads successfully, `build_context_files_prompt(skip_soul=True)` prevents it from appearing twice. Long files are truncated before injection. ## Skills index The skills system contributes a compact skills index to the prompt when skills tooling is available. ## Why prompt assembly is split this way The architecture is intentionally optimized to: - preserve provider-side prompt caching - avoid mutating history unnecessarily - keep memory semantics understandable - let gateway/ACP/CLI add context without poisoning persistent prompt state ## Related docs - [Context Compression & Prompt Caching](./context-compression-and-caching.md) - [Session Storage](./session-storage.md) - [Gateway Internals](./gateway-internals.md)