- add ACP user and developer docs covering setup, lifecycle, callbacks, permissions, tool rendering, and runtime behavior - add developer guides for agent loop, provider runtime resolution, prompt assembly, context caching/compression, gateway internals, session storage, tools runtime, trajectories, and cron internals - refresh architecture, quickstart, installation, CLI reference, and environments docs to link the new implementation pages and ACP support
86 lines
2.3 KiB
Markdown
86 lines
2.3 KiB
Markdown
---
|
|
sidebar_position: 5
|
|
title: "Prompt Assembly"
|
|
description: "How Hermes builds the system prompt, preserves cache stability, and injects ephemeral layers"
|
|
---
|
|
|
|
# Prompt Assembly
|
|
|
|
Hermes deliberately separates:
|
|
|
|
- **cached system prompt state**
|
|
- **ephemeral API-call-time additions**
|
|
|
|
This is one of the most important design choices in the project because it affects:
|
|
|
|
- token usage
|
|
- prompt caching effectiveness
|
|
- session continuity
|
|
- memory correctness
|
|
|
|
Primary files:
|
|
|
|
- `run_agent.py`
|
|
- `agent/prompt_builder.py`
|
|
- `tools/memory_tool.py`
|
|
|
|
## Cached system prompt layers
|
|
|
|
The cached system prompt is assembled in roughly this order:
|
|
|
|
1. default agent identity
|
|
2. tool-aware behavior guidance
|
|
3. Honcho static block (when active)
|
|
4. optional system message
|
|
5. frozen MEMORY snapshot
|
|
6. frozen USER profile snapshot
|
|
7. skills index
|
|
8. context files (`AGENTS.md`, `SOUL.md`, `.cursorrules`, `.cursor/rules/*.mdc`)
|
|
9. timestamp / optional session ID
|
|
10. platform hint
|
|
|
|
## API-call-time-only layers
|
|
|
|
These are intentionally *not* persisted as part of the cached system prompt:
|
|
|
|
- `ephemeral_system_prompt`
|
|
- prefill messages
|
|
- gateway-derived session context overlays
|
|
- later-turn Honcho recall injected into the current-turn user message
|
|
|
|
This separation keeps the stable prefix stable for caching.
|
|
|
|
## Memory snapshots
|
|
|
|
Local memory and user profile data are injected as frozen snapshots at session start. Mid-session writes update disk state but do not mutate the already-built system prompt until a new session or forced rebuild occurs.
|
|
|
|
## Context files
|
|
|
|
`agent/prompt_builder.py` scans and sanitizes:
|
|
|
|
- `AGENTS.md`
|
|
- `SOUL.md`
|
|
- `.cursorrules`
|
|
- `.cursor/rules/*.mdc`
|
|
|
|
Long files are truncated before injection.
|
|
|
|
## Skills index
|
|
|
|
The skills system contributes a compact skills index to the prompt when skills tooling is available.
|
|
|
|
## Why prompt assembly is split this way
|
|
|
|
The architecture is intentionally optimized to:
|
|
|
|
- preserve provider-side prompt caching
|
|
- avoid mutating history unnecessarily
|
|
- keep memory semantics understandable
|
|
- let gateway/ACP/CLI add context without poisoning persistent prompt state
|
|
|
|
## Related docs
|
|
|
|
- [Context Compression & Prompt Caching](./context-compression-and-caching.md)
|
|
- [Session Storage](./session-storage.md)
|
|
- [Gateway Internals](./gateway-internals.md)
|