Files

Teknium ee3f3e756d docs: fix stale and incorrect documentation across 18 files

Cross-referenced all 84 docs pages against the actual codebase and
corrected every discrepancy found.

Reference docs:
- faq.md: Fix non-existent commands (/stats→/usage, /context→/usage,
  hermes models→hermes model, hermes config get→hermes config show,
  hermes gateway logs→cat gateway.log, async→sync chat() call)
- cli-commands.md: Fix --provider choices list (remove providers not
  in argparse), add undocumented -s/--skills flag
- slash-commands.md: Add missing /queue and /resume commands, fix
  /approve args_hint to show [session|always]
- tools-reference.md: Remove duplicate vision and web toolset sections
- environment-variables.md: Fix HERMES_INFERENCE_PROVIDER list (add
  copilot-acp, remove alibaba to match actual argparse choices)

Configuration & user guide:
- configuration.md: Fix approval_mode→approvals.mode (manual not ask),
  checkpoints.enabled default true not false, human_delay defaults
  (500/2000→800/2500), remove non-existent delegation.max_iterations
  and delegation.default_toolsets, fix website_blocklist nesting
  under security:, add .hermes.md and CLAUDE.md to context files
  table with priority system explanation
- security.md: Fix website_blocklist nesting under security:
- context-files.md: Add .hermes.md/HERMES.md and CLAUDE.md support,
  document priority-based first-match-wins loading behavior
- cli.md: Fix personalities config nesting (top-level, not under agent:)
- delegation.md: Fix model override docs (config-level, not per-call
  tool parameter)
- rl-training.md: Fix log directory (tinker-atropos/logs/→
  ~/.hermes/logs/rl_training/)
- tts.md: Fix Discord delivery format (voice bubble with fallback,
  not just file attachment)
- git-worktrees.md: Remove outdated v0.2.0 version reference

Developer guide:
- prompt-assembly.md: Add .hermes.md, CLAUDE.md, document priority
  system for context files
- agent-loop.md: Fix callback list (remove non-existent
  message_callback, add stream_delta_callback, tool_gen_callback,
  status_callback)

Messaging & guides:
- webhooks.md: Fix command (hermes setup gateway→hermes gateway setup)
- tips.md: Fix session idle timeout (120min→24h), config file
  (gateway.json→config.yaml)
- build-a-hermes-plugin.md: Fix plugin.yaml provides: format
  (provides_tools/provides_hooks as lists), note register_command()
  as not yet implemented

2026-03-24 07:53:07 -07:00

3.0 KiB

Raw Permalink Blame History

sidebar_position, title, description

sidebar_position	title	description
5	Prompt Assembly	How Hermes builds the system prompt, preserves cache stability, and injects ephemeral layers

Prompt Assembly

Hermes deliberately separates:

cached system prompt state
ephemeral API-call-time additions

This is one of the most important design choices in the project because it affects:

token usage
prompt caching effectiveness
session continuity
memory correctness

Primary files:

run_agent.py
agent/prompt_builder.py
tools/memory_tool.py

Cached system prompt layers

The cached system prompt is assembled in roughly this order:

agent identity — SOUL.md from HERMES_HOME when available, otherwise falls back to DEFAULT_AGENT_IDENTITY in prompt_builder.py
tool-aware behavior guidance
Honcho static block (when active)
optional system message
frozen MEMORY snapshot
frozen USER profile snapshot
skills index
context files (AGENTS.md, .cursorrules, .cursor/rules/*.mdc) — SOUL.md is not included here when it was already loaded as the identity in step 1
timestamp / optional session ID
platform hint

When skip_context_files is set (e.g., subagent delegation), SOUL.md is not loaded and the hardcoded DEFAULT_AGENT_IDENTITY is used instead.

API-call-time-only layers

These are intentionally not persisted as part of the cached system prompt:

ephemeral_system_prompt
prefill messages
gateway-derived session context overlays
later-turn Honcho recall injected into the current-turn user message

This separation keeps the stable prefix stable for caching.

Memory snapshots

Local memory and user profile data are injected as frozen snapshots at session start. Mid-session writes update disk state but do not mutate the already-built system prompt until a new session or forced rebuild occurs.

Context files

agent/prompt_builder.py scans and sanitizes project context files using a priority system — only one type is loaded (first match wins):

.hermes.md / HERMES.md (walks to git root)
AGENTS.md (recursive directory walk)
CLAUDE.md (CWD only)
.cursorrules / .cursor/rules/*.mdc (CWD only)

SOUL.md is loaded separately via load_soul_md() for the identity slot. When it loads successfully, build_context_files_prompt(skip_soul=True) prevents it from appearing twice.

Long files are truncated before injection.

Skills index

The skills system contributes a compact skills index to the prompt when skills tooling is available.

Why prompt assembly is split this way

The architecture is intentionally optimized to:

preserve provider-side prompt caching
avoid mutating history unnecessarily
keep memory semantics understandable
let gateway/ACP/CLI add context without poisoning persistent prompt state

3.0 KiB Raw Permalink Blame History