This commit is contained in:
170
ARCHITECTURE.md
Normal file
170
ARCHITECTURE.md
Normal file
@@ -0,0 +1,170 @@
|
||||
# Architecture
|
||||
|
||||
High-level system design of the Hermes/Timmy sovereign AI agent framework.
|
||||
|
||||
## Layers
|
||||
|
||||
The system has three layers, top to bottom:
|
||||
|
||||
```
|
||||
SOUL.md (Bitcoin) Immutable moral framework, on-chain inscription
|
||||
|
|
||||
~/.timmy/ (Sovereign) Identity, specs, papers, evolution tracking
|
||||
|
|
||||
~/.hermes/ (Operational) Running agent, profiles, skills, cron, sessions
|
||||
|
|
||||
Fleet (VPS Agents) Ezra, Bezalel, Allegro — remote workers, Gitea, Ansible
|
||||
```
|
||||
|
||||
## Core Components
|
||||
|
||||
### Agent Loop (run_agent.py)
|
||||
|
||||
Synchronous, tool-call driven conversation loop. The AIAgent class manages:
|
||||
- API call budget with iteration tracking
|
||||
- Context compression (automatic when window fills)
|
||||
- Checkpoint system (max 50 snapshots)
|
||||
- Trajectory saving for training
|
||||
- Tool use enforcement for models that describe tools instead of calling them
|
||||
|
||||
```
|
||||
while api_call_count < max_iterations:
|
||||
response = LLM(messages, tools)
|
||||
if response.tool_calls:
|
||||
for call in response.tool_calls:
|
||||
result = handle(call)
|
||||
messages.append(result)
|
||||
else:
|
||||
return response.content
|
||||
```
|
||||
|
||||
### Tool System
|
||||
|
||||
Central singleton registry with 47 static tools across 21+ toolsets, plus dynamic MCP tools.
|
||||
|
||||
Key mechanisms:
|
||||
- **Approval system** — manual/smart/off modes, dangerous command detection
|
||||
- **Composite toolsets** — e.g., debugging = terminal + web + file
|
||||
- **Subagent delegation** — isolated contexts, max depth 2, max 3 concurrent
|
||||
- **Mixture of Agents** — routes through 4+ frontier LLMs, synthesizes responses
|
||||
- **Terminal backends** — local, docker, ssh, modal, daytona, singularity
|
||||
|
||||
### Gateway (Multi-Platform)
|
||||
|
||||
25 messaging platform adapters in `gateway/run.py` (8,852 lines):
|
||||
|
||||
telegram, discord, slack, whatsapp, homeassistant, signal, matrix,
|
||||
mattermost, dingtalk, feishu, wecom, weixin, sms, email, webhook,
|
||||
bluebubbles, + API server
|
||||
|
||||
Each platform has its own adapter implementing BasePlatformAdapter.
|
||||
|
||||
### Profiles
|
||||
|
||||
15+ named agent configurations in `~/.hermes/profiles/<name>/`. Each profile is self-contained:
|
||||
- Own config.yaml, SOUL.md, skills/, auth.json
|
||||
- Own state.db, memory_store.db, sessions/
|
||||
- Isolated credentials and tool access
|
||||
|
||||
### Cron Integration
|
||||
|
||||
File-based lock scheduler, gateway calls tick() every 60 seconds.
|
||||
- Jobs in `~/.hermes/cron/jobs.json`
|
||||
- Supports SILENT_MARKER for no-news suppression
|
||||
- Delivery to 15 platforms auto-resolved from origin
|
||||
|
||||
### Context Compression
|
||||
|
||||
ContextCompressor with 5-step pipeline:
|
||||
1. Prune old tool results (cheap)
|
||||
2. Protect head messages (system prompt + first exchange)
|
||||
3. Protect tail by token budget (~20K tokens)
|
||||
4. Summarize middle turns with auxiliary LLM
|
||||
5. Iterative summary updates on subsequent compactions
|
||||
|
||||
### Auxiliary Client Router
|
||||
|
||||
Multi-provider resolution chain with automatic fallback:
|
||||
- Text: OpenRouter → Nous Portal → Custom → Codex OAuth → Anthropic → Direct providers
|
||||
- Vision: Selected provider → OpenRouter → Nous Portal → Codex → Anthropic → Custom
|
||||
- Auto-fallback on 402/credit-exhaustion
|
||||
|
||||
## Data Flow
|
||||
|
||||
```
|
||||
User Message
|
||||
|
|
||||
v
|
||||
Gateway (platform adapter)
|
||||
|
|
||||
v
|
||||
Session Store (SQLite, state.db)
|
||||
|
|
||||
v
|
||||
Agent Loop (run_agent.py)
|
||||
|
|
||||
+---> Tool Registry (47 tools + MCP)
|
||||
| |
|
||||
| +---> Terminal (local/docker/ssh/modal)
|
||||
| +---> File System
|
||||
| +---> Web (search, browse, scrape)
|
||||
| +---> Memory (holographic, fact_store)
|
||||
| +---> Subagents (delegated, isolated)
|
||||
|
|
||||
+---> Auxiliary Client (vision, compression, search)
|
||||
|
|
||||
+---> Context Compressor (if window full)
|
||||
|
|
||||
v
|
||||
Response → Gateway → Platform → User
|
||||
```
|
||||
|
||||
## SOUL.md → Architecture Mapping
|
||||
|
||||
| SOUL.md Value | Architectural Mechanism |
|
||||
|------------------------|------------------------------------------------|
|
||||
| Sovereignty | Local-first, no phone-home, forkable code |
|
||||
| Service | Tool system, multi-platform gateway |
|
||||
| Honesty | Source distinction, refusal over fabrication |
|
||||
| Humility | Small-model support, graceful degradation |
|
||||
| Courage | Crisis detection, dark content handling |
|
||||
| Silence | SILENT_MARKER in cron, brevity defaults |
|
||||
| When a Man Is Dying | Crisis protocol integration, 988 routing |
|
||||
|
||||
## External Dependencies
|
||||
|
||||
| Component | Dependency | Sovereignty Posture |
|
||||
|------------------------|-------------------|------------------------------|
|
||||
| LLM Inference | OpenRouter/Nous | Fallback to local Ollama |
|
||||
| Vision | Provider chain | Local Gemma 3 available |
|
||||
| Messaging | Platform APIs | 25 adapters, no lock-in |
|
||||
| Storage | SQLite (local) | Full control |
|
||||
| Deployment | Ansible (local) | Sovereign, no cloud CI |
|
||||
| Source Control | Gitea (self-host) | Full control |
|
||||
|
||||
## Novel Contributions
|
||||
|
||||
1. **On-Chain Soul** — Moral framework inscribed on Bitcoin as immutable conscience. Values as permanent, forkable inscription rather than mutable system prompt.
|
||||
|
||||
2. **Poka-Yoke Guardrails** — Five lightweight runtime guardrails eliminating entire failure categories (1,400+ failures prevented). Paper-ready for NeurIPS/ICML.
|
||||
|
||||
3. **Sovereign Fleet Architecture** — Declarative deployment for heterogeneous agent fleets. 45min manual → 47s automated with Ansible pipeline.
|
||||
|
||||
4. **Source Distinction** — Three-tier provenance tagging (retrieved/generated/mixed) for epistemic honesty in LLM outputs.
|
||||
|
||||
5. **Refusal Over Fabrication** — Detecting and preventing ungrounded hedging in LLM responses.
|
||||
|
||||
## What's Undocumented
|
||||
|
||||
Known documentation gaps (opportunities for future work):
|
||||
- Profiles system (creation, isolation guarantees)
|
||||
- Skills Hub registry protocol
|
||||
- Fleet routing logic
|
||||
- Checkpoint system mechanics
|
||||
- Per-profile credential isolation
|
||||
|
||||
---
|
||||
|
||||
*For detailed code-level analysis, see [hermes-agent-architecture-report.md](hermes-agent-architecture-report.md).*
|
||||
|
||||
*Sovereignty and service always.*
|
||||
Reference in New Issue
Block a user