Major changes across 20 documentation pages: Staleness fixes: - Fix FAQ: wrong import path (hermes.agent → run_agent) - Fix FAQ: stale Gemini 2.0 model → Gemini 3 Flash - Fix integrations/index: missing MiniMax TTS provider - Fix integrations/index: web_crawl is not a registered tool - Fix sessions: add all 19 session sources (was only 5) - Fix cron: add all 18 delivery targets (was only telegram/discord) - Fix webhooks: add all delivery targets - Fix overview: add missing MCP, memory providers, credential pools - Fix all line-number references → use function name searches instead - Update file size estimates (run_agent ~9200, gateway ~7200, cli ~8500) Expanded thin pages (< 150 lines → substantial depth): - honcho.md: 43 → 108 lines — added feature comparison, tools, config, CLI - overview.md: 49 → 55 lines — added MCP, memory providers, credential pools - toolsets-reference.md: 57 → 175 lines — added explanations, config examples, custom toolsets, wildcards, platform differences table - optional-skills-catalog.md: 74 → 153 lines — added 25+ missing skills across communication, devops, mlops (18!), productivity, research categories - integrations/index.md: 82 → 115 lines — added messaging, HA, plugins sections - cron-internals.md: 90 → 195 lines — added job JSON example, lifecycle states, tick cycle, delivery targets, script-backed jobs, CLI interface - gateway-internals.md: 111 → 250 lines — added architecture diagram, message flow, two-level guard, platform adapters, token locks, process management - agent-loop.md: 112 → 235 lines — added entry points, API mode resolution, turn lifecycle detail, message alternation rules, tool execution flow, callback table, budget tracking, compression details - architecture.md: 152 → 295 lines — added system overview diagram, data flow diagrams, design principles table, dependency chain Other depth additions: - context-references.md: added platform availability, compression interaction, common patterns sections - slash-commands.md: added quick commands config example, alias resolution - image-generation.md: added platform delivery table - tools-reference.md: added tool counts, MCP tools note - index.md: updated platform count (5 → 14+), tool count (40+ → 47)
254 lines
12 KiB
Markdown
254 lines
12 KiB
Markdown
---
|
|
sidebar_position: 7
|
|
title: "Gateway Internals"
|
|
description: "How the messaging gateway boots, authorizes users, routes sessions, and delivers messages"
|
|
---
|
|
|
|
# Gateway Internals
|
|
|
|
The messaging gateway is the long-running process that connects Hermes to 14+ external messaging platforms through a unified architecture.
|
|
|
|
## Key Files
|
|
|
|
| File | Purpose |
|
|
|------|---------|
|
|
| `gateway/run.py` | `GatewayRunner` — main loop, slash commands, message dispatch (~7,200 lines) |
|
|
| `gateway/session.py` | `SessionStore` — conversation persistence and session key construction |
|
|
| `gateway/delivery.py` | Outbound message delivery to target platforms/channels |
|
|
| `gateway/pairing.py` | DM pairing flow for user authorization |
|
|
| `gateway/channel_directory.py` | Maps chat IDs to human-readable names for cron delivery |
|
|
| `gateway/hooks.py` | Hook discovery, loading, and lifecycle event dispatch |
|
|
| `gateway/mirror.py` | Cross-session message mirroring for `send_message` |
|
|
| `gateway/status.py` | Token lock management for profile-scoped gateway instances |
|
|
| `gateway/builtin_hooks/` | Always-registered hooks (e.g., BOOT.md system prompt hook) |
|
|
| `gateway/platforms/` | Platform adapters (one per messaging platform) |
|
|
|
|
## Architecture Overview
|
|
|
|
```text
|
|
┌─────────────────────────────────────────────────┐
|
|
│ GatewayRunner │
|
|
│ │
|
|
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
|
|
│ │ Telegram │ │ Discord │ │ Slack │ ... │
|
|
│ │ Adapter │ │ Adapter │ │ Adapter │ │
|
|
│ └─────┬─────┘ └─────┬────┘ └─────┬────┘ │
|
|
│ │ │ │ │
|
|
│ └──────────────┼──────────────┘ │
|
|
│ ▼ │
|
|
│ _handle_message() │
|
|
│ │ │
|
|
│ ┌────────────┼────────────┐ │
|
|
│ ▼ ▼ ▼ │
|
|
│ Slash command AIAgent Queue/BG │
|
|
│ dispatch creation sessions │
|
|
│ │ │
|
|
│ ▼ │
|
|
│ SessionStore │
|
|
│ (SQLite persistence) │
|
|
└─────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Message Flow
|
|
|
|
When a message arrives from any platform:
|
|
|
|
1. **Platform adapter** receives raw event, normalizes it into a `MessageEvent`
|
|
2. **Base adapter** checks active session guard:
|
|
- If agent is running for this session → queue message, set interrupt event
|
|
- If `/approve`, `/deny`, `/stop` → bypass guard (dispatched inline)
|
|
3. **GatewayRunner._handle_message()** receives the event:
|
|
- Resolve session key via `_session_key_for_source()` (format: `agent:main:{platform}:{chat_type}:{chat_id}`)
|
|
- Check authorization (see Authorization below)
|
|
- Check if it's a slash command → dispatch to command handler
|
|
- Check if agent is already running → intercept commands like `/stop`, `/status`
|
|
- Otherwise → create `AIAgent` instance and run conversation
|
|
4. **Response** is sent back through the platform adapter
|
|
|
|
### Session Key Format
|
|
|
|
Session keys encode the full routing context:
|
|
|
|
```
|
|
agent:main:{platform}:{chat_type}:{chat_id}
|
|
```
|
|
|
|
For example: `agent:main:telegram:private:123456789`
|
|
|
|
Thread-aware platforms (Telegram forum topics, Discord threads, Slack threads) may include thread IDs in the chat_id portion. **Never construct session keys manually** — always use `build_session_key()` from `gateway/session.py`.
|
|
|
|
### Two-Level Message Guard
|
|
|
|
When an agent is actively running, incoming messages pass through two sequential guards:
|
|
|
|
1. **Level 1 — Base adapter** (`gateway/platforms/base.py`): Checks `_active_sessions`. If the session is active, queues the message in `_pending_messages` and sets an interrupt event. This catches messages *before* they reach the gateway runner.
|
|
|
|
2. **Level 2 — Gateway runner** (`gateway/run.py`): Checks `_running_agents`. Intercepts specific commands (`/stop`, `/new`, `/queue`, `/status`, `/approve`, `/deny`) and routes them appropriately. Everything else triggers `running_agent.interrupt()`.
|
|
|
|
Commands that must reach the runner while the agent is blocked (like `/approve`) are dispatched **inline** via `await self._message_handler(event)` — they bypass the background task system to avoid race conditions.
|
|
|
|
## Authorization
|
|
|
|
The gateway uses a multi-layer authorization check, evaluated in order:
|
|
|
|
1. **Gateway-wide allow-all** (`GATEWAY_ALLOW_ALL_USERS`) — if set, all users are authorized
|
|
2. **Platform allowlist** (e.g., `TELEGRAM_ALLOWED_USERS`) — comma-separated user IDs
|
|
3. **DM pairing** — authenticated users can pair new users via a pairing code
|
|
4. **Admin escalation** — some commands require admin status beyond basic authorization
|
|
|
|
### DM Pairing Flow
|
|
|
|
```text
|
|
Admin: /pair
|
|
Gateway: "Pairing code: ABC123. Share with the user."
|
|
New user: ABC123
|
|
Gateway: "Paired! You're now authorized."
|
|
```
|
|
|
|
Pairing state is persisted in `gateway/pairing.py` and survives restarts.
|
|
|
|
## Slash Command Dispatch
|
|
|
|
All slash commands in the gateway flow through the same resolution pipeline:
|
|
|
|
1. `resolve_command()` from `hermes_cli/commands.py` maps input to canonical name (handles aliases, prefix matching)
|
|
2. The canonical name is checked against `GATEWAY_KNOWN_COMMANDS`
|
|
3. Handler in `_handle_message()` dispatches based on canonical name
|
|
4. Some commands are gated on config (`gateway_config_gate` on `CommandDef`)
|
|
|
|
### Running-Agent Guard
|
|
|
|
Commands that must NOT execute while the agent is processing are rejected early:
|
|
|
|
```python
|
|
if _quick_key in self._running_agents:
|
|
if canonical == "model":
|
|
return "⏳ Agent is running — wait for it to finish or /stop first."
|
|
```
|
|
|
|
Bypass commands (`/stop`, `/new`, `/approve`, `/deny`, `/queue`, `/status`) have special handling.
|
|
|
|
## Config Sources
|
|
|
|
The gateway reads configuration from multiple sources:
|
|
|
|
| Source | What it provides |
|
|
|--------|-----------------|
|
|
| `~/.hermes/.env` | API keys, bot tokens, platform credentials |
|
|
| `~/.hermes/config.yaml` | Model settings, tool configuration, display options |
|
|
| Environment variables | Override any of the above |
|
|
|
|
Unlike the CLI (which uses `load_cli_config()` with hardcoded defaults), the gateway reads `config.yaml` directly via YAML loader. This means config keys that exist in the CLI's defaults dict but not in the user's config file may behave differently between CLI and gateway.
|
|
|
|
## Platform Adapters
|
|
|
|
Each messaging platform has an adapter in `gateway/platforms/`:
|
|
|
|
```text
|
|
gateway/platforms/
|
|
├── base.py # BaseAdapter — shared logic for all platforms
|
|
├── telegram.py # Telegram Bot API (long polling or webhook)
|
|
├── discord.py # Discord bot via discord.py
|
|
├── slack.py # Slack Socket Mode
|
|
├── whatsapp.py # WhatsApp Business Cloud API
|
|
├── signal.py # Signal via signal-cli REST API
|
|
├── matrix.py # Matrix via matrix-nio (optional E2EE)
|
|
├── mattermost.py # Mattermost WebSocket API
|
|
├── email_adapter.py # Email via IMAP/SMTP
|
|
├── sms.py # SMS via Twilio
|
|
├── dingtalk.py # DingTalk WebSocket
|
|
├── feishu.py # Feishu/Lark WebSocket or webhook
|
|
├── wecom.py # WeCom (WeChat Work) callback
|
|
└── homeassistant.py # Home Assistant conversation integration
|
|
```
|
|
|
|
Adapters implement a common interface:
|
|
- `connect()` / `disconnect()` — lifecycle management
|
|
- `send_message()` — outbound message delivery
|
|
- `on_message()` — inbound message normalization → `MessageEvent`
|
|
|
|
### Token Locks
|
|
|
|
Adapters that connect with unique credentials call `acquire_scoped_lock()` in `connect()` and `release_scoped_lock()` in `disconnect()`. This prevents two profiles from using the same bot token simultaneously.
|
|
|
|
## Delivery Path
|
|
|
|
Outgoing deliveries (`gateway/delivery.py`) handle:
|
|
|
|
- **Direct reply** — send response back to the originating chat
|
|
- **Home channel delivery** — route cron job outputs and background results to a configured home channel
|
|
- **Explicit target delivery** — `send_message` tool specifying `telegram:-1001234567890`
|
|
- **Cross-platform delivery** — deliver to a different platform than the originating message
|
|
|
|
Cron job deliveries are NOT mirrored into gateway session history — they live in their own cron session only. This is a deliberate design choice to avoid message alternation violations.
|
|
|
|
## Hooks
|
|
|
|
Gateway hooks are Python modules that respond to lifecycle events:
|
|
|
|
### Gateway Hook Events
|
|
|
|
| Event | When fired |
|
|
|-------|-----------|
|
|
| `gateway:startup` | Gateway process starts |
|
|
| `session:start` | New conversation session begins |
|
|
| `session:end` | Session completes or times out |
|
|
| `session:reset` | User resets session with `/new` |
|
|
| `agent:start` | Agent begins processing a message |
|
|
| `agent:step` | Agent completes one tool-calling iteration |
|
|
| `agent:end` | Agent finishes and returns response |
|
|
| `command:*` | Any slash command is executed |
|
|
|
|
Hooks are discovered from `gateway/builtin_hooks/` (always active) and `~/.hermes/hooks/` (user-installed). Each hook is a directory with a `HOOK.yaml` manifest and `handler.py`.
|
|
|
|
## Memory Provider Integration
|
|
|
|
When a memory provider plugin (e.g., Honcho) is enabled:
|
|
|
|
1. Gateway creates an `AIAgent` per message with the session ID
|
|
2. The `MemoryManager` initializes the provider with the session context
|
|
3. Provider tools (e.g., `honcho_profile`, `viking_search`) are routed through:
|
|
|
|
```text
|
|
AIAgent._invoke_tool()
|
|
→ self._memory_manager.handle_tool_call(name, args)
|
|
→ provider.handle_tool_call(name, args)
|
|
```
|
|
|
|
4. On session end/reset, `on_session_end()` fires for cleanup and final data flush
|
|
|
|
### Memory Flush Lifecycle
|
|
|
|
When a session is reset, resumed, or expires:
|
|
1. Built-in memories are flushed to disk
|
|
2. Memory provider's `on_session_end()` hook fires
|
|
3. A temporary `AIAgent` runs a memory-only conversation turn
|
|
4. Context is then discarded or archived
|
|
|
|
## Background Maintenance
|
|
|
|
The gateway runs periodic maintenance alongside message handling:
|
|
|
|
- **Cron ticking** — checks job schedules and fires due jobs
|
|
- **Session expiry** — cleans up abandoned sessions after timeout
|
|
- **Memory flush** — proactively flushes memory before session expiry
|
|
- **Cache refresh** — refreshes model lists and provider status
|
|
|
|
## Process Management
|
|
|
|
The gateway runs as a long-lived process, managed via:
|
|
|
|
- `hermes gateway start` / `hermes gateway stop` — manual control
|
|
- `systemctl` (Linux) or `launchctl` (macOS) — service management
|
|
- PID file at `~/.hermes/gateway.pid` — profile-scoped process tracking
|
|
|
|
**Profile-scoped vs global**: `start_gateway()` uses profile-scoped PID files. `hermes gateway stop` stops only the current profile's gateway. `hermes gateway stop --all` uses global `ps aux` scanning to kill all gateway processes (used during updates).
|
|
|
|
## Related Docs
|
|
|
|
- [Session Storage](./session-storage.md)
|
|
- [Cron Internals](./cron-internals.md)
|
|
- [ACP Internals](./acp-internals.md)
|
|
- [Agent Loop Internals](./agent-loop.md)
|
|
- [Messaging Gateway (User Guide)](/docs/user-guide/messaging)
|