Enhanced documentation to reflect the new interactive setup command for configuring messaging platforms (Telegram, Discord, Slack, WhatsApp). Updated sections in AGENTS.md, README.md, and messaging.md to provide clear instructions on using the 'hermes gateway setup' command, improving user experience and accessibility for platform configuration.
598 lines
19 KiB
Markdown
598 lines
19 KiB
Markdown
# Messaging Platform Integrations (Gateway)
|
|
|
|
Hermes Agent can connect to messaging platforms like Telegram, Discord, and WhatsApp to serve as a conversational AI assistant.
|
|
|
|
## Quick Start
|
|
|
|
The easiest way to configure messaging is the interactive wizard:
|
|
|
|
```bash
|
|
hermes gateway setup # Configure Telegram, Discord, Slack, WhatsApp
|
|
```
|
|
|
|
This walks you through each platform with arrow-key selection, handles tokens, allowlists, and home channels, and offers to start/restart the gateway when done.
|
|
|
|
**Or configure manually** by editing `~/.hermes/.env`:
|
|
|
|
```bash
|
|
# Set your bot token(s)
|
|
echo 'TELEGRAM_BOT_TOKEN="your_telegram_bot_token"' >> ~/.hermes/.env
|
|
echo 'DISCORD_BOT_TOKEN="your_discord_bot_token"' >> ~/.hermes/.env
|
|
```
|
|
|
|
**Then start the gateway:**
|
|
|
|
```bash
|
|
hermes gateway # Run in foreground (useful for debugging)
|
|
hermes gateway install # Install as a system service (runs in background)
|
|
|
|
# Manage the service
|
|
hermes gateway start
|
|
hermes gateway stop
|
|
hermes gateway restart
|
|
hermes gateway status
|
|
```
|
|
|
|
## Architecture Overview
|
|
|
|
```text
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ Hermes Gateway │
|
|
├─────────────────────────────────────────────────────────────────┤
|
|
│ │
|
|
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
|
|
│ │ Telegram │ │ Discord │ │ WhatsApp │ │ Slack │ │
|
|
│ │ Adapter │ │ Adapter │ │ Adapter │ │ Adapter │ │
|
|
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
|
|
│ │ │ │ │ │
|
|
│ └─────────────┼────────────┼─────────────┘ │
|
|
│ │ │
|
|
│ ┌────────▼────────┐ │
|
|
│ │ Session Store │ │
|
|
│ │ (per-chat) │ │
|
|
│ └────────┬────────┘ │
|
|
│ │ │
|
|
│ ┌────────▼────────┐ │
|
|
│ │ AIAgent │ │
|
|
│ │ (run_agent) │ │
|
|
│ └─────────────────┘ │
|
|
│ │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Session Management
|
|
|
|
### Session Persistence
|
|
|
|
Sessions persist across messages until they reset. The agent remembers your conversation context.
|
|
|
|
### Reset Policies
|
|
|
|
Sessions reset based on configurable policies:
|
|
|
|
| Policy | Default | Description |
|
|
|--------|---------|-------------|
|
|
| Daily | 4:00 AM | Reset at a specific hour each day |
|
|
| Idle | 120 min | Reset after N minutes of inactivity |
|
|
| Both | (combined) | Whichever triggers first |
|
|
|
|
### Manual Reset
|
|
|
|
Send `/new` or `/reset` as a message to start fresh.
|
|
|
|
### Context Management
|
|
|
|
| Command | Description |
|
|
|---------|-------------|
|
|
| `/compress` | Manually compress conversation context (saves memories, then summarizes) |
|
|
| `/usage` | Show token usage and context window status for the current session |
|
|
|
|
### Per-Platform Overrides
|
|
|
|
Configure different reset policies per platform:
|
|
|
|
```json
|
|
{
|
|
"reset_by_platform": {
|
|
"telegram": { "mode": "idle", "idle_minutes": 240 },
|
|
"discord": { "mode": "idle", "idle_minutes": 60 }
|
|
}
|
|
}
|
|
```
|
|
|
|
## Platform Setup
|
|
|
|
### Telegram
|
|
|
|
1. **Create a bot** via [@BotFather](https://t.me/BotFather)
|
|
2. **Get your token** (looks like `123456789:ABCdefGHIjklMNOpqrsTUVwxyz`)
|
|
3. **Set environment variable:**
|
|
```bash
|
|
export TELEGRAM_BOT_TOKEN="your_token_here"
|
|
```
|
|
4. **Optional: Set home channel** for cron job delivery:
|
|
```bash
|
|
export TELEGRAM_HOME_CHANNEL="-1001234567890"
|
|
export TELEGRAM_HOME_CHANNEL_NAME="My Notes"
|
|
```
|
|
|
|
**Requirements:**
|
|
```bash
|
|
pip install python-telegram-bot>=20.0
|
|
```
|
|
|
|
### Discord
|
|
|
|
1. **Create an application** at [Discord Developer Portal](https://discord.com/developers/applications)
|
|
2. **Create a bot** under your application
|
|
3. **Get the bot token**
|
|
4. **Enable required intents:**
|
|
- Message Content Intent
|
|
- Server Members Intent (optional)
|
|
5. **Invite to your server** using OAuth2 URL generator (scopes: `bot`, `applications.commands`)
|
|
6. **Set environment variable:**
|
|
```bash
|
|
export DISCORD_BOT_TOKEN="your_token_here"
|
|
```
|
|
7. **Optional: Set home channel:**
|
|
```bash
|
|
export DISCORD_HOME_CHANNEL="123456789012345678"
|
|
export DISCORD_HOME_CHANNEL_NAME="#bot-updates"
|
|
```
|
|
|
|
**Requirements:**
|
|
```bash
|
|
pip install discord.py>=2.0
|
|
```
|
|
|
|
### WhatsApp
|
|
|
|
WhatsApp uses a built-in bridge powered by [Baileys](https://github.com/WhiskeySockets/Baileys) that connects via WhatsApp Web.
|
|
|
|
**Two modes:**
|
|
|
|
- **`bot` mode (recommended):** Use a dedicated phone number for the bot. Other people message that number directly. All `fromMe` messages are treated as bot echo-backs and ignored.
|
|
- **`self-chat` mode:** Use your own WhatsApp account. You talk to the agent by messaging yourself (WhatsApp → "Message Yourself").
|
|
|
|
**Setup:**
|
|
|
|
```bash
|
|
hermes whatsapp
|
|
```
|
|
|
|
The wizard walks you through mode selection, allowlist configuration, dependency installation, and QR code pairing. For bot mode, you'll need a second phone number with WhatsApp installed on some device (dual-SIM with WhatsApp Business app is the easiest approach).
|
|
|
|
Then start the gateway:
|
|
|
|
```bash
|
|
hermes gateway
|
|
```
|
|
|
|
**Environment variables:**
|
|
|
|
```bash
|
|
WHATSAPP_ENABLED=true
|
|
WHATSAPP_MODE=bot # "bot" (separate number) or "self-chat" (message yourself)
|
|
WHATSAPP_ALLOWED_USERS=15551234567 # Comma-separated phone numbers with country code
|
|
```
|
|
|
|
**Getting a second number for bot mode:**
|
|
|
|
| Option | Cost | Notes |
|
|
|--------|------|-------|
|
|
| WhatsApp Business app + dual-SIM | Free (if you have dual-SIM) | Install alongside personal WhatsApp, no second phone needed |
|
|
| Google Voice | Free (US only) | voice.google.com, verify WhatsApp via the Google Voice app |
|
|
| Prepaid SIM | $3-10/month | Any carrier; verify once, phone can go in a drawer on WiFi |
|
|
|
|
Agent responses are prefixed with "⚕ **Hermes Agent**" for easy identification.
|
|
|
|
> **Re-pairing:** If WhatsApp Web sessions disconnect (protocol updates, phone reset), re-pair with `hermes whatsapp`.
|
|
|
|
## Configuration
|
|
|
|
There are **three ways** to configure the gateway (in order of precedence):
|
|
|
|
### 1. Environment Variables (`.env` file) - Recommended for Quick Setup
|
|
|
|
Add to your `~/.hermes/.env` file:
|
|
|
|
```bash
|
|
# =============================================================================
|
|
# MESSAGING PLATFORM TOKENS
|
|
# =============================================================================
|
|
|
|
# Telegram - get from @BotFather on Telegram
|
|
TELEGRAM_BOT_TOKEN=your_telegram_bot_token
|
|
TELEGRAM_ALLOWED_USERS=123456789,987654321 # Security: restrict to these user IDs
|
|
|
|
# Optional: Default channel for cron job delivery
|
|
TELEGRAM_HOME_CHANNEL=-1001234567890
|
|
TELEGRAM_HOME_CHANNEL_NAME="My Notes"
|
|
|
|
# Discord - get from Discord Developer Portal
|
|
DISCORD_BOT_TOKEN=your_discord_bot_token
|
|
DISCORD_ALLOWED_USERS=123456789012345678 # Security: restrict to these user IDs
|
|
|
|
# Optional: Default channel for cron job delivery
|
|
DISCORD_HOME_CHANNEL=123456789012345678
|
|
DISCORD_HOME_CHANNEL_NAME="#bot-updates"
|
|
|
|
# Slack - get from Slack API (api.slack.com/apps)
|
|
SLACK_BOT_TOKEN=xoxb-your-slack-bot-token
|
|
SLACK_APP_TOKEN=xapp-your-slack-app-token # Required for Socket Mode
|
|
SLACK_ALLOWED_USERS=U01234ABCDE # Security: restrict to these user IDs
|
|
|
|
# Optional: Default channel for cron job delivery
|
|
# SLACK_HOME_CHANNEL=C01234567890
|
|
|
|
# WhatsApp - pair via: hermes whatsapp
|
|
WHATSAPP_ENABLED=true
|
|
WHATSAPP_ALLOWED_USERS=15551234567 # Phone numbers with country code
|
|
|
|
# =============================================================================
|
|
# AGENT SETTINGS
|
|
# =============================================================================
|
|
|
|
# Max tool-calling iterations per conversation (default: 60)
|
|
HERMES_MAX_ITERATIONS=60
|
|
|
|
# Working directory for terminal commands (default: home ~)
|
|
MESSAGING_CWD=/home/myuser
|
|
|
|
# =============================================================================
|
|
# TOOL PROGRESS NOTIFICATIONS
|
|
# =============================================================================
|
|
|
|
# Tool progress is now configured in config.yaml:
|
|
# display:
|
|
# tool_progress: all # off | new | all | verbose
|
|
|
|
# =============================================================================
|
|
# SESSION SETTINGS
|
|
# =============================================================================
|
|
|
|
# Reset sessions after N minutes of inactivity (default: 120)
|
|
SESSION_IDLE_MINUTES=120
|
|
|
|
# Daily reset hour in 24h format (default: 4 = 4am)
|
|
SESSION_RESET_HOUR=4
|
|
```
|
|
|
|
### 2. Gateway Config File (`~/.hermes/gateway.json`) - Full Control
|
|
|
|
For advanced configuration, create `~/.hermes/gateway.json`:
|
|
|
|
```json
|
|
{
|
|
"platforms": {
|
|
"telegram": {
|
|
"enabled": true,
|
|
"token": "your_telegram_token",
|
|
"home_channel": {
|
|
"platform": "telegram",
|
|
"chat_id": "-1001234567890",
|
|
"name": "My Notes"
|
|
}
|
|
},
|
|
"discord": {
|
|
"enabled": true,
|
|
"token": "your_discord_token",
|
|
"home_channel": {
|
|
"platform": "discord",
|
|
"chat_id": "123456789012345678",
|
|
"name": "#bot-updates"
|
|
}
|
|
}
|
|
},
|
|
"default_reset_policy": {
|
|
"mode": "both",
|
|
"at_hour": 4,
|
|
"idle_minutes": 120
|
|
},
|
|
"reset_by_platform": {
|
|
"discord": {
|
|
"mode": "idle",
|
|
"idle_minutes": 60
|
|
}
|
|
},
|
|
"always_log_local": true
|
|
}
|
|
```
|
|
|
|
## Platform-Specific Toolsets
|
|
|
|
Each platform has its own toolset for security:
|
|
|
|
| Platform | Toolset | Capabilities |
|
|
|----------|---------|--------------|
|
|
| CLI | `hermes-cli` | Full access (terminal, browser, etc.) |
|
|
| Telegram | `hermes-telegram` | Full tools including terminal |
|
|
| Discord | `hermes-discord` | Full tools including terminal |
|
|
| WhatsApp | `hermes-whatsapp` | Full tools including terminal |
|
|
| Slack | `hermes-slack` | Full tools including terminal |
|
|
|
|
## User Experience Features
|
|
|
|
### Typing Indicator
|
|
|
|
The gateway keeps the "typing..." indicator active throughout processing, refreshing every 4 seconds. This lets users know the bot is working even during long tool-calling sequences.
|
|
|
|
### Tool Progress Notifications
|
|
|
|
When `tool_progress` is enabled in `config.yaml`, the bot sends status messages as it works:
|
|
|
|
```text
|
|
💻 `ls -la`...
|
|
🔍 web_search...
|
|
📄 web_extract...
|
|
🎨 image_generate...
|
|
```
|
|
|
|
Terminal commands show the actual command (truncated to 50 chars). Other tools just show the tool name.
|
|
|
|
**Modes:**
|
|
- `new`: Only sends message when switching to a different tool (less spam)
|
|
- `all`: Sends message for every single tool call
|
|
|
|
### Working Directory
|
|
|
|
- **CLI (`hermes` command)**: Uses current directory where you run the command
|
|
- **Messaging**: Uses `MESSAGING_CWD` (default: home directory `~`)
|
|
|
|
This is intentional: CLI users are in a terminal and expect the agent to work in their current directory, while messaging users need a consistent starting location.
|
|
|
|
### Max Iterations
|
|
|
|
If the agent hits the max iteration limit while working, instead of a generic error, it asks the model to summarize what it found so far. This gives you a useful response even when the task couldn't be fully completed.
|
|
|
|
## Voice Messages (TTS)
|
|
|
|
The `text_to_speech` tool generates audio that the gateway delivers as native voice messages on each platform:
|
|
|
|
| Platform | Delivery | Format |
|
|
|----------|----------|--------|
|
|
| Telegram | Voice bubble (plays inline) | Opus `.ogg` — native from OpenAI/ElevenLabs, converted via ffmpeg for Edge TTS |
|
|
| Discord | Audio file attachment | MP3 |
|
|
| WhatsApp | Audio file attachment | MP3 |
|
|
| CLI | Saved to `~/voice-memos/` | MP3 |
|
|
|
|
**Providers:**
|
|
- **Edge TTS** (default) — Free, no API key, 322 voices in 74 languages
|
|
- **ElevenLabs** — Premium quality, requires `ELEVENLABS_API_KEY`
|
|
- **OpenAI TTS** — Good quality, requires `OPENAI_API_KEY`
|
|
|
|
Voice and provider are configured by the user in `~/.hermes/config.yaml` under the `tts:` key. The model only sends text; it does not choose the voice.
|
|
|
|
The tool returns a `MEDIA:<path>` tag that the gateway sending pipeline intercepts and delivers as a native audio message. If `[[audio_as_voice]]` is present (Opus format available), Telegram sends it as a voice bubble instead of an audio file.
|
|
|
|
**Telegram voice bubbles & ffmpeg:**
|
|
|
|
Telegram requires Opus/OGG format for native voice bubbles (the round, inline-playable kind). **OpenAI and ElevenLabs** produce Opus natively when on Telegram — no extra setup needed. **Edge TTS** (the default free provider) outputs MP3 and needs `ffmpeg` to convert:
|
|
|
|
```bash
|
|
sudo apt install ffmpeg # Ubuntu/Debian
|
|
brew install ffmpeg # macOS
|
|
sudo dnf install ffmpeg # Fedora
|
|
```
|
|
|
|
Without ffmpeg, Edge TTS audio is sent as a regular audio file (still playable, but shows as a rectangular music player instead of a voice bubble).
|
|
|
|
## Cron Job Delivery
|
|
|
|
Cron jobs are executed automatically by the gateway daemon. When the gateway is running (via `hermes gateway` or `hermes gateway install`), it ticks the scheduler every 60 seconds and runs due jobs.
|
|
|
|
When scheduling cron jobs, you can specify where the output should be delivered:
|
|
|
|
```text
|
|
User: "Remind me to check the server in 30 minutes"
|
|
|
|
Agent uses: schedule_cronjob(
|
|
prompt="Check server status...",
|
|
schedule="30m",
|
|
deliver="origin" # Back to this chat
|
|
)
|
|
```
|
|
|
|
### Delivery Options
|
|
|
|
| Option | Description |
|
|
|--------|-------------|
|
|
| `"origin"` | Back to where the job was created |
|
|
| `"local"` | Save to local files only |
|
|
| `"telegram"` | Telegram home channel |
|
|
| `"discord"` | Discord home channel |
|
|
| `"telegram:123456"` | Specific Telegram chat |
|
|
|
|
## Dynamic Context Injection
|
|
|
|
The agent knows where it is via injected context:
|
|
|
|
```text
|
|
## Current Session Context
|
|
|
|
**Source:** Telegram (group: Dev Team, ID: -1001234567890)
|
|
**Connected Platforms:** local, telegram, discord
|
|
|
|
**Home Channels:**
|
|
- telegram: My Notes (ID: -1001234567890)
|
|
- discord: #bot-updates (ID: 123456789012345678)
|
|
|
|
**Delivery options for scheduled tasks:**
|
|
- "origin" → Back to this chat (Dev Team)
|
|
- "local" → Save to local files only
|
|
- "telegram" → Home channel (My Notes)
|
|
- "discord" → Home channel (#bot-updates)
|
|
```
|
|
|
|
## CLI Commands
|
|
|
|
| Command | Description |
|
|
|---------|-------------|
|
|
| `/platforms` | Show gateway configuration and status |
|
|
| `--gateway` | Start the gateway (CLI flag) |
|
|
|
|
## Troubleshooting
|
|
|
|
### "python-telegram-bot not installed"
|
|
|
|
```bash
|
|
pip install python-telegram-bot>=20.0
|
|
```
|
|
|
|
### "discord.py not installed"
|
|
|
|
```bash
|
|
pip install discord.py>=2.0
|
|
```
|
|
|
|
### "No platforms connected"
|
|
|
|
1. Check your environment variables are set
|
|
2. Check your tokens are valid
|
|
3. Try `/platforms` to see configuration status
|
|
|
|
### Session not persisting
|
|
|
|
1. Check `~/.hermes/sessions/` exists
|
|
2. Check session policies aren't too aggressive
|
|
3. Verify no errors in gateway logs
|
|
|
|
## Adding a New Platform
|
|
|
|
To add a new messaging platform:
|
|
|
|
### 1. Create the adapter
|
|
|
|
Create `gateway/platforms/your_platform.py`:
|
|
|
|
```python
|
|
from gateway.platforms.base import BasePlatformAdapter, MessageEvent, SendResult
|
|
from gateway.config import Platform, PlatformConfig
|
|
|
|
class YourPlatformAdapter(BasePlatformAdapter):
|
|
def __init__(self, config: PlatformConfig):
|
|
super().__init__(config, Platform.YOUR_PLATFORM)
|
|
|
|
async def connect(self) -> bool:
|
|
# Connect to the platform
|
|
...
|
|
|
|
async def disconnect(self) -> None:
|
|
# Disconnect
|
|
...
|
|
|
|
async def send(self, chat_id: str, content: str, ...) -> SendResult:
|
|
# Send a message
|
|
...
|
|
|
|
async def get_chat_info(self, chat_id: str) -> Dict[str, Any]:
|
|
# Get chat information
|
|
...
|
|
```
|
|
|
|
### 2. Register the platform
|
|
|
|
Add to `gateway/config.py`:
|
|
|
|
```python
|
|
class Platform(Enum):
|
|
# ... existing ...
|
|
YOUR_PLATFORM = "your_platform"
|
|
```
|
|
|
|
### 3. Add to gateway runner
|
|
|
|
Update `gateway/run.py` `_create_adapter()`:
|
|
|
|
```python
|
|
elif platform == Platform.YOUR_PLATFORM:
|
|
from gateway.platforms.your_platform import YourPlatformAdapter
|
|
return YourPlatformAdapter(config)
|
|
```
|
|
|
|
### 4. Create a toolset (optional)
|
|
|
|
Add to `toolsets.py`:
|
|
|
|
```python
|
|
"hermes-your-platform": {
|
|
"description": "Your platform toolset",
|
|
"tools": [...],
|
|
"includes": []
|
|
}
|
|
```
|
|
|
|
### 5. Configure
|
|
|
|
Add environment variables to `.env`:
|
|
|
|
```bash
|
|
YOUR_PLATFORM_TOKEN=...
|
|
YOUR_PLATFORM_HOME_CHANNEL=...
|
|
```
|
|
|
|
## Service Management
|
|
|
|
### Linux (systemd)
|
|
|
|
```bash
|
|
# Install as user service
|
|
./scripts/hermes-gateway install
|
|
|
|
# Manage
|
|
systemctl --user start hermes-gateway
|
|
systemctl --user stop hermes-gateway
|
|
systemctl --user restart hermes-gateway
|
|
systemctl --user status hermes-gateway
|
|
|
|
# View logs
|
|
journalctl --user -u hermes-gateway -f
|
|
|
|
# Enable lingering (keeps running after logout)
|
|
sudo loginctl enable-linger $USER
|
|
```
|
|
|
|
### macOS (launchd)
|
|
|
|
```bash
|
|
# Install
|
|
./scripts/hermes-gateway install
|
|
|
|
# Manage
|
|
launchctl start ai.hermes.gateway
|
|
launchctl stop ai.hermes.gateway
|
|
|
|
# View logs
|
|
tail -f ~/.hermes/logs/gateway.log
|
|
```
|
|
|
|
### Manual (any platform)
|
|
|
|
```bash
|
|
# Run in foreground (for testing/debugging)
|
|
./scripts/hermes-gateway run
|
|
|
|
# Or via CLI (also foreground)
|
|
python cli.py --gateway
|
|
```
|
|
|
|
## Interrupting the Agent
|
|
|
|
Send any message while the agent is working to interrupt it. The message becomes the next prompt after the agent stops. Key behaviors:
|
|
|
|
- **In-progress terminal commands are killed immediately** -- SIGTERM first, SIGKILL after 1 second if the process resists. Works on local, Docker, SSH, Singularity, and Modal backends.
|
|
- **Tool calls are cancelled** -- if the model generated multiple tool calls in one batch, only the currently-executing one runs. The rest are skipped.
|
|
- **Multiple messages are combined** -- if you send "Stop!" then "Do X instead" while the agent is stopping, both messages are joined into one prompt (separated by newline).
|
|
- **`/stop` command** -- interrupts without queuing a follow-up message.
|
|
- **Priority processing** -- interrupt signals bypass command parsing and session creation for minimal latency.
|
|
|
|
## Storage Locations
|
|
|
|
| Path | Purpose |
|
|
|------|---------|
|
|
| `~/.hermes/gateway.json` | Gateway configuration |
|
|
| `~/.hermes/sessions/sessions.json` | Session index |
|
|
| `~/.hermes/sessions/{id}.jsonl` | Conversation transcripts |
|
|
| `~/.hermes/cron/output/` | Cron job outputs |
|
|
| `~/.hermes/logs/gateway.log` | Gateway logs (macOS launchd) |
|