Files
hermes-agent/gateway_analysis_report.md
Allegro 10271c6b44
Some checks failed
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Failing after 25s
Tests / test (pull_request) Failing after 24s
Docker Build and Publish / build-and-push (pull_request) Failing after 35s
security: fix command injection vulnerabilities (CVSS 9.8)
Replace shell=True with list-based subprocess execution to prevent
command injection via malicious user input.

Changes:
- tools/transcription_tools.py: Use shlex.split() + shell=False
- tools/environments/docker.py: List-based commands with container ID validation

Fixes CVE-level vulnerability where malicious file paths or container IDs
could inject arbitrary commands.

CVSS: 9.8 (Critical)
Refs: V-001 in SECURITY_AUDIT_REPORT.md
2026-03-30 23:15:11 +00:00

35 KiB

Hermes Gateway System - Deep Analysis Report

Executive Summary

This report provides an exhaustive analysis of the Hermes messaging gateway system, which serves as the unified interface between the AI agent and 15+ messaging platforms. The gateway handles message routing, session management, platform abstraction, and cross-platform delivery.


1. Message Flow Diagram for All Platforms

1.1 Inbound Message Flow (Universal Pattern)

┌─────────────────────────────────────────────────────────────────────────────┐
│                           EXTERNAL MESSAGING PLATFORM                        │
│  (Telegram/Discord/Slack/WhatsApp/Signal/Matrix/Mattermost/Email/SMS/etc)   │
└─────────────────────────────────────────────────────────────────────────────┘
                                      │
                                      ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                      PLATFORM-SPECIFIC TRANSPORT LAYER                       │
│  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐   │
│  │  WebSocket  │ │  Long Poll  │ │   Webhook   │ │  HTTP REST + SSE    │   │
│  │  (Discord)  │ │ (Telegram)  │ │  (Generic)  │ │    (Signal/HA)      │   │
│  └─────────────┘ └─────────────┘ └─────────────┘ └─────────────────────┘   │
└─────────────────────────────────────────────────────────────────────────────┘
                                      │
                                      ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                      PLATFORM ADAPTER (BasePlatformAdapter)                  │
│  ┌──────────────────────────────────────────────────────────────────────┐   │
│  │  1. Authentication/Validation (token verification, HMAC checks)      │   │
│  │  2. Message Parsing (extract text, media, metadata)                  │   │
│  │  3. Source Building (SessionSource: chat_id, user_id, platform)      │   │
│  │  4. Media Caching (images/audio/documents → local filesystem)        │   │
│  │  5. Deduplication (message ID tracking, TTL caches)                  │   │
│  └──────────────────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────────────────┘
                                      │
                                      ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                           MESSAGEEVENT CREATION                              │
│  ┌──────────────────────────────────────────────────────────────────────┐   │
│  │  MessageEvent {                                                      │   │
│  │    text: str,                    # Extracted message text            │   │
│  │    message_type: MessageType,    # TEXT/PHOTO/VOICE/DOCUMENT/etc     │   │
│  │    source: SessionSource,        # Platform + chat + user context    │   │
│  │    media_urls: List[str],        # Cached attachment paths           │   │
│  │    message_id: str,              # Platform message ID               │   │
│  │    reply_to_message_id: str,     # Thread/reply context              │   │
│  │    timestamp: datetime,          # Message time                      │   │
│  │    raw_message: Any,             # Platform-specific payload         │   │
│  │  }                                                                   │   │
│  └──────────────────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────────────────┘
                                      │
                                      ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                           GATEWAY RUNNER (run.py)                            │
│  ┌──────────────────────────────────────────────────────────────────────┐   │
│  │  1. Authorization Check (_is_user_authorized)                        │   │
│  │     - Check allowlists (user-specific, group-specific)               │   │
│  │     - Check pairing mode (first-user-wins, admin-only)               │   │
│  │     - Validate group policies                                        │   │
│  │  2. Session Resolution/Creation (_get_or_create_session)             │   │
│  │  3. Command Processing (/reset, /status, /stop, etc.)                │   │
│  │  4. Agent Invocation (_process_message_with_agent)                   │   │
│  └──────────────────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────────────────┘
                                      │
                                      ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                           AI AGENT PROCESSING                                │
│                    (Agent Loop with Tool Calling)                            │
└─────────────────────────────────────────────────────────────────────────────┘

1.2 Outbound Message Flow

┌─────────────────────────────────────────────────────────────────────────────┐
│                           AI AGENT RESPONSE                                  │
│                    (Text + Media + Tool Results)                             │
└─────────────────────────────────────────────────────────────────────────────┘
                                      │
                                      ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                         RESPONSE PROCESSING                                  │
│  ┌──────────────────────────────────────────────────────────────────────┐   │
│  │  1. Format Message (platform-specific markdown conversion)           │   │
│  │  2. Truncate if needed (respect platform limits)                     │   │
│  │  3. Media Handling (upload to platform if needed)                    │   │
│  │  4. Thread Context (reply_to_message_id, thread_id)                  │   │
│  └──────────────────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────────────────┘
                                      │
                                      ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                      PLATFORM ADAPTER SEND METHOD                            │
│  ┌──────────────────────────────────────────────────────────────────────┐   │
│  │  send(chat_id, content, reply_to, metadata) -> SendResult            │   │
│  │  ├── Telegram: Bot API (HTTP POST to sendMessage)                    │   │
│  │  ├── Discord: discord.py (channel.send())                            │   │
│  │  ├── Slack: slack_bolt (chat.postMessage)                            │   │
│  │  ├── Matrix: matrix-nio (room_send)                                  │   │
│  │  ├── Signal: signal-cli HTTP RPC                                     │   │
│  │  ├── WhatsApp: Bridge HTTP POST to Node.js process                   │   │
│  │  └── ... (15+ platforms)                                             │   │
│  └──────────────────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────────────────┘
                                      │
                                      ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│                         DELIVERY CONFIRMATION                                │
│                    (SendResult: success/error/message_id)                    │
└─────────────────────────────────────────────────────────────────────────────┘

1.3 Platform-Specific Transport Architectures

Platform Transport Connection Model Authentication
Telegram HTTP Long Polling / Webhook Persistent HTTP Bot Token
Discord WebSocket (Gateway) Persistent WS Bot Token
Slack Socket Mode (WebSocket) Persistent WS Bot Token + App Token
WhatsApp HTTP Bridge (Local) Child Process + HTTP Session-based
Signal HTTP + SSE HTTP Stream signal-cli daemon
Matrix HTTP + Sync Loop Polling with long-poll Access Token
Mattermost WebSocket Persistent WS Bot Token
Email IMAP + SMTP Polling (IMAP) Username/Password
SMS (Twilio) HTTP Webhook Inbound HTTP + REST outbound Account SID + Auth Token
DingTalk WebSocket (Stream) Persistent WS Client ID + Secret
Feishu WebSocket / Webhook WS or HTTP App ID + Secret
WeCom WebSocket Persistent WS Bot ID + Secret
Home Assistant WebSocket Persistent WS Long-lived Token
Webhook HTTP Server Inbound HTTP HMAC Signature
API Server HTTP Server Inbound HTTP API Key

2. Session Lifecycle Analysis

2.1 Session State Model

┌─────────────────────────────────────────────────────────────────────────────┐
│                         SESSION STATE MACHINE                                │
└─────────────────────────────────────────────────────────────────────────────┘

    ┌──────────┐
    │  START   │
    └────┬─────┘
         │
         ▼
    ┌────────────────────────────────────────────────────────────────────┐
    │                         SESSION CREATION                            │
    │  ┌──────────────────────────────────────────────────────────────┐  │
    │  │  1. Generate session_id (UUID)                               │  │
    │  │  2. Create SessionSource (platform, chat_id, user_id, ...)   │  │
    │  │  3. Initialize memory (Honcho/UserRepo)                      │  │
    │  │  4. Set creation timestamp                                   │  │
    │  │  5. Initialize environment (worktree, tools, skills)         │  │
    │  └──────────────────────────────────────────────────────────────┘  │
    └────────────────────────────────────────────────────────────────────┘
         │
         ▼
    ┌────────────────────────────────────────────────────────────────────┐
    │                          ACTIVE STATE                               │
    │  ┌──────────────────────────────────────────────────────────────┐  │
    │  │  SESSION OPERATIONS:                                          │  │
    │  │  ├── Message Processing (handle_message)                     │  │
    │  │  ├── Tool Execution (terminal, file ops, browser, etc.)      │  │
    │  │  ├── Memory Storage/Retrieval (context building)             │  │
    │  │  ├── Checkpoint Creation (state snapshots)                   │  │
    │  │  └── Delivery Routing (responses to multiple platforms)      │  │
    │  │                                                              │  │
    │  │  LIFECYCLE EVENTS:                                          │  │
    │  │  ├── /reset - Clear session state, keep identity             │  │
    │  │  ├── /stop - Interrupt current operation                     │  │
    │  │  ├── /title - Rename session                                 │  │
    │  │  ├── Checkpoint/Resume - Save/restore execution state        │  │
    │  │  └── Background task completion (cron jobs, delegations)     │  │
    │  └──────────────────────────────────────────────────────────────┘  │
    └────────────────────────────────────────────────────────────────────┘
         │
         ├── Idle Timeout ────────┐
         │                        ▼
    ┌────┴───────────────────────────────────────────────────────────────┐
    │                     SESSION PERSISTENCE                             │
    │  ┌──────────────────────────────────────────────────────────────┐  │
    │  │  Save to:                                                     │  │
    │  │  ├── SQLite (session metadata)                               │  │
    │  │  ├── Honcho (conversation history)                           │  │
    │  │  ├── Filesystem (checkpoints, outputs)                       │  │
    │  │  └── Platform (message history for context)                  │  │
    │  └──────────────────────────────────────────────────────────────┘  │
    └────────────────────────────────────────────────────────────────────┘
         │
         ├── Explicit Close / Error / Timeout
         │
         ▼
    ┌────────────────────────────────────────────────────────────────────┐
    │                      SESSION TERMINATION                            │
    │  ┌──────────────────────────────────────────────────────────────┐  │
    │  │  Cleanup Actions:                                             │  │
    │  │  ├── Flush memory to persistent store                        │  │
    │  │  ├── Cancel pending tasks                                    │  │
    │  │  ├── Close environment resources                             │  │
    │  │  ├── Remove from active sessions map                         │  │
    │  │  └── Notify user (if graceful)                               │  │
    │  └──────────────────────────────────────────────────────────────┘  │
    └────────────────────────────────────────────────────────────────────┘

2.2 Session Data Model

SessionSource:
  platform: Platform           # TELEGRAM, DISCORD, SLACK, etc.
  chat_id: str                 # Platform-specific chat/channel ID
  chat_name: Optional[str]     # Display name
  chat_type: str               # "dm" | "group" | "channel"
  user_id: str                 # User identifier (platform-specific)
  user_name: Optional[str]     # Display name
  user_id_alt: Optional[str]   # Alternative ID (e.g., Matrix MXID)
  thread_id: Optional[str]     # Thread/topic ID
  message_id: Optional[str]    # Specific message ID (for replies)

SessionMetadata:
  session_id: str              # UUID
  created_at: datetime
  last_activity: datetime
  agent_id: Optional[str]      # Honcho agent ID
  session_title: Optional[str]
  
ActiveSession:
  source: SessionSource
  metadata: SessionMetadata
  memory: HonchoClient          # Conversation storage
  environment: Optional[str]    # Active execution environment

2.3 Session Persistence Strategy

Layer Storage TTL/Policy Purpose
In-Memory Dict[str, ActiveSession] Gateway lifetime Fast access to active sessions
SQLite ~/.hermes/sessions.db Persistent Session metadata, checkpoints
Honcho API Cloud/self-hosted Persistent Conversation history, user memory
Filesystem ~/.hermes/checkpoints/ User-managed Execution state snapshots
Platform Message history Platform-dependent Context window reconstruction

3. Platform Adapter Comparison Matrix

3.1 Feature Matrix

Feature Telegram Discord Slack Matrix Signal WhatsApp Mattermost Email SMS
Message Types
Text
Images
Documents
Voice/Audio ⚠️
Video
Stickers
Threading
Thread Support (topics) (refs)
Reply Chains
Advanced
Typing Indicators ⚠️ ⚠️
Message Edit
Message Delete
Reactions
Slash Commands
Security
E2EE Available ⚠️ (TLS)
Self-hosted ⚠️ ⚠️ ⚠️
Scale
Max Message 4096 2000 40000 4000 8000 65536 4000 50000 1600
Rate Limits High Medium Medium Low Low Low High Medium Low

3.2 Implementation Complexity

Platform Lines of Code Dependencies Setup Complexity Maintenance
Telegram ~2100 python-telegram-bot Low Low
Discord ~2300 discord.py + opus Medium Medium
Slack ~970 slack-bolt Medium Low
Matrix ~1050 matrix-nio High Medium
Signal ~800 httpx (only) High Low
WhatsApp ~800 Node.js bridge High High
Mattermost ~720 aiohttp Low Low
Email ~620 stdlib (imaplib/smtplib) Low Low
SMS ~280 aiohttp Low Low
DingTalk ~340 dingtalk-stream Low Low
Feishu ~3250 lark-oapi High Medium
WeCom ~1330 aiohttp + httpx Medium Medium
Home Assistant ~450 aiohttp Low Low
Webhook ~620 aiohttp Low Low
API Server ~1320 aiohttp Low Low

3.3 Protocol Implementation Patterns

Platform Connection Pattern Message Ingestion Message Delivery
Telegram Polling/Webhook Update processing loop HTTP POST
Discord Gateway WebSocket Event dispatch Gateway send
Slack Socket Mode WS Event handlers Web API
Matrix Sync loop (HTTP long-poll) Event callbacks Room send API
Signal SSE stream Async iterator JSON-RPC HTTP
WhatsApp Local HTTP bridge Polling endpoint HTTP POST
Mattermost WebSocket Event loop REST API
Email IMAP IDLE/polling UID tracking SMTP
SMS HTTP webhook POST handler REST API

4. Ten Scalability Recommendations

4.1 Horizontal Scaling

R1. Implement Gateway Sharding

  • Current: Single-process gateway with per-platform adapters
  • Problem: Memory/CPU limits as session count grows
  • Solution: Implement consistent hashing by chat_id to route messages to gateway shards
  • Implementation: Use Redis for session state, allow multiple gateway instances behind load balancer

R2. Async Connection Pooling

  • Current: Each adapter manages its own connections
  • Problem: Connection explosion with high concurrency
  • Solution: Implement shared connection pools for HTTP-based platforms (Telegram, Slack, Matrix)
  • Implementation: Use aiohttp/httpx connection pooling with configurable limits

4.2 Message Processing

R3. Implement Message Queue Backpressure

  • Current: Direct adapter → agent invocation
  • Problem: Agent overload during message bursts
  • Solution: Add per-session message queues with prioritization
  • Implementation: Use asyncio.PriorityQueue, drop old messages if queue exceeds limit

R4. Batch Processing for Similar Requests

  • Current: Each message triggers individual agent runs
  • Problem: Redundant processing for similar queries
  • Solution: Implement request deduplication and batching window
  • Implementation: 100ms batching window, group similar requests, shared LLM inference

4.3 Session Management

R5. Session Tiering with LRU Eviction

  • Current: All sessions kept in memory
  • Problem: Memory exhaustion with many concurrent sessions
  • Solution: Implement hot/warm/cold session tiers
  • Implementation: Active (in-memory), Idle (Redis), Archived (DB) with automatic promotion

R6. Streaming Response Handling

  • Current: Full response buffering before platform send
  • Problem: Delayed first-byte delivery, memory pressure for large responses
  • Solution: Stream chunks to platforms as they're generated
  • Implementation: Generator-based response handling, platform-specific chunking

4.4 Platform Optimization

R7. Adaptive Polling Intervals

  • Current: Fixed polling intervals (Telegram, Email)
  • Problem: Wasted API calls during low activity, latency during high activity
  • Solution: Implement adaptive backoff based on message frequency
  • Implementation: Exponential backoff to min interval, jitter, reset on activity

R8. Platform-Specific Rate Limiters

  • Current: Generic rate limiting
  • Problem: Platform-specific limits cause throttling errors
  • Solution: Implement per-platform token bucket rate limiters
  • Implementation: Separate rate limiters per platform with platform-specific limits

4.5 Infrastructure

R9. Distributed Checkpoint Storage

  • Current: Local filesystem checkpoints
  • Problem: Single point of failure, not shareable across instances
  • Solution: Pluggable checkpoint backends (S3, Redis, NFS)
  • Implementation: Abstract checkpoint interface, async uploads

R10. Observability and Auto-scaling

  • Current: Basic logging, no metrics
  • Problem: No visibility into bottlenecks, manual scaling
  • Solution: Implement comprehensive metrics and auto-scaling triggers
  • Implementation: Prometheus metrics (sessions, messages, latency), HPA based on queue depth

5. Security Audit for Each Platform

5.1 Authentication & Authorization

Platform Token Storage Token Rotation Scope Validation Vulnerabilities
Telegram Environment Manual Bot-level Token in env, shared across instances
Discord Environment Manual Bot-level Token in env, privileged intents needed
Slack Environment + OAuth file Auto (OAuth) App-level App token exposure risk
Matrix Environment Manual User-level Access token long-lived
Signal Environment N/A (daemon) Account-level No E2EE for bot messages
WhatsApp Session files Auto Account-level QR code interception risk
Mattermost Environment Manual Bot-level Token in env
Email Environment App passwords Account-level Password in env, IMAP/SMTP plain auth
SMS Environment N/A Account-level Credentials in env
DingTalk Environment Auto App-level Client secret in env
Feishu Environment Auto App-level App secret in env
WeCom Environment Auto Bot-level Bot secret in env
Home Assistant Environment Manual Token-level Long-lived tokens
Webhook Route config N/A Route-level HMAC secret in config
API Server Config Manual API key Key in memory, no rotation

5.2 Data Protection

Platform Data at Rest Data in Transit E2EE Available PII Redaction
Telegram (cloud) TLS Phone numbers
Discord (cloud) TLS User IDs
Slack ⚠️ (cloud) TLS User IDs
Matrix (configurable) TLS (optional) ⚠️ Partial
Signal (local) TLS (always) Phone numbers
WhatsApp ⚠️ (local bridge) TLS ⚠️ (bridge)
Mattermost (self-hosted) TLS ⚠️ Partial
Email (local) TLS ⚠️ (PGP possible) Addresses
SMS (Twilio cloud) TLS Phone numbers
DingTalk (cloud) TLS ⚠️ Partial
Feishu (cloud) TLS ⚠️ Partial
WeCom ⚠️ (enterprise) TLS ⚠️ Partial
Home Assistant (local) TLS/WS N/A Entity IDs
Webhook (local) TLS N/A ⚠️ Config-dependent
API Server (SQLite) TLS N/A API keys

5.3 Attack Vectors & Mitigations

A. Telegram

  • Vector: Webhook spoofing with fake updates
  • Mitigation: Validate update signatures (if using webhooks with secret)
  • Status: Implemented (webhook secret validation)

B. Discord

  • Vector: Gateway intent manipulation for privilege escalation
  • Mitigation: Minimal intent configuration, validate member permissions
  • Status: ⚠️ Partial (intents configured but not runtime validated)

C. Slack

  • Vector: Request forgery via delayed signature replay
  • Mitigation: Timestamp validation in signature verification
  • Status: Implemented (Bolt handles this)

D. Matrix

  • Vector: Device verification bypass for E2EE rooms
  • Mitigation: Require verified devices, blacklist unverified
  • Status: ⚠️ Partial (E2EE supported but verification UI not implemented)

E. Signal

  • Vector: signal-cli daemon access if local
  • Mitigation: Bind to localhost only, file permissions on socket
  • Status: ⚠️ Partial (relies on system configuration)

F. WhatsApp

  • Vector: Bridge process compromise, session hijacking
  • Mitigation: Process isolation, session file permissions, QR code timeout
  • Status: ⚠️ Partial (process isolation via subprocess)

G. Email

  • Vector: Attachment malware, phishing via spoofed sender
  • Mitigation: Attachment scanning, SPF/DKIM validation consideration
  • Status: ⚠️ Partial (automated sender filtering, no malware scanning)

H. Webhook

  • Vector: HMAC secret brute force, replay attacks
  • Mitigation: Constant-time comparison, timestamp validation, rate limiting
  • Status: Implemented (constant-time HMAC, rate limiting)

I. API Server

  • Vector: API key brute force, unauthorized model access
  • Mitigation: Rate limiting, key rotation, request logging
  • Status: ⚠️ Partial (rate limiting recommended but not enforced)

5.4 Security Recommendations

  1. Implement Secret Rotation: All platforms using long-lived tokens should support rotation without restart
  2. Add Request Signing: Platforms without native validation should implement Ed25519 request signing
  3. Implement Audit Logging: All authentication events should be logged with structured format
  4. Add Rate Limiting: Per-user, per-chat, and per-platform rate limiting with exponential backoff
  5. Enable Content Scanning: File attachments should be scanned for malware before processing
  6. Implement CSP: For webhook/API server, strict Content-Security-Policy headers
  7. Add Security Headers: All HTTP responses should include security headers (HSTS, X-Frame-Options, etc.)

Appendix A: Code Quality Metrics

A.1 Test Coverage by Platform

Platform Unit Tests Integration Tests Mock Coverage
Telegram High
Discord High
Slack High
Matrix Medium
Signal ⚠️ Medium
WhatsApp ⚠️ Low
Mattermost High
Email High
SMS High
Other ⚠️ Low

A.2 Documentation Completeness

Platform Setup Guide API Reference Troubleshooting Examples
Telegram
Discord
Slack
WhatsApp ⚠️
Signal ⚠️ ⚠️
Matrix ⚠️ ⚠️
Other ⚠️

Appendix B: Performance Benchmarks (Estimated)

Platform Messages/sec Latency (p50) Latency (p99) Memory/session
Telegram 100+ 50ms 200ms ~5KB
Discord 50+ 100ms 500ms ~10KB
Slack 50+ 150ms 600ms ~8KB
Matrix 20+ 300ms 1000ms ~15KB
Signal 30+ 200ms 800ms ~10KB
WhatsApp 20+ 500ms 2000ms ~20KB

Report generated: March 30, 2026 *Total lines analyzed: ~35,000+ *Platforms covered: 15 *Files analyzed: 45+