Files

Teknium 43d468cea8 docs: comprehensive documentation audit — fix stale info, expand thin pages, add depth (#5393 )

Major changes across 20 documentation pages:

Staleness fixes:
- Fix FAQ: wrong import path (hermes.agent → run_agent)
- Fix FAQ: stale Gemini 2.0 model → Gemini 3 Flash
- Fix integrations/index: missing MiniMax TTS provider
- Fix integrations/index: web_crawl is not a registered tool
- Fix sessions: add all 19 session sources (was only 5)
- Fix cron: add all 18 delivery targets (was only telegram/discord)
- Fix webhooks: add all delivery targets
- Fix overview: add missing MCP, memory providers, credential pools
- Fix all line-number references → use function name searches instead
- Update file size estimates (run_agent ~9200, gateway ~7200, cli ~8500)

Expanded thin pages (< 150 lines → substantial depth):
- honcho.md: 43 → 108 lines — added feature comparison, tools, config, CLI
- overview.md: 49 → 55 lines — added MCP, memory providers, credential pools
- toolsets-reference.md: 57 → 175 lines — added explanations, config examples,
  custom toolsets, wildcards, platform differences table
- optional-skills-catalog.md: 74 → 153 lines — added 25+ missing skills across
  communication, devops, mlops (18!), productivity, research categories
- integrations/index.md: 82 → 115 lines — added messaging, HA, plugins sections
- cron-internals.md: 90 → 195 lines — added job JSON example, lifecycle states,
  tick cycle, delivery targets, script-backed jobs, CLI interface
- gateway-internals.md: 111 → 250 lines — added architecture diagram, message
  flow, two-level guard, platform adapters, token locks, process management
- agent-loop.md: 112 → 235 lines — added entry points, API mode resolution,
  turn lifecycle detail, message alternation rules, tool execution flow,
  callback table, budget tracking, compression details
- architecture.md: 152 → 295 lines — added system overview diagram, data flow
  diagrams, design principles table, dependency chain

Other depth additions:
- context-references.md: added platform availability, compression interaction,
  common patterns sections
- slash-commands.md: added quick commands config example, alias resolution
- image-generation.md: added platform delivery table
- tools-reference.md: added tool counts, MCP tools note
- index.md: updated platform count (5 → 14+), tool count (40+ → 47)

2026-04-05 19:45:50 -07:00

7.0 KiB

Raw Permalink Blame History

title, sidebar_label, sidebar_position

title	sidebar_label	sidebar_position
Integrations	Overview	0

Integrations

Hermes Agent connects to external systems for AI inference, tool servers, IDE workflows, programmatic access, and more. These integrations extend what Hermes can do and where it can run.

AI Providers & Routing

Hermes supports multiple AI inference providers out of the box. Use hermes model to configure interactively, or set them in config.yaml.

AI Providers — OpenRouter, Anthropic, OpenAI, Google, and any OpenAI-compatible endpoint. Hermes auto-detects capabilities like vision, streaming, and tool use per provider.
Provider Routing — Fine-grained control over which underlying providers handle your OpenRouter requests. Optimize for cost, speed, or quality with sorting, whitelists, blacklists, and explicit priority ordering.
Fallback Providers — Automatic failover to backup LLM providers when your primary model encounters errors. Includes primary model fallback and independent auxiliary task fallback for vision, compression, and web extraction.

Tool Servers (MCP)

MCP Servers — Connect Hermes to external tool servers via Model Context Protocol. Access tools from GitHub, databases, file systems, browser stacks, internal APIs, and more without writing native Hermes tools. Supports both stdio and SSE transports, per-server tool filtering, and capability-aware resource/prompt registration.

Web Search Backends

The web_search and web_extract tools support four backend providers, configured via config.yaml or hermes tools:

Backend	Env Var	Search	Extract	Crawl
Firecrawl (default)	`FIRECRAWL_API_KEY`	✔	✔	✔
Parallel	`PARALLEL_API_KEY`	✔	✔	—
Tavily	`TAVILY_API_KEY`	✔	✔	✔
Exa	`EXA_API_KEY`	✔	✔	—

Quick setup example:

web:
  backend: firecrawl    # firecrawl | parallel | tavily | exa

If web.backend is not set, the backend is auto-detected from whichever API key is available. Self-hosted Firecrawl is also supported via FIRECRAWL_API_URL.

Browser Automation

Hermes includes full browser automation with multiple backend options for navigating websites, filling forms, and extracting information:

Browserbase — Managed cloud browsers with anti-bot tooling, CAPTCHA solving, and residential proxies
Browser Use — Alternative cloud browser provider
Local Chrome via CDP — Connect to your running Chrome instance using /browser connect
Local Chromium — Headless local browser via the agent-browser CLI

See Browser Automation for setup and usage.

Voice & TTS Providers

Text-to-speech and speech-to-text across all messaging platforms:

| Provider | Quality | Cost | API Key | ||----------|---------|------|---------| || Edge TTS (default) | Good | Free | None needed | || ElevenLabs | Excellent | Paid | ELEVENLABS_API_KEY | || OpenAI TTS | Good | Paid | VOICE_TOOLS_OPENAI_KEY | || MiniMax | Good | Paid | MINIMAX_API_KEY | || NeuTTS | Good | Free | None needed |

Speech-to-text supports three providers: local Whisper (free, runs on-device), Groq (fast cloud), and OpenAI Whisper API. Voice message transcription works across Telegram, Discord, WhatsApp, and other messaging platforms. See Voice & TTS and Voice Mode for details.

IDE & Editor Integration

IDE Integration (ACP) — Use Hermes Agent inside ACP-compatible editors such as VS Code, Zed, and JetBrains. Hermes runs as an ACP server, rendering chat messages, tool activity, file diffs, and terminal commands inside your editor.

Programmatic Access

API Server — Expose Hermes as an OpenAI-compatible HTTP endpoint. Any frontend that speaks the OpenAI format — Open WebUI, LobeChat, LibreChat, NextChat, ChatBox — can connect and use Hermes as a backend with its full toolset.

Memory & Personalization

Built-in Memory — Persistent, curated memory via MEMORY.md and USER.md files. The agent maintains bounded stores of personal notes and user profile data that survive across sessions.
Memory Providers — Plug in external memory backends for deeper personalization. Seven providers are supported: Honcho (dialectic reasoning), OpenViking (tiered retrieval), Mem0 (cloud extraction), Hindsight (knowledge graphs), Holographic (local SQLite), RetainDB (hybrid search), and ByteRover (CLI-based).

Messaging Platforms

Hermes runs as a gateway bot on 14+ messaging platforms, all configured through the same gateway subsystem:

Telegram, Discord, Slack, WhatsApp, Signal, Matrix, Mattermost, Email, SMS, DingTalk, Feishu/Lark, WeCom, Home Assistant, Webhooks

See the Messaging Gateway overview for the platform comparison table and setup guide.

Home Automation

Home Assistant — Control smart home devices via four dedicated tools (ha_list_entities, ha_get_state, ha_list_services, ha_call_service). The Home Assistant toolset activates automatically when HASS_TOKEN is configured.

Plugins

Plugin System — Extend Hermes with custom tools, lifecycle hooks, and CLI commands without modifying core code. Plugins are discovered from ~/.hermes/plugins/, project-local .hermes/plugins/, and pip-installed entry points.
Build a Plugin — Step-by-step guide for creating Hermes plugins with tools, hooks, and CLI commands.

Training & Evaluation

RL Training — Generate trajectory data from agent sessions for reinforcement learning and model fine-tuning. Supports Atropos environments with customizable reward functions.
Batch Processing — Run the agent across hundreds of prompts in parallel, generating structured ShareGPT-format trajectory data for training data generation or evaluation.

7.0 KiB Raw Permalink Blame History