Merge main's faster-whisper (local, free) with our Groq support into a unified three-provider STT pipeline: local > groq > openai. Provider priority ensures free options are tried first. Each provider has its own transcriber function with model auto-correction, env- overridable endpoints, and proper error handling. 74 tests cover the full provider matrix, fallback chains, model correction, config loading, validation edge cases, and dispatch.
309 lines
14 KiB
Plaintext
309 lines
14 KiB
Plaintext
# Hermes Agent Environment Configuration
|
|
# Copy this file to .env and fill in your API keys
|
|
|
|
# =============================================================================
|
|
# LLM PROVIDER (OpenRouter)
|
|
# =============================================================================
|
|
# OpenRouter provides access to many models through one API
|
|
# All LLM calls go through OpenRouter - no direct provider keys needed
|
|
# Get your key at: https://openrouter.ai/keys
|
|
OPENROUTER_API_KEY=
|
|
|
|
# Default model to use (OpenRouter format: provider/model)
|
|
# Examples: anthropic/claude-opus-4.6, openai/gpt-4o, google/gemini-3-flash-preview, zhipuai/glm-4-plus
|
|
LLM_MODEL=anthropic/claude-opus-4.6
|
|
|
|
# =============================================================================
|
|
# LLM PROVIDER (z.ai / GLM)
|
|
# =============================================================================
|
|
# z.ai provides access to ZhipuAI GLM models (GLM-4-Plus, etc.)
|
|
# Get your key at: https://z.ai or https://open.bigmodel.cn
|
|
GLM_API_KEY=
|
|
# GLM_BASE_URL=https://api.z.ai/api/paas/v4 # Override default base URL
|
|
|
|
# =============================================================================
|
|
# LLM PROVIDER (Kimi / Moonshot)
|
|
# =============================================================================
|
|
# Kimi Code provides access to Moonshot AI coding models (kimi-k2.5, etc.)
|
|
# Get your key at: https://platform.kimi.ai (Kimi Code console)
|
|
# Keys prefixed sk-kimi- use the Kimi Code API (api.kimi.com) by default.
|
|
# Legacy keys from platform.moonshot.ai need KIMI_BASE_URL override below.
|
|
KIMI_API_KEY=
|
|
# KIMI_BASE_URL=https://api.kimi.com/coding/v1 # Default for sk-kimi- keys
|
|
# KIMI_BASE_URL=https://api.moonshot.ai/v1 # For legacy Moonshot keys
|
|
# KIMI_BASE_URL=https://api.moonshot.cn/v1 # For Moonshot China keys
|
|
|
|
# =============================================================================
|
|
# LLM PROVIDER (MiniMax)
|
|
# =============================================================================
|
|
# MiniMax provides access to MiniMax models (global endpoint)
|
|
# Get your key at: https://www.minimax.io
|
|
MINIMAX_API_KEY=
|
|
# MINIMAX_BASE_URL=https://api.minimax.io/v1 # Override default base URL
|
|
|
|
# MiniMax China endpoint (for users in mainland China)
|
|
MINIMAX_CN_API_KEY=
|
|
# MINIMAX_CN_BASE_URL=https://api.minimaxi.com/v1 # Override default base URL
|
|
|
|
# =============================================================================
|
|
# TOOL API KEYS
|
|
# =============================================================================
|
|
|
|
# Firecrawl API Key - Web search, extract, and crawl
|
|
# Get at: https://firecrawl.dev/
|
|
FIRECRAWL_API_KEY=
|
|
|
|
# FAL.ai API Key - Image generation
|
|
# Get at: https://fal.ai/
|
|
FAL_KEY=
|
|
|
|
# Honcho - Cross-session AI-native user modeling (optional)
|
|
# Builds a persistent understanding of the user across sessions and tools.
|
|
# Get at: https://app.honcho.dev
|
|
# Also requires ~/.honcho/config.json with enabled=true (see README).
|
|
HONCHO_API_KEY=
|
|
|
|
# =============================================================================
|
|
# TERMINAL TOOL CONFIGURATION (mini-swe-agent backend)
|
|
# =============================================================================
|
|
# Backend type: "local", "singularity", "docker", "modal", or "ssh"
|
|
# Terminal backend is configured in ~/.hermes/config.yaml (terminal.backend).
|
|
# Use 'hermes setup' or 'hermes config set terminal.backend docker' to change.
|
|
# Supported: local, docker, singularity, modal, ssh
|
|
#
|
|
# Only override here if you need to force a backend without touching config.yaml:
|
|
# TERMINAL_ENV=local
|
|
|
|
# Container images (for singularity/docker/modal backends)
|
|
# TERMINAL_DOCKER_IMAGE=nikolaik/python-nodejs:python3.11-nodejs20
|
|
# TERMINAL_SINGULARITY_IMAGE=docker://nikolaik/python-nodejs:python3.11-nodejs20
|
|
TERMINAL_MODAL_IMAGE=nikolaik/python-nodejs:python3.11-nodejs20
|
|
|
|
|
|
# Working directory for terminal commands
|
|
# For local backend: "." means current directory (resolved automatically)
|
|
# For remote backends (ssh/docker/modal/singularity): use an absolute path
|
|
# INSIDE the target environment, or leave unset for the backend's default
|
|
# (/root for modal, / for docker, ~ for ssh). Do NOT use a host-local path.
|
|
# Usually managed by config.yaml (terminal.cwd) — uncomment to override
|
|
# TERMINAL_CWD=.
|
|
|
|
# Default command timeout in seconds
|
|
TERMINAL_TIMEOUT=60
|
|
|
|
# Cleanup inactive environments after this many seconds
|
|
TERMINAL_LIFETIME_SECONDS=300
|
|
|
|
# =============================================================================
|
|
# SSH REMOTE EXECUTION (for TERMINAL_ENV=ssh)
|
|
# =============================================================================
|
|
# Run terminal commands on a remote server via SSH.
|
|
# Agent code stays on your machine, commands execute remotely.
|
|
#
|
|
# SECURITY BENEFITS:
|
|
# - Agent cannot read your .env file (API keys protected)
|
|
# - Agent cannot modify its own code
|
|
# - Remote server acts as isolated sandbox
|
|
# - Can safely configure passwordless sudo on remote
|
|
#
|
|
# TERMINAL_SSH_HOST=192.168.1.100
|
|
# TERMINAL_SSH_USER=agent
|
|
# TERMINAL_SSH_PORT=22
|
|
# TERMINAL_SSH_KEY=~/.ssh/id_rsa
|
|
|
|
# =============================================================================
|
|
# SUDO SUPPORT (works with ALL terminal backends)
|
|
# =============================================================================
|
|
# If set, enables sudo commands by piping password via `sudo -S`.
|
|
# Works with: local, docker, singularity, modal, and ssh backends.
|
|
#
|
|
# SECURITY WARNING: Password stored in plaintext. Only use on trusted machines.
|
|
#
|
|
# ALTERNATIVES:
|
|
# - For SSH backend: Configure passwordless sudo on the remote server
|
|
# - For containers: Run as root inside the container (no sudo needed)
|
|
# - For local: Configure /etc/sudoers for specific commands
|
|
# - For CLI: Leave unset - you'll be prompted interactively with 45s timeout
|
|
#
|
|
# SUDO_PASSWORD=your_password_here
|
|
|
|
# =============================================================================
|
|
# MODAL CLOUD BACKEND (Optional - for TERMINAL_ENV=modal)
|
|
# =============================================================================
|
|
# Modal uses CLI authentication, not environment variables.
|
|
# Run: pip install modal && modal setup
|
|
# This will authenticate via browser and store credentials locally.
|
|
# No API key needed in .env - Modal handles auth automatically.
|
|
|
|
# =============================================================================
|
|
# BROWSER TOOL CONFIGURATION (agent-browser + Browserbase)
|
|
# =============================================================================
|
|
# Browser automation requires Browserbase cloud service for remote browser execution.
|
|
# This allows the agent to navigate websites, fill forms, and extract information.
|
|
#
|
|
# STEALTH MODES:
|
|
# - Basic Stealth: ALWAYS active (random fingerprints, auto CAPTCHA solving)
|
|
# - Advanced Stealth: Requires BROWSERBASE_ADVANCED_STEALTH=true (Scale Plan only)
|
|
|
|
# Browserbase API Key - Cloud browser execution
|
|
# Get at: https://browserbase.com/
|
|
BROWSERBASE_API_KEY=
|
|
|
|
# Browserbase Project ID - From your Browserbase dashboard
|
|
BROWSERBASE_PROJECT_ID=
|
|
|
|
# Enable residential proxies for better CAPTCHA solving (default: true)
|
|
# Routes traffic through residential IPs, significantly improves success rate
|
|
BROWSERBASE_PROXIES=true
|
|
|
|
# Enable advanced stealth mode (default: false, requires Scale Plan)
|
|
# Uses custom Chromium build to avoid bot detection altogether
|
|
BROWSERBASE_ADVANCED_STEALTH=false
|
|
|
|
# Browser session timeout in seconds (default: 300)
|
|
# Sessions are cleaned up after this duration of inactivity
|
|
BROWSER_SESSION_TIMEOUT=300
|
|
|
|
# Browser inactivity timeout - auto-cleanup inactive sessions (default: 120 = 2 min)
|
|
# Browser sessions are automatically closed after this period of no activity
|
|
BROWSER_INACTIVITY_TIMEOUT=120
|
|
|
|
# =============================================================================
|
|
# SESSION LOGGING
|
|
# =============================================================================
|
|
# Session trajectories are automatically saved to logs/ directory
|
|
# Format: logs/session_YYYYMMDD_HHMMSS_UUID.json
|
|
# Contains full conversation history in trajectory format for debugging/replay
|
|
|
|
# =============================================================================
|
|
# VOICE TRANSCRIPTION & OPENAI TTS
|
|
# =============================================================================
|
|
# Required for voice message transcription (Whisper) and OpenAI TTS voices.
|
|
# Uses OpenAI's API directly (not via OpenRouter).
|
|
# Named VOICE_TOOLS_OPENAI_KEY to avoid interference with OpenRouter.
|
|
# Get at: https://platform.openai.com/api-keys
|
|
VOICE_TOOLS_OPENAI_KEY=
|
|
|
|
# =============================================================================
|
|
# SLACK INTEGRATION
|
|
# =============================================================================
|
|
# Slack Bot Token - From Slack App settings (OAuth & Permissions)
|
|
# Get at: https://api.slack.com/apps
|
|
# SLACK_BOT_TOKEN=xoxb-...
|
|
|
|
# Slack App Token - For Socket Mode (App-Level Tokens in Slack App settings)
|
|
# SLACK_APP_TOKEN=xapp-...
|
|
|
|
# Slack allowed users (comma-separated Slack user IDs)
|
|
# SLACK_ALLOWED_USERS=
|
|
|
|
# WhatsApp (built-in Baileys bridge — run `hermes whatsapp` to pair)
|
|
# WHATSAPP_ENABLED=false
|
|
# WHATSAPP_ALLOWED_USERS=15551234567
|
|
|
|
# Email (IMAP/SMTP — send and receive emails as Hermes)
|
|
# For Gmail: enable 2FA → create App Password at https://myaccount.google.com/apppasswords
|
|
# EMAIL_ADDRESS=hermes@gmail.com
|
|
# EMAIL_PASSWORD=xxxx xxxx xxxx xxxx
|
|
# EMAIL_IMAP_HOST=imap.gmail.com
|
|
# EMAIL_IMAP_PORT=993
|
|
# EMAIL_SMTP_HOST=smtp.gmail.com
|
|
# EMAIL_SMTP_PORT=587
|
|
# EMAIL_POLL_INTERVAL=15
|
|
# EMAIL_ALLOWED_USERS=your@email.com
|
|
# EMAIL_HOME_ADDRESS=your@email.com
|
|
|
|
# Web UI (browser-based chat interface on local network)
|
|
# Access from phone/tablet/desktop at http://<your-ip>:8765
|
|
# WEB_UI_ENABLED=false
|
|
# WEB_UI_PORT=8765
|
|
# WEB_UI_HOST=127.0.0.1 # Use 0.0.0.0 to expose on LAN
|
|
# WEB_UI_TOKEN= # Auto-generated if empty
|
|
|
|
# Gateway-wide: allow ALL users without an allowlist (default: false = deny)
|
|
# Only set to true if you intentionally want open access.
|
|
# GATEWAY_ALLOW_ALL_USERS=false
|
|
|
|
# =============================================================================
|
|
# RESPONSE PACING
|
|
# =============================================================================
|
|
# Human-like delays between message chunks on messaging platforms.
|
|
# Makes the bot feel less robotic.
|
|
# HERMES_HUMAN_DELAY_MODE=off # off | natural | custom
|
|
# HERMES_HUMAN_DELAY_MIN_MS=800 # Min delay in ms (custom mode)
|
|
# HERMES_HUMAN_DELAY_MAX_MS=2500 # Max delay in ms (custom mode)
|
|
|
|
# =============================================================================
|
|
# DEBUG OPTIONS
|
|
# =============================================================================
|
|
WEB_TOOLS_DEBUG=false
|
|
VISION_TOOLS_DEBUG=false
|
|
MOA_TOOLS_DEBUG=false
|
|
IMAGE_TOOLS_DEBUG=false
|
|
|
|
# =============================================================================
|
|
# CONTEXT COMPRESSION (Auto-shrinks long conversations)
|
|
# =============================================================================
|
|
# When conversation approaches model's context limit, middle turns are
|
|
# automatically summarized to free up space.
|
|
#
|
|
# Context compression is configured in ~/.hermes/config.yaml under compression:
|
|
# CONTEXT_COMPRESSION_ENABLED=true # Enable auto-compression (default: true)
|
|
# CONTEXT_COMPRESSION_THRESHOLD=0.85 # Compress at 85% of context limit
|
|
# Model is set via compression.summary_model in config.yaml (default: google/gemini-3-flash-preview)
|
|
|
|
# =============================================================================
|
|
# RL TRAINING (Tinker + Atropos)
|
|
# =============================================================================
|
|
# Run reinforcement learning training on language models using the Tinker API.
|
|
# Requires the rl-server to be running (from tinker-atropos package).
|
|
|
|
# Tinker API Key - RL training service
|
|
# Get at: https://tinker-console.thinkingmachines.ai/keys
|
|
TINKER_API_KEY=
|
|
|
|
# Weights & Biases API Key - Experiment tracking and metrics
|
|
# Get at: https://wandb.ai/authorize
|
|
WANDB_API_KEY=
|
|
|
|
# RL API Server URL (default: http://localhost:8080)
|
|
# Change if running the rl-server on a different host/port
|
|
# RL_API_URL=http://localhost:8080
|
|
|
|
# =============================================================================
|
|
# SKILLS HUB (GitHub integration for skill search/install/publish)
|
|
# =============================================================================
|
|
|
|
# GitHub Personal Access Token — for higher API rate limits on skill search/install
|
|
# Get at: https://github.com/settings/tokens (Fine-grained recommended)
|
|
# GITHUB_TOKEN=ghp_xxxxxxxxxxxxxxxxxxxx
|
|
|
|
# GitHub App credentials (optional — for bot identity on PRs)
|
|
# GITHUB_APP_ID=
|
|
# GITHUB_APP_PRIVATE_KEY_PATH=
|
|
# GITHUB_APP_INSTALLATION_ID=
|
|
|
|
# Groq API key (free tier — used for Whisper STT in voice mode)
|
|
# GROQ_API_KEY=
|
|
|
|
# =============================================================================
|
|
# STT PROVIDER SELECTION
|
|
# =============================================================================
|
|
# Default STT provider is "local" (faster-whisper) — runs on your machine, no API key needed.
|
|
# Install with: pip install faster-whisper
|
|
# Model downloads automatically on first use (~150 MB for "base").
|
|
# To use cloud providers instead, set GROQ_API_KEY or VOICE_TOOLS_OPENAI_KEY above.
|
|
# Provider priority: local > groq > openai
|
|
# Configure in config.yaml: stt.provider: local | groq | openai
|
|
|
|
# =============================================================================
|
|
# STT ADVANCED OVERRIDES (optional)
|
|
# =============================================================================
|
|
# Override default STT models per provider (normally set via stt.model in config.yaml)
|
|
# STT_GROQ_MODEL=whisper-large-v3-turbo
|
|
# STT_OPENAI_MODEL=whisper-1
|
|
|
|
# Override STT provider endpoints (for proxies or self-hosted instances)
|
|
# GROQ_BASE_URL=https://api.groq.com/openai/v1
|
|
# STT_OPENAI_BASE_URL=https://api.openai.com/v1
|