3.7 KiB
3.7 KiB
name, description, tags, trigger
| name | description | tags | trigger | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| lazarus-pit-recovery | Resurrect a downed Hermes agent — fallback inference paths, profile recovery, Telegram reconnection. When one falls, all hands rally. |
|
Agent is down, unresponsive, or has invalid credentials and needs to be brought back online |
Lazarus Pit — Agent Recovery Protocol
When an agent goes down, ALL available agents rally to bring it back.
Step 1: Assess Current Fleet State
# Check running agents
ps aux | grep hermes | grep -v grep
systemctl list-units 'hermes-*' --all
# Check running inference backends
ps aux | grep -E 'ollama|llama-server' | grep -v grep
curl -s http://localhost:11434/api/tags # Ollama models
Step 2: Identify the Problem
Common failure modes:
- Invalid API key (Kimi/OpenAI/etc) → Switch to local inference
- Invalid Telegram bot token → Get fresh token from @BotFather or reuse available one
- Model not loaded → Pull via Ollama or start llama-server
- Service crashed → Check logs:
journalctl -u hermes-<name> --since "1 hour ago"
Step 3: Local Inference Fallback Chain
Priority order:
-
Ollama (easiest) — Check available models:
ollama list- Gemma 3:4b (fast, low memory)
- Gemma 3:27b (better quality, more RAM)
ollama serve & # If not running ollama run gemma3:4b # Test -
TurboQuant llama.cpp (best memory efficiency)
cd /root/llama-cpp-turboquant/ ./build/bin/llama-server \ -m /path/to/model.gguf \ --host 0.0.0.0 --port 8080 \ -c 4096 --cache-type-k turbo4 --cache-type-v turbo4- turbo4: 3.8x KV compression, minimal quality loss
- turbo2: 6.4x compression, noticeable quality loss
-
Standard llama.cpp — Same as above without
--cache-typeflags
Step 4: Configure Profile
# Profile locations
ls ~/.hermes/profiles/ # Hermes profiles
ls /root/wizards/ # Wizard directories
# Key files to edit
~/.hermes/profiles/<name>/config.yaml # Model + provider config
~/.hermes/profiles/<name>/.env # API keys + bot tokens
/root/wizards/<name>/home/.env # Alternative .env location
Ollama config.yaml:
model: gemma3:4b
providers:
ollama:
base_url: http://localhost:11434/v1
llama.cpp config.yaml:
model: local-model
providers:
llama-cpp:
base_url: http://localhost:8080/v1
Step 5: Connect Telegram
# Add bot token to .env
echo 'TELEGRAM_BOT_TOKEN=<token>' >> ~/.hermes/profiles/<name>/.env
# Add channel
echo 'TELEGRAM_ALLOWED_CHATS=-1003664764329' >> ~/.hermes/profiles/<name>/.env
Step 6: Launch & Verify
# Start service
systemctl start hermes-<name>
# Or manual:
HERMES_PROFILE=<name> hermes gateway run
Step 7: Validate
- Send test message in Telegram
- Check response arrives
- Verify logs:
journalctl -u hermes-<name> -f
Pitfalls
- Qin profile has INVALID Kimi keys and bot token as of 2026-04 — needs fresh creds
- Allegro and Ezra tokens are IN USE — don't steal from running agents
- CPU-only inference is slow (~35s for Gemma 3:4b) — acceptable for chat, not for coding
- TurboQuant requires custom llama.cpp build — standard Ollama doesn't support it
- Token masking —
systemctl showmasks env vars; check .env files directly
Known Bot Inventory
| Agent | Status | Backend | Notes |
|---|---|---|---|
| Ezra | ACTIVE | Kimi | Don't touch |
| Allegro | ACTIVE | Kimi | Don't touch |
| Bezalel | AVAILABLE | Ollama/llama.cpp | Recovery candidate |
| Qin | BROKEN | - | Needs fresh creds |
| Adagio | AVAILABLE | - | Token may be invalid |