Checkpoint: 2026-04-04 08:00:03 UTC
This commit is contained in:
126
skills/devops/lazarus-pit-recovery/SKILL.md
Normal file
126
skills/devops/lazarus-pit-recovery/SKILL.md
Normal file
@@ -0,0 +1,126 @@
|
||||
---
|
||||
name: lazarus-pit-recovery
|
||||
description: "Resurrect a downed Hermes agent — fallback inference paths, profile recovery, Telegram reconnection. When one falls, all hands rally."
|
||||
tags: [recovery, agents, ollama, llama-cpp, turboquant, telegram, lazarus]
|
||||
trigger: "Agent is down, unresponsive, or has invalid credentials and needs to be brought back online"
|
||||
---
|
||||
|
||||
# Lazarus Pit — Agent Recovery Protocol
|
||||
|
||||
When an agent goes down, ALL available agents rally to bring it back.
|
||||
|
||||
## Step 1: Assess Current Fleet State
|
||||
|
||||
```bash
|
||||
# Check running agents
|
||||
ps aux | grep hermes | grep -v grep
|
||||
systemctl list-units 'hermes-*' --all
|
||||
|
||||
# Check running inference backends
|
||||
ps aux | grep -E 'ollama|llama-server' | grep -v grep
|
||||
curl -s http://localhost:11434/api/tags # Ollama models
|
||||
```
|
||||
|
||||
## Step 2: Identify the Problem
|
||||
|
||||
Common failure modes:
|
||||
- **Invalid API key** (Kimi/OpenAI/etc) → Switch to local inference
|
||||
- **Invalid Telegram bot token** → Get fresh token from @BotFather or reuse available one
|
||||
- **Model not loaded** → Pull via Ollama or start llama-server
|
||||
- **Service crashed** → Check logs: `journalctl -u hermes-<name> --since "1 hour ago"`
|
||||
|
||||
## Step 3: Local Inference Fallback Chain
|
||||
|
||||
Priority order:
|
||||
1. **Ollama** (easiest) — Check available models: `ollama list`
|
||||
- Gemma 3:4b (fast, low memory)
|
||||
- Gemma 3:27b (better quality, more RAM)
|
||||
```bash
|
||||
ollama serve & # If not running
|
||||
ollama run gemma3:4b # Test
|
||||
```
|
||||
|
||||
2. **TurboQuant llama.cpp** (best memory efficiency)
|
||||
```bash
|
||||
cd /root/llama-cpp-turboquant/
|
||||
./build/bin/llama-server \
|
||||
-m /path/to/model.gguf \
|
||||
--host 0.0.0.0 --port 8080 \
|
||||
-c 4096 --cache-type-k turbo4 --cache-type-v turbo4
|
||||
```
|
||||
- turbo4: 3.8x KV compression, minimal quality loss
|
||||
- turbo2: 6.4x compression, noticeable quality loss
|
||||
|
||||
3. **Standard llama.cpp** — Same as above without `--cache-type` flags
|
||||
|
||||
## Step 4: Configure Profile
|
||||
|
||||
```bash
|
||||
# Profile locations
|
||||
ls ~/.hermes/profiles/ # Hermes profiles
|
||||
ls /root/wizards/ # Wizard directories
|
||||
|
||||
# Key files to edit
|
||||
~/.hermes/profiles/<name>/config.yaml # Model + provider config
|
||||
~/.hermes/profiles/<name>/.env # API keys + bot tokens
|
||||
/root/wizards/<name>/home/.env # Alternative .env location
|
||||
```
|
||||
|
||||
### Ollama config.yaml:
|
||||
```yaml
|
||||
model: gemma3:4b
|
||||
providers:
|
||||
ollama:
|
||||
base_url: http://localhost:11434/v1
|
||||
```
|
||||
|
||||
### llama.cpp config.yaml:
|
||||
```yaml
|
||||
model: local-model
|
||||
providers:
|
||||
llama-cpp:
|
||||
base_url: http://localhost:8080/v1
|
||||
```
|
||||
|
||||
## Step 5: Connect Telegram
|
||||
|
||||
```bash
|
||||
# Add bot token to .env
|
||||
echo 'TELEGRAM_BOT_TOKEN=<token>' >> ~/.hermes/profiles/<name>/.env
|
||||
|
||||
# Add channel
|
||||
echo 'TELEGRAM_ALLOWED_CHATS=-1003664764329' >> ~/.hermes/profiles/<name>/.env
|
||||
```
|
||||
|
||||
## Step 6: Launch & Verify
|
||||
|
||||
```bash
|
||||
# Start service
|
||||
systemctl start hermes-<name>
|
||||
# Or manual:
|
||||
HERMES_PROFILE=<name> hermes gateway run
|
||||
```
|
||||
|
||||
## Step 7: Validate
|
||||
|
||||
- Send test message in Telegram
|
||||
- Check response arrives
|
||||
- Verify logs: `journalctl -u hermes-<name> -f`
|
||||
|
||||
## Pitfalls
|
||||
|
||||
- **Qin profile** has INVALID Kimi keys and bot token as of 2026-04 — needs fresh creds
|
||||
- **Allegro and Ezra tokens** are IN USE — don't steal from running agents
|
||||
- **CPU-only inference** is slow (~35s for Gemma 3:4b) — acceptable for chat, not for coding
|
||||
- **TurboQuant requires custom llama.cpp build** — standard Ollama doesn't support it
|
||||
- **Token masking** — `systemctl show` masks env vars; check .env files directly
|
||||
|
||||
## Known Bot Inventory
|
||||
|
||||
| Agent | Status | Backend | Notes |
|
||||
|-------|--------|---------|-------|
|
||||
| Ezra | ACTIVE | Kimi | Don't touch |
|
||||
| Allegro | ACTIVE | Kimi | Don't touch |
|
||||
| Bezalel | AVAILABLE | Ollama/llama.cpp | Recovery candidate |
|
||||
| Qin | BROKEN | - | Needs fresh creds |
|
||||
| Adagio | AVAILABLE | - | Token may be invalid |
|
||||
Reference in New Issue
Block a user