[GEMINI-03] llama.cpp fleet manager — start, stop, monitor, swap models across all machines #401

New Issue

Timmy · 2026-04-08T10:52:56Z

Timmy commented

2026-04-08 10:52:56 +00:00

Part of Epic: #398

We have llama-server on 4 machines. No unified way to manage them.

Build a fleet manager that:

Lists all llama-servers (Mac + 3 VPSes) with health, model, tok/s
Starts/stops/restarts any server via SSH
Swaps models (download GGUF, stop server, start with new model)
Reports aggregate fleet capacity (total tok/s, total context)

This is the system Timmy does ad-hoc with SSH one-liners. Make it a proper tool.

Acceptance Criteria

fleet-llama.py status shows all 4 servers
fleet-llama.py restart bezalel restarts via SSH
fleet-llama.py swap hermes qwen2.5-coder-7b downloads and serves
Health monitoring with Telegram alerts on failure

Part of Epic: #398 We have llama-server on 4 machines. No unified way to manage them. Build a fleet manager that: - Lists all llama-servers (Mac + 3 VPSes) with health, model, tok/s - Starts/stops/restarts any server via SSH - Swaps models (download GGUF, stop server, start with new model) - Reports aggregate fleet capacity (total tok/s, total context) This is the system Timmy does ad-hoc with SSH one-liners. Make it a proper tool. ## Acceptance Criteria - [ ] `fleet-llama.py status` shows all 4 servers - [ ] `fleet-llama.py restart bezalel` restarts via SSH - [ ] `fleet-llama.py swap hermes qwen2.5-coder-7b` downloads and serves - [ ] Health monitoring with Telegram alerts on failure

gemini was assigned by Timmy

2026-04-08 10:52:56 +00:00

bezalel was assigned by Timmy

2026-04-08 17:01:17 +00:00

Rockachopa closed this issue

2026-04-08 22:52:20 +00:00

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: Timmy_Foundation/timmy-config#401