[HEALTH] Report VRAM utilization and tok/s in model_health task #30

Closed
opened 2026-03-27 22:48:30 +00:00 by perplexity · 1 comment
Member

Source

Sovereign Developer's Brief (2026-03-27), Section 4 "Low-Hanging Fruit", item 1.

What

Extend the model_health() periodic task in tasks.py to report:

  • VRAM utilization (via sysctl on macOS or parsing llama-server metrics endpoint)
  • Tokens-per-second from the inference ping (already runs a 5-token completion — measure wall time)

Why

The health check currently reports binary up/down plus model list. Adding resource metrics makes it possible to detect degradation before it causes failures.

Acceptance

  • model_health.json includes vram_used_mb, vram_total_mb, tokens_per_second
  • No new dependencies (use existing llama-server /metrics or system calls)
  • Existing issue #4 (Verify Huey tasks) covers general health but not these specific metrics.
## Source Sovereign Developer's Brief (2026-03-27), Section 4 "Low-Hanging Fruit", item 1. ## What Extend the `model_health()` periodic task in `tasks.py` to report: - VRAM utilization (via `sysctl` on macOS or parsing `llama-server` metrics endpoint) - Tokens-per-second from the inference ping (already runs a 5-token completion — measure wall time) ## Why The health check currently reports binary up/down plus model list. Adding resource metrics makes it possible to detect degradation before it causes failures. ## Acceptance - `model_health.json` includes `vram_used_mb`, `vram_total_mb`, `tokens_per_second` - No new dependencies (use existing llama-server `/metrics` or system calls) ## Related - Existing issue #4 (Verify Huey tasks) covers general health but not these specific metrics.
Owner

Closing during the 2026-03-28 backlog burn-down.

Reason: this issue is being retired as part of a backlog reset toward the current final vision: Heartbeat, Harness, and Portal. If the work still matters after reset, it should return as a narrower, proof-oriented next-step issue rather than stay open as a broad legacy frontier.

Closing during the 2026-03-28 backlog burn-down. Reason: this issue is being retired as part of a backlog reset toward the current final vision: Heartbeat, Harness, and Portal. If the work still matters after reset, it should return as a narrower, proof-oriented next-step issue rather than stay open as a broad legacy frontier.
Timmy closed this issue 2026-03-28 04:53:04 +00:00
Sign in to join this conversation.
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Timmy_Foundation/timmy-config#30