Files
timmy-home/docs/PREDICTIVE_RESOURCE_ALLOCATION.md
Alexander Whitestone 2c31ae8972
Some checks failed
Smoke Test / smoke (pull_request) Failing after 19s
docs: predictive resource allocation — operator guide (#749)
Closes #749
2026-04-16 01:51:19 +00:00

2.5 KiB

Predictive Resource Allocation

Forecasts near-term fleet demand from historical telemetry so the operator can pre-provision resources before a surge hits.

How It Works

The predictor reads two data sources:

  1. Metric logs (metrics/local_*.jsonl) — request cadence, token volume, caller mix, success/failure rates
  2. Heartbeat logs (heartbeat/ticks_*.jsonl) — Gitea availability, local inference health

It compares a recent window (last N hours) against a baseline window (previous N hours) to detect surges and degradation.

Output Contract

{
  "resource_mode": "steady|surge",
  "dispatch_posture": "normal|degraded",
  "horizon_hours": 6,
  "recent_request_rate": 12.5,
  "baseline_request_rate": 8.0,
  "predicted_request_rate": 15.0,
  "surge_factor": 1.56,
  "demand_level": "elevated|normal|low|critical",
  "gitea_outages": 0,
  "inference_failures": 2,
  "top_callers": [...],
  "recommended_actions": ["..."]
}

Demand Levels

Surge Factor Level Meaning
> 3.0 critical Extreme surge, immediate action needed
> 1.5 elevated Notable increase, pre-warm recommended
> 1.0 normal Slight increase, monitor
<= 1.0 low Flat or declining

Posture Signals

Signal Effect
Surge factor > 1.5 resource_mode: surge + pre-warm recommendation
Gitea outages >= 1 dispatch_posture: degraded + cache recommendation
Inference failures >= 2 resource_mode: surge + reliability investigation
Heavy batch callers Throttle recommendation
High caller failure rates Investigation recommendation

Usage

# Markdown report
python3 scripts/predictive_resource_allocator.py

# JSON output
python3 scripts/predictive_resource_allocator.py --json

# Custom paths and horizon
python3 scripts/predictive_resource_allocator.py \
    --metrics metrics/local_20260329.jsonl \
    --heartbeat heartbeat/ticks_20260329.jsonl \
    --horizon 12

Tests

python3 -m pytest tests/test_predictive_resource_allocator.py -v

The predictor generates contextual recommendations:

  • Pre-warm local inference — surge detected, warm up before next window
  • Throttle background jobs — heavy batch work consuming capacity
  • Investigate failure rates — specific callers failing at high rates
  • Investigate model reliability — inference health degraded
  • Cache forge state — Gitea availability issues
  • Maintain current allocation — no issues detected