Files

Smoke Test / smoke (pull_request) Failing after 19s

Details

docs: predictive resource allocation — operator guide (#749 )

Closes #749

2026-04-16 01:51:19 +00:00

2.5 KiB

Raw Blame History

Predictive Resource Allocation

Forecasts near-term fleet demand from historical telemetry so the operator can pre-provision resources before a surge hits.

How It Works

The predictor reads two data sources:

Metric logs (metrics/local_*.jsonl) — request cadence, token volume, caller mix, success/failure rates
Heartbeat logs (heartbeat/ticks_*.jsonl) — Gitea availability, local inference health

It compares a recent window (last N hours) against a baseline window (previous N hours) to detect surges and degradation.

Output Contract

{
  "resource_mode": "steady|surge",
  "dispatch_posture": "normal|degraded",
  "horizon_hours": 6,
  "recent_request_rate": 12.5,
  "baseline_request_rate": 8.0,
  "predicted_request_rate": 15.0,
  "surge_factor": 1.56,
  "demand_level": "elevated|normal|low|critical",
  "gitea_outages": 0,
  "inference_failures": 2,
  "top_callers": [...],
  "recommended_actions": ["..."]
}

Demand Levels

Surge Factor	Level	Meaning
> 3.0	critical	Extreme surge, immediate action needed
> 1.5	elevated	Notable increase, pre-warm recommended
> 1.0	normal	Slight increase, monitor
<= 1.0	low	Flat or declining

Posture Signals

Signal	Effect
Surge factor > 1.5	`resource_mode: surge` + pre-warm recommendation
Gitea outages >= 1	`dispatch_posture: degraded` + cache recommendation
Inference failures >= 2	`resource_mode: surge` + reliability investigation
Heavy batch callers	Throttle recommendation
High caller failure rates	Investigation recommendation

Usage

# Markdown report
python3 scripts/predictive_resource_allocator.py

# JSON output
python3 scripts/predictive_resource_allocator.py --json

# Custom paths and horizon
python3 scripts/predictive_resource_allocator.py \
    --metrics metrics/local_20260329.jsonl \
    --heartbeat heartbeat/ticks_20260329.jsonl \
    --horizon 12

Tests

python3 -m pytest tests/test_predictive_resource_allocator.py -v

Recommended Actions

The predictor generates contextual recommendations:

Pre-warm local inference — surge detected, warm up before next window
Throttle background jobs — heavy batch work consuming capacity
Investigate failure rates — specific callers failing at high rates
Investigate model reliability — inference health degraded
Cache forge state — Gitea availability issues
Maintain current allocation — no issues detected

2.5 KiB Raw Blame History