Cost control for L40S pod (/bin/sh.79/hr).
Idle watchdog (cron every 15 min):
- Tracks last inference request timestamp
- If idle > 30 min, stops pod via RunPod GraphQL API
- Logs stop/start events with timestamps to cost log
Auto-resume manager:
- Import before inference to ensure pod is RUNNING
- If stopped, resumes and polls until Ollama responds
- Updates timestamp on each request
Components:
- big_brain_idle_watchdog.py: idle check + pod stop
- big_brain_manager.py: auto-resume + status
- 11 tests covering all states and edge cases
Closes#577