Some checks failed
Tests / test (pull_request) Failing after 25m4s
Tests / e2e (pull_request) Successful in 3m19s
Contributor Attribution Check / check-attribution (pull_request) Failing after 14s
Docker Build and Publish / build-and-push (pull_request) Has been skipped
Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 14s
Error rate peaks at 18:00 (9.4%) during evening cron batches vs 4.0% at 09:00 during interactive work. Route cron tasks to stronger models during off-hours when user is not present to correct errors. New agent/time_aware_routing.py: - resolve_time_aware_model(): routes based on hour, error rate, task type - Interactive sessions: always use base model (user corrects errors) - Cron during business hours: use base model (low error rate) - Cron during off-hours with high error rate (>6%): upgrade to strong model - get_hour_error_rate(): error rates by hour from empirical audit - is_off_hours(): 18:00-05:59 = off-hours - RoutingDecision: model, provider, reason, hour, error_rate - get_routing_report(): 24h forecast of routing decisions Config via env vars: - CRON_STRONG_MODEL (default: xiaomi/mimo-v2-pro) - CRON_CHEAP_MODEL (default: qwen2.5:7b) - CRON_ERROR_THRESHOLD (default: 6.0%) Tests: tests/test_time_aware_routing.py (9 tests) Closes #889
59 lines
1.7 KiB
Python
59 lines
1.7 KiB
Python
"""Tests for time-aware model routing."""
|
|
|
|
import pytest
|
|
import sys
|
|
from pathlib import Path
|
|
|
|
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
|
|
|
|
from agent.time_aware_routing import (
|
|
resolve_time_aware_model,
|
|
get_hour_error_rate,
|
|
is_off_hours,
|
|
get_routing_report,
|
|
)
|
|
|
|
|
|
class TestErrorRates:
|
|
def test_evening_high_error(self):
|
|
assert get_hour_error_rate(18) == 9.4
|
|
assert get_hour_error_rate(19) == 8.1
|
|
|
|
def test_morning_low_error(self):
|
|
assert get_hour_error_rate(9) == 4.0
|
|
assert get_hour_error_rate(12) == 4.0
|
|
|
|
def test_default_for_unknown(self):
|
|
assert get_hour_error_rate(15) == 4.0
|
|
|
|
|
|
class TestOffHours:
|
|
def test_evening_is_off_hours(self):
|
|
assert is_off_hours(20) is True
|
|
assert is_off_hours(2) is True
|
|
|
|
def test_business_hours_not_off(self):
|
|
assert is_off_hours(9) is False
|
|
assert is_off_hours(14) is False
|
|
|
|
|
|
class TestRouting:
|
|
def test_interactive_uses_base_model(self):
|
|
d = resolve_time_aware_model("my-model", "my-provider", is_cron=False, hour=18)
|
|
assert d.model == "my-model"
|
|
assert "Interactive" in d.reason
|
|
|
|
def test_cron_low_error_uses_base(self):
|
|
d = resolve_time_aware_model("cheap-model", is_cron=True, hour=10)
|
|
assert d.model == "cheap-model"
|
|
|
|
def test_cron_high_error_upgrades(self):
|
|
d = resolve_time_aware_model("cheap-model", is_cron=True, hour=18)
|
|
assert d.model != "cheap-model"
|
|
assert d.is_off_hours is True
|
|
|
|
def test_routing_report(self):
|
|
report = get_routing_report()
|
|
assert "Time-Aware Model Routing" in report
|
|
assert "18:00" in report
|