feat: quality analysis — bug fixes, mobile tests, HITL checklist
Senior architect review findings + remediations:
BUG FIX — critical interface mismatch
- TimmyAirLLMAgent only exposed print_response(); dashboard route calls
agent.run() → AttributeError when AirLLM backend is selected.
Added run() → RunResult(content) as primary inference entry point;
print_response() now delegates to run() so both call sites share
one inference path.
- Added RunResult dataclass for Agno-compatible structured return.
BUG FIX — hardcoded model name in health status partial
- health_status.html rendered literal "llama3.2" regardless of
OLLAMA_MODEL env var. Route now passes settings.ollama_model to
the template context; partial renders {{ model }} instead.
FEATURE — /mobile-test HITL checklist page
- 22 human-executable test scenarios across: Layout, Touch & Input,
Chat behaviour, Health, Scroll, Notch/Home Bar, Live UI.
- Pass/Fail/Skip buttons with sessionStorage state persistence.
- Live progress bar + final score summary.
- TEST link added to Mission Control header for quick access on phone.
TEST — 32 new automated mobile quality tests (M1xx–M6xx)
- M1xx: viewport/meta tags (8 tests)
- M2xx: touch target sizing — 44 px min-height, manipulation (4 tests)
- M3xx: iOS zoom prevention, autocapitalize, enterkeyhint (5 tests)
- M4xx: HTMX robustness — hx-sync drop, disabled-elt, polling (5 tests)
- M5xx: safe-area insets, overscroll, dvh units (5 tests)
- M6xx: AirLLM interface contract — run(), RunResult, delegation (5 tests)
Total test count: 61 → 93 (all passing).
https://claude.ai/code/session_01RBuRCBXZNkAQQXXGiJNDmt