feat: quality analysis — bug fixes, mobile tests, HITL checklist
Senior architect review findings + remediations:
BUG FIX — critical interface mismatch
- TimmyAirLLMAgent only exposed print_response(); dashboard route calls
agent.run() → AttributeError when AirLLM backend is selected.
Added run() → RunResult(content) as primary inference entry point;
print_response() now delegates to run() so both call sites share
one inference path.
- Added RunResult dataclass for Agno-compatible structured return.
BUG FIX — hardcoded model name in health status partial
- health_status.html rendered literal "llama3.2" regardless of
OLLAMA_MODEL env var. Route now passes settings.ollama_model to
the template context; partial renders {{ model }} instead.
FEATURE — /mobile-test HITL checklist page
- 22 human-executable test scenarios across: Layout, Touch & Input,
Chat behaviour, Health, Scroll, Notch/Home Bar, Live UI.
- Pass/Fail/Skip buttons with sessionStorage state persistence.
- Live progress bar + final score summary.
- TEST link added to Mission Control header for quick access on phone.
TEST — 32 new automated mobile quality tests (M1xx–M6xx)
- M1xx: viewport/meta tags (8 tests)
- M2xx: touch target sizing — 44 px min-height, manipulation (4 tests)
- M3xx: iOS zoom prevention, autocapitalize, enterkeyhint (5 tests)
- M4xx: HTMX robustness — hx-sync drop, disabled-elt, polling (5 tests)
- M5xx: safe-area insets, overscroll, dvh units (5 tests)
- M6xx: AirLLM interface contract — run(), RunResult, delegation (5 tests)
Total test count: 61 → 93 (all passing).
https://claude.ai/code/session_01RBuRCBXZNkAQQXXGiJNDmt
2026-02-21 17:21:47 +00:00
|
|
|
|
"""Mobile-first quality tests — automated validation of mobile UX requirements.
|
|
|
|
|
|
|
|
|
|
|
|
These tests verify the HTML, CSS, and HTMX attributes that make the dashboard
|
|
|
|
|
|
work correctly on phones. No browser / Playwright required: we parse the
|
|
|
|
|
|
static assets and server responses directly.
|
|
|
|
|
|
|
|
|
|
|
|
Categories:
|
|
|
|
|
|
M1xx Viewport & meta tags
|
|
|
|
|
|
M2xx Touch target sizing
|
|
|
|
|
|
M3xx iOS keyboard & zoom prevention
|
|
|
|
|
|
M4xx HTMX robustness (double-submit, sync)
|
|
|
|
|
|
M5xx Safe-area / notch support
|
|
|
|
|
|
"""
|
|
|
|
|
|
|
|
|
|
|
|
import re
|
|
|
|
|
|
from pathlib import Path
|
|
|
|
|
|
|
|
|
|
|
|
# ── helpers ───────────────────────────────────────────────────────────────────
|
|
|
|
|
|
|
2026-02-26 23:39:13 -05:00
|
|
|
|
|
feat: quality analysis — bug fixes, mobile tests, HITL checklist
Senior architect review findings + remediations:
BUG FIX — critical interface mismatch
- TimmyAirLLMAgent only exposed print_response(); dashboard route calls
agent.run() → AttributeError when AirLLM backend is selected.
Added run() → RunResult(content) as primary inference entry point;
print_response() now delegates to run() so both call sites share
one inference path.
- Added RunResult dataclass for Agno-compatible structured return.
BUG FIX — hardcoded model name in health status partial
- health_status.html rendered literal "llama3.2" regardless of
OLLAMA_MODEL env var. Route now passes settings.ollama_model to
the template context; partial renders {{ model }} instead.
FEATURE — /mobile-test HITL checklist page
- 22 human-executable test scenarios across: Layout, Touch & Input,
Chat behaviour, Health, Scroll, Notch/Home Bar, Live UI.
- Pass/Fail/Skip buttons with sessionStorage state persistence.
- Live progress bar + final score summary.
- TEST link added to Mission Control header for quick access on phone.
TEST — 32 new automated mobile quality tests (M1xx–M6xx)
- M1xx: viewport/meta tags (8 tests)
- M2xx: touch target sizing — 44 px min-height, manipulation (4 tests)
- M3xx: iOS zoom prevention, autocapitalize, enterkeyhint (5 tests)
- M4xx: HTMX robustness — hx-sync drop, disabled-elt, polling (5 tests)
- M5xx: safe-area insets, overscroll, dvh units (5 tests)
- M6xx: AirLLM interface contract — run(), RunResult, delegation (5 tests)
Total test count: 61 → 93 (all passing).
https://claude.ai/code/session_01RBuRCBXZNkAQQXXGiJNDmt
2026-02-21 17:21:47 +00:00
|
|
|
|
def _css() -> str:
|
|
|
|
|
|
"""Read the main stylesheet."""
|
refactor: Phase 3 — reorganize tests into module-mirroring subdirectories
Move 97 test files from flat tests/ into 13 subdirectories:
tests/dashboard/ (8 files — routes, mobile, mission control)
tests/swarm/ (17 files — coordinator, docker, routing, tasks)
tests/timmy/ (12 files — agent, backends, CLI, tools)
tests/self_coding/ (14 files — git safety, indexer, self-modify)
tests/lightning/ (3 files — L402, LND, interface)
tests/creative/ (8 files — assembler, director, image/music/video)
tests/integrations/ (10 files — chat bridge, telegram, voice, websocket)
tests/mcp/ (4 files — bootstrap, discovery, executor)
tests/spark/ (3 files — engine, tools, events)
tests/hands/ (3 files — registry, oracle, phase5)
tests/scripture/ (1 file)
tests/infrastructure/ (3 files — router cascade, API)
tests/security/ (3 files — XSS, regression)
Fix Path(__file__) reference in test_mobile_scenarios.py for new depth.
Add __init__.py to all test subdirectories.
Tests: 1503 passed, 9 failed (pre-existing), 53 errors (pre-existing)
https://claude.ai/code/session_019oMFNvD8uSGSSmBMGkBfQN
2026-02-26 21:21:28 +00:00
|
|
|
|
css_path = Path(__file__).parent.parent.parent / "static" / "style.css"
|
feat: quality analysis — bug fixes, mobile tests, HITL checklist
Senior architect review findings + remediations:
BUG FIX — critical interface mismatch
- TimmyAirLLMAgent only exposed print_response(); dashboard route calls
agent.run() → AttributeError when AirLLM backend is selected.
Added run() → RunResult(content) as primary inference entry point;
print_response() now delegates to run() so both call sites share
one inference path.
- Added RunResult dataclass for Agno-compatible structured return.
BUG FIX — hardcoded model name in health status partial
- health_status.html rendered literal "llama3.2" regardless of
OLLAMA_MODEL env var. Route now passes settings.ollama_model to
the template context; partial renders {{ model }} instead.
FEATURE — /mobile-test HITL checklist page
- 22 human-executable test scenarios across: Layout, Touch & Input,
Chat behaviour, Health, Scroll, Notch/Home Bar, Live UI.
- Pass/Fail/Skip buttons with sessionStorage state persistence.
- Live progress bar + final score summary.
- TEST link added to Mission Control header for quick access on phone.
TEST — 32 new automated mobile quality tests (M1xx–M6xx)
- M1xx: viewport/meta tags (8 tests)
- M2xx: touch target sizing — 44 px min-height, manipulation (4 tests)
- M3xx: iOS zoom prevention, autocapitalize, enterkeyhint (5 tests)
- M4xx: HTMX robustness — hx-sync drop, disabled-elt, polling (5 tests)
- M5xx: safe-area insets, overscroll, dvh units (5 tests)
- M6xx: AirLLM interface contract — run(), RunResult, delegation (5 tests)
Total test count: 61 → 93 (all passing).
https://claude.ai/code/session_01RBuRCBXZNkAQQXXGiJNDmt
2026-02-21 17:21:47 +00:00
|
|
|
|
return css_path.read_text()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def _index_html(client) -> str:
|
|
|
|
|
|
return client.get("/").text
|
|
|
|
|
|
|
|
|
|
|
|
|
2026-02-22 16:21:32 -05:00
|
|
|
|
def _timmy_panel_html(client) -> str:
|
|
|
|
|
|
"""Fetch the Timmy chat panel (loaded dynamically from index via HTMX)."""
|
2026-03-05 19:45:38 -05:00
|
|
|
|
return client.get("/agents/default/panel").text
|
2026-02-22 16:21:32 -05:00
|
|
|
|
|
|
|
|
|
|
|
feat: quality analysis — bug fixes, mobile tests, HITL checklist
Senior architect review findings + remediations:
BUG FIX — critical interface mismatch
- TimmyAirLLMAgent only exposed print_response(); dashboard route calls
agent.run() → AttributeError when AirLLM backend is selected.
Added run() → RunResult(content) as primary inference entry point;
print_response() now delegates to run() so both call sites share
one inference path.
- Added RunResult dataclass for Agno-compatible structured return.
BUG FIX — hardcoded model name in health status partial
- health_status.html rendered literal "llama3.2" regardless of
OLLAMA_MODEL env var. Route now passes settings.ollama_model to
the template context; partial renders {{ model }} instead.
FEATURE — /mobile-test HITL checklist page
- 22 human-executable test scenarios across: Layout, Touch & Input,
Chat behaviour, Health, Scroll, Notch/Home Bar, Live UI.
- Pass/Fail/Skip buttons with sessionStorage state persistence.
- Live progress bar + final score summary.
- TEST link added to Mission Control header for quick access on phone.
TEST — 32 new automated mobile quality tests (M1xx–M6xx)
- M1xx: viewport/meta tags (8 tests)
- M2xx: touch target sizing — 44 px min-height, manipulation (4 tests)
- M3xx: iOS zoom prevention, autocapitalize, enterkeyhint (5 tests)
- M4xx: HTMX robustness — hx-sync drop, disabled-elt, polling (5 tests)
- M5xx: safe-area insets, overscroll, dvh units (5 tests)
- M6xx: AirLLM interface contract — run(), RunResult, delegation (5 tests)
Total test count: 61 → 93 (all passing).
https://claude.ai/code/session_01RBuRCBXZNkAQQXXGiJNDmt
2026-02-21 17:21:47 +00:00
|
|
|
|
# ── M1xx — Viewport & meta tags ───────────────────────────────────────────────
|
|
|
|
|
|
|
2026-02-26 23:39:13 -05:00
|
|
|
|
|
feat: quality analysis — bug fixes, mobile tests, HITL checklist
Senior architect review findings + remediations:
BUG FIX — critical interface mismatch
- TimmyAirLLMAgent only exposed print_response(); dashboard route calls
agent.run() → AttributeError when AirLLM backend is selected.
Added run() → RunResult(content) as primary inference entry point;
print_response() now delegates to run() so both call sites share
one inference path.
- Added RunResult dataclass for Agno-compatible structured return.
BUG FIX — hardcoded model name in health status partial
- health_status.html rendered literal "llama3.2" regardless of
OLLAMA_MODEL env var. Route now passes settings.ollama_model to
the template context; partial renders {{ model }} instead.
FEATURE — /mobile-test HITL checklist page
- 22 human-executable test scenarios across: Layout, Touch & Input,
Chat behaviour, Health, Scroll, Notch/Home Bar, Live UI.
- Pass/Fail/Skip buttons with sessionStorage state persistence.
- Live progress bar + final score summary.
- TEST link added to Mission Control header for quick access on phone.
TEST — 32 new automated mobile quality tests (M1xx–M6xx)
- M1xx: viewport/meta tags (8 tests)
- M2xx: touch target sizing — 44 px min-height, manipulation (4 tests)
- M3xx: iOS zoom prevention, autocapitalize, enterkeyhint (5 tests)
- M4xx: HTMX robustness — hx-sync drop, disabled-elt, polling (5 tests)
- M5xx: safe-area insets, overscroll, dvh units (5 tests)
- M6xx: AirLLM interface contract — run(), RunResult, delegation (5 tests)
Total test count: 61 → 93 (all passing).
https://claude.ai/code/session_01RBuRCBXZNkAQQXXGiJNDmt
2026-02-21 17:21:47 +00:00
|
|
|
|
def test_M101_viewport_meta_present(client):
|
|
|
|
|
|
"""viewport meta tag must exist for correct mobile scaling."""
|
|
|
|
|
|
html = _index_html(client)
|
|
|
|
|
|
assert 'name="viewport"' in html
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def test_M102_viewport_includes_width_device_width(client):
|
|
|
|
|
|
html = _index_html(client)
|
|
|
|
|
|
assert "width=device-width" in html
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def test_M103_viewport_includes_initial_scale_1(client):
|
|
|
|
|
|
html = _index_html(client)
|
|
|
|
|
|
assert "initial-scale=1" in html
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def test_M104_viewport_includes_viewport_fit_cover(client):
|
|
|
|
|
|
"""viewport-fit=cover is required for iPhone notch / Dynamic Island support."""
|
|
|
|
|
|
html = _index_html(client)
|
|
|
|
|
|
assert "viewport-fit=cover" in html
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def test_M105_apple_mobile_web_app_capable(client):
|
|
|
|
|
|
"""Enables full-screen / standalone mode when added to iPhone home screen."""
|
|
|
|
|
|
html = _index_html(client)
|
|
|
|
|
|
assert "apple-mobile-web-app-capable" in html
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def test_M106_theme_color_meta_present(client):
|
|
|
|
|
|
"""theme-color sets the browser chrome colour on Android Chrome."""
|
|
|
|
|
|
html = _index_html(client)
|
|
|
|
|
|
assert 'name="theme-color"' in html
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def test_M107_apple_status_bar_style_present(client):
|
|
|
|
|
|
html = _index_html(client)
|
|
|
|
|
|
assert "apple-mobile-web-app-status-bar-style" in html
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def test_M108_lang_attribute_on_html(client):
|
|
|
|
|
|
"""lang attribute aids screen readers and mobile TTS."""
|
|
|
|
|
|
html = _index_html(client)
|
|
|
|
|
|
assert '<html lang="en"' in html
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
# ── M2xx — Touch target sizing ────────────────────────────────────────────────
|
|
|
|
|
|
|
2026-02-26 23:39:13 -05:00
|
|
|
|
|
feat: quality analysis — bug fixes, mobile tests, HITL checklist
Senior architect review findings + remediations:
BUG FIX — critical interface mismatch
- TimmyAirLLMAgent only exposed print_response(); dashboard route calls
agent.run() → AttributeError when AirLLM backend is selected.
Added run() → RunResult(content) as primary inference entry point;
print_response() now delegates to run() so both call sites share
one inference path.
- Added RunResult dataclass for Agno-compatible structured return.
BUG FIX — hardcoded model name in health status partial
- health_status.html rendered literal "llama3.2" regardless of
OLLAMA_MODEL env var. Route now passes settings.ollama_model to
the template context; partial renders {{ model }} instead.
FEATURE — /mobile-test HITL checklist page
- 22 human-executable test scenarios across: Layout, Touch & Input,
Chat behaviour, Health, Scroll, Notch/Home Bar, Live UI.
- Pass/Fail/Skip buttons with sessionStorage state persistence.
- Live progress bar + final score summary.
- TEST link added to Mission Control header for quick access on phone.
TEST — 32 new automated mobile quality tests (M1xx–M6xx)
- M1xx: viewport/meta tags (8 tests)
- M2xx: touch target sizing — 44 px min-height, manipulation (4 tests)
- M3xx: iOS zoom prevention, autocapitalize, enterkeyhint (5 tests)
- M4xx: HTMX robustness — hx-sync drop, disabled-elt, polling (5 tests)
- M5xx: safe-area insets, overscroll, dvh units (5 tests)
- M6xx: AirLLM interface contract — run(), RunResult, delegation (5 tests)
Total test count: 61 → 93 (all passing).
https://claude.ai/code/session_01RBuRCBXZNkAQQXXGiJNDmt
2026-02-21 17:21:47 +00:00
|
|
|
|
def test_M201_send_button_min_height_44px():
|
|
|
|
|
|
"""SEND button must be at least 44 × 44 px — Apple HIG minimum."""
|
|
|
|
|
|
css = _css()
|
|
|
|
|
|
# Inside the mobile media query the send button must have min-height: 44px
|
|
|
|
|
|
assert "min-height: 44px" in css
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def test_M203_send_button_min_width_64px():
|
|
|
|
|
|
"""Send button needs sufficient width so it isn't accidentally missed."""
|
|
|
|
|
|
css = _css()
|
|
|
|
|
|
assert "min-width: 64px" in css
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def test_M204_touch_action_manipulation_on_buttons():
|
|
|
|
|
|
"""touch-action: manipulation removes 300ms tap delay on mobile browsers."""
|
|
|
|
|
|
css = _css()
|
|
|
|
|
|
assert "touch-action: manipulation" in css
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
# ── M3xx — iOS keyboard & zoom prevention ─────────────────────────────────────
|
|
|
|
|
|
|
2026-02-26 23:39:13 -05:00
|
|
|
|
|
feat: quality analysis — bug fixes, mobile tests, HITL checklist
Senior architect review findings + remediations:
BUG FIX — critical interface mismatch
- TimmyAirLLMAgent only exposed print_response(); dashboard route calls
agent.run() → AttributeError when AirLLM backend is selected.
Added run() → RunResult(content) as primary inference entry point;
print_response() now delegates to run() so both call sites share
one inference path.
- Added RunResult dataclass for Agno-compatible structured return.
BUG FIX — hardcoded model name in health status partial
- health_status.html rendered literal "llama3.2" regardless of
OLLAMA_MODEL env var. Route now passes settings.ollama_model to
the template context; partial renders {{ model }} instead.
FEATURE — /mobile-test HITL checklist page
- 22 human-executable test scenarios across: Layout, Touch & Input,
Chat behaviour, Health, Scroll, Notch/Home Bar, Live UI.
- Pass/Fail/Skip buttons with sessionStorage state persistence.
- Live progress bar + final score summary.
- TEST link added to Mission Control header for quick access on phone.
TEST — 32 new automated mobile quality tests (M1xx–M6xx)
- M1xx: viewport/meta tags (8 tests)
- M2xx: touch target sizing — 44 px min-height, manipulation (4 tests)
- M3xx: iOS zoom prevention, autocapitalize, enterkeyhint (5 tests)
- M4xx: HTMX robustness — hx-sync drop, disabled-elt, polling (5 tests)
- M5xx: safe-area insets, overscroll, dvh units (5 tests)
- M6xx: AirLLM interface contract — run(), RunResult, delegation (5 tests)
Total test count: 61 → 93 (all passing).
https://claude.ai/code/session_01RBuRCBXZNkAQQXXGiJNDmt
2026-02-21 17:21:47 +00:00
|
|
|
|
def test_M301_input_font_size_16px_in_mobile_query():
|
|
|
|
|
|
"""iOS Safari zooms in when input font-size < 16px. Must be exactly 16px."""
|
|
|
|
|
|
css = _css()
|
|
|
|
|
|
# The mobile media-query block must override to 16px
|
2026-03-08 12:50:44 -04:00
|
|
|
|
mobile_block_match = re.search(r"@media\s*\(max-width:\s*768px\)(.*)", css, re.DOTALL)
|
feat: quality analysis — bug fixes, mobile tests, HITL checklist
Senior architect review findings + remediations:
BUG FIX — critical interface mismatch
- TimmyAirLLMAgent only exposed print_response(); dashboard route calls
agent.run() → AttributeError when AirLLM backend is selected.
Added run() → RunResult(content) as primary inference entry point;
print_response() now delegates to run() so both call sites share
one inference path.
- Added RunResult dataclass for Agno-compatible structured return.
BUG FIX — hardcoded model name in health status partial
- health_status.html rendered literal "llama3.2" regardless of
OLLAMA_MODEL env var. Route now passes settings.ollama_model to
the template context; partial renders {{ model }} instead.
FEATURE — /mobile-test HITL checklist page
- 22 human-executable test scenarios across: Layout, Touch & Input,
Chat behaviour, Health, Scroll, Notch/Home Bar, Live UI.
- Pass/Fail/Skip buttons with sessionStorage state persistence.
- Live progress bar + final score summary.
- TEST link added to Mission Control header for quick access on phone.
TEST — 32 new automated mobile quality tests (M1xx–M6xx)
- M1xx: viewport/meta tags (8 tests)
- M2xx: touch target sizing — 44 px min-height, manipulation (4 tests)
- M3xx: iOS zoom prevention, autocapitalize, enterkeyhint (5 tests)
- M4xx: HTMX robustness — hx-sync drop, disabled-elt, polling (5 tests)
- M5xx: safe-area insets, overscroll, dvh units (5 tests)
- M6xx: AirLLM interface contract — run(), RunResult, delegation (5 tests)
Total test count: 61 → 93 (all passing).
https://claude.ai/code/session_01RBuRCBXZNkAQQXXGiJNDmt
2026-02-21 17:21:47 +00:00
|
|
|
|
assert mobile_block_match, "Mobile media query not found"
|
|
|
|
|
|
mobile_block = mobile_block_match.group(1)
|
|
|
|
|
|
assert "font-size: 16px" in mobile_block
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def test_M302_input_autocapitalize_none(client):
|
|
|
|
|
|
"""autocapitalize=none prevents iOS from capitalising chat commands."""
|
2026-02-22 16:21:32 -05:00
|
|
|
|
html = _timmy_panel_html(client)
|
feat: quality analysis — bug fixes, mobile tests, HITL checklist
Senior architect review findings + remediations:
BUG FIX — critical interface mismatch
- TimmyAirLLMAgent only exposed print_response(); dashboard route calls
agent.run() → AttributeError when AirLLM backend is selected.
Added run() → RunResult(content) as primary inference entry point;
print_response() now delegates to run() so both call sites share
one inference path.
- Added RunResult dataclass for Agno-compatible structured return.
BUG FIX — hardcoded model name in health status partial
- health_status.html rendered literal "llama3.2" regardless of
OLLAMA_MODEL env var. Route now passes settings.ollama_model to
the template context; partial renders {{ model }} instead.
FEATURE — /mobile-test HITL checklist page
- 22 human-executable test scenarios across: Layout, Touch & Input,
Chat behaviour, Health, Scroll, Notch/Home Bar, Live UI.
- Pass/Fail/Skip buttons with sessionStorage state persistence.
- Live progress bar + final score summary.
- TEST link added to Mission Control header for quick access on phone.
TEST — 32 new automated mobile quality tests (M1xx–M6xx)
- M1xx: viewport/meta tags (8 tests)
- M2xx: touch target sizing — 44 px min-height, manipulation (4 tests)
- M3xx: iOS zoom prevention, autocapitalize, enterkeyhint (5 tests)
- M4xx: HTMX robustness — hx-sync drop, disabled-elt, polling (5 tests)
- M5xx: safe-area insets, overscroll, dvh units (5 tests)
- M6xx: AirLLM interface contract — run(), RunResult, delegation (5 tests)
Total test count: 61 → 93 (all passing).
https://claude.ai/code/session_01RBuRCBXZNkAQQXXGiJNDmt
2026-02-21 17:21:47 +00:00
|
|
|
|
assert 'autocapitalize="none"' in html
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def test_M303_input_autocorrect_off(client):
|
|
|
|
|
|
"""autocorrect=off prevents iOS from mangling technical / proper-noun input."""
|
2026-02-22 16:21:32 -05:00
|
|
|
|
html = _timmy_panel_html(client)
|
feat: quality analysis — bug fixes, mobile tests, HITL checklist
Senior architect review findings + remediations:
BUG FIX — critical interface mismatch
- TimmyAirLLMAgent only exposed print_response(); dashboard route calls
agent.run() → AttributeError when AirLLM backend is selected.
Added run() → RunResult(content) as primary inference entry point;
print_response() now delegates to run() so both call sites share
one inference path.
- Added RunResult dataclass for Agno-compatible structured return.
BUG FIX — hardcoded model name in health status partial
- health_status.html rendered literal "llama3.2" regardless of
OLLAMA_MODEL env var. Route now passes settings.ollama_model to
the template context; partial renders {{ model }} instead.
FEATURE — /mobile-test HITL checklist page
- 22 human-executable test scenarios across: Layout, Touch & Input,
Chat behaviour, Health, Scroll, Notch/Home Bar, Live UI.
- Pass/Fail/Skip buttons with sessionStorage state persistence.
- Live progress bar + final score summary.
- TEST link added to Mission Control header for quick access on phone.
TEST — 32 new automated mobile quality tests (M1xx–M6xx)
- M1xx: viewport/meta tags (8 tests)
- M2xx: touch target sizing — 44 px min-height, manipulation (4 tests)
- M3xx: iOS zoom prevention, autocapitalize, enterkeyhint (5 tests)
- M4xx: HTMX robustness — hx-sync drop, disabled-elt, polling (5 tests)
- M5xx: safe-area insets, overscroll, dvh units (5 tests)
- M6xx: AirLLM interface contract — run(), RunResult, delegation (5 tests)
Total test count: 61 → 93 (all passing).
https://claude.ai/code/session_01RBuRCBXZNkAQQXXGiJNDmt
2026-02-21 17:21:47 +00:00
|
|
|
|
assert 'autocorrect="off"' in html
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def test_M304_input_enterkeyhint_send(client):
|
|
|
|
|
|
"""enterkeyhint=send labels the iOS return key 'Send' for clearer UX."""
|
2026-02-22 16:21:32 -05:00
|
|
|
|
html = _timmy_panel_html(client)
|
feat: quality analysis — bug fixes, mobile tests, HITL checklist
Senior architect review findings + remediations:
BUG FIX — critical interface mismatch
- TimmyAirLLMAgent only exposed print_response(); dashboard route calls
agent.run() → AttributeError when AirLLM backend is selected.
Added run() → RunResult(content) as primary inference entry point;
print_response() now delegates to run() so both call sites share
one inference path.
- Added RunResult dataclass for Agno-compatible structured return.
BUG FIX — hardcoded model name in health status partial
- health_status.html rendered literal "llama3.2" regardless of
OLLAMA_MODEL env var. Route now passes settings.ollama_model to
the template context; partial renders {{ model }} instead.
FEATURE — /mobile-test HITL checklist page
- 22 human-executable test scenarios across: Layout, Touch & Input,
Chat behaviour, Health, Scroll, Notch/Home Bar, Live UI.
- Pass/Fail/Skip buttons with sessionStorage state persistence.
- Live progress bar + final score summary.
- TEST link added to Mission Control header for quick access on phone.
TEST — 32 new automated mobile quality tests (M1xx–M6xx)
- M1xx: viewport/meta tags (8 tests)
- M2xx: touch target sizing — 44 px min-height, manipulation (4 tests)
- M3xx: iOS zoom prevention, autocapitalize, enterkeyhint (5 tests)
- M4xx: HTMX robustness — hx-sync drop, disabled-elt, polling (5 tests)
- M5xx: safe-area insets, overscroll, dvh units (5 tests)
- M6xx: AirLLM interface contract — run(), RunResult, delegation (5 tests)
Total test count: 61 → 93 (all passing).
https://claude.ai/code/session_01RBuRCBXZNkAQQXXGiJNDmt
2026-02-21 17:21:47 +00:00
|
|
|
|
assert 'enterkeyhint="send"' in html
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def test_M305_input_spellcheck_false(client):
|
|
|
|
|
|
"""spellcheck=false prevents red squiggles on technical terms."""
|
2026-02-22 16:21:32 -05:00
|
|
|
|
html = _timmy_panel_html(client)
|
feat: quality analysis — bug fixes, mobile tests, HITL checklist
Senior architect review findings + remediations:
BUG FIX — critical interface mismatch
- TimmyAirLLMAgent only exposed print_response(); dashboard route calls
agent.run() → AttributeError when AirLLM backend is selected.
Added run() → RunResult(content) as primary inference entry point;
print_response() now delegates to run() so both call sites share
one inference path.
- Added RunResult dataclass for Agno-compatible structured return.
BUG FIX — hardcoded model name in health status partial
- health_status.html rendered literal "llama3.2" regardless of
OLLAMA_MODEL env var. Route now passes settings.ollama_model to
the template context; partial renders {{ model }} instead.
FEATURE — /mobile-test HITL checklist page
- 22 human-executable test scenarios across: Layout, Touch & Input,
Chat behaviour, Health, Scroll, Notch/Home Bar, Live UI.
- Pass/Fail/Skip buttons with sessionStorage state persistence.
- Live progress bar + final score summary.
- TEST link added to Mission Control header for quick access on phone.
TEST — 32 new automated mobile quality tests (M1xx–M6xx)
- M1xx: viewport/meta tags (8 tests)
- M2xx: touch target sizing — 44 px min-height, manipulation (4 tests)
- M3xx: iOS zoom prevention, autocapitalize, enterkeyhint (5 tests)
- M4xx: HTMX robustness — hx-sync drop, disabled-elt, polling (5 tests)
- M5xx: safe-area insets, overscroll, dvh units (5 tests)
- M6xx: AirLLM interface contract — run(), RunResult, delegation (5 tests)
Total test count: 61 → 93 (all passing).
https://claude.ai/code/session_01RBuRCBXZNkAQQXXGiJNDmt
2026-02-21 17:21:47 +00:00
|
|
|
|
assert 'spellcheck="false"' in html
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
# ── M4xx — HTMX robustness ────────────────────────────────────────────────────
|
|
|
|
|
|
|
2026-02-26 23:39:13 -05:00
|
|
|
|
|
feat: quality analysis — bug fixes, mobile tests, HITL checklist
Senior architect review findings + remediations:
BUG FIX — critical interface mismatch
- TimmyAirLLMAgent only exposed print_response(); dashboard route calls
agent.run() → AttributeError when AirLLM backend is selected.
Added run() → RunResult(content) as primary inference entry point;
print_response() now delegates to run() so both call sites share
one inference path.
- Added RunResult dataclass for Agno-compatible structured return.
BUG FIX — hardcoded model name in health status partial
- health_status.html rendered literal "llama3.2" regardless of
OLLAMA_MODEL env var. Route now passes settings.ollama_model to
the template context; partial renders {{ model }} instead.
FEATURE — /mobile-test HITL checklist page
- 22 human-executable test scenarios across: Layout, Touch & Input,
Chat behaviour, Health, Scroll, Notch/Home Bar, Live UI.
- Pass/Fail/Skip buttons with sessionStorage state persistence.
- Live progress bar + final score summary.
- TEST link added to Mission Control header for quick access on phone.
TEST — 32 new automated mobile quality tests (M1xx–M6xx)
- M1xx: viewport/meta tags (8 tests)
- M2xx: touch target sizing — 44 px min-height, manipulation (4 tests)
- M3xx: iOS zoom prevention, autocapitalize, enterkeyhint (5 tests)
- M4xx: HTMX robustness — hx-sync drop, disabled-elt, polling (5 tests)
- M5xx: safe-area insets, overscroll, dvh units (5 tests)
- M6xx: AirLLM interface contract — run(), RunResult, delegation (5 tests)
Total test count: 61 → 93 (all passing).
https://claude.ai/code/session_01RBuRCBXZNkAQQXXGiJNDmt
2026-02-21 17:21:47 +00:00
|
|
|
|
def test_M401_form_hx_sync_drop(client):
|
|
|
|
|
|
"""hx-sync=this:drop discards duplicate submissions (fast double-tap)."""
|
2026-02-22 16:21:32 -05:00
|
|
|
|
html = _timmy_panel_html(client)
|
feat: quality analysis — bug fixes, mobile tests, HITL checklist
Senior architect review findings + remediations:
BUG FIX — critical interface mismatch
- TimmyAirLLMAgent only exposed print_response(); dashboard route calls
agent.run() → AttributeError when AirLLM backend is selected.
Added run() → RunResult(content) as primary inference entry point;
print_response() now delegates to run() so both call sites share
one inference path.
- Added RunResult dataclass for Agno-compatible structured return.
BUG FIX — hardcoded model name in health status partial
- health_status.html rendered literal "llama3.2" regardless of
OLLAMA_MODEL env var. Route now passes settings.ollama_model to
the template context; partial renders {{ model }} instead.
FEATURE — /mobile-test HITL checklist page
- 22 human-executable test scenarios across: Layout, Touch & Input,
Chat behaviour, Health, Scroll, Notch/Home Bar, Live UI.
- Pass/Fail/Skip buttons with sessionStorage state persistence.
- Live progress bar + final score summary.
- TEST link added to Mission Control header for quick access on phone.
TEST — 32 new automated mobile quality tests (M1xx–M6xx)
- M1xx: viewport/meta tags (8 tests)
- M2xx: touch target sizing — 44 px min-height, manipulation (4 tests)
- M3xx: iOS zoom prevention, autocapitalize, enterkeyhint (5 tests)
- M4xx: HTMX robustness — hx-sync drop, disabled-elt, polling (5 tests)
- M5xx: safe-area insets, overscroll, dvh units (5 tests)
- M6xx: AirLLM interface contract — run(), RunResult, delegation (5 tests)
Total test count: 61 → 93 (all passing).
https://claude.ai/code/session_01RBuRCBXZNkAQQXXGiJNDmt
2026-02-21 17:21:47 +00:00
|
|
|
|
assert 'hx-sync="this:drop"' in html
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def test_M402_form_hx_disabled_elt(client):
|
|
|
|
|
|
"""hx-disabled-elt disables the SEND button while a request is in-flight."""
|
2026-02-22 16:21:32 -05:00
|
|
|
|
html = _timmy_panel_html(client)
|
feat: quality analysis — bug fixes, mobile tests, HITL checklist
Senior architect review findings + remediations:
BUG FIX — critical interface mismatch
- TimmyAirLLMAgent only exposed print_response(); dashboard route calls
agent.run() → AttributeError when AirLLM backend is selected.
Added run() → RunResult(content) as primary inference entry point;
print_response() now delegates to run() so both call sites share
one inference path.
- Added RunResult dataclass for Agno-compatible structured return.
BUG FIX — hardcoded model name in health status partial
- health_status.html rendered literal "llama3.2" regardless of
OLLAMA_MODEL env var. Route now passes settings.ollama_model to
the template context; partial renders {{ model }} instead.
FEATURE — /mobile-test HITL checklist page
- 22 human-executable test scenarios across: Layout, Touch & Input,
Chat behaviour, Health, Scroll, Notch/Home Bar, Live UI.
- Pass/Fail/Skip buttons with sessionStorage state persistence.
- Live progress bar + final score summary.
- TEST link added to Mission Control header for quick access on phone.
TEST — 32 new automated mobile quality tests (M1xx–M6xx)
- M1xx: viewport/meta tags (8 tests)
- M2xx: touch target sizing — 44 px min-height, manipulation (4 tests)
- M3xx: iOS zoom prevention, autocapitalize, enterkeyhint (5 tests)
- M4xx: HTMX robustness — hx-sync drop, disabled-elt, polling (5 tests)
- M5xx: safe-area insets, overscroll, dvh units (5 tests)
- M6xx: AirLLM interface contract — run(), RunResult, delegation (5 tests)
Total test count: 61 → 93 (all passing).
https://claude.ai/code/session_01RBuRCBXZNkAQQXXGiJNDmt
2026-02-21 17:21:47 +00:00
|
|
|
|
assert "hx-disabled-elt" in html
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def test_M403_form_hx_indicator(client):
|
|
|
|
|
|
"""hx-indicator wires up the loading spinner to the in-flight state."""
|
2026-02-22 16:21:32 -05:00
|
|
|
|
html = _timmy_panel_html(client)
|
feat: quality analysis — bug fixes, mobile tests, HITL checklist
Senior architect review findings + remediations:
BUG FIX — critical interface mismatch
- TimmyAirLLMAgent only exposed print_response(); dashboard route calls
agent.run() → AttributeError when AirLLM backend is selected.
Added run() → RunResult(content) as primary inference entry point;
print_response() now delegates to run() so both call sites share
one inference path.
- Added RunResult dataclass for Agno-compatible structured return.
BUG FIX — hardcoded model name in health status partial
- health_status.html rendered literal "llama3.2" regardless of
OLLAMA_MODEL env var. Route now passes settings.ollama_model to
the template context; partial renders {{ model }} instead.
FEATURE — /mobile-test HITL checklist page
- 22 human-executable test scenarios across: Layout, Touch & Input,
Chat behaviour, Health, Scroll, Notch/Home Bar, Live UI.
- Pass/Fail/Skip buttons with sessionStorage state persistence.
- Live progress bar + final score summary.
- TEST link added to Mission Control header for quick access on phone.
TEST — 32 new automated mobile quality tests (M1xx–M6xx)
- M1xx: viewport/meta tags (8 tests)
- M2xx: touch target sizing — 44 px min-height, manipulation (4 tests)
- M3xx: iOS zoom prevention, autocapitalize, enterkeyhint (5 tests)
- M4xx: HTMX robustness — hx-sync drop, disabled-elt, polling (5 tests)
- M5xx: safe-area insets, overscroll, dvh units (5 tests)
- M6xx: AirLLM interface contract — run(), RunResult, delegation (5 tests)
Total test count: 61 → 93 (all passing).
https://claude.ai/code/session_01RBuRCBXZNkAQQXXGiJNDmt
2026-02-21 17:21:47 +00:00
|
|
|
|
assert "hx-indicator" in html
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def test_M404_health_panel_auto_refreshes(client):
|
|
|
|
|
|
"""Health panel must poll via HTMX trigger — 'every 30s' confirms this."""
|
|
|
|
|
|
html = _index_html(client)
|
|
|
|
|
|
assert "every 30s" in html
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def test_M405_chat_log_loads_history_on_boot(client):
|
|
|
|
|
|
"""Chat log fetches history via hx-trigger=load so it's populated on open."""
|
|
|
|
|
|
html = _index_html(client)
|
|
|
|
|
|
assert 'hx-trigger="load"' in html
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
# ── M5xx — Safe-area / notch support ─────────────────────────────────────────
|
|
|
|
|
|
|
2026-02-26 23:39:13 -05:00
|
|
|
|
|
feat: quality analysis — bug fixes, mobile tests, HITL checklist
Senior architect review findings + remediations:
BUG FIX — critical interface mismatch
- TimmyAirLLMAgent only exposed print_response(); dashboard route calls
agent.run() → AttributeError when AirLLM backend is selected.
Added run() → RunResult(content) as primary inference entry point;
print_response() now delegates to run() so both call sites share
one inference path.
- Added RunResult dataclass for Agno-compatible structured return.
BUG FIX — hardcoded model name in health status partial
- health_status.html rendered literal "llama3.2" regardless of
OLLAMA_MODEL env var. Route now passes settings.ollama_model to
the template context; partial renders {{ model }} instead.
FEATURE — /mobile-test HITL checklist page
- 22 human-executable test scenarios across: Layout, Touch & Input,
Chat behaviour, Health, Scroll, Notch/Home Bar, Live UI.
- Pass/Fail/Skip buttons with sessionStorage state persistence.
- Live progress bar + final score summary.
- TEST link added to Mission Control header for quick access on phone.
TEST — 32 new automated mobile quality tests (M1xx–M6xx)
- M1xx: viewport/meta tags (8 tests)
- M2xx: touch target sizing — 44 px min-height, manipulation (4 tests)
- M3xx: iOS zoom prevention, autocapitalize, enterkeyhint (5 tests)
- M4xx: HTMX robustness — hx-sync drop, disabled-elt, polling (5 tests)
- M5xx: safe-area insets, overscroll, dvh units (5 tests)
- M6xx: AirLLM interface contract — run(), RunResult, delegation (5 tests)
Total test count: 61 → 93 (all passing).
https://claude.ai/code/session_01RBuRCBXZNkAQQXXGiJNDmt
2026-02-21 17:21:47 +00:00
|
|
|
|
def test_M501_safe_area_inset_top_in_header():
|
|
|
|
|
|
"""Header padding must accommodate the iPhone notch / status bar."""
|
|
|
|
|
|
css = _css()
|
|
|
|
|
|
assert "safe-area-inset-top" in css
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def test_M502_safe_area_inset_bottom_in_footer():
|
|
|
|
|
|
"""Chat footer padding must clear the iPhone home indicator bar."""
|
|
|
|
|
|
css = _css()
|
|
|
|
|
|
assert "safe-area-inset-bottom" in css
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def test_M503_overscroll_behavior_none():
|
|
|
|
|
|
"""overscroll-behavior: none prevents the jarring rubber-band effect."""
|
|
|
|
|
|
css = _css()
|
|
|
|
|
|
assert "overscroll-behavior: none" in css
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def test_M504_webkit_overflow_scrolling_touch():
|
|
|
|
|
|
"""-webkit-overflow-scrolling: touch gives momentum scrolling on iOS."""
|
|
|
|
|
|
css = _css()
|
|
|
|
|
|
assert "-webkit-overflow-scrolling: touch" in css
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
def test_M505_dvh_units_used():
|
|
|
|
|
|
"""Dynamic viewport height (dvh) accounts for collapsing browser chrome."""
|
|
|
|
|
|
css = _css()
|
|
|
|
|
|
assert "dvh" in css
|