- Add QUALITY_ANALYSIS.md — 10-point architect review covering
architecture coherence, completeness (~35-40% vs vision), mobile UX,
security, test coverage, code quality, and DX
- Fix P0 XSS: mobile.html chat input now uses DOM textContent instead
of innerHTML string interpolation with raw user input
- Fix P0 XSS: swarm_live.html agent/auction rendering rewritten with
safe DOM methods (_t/_el helpers) — no more ${agent.name} in innerHTML
- Add M7xx test category (4 new tests) covering XSS prevention assertions;
total suite now 232 passing (was 228)
- HITL session guide included in analysis with step-by-step phone test
instructions and critical scenario priority ordering
https://claude.ai/code/session_0183Nzcy7TMqjrAopnTtygds
13 KiB
Timmy Time — Senior Architect Quality Analysis
Date: 2026-02-21
Branch: claude/quality-analysis-mobile-testing-0zgPi
Test Suite: 228/228 passing ✅
Executive Summary
Timmy Time has a strong Python backend skeleton and a working HTMX UI, but the project is at a critical architectural fork: a second, fully-detached React frontend was introduced that uses 100% mock/static data with zero API connectivity. This split creates the illusion of a richer app than exists. Completeness against the stated vision is ~35-40%. The mobile HITL framework is the standout quality asset.
1. Architecture Coherence — CRITICAL ⚠️
Score: 3/10
Finding: Dual Frontend, Zero Integration
The project ships two separate UIs that both claim to be "Mission Control":
| UI | Tech | Backend Connected? |
|---|---|---|
src/dashboard/ |
FastAPI + Jinja2 + HTMX | ✅ Yes — real Timmy chat, health, history |
dashboard-web/ |
React + TypeScript + Vite | ❌ No — 100% static mock data |
The React dashboard (dashboard-web/client/src/lib/data.ts) exports MOCK_CHAT, MOCK_HEALTH, MOCK_NOTIFICATIONS, MOCK_TASKS, MOCK_WS_EVENTS — every data source is hardcoded. There is not a single fetch() call to the FastAPI backend. The ChatPanel simulates responses with setTimeout(). The StatusSidebar shows a hardcoded Ollama status — it never calls /health/status.
Impact: The React UI is a clickable mockup, not a product. A new developer would not know which frontend is authoritative.
Finding: React App Has No Build Config
dashboard-web/client/ contains src/ and index.html but no package.json, vite.config.ts, or tsconfig.json in that directory. The app imports from @/components/ui/* (shadcn/ui) but the components/ui/ directory does not exist in the repo. The React app is not buildable as committed.
2. Completeness Against Vision — 35-40%
Score: 4/10
| Feature | Roadmap | Status |
|---|---|---|
| Agno + Ollama + SQLite dashboard | v1.0.0 | ✅ Complete |
| HTMX chat with history | v1.0.0 | ✅ Complete |
| AirLLM big-brain backend | v1.0.0 | ✅ Complete |
| CLI (chat/think/status) | v1.0.0 | ✅ Complete |
| Swarm registry + coordinator | v2.0.0 | ⚠️ Skeleton only — no real agents |
| Agent personas (Echo, Mace, Forge…) | v2.0.0 | ❌ Catalog only — never instantiated |
| MCP tools integration | v2.0.0 | ❌ Not started |
| Voice NLU | v2.0.0 | ⚠️ Backend module — no live UI |
| Push notifications | v2.0.0 | ⚠️ Backend module — never triggered |
| Siri Shortcuts | v2.0.0 | ⚠️ Endpoint stub only |
| WebSocket live swarm feed | v2.0.0 | ⚠️ Server-side ready — no UI consumer |
| L402 / Lightning payments | v3.0.0 | ⚠️ Mock implementation only |
| Real LND gRPC backend | v3.0.0 | ❌ Not started |
Single .app bundle |
v3.0.0 | ❌ Not started |
| React dashboard (live data) | — | ❌ All mock data |
| Mobile HITL checklist | — | ✅ Complete (27 scenarios) |
3. Mobile UX Audit
Score: 7/10 (HTMX UI) / 2/10 (React UI)
HTMX Dashboard — Strong
The HTMX-served dashboard has solid mobile foundations verified by the automated test suite:
- ✅
viewport-fit=cover— Dynamic Island / notch support - ✅
apple-mobile-web-app-capable— Home Screen PWA mode - ✅
safe-area-inset-top/bottom— padding clears notch and home indicator - ✅
overscroll-behavior: none— no rubber-band on main page - ✅
-webkit-overflow-scrolling: touch— momentum scroll in chat - ✅
dvhunits — correct height on iOS with collapsing chrome - ✅ 44px touch targets on SEND button and inputs
- ✅
font-size: 16pxin mobile query — iOS zoom prevention - ✅
enterkeyhint="send"— Send-labelled keyboard key - ✅ HTMX
hx-sync="this:drop"— double-tap protection - ✅ HTMX
hx-disabled-elt— in-flight button lockout
Gap: Mobile Quick Actions Page (/mobile)
The /mobile route template shows a "Mobile only" page with quick action tiles and a JS-based chat — but it uses CSS display: none on desktop via .mobile-only with an @media (min-width: 769px) rule. The desktop fallback shows a placeholder. This is a valid progressive enhancement approach but the page is not linked from the main nav bar.
React Dashboard — Mobile Not Functional
The React dashboard uses hidden lg:flex for the left sidebar (desktop only) and an AnimatePresence slide-in overlay for mobile. The mobile UX architecture is correct. However, because all data is mock, tapping "Chat" produces a simulated response from a setTimeout, not from Ollama. This is not tested and not usable.
4. Human-in-the-Loop (HITL) Mobile Testing
Score: 8/10
The /mobile-test route is the standout quality feature. It provides:
- 21 structured test scenarios across 7 categories (Layout, Touch, Chat, Health, Scroll, Notch, Live UI)
- PASS/FAIL/SKIP buttons with sessionStorage persistence across scroll
- Live pass rate counter and progress bar
- Accessible on any phone via local network URL
- ← MISSION CONTROL back-link for easy navigation
Gaps to improve:
- No server-side results storage — results lost when tab closes
- No shareable/exportable report (screenshot required for handoff)
- React dashboard has no equivalent HITL page
- No automated Playwright/Selenium mobile tests that could catch regressions
5. Security Assessment
Score: 5/10
XSS Vulnerability — /mobile template
mobile.html line ~85 uses raw innerHTML string interpolation with user-supplied message content:
// mobile.html — VULNERABLE
chat.innerHTML += `
<div class="chat-message user">
<div>${message}</div> <!-- message is user input, not escaped -->
</div>
`;
If a user types <img src=x onerror=alert(1)>, it executes. This is a stored XSS via innerHTML. Fix: use document.createTextNode(message) or escape HTML before insertion.
The swarm_live.html has the same pattern with WebSocket data:
container.innerHTML = agents.map(agent => `...${agent.name}...`).join('');
If agent names contain <script> tags (or any HTML), this executes in context.
Hardcoded Secrets
l402_proxy.py: _MACAROON_SECRET = "timmy-macaroon-secret".encode() (default)
payment_handler.py: _HMAC_SECRET = "timmy-sovereign-sats".encode() (default)
Both fall back to env var reads which is correct, but the defaults should not be production-safe strings — they should be None with a startup assertion requiring them to be set.
No Route Authentication
All /swarm/spawn, /swarm/tasks, /marketplace, /agents/timmy/chat endpoints have no auth guard. On a --host 0.0.0.0 server, anyone on the local network can post tasks or clear chat history. Acceptable for v1 local-only use but must be documented and gated before LAN exposure.
6. Test Coverage
Score: 7/10
| Suite | Tests | Quality |
|---|---|---|
| Agent unit | 13 | Good |
| Backends | 14 | Good |
| Mobile scenarios | 32 | Excellent — covers M1xx-M6xx categories |
| Swarm | 29+10+16 | Good |
| L402 proxy | 13 | Good |
| Voice NLU | 15 | Good |
| Dashboard routes | 18+18 | Good |
| WebSocket | 3 | Thin — no reconnect or message-type tests |
| React components | 0 | Missing entirely |
| End-to-end (Playwright) | 0 | Missing |
Key gaps:
- No tests for the XSS vulnerabilities
- No tests for the
/mobilequick-chat endpoint - WebSocket tests don't cover reconnection logic or malformed payloads
- React app has zero test coverage
7. Code Quality
Score: 7/10
Strengths:
- Clean module separation (
timmy/,swarm/,dashboard/routes/,timmy_serve/) - Consistent use of dataclasses for domain models
- Good docstrings on all public functions
- SQLite-backed persistence for both Agno memory and swarm registry
- pydantic-settings config with
.envoverride support
Weaknesses:
- Swarm
coordinator.pyuses a module-level singletoncoordinator = SwarmCoordinator()— not injectable, hard to test in isolation swarm/registry.pyopens a new SQLite connection on every call (no connection pool)dashboard/routes/swarm.pycreates a newJinja2Templatesinstance — it should reuse the one fromapp.py- React components import from
@/components/ui/*which don't exist in the committed tree
8. Developer Experience
Score: 6/10
Strengths:
- README is excellent — copy-paste friendly, covers Mac quickstart, phone access, troubleshooting
- DEVELOPMENT_REPORT.md provides full history of what was built and why
.env.examplecovers all config variables- Self-TDD watchdog CLI is a creative addition
Weaknesses:
- No
docker-compose.yml— setup requires manual Python venv + Ollama install - Two apps (FastAPI + React) with no single
make devcommand to start both STATUS.mdsays v1.0.0 but development is well past that — version drift- React app missing from the quickstart instructions entirely
9. Backend Architecture
Score: 7/10
The FastAPI backend is well-structured. The swarm subsystem follows a clean coordinator pattern. The L402 mock is architecturally correct (the interface matches what real LND calls would require).
Gaps:
- Swarm "agents" are database records —
spawn_agent()registers a record but no Python process is actually launched.agent_runner.pyusessubprocess.Popento runpython -m swarm.agent_runnerbut no__main__block exists in that file. - The bidding system (
bidder.py) runs an asyncio auction but there are no actual bidder agents submitting bids — auctions will always time out with no winner. - Voice TTS (
voice_tts.py) requirespyttsx3(optional dep) but the voice route offers no graceful fallback message when pyttsx3 is absent.
10. Prioritized Defects
| Priority | ID | Issue | File |
|---|---|---|---|
| P0 | SEC-01 | XSS via innerHTML with unsanitized user input | mobile.html:85, swarm_live.html:72 |
| P0 | ARCH-01 | React dashboard 100% mock — no backend calls | dashboard-web/client/src/ |
| P0 | ARCH-02 | React app not buildable — missing package.json, shadcn/ui | dashboard-web/client/ |
| P1 | SEC-02 | Hardcoded L402/HMAC secrets without startup assertion | l402_proxy.py, payment_handler.py |
| P1 | FUNC-01 | Swarm spawn creates DB record but no process | swarm/agent_runner.py |
| P1 | FUNC-02 | Auction always fails — no real bid submitters | swarm/bidder.py |
| P2 | UX-01 | /mobile route not in desktop nav |
base.html, index.html |
| P2 | TEST-01 | WebSocket reconnection not tested | tests/test_websocket.py |
| P2 | DX-01 | No single dev startup command | README.md |
| P3 | PERF-01 | SQLite connection opened per-query in registry | swarm/registry.py |
HITL Mobile Test Session Guide
To run a complete human-in-the-loop mobile test session right now:
# 1. Start the dashboard
source .venv/bin/activate
uvicorn dashboard.app:app --host 0.0.0.0 --port 8000 --reload
# 2. Find your local IP
ipconfig getifaddr en0 # macOS
hostname -I # Linux
# 3. Open on your phone (same Wi-Fi)
http://<YOUR_IP>:8000/mobile-test
# 4. Work through the 21 scenarios, marking PASS / FAIL / SKIP
# 5. Screenshot the SUMMARY section for your records
# ─── Also test the main dashboard on mobile ───────────────────────
http://<YOUR_IP>:8000 # Main Mission Control
http://<YOUR_IP>:8000/mobile # Quick Actions (mobile-optimized)
Critical scenarios to test first:
- T01 — iOS zoom prevention (tap input, watch for zoom)
- C02 — Multi-turn memory (tell Timmy your name, ask it back)
- C04 — Offline graceful error (stop Ollama, send message)
- N01/N02 — Notch / home bar clearance (notched iPhone)
Recommended Next Prompt for Development
Connect the React dashboard (dashboard-web/) to the live FastAPI backend.
Priority order:
1. FIX BUILD FIRST: Add package.json, vite.config.ts, tailwind.config.ts, and
tsconfig.json to dashboard-web/client/. Install shadcn/ui so the existing
component imports resolve. Verify `npm run dev` starts the app.
2. CHAT (highest user value): Replace ChatPanel mock with real fetch to
POST /agents/timmy/chat. Show the actual Timmy response from Ollama.
Implement loading state (matches the existing isTyping UI).
3. HEALTH: Replace MOCK_HEALTH in StatusSidebar with a polling fetch to
GET /health/status (every 30s, matching HTMX behaviour).
4. SWARM WEBSOCKET: Open a real WebSocket to ws://localhost:8000/swarm/ws
and pipe state updates into SwarmPanel — replacing MOCK_WS_EVENTS.
5. SECURITY: Fix XSS in mobile.html and swarm_live.html — replace innerHTML
string interpolation with safe DOM methods (textContent / createTextNode).
Use React Query (TanStack) for data fetching with stale-while-revalidate.
Keep the existing HTMX dashboard running in parallel — the React app should
be the forward-looking UI.
Analysis by Claude Code — Senior Architect Review Timmy Time Dashboard | branch: claude/quality-analysis-mobile-testing-0zgPi