forked from Rockachopa/Timmy-time-dashboard
- Add QUALITY_ANALYSIS.md — 10-point architect review covering
architecture coherence, completeness (~35-40% vs vision), mobile UX,
security, test coverage, code quality, and DX
- Fix P0 XSS: mobile.html chat input now uses DOM textContent instead
of innerHTML string interpolation with raw user input
- Fix P0 XSS: swarm_live.html agent/auction rendering rewritten with
safe DOM methods (_t/_el helpers) — no more ${agent.name} in innerHTML
- Add M7xx test category (4 new tests) covering XSS prevention assertions;
total suite now 232 passing (was 228)
- HITL session guide included in analysis with step-by-step phone test
instructions and critical scenario priority ordering
https://claude.ai/code/session_0183Nzcy7TMqjrAopnTtygds
307 lines
13 KiB
Markdown
307 lines
13 KiB
Markdown
# Timmy Time — Senior Architect Quality Analysis
|
|
**Date:** 2026-02-21
|
|
**Branch:** `claude/quality-analysis-mobile-testing-0zgPi`
|
|
**Test Suite:** 228/228 passing ✅
|
|
|
|
---
|
|
|
|
## Executive Summary
|
|
|
|
Timmy Time has a strong Python backend skeleton and a working HTMX UI, but the project is at a **critical architectural fork**: a second, fully-detached React frontend was introduced that uses 100% mock/static data with zero API connectivity. This split creates the illusion of a richer app than exists. Completeness against the stated vision is **~35-40%**. The mobile HITL framework is the standout quality asset.
|
|
|
|
---
|
|
|
|
## 1. Architecture Coherence — CRITICAL ⚠️
|
|
|
|
**Score: 3/10**
|
|
|
|
### Finding: Dual Frontend, Zero Integration
|
|
|
|
The project ships two separate UIs that both claim to be "Mission Control":
|
|
|
|
| UI | Tech | Backend Connected? |
|
|
|----|------|--------------------|
|
|
| `src/dashboard/` | FastAPI + Jinja2 + HTMX | ✅ Yes — real Timmy chat, health, history |
|
|
| `dashboard-web/` | React + TypeScript + Vite | ❌ No — 100% static mock data |
|
|
|
|
The React dashboard (`dashboard-web/client/src/lib/data.ts`) exports `MOCK_CHAT`, `MOCK_HEALTH`, `MOCK_NOTIFICATIONS`, `MOCK_TASKS`, `MOCK_WS_EVENTS` — every data source is hardcoded. There is **not a single `fetch()` call** to the FastAPI backend. The `ChatPanel` simulates responses with `setTimeout()`. The `StatusSidebar` shows a hardcoded Ollama status — it never calls `/health/status`.
|
|
|
|
**Impact:** The React UI is a clickable mockup, not a product. A new developer would not know which frontend is authoritative.
|
|
|
|
### Finding: React App Has No Build Config
|
|
|
|
`dashboard-web/client/` contains `src/` and `index.html` but no `package.json`, `vite.config.ts`, or `tsconfig.json` in that directory. The app imports from `@/components/ui/*` (shadcn/ui) but the `components/ui/` directory does not exist in the repo. The React app is **not buildable as committed**.
|
|
|
|
---
|
|
|
|
## 2. Completeness Against Vision — 35-40%
|
|
|
|
**Score: 4/10**
|
|
|
|
| Feature | Roadmap | Status |
|
|
|---------|---------|--------|
|
|
| Agno + Ollama + SQLite dashboard | v1.0.0 | ✅ Complete |
|
|
| HTMX chat with history | v1.0.0 | ✅ Complete |
|
|
| AirLLM big-brain backend | v1.0.0 | ✅ Complete |
|
|
| CLI (chat/think/status) | v1.0.0 | ✅ Complete |
|
|
| Swarm registry + coordinator | v2.0.0 | ⚠️ Skeleton only — no real agents |
|
|
| Agent personas (Echo, Mace, Forge…) | v2.0.0 | ❌ Catalog only — never instantiated |
|
|
| MCP tools integration | v2.0.0 | ❌ Not started |
|
|
| Voice NLU | v2.0.0 | ⚠️ Backend module — no live UI |
|
|
| Push notifications | v2.0.0 | ⚠️ Backend module — never triggered |
|
|
| Siri Shortcuts | v2.0.0 | ⚠️ Endpoint stub only |
|
|
| WebSocket live swarm feed | v2.0.0 | ⚠️ Server-side ready — no UI consumer |
|
|
| L402 / Lightning payments | v3.0.0 | ⚠️ Mock implementation only |
|
|
| Real LND gRPC backend | v3.0.0 | ❌ Not started |
|
|
| Single `.app` bundle | v3.0.0 | ❌ Not started |
|
|
| React dashboard (live data) | — | ❌ All mock data |
|
|
| Mobile HITL checklist | — | ✅ Complete (27 scenarios) |
|
|
|
|
---
|
|
|
|
## 3. Mobile UX Audit
|
|
|
|
**Score: 7/10 (HTMX UI) / 2/10 (React UI)**
|
|
|
|
### HTMX Dashboard — Strong
|
|
|
|
The HTMX-served dashboard has solid mobile foundations verified by the automated test suite:
|
|
|
|
- ✅ `viewport-fit=cover` — Dynamic Island / notch support
|
|
- ✅ `apple-mobile-web-app-capable` — Home Screen PWA mode
|
|
- ✅ `safe-area-inset-top/bottom` — padding clears notch and home indicator
|
|
- ✅ `overscroll-behavior: none` — no rubber-band on main page
|
|
- ✅ `-webkit-overflow-scrolling: touch` — momentum scroll in chat
|
|
- ✅ `dvh` units — correct height on iOS with collapsing chrome
|
|
- ✅ 44px touch targets on SEND button and inputs
|
|
- ✅ `font-size: 16px` in mobile query — iOS zoom prevention
|
|
- ✅ `enterkeyhint="send"` — Send-labelled keyboard key
|
|
- ✅ HTMX `hx-sync="this:drop"` — double-tap protection
|
|
- ✅ HTMX `hx-disabled-elt` — in-flight button lockout
|
|
|
|
### Gap: Mobile Quick Actions Page (`/mobile`)
|
|
|
|
The `/mobile` route template shows a "Mobile only" page with quick action tiles and a JS-based chat — but it uses **CSS `display: none` on desktop** via `.mobile-only` with an `@media (min-width: 769px)` rule. The desktop fallback shows a placeholder. This is a valid progressive enhancement approach but the page is not linked from the main nav bar.
|
|
|
|
### React Dashboard — Mobile Not Functional
|
|
|
|
The React dashboard uses `hidden lg:flex` for the left sidebar (desktop only) and an `AnimatePresence` slide-in overlay for mobile. The mobile UX architecture is correct. However, because all data is mock, tapping "Chat" produces a simulated response from a setTimeout, not from Ollama. This is not tested and not usable.
|
|
|
|
---
|
|
|
|
## 4. Human-in-the-Loop (HITL) Mobile Testing
|
|
|
|
**Score: 8/10**
|
|
|
|
The `/mobile-test` route is the standout quality feature. It provides:
|
|
|
|
- 21 structured test scenarios across 7 categories (Layout, Touch, Chat, Health, Scroll, Notch, Live UI)
|
|
- PASS/FAIL/SKIP buttons with sessionStorage persistence across scroll
|
|
- Live pass rate counter and progress bar
|
|
- Accessible on any phone via local network URL
|
|
- ← MISSION CONTROL back-link for easy navigation
|
|
|
|
**Gaps to improve:**
|
|
- No server-side results storage — results lost when tab closes
|
|
- No shareable/exportable report (screenshot required for handoff)
|
|
- React dashboard has no equivalent HITL page
|
|
- No automated Playwright/Selenium mobile tests that could catch regressions
|
|
|
|
---
|
|
|
|
## 5. Security Assessment
|
|
|
|
**Score: 5/10**
|
|
|
|
### XSS Vulnerability — `/mobile` template
|
|
|
|
`mobile.html` line ~85 uses raw `innerHTML` string interpolation with user-supplied message content:
|
|
|
|
```javascript
|
|
// mobile.html — VULNERABLE
|
|
chat.innerHTML += `
|
|
<div class="chat-message user">
|
|
<div>${message}</div> <!-- message is user input, not escaped -->
|
|
</div>
|
|
`;
|
|
```
|
|
|
|
If a user types `<img src=x onerror=alert(1)>`, it executes. This is a stored XSS via `innerHTML`. Fix: use `document.createTextNode(message)` or escape HTML before insertion.
|
|
|
|
The `swarm_live.html` has the same pattern with WebSocket data:
|
|
```javascript
|
|
container.innerHTML = agents.map(agent => `...${agent.name}...`).join('');
|
|
```
|
|
If agent names contain `<script>` tags (or any HTML), this executes in context.
|
|
|
|
### Hardcoded Secrets
|
|
|
|
`l402_proxy.py`: `_MACAROON_SECRET = "timmy-macaroon-secret".encode()` (default)
|
|
`payment_handler.py`: `_HMAC_SECRET = "timmy-sovereign-sats".encode()` (default)
|
|
|
|
Both fall back to env var reads which is correct, but the defaults should not be production-safe strings — they should be None with a startup assertion requiring them to be set.
|
|
|
|
### No Route Authentication
|
|
|
|
All `/swarm/spawn`, `/swarm/tasks`, `/marketplace`, `/agents/timmy/chat` endpoints have no auth guard. On a `--host 0.0.0.0` server, anyone on the local network can post tasks or clear chat history. Acceptable for v1 local-only use but must be documented and gated before LAN exposure.
|
|
|
|
---
|
|
|
|
## 6. Test Coverage
|
|
|
|
**Score: 7/10**
|
|
|
|
| Suite | Tests | Quality |
|
|
|-------|-------|---------|
|
|
| Agent unit | 13 | Good |
|
|
| Backends | 14 | Good |
|
|
| Mobile scenarios | 32 | Excellent — covers M1xx-M6xx categories |
|
|
| Swarm | 29+10+16 | Good |
|
|
| L402 proxy | 13 | Good |
|
|
| Voice NLU | 15 | Good |
|
|
| Dashboard routes | 18+18 | Good |
|
|
| WebSocket | 3 | Thin — no reconnect or message-type tests |
|
|
| React components | 0 | Missing entirely |
|
|
| End-to-end (Playwright) | 0 | Missing |
|
|
|
|
**Key gaps:**
|
|
1. No tests for the XSS vulnerabilities
|
|
2. No tests for the `/mobile` quick-chat endpoint
|
|
3. WebSocket tests don't cover reconnection logic or malformed payloads
|
|
4. React app has zero test coverage
|
|
|
|
---
|
|
|
|
## 7. Code Quality
|
|
|
|
**Score: 7/10**
|
|
|
|
**Strengths:**
|
|
- Clean module separation (`timmy/`, `swarm/`, `dashboard/routes/`, `timmy_serve/`)
|
|
- Consistent use of dataclasses for domain models
|
|
- Good docstrings on all public functions
|
|
- SQLite-backed persistence for both Agno memory and swarm registry
|
|
- pydantic-settings config with `.env` override support
|
|
|
|
**Weaknesses:**
|
|
- Swarm `coordinator.py` uses a module-level singleton `coordinator = SwarmCoordinator()` — not injectable, hard to test in isolation
|
|
- `swarm/registry.py` opens a new SQLite connection on every call (no connection pool)
|
|
- `dashboard/routes/swarm.py` creates a new `Jinja2Templates` instance — it should reuse the one from `app.py`
|
|
- React components import from `@/components/ui/*` which don't exist in the committed tree
|
|
|
|
---
|
|
|
|
## 8. Developer Experience
|
|
|
|
**Score: 6/10**
|
|
|
|
**Strengths:**
|
|
- README is excellent — copy-paste friendly, covers Mac quickstart, phone access, troubleshooting
|
|
- DEVELOPMENT_REPORT.md provides full history of what was built and why
|
|
- `.env.example` covers all config variables
|
|
- Self-TDD watchdog CLI is a creative addition
|
|
|
|
**Weaknesses:**
|
|
- No `docker-compose.yml` — setup requires manual Python venv + Ollama install
|
|
- Two apps (FastAPI + React) with no single `make dev` command to start both
|
|
- `STATUS.md` says v1.0.0 but development is well past that — version drift
|
|
- React app missing from the quickstart instructions entirely
|
|
|
|
---
|
|
|
|
## 9. Backend Architecture
|
|
|
|
**Score: 7/10**
|
|
|
|
The FastAPI backend is well-structured. The swarm subsystem follows a clean coordinator pattern. The L402 mock is architecturally correct (the interface matches what real LND calls would require).
|
|
|
|
**Gaps:**
|
|
- Swarm "agents" are database records — `spawn_agent()` registers a record but no Python process is actually launched. `agent_runner.py` uses `subprocess.Popen` to run `python -m swarm.agent_runner` but no `__main__` block exists in that file.
|
|
- The bidding system (`bidder.py`) runs an asyncio auction but there are no actual bidder agents submitting bids — auctions will always time out with no winner.
|
|
- Voice TTS (`voice_tts.py`) requires `pyttsx3` (optional dep) but the voice route offers no graceful fallback message when pyttsx3 is absent.
|
|
|
|
---
|
|
|
|
## 10. Prioritized Defects
|
|
|
|
| Priority | ID | Issue | File |
|
|
|----------|----|-------|------|
|
|
| P0 | SEC-01 | XSS via innerHTML with unsanitized user input | `mobile.html:85`, `swarm_live.html:72` |
|
|
| P0 | ARCH-01 | React dashboard 100% mock — no backend calls | `dashboard-web/client/src/` |
|
|
| P0 | ARCH-02 | React app not buildable — missing package.json, shadcn/ui | `dashboard-web/client/` |
|
|
| P1 | SEC-02 | Hardcoded L402/HMAC secrets without startup assertion | `l402_proxy.py`, `payment_handler.py` |
|
|
| P1 | FUNC-01 | Swarm spawn creates DB record but no process | `swarm/agent_runner.py` |
|
|
| P1 | FUNC-02 | Auction always fails — no real bid submitters | `swarm/bidder.py` |
|
|
| P2 | UX-01 | `/mobile` route not in desktop nav | `base.html`, `index.html` |
|
|
| P2 | TEST-01 | WebSocket reconnection not tested | `tests/test_websocket.py` |
|
|
| P2 | DX-01 | No single dev startup command | `README.md` |
|
|
| P3 | PERF-01 | SQLite connection opened per-query in registry | `swarm/registry.py` |
|
|
|
|
---
|
|
|
|
## HITL Mobile Test Session Guide
|
|
|
|
To run a complete human-in-the-loop mobile test session right now:
|
|
|
|
```bash
|
|
# 1. Start the dashboard
|
|
source .venv/bin/activate
|
|
uvicorn dashboard.app:app --host 0.0.0.0 --port 8000 --reload
|
|
|
|
# 2. Find your local IP
|
|
ipconfig getifaddr en0 # macOS
|
|
hostname -I # Linux
|
|
|
|
# 3. Open on your phone (same Wi-Fi)
|
|
http://<YOUR_IP>:8000/mobile-test
|
|
|
|
# 4. Work through the 21 scenarios, marking PASS / FAIL / SKIP
|
|
# 5. Screenshot the SUMMARY section for your records
|
|
|
|
# ─── Also test the main dashboard on mobile ───────────────────────
|
|
http://<YOUR_IP>:8000 # Main Mission Control
|
|
http://<YOUR_IP>:8000/mobile # Quick Actions (mobile-optimized)
|
|
```
|
|
|
|
**Critical scenarios to test first:**
|
|
- T01 — iOS zoom prevention (tap input, watch for zoom)
|
|
- C02 — Multi-turn memory (tell Timmy your name, ask it back)
|
|
- C04 — Offline graceful error (stop Ollama, send message)
|
|
- N01/N02 — Notch / home bar clearance (notched iPhone)
|
|
|
|
---
|
|
|
|
## Recommended Next Prompt for Development
|
|
|
|
```
|
|
Connect the React dashboard (dashboard-web/) to the live FastAPI backend.
|
|
|
|
Priority order:
|
|
|
|
1. FIX BUILD FIRST: Add package.json, vite.config.ts, tailwind.config.ts, and
|
|
tsconfig.json to dashboard-web/client/. Install shadcn/ui so the existing
|
|
component imports resolve. Verify `npm run dev` starts the app.
|
|
|
|
2. CHAT (highest user value): Replace ChatPanel mock with real fetch to
|
|
POST /agents/timmy/chat. Show the actual Timmy response from Ollama.
|
|
Implement loading state (matches the existing isTyping UI).
|
|
|
|
3. HEALTH: Replace MOCK_HEALTH in StatusSidebar with a polling fetch to
|
|
GET /health/status (every 30s, matching HTMX behaviour).
|
|
|
|
4. SWARM WEBSOCKET: Open a real WebSocket to ws://localhost:8000/swarm/ws
|
|
and pipe state updates into SwarmPanel — replacing MOCK_WS_EVENTS.
|
|
|
|
5. SECURITY: Fix XSS in mobile.html and swarm_live.html — replace innerHTML
|
|
string interpolation with safe DOM methods (textContent / createTextNode).
|
|
|
|
Use React Query (TanStack) for data fetching with stale-while-revalidate.
|
|
Keep the existing HTMX dashboard running in parallel — the React app should
|
|
be the forward-looking UI.
|
|
```
|
|
|
|
---
|
|
|
|
*Analysis by Claude Code — Senior Architect Review*
|
|
*Timmy Time Dashboard | branch: claude/quality-analysis-mobile-testing-0zgPi*
|