Files
timmy-time-app/SPEC.md
2026-03-14 20:38:02 -04:00

6.3 KiB

Timmy Time App — MVP Specification

Overview

Native iPad app that serves as the primary interface to a sovereign AI agent (Timmy) running on a Mac server. All heavy computation stays on the Mac. The iPad handles UI, sensor capture, and local preprocessing.

Target Platform

  • iPadOS 26.1+
  • iPad Pro 13" (primary target)
  • Swift 6 / SwiftUI
  • Minimum deployment target: iPadOS 26.0

Server Requirements

The Mac runs the existing Timmy dashboard (FastAPI) with new API endpoints added specifically for the app. Communication over Tailscale (private network).

API Endpoints (to build on Mac side)

Chat

POST /api/v1/chat
  Body: { "message": "string", "session_id": "string", "attachments": [...] }
  Response: streaming text/event-stream (SSE)

GET /api/v1/chat/history?session_id=X&limit=50
  Response: { "messages": [...] }

Upload / Media

POST /api/v1/upload
  Body: multipart/form-data with file
  Response: { "id": "string", "type": "image|audio|document|url",
              "summary": "string", "metadata": {...} }

The upload endpoint auto-detects media type and routes to the right processor:

  • Images (jpg, png, heic) → vision model analysis
  • Audio (m4a, mp3, wav, caf) → Whisper transcription
  • Documents (pdf, txt, md) → text extraction
  • URLs (detected from text) → web page extraction

Status

GET /api/v1/status
  Response: { "timmy": "online|offline", "model": "qwen3:30b",
              "ollama": "running|stopped", "uptime": "..." }

App Structure

TimmyTime/
├── TimmyTimeApp.swift          # App entry point
├── Models/
│   ├── Message.swift           # Chat message model
│   ├── Attachment.swift        # Media attachment model
│   └── ServerConfig.swift      # Server URL, auth config
├── Views/
│   ├── ChatView.swift          # Main chat interface
│   ├── MessageBubble.swift     # Individual message rendering
│   ├── AttachmentPicker.swift  # Photo/file/camera picker
│   ├── VoiceButton.swift       # Hold-to-talk microphone
│   ├── SettingsView.swift      # Server URL config
│   └── StatusBar.swift         # Connection/model status
├── Services/
│   ├── ChatService.swift       # HTTP + SSE streaming client
│   ├── UploadService.swift     # Multipart file upload
│   ├── AudioRecorder.swift     # AVFoundation mic recording
│   └── PersistenceService.swift # Local chat history (SwiftData)
├── Assets.xcassets/            # App icons, colors
└── Info.plist

Screen Layout (iPad Landscape)

┌──────────────────────────────────────────────────────────┐
│ ◉ Timmy Time          qwen3:30b ● Online         ⚙️     │
├──────────────────────────────────────────────────────────┤
│                                                          │
│   ┌─────────────────────────────┐                        │
│   │  Timmy:                     │                        │
│   │  Here's what I found...     │                        │
│   └─────────────────────────────┘                        │
│                                                          │
│                        ┌─────────────────────────────┐   │
│                        │  You:                       │   │
│                        │  [📷 photo.jpg]             │   │
│                        │  What's in this image?      │   │
│                        └─────────────────────────────┘   │
│                                                          │
│   ┌─────────────────────────────┐                        │
│   │  Timmy:                     │                        │
│   │  I see a circuit board...   │                        │
│   │  ▋ (streaming)              │                        │
│   └─────────────────────────────┘                        │
│                                                          │
├──────────────────────────────────────────────────────────┤
│  📎  📷  🎤 │  Type a message...                    ➤   │
└──────────────────────────────────────────────────────────┘

📎 = file picker    📷 = camera    🎤 = hold to talk    ➤ = send

Data Flow

  1. User types/speaks/attaches media
  2. App sends to Mac: POST /api/v1/chat (text) or /api/v1/upload (media)
  3. Upload endpoint processes media, returns summary + ID
  4. Chat endpoint receives message + attachment IDs, streams response
  5. App renders streaming response in real time
  6. Chat saved to local SwiftData store for offline viewing

Design Principles

  • Touch-first. Everything reachable by thumb. No tiny tap targets.
  • Sovereign. No cloud dependencies. All traffic stays on Tailscale.
  • Media-rich. Images, audio, links displayed inline, not as file names.
  • Fast. Streaming responses start appearing immediately.
  • Simple. One screen. Chat is the interface. Everything else is secondary.

Color Palette

  • Dark mode primary (easier on eyes, looks good on OLED iPad Pro)
  • Accent color: match Timmy's personality — warm gold or sovereign blue
  • Message bubbles: subtle differentiation between user and Timmy

Phase 2 Features (post-MVP)

  • Apple Pencil: draw on images, handwriting input
  • Core ML: on-device Whisper for instant transcription
  • Split View: chat + status side by side
  • Drag and drop from other apps
  • Share Sheet extension ("Send to Timmy")
  • Push notifications for long-running task completion