docs: project README and MVP specification

This commit is contained in:
Alexander Whitestone
2026-03-14 20:38:02 -04:00
parent 52a45186ba
commit 93058a3e54
2 changed files with 194 additions and 2 deletions

View File

@@ -1,3 +1,52 @@
# timmy-time-app
# Timmy Time
Timmy Time — sovereign AI command center for iPad
Sovereign AI command center for iPad. Chat with Timmy, share images, voice,
links, and documents. All processing happens on your Mac — the iPad is the
eyes, ears, and hands; the Mac is the brain.
## Architecture
```
iPad (native SwiftUI app) Mac (headless server)
┌─────────────────────────┐ ┌──────────────────────────┐
│ Chat UI │ │ FastAPI backend │
│ Camera / Photos │──────→ │ POST /api/v1/chat │
│ Microphone │ Tail- │ POST /api/v1/upload │
│ File picker │ scale │ WS /api/v1/chat/ws │
│ URL sharing │ │ │
│ │ ←───── │ Media pipeline: │
│ Streaming responses │ │ Images → Vision model │
│ Voice playback │ │ Audio → Whisper STT │
│ Rich media display │ │ URLs → Web extract │
│ │ │ Docs → OCR/parse │
│ │ │ │
│ │ │ Timmy agent (Agno) │
│ │ │ Ollama (qwen3:30b) │
└─────────────────────────┘ └──────────────────────────┘
```
## Requirements
- iPad Pro with iPadOS 26.1+
- Mac with macOS 14+ running the Timmy dashboard server
- Tailscale connecting both devices
- Xcode 26+ on Mac for building
## MVP Features
- [x] Project structure
- [ ] Chat screen with streaming responses
- [ ] Image attachment (camera + photo library)
- [ ] Voice input (mic → Whisper on Mac)
- [ ] URL link sharing and summarization
- [ ] Persistent chat history
- [ ] Server-side API endpoints for the app
## Future
- Apple Pencil drawing/annotation
- On-device models (Core ML)
- LiDAR / ARKit spatial awareness
- Siri Shortcuts integration
- Push notifications
- Video processing

143
SPEC.md Normal file
View File

@@ -0,0 +1,143 @@
# Timmy Time App — MVP Specification
## Overview
Native iPad app that serves as the primary interface to a sovereign AI agent
(Timmy) running on a Mac server. All heavy computation stays on the Mac.
The iPad handles UI, sensor capture, and local preprocessing.
## Target Platform
- iPadOS 26.1+
- iPad Pro 13" (primary target)
- Swift 6 / SwiftUI
- Minimum deployment target: iPadOS 26.0
## Server Requirements
The Mac runs the existing Timmy dashboard (FastAPI) with new API endpoints
added specifically for the app. Communication over Tailscale (private network).
### API Endpoints (to build on Mac side)
#### Chat
```
POST /api/v1/chat
Body: { "message": "string", "session_id": "string", "attachments": [...] }
Response: streaming text/event-stream (SSE)
GET /api/v1/chat/history?session_id=X&limit=50
Response: { "messages": [...] }
```
#### Upload / Media
```
POST /api/v1/upload
Body: multipart/form-data with file
Response: { "id": "string", "type": "image|audio|document|url",
"summary": "string", "metadata": {...} }
```
The upload endpoint auto-detects media type and routes to the right processor:
- Images (jpg, png, heic) → vision model analysis
- Audio (m4a, mp3, wav, caf) → Whisper transcription
- Documents (pdf, txt, md) → text extraction
- URLs (detected from text) → web page extraction
#### Status
```
GET /api/v1/status
Response: { "timmy": "online|offline", "model": "qwen3:30b",
"ollama": "running|stopped", "uptime": "..." }
```
## App Structure
```
TimmyTime/
├── TimmyTimeApp.swift # App entry point
├── Models/
│ ├── Message.swift # Chat message model
│ ├── Attachment.swift # Media attachment model
│ └── ServerConfig.swift # Server URL, auth config
├── Views/
│ ├── ChatView.swift # Main chat interface
│ ├── MessageBubble.swift # Individual message rendering
│ ├── AttachmentPicker.swift # Photo/file/camera picker
│ ├── VoiceButton.swift # Hold-to-talk microphone
│ ├── SettingsView.swift # Server URL config
│ └── StatusBar.swift # Connection/model status
├── Services/
│ ├── ChatService.swift # HTTP + SSE streaming client
│ ├── UploadService.swift # Multipart file upload
│ ├── AudioRecorder.swift # AVFoundation mic recording
│ └── PersistenceService.swift # Local chat history (SwiftData)
├── Assets.xcassets/ # App icons, colors
└── Info.plist
```
## Screen Layout (iPad Landscape)
```
┌──────────────────────────────────────────────────────────┐
│ ◉ Timmy Time qwen3:30b ● Online ⚙️ │
├──────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────┐ │
│ │ Timmy: │ │
│ │ Here's what I found... │ │
│ └─────────────────────────────┘ │
│ │
│ ┌─────────────────────────────┐ │
│ │ You: │ │
│ │ [📷 photo.jpg] │ │
│ │ What's in this image? │ │
│ └─────────────────────────────┘ │
│ │
│ ┌─────────────────────────────┐ │
│ │ Timmy: │ │
│ │ I see a circuit board... │ │
│ │ ▋ (streaming) │ │
│ └─────────────────────────────┘ │
│ │
├──────────────────────────────────────────────────────────┤
│ 📎 📷 🎤 │ Type a message... ➤ │
└──────────────────────────────────────────────────────────┘
📎 = file picker 📷 = camera 🎤 = hold to talk ➤ = send
```
## Data Flow
1. User types/speaks/attaches media
2. App sends to Mac: POST /api/v1/chat (text) or /api/v1/upload (media)
3. Upload endpoint processes media, returns summary + ID
4. Chat endpoint receives message + attachment IDs, streams response
5. App renders streaming response in real time
6. Chat saved to local SwiftData store for offline viewing
## Design Principles
- **Touch-first.** Everything reachable by thumb. No tiny tap targets.
- **Sovereign.** No cloud dependencies. All traffic stays on Tailscale.
- **Media-rich.** Images, audio, links displayed inline, not as file names.
- **Fast.** Streaming responses start appearing immediately.
- **Simple.** One screen. Chat is the interface. Everything else is secondary.
## Color Palette
- Dark mode primary (easier on eyes, looks good on OLED iPad Pro)
- Accent color: match Timmy's personality — warm gold or sovereign blue
- Message bubbles: subtle differentiation between user and Timmy
## Phase 2 Features (post-MVP)
- Apple Pencil: draw on images, handwriting input
- Core ML: on-device Whisper for instant transcription
- Split View: chat + status side by side
- Drag and drop from other apps
- Share Sheet extension ("Send to Timmy")
- Push notifications for long-running task completion