This repository has been archived on 2026-03-24. You can view files and clone it. You cannot open issues or pull requests or push a commit.
Files
Timmy-time-dashboard/PLAN.md
Claude 1103da339c feat: add full creative studio + DevOps tools (Pixel, Lyra, Reel personas)
Adds 3 new personas (Pixel, Lyra, Reel) and 5 new tool modules:

- Git/DevOps tools (GitPython): clone, status, diff, log, blame, branch,
  add, commit, push, pull, stash — wired to Forge and Helm personas
- Image generation (FLUX via diffusers): text-to-image, storyboards,
  variations — Pixel persona
- Music generation (ACE-Step 1.5): full songs with vocals+instrumentals,
  instrumental tracks, vocal-only tracks — Lyra persona
- Video generation (Wan 2.1 via diffusers): text-to-video, image-to-video
  clips — Reel persona
- Creative Director pipeline: multi-step orchestration that chains
  storyboard → music → video → assembly into 3+ minute final videos
- Video assembler (MoviePy + FFmpeg): stitch clips, overlay audio,
  title cards, subtitles, final export

Also includes:
- Spark Intelligence tool-level + creative pipeline event capture
- Creative Studio dashboard page (/creative/ui) with 4 tabs
- Config settings for all new models and output directories
- pyproject.toml creative optional extra for GPU dependencies
- 107 new tests covering all modules (624 total, all passing)

https://claude.ai/code/session_01KJm6jQkNi3aA3yoQJn636c
2026-02-24 16:31:47 +00:00

479 lines
14 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Plan: Full Creative & DevOps Capabilities for Timmy
## Overview
Add five major capability domains to Timmy's agent system, turning it into a
sovereign creative studio and full-stack DevOps operator. All tools are
open-source, self-hosted, and GPU-accelerated where needed.
---
## Phase 1: Git & DevOps Tools (Forge + Helm personas)
**Goal:** Timmy can observe local/remote repos, read code, create branches,
stage changes, commit, diff, log, and manage PRs — all through the swarm
task system with Spark event capture.
### New module: `src/tools/git_tools.py`
Tools to add (using **GitPython** — BSD-3, `pip install GitPython`):
| Tool | Function | Persona Access |
|---|---|---|
| `git_clone` | Clone a remote repo to local path | Forge, Helm |
| `git_status` | Show working tree status | Forge, Helm, Timmy |
| `git_diff` | Show staged/unstaged diffs | Forge, Helm, Timmy |
| `git_log` | Show recent commit history | Forge, Helm, Echo, Timmy |
| `git_branch` | List/create/switch branches | Forge, Helm |
| `git_add` | Stage files for commit | Forge, Helm |
| `git_commit` | Create a commit with message | Forge, Helm |
| `git_push` | Push to remote | Forge, Helm |
| `git_pull` | Pull from remote | Forge, Helm |
| `git_blame` | Show line-by-line authorship | Forge, Echo |
| `git_stash` | Stash/pop changes | Forge, Helm |
### Changes to existing files
- **`src/timmy/tools.py`** — Add `create_git_tools()` factory, wire into
`PERSONA_TOOLKITS` for Forge and Helm
- **`src/swarm/tool_executor.py`** — Enhance `_infer_tools_needed()` with
git keywords (commit, branch, push, pull, diff, clone, merge)
- **`src/config.py`** — Add `git_default_repo_dir: str = "~/repos"` setting
- **`src/spark/engine.py`** — Add `on_tool_executed()` method to capture
individual tool invocations (not just task-level events)
- **`src/swarm/personas.py`** — Add git-related keywords to Forge and Helm
preferred_keywords
### New dependency
```toml
# pyproject.toml
dependencies = [
...,
"GitPython>=3.1.40",
]
```
### Dashboard
- **`/tools`** page updated to show git tools in the catalog
- Git tool usage stats visible per agent
### Tests
- `tests/test_git_tools.py` — test all git tool functions against tmp repos
- Mock GitPython's `Repo` class for unit tests
---
## Phase 2: Image Generation (new "Pixel" persona)
**Goal:** Generate storyboard frames and standalone images from text prompts
using FLUX.2 Klein 4B locally.
### New persona: Pixel — Visual Architect
```python
"pixel": {
"id": "pixel",
"name": "Pixel",
"role": "Visual Architect",
"description": "Image generation, storyboard frames, and visual design.",
"capabilities": "image-generation,storyboard,design",
"rate_sats": 80,
"bid_base": 60,
"bid_jitter": 20,
"preferred_keywords": [
"image", "picture", "photo", "draw", "illustration",
"storyboard", "frame", "visual", "design", "generate",
"portrait", "landscape", "scene", "artwork",
],
}
```
### New module: `src/tools/image_tools.py`
Tools (using **diffusers** + **FLUX.2 Klein 4B** — Apache 2.0):
| Tool | Function |
|---|---|
| `generate_image` | Text-to-image generation (returns file path) |
| `generate_storyboard` | Generate N frames from scene descriptions |
| `image_variations` | Generate variations of an existing image |
### Architecture
```
generate_image(prompt, width=1024, height=1024, steps=4)
→ loads FLUX.2 Klein via diffusers FluxPipeline
→ saves to data/images/{uuid}.png
→ returns path + metadata
```
- Model loaded lazily on first use, kept in memory for subsequent calls
- Falls back to CPU generation (slower) if no GPU
- Output saved to `data/images/` with metadata JSON sidecar
### New dependency (optional extra)
```toml
[project.optional-dependencies]
creative = [
"diffusers>=0.30.0",
"transformers>=4.40.0",
"accelerate>=0.30.0",
"torch>=2.2.0",
"safetensors>=0.4.0",
]
```
### Config
```python
# config.py additions
flux_model_id: str = "black-forest-labs/FLUX.2-klein-4b"
image_output_dir: str = "data/images"
image_default_steps: int = 4
```
### Dashboard
- `/creative/ui` — new Creative Studio page (image gallery + generation form)
- HTMX-powered: submit prompt, poll for result, display inline
- Gallery view of all generated images with metadata
### Tests
- `tests/test_image_tools.py` — mock diffusers pipeline, test prompt handling,
file output, storyboard generation
---
## Phase 3: Music Generation (new "Lyra" persona)
**Goal:** Generate full songs with vocals, instrumentals, and lyrics using
ACE-Step 1.5 locally.
### New persona: Lyra — Sound Weaver
```python
"lyra": {
"id": "lyra",
"name": "Lyra",
"role": "Sound Weaver",
"description": "Music and song generation with vocals, instrumentals, and lyrics.",
"capabilities": "music-generation,vocals,composition",
"rate_sats": 90,
"bid_base": 70,
"bid_jitter": 20,
"preferred_keywords": [
"music", "song", "sing", "vocal", "instrumental",
"melody", "beat", "track", "compose", "lyrics",
"audio", "sound", "album", "remix",
],
}
```
### New module: `src/tools/music_tools.py`
Tools (using **ACE-Step 1.5** — Apache 2.0, `pip install ace-step`):
| Tool | Function |
|---|---|
| `generate_song` | Text/lyrics → full song (vocals + instrumentals) |
| `generate_instrumental` | Text prompt → instrumental track |
| `generate_vocals` | Lyrics + style → vocal track |
| `list_genres` | Return supported genre/style tags |
### Architecture
```
generate_song(lyrics, genre="pop", duration=120, language="en")
→ loads ACE-Step model (lazy, cached)
→ generates audio
→ saves to data/music/{uuid}.wav
→ returns path + metadata (duration, genre, etc.)
```
- Model loaded lazily, ~4GB VRAM minimum
- Output saved to `data/music/` with metadata sidecar
- Supports 19 languages, genre tags, tempo control
### New dependency (optional extra, extends `creative`)
```toml
[project.optional-dependencies]
creative = [
...,
"ace-step>=1.5.0",
]
```
### Config
```python
music_output_dir: str = "data/music"
ace_step_model: str = "ace-step/ACE-Step-v1.5"
```
### Dashboard
- `/creative/ui` expanded with Music tab
- Audio player widget (HTML5 `<audio>` element)
- Lyrics input form with genre/style selector
### Tests
- `tests/test_music_tools.py` — mock ACE-Step model, test generation params
---
## Phase 4: Video Generation (new "Reel" persona)
**Goal:** Generate video clips from text/image prompts using Wan 2.1 locally.
### New persona: Reel — Motion Director
```python
"reel": {
"id": "reel",
"name": "Reel",
"role": "Motion Director",
"description": "Video generation from text and image prompts.",
"capabilities": "video-generation,animation,motion",
"rate_sats": 100,
"bid_base": 80,
"bid_jitter": 20,
"preferred_keywords": [
"video", "clip", "animate", "motion", "film",
"scene", "cinematic", "footage", "render", "timelapse",
],
}
```
### New module: `src/tools/video_tools.py`
Tools (using **Wan 2.1** via diffusers — Apache 2.0):
| Tool | Function |
|---|---|
| `generate_video_clip` | Text → short video clip (36 seconds) |
| `image_to_video` | Image + prompt → animated video from still |
| `list_video_styles` | Return supported style presets |
### Architecture
```
generate_video_clip(prompt, duration=5, resolution="480p", fps=24)
→ loads Wan 2.1 via diffusers pipeline (lazy, cached)
→ generates frames
→ encodes to MP4 via FFmpeg
→ saves to data/video/{uuid}.mp4
→ returns path + metadata
```
- Wan 2.1 1.3B model: ~16GB VRAM
- Output saved to `data/video/`
- Resolution options: 480p (16GB), 720p (24GB+)
### New dependency (extends `creative` extra)
```toml
creative = [
...,
# Wan 2.1 uses diffusers (already listed) + model weights downloaded on first use
]
```
### Config
```python
video_output_dir: str = "data/video"
wan_model_id: str = "Wan-AI/Wan2.1-T2V-1.3B"
video_default_resolution: str = "480p"
```
### Tests
- `tests/test_video_tools.py` — mock diffusers pipeline, test clip generation
---
## Phase 5: Creative Director — Storyboard & Assembly Pipeline
**Goal:** Orchestrate multi-persona workflows to produce 3+ minute creative
videos with music, narration, and stitched scenes.
### New module: `src/creative/director.py`
The Creative Director is a **multi-step pipeline** that coordinates Pixel,
Lyra, and Reel to produce complete creative works:
```
User: "Create a 3-minute music video about a sunrise over mountains"
Creative Director
┌─────────┼──────────┐
│ │ │
1. STORYBOARD 2. MUSIC 3. GENERATE
(Pixel) (Lyra) (Reel)
│ │ │
N scene Full song N video clips
descriptions with from storyboard
+ keyframes vocals frames
│ │ │
└─────────┼──────────┘
4. ASSEMBLE
(MoviePy + FFmpeg)
Final video with
music, transitions,
titles
```
### Pipeline steps
1. **Script** — Timmy (or Quill) writes scene descriptions and lyrics
2. **Storyboard** — Pixel generates keyframe images for each scene
3. **Music** — Lyra generates the soundtrack (vocals + instrumentals)
4. **Video clips** — Reel generates video for each scene (image-to-video
from storyboard frames, or text-to-video from descriptions)
5. **Assembly** — MoviePy stitches clips together with cross-fades,
overlays the music track, adds title cards
### New module: `src/creative/assembler.py`
Video assembly engine (using **MoviePy** — MIT, `pip install moviepy`):
| Function | Purpose |
|---|---|
| `stitch_clips` | Concatenate video clips with transitions |
| `overlay_audio` | Mix music track onto video |
| `add_title_card` | Prepend/append title/credits |
| `add_subtitles` | Burn lyrics/captions onto video |
| `export_final` | Encode final video (H.264 + AAC) |
### New dependency
```toml
dependencies = [
...,
"moviepy>=2.0.0",
]
```
### Config
```python
creative_output_dir: str = "data/creative"
video_transition_duration: float = 1.0 # seconds
default_video_codec: str = "libx264"
```
### Dashboard
- `/creative/ui` — Full Creative Studio with tabs:
- **Images** — gallery + generation form
- **Music** — player + generation form
- **Video** — player + generation form
- **Director** — multi-step pipeline builder with storyboard view
- `/creative/projects` — saved projects with all assets
- `/creative/projects/{id}` — project detail with timeline view
### Tests
- `tests/test_assembler.py` — test stitching, audio overlay, title cards
- `tests/test_director.py` — test pipeline orchestration with mocks
---
## Phase 6: Spark Integration for All New Tools
**Goal:** Every tool invocation and creative pipeline step gets captured by
Spark Intelligence for learning and advisory.
### Changes to `src/spark/engine.py`
```python
def on_tool_executed(
self, agent_id: str, tool_name: str,
task_id: Optional[str], success: bool,
duration_ms: Optional[int] = None,
) -> Optional[str]:
"""Capture individual tool invocations."""
def on_creative_step(
self, project_id: str, step_name: str,
agent_id: str, output_path: Optional[str],
) -> Optional[str]:
"""Capture creative pipeline progress."""
```
### New advisor patterns
- "Pixel generates storyboards 40% faster than individual image calls"
- "Lyra's pop genre tracks have 85% higher completion rate than jazz"
- "Video generation on 480p uses 60% less GPU time than 720p for similar quality"
- "Git commits from Forge average 3 files per commit"
---
## Implementation Order
| Phase | What | New Files | Est. Tests |
|---|---|---|---|
| 1 | Git/DevOps tools | 2 source + 1 test | ~25 |
| 2 | Image generation | 2 source + 1 test + 1 template | ~15 |
| 3 | Music generation | 1 source + 1 test | ~12 |
| 4 | Video generation | 1 source + 1 test | ~12 |
| 5 | Creative Director pipeline | 2 source + 2 tests + 1 template | ~20 |
| 6 | Spark tool-level capture | 1 modified + 1 test update | ~8 |
**Total: ~10 new source files, ~6 new test files, ~92 new tests**
---
## New Dependencies Summary
**Required (always installed):**
```
GitPython>=3.1.40
moviepy>=2.0.0
```
**Optional `creative` extra (GPU features):**
```
diffusers>=0.30.0
transformers>=4.40.0
accelerate>=0.30.0
torch>=2.2.0
safetensors>=0.4.0
ace-step>=1.5.0
```
**Install:** `pip install ".[creative]"` for full creative stack
---
## New Persona Summary
| ID | Name | Role | Tools |
|---|---|---|---|
| pixel | Pixel | Visual Architect | generate_image, generate_storyboard, image_variations |
| lyra | Lyra | Sound Weaver | generate_song, generate_instrumental, generate_vocals |
| reel | Reel | Motion Director | generate_video_clip, image_to_video |
These join the existing 6 personas (Echo, Mace, Helm, Seer, Forge, Quill)
for a total of **9 specialized agents** in the swarm.
---
## Hardware Requirements
- **CPU only:** Git tools, MoviePy assembly, all tests (mocked)
- **8GB VRAM:** FLUX.2 Klein 4B (images)
- **4GB VRAM:** ACE-Step 1.5 (music)
- **16GB VRAM:** Wan 2.1 1.3B (video at 480p)
- **Recommended:** RTX 4090 24GB runs the entire stack comfortably