Phase 1 — Documentation cleanup: - Slim README 303→93 lines (remove duplicated architecture, config tables) - Slim CLAUDE.md 267→80 lines (remove project layout, env vars, CI section) - Slim AGENTS.md 342→72 lines (remove duplicated patterns, running locally) - Delete MEMORY.md, WORKSET_PLAN.md, WORKSET_PLAN_PHASE2.md (session docs) - Archive PLAN.md, IMPLEMENTATION_SUMMARY.md to docs/ - Move QUALITY_ANALYSIS.md, QUALITY_REVIEW_REPORT.md to docs/ - Move apply_security_fixes.py, activate_self_tdd.sh to scripts/ Phase 4 — Config & build cleanup: - Fix wheel build: add 11 missing modules to pyproject.toml include list - Add pytest markers (unit, integration, dashboard, swarm, slow) - Add data/self_modify_reports/ and .handoff/ to .gitignore Phase 6 — Token optimization: - Add docstrings to 15 __init__.py files that were empty - Create __init__.py for events/, memory/, upgrades/ modules Root markdown: 87KB → ~18KB (79% reduction) https://claude.ai/code/session_019oMFNvD8uSGSSmBMGkBfQN
14 KiB
Plan: Full Creative & DevOps Capabilities for Timmy
Overview
Add five major capability domains to Timmy's agent system, turning it into a sovereign creative studio and full-stack DevOps operator. All tools are open-source, self-hosted, and GPU-accelerated where needed.
Phase 1: Git & DevOps Tools (Forge + Helm personas)
Goal: Timmy can observe local/remote repos, read code, create branches, stage changes, commit, diff, log, and manage PRs — all through the swarm task system with Spark event capture.
New module: src/tools/git_tools.py
Tools to add (using GitPython — BSD-3, pip install GitPython):
| Tool | Function | Persona Access |
|---|---|---|
git_clone |
Clone a remote repo to local path | Forge, Helm |
git_status |
Show working tree status | Forge, Helm, Timmy |
git_diff |
Show staged/unstaged diffs | Forge, Helm, Timmy |
git_log |
Show recent commit history | Forge, Helm, Echo, Timmy |
git_branch |
List/create/switch branches | Forge, Helm |
git_add |
Stage files for commit | Forge, Helm |
git_commit |
Create a commit with message | Forge, Helm |
git_push |
Push to remote | Forge, Helm |
git_pull |
Pull from remote | Forge, Helm |
git_blame |
Show line-by-line authorship | Forge, Echo |
git_stash |
Stash/pop changes | Forge, Helm |
Changes to existing files
src/timmy/tools.py— Addcreate_git_tools()factory, wire intoPERSONA_TOOLKITSfor Forge and Helmsrc/swarm/tool_executor.py— Enhance_infer_tools_needed()with git keywords (commit, branch, push, pull, diff, clone, merge)src/config.py— Addgit_default_repo_dir: str = "~/repos"settingsrc/spark/engine.py— Addon_tool_executed()method to capture individual tool invocations (not just task-level events)src/swarm/personas.py— Add git-related keywords to Forge and Helm preferred_keywords
New dependency
# pyproject.toml
dependencies = [
...,
"GitPython>=3.1.40",
]
Dashboard
/toolspage updated to show git tools in the catalog- Git tool usage stats visible per agent
Tests
tests/test_git_tools.py— test all git tool functions against tmp repos- Mock GitPython's
Repoclass for unit tests
Phase 2: Image Generation (new "Pixel" persona)
Goal: Generate storyboard frames and standalone images from text prompts using FLUX.2 Klein 4B locally.
New persona: Pixel — Visual Architect
"pixel": {
"id": "pixel",
"name": "Pixel",
"role": "Visual Architect",
"description": "Image generation, storyboard frames, and visual design.",
"capabilities": "image-generation,storyboard,design",
"rate_sats": 80,
"bid_base": 60,
"bid_jitter": 20,
"preferred_keywords": [
"image", "picture", "photo", "draw", "illustration",
"storyboard", "frame", "visual", "design", "generate",
"portrait", "landscape", "scene", "artwork",
],
}
New module: src/tools/image_tools.py
Tools (using diffusers + FLUX.2 Klein 4B — Apache 2.0):
| Tool | Function |
|---|---|
generate_image |
Text-to-image generation (returns file path) |
generate_storyboard |
Generate N frames from scene descriptions |
image_variations |
Generate variations of an existing image |
Architecture
generate_image(prompt, width=1024, height=1024, steps=4)
→ loads FLUX.2 Klein via diffusers FluxPipeline
→ saves to data/images/{uuid}.png
→ returns path + metadata
- Model loaded lazily on first use, kept in memory for subsequent calls
- Falls back to CPU generation (slower) if no GPU
- Output saved to
data/images/with metadata JSON sidecar
New dependency (optional extra)
[project.optional-dependencies]
creative = [
"diffusers>=0.30.0",
"transformers>=4.40.0",
"accelerate>=0.30.0",
"torch>=2.2.0",
"safetensors>=0.4.0",
]
Config
# config.py additions
flux_model_id: str = "black-forest-labs/FLUX.2-klein-4b"
image_output_dir: str = "data/images"
image_default_steps: int = 4
Dashboard
/creative/ui— new Creative Studio page (image gallery + generation form)- HTMX-powered: submit prompt, poll for result, display inline
- Gallery view of all generated images with metadata
Tests
tests/test_image_tools.py— mock diffusers pipeline, test prompt handling, file output, storyboard generation
Phase 3: Music Generation (new "Lyra" persona)
Goal: Generate full songs with vocals, instrumentals, and lyrics using ACE-Step 1.5 locally.
New persona: Lyra — Sound Weaver
"lyra": {
"id": "lyra",
"name": "Lyra",
"role": "Sound Weaver",
"description": "Music and song generation with vocals, instrumentals, and lyrics.",
"capabilities": "music-generation,vocals,composition",
"rate_sats": 90,
"bid_base": 70,
"bid_jitter": 20,
"preferred_keywords": [
"music", "song", "sing", "vocal", "instrumental",
"melody", "beat", "track", "compose", "lyrics",
"audio", "sound", "album", "remix",
],
}
New module: src/tools/music_tools.py
Tools (using ACE-Step 1.5 — Apache 2.0, pip install ace-step):
| Tool | Function |
|---|---|
generate_song |
Text/lyrics → full song (vocals + instrumentals) |
generate_instrumental |
Text prompt → instrumental track |
generate_vocals |
Lyrics + style → vocal track |
list_genres |
Return supported genre/style tags |
Architecture
generate_song(lyrics, genre="pop", duration=120, language="en")
→ loads ACE-Step model (lazy, cached)
→ generates audio
→ saves to data/music/{uuid}.wav
→ returns path + metadata (duration, genre, etc.)
- Model loaded lazily, ~4GB VRAM minimum
- Output saved to
data/music/with metadata sidecar - Supports 19 languages, genre tags, tempo control
New dependency (optional extra, extends creative)
[project.optional-dependencies]
creative = [
...,
"ace-step>=1.5.0",
]
Config
music_output_dir: str = "data/music"
ace_step_model: str = "ace-step/ACE-Step-v1.5"
Dashboard
/creative/uiexpanded with Music tab- Audio player widget (HTML5
<audio>element) - Lyrics input form with genre/style selector
Tests
tests/test_music_tools.py— mock ACE-Step model, test generation params
Phase 4: Video Generation (new "Reel" persona)
Goal: Generate video clips from text/image prompts using Wan 2.1 locally.
New persona: Reel — Motion Director
"reel": {
"id": "reel",
"name": "Reel",
"role": "Motion Director",
"description": "Video generation from text and image prompts.",
"capabilities": "video-generation,animation,motion",
"rate_sats": 100,
"bid_base": 80,
"bid_jitter": 20,
"preferred_keywords": [
"video", "clip", "animate", "motion", "film",
"scene", "cinematic", "footage", "render", "timelapse",
],
}
New module: src/tools/video_tools.py
Tools (using Wan 2.1 via diffusers — Apache 2.0):
| Tool | Function |
|---|---|
generate_video_clip |
Text → short video clip (3–6 seconds) |
image_to_video |
Image + prompt → animated video from still |
list_video_styles |
Return supported style presets |
Architecture
generate_video_clip(prompt, duration=5, resolution="480p", fps=24)
→ loads Wan 2.1 via diffusers pipeline (lazy, cached)
→ generates frames
→ encodes to MP4 via FFmpeg
→ saves to data/video/{uuid}.mp4
→ returns path + metadata
- Wan 2.1 1.3B model: ~16GB VRAM
- Output saved to
data/video/ - Resolution options: 480p (16GB), 720p (24GB+)
New dependency (extends creative extra)
creative = [
...,
# Wan 2.1 uses diffusers (already listed) + model weights downloaded on first use
]
Config
video_output_dir: str = "data/video"
wan_model_id: str = "Wan-AI/Wan2.1-T2V-1.3B"
video_default_resolution: str = "480p"
Tests
tests/test_video_tools.py— mock diffusers pipeline, test clip generation
Phase 5: Creative Director — Storyboard & Assembly Pipeline
Goal: Orchestrate multi-persona workflows to produce 3+ minute creative videos with music, narration, and stitched scenes.
New module: src/creative/director.py
The Creative Director is a multi-step pipeline that coordinates Pixel, Lyra, and Reel to produce complete creative works:
User: "Create a 3-minute music video about a sunrise over mountains"
│
Creative Director
┌─────────┼──────────┐
│ │ │
1. STORYBOARD 2. MUSIC 3. GENERATE
(Pixel) (Lyra) (Reel)
│ │ │
N scene Full song N video clips
descriptions with from storyboard
+ keyframes vocals frames
│ │ │
└─────────┼──────────┘
│
4. ASSEMBLE
(MoviePy + FFmpeg)
│
Final video with
music, transitions,
titles
Pipeline steps
- Script — Timmy (or Quill) writes scene descriptions and lyrics
- Storyboard — Pixel generates keyframe images for each scene
- Music — Lyra generates the soundtrack (vocals + instrumentals)
- Video clips — Reel generates video for each scene (image-to-video from storyboard frames, or text-to-video from descriptions)
- Assembly — MoviePy stitches clips together with cross-fades, overlays the music track, adds title cards
New module: src/creative/assembler.py
Video assembly engine (using MoviePy — MIT, pip install moviepy):
| Function | Purpose |
|---|---|
stitch_clips |
Concatenate video clips with transitions |
overlay_audio |
Mix music track onto video |
add_title_card |
Prepend/append title/credits |
add_subtitles |
Burn lyrics/captions onto video |
export_final |
Encode final video (H.264 + AAC) |
New dependency
dependencies = [
...,
"moviepy>=2.0.0",
]
Config
creative_output_dir: str = "data/creative"
video_transition_duration: float = 1.0 # seconds
default_video_codec: str = "libx264"
Dashboard
/creative/ui— Full Creative Studio with tabs:- Images — gallery + generation form
- Music — player + generation form
- Video — player + generation form
- Director — multi-step pipeline builder with storyboard view
/creative/projects— saved projects with all assets/creative/projects/{id}— project detail with timeline view
Tests
tests/test_assembler.py— test stitching, audio overlay, title cardstests/test_director.py— test pipeline orchestration with mocks
Phase 6: Spark Integration for All New Tools
Goal: Every tool invocation and creative pipeline step gets captured by Spark Intelligence for learning and advisory.
Changes to src/spark/engine.py
def on_tool_executed(
self, agent_id: str, tool_name: str,
task_id: Optional[str], success: bool,
duration_ms: Optional[int] = None,
) -> Optional[str]:
"""Capture individual tool invocations."""
def on_creative_step(
self, project_id: str, step_name: str,
agent_id: str, output_path: Optional[str],
) -> Optional[str]:
"""Capture creative pipeline progress."""
New advisor patterns
- "Pixel generates storyboards 40% faster than individual image calls"
- "Lyra's pop genre tracks have 85% higher completion rate than jazz"
- "Video generation on 480p uses 60% less GPU time than 720p for similar quality"
- "Git commits from Forge average 3 files per commit"
Implementation Order
| Phase | What | New Files | Est. Tests |
|---|---|---|---|
| 1 | Git/DevOps tools | 2 source + 1 test | ~25 |
| 2 | Image generation | 2 source + 1 test + 1 template | ~15 |
| 3 | Music generation | 1 source + 1 test | ~12 |
| 4 | Video generation | 1 source + 1 test | ~12 |
| 5 | Creative Director pipeline | 2 source + 2 tests + 1 template | ~20 |
| 6 | Spark tool-level capture | 1 modified + 1 test update | ~8 |
Total: ~10 new source files, ~6 new test files, ~92 new tests
New Dependencies Summary
Required (always installed):
GitPython>=3.1.40
moviepy>=2.0.0
Optional creative extra (GPU features):
diffusers>=0.30.0
transformers>=4.40.0
accelerate>=0.30.0
torch>=2.2.0
safetensors>=0.4.0
ace-step>=1.5.0
Install: pip install ".[creative]" for full creative stack
New Persona Summary
| ID | Name | Role | Tools |
|---|---|---|---|
| pixel | Pixel | Visual Architect | generate_image, generate_storyboard, image_variations |
| lyra | Lyra | Sound Weaver | generate_song, generate_instrumental, generate_vocals |
| reel | Reel | Motion Director | generate_video_clip, image_to_video |
These join the existing 6 personas (Echo, Mace, Helm, Seer, Forge, Quill) for a total of 9 specialized agents in the swarm.
Hardware Requirements
- CPU only: Git tools, MoviePy assembly, all tests (mocked)
- 8GB VRAM: FLUX.2 Klein 4B (images)
- 4GB VRAM: ACE-Step 1.5 (music)
- 16GB VRAM: Wan 2.1 1.3B (video at 480p)
- Recommended: RTX 4090 24GB runs the entire stack comfortably