1
0
This repository has been archived on 2026-03-24. You can view files and clone it. You cannot open issues or pull requests or push a commit.
Files
Timmy-time-dashboard/MULTIMODAL_BACKLOG.md
2026-04-15 05:12:49 +00:00

36 lines
2.0 KiB
Markdown

# Gemma 4 Multimodal Backlog
## Epic 1: Visual QA for Nexus World
- **Goal:** Use Gemma 4's vision to audit screenshots of the Three.js Nexus world for layout inconsistencies and UI bugs.
- **Tasks:**
- [x] Capture automated screenshots of all primary Nexus zones.
- [ ] Analyze images for clipping, overlapping UI elements, and lighting glitches.
- [ ] Generate a structured bug report with coordinates and suggested fixes.
## Epic 2: The Testament Visual Consistency Audit
- **Goal:** Ensure the generated image assets for The Testament align with the narrative mood and visual manifest.
- **Tasks:**
- [ ] Compare generated assets against `visual_manifest.json` descriptions.
- [ ] Flag images that diverge from the "Cinematic Noir, 35mm, high contrast" aesthetic.
- [ ] Refine prompts for divergent beats and trigger re-renders.
## Epic 3: Sovereign Heart Emotive Stillness
- **Goal:** Develop a system for selecting the most emotive static image based on the sentiment of generated TTS.
- **Tasks:**
- [ ] Analyze TTS output for emotional valence and arousal.
- [ ] Map sentiment kernels to the visual asset library.
- [ ] Implement a "breathing" transition logic between assets for an expressive presence.
## Epic 4: Multimodal Architecture Synthesis
- **Goal:** Extract and synthesize architectural patterns from visual research papers.
- **Tasks:**
- [ ] Ingest PDF research papers on agentic workflows.
- [ ] Analyze diagrams and charts to extract structural logic.
- [ ] Synthesize findings into `Sovereign_Knowledge_Graph.md`.
## General Tasks
- [x] **Task 1:** Add Gemma 4 entries to `KNOWN_MODEL_CAPABILITIES` and vision fallback chain in `src/infrastructure/models/multimodal.py`. Gemma 4 is a multimodal model supporting vision, text, tools, JSON, and streaming. ✅ PR #1493
- [x] **Task 3:** Add a `ModelCapability.VIDEO` enum member for future video understanding models. ✅ PR #1494
- [ ] **Task 4:** Implement `get_model_for_content("video")` routing with appropriate fallback chain.