Explore Help

Rockachopa/Timmy-time-dashboard

1

0

You've already forked Timmy-time-dashboard

Code Issues Pull Requests Actions Packages Projects Releases Wiki Activity

Files

c441a6c903effa5a84bce22faecf20fee1788853

Timmy-time-dashboard/MULTIMODAL_BACKLOG.md

Alexander Whitestone c441a6c903 docs: mark Task 3 as completed in multimodal backlog

PR #1494 adds VIDEO to ModelCapability enum.

2026-04-15 05:12:49 +00:00

2.0 KiB

Raw Blame History

Gemma 4 Multimodal Backlog

Epic 1: Visual QA for Nexus World

Goal: Use Gemma 4's vision to audit screenshots of the Three.js Nexus world for layout inconsistencies and UI bugs.
Tasks:
- Capture automated screenshots of all primary Nexus zones.
- Analyze images for clipping, overlapping UI elements, and lighting glitches.
- Generate a structured bug report with coordinates and suggested fixes.

Epic 2: The Testament Visual Consistency Audit

Goal: Ensure the generated image assets for The Testament align with the narrative mood and visual manifest.
Tasks:
- Compare generated assets against visual_manifest.json descriptions.
- Flag images that diverge from the "Cinematic Noir, 35mm, high contrast" aesthetic.
- Refine prompts for divergent beats and trigger re-renders.

Epic 3: Sovereign Heart Emotive Stillness

Goal: Develop a system for selecting the most emotive static image based on the sentiment of generated TTS.
Tasks:
- Analyze TTS output for emotional valence and arousal.
- Map sentiment kernels to the visual asset library.
- Implement a "breathing" transition logic between assets for an expressive presence.

Epic 4: Multimodal Architecture Synthesis

Goal: Extract and synthesize architectural patterns from visual research papers.
Tasks:
- Ingest PDF research papers on agentic workflows.
- Analyze diagrams and charts to extract structural logic.
- Synthesize findings into Sovereign_Knowledge_Graph.md.

General Tasks

Task 1: Add Gemma 4 entries to KNOWN_MODEL_CAPABILITIES and vision fallback chain in src/infrastructure/models/multimodal.py. Gemma 4 is a multimodal model supporting vision, text, tools, JSON, and streaming. ✅ PR #1493
Task 3: Add a ModelCapability.VIDEO enum member for future video understanding models. ✅ PR #1494
Task 4: Implement get_model_for_content("video") routing with appropriate fallback chain.

Reference in New Issue View Git Blame Copy Permalink

Powered by Gitea Version: 1.25.4 Page: 477ms Template: 35ms

English

Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語简体中文繁體中文（台灣）繁體中文（香港） 한국어

Licenses API