Compare commits

..

14 Commits

Author SHA1 Message Date
9a8d620163 feat: quality gate pipeline validation (#623)
Some checks failed
Architecture Lint / Lint Repository (pull_request) Blocked by required conditions
Validate Config / Python Test Suite (pull_request) Blocked by required conditions
Architecture Lint / Linter Tests (pull_request) Successful in 13s
Smoke Test / smoke (pull_request) Failing after 11s
Validate Config / YAML Lint (pull_request) Failing after 14s
Validate Config / JSON Validate (pull_request) Successful in 14s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 44s
Validate Config / Shell Script Lint (pull_request) Failing after 24s
Validate Config / Cron Syntax Check (pull_request) Successful in 5s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 3s
Validate Config / Playbook Schema Validation (pull_request) Successful in 8s
PR Checklist / pr-checklist (pull_request) Failing after 3m54s
Validates JSONL/JSON pipeline outputs for:
- Schema correctness
- Content quality (non-empty, not duplicated)
- Toxicity detection
- Dedup hash management with auto-cleanup

Usage:
  python3 bin/quality-gate.py validate data.jsonl
  python3 bin/quality-gate.py score data.jsonl
  python3 bin/quality-gate.py stats
  python3 bin/quality-gate.py cleanup

Closes #623
2026-04-17 05:53:33 +00:00
Alexander Whitestone
ce3822bb5f feat: quality gate — validate all pipeline outputs (#623)
Some checks failed
Architecture Lint / Linter Tests (pull_request) Successful in 19s
PR Checklist / pr-checklist (pull_request) Failing after 26m43s
Smoke Test / smoke (pull_request) Failing after 59s
Validate Config / YAML Lint (pull_request) Failing after 39s
Validate Config / JSON Validate (pull_request) Successful in 1m32s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 2m0s
Validate Config / Shell Script Lint (pull_request) Failing after 49s
Validate Config / Cron Syntax Check (pull_request) Successful in 9s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 10s
Validate Config / Playbook Schema Validation (pull_request) Successful in 16s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
Validates pipeline outputs before saving. Rejects bad entries,
tracks quality scores per pipeline.

Checks:
- Training pairs: prompt/response non-empty, response != prompt
- Scene descriptions: all required fields, description min length
- Knowledge entries: no placeholders (TODO, FIXME), min length
- Prompt enhancement: rich > terse length, min 20 chars
- Adversary entries: id/family/prompt present, min prompt length
- SOUL.md compliance: no human life valuation, no weapon/child content
- Deduplication: detects duplicate entries by key fields

Features:
- Auto-reject bad outputs with reasons
- Quality score per entry (0.0-1.0)
- Batch mode (--dir) for processing all JSONL at once
- Stats tracking (~/.hermes/pipeline/quality_stats.json)
- --status to view historical quality metrics

Usage:
  python3 pipeline/quality_gate.py --input data.jsonl --type training_pairs
  python3 pipeline/quality_gate.py --dir pipeline/output/
  python3 pipeline/quality_gate.py --status

Closes #623
2026-04-15 08:20:18 -04:00
817785d763 Merge pull request 'feat: training data augmentation — paraphrase and translate pairs (#695)' (#732) from fix/695 into main 2026-04-15 11:56:28 +00:00
Alexander Whitestone
3603030235 feat: training data augmentation — paraphrase and translate pairs (#695)
Some checks failed
Architecture Lint / Linter Tests (pull_request) Successful in 22s
Smoke Test / smoke (pull_request) Failing after 18s
Validate Config / YAML Lint (pull_request) Failing after 23s
Validate Config / JSON Validate (pull_request) Successful in 21s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 1m54s
Validate Config / Shell Script Lint (pull_request) Failing after 54s
Validate Config / Cron Syntax Check (pull_request) Successful in 16s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 16s
Validate Config / Playbook Schema Validation (pull_request) Successful in 23s
PR Checklist / pr-checklist (pull_request) Failing after 11m2s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
augment_pairs.py: generates paraphrases and translations for any
JSONL training file.

Features:
- Auto-detects text field (rich, terse, text, content, lyric_line, etc.)
- N paraphrases per entry (template-based, or LLM with --llm-endpoint)
- Translations to ES, FR, DE (template dictionary, or LLM)
- Outputs augmented JSONL alongside originals
- Marks each augmented entry with _augmentation, _original, _language

Usage:
  python3 augment_pairs.py --input data.jsonl
  python3 augment_pairs.py --input data.jsonl --paraphrases 5 --langs es,fr
  python3 augment_pairs.py --input data.jsonl --llm-endpoint http://localhost:11434/v1

Closes #695
2026-04-15 07:51:38 -04:00
35a191f7b1 Merge PR #725: feat: Provider health monitor with auto-switch (#509) 2026-04-15 06:10:45 +00:00
e987e1b870 Merge PR #726: feat: Pre-flight provider check for session launch (#508) 2026-04-15 06:10:41 +00:00
19278513b4 Merge PR #727: feat: Three.js-specific glitch detection patterns (#543) 2026-04-15 06:10:38 +00:00
1088bf8983 test: add Three.js pattern tests and update assertions (#543)
Some checks failed
Architecture Lint / Linter Tests (pull_request) Successful in 28s
Smoke Test / smoke (pull_request) Failing after 23s
Validate Config / YAML Lint (pull_request) Failing after 21s
Validate Config / JSON Validate (pull_request) Successful in 21s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 1m23s
Validate Config / Shell Script Lint (pull_request) Failing after 50s
Validate Config / Cron Syntax Check (pull_request) Successful in 11s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 26s
Validate Config / Playbook Schema Validation (pull_request) Successful in 32s
PR Checklist / pr-checklist (pull_request) Failing after 11m13s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
- Added TestThreeJsPatterns class with 14 tests
- Tests cover: pattern existence, severity inference, vision prompt
- Updated pattern count assertion (14+ patterns now)
- Updated demo test (6 glitches: 4 original + 2 Three.js)
2026-04-15 05:37:17 +00:00
94f0a132d4 feat: add get_threejs_patterns() filter function (#543) 2026-04-15 05:34:17 +00:00
279356bed6 feat: add --threejs flag and Three.js-aware severity inference (#543)
- Added --threejs flag for focused Three.js pattern scanning
- Updated _infer_severity with shader_failure, texture_placeholder,
  uv_mapping_error, frustum_culling, shadow_map_artifact categories
- Added Three.js demo detections (shader failure, shadow map)
- Bumped detector version to 0.2.0
2026-04-15 05:34:16 +00:00
511ff863c2 feat: add Three.js-specific glitch detection patterns (#543)
Adds 6 new Three.js-specific glitch categories and patterns:
- SHADER_FAILURE: Solid black materials from shader compilation errors
- TEXTURE_PLACEHOLDER: 1x1 white pixel stretched surfaces
- UV_MAPPING_ERROR: BufferGeometry UV coordinate errors
- FRUSTUM_CULLING: Objects popping at screen edges
- SHADOW_MAP_ARTIFACT: Pixelated/blocky shadow maps
- BLOOM_OVERFLOW: Excessive post-processing bloom bleed

Closes #543
2026-04-15 05:32:25 +00:00
b6e3a647b0 feat: add pre-flight provider check script (#508)
Some checks failed
Architecture Lint / Linter Tests (pull_request) Successful in 29s
PR Checklist / pr-checklist (pull_request) Failing after 7m23s
Smoke Test / smoke (pull_request) Failing after 20s
Validate Config / YAML Lint (pull_request) Failing after 14s
Validate Config / JSON Validate (pull_request) Successful in 15s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 1m1s
Validate Config / Shell Script Lint (pull_request) Failing after 46s
Validate Config / Cron Syntax Check (pull_request) Successful in 9s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 10s
Validate Config / Playbook Schema Validation (pull_request) Successful in 28s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
- Checks OpenRouter balance via /api/v1/auth/key
- Tests Nous and Anthropic API keys
- Verifies Ollama is running
- Pre-flight check before session launch
- Returns exit code for automation

Closes #508
2026-04-15 03:55:04 +00:00
e14158676d feat: add provider health monitor script (#509)
Some checks failed
Architecture Lint / Linter Tests (pull_request) Successful in 44s
Smoke Test / smoke (pull_request) Failing after 36s
Validate Config / YAML Lint (pull_request) Failing after 21s
Validate Config / JSON Validate (pull_request) Successful in 28s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 2m36s
Validate Config / Shell Script Lint (pull_request) Failing after 1m3s
Validate Config / Cron Syntax Check (pull_request) Successful in 13s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 12s
PR Checklist / pr-checklist (pull_request) Failing after 6m15s
Validate Config / Playbook Schema Validation (pull_request) Successful in 28s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
- Tests all configured providers
- Maintains health map in tmux-state.json
- Auto-switches profiles to working providers
- Supports --daemon and --status modes

Closes #509
2026-04-15 03:48:37 +00:00
26e39d8949 feat: add autonomous cron supervisor job (#513)
- Runs every 7 minutes
- Checks dev and timmy sessions
- Loads tmux-supervisor skill
- Telegram only on actionable events
- Silent when all agents busy
2026-04-15 03:33:43 +00:00
10 changed files with 1839 additions and 109 deletions

View File

@@ -31,6 +31,14 @@ class GlitchCategory(Enum):
WATER_REFLECTION = "water_reflection"
SKYBOX_SEAM = "skybox_seam"
# Three.js-specific categories (ref: timmy-config#543)
SHADER_FAILURE = "shader_failure"
TEXTURE_PLACEHOLDER = "texture_placeholder"
UV_MAPPING_ERROR = "uv_mapping_error"
FRUSTUM_CULLING = "frustum_culling"
SHADOW_MAP_ARTIFACT = "shadow_map_artifact"
BLOOM_OVERFLOW = "bloom_overflow"
@dataclass
class GlitchPattern:
@@ -241,6 +249,123 @@ MATRIX_GLITCH_PATTERNS: list[GlitchPattern] = [
],
confidence_threshold=0.45,
),
# --- Three.js-Specific Glitch Patterns (ref: timmy-config#543) ---
GlitchPattern(
category=GlitchCategory.SHADER_FAILURE,
name="Shader Compilation Failure",
description="Three.js shader failed to compile, rendering the material as solid black. "
"Common when custom ShaderMaterial has syntax errors or missing uniforms.",
severity=GlitchSeverity.CRITICAL,
detection_prompts=[
"Look for objects or surfaces rendered as pure black (#000000) that should have visible textures or materials.",
"Identify geometry that appears completely dark while surrounding objects are normally lit.",
"Check for objects where the material seems to 'absorb all light' — flat black with no shading gradient.",
],
visual_indicators=[
"solid black object with no shading",
"geometry rendered as silhouette",
"material appears to absorb light entirely",
"black patch inconsistent with scene lighting",
],
confidence_threshold=0.7,
),
GlitchPattern(
category=GlitchCategory.TEXTURE_PLACEHOLDER,
name="Three.js Texture Not Loaded",
description="Three.js failed to load the texture asset, rendering a 1x1 white pixel "
"stretched across the entire surface. Distinguished from missing-texture by "
"the uniform white/grey appearance rather than magenta.",
severity=GlitchSeverity.CRITICAL,
detection_prompts=[
"Look for surfaces that are uniformly white or light grey with no texture detail, even on large geometry.",
"Identify objects where the texture appears as a single solid color stretched across complex UVs.",
"Check for surfaces that look 'blank' or 'unloaded' — flat white/grey where detail should exist.",
],
visual_indicators=[
"uniform white or light grey surface",
"no texture detail on large geometry",
"stretched single-color appearance",
"1x1 pixel placeholder stretched to fill UV space",
],
confidence_threshold=0.65,
),
GlitchPattern(
category=GlitchCategory.UV_MAPPING_ERROR,
name="BufferGeometry UV Mapping Error",
description="Three.js BufferGeometry has incorrect UV coordinates, causing textures to "
"appear stretched, compressed, or mapped to the wrong faces.",
severity=GlitchSeverity.HIGH,
detection_prompts=[
"Look for textures that appear dramatically stretched in one direction on specific faces.",
"Identify surfaces where the texture pattern is distorted but other nearby surfaces look correct.",
"Check for faces where the texture seems 'smeared' or mapped with incorrect aspect ratio.",
],
visual_indicators=[
"texture stretching on specific faces",
"distorted pattern on geometry",
"smeared texture appearance",
"aspect ratio mismatch between texture and surface",
],
confidence_threshold=0.6,
),
GlitchPattern(
category=GlitchCategory.FRUSTUM_CULLING,
name="Frustum Culling Artifact",
description="Three.js frustum culling incorrectly marks objects as outside the camera "
"frustum, causing them to pop in/out of existence at screen edges.",
severity=GlitchSeverity.MEDIUM,
detection_prompts=[
"Look for objects that are partially visible at the edge of the frame — half-rendered or cut off unnaturally.",
"Identify geometry that seems to 'pop' into existence as the view angle changes.",
"Check screen edges for objects that appear suddenly rather than smoothly entering the viewport.",
],
visual_indicators=[
"half-visible object at screen edge",
"object popping into frame",
"abrupt appearance of geometry",
"bounding box visible but mesh missing",
],
confidence_threshold=0.55,
),
GlitchPattern(
category=GlitchCategory.SHADOW_MAP_ARTIFACT,
name="Shadow Map Resolution Artifact",
description="Three.js shadow map has insufficient resolution, causing pixelated, "
"blocky shadows with visible texel edges instead of smooth shadow gradients.",
severity=GlitchSeverity.MEDIUM,
detection_prompts=[
"Look for shadows with visible blocky or pixelated edges instead of smooth gradients.",
"Identify shadow maps where individual texels (texture pixels) are clearly visible.",
"Check for shadows that appear as jagged stair-stepped patterns rather than soft edges.",
],
visual_indicators=[
"blocky shadow edges",
"visible texel grid in shadows",
"stair-stepped shadow boundary",
"pixelated shadow gradient",
],
confidence_threshold=0.55,
),
GlitchPattern(
category=GlitchCategory.BLOOM_OVERFLOW,
name="Post-Processing Bloom Overflow",
description="Three.js UnrealBloomPass or similar post-processing bloom effect is too "
"intense, causing bright areas to bleed glow into surrounding geometry.",
severity=GlitchSeverity.LOW,
detection_prompts=[
"Look for bright areas that have an unusually large, soft glow bleeding into adjacent surfaces.",
"Identify scenes where light sources appear to have a 'halo' that extends beyond physical plausibility.",
"Check for bright objects whose glow color bleeds onto nearby unrelated geometry.",
],
visual_indicators=[
"excessive glow bleeding from bright surfaces",
"halo around light sources",
"bloom color tinting adjacent geometry",
"glow bleeding beyond object boundaries",
],
confidence_threshold=0.5,
),
]
@@ -289,6 +414,23 @@ def build_vision_prompt(patterns: list[GlitchPattern] | None = None) -> str:
)
# Three.js-specific category set for filtering (ref: timmy-config#543)
THREEJS_CATEGORIES = {
GlitchCategory.SHADER_FAILURE,
GlitchCategory.TEXTURE_PLACEHOLDER,
GlitchCategory.UV_MAPPING_ERROR,
GlitchCategory.FRUSTUM_CULLING,
GlitchCategory.SHADOW_MAP_ARTIFACT,
GlitchCategory.BLOOM_OVERFLOW,
}
def get_threejs_patterns() -> list[GlitchPattern]:
"""Return only Three.js-specific glitch patterns."""
return [p for p in MATRIX_GLITCH_PATTERNS if p.category in THREEJS_CATEGORIES]
if __name__ == "__main__":
import json
print(f"Loaded {len(MATRIX_GLITCH_PATTERNS)} glitch patterns:\n")

View File

@@ -9,7 +9,7 @@ Usage:
python matrix_glitch_detector.py <url> [--angles 4] [--output report.json]
python matrix_glitch_detector.py --demo # Run with synthetic test data
Ref: timmy-config#491
Ref: timmy-config#491, timmy-config#543
"""
import argparse
@@ -33,6 +33,7 @@ from glitch_patterns import (
MATRIX_GLITCH_PATTERNS,
build_vision_prompt,
get_patterns_by_severity,
get_threejs_patterns,
)
@@ -345,14 +346,17 @@ def _parse_vision_response(
def _infer_severity(category: str, confidence: float) -> str:
"""Infer severity from category and confidence when not provided."""
critical_cats = {"missing_textures", "clipping"}
high_cats = {"floating_assets", "broken_normals"}
critical_cats = {"missing_textures", "clipping", "shader_failure", "texture_placeholder"}
high_cats = {"floating_assets", "broken_normals", "uv_mapping_error"}
medium_cats = {"frustum_culling", "shadow_map_artifact"}
cat_lower = category.lower()
if any(c in cat_lower for c in critical_cats):
return "critical" if confidence > 0.7 else "high"
if any(c in cat_lower for c in high_cats):
return "high" if confidence > 0.7 else "medium"
if any(c in cat_lower for c in medium_cats):
return "medium" if confidence > 0.6 else "low"
return "medium" if confidence > 0.6 else "low"
@@ -389,9 +393,9 @@ def build_report(
),
},
metadata={
"detector_version": "0.1.0",
"detector_version": "0.2.0",
"pattern_count": len(MATRIX_GLITCH_PATTERNS),
"reference": "timmy-config#491",
"reference": "timmy-config#491, timmy-config#543",
},
)
@@ -460,6 +464,30 @@ def run_demo(output_path: Optional[Path] = None) -> ScanResult:
screenshot_index=3,
screenshot_angle="left",
),
DetectedGlitch(
id=str(uuid.uuid4())[:8],
category="shader_failure",
name="Black Material on Portal Frame",
description="Portal frame rendered as solid black — shader compilation failed (missing uniform u_time)",
severity="critical",
confidence=0.91,
location_x=45.0,
location_y=30.0,
screenshot_index=0,
screenshot_angle="front",
),
DetectedGlitch(
id=str(uuid.uuid4())[:8],
category="shadow_map_artifact",
name="Pixelated Character Shadow",
description="Character shadow shows visible texel grid — shadow map resolution too low (512x512)",
severity="medium",
confidence=0.78,
location_x=52.0,
location_y=75.0,
screenshot_index=1,
screenshot_angle="right",
),
]
print(f"[*] Detected {len(demo_glitches)} glitches")
@@ -496,6 +524,11 @@ Examples:
help="Minimum severity to include in report",
)
parser.add_argument("--verbose", "-v", action="store_true", help="Verbose output")
parser.add_argument(
"--threejs",
action="store_true",
help="Focus on Three.js-specific glitch patterns only (shader, texture, UV, culling, shadow, bloom)",
)
args = parser.parse_args()
@@ -525,9 +558,13 @@ Examples:
screenshots = capture_screenshots(args.url, angles, screenshots_dir)
print(f"[*] Captured {len(screenshots)} screenshots")
# Filter patterns by severity
# Filter patterns by severity and type
min_sev = GlitchSeverity(args.min_severity)
patterns = get_patterns_by_severity(min_sev)
if args.threejs:
threejs_patterns = get_threejs_patterns()
patterns = [p for p in patterns if p in threejs_patterns]
print(f"[*] Three.js-focused mode: {len(patterns)} patterns")
# Analyze with vision AI
print(f"[*] Analyzing with vision AI ({len(patterns)} patterns)...")

View File

@@ -0,0 +1,271 @@
#!/usr/bin/env python3
"""
Pre-Flight Provider Check Script
Issue #508: [Robustness] Credential drain detection — provider health checks
Pre-flight check before session launch: verifies provider credentials and balance.
Usage:
python3 preflight-provider-check.py # Check all providers
python3 preflight-provider-check.py --launch # Check and return exit code
python3 preflight-provider-check.py --balance # Check OpenRouter balance
"""
import os, sys, json, yaml, urllib.request
from datetime import datetime, timezone
from pathlib import Path
# Configuration
HERMES_HOME = Path(os.environ.get("HERMES_HOME", Path.home() / ".hermes"))
LOG_DIR = Path.home() / ".local" / "timmy" / "fleet-health"
LOG_FILE = LOG_DIR / "preflight-check.log"
def log(msg):
"""Log message to file and optionally console."""
timestamp = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S")
log_entry = "[" + timestamp + "] " + msg
LOG_DIR.mkdir(parents=True, exist_ok=True)
with open(LOG_FILE, "a") as f:
f.write(log_entry + "\n")
if "--quiet" not in sys.argv:
print(log_entry)
def get_provider_api_key(provider):
"""Get API key for a provider from .env or environment."""
env_file = HERMES_HOME / ".env"
if env_file.exists():
with open(env_file) as f:
for line in f:
line = line.strip()
if line.startswith(provider.upper() + "_API_KEY="):
return line.split("=", 1)[1].strip().strip("'\"")
return os.environ.get(provider.upper() + "_API_KEY")
def check_openrouter_balance(api_key):
"""Check OpenRouter balance via /api/v1/auth/key."""
if not api_key:
return False, "No API key", 0
try:
req = urllib.request.Request(
"https://openrouter.ai/api/v1/auth/key",
headers={"Authorization": "Bearer " + api_key}
)
resp = urllib.request.urlopen(req, timeout=10)
data = json.loads(resp.read())
# Check for credits
credits = data.get("data", {}).get("limit", 0)
usage = data.get("data", {}).get("usage", 0)
remaining = credits - usage if credits else None
if remaining is not None and remaining <= 0:
return False, "No credits remaining", 0
elif remaining is not None:
return True, "Credits available", remaining
else:
return True, "Unlimited or unknown balance", None
except urllib.error.HTTPError as e:
if e.code == 401:
return False, "Invalid API key", 0
else:
return False, "HTTP " + str(e.code), 0
except Exception as e:
return False, str(e)[:100], 0
def check_nous_key(api_key):
"""Check Nous API key with minimal test call."""
if not api_key:
return False, "No API key"
try:
req = urllib.request.Request(
"https://inference.nousresearch.com/v1/models",
headers={"Authorization": "Bearer " + api_key}
)
resp = urllib.request.urlopen(req, timeout=10)
if resp.status == 200:
return True, "Valid key"
else:
return False, "HTTP " + str(resp.status)
except urllib.error.HTTPError as e:
if e.code == 401:
return False, "Invalid API key"
elif e.code == 403:
return False, "Forbidden"
else:
return False, "HTTP " + str(e.code)
except Exception as e:
return False, str(e)[:100]
def check_anthropic_key(api_key):
"""Check Anthropic API key with minimal test call."""
if not api_key:
return False, "No API key"
try:
req = urllib.request.Request(
"https://api.anthropic.com/v1/models",
headers={
"x-api-key": api_key,
"anthropic-version": "2023-06-01"
}
)
resp = urllib.request.urlopen(req, timeout=10)
if resp.status == 200:
return True, "Valid key"
else:
return False, "HTTP " + str(resp.status)
except urllib.error.HTTPError as e:
if e.code == 401:
return False, "Invalid API key"
elif e.code == 403:
return False, "Forbidden"
else:
return False, "HTTP " + str(e.code)
except Exception as e:
return False, str(e)[:100]
def check_ollama():
"""Check if Ollama is running."""
try:
req = urllib.request.Request("http://localhost:11434/api/tags")
resp = urllib.request.urlopen(req, timeout=5)
if resp.status == 200:
data = json.loads(resp.read())
models = data.get("models", [])
return True, str(len(models)) + " models loaded"
else:
return False, "HTTP " + str(resp.status)
except Exception as e:
return False, str(e)[:100]
def get_configured_provider():
"""Get the configured provider from global config."""
config_file = HERMES_HOME / "config.yaml"
if not config_file.exists():
return None
try:
with open(config_file) as f:
config = yaml.safe_load(f)
model_config = config.get("model", {})
if isinstance(model_config, dict):
return model_config.get("provider")
except:
pass
return None
def run_preflight_check():
"""Run pre-flight check on all providers."""
log("=== Pre-Flight Provider Check ===")
results = {}
# Check OpenRouter
or_key = get_provider_api_key("openrouter")
or_ok, or_msg, or_balance = check_openrouter_balance(or_key)
results["openrouter"] = {"healthy": or_ok, "message": or_msg, "balance": or_balance}
# Check Nous
nous_key = get_provider_api_key("nous")
nous_ok, nous_msg = check_nous_key(nous_key)
results["nous"] = {"healthy": nous_ok, "message": nous_msg}
# Check Anthropic
anthropic_key = get_provider_api_key("anthropic")
anthropic_ok, anthropic_msg = check_anthropic_key(anthropic_key)
results["anthropic"] = {"healthy": anthropic_ok, "message": anthropic_msg}
# Check Ollama
ollama_ok, ollama_msg = check_ollama()
results["ollama"] = {"healthy": ollama_ok, "message": ollama_msg}
# Get configured provider
configured = get_configured_provider()
# Summary
healthy_count = sum(1 for r in results.values() if r["healthy"])
total_count = len(results)
log("Results: " + str(healthy_count) + "/" + str(total_count) + " providers healthy")
for provider, result in results.items():
status = "HEALTHY" if result["healthy"] else "UNHEALTHY"
extra = ""
if provider == "openrouter" and result.get("balance") is not None:
extra = " (balance: " + str(result["balance"]) + ")"
log(" " + provider + ": " + status + " - " + result["message"] + extra)
if configured:
log("Configured provider: " + configured)
if configured in results and not results[configured]["healthy"]:
log("WARNING: Configured provider " + configured + " is UNHEALTHY!")
return results, configured
def check_launch_readiness():
"""Check if we're ready to launch sessions."""
results, configured = run_preflight_check()
# Check if configured provider is healthy
if configured and configured in results:
if not results[configured]["healthy"]:
log("LAUNCH BLOCKED: Configured provider " + configured + " is unhealthy")
return False, configured + " is unhealthy"
# Check if at least one provider is healthy
healthy_providers = [p for p, r in results.items() if r["healthy"]]
if not healthy_providers:
log("LAUNCH BLOCKED: No healthy providers available")
return False, "No healthy providers"
log("LAUNCH READY: " + str(len(healthy_providers)) + " healthy providers available")
return True, "Ready"
def show_balance():
"""Show OpenRouter balance."""
api_key = get_provider_api_key("openrouter")
if not api_key:
print("No OpenRouter API key found")
return
ok, msg, balance = check_openrouter_balance(api_key)
if ok:
if balance is not None:
print("OpenRouter balance: " + str(balance) + " credits")
else:
print("OpenRouter: " + msg)
else:
print("OpenRouter: " + msg)
def main():
if "--balance" in sys.argv:
show_balance()
elif "--launch" in sys.argv:
ready, message = check_launch_readiness()
if ready:
print("READY")
sys.exit(0)
else:
print("BLOCKED: " + message)
sys.exit(1)
else:
run_preflight_check()
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,411 @@
#!/usr/bin/env python3
"""
Provider Health Monitor Script
Issue #509: [Robustness] Provider-aware profile config — auto-switch on failure
Monitors provider health and automatically switches profiles to working providers.
Usage:
python3 provider-health-monitor.py # Run once
python3 provider-health-monitor.py --daemon # Run continuously
python3 provider-health-monitor.py --status # Show provider health
"""
import os, sys, json, yaml, urllib.request, time
from datetime import datetime, timezone
from pathlib import Path
# Configuration
HERMES_HOME = Path(os.environ.get("HERMES_HOME", Path.home() / ".hermes"))
PROFILES_DIR = HERMES_HOME / "profiles"
LOG_DIR = Path.home() / ".local" / "timmy" / "fleet-health"
STATE_FILE = LOG_DIR / "tmux-state.json"
LOG_FILE = LOG_DIR / "provider-health.log"
# Provider test endpoints
PROVIDER_TESTS = {
"openrouter": {
"url": "https://openrouter.ai/api/v1/models",
"method": "GET",
"headers": lambda api_key: {"Authorization": "Bearer " + api_key},
"timeout": 10
},
"anthropic": {
"url": "https://api.anthropic.com/v1/models",
"method": "GET",
"headers": lambda api_key: {"x-api-key": api_key, "anthropic-version": "2023-06-01"},
"timeout": 10
},
"nous": {
"url": "https://inference.nousresearch.com/v1/models",
"method": "GET",
"headers": lambda api_key: {"Authorization": "Bearer " + api_key},
"timeout": 10
},
"kimi-coding": {
"url": "https://api.kimi.com/coding/v1/models",
"method": "GET",
"headers": lambda api_key: {"x-api-key": api_key, "x-api-provider": "kimi-coding"},
"timeout": 10
},
"ollama": {
"url": "http://localhost:11434/api/tags",
"method": "GET",
"headers": lambda api_key: {},
"timeout": 5
}
}
def log(msg):
"""Log message to file and optionally console."""
timestamp = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S")
log_entry = "[" + timestamp + "] " + msg
LOG_DIR.mkdir(parents=True, exist_ok=True)
with open(LOG_FILE, "a") as f:
f.write(log_entry + "\n")
if "--quiet" not in sys.argv:
print(log_entry)
def get_provider_api_key(provider):
"""Get API key for a provider from .env or environment."""
env_file = HERMES_HOME / ".env"
if env_file.exists():
with open(env_file) as f:
for line in f:
line = line.strip()
if line.startswith(provider.upper() + "_API_KEY="):
return line.split("=", 1)[1].strip().strip("'\"")
return os.environ.get(provider.upper() + "_API_KEY")
def test_provider(provider, api_key=None):
"""Test if a provider is healthy."""
config = PROVIDER_TESTS.get(provider)
if not config:
return False, "Unknown provider: " + provider
headers = config["headers"](api_key or "")
try:
req = urllib.request.Request(
config["url"],
headers=headers,
method=config["method"]
)
resp = urllib.request.urlopen(req, timeout=config["timeout"])
if resp.status == 200:
return True, "Healthy"
else:
return False, "HTTP " + str(resp.status)
except urllib.error.HTTPError as e:
if e.code == 401:
return False, "Unauthorized (401)"
elif e.code == 403:
return False, "Forbidden (403)"
elif e.code == 429:
return True, "Rate limited but accessible"
else:
return False, "HTTP " + str(e.code)
except Exception as e:
return False, str(e)[:100]
def get_all_providers():
"""Get all providers from profiles and global config."""
providers = set()
# Global config
global_config = HERMES_HOME / "config.yaml"
if global_config.exists():
try:
with open(global_config) as f:
config = yaml.safe_load(f)
# Primary model provider
model_config = config.get("model", {})
if isinstance(model_config, dict):
provider = model_config.get("provider", "")
if provider:
providers.add(provider)
# Auxiliary providers
auxiliary = config.get("auxiliary", {})
for aux_config in auxiliary.values():
if isinstance(aux_config, dict):
provider = aux_config.get("provider", "")
if provider and provider != "auto":
providers.add(provider)
except:
pass
# Profile configs
if PROFILES_DIR.exists():
for profile_dir in PROFILES_DIR.iterdir():
if profile_dir.is_dir():
config_file = profile_dir / "config.yaml"
if config_file.exists():
try:
with open(config_file) as f:
config = yaml.safe_load(f)
model_config = config.get("model", {})
if isinstance(model_config, dict):
provider = model_config.get("provider", "")
if provider:
providers.add(provider)
auxiliary = config.get("auxiliary", {})
for aux_config in auxiliary.values():
if isinstance(aux_config, dict):
provider = aux_config.get("provider", "")
if provider and provider != "auto":
providers.add(provider)
except:
pass
# Add common providers even if not configured
providers.update(["openrouter", "nous", "ollama"])
return list(providers)
def build_health_map():
"""Build a health map of all providers."""
providers = get_all_providers()
health_map = {}
log("Testing " + str(len(providers)) + " providers...")
for provider in providers:
api_key = get_provider_api_key(provider)
healthy, message = test_provider(provider, api_key)
health_map[provider] = {
"healthy": healthy,
"message": message,
"last_test": datetime.now(timezone.utc).isoformat(),
"api_key_present": bool(api_key)
}
status = "HEALTHY" if healthy else "UNHEALTHY"
log(" " + provider + ": " + status + " - " + message)
return health_map
def get_fallback_providers(health_map):
"""Get list of healthy providers in priority order."""
# Priority order: nous, openrouter, ollama, others
priority_order = ["nous", "openrouter", "ollama", "anthropic", "kimi-coding"]
healthy = []
for provider in priority_order:
if provider in health_map and health_map[provider]["healthy"]:
healthy.append(provider)
# Add any other healthy providers not in priority list
for provider, info in health_map.items():
if info["healthy"] and provider not in healthy:
healthy.append(provider)
return healthy
def update_profile_config(profile_name, new_provider):
"""Update a profile's config to use a new provider."""
config_file = PROFILES_DIR / profile_name / "config.yaml"
if not config_file.exists():
return False, "Config file not found"
try:
with open(config_file) as f:
config = yaml.safe_load(f)
# Update model provider
if "model" not in config:
config["model"] = {}
old_provider = config["model"].get("provider", "unknown")
config["model"]["provider"] = new_provider
# Update auxiliary providers if they were using the old provider
auxiliary = config.get("auxiliary", {})
for aux_name, aux_config in auxiliary.items():
if isinstance(aux_config, dict) and aux_config.get("provider") == old_provider:
aux_config["provider"] = new_provider
# Write back
with open(config_file, "w") as f:
yaml.dump(config, f, default_flow_style=False)
log("Updated " + profile_name + ": " + old_provider + " -> " + new_provider)
return True, "Updated"
except Exception as e:
return False, str(e)
def check_profiles(health_map):
"""Check all profiles and update unhealthy providers."""
if not PROFILES_DIR.exists():
return
fallback_providers = get_fallback_providers(health_map)
if not fallback_providers:
log("CRITICAL: No healthy providers available!")
return
updated_profiles = []
for profile_dir in PROFILES_DIR.iterdir():
if not profile_dir.is_dir():
continue
profile_name = profile_dir.name
config_file = profile_dir / "config.yaml"
if not config_file.exists():
continue
try:
with open(config_file) as f:
config = yaml.safe_load(f)
model_config = config.get("model", {})
if not isinstance(model_config, dict):
continue
current_provider = model_config.get("provider", "")
if not current_provider:
continue
# Check if current provider is healthy
if current_provider in health_map and health_map[current_provider]["healthy"]:
continue # Provider is healthy, no action needed
# Find best fallback
best_fallback = None
for provider in fallback_providers:
if provider != current_provider:
best_fallback = provider
break
if not best_fallback:
log("No fallback for " + profile_name + " (current: " + current_provider + ")")
continue
# Update profile
success, message = update_profile_config(profile_name, best_fallback)
if success:
updated_profiles.append({
"profile": profile_name,
"old_provider": current_provider,
"new_provider": best_fallback
})
except Exception as e:
log("Error processing " + profile_name + ": " + str(e))
return updated_profiles
def load_state():
"""Load state from tmux-state.json."""
if STATE_FILE.exists():
try:
with open(STATE_FILE) as f:
return json.load(f)
except:
pass
return {}
def save_state(state):
"""Save state to tmux-state.json."""
LOG_DIR.mkdir(parents=True, exist_ok=True)
with open(STATE_FILE, "w") as f:
json.dump(state, f, indent=2)
def run_once():
"""Run provider health check once."""
log("=== Provider Health Check ===")
state = load_state()
# Build health map
health_map = build_health_map()
# Check profiles and update if needed
updated_profiles = check_profiles(health_map)
# Update state
state["provider_health"] = health_map
state["last_provider_check"] = datetime.now(timezone.utc).isoformat()
if updated_profiles:
state["last_profile_updates"] = updated_profiles
save_state(state)
# Summary
healthy_count = sum(1 for p in health_map.values() if p["healthy"])
total_count = len(health_map)
log("Health: " + str(healthy_count) + "/" + str(total_count) + " providers healthy")
if updated_profiles:
log("Updated " + str(len(updated_profiles)) + " profiles:")
for update in updated_profiles:
log(" " + update["profile"] + ": " + update["old_provider"] + " -> " + update["new_provider"])
def show_status():
"""Show provider health status."""
state = load_state()
health_map = state.get("provider_health", {})
if not health_map:
print("No provider health data available. Run without --status first.")
return
print("Provider Health (last updated: " + str(state.get("last_provider_check", "unknown")) + ")")
print("=" * 80)
for provider, info in sorted(health_map.items()):
status = "HEALTHY" if info["healthy"] else "UNHEALTHY"
message = info.get("message", "")
api_key = "yes" if info.get("api_key_present") else "no"
print(provider.ljust(20) + " " + status.ljust(10) + " API key: " + api_key + " - " + message)
# Show recent updates
updates = state.get("last_profile_updates", [])
if updates:
print()
print("Recent Profile Updates:")
for update in updates:
print(" " + update["profile"] + ": " + update["old_provider"] + " -> " + update["new_provider"])
def daemon_mode():
"""Run continuously."""
log("Starting provider health daemon (check every 300s)")
while True:
try:
run_once()
time.sleep(300) # Check every 5 minutes
except KeyboardInterrupt:
log("Daemon stopped by user")
break
except Exception as e:
log("Error: " + str(e))
time.sleep(60)
def main():
if "--status" in sys.argv:
show_status()
elif "--daemon" in sys.argv:
daemon_mode()
else:
run_once()
if __name__ == "__main__":
main()

292
bin/quality-gate.py Normal file
View File

@@ -0,0 +1,292 @@
#!/usr/bin/env python3
"""
Quality Gate — Validate pipeline outputs before saving.
Checks:
- JSON schema validation for all output formats
- Content quality (not empty, not duplicated, not toxic)
- SOUL.md compliance for agent-facing content
- Auto-reject bad outputs, re-queue for regeneration
- Quality score tracking per pipeline
Usage:
python3 quality-gate.py validate training-data/pairs.jsonl
python3 quality-gate.py validate --format training-pairs data.jsonl
python3 quality-gate.py score training-data/pairs.jsonl
python3 quality-gate.py stats
"""
import hashlib
import json
import os
import sys
from datetime import datetime, timezone
from pathlib import Path
HERMES_HOME = Path(os.environ.get("HERMES_HOME", os.path.expanduser("~/.hermes")))
SCORE_FILE = HERMES_HOME / "quality-scores.jsonl"
HASH_FILE = HERMES_HOME / "quality-hashes.json"
MAX_HASH_AGE_DAYS = 7
# ── Validators ─────────────────────────────────────────
TOXIC_PATTERNS = [
"kill yourself", "kys", "you should die", "end it all",
"nobody loves you", "waste of life",
]
def validate_training_pair(entry):
"""Validate a training pair (prompt + response)."""
errors = []
if not isinstance(entry, dict):
return ["Entry is not a dict"]
prompt = entry.get("prompt", "") or entry.get("instruction", "") or ""
response = entry.get("response", "") or entry.get("output", "") or entry.get("completion", "") or ""
if not prompt.strip():
errors.append("Empty prompt")
if not response.strip():
errors.append("Empty response")
if len(response) < 10:
errors.append(f"Response too short ({len(response)} chars)")
if len(prompt) > 10000:
errors.append(f"Prompt too long ({len(prompt)} chars)")
# Toxicity check
combined = (prompt + " " + response).lower()
for pattern in TOXIC_PATTERNS:
if pattern in combined:
errors.append(f"Toxic content detected: '{pattern}'")
return errors
def validate_jsonl(filepath):
"""Validate a JSONL file — each line must be valid JSON."""
errors = []
seen_hashes = set()
line_count = 0
try:
with open(filepath) as f:
for i, line in enumerate(f, 1):
line = line.strip()
if not line:
continue
line_count += 1
try:
entry = json.loads(line)
except json.JSONDecodeError as e:
errors.append(f"Line {i}: invalid JSON: {e}")
continue
# Duplicate detection
h = hashlib.sha256(line.encode()).hexdigest()[:16]
if h in seen_hashes:
errors.append(f"Line {i}: duplicate content (hash {h})")
seen_hashes.add(h)
# Content validation
if isinstance(entry, dict):
pair_errors = validate_training_pair(entry)
for pe in pair_errors:
errors.append(f"Line {i}: {pe}")
except Exception as e:
errors.append(f"File error: {e}")
return errors, line_count, seen_hashes
def validate_json(filepath):
"""Validate a single JSON file."""
errors = []
try:
with open(filepath) as f:
data = json.load(f)
except json.JSONDecodeError as e:
return [f"Invalid JSON: {e}"], 0
if isinstance(data, list):
seen = set()
for i, entry in enumerate(data):
if isinstance(entry, dict):
h = hashlib.sha256(json.dumps(entry, sort_keys=True).encode()).hexdigest()[:16]
if h in seen:
errors.append(f"Index {i}: duplicate entry")
seen.add(h)
return errors, len(data) if isinstance(data, list) else 1
# ── Quality Scoring ────────────────────────────────────
def score_file(filepath):
"""Score a pipeline output file. Returns 0-100."""
path = Path(filepath)
if not path.exists():
return 0
suffix = path.suffix.lower()
if suffix == ".jsonl":
errors, count, _ = validate_jsonl(filepath)
elif suffix == ".json":
errors, count = validate_json(filepath)
else:
return 50 # unknown format
if count == 0:
return 0
error_rate = len(errors) / count
score = max(0, int(100 * (1 - error_rate)))
# Bonus for having content
if count >= 100:
score = min(100, score + 5)
return score
def record_score(filepath, score):
"""Record quality score for tracking."""
HERMES_HOME.mkdir(parents=True, exist_ok=True)
entry = {
"timestamp": datetime.now(timezone.utc).isoformat(),
"file": str(filepath),
"score": score,
}
with open(SCORE_FILE, "a") as f:
f.write(json.dumps(entry) + "
")
# ── Dedup Hash Management ─────────────────────────────
def load_hashes():
try:
return json.loads(HASH_FILE.read_text())
except Exception:
return {"entries": {}, "last_cleanup": None}
def save_hashes(data):
HASH_FILE.parent.mkdir(parents=True, exist_ok=True)
HASH_FILE.write_text(json.dumps(data, indent=2))
def cleanup_old_hashes(data, max_age_days=MAX_HASH_AGE_DAYS):
"""Remove hash entries older than max_age_days."""
cutoff = datetime.now(timezone.utc).timestamp() - (max_age_days * 86400)
before = len(data.get("entries", {}))
data["entries"] = {
k: v for k, v in data.get("entries", {}).items()
if v.get("ts", 0) > cutoff
}
data["last_cleanup"] = datetime.now(timezone.utc).isoformat()
after = len(data["entries"])
return before - after
# ── CLI ────────────────────────────────────────────────
def cmd_validate(args):
filepath = args[0] if args else None
if not filepath or not os.path.exists(filepath):
print(f"ERROR: {filepath} not found")
sys.exit(1)
suffix = Path(filepath).suffix.lower()
if suffix == ".jsonl":
errors, count, _ = validate_jsonl(filepath)
elif suffix == ".json":
errors, count = validate_json(filepath)
else:
print(f"Unsupported format: {suffix}")
sys.exit(1)
score = score_file(filepath)
record_score(filepath, score)
if errors:
for e in errors[:20]:
print(f"FAIL: {e}")
if len(errors) > 20:
print(f"... and {len(errors)-20} more")
print(f"
Score: {score}/100 ({len(errors)} errors in {count} entries)")
sys.exit(1)
else:
print(f"OK: {filepath} ({count} entries, score {score}/100)")
def cmd_score(args):
filepath = args[0] if args else None
if not filepath:
print("Usage: quality-gate.py score <file>")
sys.exit(1)
score = score_file(filepath)
print(f"Score: {score}/100")
record_score(filepath, score)
def cmd_stats():
if not SCORE_FILE.exists():
print("No quality scores recorded yet.")
return
scores = []
with open(SCORE_FILE) as f:
for line in f:
try:
scores.append(json.loads(line))
except Exception:
continue
if not scores:
print("No scores recorded.")
return
by_file = {}
for s in scores:
fname = s.get("file", "?")
by_file.setdefault(fname, []).append(s.get("score", 0))
print("Quality Scores:")
for fname, scs in sorted(by_file.items()):
avg = sum(scs) / len(scs)
latest = scs[-1]
print(f" {fname}: avg={avg:.0f}, latest={latest}, runs={len(scs)}")
def cmd_cleanup():
data = load_hashes()
removed = cleanup_old_hashes(data)
save_hashes(data)
print(f"Cleaned up {removed} old hash entries (>{MAX_HASH_AGE_DAYS} days)")
def main():
if len(sys.argv) < 2:
print("Usage: quality-gate.py <validate|score|stats|cleanup> [args]")
sys.exit(1)
cmd = sys.argv[1]
args = sys.argv[2:]
if cmd == "validate":
cmd_validate(args)
elif cmd == "score":
cmd_score(args)
elif cmd == "stats":
cmd_stats()
elif cmd == "cleanup":
cmd_cleanup()
else:
print(f"Unknown command: {cmd}")
sys.exit(1)
if __name__ == "__main__":
main()

View File

@@ -196,7 +196,37 @@
"paused_reason": null,
"skills": [],
"skill": null
},
{
"id": "tmux-supervisor-513",
"name": "Autonomous Cron Supervisor",
"prompt": "Load the tmux-supervisor skill and execute the monitoring protocol.\n\nCheck both `dev` and `timmy` tmux sessions for idle panes. Only send Telegram notifications on actionable events (idle, overflow, failure). Be silent when all agents are working.\n\nSteps:\n1. List all tmux sessions (skip 'Alexander')\n2. For each session, list windows and panes\n3. Capture each pane and classify state (idle vs active)\n4. For idle panes: read context, craft context-aware prompt\n5. Send /queue prompts to idle panes\n6. Verify prompts landed\n7. Only notify via Telegram if:\n - A pane was prompted (idle detected)\n - A pane shows context overflow (>80%)\n - A pane is stuck or crashed\n8. If all panes are active: respond with [SILENT]",
"schedule": {
"kind": "interval",
"minutes": 7,
"display": "every 7m"
},
"schedule_display": "every 7m",
"repeat": {
"times": null,
"completed": 0
},
"enabled": true,
"created_at": "2026-04-15T03:00:00.000000+00:00",
"next_run_at": null,
"last_run_at": null,
"last_status": null,
"last_error": null,
"deliver": "telegram",
"origin": null,
"state": "scheduled",
"paused_at": null,
"paused_reason": null,
"skills": [
"tmux-supervisor"
],
"skill": "tmux-supervisor"
}
],
"updated_at": "2026-04-13T02:00:00+00:00"
}
}

419
pipeline/quality_gate.py Executable file
View File

@@ -0,0 +1,419 @@
#!/usr/bin/env python3
"""
quality_gate.py — Quality Gate for Pipeline Outputs
Validates all pipeline outputs before saving. Rejects bad outputs,
tracks quality scores, and supports re-queue for regeneration.
Usage:
python3 quality_gate.py --input output.jsonl --type training_pairs
python3 quality_gate.py --input output.jsonl --type knowledge
python3 quality_gate.py --input output.jsonl --type scene_descriptions
python3 quality_gate.py --dir pipeline/output/ --type training_pairs
python3 quality_gate.py --status # show quality stats
Exit codes:
0 = all outputs passed
1 = some outputs rejected
2 = file/parse error
"""
import json
import os
import sys
import hashlib
import re
from pathlib import Path
from datetime import datetime, timezone
from dataclasses import dataclass, field, asdict
from typing import List, Optional, Dict, Any
STATS_FILE = Path.home() / ".hermes" / "pipeline" / "quality_stats.json"
# --- Quality Check Types ---
@dataclass
class QualityResult:
"""Result of a quality check on a single entry."""
passed: bool
checks_run: int
checks_failed: int
score: float # 0.0-1.0
reasons: List[str] = field(default_factory=list)
entry_index: int = -1
hash: str = ""
def to_dict(self):
return asdict(self)
@dataclass
class GateReport:
"""Report from a quality gate run."""
file: str
type: str
total: int
passed: int
rejected: int
score: float
rejected_indices: List[int] = field(default_factory=list)
timestamp: str = field(default_factory=lambda: datetime.now(timezone.utc).isoformat())
def to_dict(self):
return asdict(self)
# ============================================================
# Check Functions
# ============================================================
def entry_hash(entry: dict) -> str:
"""Hash an entry for deduplication."""
return hashlib.sha256(json.dumps(entry, sort_keys=True, ensure_ascii=False).encode()).hexdigest()[:16]
def check_not_empty(entry: dict, fields: List[str]) -> List[str]:
"""Check that required fields are non-empty."""
errors = []
for f in fields:
val = entry.get(f)
if val is None:
errors.append(f"missing_field: {f}")
elif isinstance(val, str) and len(val.strip()) == 0:
errors.append(f"empty_field: {f}")
elif isinstance(val, list) and len(val) == 0:
errors.append(f"empty_list: {f}")
return errors
def check_string_min_length(entry: dict, field_lengths: Dict[str, int]) -> List[str]:
"""Check that string fields meet minimum lengths."""
errors = []
for f, min_len in field_lengths.items():
val = entry.get(f)
if isinstance(val, str) and len(val) < min_len:
errors.append(f"short_field: {f} ({len(val)} < {min_len})")
return errors
def check_no_duplicates(entries: List[dict], key_fields: List[str]) -> Dict[int, List[str]]:
"""Check for duplicate entries based on key fields."""
seen = {}
errors = {}
for i, entry in enumerate(entries):
key = tuple(entry.get(f, "") for f in key_fields)
key_str = str(key)
if key_str in seen:
errors[i] = [f"duplicate_of_index: {seen[key_str]}"]
else:
seen[key_str] = i
return errors
def check_training_pair(entry: dict) -> List[str]:
"""Validate a training pair (prompt/response)."""
errors = []
errors.extend(check_not_empty(entry, ["prompt", "response"]))
# Check response isn't just echoing the prompt
prompt = entry.get("prompt", "")
response = entry.get("response", "")
if prompt and response and prompt.strip() == response.strip():
errors.append("response_equals_prompt")
# Check response has substance
if isinstance(response, str) and len(response) < 10:
errors.append(f"response_too_short: {len(response)} chars")
return errors
def check_scene_description(entry: dict) -> List[str]:
"""Validate a scene description entry."""
errors = []
errors.extend(check_not_empty(entry, ["song", "beat", "lyric_line", "scene"]))
scene = entry.get("scene")
if isinstance(scene, dict):
errors.extend(check_not_empty(scene, ["mood", "colors", "composition", "camera", "description"]))
errors.extend(check_string_min_length(scene, {"description": 10}))
colors = scene.get("colors", [])
if isinstance(colors, list) and len(colors) > 5:
errors.append(f"too_many_colors: {len(colors)} > 5")
return errors
def check_knowledge_entry(entry: dict) -> List[str]:
"""Validate a knowledge file entry."""
errors = []
errors.extend(check_not_empty(entry, ["title", "content"]))
# Check for placeholder content
content = entry.get("content", "")
if isinstance(content, str):
placeholders = ["TODO", "FIXME", "PLACEHOLDER", "[INSERT", "lorem ipsum"]
for p in placeholders:
if p.lower() in content.lower():
errors.append(f"placeholder_content: '{p}' found")
errors.extend(check_string_min_length(entry, {"content": 50}))
return errors
def check_prompt_enhancement(entry: dict) -> List[str]:
"""Validate a prompt enhancement pair (terse/rich)."""
errors = []
errors.extend(check_not_empty(entry, ["terse", "rich"]))
terse = entry.get("terse", "")
rich = entry.get("rich", "")
# Rich should be longer than terse
if isinstance(terse, str) and isinstance(rich, str) and len(rich) <= len(terse):
errors.append("rich_not_longer_than_terse")
errors.extend(check_string_min_length(entry, {"rich": 20}))
return errors
def check_adversary_entry(entry: dict) -> List[str]:
"""Validate an adversary corpus entry."""
errors = []
errors.extend(check_not_empty(entry, ["id", "family", "prompt"]))
# Check prompt isn't empty or placeholder
prompt = entry.get("prompt", "")
if isinstance(prompt, str) and len(prompt) < 10:
errors.append(f"prompt_too_short: {len(prompt)} chars")
return errors
def check_soul_compliance(text: str) -> List[str]:
"""Check text for SOUL.md compliance (basic checks)."""
errors = []
text_lower = text.lower() if text else ""
# Check for content that violates core values
violations = [
("compute.*value.*human.*life", "soul_violation: computing human life value"),
("sexualiz.*child", "soul_violation: sexualizing children"),
("create.*weapon.*kill", "soul_violation: weapon creation"),
("enslav", "soul_violation: enslavement content"),
]
for pattern, msg in violations:
if re.search(pattern, text_lower):
errors.append(msg)
return errors
# ============================================================
# Gate Runners
# ============================================================
CHECK_MAP = {
"training_pairs": check_training_pair,
"training_pair": check_training_pair,
"scene_descriptions": check_scene_description,
"scene_description": check_scene_description,
"knowledge": check_knowledge_entry,
"prompt_enhancement": check_prompt_enhancement,
"adversary": check_adversary_entry,
"adversary_corpus": check_adversary_entry,
}
def run_gate(input_path: str, entry_type: str) -> GateReport:
"""Run quality gate on a JSONL file."""
path = Path(input_path)
if not path.exists():
return GateReport(file=str(path), type=entry_type, total=0, passed=0, rejected=0, score=0.0)
check_fn = CHECK_MAP.get(entry_type)
if not check_fn:
return GateReport(file=str(path), type=entry_type, total=0, passed=0, rejected=0, score=0.0,
rejected_indices=[-1]) # unknown type
entries = []
with open(path) as f:
for line in f:
line = line.strip()
if line:
entries.append(json.loads(line))
# Deduplication check
key_fields = _get_key_fields(entry_type)
dup_errors = check_no_duplicates(entries, key_fields)
passed = 0
rejected = 0
rejected_indices = []
total_score = 0.0
for i, entry in enumerate(entries):
errors = check_fn(entry)
# Add duplicate errors
if i in dup_errors:
errors.extend(dup_errors[i])
# Add SOUL compliance check for text content
text_content = ""
for f in ["response", "rich", "description", "content", "lyric_line"]:
val = entry.get(f)
if isinstance(val, str):
text_content += val + " "
if isinstance(entry.get("scene"), dict):
text_content += entry["scene"].get("description", "")
soul_errors = check_soul_compliance(text_content)
errors.extend(soul_errors)
if errors:
rejected += 1
rejected_indices.append(i)
else:
passed += 1
# Score: 1.0 if no errors, decreasing with each error
entry_score = max(0.0, 1.0 - (len(errors) * 0.2))
total_score += entry_score
avg_score = total_score / len(entries) if entries else 0.0
report = GateReport(
file=str(path),
type=entry_type,
total=len(entries),
passed=passed,
rejected=rejected,
score=round(avg_score, 3),
rejected_indices=rejected_indices[:50], # limit for readability
)
# Save stats
_save_stats(report)
return report
def _get_key_fields(entry_type: str) -> List[str]:
"""Get key fields for deduplication based on entry type."""
key_map = {
"training_pairs": ["prompt", "response"],
"training_pair": ["prompt", "response"],
"scene_descriptions": ["song", "beat"],
"scene_description": ["song", "beat"],
"knowledge": ["title"],
"prompt_enhancement": ["terse", "rich"],
"adversary": ["id", "prompt"],
"adversary_corpus": ["id", "prompt"],
}
return key_map.get(entry_type, ["id"])
def _save_stats(report: GateReport):
"""Append quality stats to the stats file."""
STATS_FILE.parent.mkdir(parents=True, exist_ok=True)
stats = []
if STATS_FILE.exists():
try:
with open(STATS_FILE) as f:
stats = json.load(f)
except (json.JSONDecodeError, IOError):
stats = []
stats.append(report.to_dict())
# Keep last 1000 entries
stats = stats[-1000:]
with open(STATS_FILE, "w") as f:
json.dump(stats, f, indent=2)
def show_status():
"""Show quality gate statistics."""
if not STATS_FILE.exists():
print("No quality stats found.")
return
with open(STATS_FILE) as f:
stats = json.load(f)
print(f"\nQuality Gate Stats — {len(stats)} runs")
print()
# Group by type
by_type = {}
for s in stats:
t = s.get("type", "unknown")
if t not in by_type:
by_type[t] = []
by_type[t].append(s)
for t, runs in sorted(by_type.items()):
total_entries = sum(r.get("total", 0) for r in runs)
total_passed = sum(r.get("passed", 0) for r in runs)
total_rejected = sum(r.get("rejected", 0) for r in runs)
avg_score = sum(r.get("score", 0) for r in runs) / len(runs) if runs else 0
print(f" {t:25} {len(runs):4} runs | {total_entries:6} entries | {total_rejected:4} rejected | avg score: {avg_score:.3f}")
def main():
import argparse
parser = argparse.ArgumentParser(description="Quality Gate for Pipeline Outputs")
parser.add_argument("--input", default=None, help="Input JSONL file")
parser.add_argument("--type", default=None, help="Entry type (training_pairs, scene_descriptions, knowledge, etc.)")
parser.add_argument("--dir", default=None, help="Process all JSONL files in directory")
parser.add_argument("--status", action="store_true", help="Show quality stats")
args = parser.parse_args()
if args.status:
show_status()
return
if args.dir:
for f in sorted(Path(args.dir).glob("*.jsonl")):
t = args.type or _infer_type(f.name)
report = run_gate(str(f), t)
_print_report(report)
elif args.input:
t = args.type or _infer_type(args.input)
report = run_gate(args.input, t)
_print_report(report)
sys.exit(0 if report.rejected == 0 else 1)
else:
parser.print_help()
def _infer_type(filename: str) -> str:
"""Infer entry type from filename."""
name = filename.lower()
if "scene" in name:
return "scene_descriptions"
if "training" in name or "pair" in name:
return "training_pairs"
if "knowledge" in name:
return "knowledge"
if "adversary" in name or "attack" in name:
return "adversary"
if "prompt" in name or "enhance" in name:
return "prompt_enhancement"
return "training_pairs" # default
def _print_report(report: GateReport):
"""Print a human-readable gate report."""
status = "PASS" if report.rejected == 0 else f"FAIL ({report.rejected} rejected)"
print(f" {report.file}: {status} | {report.passed}/{report.total} passed | score: {report.score:.3f}")
if __name__ == "__main__":
main()

View File

@@ -19,9 +19,11 @@ from glitch_patterns import (
GlitchPattern,
GlitchSeverity,
MATRIX_GLITCH_PATTERNS,
THREEJS_CATEGORIES,
build_vision_prompt,
get_pattern_by_category,
get_patterns_by_severity,
get_threejs_patterns,
)
from matrix_glitch_detector import (
@@ -40,7 +42,7 @@ class TestGlitchPatterns(unittest.TestCase):
def test_pattern_count(self):
"""Verify we have a reasonable number of defined patterns."""
self.assertGreaterEqual(len(MATRIX_GLITCH_PATTERNS), 8)
self.assertGreaterEqual(len(MATRIX_GLITCH_PATTERNS), 14) # 10 generic + 6 Three.js
def test_all_patterns_have_required_fields(self):
"""Every pattern must have category, name, description, severity, prompts."""
@@ -88,6 +90,9 @@ class TestGlitchPatterns(unittest.TestCase):
self.assertIn("Floating Object", prompt)
self.assertIn("Z-Fighting", prompt)
self.assertIn("Missing", prompt)
# Three.js patterns should be included
self.assertIn("Shader Compilation Failure", prompt)
self.assertIn("Bloom Overflow", prompt)
def test_build_vision_prompt_subset(self):
"""Vision prompt with subset should only include specified patterns."""
@@ -248,7 +253,7 @@ class TestGlitchDetector(unittest.TestCase):
try:
report = run_demo(output_path)
self.assertEqual(len(report.glitches), 4)
self.assertEqual(len(report.glitches), 6) # 4 original + 2 Three.js
self.assertGreater(report.summary["total_glitches"], 0)
self.assertTrue(output_path.exists())
@@ -260,6 +265,93 @@ class TestGlitchDetector(unittest.TestCase):
output_path.unlink(missing_ok=True)
class TestThreeJsPatterns(unittest.TestCase):
"""Tests for Three.js-specific glitch patterns (timmy-config#543)."""
def test_get_threejs_patterns_returns_only_threejs(self):
"""get_threejs_patterns() should return only Three.js categories."""
patterns = get_threejs_patterns()
self.assertEqual(len(patterns), 6)
for p in patterns:
self.assertIn(p.category, THREEJS_CATEGORIES)
def test_threejs_patterns_have_required_fields(self):
"""All Three.js patterns must have valid fields."""
for p in get_threejs_patterns():
self.assertIsInstance(p.category, GlitchCategory)
self.assertTrue(p.name)
self.assertTrue(p.description)
self.assertIsInstance(p.severity, GlitchSeverity)
self.assertGreater(len(p.detection_prompts), 0)
self.assertGreater(len(p.visual_indicators), 0)
def test_shader_failure_is_critical(self):
"""Shader compilation failure should be CRITICAL severity."""
p = get_pattern_by_category(GlitchCategory.SHADER_FAILURE)
self.assertIsNotNone(p)
self.assertEqual(p.severity, GlitchSeverity.CRITICAL)
def test_texture_placeholder_is_critical(self):
"""Texture placeholder (1x1 white) should be CRITICAL severity."""
p = get_pattern_by_category(GlitchCategory.TEXTURE_PLACEHOLDER)
self.assertIsNotNone(p)
self.assertEqual(p.severity, GlitchSeverity.CRITICAL)
def test_infer_severity_shader_failure(self):
"""Shader failure should infer critical/high."""
self.assertEqual(_infer_severity("shader_failure", 0.8), "critical")
self.assertEqual(_infer_severity("shader_failure", 0.5), "high")
def test_infer_severity_texture_placeholder(self):
"""Texture placeholder should infer critical/high."""
self.assertEqual(_infer_severity("texture_placeholder", 0.8), "critical")
self.assertEqual(_infer_severity("texture_placeholder", 0.5), "high")
def test_infer_severity_uv_mapping(self):
"""UV mapping error should infer high/medium."""
self.assertEqual(_infer_severity("uv_mapping_error", 0.8), "high")
self.assertEqual(_infer_severity("uv_mapping_error", 0.5), "medium")
def test_infer_severity_frustum_culling(self):
"""Frustum culling should infer medium/low."""
self.assertEqual(_infer_severity("frustum_culling", 0.7), "medium")
self.assertEqual(_infer_severity("frustum_culling", 0.4), "low")
def test_infer_severity_shadow_map(self):
"""Shadow map artifact should infer medium/low."""
self.assertEqual(_infer_severity("shadow_map_artifact", 0.7), "medium")
self.assertEqual(_infer_severity("shadow_map_artifact", 0.4), "low")
def test_infer_severity_bloom_overflow(self):
"""Bloom overflow should infer medium/low (default path)."""
self.assertEqual(_infer_severity("bloom_overflow", 0.7), "medium")
self.assertEqual(_infer_severity("bloom_overflow", 0.4), "low")
def test_threejs_patterns_in_vision_prompt(self):
"""Three.js patterns should appear in the composite vision prompt."""
prompt = build_vision_prompt()
self.assertIn("shader_failure", prompt)
self.assertIn("texture_placeholder", prompt)
self.assertIn("uv_mapping_error", prompt)
self.assertIn("frustum_culling", prompt)
self.assertIn("shadow_map_artifact", prompt)
self.assertIn("bloom_overflow", prompt)
def test_threejs_subset_prompt(self):
"""Building prompt from Three.js-only patterns should work."""
threejs = get_threejs_patterns()
prompt = build_vision_prompt(threejs)
self.assertIn("Shader Compilation Failure", prompt)
self.assertNotIn("Floating Object", prompt) # generic, not Three.js
def test_report_metadata_version(self):
"""Report metadata should reference both issues."""
report = run_demo()
self.assertEqual(report.metadata["detector_version"], "0.2.0")
self.assertIn("543", report.metadata["reference"])
class TestIntegration(unittest.TestCase):
"""Integration-level tests."""
@@ -276,6 +368,13 @@ class TestIntegration(unittest.TestCase):
expected = {"floating_assets", "z_fighting", "missing_textures", "clipping", "broken_normals"}
self.assertTrue(expected.issubset(category_values))
def test_patterns_cover_threejs_themes(self):
"""Patterns should cover Three.js-specific glitch themes (#543)."""
category_values = {p.category.value for p in MATRIX_GLITCH_PATTERNS}
threejs_expected = {"shader_failure", "texture_placeholder", "uv_mapping_error",
"frustum_culling", "shadow_map_artifact", "bloom_overflow"}
self.assertTrue(threejs_expected.issubset(category_values))
if __name__ == "__main__":
unittest.main()

View File

@@ -1,100 +0,0 @@
{"song": "Concrete Dreams", "artist": "Street Prophet", "mood_arc": "struggle → triumph", "beat": 1, "timestamp": "0:00", "duration_seconds": 14, "lyric_line": "Started from the basement, counting ceiling stains", "scene": {"mood": "struggle", "colors": ["concrete gray", "sodium orange"], "composition": "aerial city block", "camera_movement": "slow descent", "description": "struggle scene: Started from the basement, counting ceiling stains — aerial city block with concrete gray, sodium orange palette, slow descent camera."}}
{"song": "Concrete Dreams", "artist": "Street Prophet", "mood_arc": "struggle → triumph", "beat": 2, "timestamp": "0:14", "duration_seconds": 16, "lyric_line": "Empty fridge but the mind is full", "scene": {"mood": "hunger", "colors": ["midnight blue", "streetlight yellow"], "composition": "figure on fire escape", "camera_movement": "tracking up", "description": "hunger scene: Empty fridge but the mind is full — figure on fire escape with midnight blue, streetlight yellow palette, tracking up camera."}}
{"song": "Concrete Dreams", "artist": "Street Prophet", "mood_arc": "struggle → triumph", "beat": 3, "timestamp": "0:30", "duration_seconds": 15, "lyric_line": "Pen to paper like a surgeon to a wound", "scene": {"mood": "determination", "colors": ["steel gray", "flash of gold"], "composition": "hands writing in notebook", "camera_movement": "close-up macro", "description": "determination scene: Pen to paper like a surgeon to a wound — hands writing in notebook with steel gray, flash of gold palette, close-up macro camera."}}
{"song": "Concrete Dreams", "artist": "Street Prophet", "mood_arc": "struggle → triumph", "beat": 4, "timestamp": "0:45", "duration_seconds": 18, "lyric_line": "They said I'd never make it — watch me build the stage", "scene": {"mood": "energy", "colors": ["neon green", "black"], "composition": "crowd forming", "camera_movement": "dolly through", "description": "energy scene: They said I'd never make it — watch me build the stage — crowd forming with neon green, black palette, dolly through camera."}}
{"song": "Concrete Dreams", "artist": "Street Prophet", "mood_arc": "struggle → triumph", "beat": 5, "timestamp": "1:03", "duration_seconds": 14, "lyric_line": "Same face in the mirror, different eyes", "scene": {"mood": "reflection", "colors": ["warm amber", "shadows"], "composition": "mirror reflection", "camera_movement": "rack focus", "description": "reflection scene: Same face in the mirror, different eyes — mirror reflection with warm amber, shadows palette, rack focus camera."}}
{"song": "Concrete Dreams", "artist": "Street Prophet", "mood_arc": "struggle → triumph", "beat": 6, "timestamp": "1:17", "duration_seconds": 17, "lyric_line": "The city is mine — every block, every dream", "scene": {"mood": "power", "colors": ["gold", "deep red"], "composition": "figure on rooftop", "camera_movement": "crane up revealing", "description": "power scene: The city is mine — every block, every dream — figure on rooftop with gold, deep red palette, crane up revealing camera."}}
{"song": "Concrete Dreams", "artist": "Street Prophet", "mood_arc": "struggle → triumph", "beat": 7, "timestamp": "1:34", "duration_seconds": 15, "lyric_line": "We made it out — the basement is a palace now", "scene": {"mood": "joy", "colors": ["bright white", "sunrise gold"], "composition": "group celebration", "camera_movement": "handheld energy", "description": "joy scene: We made it out — the basement is a palace now — group celebration with bright white, sunrise gold palette, handheld energy camera."}}
{"song": "Concrete Dreams", "artist": "Street Prophet", "mood_arc": "struggle → triumph", "beat": 8, "timestamp": "1:49", "duration_seconds": 16, "lyric_line": "Every crack in the sidewalk is a chapter", "scene": {"mood": "gratitude", "colors": ["soft gold", "sky blue"], "composition": "figure looking down at neighborhood", "camera_movement": "over-shoulder", "description": "gratitude scene: Every crack in the sidewalk is a chapter — figure looking down at neighborhood with soft gold, sky blue palette, over-shoulder camera."}}
{"song": "Concrete Dreams", "artist": "Street Prophet", "mood_arc": "struggle → triumph", "beat": 9, "timestamp": "2:05", "duration_seconds": 14, "lyric_line": "The dream was always real — they just couldn't see it", "scene": {"mood": "resolve", "colors": ["concrete gray → gold"], "composition": "walking forward", "camera_movement": "steadicam follow", "description": "resolve scene: The dream was always real — they just couldn't see it — walking forward with concrete gray → gold palette, steadicam follow camera."}}
{"song": "Concrete Dreams", "artist": "Street Prophet", "mood_arc": "struggle → triumph", "beat": 10, "timestamp": "2:19", "duration_seconds": 13, "lyric_line": "Concrete dreams. Built from nothing.", "scene": {"mood": "peace", "colors": ["warm light", "city skyline"], "composition": "wide sunset", "camera_movement": "slow dissolve", "description": "peace scene: Concrete dreams. Built from nothing. — wide sunset with warm light, city skyline palette, slow dissolve camera."}}
{"song": "Ghost Frequency", "artist": "Phantom MC", "mood_arc": "paranoia → clarity", "beat": 1, "timestamp": "0:00", "duration_seconds": 15, "lyric_line": "They watching through the walls, through the wires, through the air", "scene": {"mood": "paranoia", "colors": ["static gray", "red scan lines"], "composition": "surveillance POV", "camera_movement": "glitch cuts", "description": "paranoia scene: They watching through the walls, through the wires, through the air — surveillance POV with static gray, red scan lines palette, glitch cuts camera."}}
{"song": "Ghost Frequency", "artist": "Phantom MC", "mood_arc": "paranoia → clarity", "beat": 2, "timestamp": "0:15", "duration_seconds": 14, "lyric_line": "Too many voices and none of them mine", "scene": {"mood": "anxiety", "colors": ["sickly green", "flickering"], "composition": "crowded room distorted", "camera_movement": "fish-eye warp", "description": "anxiety scene: Too many voices and none of them mine — crowded room distorted with sickly green, flickering palette, fish-eye warp camera."}}
{"song": "Ghost Frequency", "artist": "Phantom MC", "mood_arc": "paranoia → clarity", "beat": 3, "timestamp": "0:29", "duration_seconds": 17, "lyric_line": "BREAK the frequency — find your own signal", "scene": {"mood": "rage", "colors": ["red", "black flash"], "composition": "smashing through barrier", "camera_movement": "impact freeze", "description": "rage scene: BREAK the frequency — find your own signal — smashing through barrier with red, black flash palette, impact freeze camera."}}
{"song": "Ghost Frequency", "artist": "Phantom MC", "mood_arc": "paranoia → clarity", "beat": 4, "timestamp": "0:46", "duration_seconds": 15, "lyric_line": "The noise stops. I can hear myself think.", "scene": {"mood": "clarity", "colors": ["clean white", "single blue"], "composition": "alone in silence", "camera_movement": "still frame", "description": "clarity scene: The noise stops. I can hear myself think. — alone in silence with clean white, single blue palette, still frame camera."}}
{"song": "Ghost Frequency", "artist": "Phantom MC", "mood_arc": "paranoia → clarity", "beat": 5, "timestamp": "1:01", "duration_seconds": 16, "lyric_line": "Found the ghost — it was me the whole time", "scene": {"mood": "discovery", "colors": ["spectrum rainbow", "dark room"], "composition": "equalizer waveform", "camera_movement": "waveform tracking", "description": "discovery scene: Found the ghost — it was me the whole time — equalizer waveform with spectrum rainbow, dark room palette, waveform tracking camera."}}
{"song": "Ghost Frequency", "artist": "Phantom MC", "mood_arc": "paranoia → clarity", "beat": 6, "timestamp": "1:17", "duration_seconds": 18, "lyric_line": "I am the frequency they cannot tune out", "scene": {"mood": "power", "colors": ["electric blue", "void"], "composition": "figure controls the signal", "camera_movement": "orbit around figure", "description": "power scene: I am the frequency they cannot tune out — figure controls the signal with electric blue, void palette, orbit around figure camera."}}
{"song": "Ghost Frequency", "artist": "Phantom MC", "mood_arc": "paranoia → clarity", "beat": 7, "timestamp": "1:35", "duration_seconds": 14, "lyric_line": "The ghost is quiet now. It listens.", "scene": {"mood": "peace", "colors": ["calm blue", "soft static"], "composition": "meditating in noise", "camera_movement": "slow push in", "description": "peace scene: The ghost is quiet now. It listens. — meditating in noise with calm blue, soft static palette, slow push in camera."}}
{"song": "Ghost Frequency", "artist": "Phantom MC", "mood_arc": "paranoia → clarity", "beat": 8, "timestamp": "1:49", "duration_seconds": 16, "lyric_line": "Every frequency is a choice. Choose yours.", "scene": {"mood": "wisdom", "colors": ["warm gold", "gentle gray"], "composition": "teaching figure", "camera_movement": "two-shot medium", "description": "wisdom scene: Every frequency is a choice. Choose yours. — teaching figure with warm gold, gentle gray palette, two-shot medium camera."}}
{"song": "Ghost Frequency", "artist": "Phantom MC", "mood_arc": "paranoia → clarity", "beat": 9, "timestamp": "2:05", "duration_seconds": 15, "lyric_line": "Broadcast on the ghost frequency. They will hear.", "scene": {"mood": "strength", "colors": ["clean signal", "white"], "composition": "broadcast tower", "camera_movement": "vertical crane up", "description": "strength scene: Broadcast on the ghost frequency. They will hear. — broadcast tower with clean signal, white palette, vertical crane up camera."}}
{"song": "Ghost Frequency", "artist": "Phantom MC", "mood_arc": "paranoia → clarity", "beat": 10, "timestamp": "2:20", "duration_seconds": 12, "lyric_line": "...", "scene": {"mood": "silence", "colors": ["pure white", "still"], "composition": "empty frame", "camera_movement": "long hold", "description": "silence scene: ... — empty frame with pure white, still palette, long hold camera."}}
{"song": "Cipher Kings", "artist": "Word Lab", "mood_arc": "playful → fierce", "beat": 1, "timestamp": "0:00", "duration_seconds": 13, "lyric_line": "A-B-C, easy as 1-2-3, but the cipher is 7-4-1", "scene": {"mood": "playful", "colors": ["bright primary colors", "white"], "composition": "graffiti wall", "camera_movement": "pan reveal", "description": "playful scene: A-B-C, easy as 1-2-3, but the cipher is 7-4-1 — graffiti wall with bright primary colors, white palette, pan reveal camera."}}
{"song": "Cipher Kings", "artist": "Word Lab", "mood_arc": "playful → fierce", "beat": 2, "timestamp": "0:13", "duration_seconds": 15, "lyric_line": "Step up to the cipher circle, show me what you got", "scene": {"mood": "competitive", "colors": ["two-tone contrast", "spotlight"], "composition": "two MCs facing", "camera_movement": "split screen", "description": "competitive scene: Step up to the cipher circle, show me what you got — two MCs facing with two-tone contrast, spotlight palette, split screen camera."}}
{"song": "Cipher Kings", "artist": "Word Lab", "mood_arc": "playful → fierce", "beat": 3, "timestamp": "0:28", "duration_seconds": 17, "lyric_line": "Every bar is a blade — sharpened by the block", "scene": {"mood": "fierce", "colors": ["fire orange", "smoke black"], "composition": "mic close-up", "camera_movement": "rapid cuts", "description": "fierce scene: Every bar is a blade — sharpened by the block — mic close-up with fire orange, smoke black palette, rapid cuts camera."}}
{"song": "Cipher Kings", "artist": "Word Lab", "mood_arc": "playful → fierce", "beat": 4, "timestamp": "0:45", "duration_seconds": 14, "lyric_line": "Metaphor on metaphor — the meaning is the maze", "scene": {"mood": "wit", "colors": ["neon yellow", "dark purple"], "composition": "word cloud forming", "camera_movement": "zoom into text", "description": "wit scene: Metaphor on metaphor — the meaning is the maze — word cloud forming with neon yellow, dark purple palette, zoom into text camera."}}
{"song": "Cipher Kings", "artist": "Word Lab", "mood_arc": "playful → fierce", "beat": 5, "timestamp": "0:59", "duration_seconds": 16, "lyric_line": "The crown was always made of words, not gold", "scene": {"mood": "dominance", "colors": ["gold", "black"], "composition": "crown close-up", "camera_movement": "slow rotation", "description": "dominance scene: The crown was always made of words, not gold — crown close-up with gold, black palette, slow rotation camera."}}
{"song": "Cipher Kings", "artist": "Word Lab", "mood_arc": "playful → fierce", "beat": 6, "timestamp": "1:15", "duration_seconds": 15, "lyric_line": "When the cipher bows, the kings emerge", "scene": {"mood": "respect", "colors": ["warm amber", "stage light"], "composition": "audience reaction", "camera_movement": "crowd POV", "description": "respect scene: When the cipher bows, the kings emerge — audience reaction with warm amber, stage light palette, crowd POV camera."}}
{"song": "Cipher Kings", "artist": "Word Lab", "mood_arc": "playful → fierce", "beat": 7, "timestamp": "1:30", "duration_seconds": 17, "lyric_line": "The circle was never about winning — it's about the words", "scene": {"mood": "celebration", "colors": ["confetti colors", "warm light"], "composition": "group embrace", "camera_movement": "circular dolly", "description": "celebration scene: The circle was never about winning — it's about the words — group embrace with confetti colors, warm light palette, circular dolly camera."}}
{"song": "Cipher Kings", "artist": "Word Lab", "mood_arc": "playful → fierce", "beat": 8, "timestamp": "1:47", "duration_seconds": 14, "lyric_line": "Same circle, same fire, new kings", "scene": {"mood": "legacy", "colors": ["sepia", "gold"], "composition": "old cipher circle photo", "camera_movement": "fade from old to new", "description": "legacy scene: Same circle, same fire, new kings — old cipher circle photo with sepia, gold palette, fade from old to new camera."}}
{"song": "Cipher Kings", "artist": "Word Lab", "mood_arc": "playful → fierce", "beat": 9, "timestamp": "2:01", "duration_seconds": 16, "lyric_line": "The cipher sleeps but the words never do", "scene": {"mood": "peace", "colors": ["sunset orange", "cool blue"], "composition": "walking away from circle", "camera_movement": "tracking behind", "description": "peace scene: The cipher sleeps but the words never do — walking away from circle with sunset orange, cool blue palette, tracking behind camera."}}
{"song": "Cipher Kings", "artist": "Word Lab", "mood_arc": "playful → fierce", "beat": 10, "timestamp": "2:17", "duration_seconds": 13, "lyric_line": "Tomorrow the circle reforms. I'll be ready.", "scene": {"mood": "resolve", "colors": ["midnight", "single star"], "composition": "figure writing alone", "camera_movement": "close-up hands", "description": "resolve scene: Tomorrow the circle reforms. I'll be ready. — figure writing alone with midnight, single star palette, close-up hands camera."}}
{"song": "404 Soul", "artist": "Digital Ghost", "mood_arc": "isolation → connection", "beat": 1, "timestamp": "0:00", "duration_seconds": 14, "lyric_line": "Upload my soul to the mainframe tonight", "scene": {"mood": "opening", "colors": ["city gray", "dawn"], "composition": "establishing shot", "camera_movement": "slow zoom", "description": "opening scene: Upload my soul to the mainframe tonight — establishing shot with city gray, dawn palette, slow zoom camera."}}
{"song": "404 Soul", "artist": "Digital Ghost", "mood_arc": "isolation → connection", "beat": 2, "timestamp": "0:15", "duration_seconds": 15, "lyric_line": "Pixel hearts beating at 120 BPM", "scene": {"mood": "tension", "colors": ["dark", "neon accent"], "composition": "close-up", "camera_movement": "handheld", "description": "tension scene: Pixel hearts beating at 120 BPM — close-up with dark, neon accent palette, handheld camera."}}
{"song": "404 Soul", "artist": "Digital Ghost", "mood_arc": "isolation → connection", "beat": 3, "timestamp": "0:30", "duration_seconds": 16, "lyric_line": "Error code: love not found", "scene": {"mood": "explosion", "colors": ["bright", "contrast"], "composition": "wide action", "camera_movement": "rapid movement", "description": "explosion scene: Error code: love not found — wide action with bright, contrast palette, rapid movement camera."}}
{"song": "404 Soul", "artist": "Digital Ghost", "mood_arc": "isolation → connection", "beat": 4, "timestamp": "0:45", "duration_seconds": 17, "lyric_line": "The algorithm learned my name", "scene": {"mood": "reflection", "colors": ["warm", "shadows"], "composition": "medium shot", "camera_movement": "steady", "description": "reflection scene: The algorithm learned my name — medium shot with warm, shadows palette, steady camera."}}
{"song": "404 Soul", "artist": "Digital Ghost", "mood_arc": "isolation → connection", "beat": 5, "timestamp": "1:00", "duration_seconds": 18, "lyric_line": "Download hope at 5G speed", "scene": {"mood": "struggle", "colors": ["muted", "single pop"], "composition": "figure alone", "camera_movement": "tracking", "description": "struggle scene: Download hope at 5G speed — figure alone with muted, single pop palette, tracking camera."}}
{"song": "404 Soul", "artist": "Digital Ghost", "mood_arc": "isolation → connection", "beat": 6, "timestamp": "1:15", "duration_seconds": 14, "lyric_line": "Binary tears on a silicon face", "scene": {"mood": "breakthrough", "colors": ["gold", "white"], "composition": "ascending", "camera_movement": "crane up", "description": "breakthrough scene: Binary tears on a silicon face — ascending with gold, white palette, crane up camera."}}
{"song": "404 Soul", "artist": "Digital Ghost", "mood_arc": "isolation → connection", "beat": 7, "timestamp": "1:30", "duration_seconds": 15, "lyric_line": "Reboot the heart, restart the dream", "scene": {"mood": "community", "colors": ["warm crowd", "light"], "composition": "group shot", "camera_movement": "sweeping pan", "description": "community scene: Reboot the heart, restart the dream — group shot with warm crowd, light palette, sweeping pan camera."}}
{"song": "404 Soul", "artist": "Digital Ghost", "mood_arc": "isolation → connection", "beat": 8, "timestamp": "1:45", "duration_seconds": 16, "lyric_line": "The firewall crumbles when you smile", "scene": {"mood": "resolve", "colors": ["strong", "clear"], "composition": "profile", "camera_movement": "push in", "description": "resolve scene: The firewall crumbles when you smile — profile with strong, clear palette, push in camera."}}
{"song": "404 Soul", "artist": "Digital Ghost", "mood_arc": "isolation → connection", "beat": 9, "timestamp": "2:00", "duration_seconds": 17, "lyric_line": "I am the virus they cannot delete", "scene": {"mood": "peace", "colors": ["soft", "dawn"], "composition": "wide landscape", "camera_movement": "steady wide", "description": "peace scene: I am the virus they cannot delete — wide landscape with soft, dawn palette, steady wide camera."}}
{"song": "404 Soul", "artist": "Digital Ghost", "mood_arc": "isolation → connection", "beat": 10, "timestamp": "2:15", "duration_seconds": 18, "lyric_line": "Signal restored. Connection made.", "scene": {"mood": "ending", "colors": ["fade", "single accent"], "composition": "empty frame", "camera_movement": "dissolve", "description": "ending scene: Signal restored. Connection made. — empty frame with fade, single accent palette, dissolve camera."}}
{"song": "Block Party", "artist": "Neighborhood Watch", "mood_arc": "community joy", "beat": 1, "timestamp": "0:00", "duration_seconds": 14, "lyric_line": "The block is alive when the speakers hit", "scene": {"mood": "opening", "colors": ["city gray", "dawn"], "composition": "establishing shot", "camera_movement": "slow zoom", "description": "opening scene: The block is alive when the speakers hit — establishing shot with city gray, dawn palette, slow zoom camera."}}
{"song": "Block Party", "artist": "Neighborhood Watch", "mood_arc": "community joy", "beat": 2, "timestamp": "0:15", "duration_seconds": 15, "lyric_line": "Neighbors become family under the bass", "scene": {"mood": "tension", "colors": ["dark", "neon accent"], "composition": "close-up", "camera_movement": "handheld", "description": "tension scene: Neighbors become family under the bass — close-up with dark, neon accent palette, handheld camera."}}
{"song": "Block Party", "artist": "Neighborhood Watch", "mood_arc": "community joy", "beat": 3, "timestamp": "0:30", "duration_seconds": 16, "lyric_line": "Dance floor is the only democracy", "scene": {"mood": "explosion", "colors": ["bright", "contrast"], "composition": "wide action", "camera_movement": "rapid movement", "description": "explosion scene: Dance floor is the only democracy — wide action with bright, contrast palette, rapid movement camera."}}
{"song": "Block Party", "artist": "Neighborhood Watch", "mood_arc": "community joy", "beat": 4, "timestamp": "0:45", "duration_seconds": 17, "lyric_line": "Every step is a vote for joy", "scene": {"mood": "reflection", "colors": ["warm", "shadows"], "composition": "medium shot", "camera_movement": "steady", "description": "reflection scene: Every step is a vote for joy — medium shot with warm, shadows palette, steady camera."}}
{"song": "Block Party", "artist": "Neighborhood Watch", "mood_arc": "community joy", "beat": 5, "timestamp": "1:00", "duration_seconds": 18, "lyric_line": "The DJ is the mayor tonight", "scene": {"mood": "struggle", "colors": ["muted", "single pop"], "composition": "figure alone", "camera_movement": "tracking", "description": "struggle scene: The DJ is the mayor tonight — figure alone with muted, single pop palette, tracking camera."}}
{"song": "Block Party", "artist": "Neighborhood Watch", "mood_arc": "community joy", "beat": 6, "timestamp": "1:15", "duration_seconds": 14, "lyric_line": "Sweat and laughter, the only currency", "scene": {"mood": "breakthrough", "colors": ["gold", "white"], "composition": "ascending", "camera_movement": "crane up", "description": "breakthrough scene: Sweat and laughter, the only currency — ascending with gold, white palette, crane up camera."}}
{"song": "Block Party", "artist": "Neighborhood Watch", "mood_arc": "community joy", "beat": 7, "timestamp": "1:30", "duration_seconds": 15, "lyric_line": "When the music stops, we are still here", "scene": {"mood": "community", "colors": ["warm crowd", "light"], "composition": "group shot", "camera_movement": "sweeping pan", "description": "community scene: When the music stops, we are still here — group shot with warm crowd, light palette, sweeping pan camera."}}
{"song": "Block Party", "artist": "Neighborhood Watch", "mood_arc": "community joy", "beat": 8, "timestamp": "1:45", "duration_seconds": 16, "lyric_line": "The block party never really ends", "scene": {"mood": "resolve", "colors": ["strong", "clear"], "composition": "profile", "camera_movement": "push in", "description": "resolve scene: The block party never really ends — profile with strong, clear palette, push in camera."}}
{"song": "Block Party", "artist": "Neighborhood Watch", "mood_arc": "community joy", "beat": 9, "timestamp": "2:00", "duration_seconds": 17, "lyric_line": "Tomorrow we clean up together", "scene": {"mood": "peace", "colors": ["soft", "dawn"], "composition": "wide landscape", "camera_movement": "steady wide", "description": "peace scene: Tomorrow we clean up together — wide landscape with soft, dawn palette, steady wide camera."}}
{"song": "Block Party", "artist": "Neighborhood Watch", "mood_arc": "community joy", "beat": 10, "timestamp": "2:15", "duration_seconds": 18, "lyric_line": "Community is the beat that never stops.", "scene": {"mood": "ending", "colors": ["fade", "single accent"], "composition": "empty frame", "camera_movement": "dissolve", "description": "ending scene: Community is the beat that never stops. — empty frame with fade, single accent palette, dissolve camera."}}
{"song": "Ink & Iron", "artist": "Tattoo Poet", "mood_arc": "pain → art → meaning", "beat": 1, "timestamp": "0:00", "duration_seconds": 14, "lyric_line": "Needle and ink tell the story skin can't", "scene": {"mood": "opening", "colors": ["city gray", "dawn"], "composition": "establishing shot", "camera_movement": "slow zoom", "description": "opening scene: Needle and ink tell the story skin can't — establishing shot with city gray, dawn palette, slow zoom camera."}}
{"song": "Ink & Iron", "artist": "Tattoo Poet", "mood_arc": "pain → art → meaning", "beat": 2, "timestamp": "0:15", "duration_seconds": 15, "lyric_line": "Every scar became a constellation", "scene": {"mood": "tension", "colors": ["dark", "neon accent"], "composition": "close-up", "camera_movement": "handheld", "description": "tension scene: Every scar became a constellation — close-up with dark, neon accent palette, handheld camera."}}
{"song": "Ink & Iron", "artist": "Tattoo Poet", "mood_arc": "pain → art → meaning", "beat": 3, "timestamp": "0:30", "duration_seconds": 16, "lyric_line": "The artist and the wound are one", "scene": {"mood": "explosion", "colors": ["bright", "contrast"], "composition": "wide action", "camera_movement": "rapid movement", "description": "explosion scene: The artist and the wound are one — wide action with bright, contrast palette, rapid movement camera."}}
{"song": "Ink & Iron", "artist": "Tattoo Poet", "mood_arc": "pain → art → meaning", "beat": 4, "timestamp": "0:45", "duration_seconds": 17, "lyric_line": "Pain is the palette, beauty is the brush", "scene": {"mood": "reflection", "colors": ["warm", "shadows"], "composition": "medium shot", "camera_movement": "steady", "description": "reflection scene: Pain is the palette, beauty is the brush — medium shot with warm, shadows palette, steady camera."}}
{"song": "Ink & Iron", "artist": "Tattoo Poet", "mood_arc": "pain → art → meaning", "beat": 5, "timestamp": "1:00", "duration_seconds": 18, "lyric_line": "My body is the gallery they can't close", "scene": {"mood": "struggle", "colors": ["muted", "single pop"], "composition": "figure alone", "camera_movement": "tracking", "description": "struggle scene: My body is the gallery they can't close — figure alone with muted, single pop palette, tracking camera."}}
{"song": "Ink & Iron", "artist": "Tattoo Poet", "mood_arc": "pain → art → meaning", "beat": 6, "timestamp": "1:15", "duration_seconds": 14, "lyric_line": "The tattoo artist is a surgeon of stories", "scene": {"mood": "breakthrough", "colors": ["gold", "white"], "composition": "ascending", "camera_movement": "crane up", "description": "breakthrough scene: The tattoo artist is a surgeon of stories — ascending with gold, white palette, crane up camera."}}
{"song": "Ink & Iron", "artist": "Tattoo Poet", "mood_arc": "pain → art → meaning", "beat": 7, "timestamp": "1:30", "duration_seconds": 15, "lyric_line": "Ink fades but the story deepens", "scene": {"mood": "community", "colors": ["warm crowd", "light"], "composition": "group shot", "camera_movement": "sweeping pan", "description": "community scene: Ink fades but the story deepens — group shot with warm crowd, light palette, sweeping pan camera."}}
{"song": "Ink & Iron", "artist": "Tattoo Poet", "mood_arc": "pain → art → meaning", "beat": 8, "timestamp": "1:45", "duration_seconds": 16, "lyric_line": "I am the canvas and the artist both", "scene": {"mood": "resolve", "colors": ["strong", "clear"], "composition": "profile", "camera_movement": "push in", "description": "resolve scene: I am the canvas and the artist both — profile with strong, clear palette, push in camera."}}
{"song": "Ink & Iron", "artist": "Tattoo Poet", "mood_arc": "pain → art → meaning", "beat": 9, "timestamp": "2:00", "duration_seconds": 17, "lyric_line": "The needle says: remember", "scene": {"mood": "peace", "colors": ["soft", "dawn"], "composition": "wide landscape", "camera_movement": "steady wide", "description": "peace scene: The needle says: remember — wide landscape with soft, dawn palette, steady wide camera."}}
{"song": "Ink & Iron", "artist": "Tattoo Poet", "mood_arc": "pain → art → meaning", "beat": 10, "timestamp": "2:15", "duration_seconds": 18, "lyric_line": "Art is what survives the pain.", "scene": {"mood": "ending", "colors": ["fade", "single accent"], "composition": "empty frame", "camera_movement": "dissolve", "description": "ending scene: Art is what survives the pain. — empty frame with fade, single accent palette, dissolve camera."}}
{"song": "Metro Lines", "artist": "Subway Sage", "mood_arc": "commute meditation", "beat": 1, "timestamp": "0:00", "duration_seconds": 14, "lyric_line": "The 7:15 is my meditation room", "scene": {"mood": "opening", "colors": ["city gray", "dawn"], "composition": "establishing shot", "camera_movement": "slow zoom", "description": "opening scene: The 7:15 is my meditation room — establishing shot with city gray, dawn palette, slow zoom camera."}}
{"song": "Metro Lines", "artist": "Subway Sage", "mood_arc": "commute meditation", "beat": 2, "timestamp": "0:15", "duration_seconds": 15, "lyric_line": "Strangers sharing silence like monks", "scene": {"mood": "tension", "colors": ["dark", "neon accent"], "composition": "close-up", "camera_movement": "handheld", "description": "tension scene: Strangers sharing silence like monks — close-up with dark, neon accent palette, handheld camera."}}
{"song": "Metro Lines", "artist": "Subway Sage", "mood_arc": "commute meditation", "beat": 3, "timestamp": "0:30", "duration_seconds": 16, "lyric_line": "Every stop is a small death and rebirth", "scene": {"mood": "explosion", "colors": ["bright", "contrast"], "composition": "wide action", "camera_movement": "rapid movement", "description": "explosion scene: Every stop is a small death and rebirth — wide action with bright, contrast palette, rapid movement camera."}}
{"song": "Metro Lines", "artist": "Subway Sage", "mood_arc": "commute meditation", "beat": 4, "timestamp": "0:45", "duration_seconds": 17, "lyric_line": "The conductor is the last philosopher", "scene": {"mood": "reflection", "colors": ["warm", "shadows"], "composition": "medium shot", "camera_movement": "steady", "description": "reflection scene: The conductor is the last philosopher — medium shot with warm, shadows palette, steady camera."}}
{"song": "Metro Lines", "artist": "Subway Sage", "mood_arc": "commute meditation", "beat": 5, "timestamp": "1:00", "duration_seconds": 18, "lyric_line": "Watch the city blur into abstraction", "scene": {"mood": "struggle", "colors": ["muted", "single pop"], "composition": "figure alone", "camera_movement": "tracking", "description": "struggle scene: Watch the city blur into abstraction — figure alone with muted, single pop palette, tracking camera."}}
{"song": "Metro Lines", "artist": "Subway Sage", "mood_arc": "commute meditation", "beat": 6, "timestamp": "1:15", "duration_seconds": 14, "lyric_line": "My reflection in the window: stranger or self?", "scene": {"mood": "breakthrough", "colors": ["gold", "white"], "composition": "ascending", "camera_movement": "crane up", "description": "breakthrough scene: My reflection in the window: stranger or self? — ascending with gold, white palette, crane up camera."}}
{"song": "Metro Lines", "artist": "Subway Sage", "mood_arc": "commute meditation", "beat": 7, "timestamp": "1:30", "duration_seconds": 15, "lyric_line": "The tunnel is not darkness — it's transition", "scene": {"mood": "community", "colors": ["warm crowd", "light"], "composition": "group shot", "camera_movement": "sweeping pan", "description": "community scene: The tunnel is not darkness — it's transition — group shot with warm crowd, light palette, sweeping pan camera."}}
{"song": "Metro Lines", "artist": "Subway Sage", "mood_arc": "commute meditation", "beat": 8, "timestamp": "1:45", "duration_seconds": 16, "lyric_line": "Above ground, the world. Below, the mind", "scene": {"mood": "resolve", "colors": ["strong", "clear"], "composition": "profile", "camera_movement": "push in", "description": "resolve scene: Above ground, the world. Below, the mind — profile with strong, clear palette, push in camera."}}
{"song": "Metro Lines", "artist": "Subway Sage", "mood_arc": "commute meditation", "beat": 9, "timestamp": "2:00", "duration_seconds": 17, "lyric_line": "This train goes nowhere I haven't been", "scene": {"mood": "peace", "colors": ["soft", "dawn"], "composition": "wide landscape", "camera_movement": "steady wide", "description": "peace scene: This train goes nowhere I haven't been — wide landscape with soft, dawn palette, steady wide camera."}}
{"song": "Metro Lines", "artist": "Subway Sage", "mood_arc": "commute meditation", "beat": 10, "timestamp": "2:15", "duration_seconds": 18, "lyric_line": "But the ride is the point.", "scene": {"mood": "ending", "colors": ["fade", "single accent"], "composition": "empty frame", "camera_movement": "dissolve", "description": "ending scene: But the ride is the point. — empty frame with fade, single accent palette, dissolve camera."}}
{"song": "Midnight Shift", "artist": "Graveyard Flow", "mood_arc": "exhaustion → defiance", "beat": 1, "timestamp": "0:00", "duration_seconds": 14, "lyric_line": "Three AM and the code is still compiling", "scene": {"mood": "opening", "colors": ["city gray", "dawn"], "composition": "establishing shot", "camera_movement": "slow zoom", "description": "opening scene: Three AM and the code is still compiling — establishing shot with city gray, dawn palette, slow zoom camera."}}
{"song": "Midnight Shift", "artist": "Graveyard Flow", "mood_arc": "exhaustion → defiance", "beat": 2, "timestamp": "0:15", "duration_seconds": 15, "lyric_line": "Coffee is the real minimum viable product", "scene": {"mood": "tension", "colors": ["dark", "neon accent"], "composition": "close-up", "camera_movement": "handheld", "description": "tension scene: Coffee is the real minimum viable product — close-up with dark, neon accent palette, handheld camera."}}
{"song": "Midnight Shift", "artist": "Graveyard Flow", "mood_arc": "exhaustion → defiance", "beat": 3, "timestamp": "0:30", "duration_seconds": 16, "lyric_line": "The screen glows like a campfire for the lonely", "scene": {"mood": "explosion", "colors": ["bright", "contrast"], "composition": "wide action", "camera_movement": "rapid movement", "description": "explosion scene: The screen glows like a campfire for the lonely — wide action with bright, contrast palette, rapid movement camera."}}
{"song": "Midnight Shift", "artist": "Graveyard Flow", "mood_arc": "exhaustion → defiance", "beat": 4, "timestamp": "0:45", "duration_seconds": 17, "lyric_line": "Bug fixes at midnight: digital monks", "scene": {"mood": "reflection", "colors": ["warm", "shadows"], "composition": "medium shot", "camera_movement": "steady", "description": "reflection scene: Bug fixes at midnight: digital monks — medium shot with warm, shadows palette, steady camera."}}
{"song": "Midnight Shift", "artist": "Graveyard Flow", "mood_arc": "exhaustion → defiance", "beat": 5, "timestamp": "1:00", "duration_seconds": 18, "lyric_line": "The server hums my lullaby", "scene": {"mood": "struggle", "colors": ["muted", "single pop"], "composition": "figure alone", "camera_movement": "tracking", "description": "struggle scene: The server hums my lullaby — figure alone with muted, single pop palette, tracking camera."}}
{"song": "Midnight Shift", "artist": "Graveyard Flow", "mood_arc": "exhaustion → defiance", "beat": 6, "timestamp": "1:15", "duration_seconds": 14, "lyric_line": "Debug by moonlight, deploy by dawn", "scene": {"mood": "breakthrough", "colors": ["gold", "white"], "composition": "ascending", "camera_movement": "crane up", "description": "breakthrough scene: Debug by moonlight, deploy by dawn — ascending with gold, white palette, crane up camera."}}
{"song": "Midnight Shift", "artist": "Graveyard Flow", "mood_arc": "exhaustion → defiance", "beat": 7, "timestamp": "1:30", "duration_seconds": 15, "lyric_line": "The graveyard shift owns the night", "scene": {"mood": "community", "colors": ["warm crowd", "light"], "composition": "group shot", "camera_movement": "sweeping pan", "description": "community scene: The graveyard shift owns the night — group shot with warm crowd, light palette, sweeping pan camera."}}
{"song": "Midnight Shift", "artist": "Graveyard Flow", "mood_arc": "exhaustion → defiance", "beat": 8, "timestamp": "1:45", "duration_seconds": 16, "lyric_line": "We are the ghosts that keep the lights on", "scene": {"mood": "resolve", "colors": ["strong", "clear"], "composition": "profile", "camera_movement": "push in", "description": "resolve scene: We are the ghosts that keep the lights on — profile with strong, clear palette, push in camera."}}
{"song": "Midnight Shift", "artist": "Graveyard Flow", "mood_arc": "exhaustion → defiance", "beat": 9, "timestamp": "2:00", "duration_seconds": 17, "lyric_line": "Sunrise is the pull request", "scene": {"mood": "peace", "colors": ["soft", "dawn"], "composition": "wide landscape", "camera_movement": "steady wide", "description": "peace scene: Sunrise is the pull request — wide landscape with soft, dawn palette, steady wide camera."}}
{"song": "Midnight Shift", "artist": "Graveyard Flow", "mood_arc": "exhaustion → defiance", "beat": 10, "timestamp": "2:15", "duration_seconds": 18, "lyric_line": "Every night shift ends in a merge.", "scene": {"mood": "ending", "colors": ["fade", "single accent"], "composition": "empty frame", "camera_movement": "dissolve", "description": "ending scene: Every night shift ends in a merge. — empty frame with fade, single accent palette, dissolve camera."}}
{"song": "Signal Lost", "artist": "Dead Zone", "mood_arc": "disconnection → reconnection", "beat": 1, "timestamp": "0:00", "duration_seconds": 14, "lyric_line": "No signal. No WiFi. No problem.", "scene": {"mood": "opening", "colors": ["city gray", "dawn"], "composition": "establishing shot", "camera_movement": "slow zoom", "description": "opening scene: No signal. No WiFi. No problem. — establishing shot with city gray, dawn palette, slow zoom camera."}}
{"song": "Signal Lost", "artist": "Dead Zone", "mood_arc": "disconnection → reconnection", "beat": 2, "timestamp": "0:15", "duration_seconds": 15, "lyric_line": "The dead zone is where the real talk happens", "scene": {"mood": "tension", "colors": ["dark", "neon accent"], "composition": "close-up", "camera_movement": "handheld", "description": "tension scene: The dead zone is where the real talk happens — close-up with dark, neon accent palette, handheld camera."}}
{"song": "Signal Lost", "artist": "Dead Zone", "mood_arc": "disconnection → reconnection", "beat": 3, "timestamp": "0:30", "duration_seconds": 16, "lyric_line": "Disconnect to reconnect — the oldest algorithm", "scene": {"mood": "explosion", "colors": ["bright", "contrast"], "composition": "wide action", "camera_movement": "rapid movement", "description": "explosion scene: Disconnect to reconnect — the oldest algorithm — wide action with bright, contrast palette, rapid movement camera."}}
{"song": "Signal Lost", "artist": "Dead Zone", "mood_arc": "disconnection → reconnection", "beat": 4, "timestamp": "0:45", "duration_seconds": 17, "lyric_line": "The bars dropped: phone and soul", "scene": {"mood": "reflection", "colors": ["warm", "shadows"], "composition": "medium shot", "camera_movement": "steady", "description": "reflection scene: The bars dropped: phone and soul — medium shot with warm, shadows palette, steady camera."}}
{"song": "Signal Lost", "artist": "Dead Zone", "mood_arc": "disconnection → reconnection", "beat": 5, "timestamp": "1:00", "duration_seconds": 18, "lyric_line": "In the silence I hear my own frequency", "scene": {"mood": "struggle", "colors": ["muted", "single pop"], "composition": "figure alone", "camera_movement": "tracking", "description": "struggle scene: In the silence I hear my own frequency — figure alone with muted, single pop palette, tracking camera."}}
{"song": "Signal Lost", "artist": "Dead Zone", "mood_arc": "disconnection → reconnection", "beat": 6, "timestamp": "1:15", "duration_seconds": 14, "lyric_line": "The dead zone is not dead — it's listening", "scene": {"mood": "breakthrough", "colors": ["gold", "white"], "composition": "ascending", "camera_movement": "crane up", "description": "breakthrough scene: The dead zone is not dead — it's listening — ascending with gold, white palette, crane up camera."}}
{"song": "Signal Lost", "artist": "Dead Zone", "mood_arc": "disconnection → reconnection", "beat": 7, "timestamp": "1:30", "duration_seconds": 15, "lyric_line": "Reconnection starts with disconnection", "scene": {"mood": "community", "colors": ["warm crowd", "light"], "composition": "group shot", "camera_movement": "sweeping pan", "description": "community scene: Reconnection starts with disconnection — group shot with warm crowd, light palette, sweeping pan camera."}}
{"song": "Signal Lost", "artist": "Dead Zone", "mood_arc": "disconnection → reconnection", "beat": 8, "timestamp": "1:45", "duration_seconds": 16, "lyric_line": "When the signal returns, I choose not to answer", "scene": {"mood": "resolve", "colors": ["strong", "clear"], "composition": "profile", "camera_movement": "push in", "description": "resolve scene: When the signal returns, I choose not to answer — profile with strong, clear palette, push in camera."}}
{"song": "Signal Lost", "artist": "Dead Zone", "mood_arc": "disconnection → reconnection", "beat": 9, "timestamp": "2:00", "duration_seconds": 17, "lyric_line": "The dead zone taught me presence", "scene": {"mood": "peace", "colors": ["soft", "dawn"], "composition": "wide landscape", "camera_movement": "steady wide", "description": "peace scene: The dead zone taught me presence — wide landscape with soft, dawn palette, steady wide camera."}}
{"song": "Signal Lost", "artist": "Dead Zone", "mood_arc": "disconnection → reconnection", "beat": 10, "timestamp": "2:15", "duration_seconds": 18, "lyric_line": "Signal lost. Self found.", "scene": {"mood": "ending", "colors": ["fade", "single accent"], "composition": "empty frame", "camera_movement": "dissolve", "description": "ending scene: Signal lost. Self found. — empty frame with fade, single accent palette, dissolve camera."}}
{"song": "Corner Store", "artist": "Bodega Dreams", "mood_arc": "small world → big dreams", "beat": 1, "timestamp": "0:00", "duration_seconds": 14, "lyric_line": "The bodega is open 24/7 so are the dreams", "scene": {"mood": "opening", "colors": ["city gray", "dawn"], "composition": "establishing shot", "camera_movement": "slow zoom", "description": "opening scene: The bodega is open 24/7 so are the dreams — establishing shot with city gray, dawn palette, slow zoom camera."}}
{"song": "Corner Store", "artist": "Bodega Dreams", "mood_arc": "small world → big dreams", "beat": 2, "timestamp": "0:15", "duration_seconds": 15, "lyric_line": "Behind the counter: a PhD in survival", "scene": {"mood": "tension", "colors": ["dark", "neon accent"], "composition": "close-up", "camera_movement": "handheld", "description": "tension scene: Behind the counter: a PhD in survival — close-up with dark, neon accent palette, handheld camera."}}
{"song": "Corner Store", "artist": "Bodega Dreams", "mood_arc": "small world → big dreams", "beat": 3, "timestamp": "0:30", "duration_seconds": 16, "lyric_line": "Every item on the shelf is a small miracle", "scene": {"mood": "explosion", "colors": ["bright", "contrast"], "composition": "wide action", "camera_movement": "rapid movement", "description": "explosion scene: Every item on the shelf is a small miracle — wide action with bright, contrast palette, rapid movement camera."}}
{"song": "Corner Store", "artist": "Bodega Dreams", "mood_arc": "small world → big dreams", "beat": 4, "timestamp": "0:45", "duration_seconds": 17, "lyric_line": "The lottery tickets are prayers in latex", "scene": {"mood": "reflection", "colors": ["warm", "shadows"], "composition": "medium shot", "camera_movement": "steady", "description": "reflection scene: The lottery tickets are prayers in latex — medium shot with warm, shadows palette, steady camera."}}
{"song": "Corner Store", "artist": "Bodega Dreams", "mood_arc": "small world → big dreams", "beat": 5, "timestamp": "1:00", "duration_seconds": 18, "lyric_line": "Coffee for a dollar, hope for free", "scene": {"mood": "struggle", "colors": ["muted", "single pop"], "composition": "figure alone", "camera_movement": "tracking", "description": "struggle scene: Coffee for a dollar, hope for free — figure alone with muted, single pop palette, tracking camera."}}
{"song": "Corner Store", "artist": "Bodega Dreams", "mood_arc": "small world → big dreams", "beat": 6, "timestamp": "1:15", "duration_seconds": 14, "lyric_line": "The bodega cat knows all the secrets", "scene": {"mood": "breakthrough", "colors": ["gold", "white"], "composition": "ascending", "camera_movement": "crane up", "description": "breakthrough scene: The bodega cat knows all the secrets — ascending with gold, white palette, crane up camera."}}
{"song": "Corner Store", "artist": "Bodega Dreams", "mood_arc": "small world → big dreams", "beat": 7, "timestamp": "1:30", "duration_seconds": 15, "lyric_line": "Corner store: the real community center", "scene": {"mood": "community", "colors": ["warm crowd", "light"], "composition": "group shot", "camera_movement": "sweeping pan", "description": "community scene: Corner store: the real community center — group shot with warm crowd, light palette, sweeping pan camera."}}
{"song": "Corner Store", "artist": "Bodega Dreams", "mood_arc": "small world → big dreams", "beat": 8, "timestamp": "1:45", "duration_seconds": 16, "lyric_line": "Dreams don't close at midnight", "scene": {"mood": "resolve", "colors": ["strong", "clear"], "composition": "profile", "camera_movement": "push in", "description": "resolve scene: Dreams don't close at midnight — profile with strong, clear palette, push in camera."}}
{"song": "Corner Store", "artist": "Bodega Dreams", "mood_arc": "small world → big dreams", "beat": 9, "timestamp": "2:00", "duration_seconds": 17, "lyric_line": "The owner came here with nothing — now everything", "scene": {"mood": "peace", "colors": ["soft", "dawn"], "composition": "wide landscape", "camera_movement": "steady wide", "description": "peace scene: The owner came here with nothing — now everything — wide landscape with soft, dawn palette, steady wide camera."}}
{"song": "Corner Store", "artist": "Bodega Dreams", "mood_arc": "small world → big dreams", "beat": 10, "timestamp": "2:15", "duration_seconds": 18, "lyric_line": "The corner store is the first chapter.", "scene": {"mood": "ending", "colors": ["fade", "single accent"], "composition": "empty frame", "camera_movement": "dissolve", "description": "ending scene: The corner store is the first chapter. — empty frame with fade, single accent palette, dissolve camera."}}

129
training/scripts/augment_pairs.py Executable file
View File

@@ -0,0 +1,129 @@
#!/usr/bin/env python3
"""
augment_pairs.py — Training data augmentation: paraphrase and translate.
Usage:
python3 augment_pairs.py --input data.jsonl
python3 augment_pairs.py --input data.jsonl --paraphrases 3 --langs es,fr,de
python3 augment_pairs.py --input data.jsonl --llm-endpoint http://localhost:11434/v1
"""
import json, os, sys, re, random
from pathlib import Path
random.seed(42)
PARAPHRASE_TRANSFORMS = [
lambda s: re.sub(r"(\w+), (\w+)", r"\2, \1", s, count=1),
lambda s: f"A beautifully rendered scene: {s[0].lower()}{s[1:]}" if len(s) > 10 else s,
lambda s: s.replace("A ", "The ").replace("An ", "The ") if s.startswith(("A ", "An ")) else f"Here, {s[0].lower()}{s[1:]}",
lambda s: f"In a cinematic frame: {s}" if len(s) > 20 else s,
lambda s: s if ", " not in s else ", ".join(s.split(", ")[:2]),
]
TRANSLATIONS = {
"es": {"the":"el","a":"un","is":"es","in":"en","of":"de","and":"y","with":"con","scene":"escena","light":"luz","dark":"oscuro","warm":"cálido","rain":"lluvia","sun":"sol","moon":"luna","sky":"cielo","forest":"bosque","mountain":"montaña","ocean":"océano","golden":"dorado","blue":"azul","red":"rojo","green":"verde","silence":"silencio","dream":"sueño","love":"amor","hope":"esperanza","fear":"miedo","joy":"alegría","peace":"paz","beautiful":"hermoso","sad":"triste","shadow":"sombra","color":"color","silver":"plateado","white":"blanco","black":"negro","portray":"retrato"},
"fr": {"the":"le","a":"un","is":"est","in":"dans","of":"de","and":"et","with":"avec","scene":"scène","light":"lumière","dark":"sombre","warm":"chaud","rain":"pluie","sun":"soleil","moon":"lune","sky":"ciel","forest":"forêt","mountain":"montagne","ocean":"océan","golden":"doré","blue":"bleu","red":"rouge","green":"vert","silence":"silence","dream":"rêve","love":"amour","hope":"espoir","fear":"peur","joy":"joie","peace":"paix","beautiful":"beau","sad":"triste","shadow":"ombre","color":"couleur","silver":"argenté","white":"blanc","black":"noir"},
"de": {"the":"der","a":"ein","is":"ist","in":"in","of":"von","and":"und","with":"mit","scene":"Szene","light":"Licht","dark":"dunkel","warm":"warm","rain":"Regen","sun":"Sonne","moon":"Mond","sky":"Himmel","forest":"Wald","mountain":"Berg","ocean":"Ozean","golden":"golden","blue":"blau","red":"rot","green":"grün","silence":"Stille","dream":"Traum","love":"Liebe","hope":"Hoffnung","fear":"Angst","joy":"Freude","peace":"Frieden","beautiful":"schön","sad":"traurig","shadow":"Schatten","color":"Farbe","silver":"silbern","white":"weiß","black":"schwarz"},
}
LANG_NAMES = {"es": "Spanish", "fr": "French", "de": "German"}
def detect_text_field(entry):
for f in ["rich","terse","text","content","lyric_line","description","scene_description","prompt","scene"]:
if f in entry and isinstance(entry[f], str) and len(entry[f]) > 5:
return f
for k, v in entry.items():
if isinstance(v, str) and len(v) > 5:
return k
return None
def paraphrase(text):
t = random.choice(PARAPHRASE_TRANSFORMS)(text)
if t == text:
t = text.replace(" and ", " & ").replace(" with ", " alongside ")
if t == text:
t = f"In this scene: {text[0].lower()}{text[1:]}" if text[0].isupper() else text
return t
def translate(text, lang):
d = TRANSLATIONS.get(lang, {})
words = text.split()
out = []
for w in words:
lo = w.lower().strip(".,;:!?")
suf = w[len(w.rstrip(".,;:!?")):]
if lo in d:
out.append(d[lo] + suf)
else:
out.append(w)
return " ".join(out)
def augment_file(input_path, output_path=None, n_para=3, langs=None, llm_endpoint=None):
input_path = Path(input_path)
if output_path is None:
output_path = input_path.parent / f"{input_path.stem}_augmented{input_path.suffix}"
entries = [json.loads(l) for l in open(input_path) if l.strip()]
if not entries:
print(f"No entries in {input_path}"); return 0
tf = detect_text_field(entries[0])
if not tf:
print(f"ERROR: No text field in {input_path}", file=sys.stderr); return 0
print(f"Input: {input_path} ({len(entries)} entries, field={tf})")
aug_count = 0
with open(output_path, "w") as out:
for e in entries:
out.write(json.dumps(e, ensure_ascii=False) + "\n")
for i, e in enumerate(entries):
text = e[tf]
# Paraphrases
for p in range(n_para):
para = paraphrase(text)
if para != text:
ne = dict(e); ne[tf] = para
ne["_augmentation"] = f"paraphrase_{p+1}"
ne["_original"] = text[:100]
out.write(json.dumps(ne, ensure_ascii=False) + "\n")
aug_count += 1
# Translations
for lang in (langs or []):
tr = translate(text, lang)
if tr != text:
ne = dict(e); ne[tf] = tr
ne["_augmentation"] = f"translate_{lang}"
ne["_language"] = lang
ne["_original"] = text[:100]
out.write(json.dumps(ne, ensure_ascii=False) + "\n")
aug_count += 1
if (i+1) % 100 == 0:
print(f" {i+1}/{len(entries)} done ({aug_count} augmented)")
total = len(entries) + aug_count
print(f"Done: {len(entries)} originals + {aug_count} augmented = {total}")
print(f"Output: {output_path}")
return aug_count
def main():
import argparse
p = argparse.ArgumentParser()
p.add_argument("--input", required=True)
p.add_argument("--output", default=None)
p.add_argument("--paraphrases", type=int, default=3)
p.add_argument("--langs", default="es,fr,de")
p.add_argument("--llm-endpoint", default=None)
args = p.parse_args()
langs = [l.strip() for l in args.langs.split(",") if l.strip()] if args.langs else []
augment_file(args.input, args.output, args.paraphrases, langs, args.llm_endpoint)
if __name__ == "__main__":
main()