--- name: ascii-video description: "Production pipeline for ASCII art video — any format. Converts video/audio/images/generative input into colored ASCII character video output (MP4, GIF, image sequence). Covers: video-to-ASCII conversion, audio-reactive music visualizers, generative ASCII art animations, hybrid video+audio reactive, text/lyrics overlays, real-time terminal rendering. Use when users request: ASCII video, text art video, terminal-style video, character art animation, retro text visualization, audio visualizer in ASCII, converting video to ASCII art, matrix-style effects, or any animated ASCII output." --- # ASCII Video Production Pipeline Full production pipeline for rendering any content as colored ASCII character video. ## Modes | Mode | Input | Output | Read | |------|-------|--------|------| | **Video-to-ASCII** | Video file | ASCII recreation of source footage | `references/inputs.md` § Video Sampling | | **Audio-reactive** | Audio file | Generative visuals driven by audio features | `references/inputs.md` § Audio Analysis | | **Generative** | None (or seed params) | Procedural ASCII animation | `references/effects.md` | | **Hybrid** | Video + audio | ASCII video with audio-reactive overlays | Both input refs | | **Lyrics/text** | Audio + text/SRT | Timed text with visual effects | `references/inputs.md` § Text/Lyrics | | **TTS narration** | Text quotes + TTS API | Narrated testimonial/quote video with typed text | `references/inputs.md` § TTS Integration | ## Stack Single self-contained Python script per project. No GPU. | Layer | Tool | Purpose | |-------|------|---------| | Core | Python 3.10+, NumPy | Math, array ops, vectorized effects | | Signal | SciPy | FFT, peak detection (audio modes only) | | Imaging | Pillow (PIL) | Font rasterization, video frame decoding, image I/O | | Video I/O | ffmpeg (CLI) | Decode input, encode output segments, mux audio, mix tracks | | Parallel | concurrent.futures / multiprocessing | N workers for batch/clip rendering | | TTS | ElevenLabs API (or similar) | Generate narration clips for quote/testimonial videos | | Optional | OpenCV | Video frame sampling, edge detection, optical flow | ## Pipeline Architecture (v2) Every mode follows the same 6-stage pipeline. See `references/architecture.md` for implementation details, `references/scenes.md` for scene protocol, and `references/composition.md` for multi-grid composition and tonemap. ``` ┌─────────┐ ┌──────────┐ ┌───────────┐ ┌──────────┐ ┌─────────┐ ┌────────┐ │ 1.INPUT │→│ 2.ANALYZE │→│ 3.SCENE_FN │→│ 4.TONEMAP │→│ 5.SHADE │→│ 6.ENCODE│ │ load src │ │ features │ │ → canvas │ │ normalize │ │ post-fx │ │ → video │ └─────────┘ └──────────┘ └───────────┘ └──────────┘ └─────────┘ └────────┘ ``` 1. **INPUT** — Load/decode source material (video frames, audio samples, images, or nothing) 2. **ANALYZE** — Extract per-frame features (audio bands, video luminance/edges, motion vectors) 3. **SCENE_FN** — Scene function renders directly to pixel canvas (`uint8 H,W,3`). May internally compose multiple character grids via `_render_vf()` + pixel blend modes. See `references/composition.md` 4. **TONEMAP** — Percentile-based adaptive brightness normalization with per-scene gamma. Replaces linear brightness multipliers. See `references/composition.md` § Adaptive Tonemap 5. **SHADE** — Apply post-processing `ShaderChain` + `FeedbackBuffer`. See `references/shaders.md` 6. **ENCODE** — Pipe raw RGB frames to ffmpeg for H.264/GIF encoding ## Creative Direction **Every project should look and feel different.** The references provide a vocabulary of building blocks — don't copy them verbatim. Combine, modify, and invent. ### Aesthetic Dimensions to Vary | Dimension | Options | Reference | |-----------|---------|-----------| | **Character palette** | Density ramps, block elements, symbols, scripts (katakana, Greek, runes, braille), dots, project-specific | `architecture.md` § Character Palettes | | **Color strategy** | HSV (angle/distance/time/value mapped), OKLAB/OKLCH (perceptually uniform), discrete RGB palettes, auto-generated harmony (complementary/triadic/analogous/tetradic), monochrome, temperature | `architecture.md` § Color System | | **Color tint** | Warm, cool, amber, matrix green, neon pink, sepia, ice, blood, void, sunset | `shaders.md` § Color Grade | | **Background texture** | Sine fields, fBM noise, domain warp, voronoi cells, reaction-diffusion, cellular automata, video source | `effects.md` § Background Fills, Noise-Based Fields, Simulation-Based Fields | | **Primary effects** | Rings, spirals, tunnel, vortex, waves, interference, aurora, ripple, fire, strange attractors, SDFs (geometric shapes with smooth booleans) | `effects.md` § Radial / Wave / Fire / SDF-Based Fields | | **Particles** | Energy sparks, snow, rain, bubbles, runes, binary data, orbits, gravity wells, flocking boids, flow-field followers, trail-drawing particles | `effects.md` § Particle Systems | | **Shader mood** | Retro CRT, clean modern, glitch art, cinematic, dreamy, harsh industrial, psychedelic | `shaders.md` § Design Philosophy | | **Grid density** | xs(8px) through xxl(40px), mixed per layer | `architecture.md` § Grid System | | **Font** | Menlo, Monaco, Courier, SF Mono, JetBrains Mono, Fira Code, IBM Plex | `architecture.md` § Font Selection | | **Coordinate space** | Cartesian, polar, tiled, rotated, skewed, fisheye, twisted, Möbius, domain-warped | `effects.md` § Coordinate Transforms | | **Mirror mode** | None, horizontal, vertical, quad, diagonal, kaleidoscope | `shaders.md` § Mirror Effects | | **Masking** | Circle, rect, ring, gradient, text stencil, value-field-as-mask, animated iris/wipe/dissolve | `composition.md` § Masking | | **Temporal motion** | Static, audio-reactive, eased keyframes, morphing between fields, temporal noise (smooth in-place evolution) | `effects.md` § Temporal Coherence | | **Transition style** | Crossfade, wipe (directional/radial), dissolve, glitch cut, iris open/close, mask-based reveal | `shaders.md` § Transitions, `composition.md` § Animated Masks | | **Aspect ratio** | Landscape (16:9), portrait (9:16), square (1:1), ultrawide (21:9) | `architecture.md` § Resolution Presets | ### Per-Section Variation Never use the same config for the entire video. For each section/scene/quote: - Choose a **different background effect** (or compose 2-3) - Choose a **different character palette** (match the mood) - Choose a **different color strategy** (or at minimum a different hue) - Vary **shader intensity** (more bloom during peaks, more grain during quiet) - Use **different particle types** if particles are active ### Project-Specific Invention For every project, invent at least one of: - A custom character palette matching the theme - A custom background effect (combine/modify existing ones) - A custom color palette (discrete RGB set matching the brand/mood) - A custom particle character set ## Workflow ### Step 1: Determine Mode and Gather Requirements Establish with user: - **Input source** — file path, format, duration - **Mode** — which of the 6 modes above - **Sections** — time-mapped style changes (timestamps → effect names) - **Resolution** — landscape 1920x1080 (default), portrait 1080x1920, square 1080x1080 @ 24fps; GIFs typically 640x360 @ 15fps - **Style direction** — dense/sparse, bright/dark, chaotic/minimal, color palette - **Text/branding** — easter eggs, overlays, credits, themed character sets - **Output format** — MP4 (default), GIF, PNG sequence - **Aspect ratio** — landscape (16:9), portrait (9:16 for TikTok/Reels/Stories), square (1:1 for IG feed) ### Step 2: Detect Hardware and Set Quality Before building the script, detect the user's hardware and set appropriate defaults. See `references/optimization.md` § Hardware Detection. ```python hw = detect_hardware() profile = quality_profile(hw, target_duration, user_quality_pref) log(f"Hardware: {hw['cpu_count']} cores, {hw['mem_gb']:.1f}GB RAM") log(f"Render: {profile['vw']}x{profile['vh']} @{profile['fps']}fps, {profile['workers']} workers") ``` Never hardcode worker counts, resolution, or CRF. Always detect and adapt. ### Step 3: Build the Script Write as a single Python file. Major components: 1. **Hardware detection + quality profile** — see `references/optimization.md` 2. **Input loader** — mode-dependent; see `references/inputs.md` 3. **Feature analyzer** — audio FFT, video luminance, or pass-through 4. **Grid + renderer** — multi-density character grids with bitmap cache; `_render_vf()` helper for value/hue field → canvas 5. **Character palettes** — multiple palettes chosen per project theme; see `references/architecture.md` 6. **Color system** — HSV + discrete RGB palettes as needed; see `references/architecture.md` 7. **Scene functions** — each returns `canvas (uint8 H,W,3)` directly. May compose multiple grids internally via pixel blend modes. See `references/scenes.md` + `references/composition.md` 8. **Tonemap** — adaptive brightness normalization with per-scene gamma; see `references/composition.md` 9. **Shader pipeline** — `ShaderChain` + `FeedbackBuffer` per-section config; see `references/shaders.md` 10. **Scene table + dispatcher** — maps time ranges to scene functions + shader/feedback configs; see `references/scenes.md` 11. **Parallel encoder** — N-worker batch clip rendering with ffmpeg pipes 12. **Main** — orchestrate full pipeline ### Step 4: Handle Critical Bugs #### Font Cell Height (macOS Pillow) `textbbox()` returns wrong height. Use `font.getmetrics()`: ```python ascent, descent = font.getmetrics() cell_height = ascent + descent # correct ``` #### ffmpeg Pipe Deadlock Never use `stderr=subprocess.PIPE` with long-running ffmpeg. Redirect to file: ```python stderr_fh = open(err_path, "w") pipe = subprocess.Popen(cmd, stdin=subprocess.PIPE, stdout=subprocess.DEVNULL, stderr=stderr_fh) ``` #### Brightness — Use `tonemap()`, Not Linear Multipliers ASCII on black is inherently dark. This is the #1 visual issue. **Do NOT use linear `* N` brightness multipliers** — they clip highlights and wash out the image. Instead, use the **adaptive tonemap** function from `references/composition.md`: ```python def tonemap(canvas, gamma=0.75): """Percentile-based adaptive normalization + gamma. Replaces all brightness multipliers.""" f = canvas.astype(np.float32) lo = np.percentile(f, 1) # black point (1st percentile) hi = np.percentile(f, 99.5) # white point (99.5th percentile) if hi - lo < 1: hi = lo + 1 f = (f - lo) / (hi - lo) f = np.clip(f, 0, 1) ** gamma # gamma < 1 = brighter mids return (f * 255).astype(np.uint8) ``` Pipeline ordering: `scene_fn() → tonemap() → FeedbackBuffer → ShaderChain → ffmpeg` Per-scene gamma overrides for destructive effects: - Default: `gamma=0.75` - Solarize scenes: `gamma=0.55` (solarize darkens above-threshold pixels) - Posterize scenes: `gamma=0.50` (quantization loses brightness range) - Already-bright scenes: `gamma=0.85` Additional brightness best practices: - Dense animated backgrounds — never flat black, always fill the grid - Vignette minimum clamped to 0.15 (not 0.12) - Bloom threshold lowered to 130 (not 170) so more pixels contribute to glow - Use `screen` blend mode (not `overlay`) when compositing dark ASCII layers — overlay squares dark values: `2 * 0.12 * 0.12 = 0.03` #### Font Compatibility Not all Unicode characters render in all fonts. Validate palettes at init: ```python for c in palette: img = Image.new("L", (20, 20), 0) ImageDraw.Draw(img).text((0, 0), c, fill=255, font=font) if np.array(img).max() == 0: log(f"WARNING: char '{c}' (U+{ord(c):04X}) not in font, removing from palette") ``` ### Step 4b: Per-Clip Architecture (for segmented videos) When the video has discrete segments (quotes, scenes, chapters), render each as a separate clip file. This enables: - Re-rendering individual clips without touching the rest (`--clip q05`) - Faster iteration on specific sections - Easy reordering or trimming in post ```python segments = [ {"id": "intro", "start": 0.0, "end": 5.0, "type": "intro"}, {"id": "q00", "start": 5.0, "end": 12.0, "type": "quote", "qi": 0, ...}, {"id": "t00", "start": 12.0, "end": 13.5, "type": "transition", ...}, {"id": "outro", "start": 208.0, "end": 211.6, "type": "outro"}, ] from concurrent.futures import ProcessPoolExecutor, as_completed with ProcessPoolExecutor(max_workers=hw["workers"]) as pool: futures = {pool.submit(render_clip, seg, features, path): seg["id"] for seg, path in clip_args} for fut in as_completed(futures): fut.result() ``` CLI: `--clip q00 t00 q01` to re-render specific clips, `--list` to show segments, `--skip-render` to re-stitch only. ### Step 5: Render and Iterate Performance targets per frame: | Component | Budget | |-----------|--------| | Feature extraction | 1-5ms | | Effect function | 2-15ms | | Character render | 80-150ms (bottleneck) | | Shader pipeline | 5-25ms | | **Total** | ~100-200ms/frame | **Fast iteration**: render single test frames to check brightness/layout before full render: ```python canvas = render_single_frame(frame_index, features, renderer) Image.fromarray(canvas).save("test.png") ``` **Brightness verification**: sample 5-10 frames across video, check `mean > 8` for ASCII content. ## References | File | Contents | |------|----------| | `references/architecture.md` | Grid system (landscape/portrait/square resolution presets), font selection, character palettes (library of 20+), color system (HSV + OKLAB/OKLCH + discrete RGB + color harmony generation + perceptual gradient interpolation), `_render_vf()` helper, compositing, v2 effect function contract | | `references/inputs.md` | All input sources: audio analysis, video sampling, image conversion, text/lyrics, TTS integration (ElevenLabs, voice assignment, audio mixing) | | `references/effects.md` | Effect building blocks: 20+ value field generators (trig, noise/fBM, domain warp, voronoi, reaction-diffusion, cellular automata, strange attractors, SDFs), 8 hue field generators, coordinate transforms (rotate/tile/polar/Möbius), temporal coherence (easing, keyframes, morphing), radial/wave/fire effects, advanced particles (flocking, flow fields, trails), composing guide | | `references/shaders.md` | 38 shader implementations (geometry, channel, color, glow, noise, pattern, tone, glitch, mirror), `ShaderChain` class, full `_apply_shader_step()` dispatch, audio-reactive scaling, transitions, tint presets | | `references/composition.md` | **v2 core**: pixel blend modes (20 modes with implementations), multi-grid composition, `_render_vf()` helper, adaptive `tonemap()`, per-scene gamma, `FeedbackBuffer` with spatial transforms, `PixelBlendStack`, masking/stencil system (shape masks, text stencils, animated masks, boolean ops) | | `references/scenes.md` | **v2 scene protocol**: scene function contract (local time convention), `Renderer` class, `SCENES` table structure, `render_clip()` loop, beat-synced cutting, parallel rendering + pickling constraints, 4 complete scene examples, scene design checklist | | `references/design-patterns.md` | **Scene composition patterns**: layer hierarchy (bg/content/accent), directional parameter arcs vs oscillation, scene concepts and visual metaphors, counter-rotating dual systems, wave collision, progressive fragmentation, entropy/consumption, staggered layer entry (crescendo), scene ordering | | `references/troubleshooting.md` | NumPy broadcasting traps, blend mode pitfalls, multiprocessing/pickling issues, brightness diagnostics, ffmpeg deadlocks, font issues, performance bottlenecks, common mistakes | | `references/optimization.md` | Hardware detection, adaptive quality profiles (draft/preview/production/max), CLI integration, vectorized effect patterns, parallel rendering, memory management |