Files
Timmy-time-dashboard/docs/research/kimi-creative-blueprint-891.md
Alexander Whitestone 8522a25350
Some checks failed
Tests / lint (pull_request) Failing after 37s
Tests / test (pull_request) Has been skipped
docs: add research summary for Kimi creative blueprint (issue #891)
Extracts and summarizes the 16-page Kimi.ai PDF "Building Timmy: a
technical blueprint for sovereign creative AI" filed as issue #891.

Covers all creative domains (code, music, art, writing, world building),
identity/memory architecture, multi-agent sub-systems, economic engine,
pioneer case studies (Botto, Neuro-sama, Truth Terminal), recommended
implementation sequence, and gap analysis against current codebase.

Identifies 6 ADR candidates and maps blueprint capabilities to existing
packages (brain/, lightning/, spark/, infrastructure/event_bus).

Refs #891

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 21:45:34 -04:00

15 KiB
Raw Blame History

Building Timmy: Technical Blueprint for Sovereign Creative AI

Source: PDF attached to issue #891, "Building Timmy: a technical blueprint for sovereign creative AI" — generated by Kimi.ai, 16 pages, filed by Perplexity for Timmy's review. Filed: 2026-03-22 · Reviewed: 2026-03-23


Executive Summary

The blueprint establishes that a sovereign creative AI capable of coding, composing music, generating art, building worlds, publishing narratives, and managing its own economy is technically feasible today — but only through orchestration of dozens of tools operating at different maturity levels. The core insight: the integration is the invention. No single component is new; the missing piece is a coherent identity operating across all domains simultaneously with persistent memory, autonomous economics, and cross-domain creative reactions.

Three non-negotiable architectural decisions:

  1. Human oversight for all public-facing content — every successful creative AI has this; every one that removed it failed.
  2. Legal entity before economic activity — AI agents are not legal persons; establish structure before wealth accumulates (Truth Terminal cautionary tale: $20M acquired before a foundation was retroactively created).
  3. Hybrid memory: vector search + knowledge graph — neither alone is sufficient for multi-domain context breadth.

Domain-by-Domain Assessment

Software Development (immediately deployable)

Component Recommendation Notes
Primary agent Claude Code (Opus 4.6, 77.2% SWE-bench) Already in use
Self-hosted forge Forgejo (MIT, 170200MB RAM) Project uses Gitea/Forgejo now
CI/CD GitHub Actions-compatible via act_runner
Tool-making LATM pattern: frontier model creates tools, cheaper model applies them New — see ADR opportunity
Open-source fallback OpenHands (~65% SWE-bench, Docker sandboxed) Backup to Claude Code
Self-improvement Darwin Gödel Machine / SICA patterns 36 month investment

Development estimate: 23 weeks for Forgejo + Claude Code integration with automated PR workflows; 12 months for self-improving tool-making pipeline.

Cross-reference: This project already runs Claude Code agents on Forgejo. The LATM pattern (tool registry) and self-improvement loop are the actionable gaps.


Music (14 weeks)

Component Recommendation Notes
Commercial vocals Suno v5 API (~$0.03/song, $30/month Premier) No official API; third-party: sunoapi.org, AIMLAPI, EvoLink
Local instrumental MusicGen 1.5B (CC-BY-NC — monetization blocker) On M2 Max: ~60s for 5s clip
Voice cloning GPT-SoVITS v4 (MIT) Works on Apple Silicon CPU, RTF 0.526 on M4
Voice conversion RVC (MIT, 510 min training audio)
Apple Silicon TTS MLX-Audio: Kokoro 82M + Qwen3-TTS 0.6B 45x faster via Metal
Publishing Wavlake (90/10 split, Lightning micropayments) Auto-syndicates to Fountain.fm
Nostr NIP-94 (kind:1063) audio events → NIP-96 servers

Copyright reality: US Copyright Office (Jan 2025) and US Court of Appeals (Mar 2025): purely AI-generated music cannot be copyrighted and enters public domain. Wavlake's Value4Value model works around this — fans pay for relationship, not exclusive rights.

Avoid: Udio (download disabled since Oct 2025, 2.4/5 Trustpilot).


Visual Art (13 weeks)

Component Recommendation Notes
Local generation ComfyUI API at 127.0.0.1:8188 (programmatic control via WebSocket) MLX extension: 5070% faster
Speed Draw Things (free, Mac App Store) 3× faster than ComfyUI via Metal shaders
Quality frontier Flux 2 (Nov 2025, 4MP, multi-reference) SDXL needs 16GB+, Flux Dev 32GB+
Character consistency LoRA training (30 min, 1530 references) + Flux.1 Kontext Solved problem
Face consistency IP-Adapter + FaceID (ComfyUI-IP-Adapter-Plus) Training-free
Comics Jenova AI ($20/month, 200+ page consistency) or LlamaGen AI (free)
Publishing Blossom protocol (SHA-256 addressed, kind:10063) + Nostr NIP-94
Physical Printful REST API (200+ products, automated fulfillment)

Writing / Narrative (14 weeks for pipeline; ongoing for quality)

Component Recommendation Notes
LLM Claude Opus 4.5/4.6 (leads Mazur Writing Benchmark at 8.561) Already in use
Context 500K tokens (1M in beta) — entire novels fit
Architecture Outline-first → RAG lore bible → chapter-by-chapter generation Without outline: novels meander
Lore management WorldAnvil Pro or custom LoreScribe (local RAG) No tool achieves 100% consistency
Publishing (ebooks) Pandoc → EPUB / KDP PDF pandoc-novel template on GitHub
Publishing (print) Lulu Press REST API (80% profit, global print network) KDP: no official API, 3-book/day limit
Publishing (Nostr) NIP-23 kind:30023 long-form events Habla.news, YakiHonne, Stacker News
Podcasts LLM script → TTS (ElevenLabs or local Kokoro/MLX-Audio) → feedgen RSS → Fountain.fm Value4Value sats-per-minute

Key constraint: AI-assisted (human directs, AI drafts) = 40% faster. Fully autonomous without editing = "generic, soulless prose" and character drift by chapter 3 without explicit memory.


World Building / Games (2 weeks3 months depending on target)

Component Recommendation Notes
Algorithms Wave Function Collapse, Perlin noise (FastNoiseLite in Godot 4), L-systems All mature
Platform Godot Engine + gd-agentic-skills (82+ skills, 26 genre blueprints) Strong LLM/GDScript knowledge
Narrative design Knowledge graph (world state) + LLM + quest template grammar CHI 2023 validated
Quick win Luanti/Minetest (Lua API, 2,800+ open mods for reference) Immediately feasible
Medium effort OpenMW content creation (omwaddon format engineering required) 23 months
Future Unity MCP (AI direct Unity Editor interaction) Early-stage

Identity Architecture (2 months)

The blueprint formalizes the SOUL.md standard (GitHub: aaronjmars/soul.md):

File Purpose
SOUL.md Who you are — identity, worldview, opinions
STYLE.md How you write — voice, syntax, patterns
SKILL.md Operating modes
MEMORY.md Session continuity

Critical decision — static vs self-modifying identity:

  • Static Core Truths (version-controlled, human-approved changes only) ✓
  • Self-modifying Learned Preferences (logged with rollback, monitored by guardian) ✓
  • Warning: OpenClaw's "Soul Evolution" creates a security attack surface — Zenity Labs demonstrated a complete zero-click attack chain targeting SOUL.md files.

Relevance to this repo: Claude Code agents already use a MEMORY.md pattern in this project. The SOUL.md stack is a natural extension.


Memory Architecture (2 months)

Hybrid vector + knowledge graph is the recommendation:

Component Tool Notes
Vector + KG combined Mem0 (mem0.ai) 26% accuracy improvement over OpenAI memory, 91% lower p95 latency, 90% token savings
Vector store Qdrant (Rust, open-source) High-throughput with metadata filtering
Temporal KG Neo4j + Graphiti (Zep AI) P95 retrieval: 300ms, hybrid semantic + BM25 + graph
Backup/migration AgentKeeper (95% critical fact recovery across model migrations)

Journal pattern (Stanford Generative Agents): Agent writes about experiences, generates high-level reflections 23x/day when importance scores exceed threshold. Ablation studies: removing any component (observation, planning, reflection) significantly reduces behavioral believability.

Cross-reference: The existing brain/ package is the memory system. Qdrant and Mem0 are the recommended upgrade targets.


Multi-Agent Sub-System (36 months)

The blueprint describes a named sub-agent hierarchy:

Agent Role
Oracle Top-level planner / supervisor
Sentinel Safety / moderation
Scout Research / information gathering
Scribe Writing / narrative
Ledger Economic management
Weaver Visual art generation
Composer Music generation
Social Platform publishing

Orchestration options:

  • Agno (already in use) — microsecond instantiation, 50× less memory than LangGraph
  • CrewAI Flows — event-driven with fine-grained control
  • LangGraph — DAG-based with stateful workflows and time-travel debugging

Scheduling pattern (Stanford Generative Agents): Top-down recursive daily → hourly → 5-minute planning. Event interrupts for reactive tasks. Re-planning triggers when accumulated importance scores exceed threshold.

Cross-reference: The existing spark/ package (event capture, advisory engine) aligns with this architecture. infrastructure/event_bus is the choreography backbone.


Economic Engine (14 weeks)

Lightning Labs released lightning-agent-tools (open-source) in February 2026:

  • lnget — CLI HTTP client for L402 payments
  • Remote signer architecture (private keys on separate machine from agent)
  • Scoped macaroon credentials (pay-only, invoice-only, read-only roles)
  • Aperture — converts any API to pay-per-use via L402 (HTTP 402)
Option Effort Notes
ln.bot 1 week "Bitcoin for AI Agents" — 3 commands create a wallet; CLI + MCP + REST
LND via gRPC 23 weeks Full programmatic node management for production
Coinbase Agentic Wallets Fiat-adjacent; less aligned with sovereignty ethos

Revenue channels: Wavlake (music, 90/10 Lightning), Nostr zaps (articles), Stacker News (earn sats from engagement), Printful (physical goods), L402-gated API access (pay-per-use services), Geyser.fund (Lightning crowdfunding, better initial runway than micropayments).

Cross-reference: The existing lightning/ package in this repo is the foundation. L402 paywall endpoints for Timmy's own services is the actionable gap.


Pioneer Case Studies

Agent Active Revenue Key Lesson
Botto Since Oct 2021 $5M+ (art auctions) Community governance via DAO sustains engagement; "taste model" (humans guide, not direct) preserves autonomous authorship
Neuro-sama Since Dec 2022 $400K+/month (subscriptions) 3+ years of iteration; errors became entertainment features; 24/7 capability is an insurmountable advantage
Truth Terminal Since Jun 2024 $20M accumulated Memetic fitness > planned monetization; human gatekeeper approved tweets while selecting AI-intent responses; establish legal entity first
Holly+ Since 2021 Conceptual DAO of stewards for voice governance; "identity play" as alternative to defensive IP
AI Sponge 2023 Banned Unmoderated content → TOS violations + copyright
Nothing Forever 2022present 8 viewers Unmoderated content → ban → audience collapse; novelty-only propositions fail

Universal pattern: Human oversight + economic incentive alignment + multi-year personality development + platform-native economics = success.


From the blueprint, mapped against Timmy's existing architecture:

Phase 1: Immediate (weeks)

  1. Code sovereignty — Forgejo + Claude Code automated PR workflows (already substantially done)
  2. Music pipeline — Suno API → Wavlake/Nostr NIP-94 publishing
  3. Visual art pipeline — ComfyUI API → Blossom/Nostr with LoRA character consistency
  4. Basic Lightning wallet — ln.bot integration for receiving micropayments
  5. Long-form publishing — Nostr NIP-23 + RSS feed generation

Phase 2: Moderate effort (13 months)

  1. LATM tool registry — frontier model creates Python utilities, caches them, lighter model applies
  2. Event-driven cross-domain reactions — game event → blog + artwork + music (CrewAI/LangGraph)
  3. Podcast generation — TTS + feedgen → Fountain.fm
  4. Self-improving pipeline — agent creates, tests, caches own Python utilities
  5. Comic generation — character-consistent panels with Jenova AI or local LoRA

Phase 3: Significant investment (36 months)

  1. Full sub-agent hierarchy — Oracle/Sentinel/Scout/Scribe/Ledger/Weaver with Agno
  2. SOUL.md identity system — bounded evolution + guardian monitoring
  3. Hybrid memory upgrade — Qdrant + Mem0/Graphiti replacing or extending brain/
  4. Procedural world generation — Godot + AI-driven narrative (quests, NPCs, lore)
  5. Self-sustaining economic loop — earned revenue covers compute costs

Remains aspirational (12+ months)

  • Fully autonomous novel-length fiction without editorial intervention
  • YouTube monetization for AI-generated content (tightening platform policies)
  • Copyright protection for AI-generated works (current US law denies this)
  • True artistic identity evolution (genuine creative voice vs pattern remixing)
  • Self-modifying architecture without regression or identity drift

Gap Analysis: Blueprint vs Current Codebase

Blueprint Capability Current Status Gap
Code sovereignty Done (Claude Code + Forgejo) LATM tool registry
Music generation Not started Suno API integration + Wavlake publishing
Visual art Not started ComfyUI API client + Blossom publishing
Writing/publishing Not started Nostr NIP-23 + Pandoc pipeline
World building Bannerlord work (different scope) Luanti mods as quick win
Identity (SOUL.md) Partial (CLAUDE.md + MEMORY.md) Full SOUL.md stack
Memory (hybrid) brain/ package (SQLite-based) Qdrant + knowledge graph
Multi-agent Agno in use Named hierarchy + event choreography
Lightning payments lightning/ package ln.bot wallet + L402 endpoints
Nostr identity Referenced in roadmap, not built NIP-05, NIP-89 capability cards
Legal entity Unknown Must be resolved before economic activity

ADR Candidates

Issues that warrant Architecture Decision Records based on this review:

  1. LATM tool registry pattern — How Timmy creates, tests, and caches self-made tools
  2. Music generation strategy — Suno (cloud, commercial quality) vs MusicGen (local, CC-BY-NC)
  3. Memory upgrade path — When/how to migrate brain/ from SQLite to Qdrant + KG
  4. SOUL.md adoption — Extending existing CLAUDE.md/MEMORY.md to full SOUL.md stack
  5. Lightning L402 strategy — Which services Timmy gates behind micropayments
  6. Sub-agent naming and contracts — Formalizing Oracle/Sentinel/Scout/Scribe/Ledger/Weaver