Compare commits

..

1 Commits

Author SHA1 Message Date
Alexander Whitestone
2ecc3800a4 docs: triage screenshot dump into 5 actionable issues
Some checks failed
Tests / lint (pull_request) Failing after 34s
Tests / test (pull_request) Has been skipped
Ingested and analyzed 5 screenshots from issue #1275:
- IMG_6187: AirLLM Apple Silicon requirements → #1284
- IMG_6125: vLLM backend for agentic throughput → #1281
- IMG_6124: DeerFlow research orchestration → #1283
- IMG_6123: Dev discipline / quality gates → #1285
- IMG_6410: SearXNG + Crawl4AI self-hosted search → #1282

Refs #1275

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 21:45:36 -04:00
15 changed files with 29 additions and 742 deletions

View File

@@ -18,17 +18,9 @@ jobs:
- name: Lint (ruff via tox) - name: Lint (ruff via tox)
run: tox -e lint run: tox -e lint
typecheck:
runs-on: ubuntu-latest
needs: lint
steps:
- uses: actions/checkout@v4
- name: Type-check (mypy via tox)
run: tox -e typecheck
test: test:
runs-on: ubuntu-latest runs-on: ubuntu-latest
needs: typecheck needs: lint
steps: steps:
- uses: actions/checkout@v4 - uses: actions/checkout@v4
- name: Run tests (via tox) - name: Run tests (via tox)

View File

@@ -1,290 +0,0 @@
# Building Timmy: Technical Blueprint for Sovereign Creative AI
> **Source:** PDF attached to issue #891, "Building Timmy: a technical blueprint for sovereign
> creative AI" — generated by Kimi.ai, 16 pages, filed by Perplexity for Timmy's review.
> **Filed:** 2026-03-22 · **Reviewed:** 2026-03-23
---
## Executive Summary
The blueprint establishes that a sovereign creative AI capable of coding, composing music,
generating art, building worlds, publishing narratives, and managing its own economy is
**technically feasible today** — but only through orchestration of dozens of tools operating
at different maturity levels. The core insight: *the integration is the invention*. No single
component is new; the missing piece is a coherent identity operating across all domains
simultaneously with persistent memory, autonomous economics, and cross-domain creative
reactions.
Three non-negotiable architectural decisions:
1. **Human oversight for all public-facing content** — every successful creative AI has this;
every one that removed it failed.
2. **Legal entity before economic activity** — AI agents are not legal persons; establish
structure before wealth accumulates (Truth Terminal cautionary tale: $20M acquired before
a foundation was retroactively created).
3. **Hybrid memory: vector search + knowledge graph** — neither alone is sufficient for
multi-domain context breadth.
---
## Domain-by-Domain Assessment
### Software Development (immediately deployable)
| Component | Recommendation | Notes |
|-----------|----------------|-------|
| Primary agent | Claude Code (Opus 4.6, 77.2% SWE-bench) | Already in use |
| Self-hosted forge | Forgejo (MIT, 170200MB RAM) | Project uses Gitea/Forgejo now |
| CI/CD | GitHub Actions-compatible via `act_runner` | — |
| Tool-making | LATM pattern: frontier model creates tools, cheaper model applies them | New — see ADR opportunity |
| Open-source fallback | OpenHands (~65% SWE-bench, Docker sandboxed) | Backup to Claude Code |
| Self-improvement | Darwin Gödel Machine / SICA patterns | 36 month investment |
**Development estimate:** 23 weeks for Forgejo + Claude Code integration with automated
PR workflows; 12 months for self-improving tool-making pipeline.
**Cross-reference:** This project already runs Claude Code agents on Forgejo. The LATM
pattern (tool registry) and self-improvement loop are the actionable gaps.
---
### Music (14 weeks)
| Component | Recommendation | Notes |
|-----------|----------------|-------|
| Commercial vocals | Suno v5 API (~$0.03/song, $30/month Premier) | No official API; third-party: sunoapi.org, AIMLAPI, EvoLink |
| Local instrumental | MusicGen 1.5B (CC-BY-NC — monetization blocker) | On M2 Max: ~60s for 5s clip |
| Voice cloning | GPT-SoVITS v4 (MIT) | Works on Apple Silicon CPU, RTF 0.526 on M4 |
| Voice conversion | RVC (MIT, 510 min training audio) | — |
| Apple Silicon TTS | MLX-Audio: Kokoro 82M + Qwen3-TTS 0.6B | 45x faster via Metal |
| Publishing | Wavlake (90/10 split, Lightning micropayments) | Auto-syndicates to Fountain.fm |
| Nostr | NIP-94 (kind:1063) audio events → NIP-96 servers | — |
**Copyright reality:** US Copyright Office (Jan 2025) and US Court of Appeals (Mar 2025):
purely AI-generated music cannot be copyrighted and enters public domain. Wavlake's
Value4Value model works around this — fans pay for relationship, not exclusive rights.
**Avoid:** Udio (download disabled since Oct 2025, 2.4/5 Trustpilot).
---
### Visual Art (13 weeks)
| Component | Recommendation | Notes |
|-----------|----------------|-------|
| Local generation | ComfyUI API at `127.0.0.1:8188` (programmatic control via WebSocket) | MLX extension: 5070% faster |
| Speed | Draw Things (free, Mac App Store) | 3× faster than ComfyUI via Metal shaders |
| Quality frontier | Flux 2 (Nov 2025, 4MP, multi-reference) | SDXL needs 16GB+, Flux Dev 32GB+ |
| Character consistency | LoRA training (30 min, 1530 references) + Flux.1 Kontext | Solved problem |
| Face consistency | IP-Adapter + FaceID (ComfyUI-IP-Adapter-Plus) | Training-free |
| Comics | Jenova AI ($20/month, 200+ page consistency) or LlamaGen AI (free) | — |
| Publishing | Blossom protocol (SHA-256 addressed, kind:10063) + Nostr NIP-94 | — |
| Physical | Printful REST API (200+ products, automated fulfillment) | — |
---
### Writing / Narrative (14 weeks for pipeline; ongoing for quality)
| Component | Recommendation | Notes |
|-----------|----------------|-------|
| LLM | Claude Opus 4.5/4.6 (leads Mazur Writing Benchmark at 8.561) | Already in use |
| Context | 500K tokens (1M in beta) — entire novels fit | — |
| Architecture | Outline-first → RAG lore bible → chapter-by-chapter generation | Without outline: novels meander |
| Lore management | WorldAnvil Pro or custom LoreScribe (local RAG) | No tool achieves 100% consistency |
| Publishing (ebooks) | Pandoc → EPUB / KDP PDF | pandoc-novel template on GitHub |
| Publishing (print) | Lulu Press REST API (80% profit, global print network) | KDP: no official API, 3-book/day limit |
| Publishing (Nostr) | NIP-23 kind:30023 long-form events | Habla.news, YakiHonne, Stacker News |
| Podcasts | LLM script → TTS (ElevenLabs or local Kokoro/MLX-Audio) → feedgen RSS → Fountain.fm | Value4Value sats-per-minute |
**Key constraint:** AI-assisted (human directs, AI drafts) = 40% faster. Fully autonomous
without editing = "generic, soulless prose" and character drift by chapter 3 without explicit
memory.
---
### World Building / Games (2 weeks3 months depending on target)
| Component | Recommendation | Notes |
|-----------|----------------|-------|
| Algorithms | Wave Function Collapse, Perlin noise (FastNoiseLite in Godot 4), L-systems | All mature |
| Platform | Godot Engine + gd-agentic-skills (82+ skills, 26 genre blueprints) | Strong LLM/GDScript knowledge |
| Narrative design | Knowledge graph (world state) + LLM + quest template grammar | CHI 2023 validated |
| Quick win | Luanti/Minetest (Lua API, 2,800+ open mods for reference) | Immediately feasible |
| Medium effort | OpenMW content creation (omwaddon format engineering required) | 23 months |
| Future | Unity MCP (AI direct Unity Editor interaction) | Early-stage |
---
### Identity Architecture (2 months)
The blueprint formalizes the **SOUL.md standard** (GitHub: aaronjmars/soul.md):
| File | Purpose |
|------|---------|
| `SOUL.md` | Who you are — identity, worldview, opinions |
| `STYLE.md` | How you write — voice, syntax, patterns |
| `SKILL.md` | Operating modes |
| `MEMORY.md` | Session continuity |
**Critical decision — static vs self-modifying identity:**
- Static Core Truths (version-controlled, human-approved changes only) ✓
- Self-modifying Learned Preferences (logged with rollback, monitored by guardian) ✓
- **Warning:** OpenClaw's "Soul Evolution" creates a security attack surface — Zenity Labs
demonstrated a complete zero-click attack chain targeting SOUL.md files.
**Relevance to this repo:** Claude Code agents already use a `MEMORY.md` pattern in
this project. The SOUL.md stack is a natural extension.
---
### Memory Architecture (2 months)
Hybrid vector + knowledge graph is the recommendation:
| Component | Tool | Notes |
|-----------|------|-------|
| Vector + KG combined | Mem0 (mem0.ai) | 26% accuracy improvement over OpenAI memory, 91% lower p95 latency, 90% token savings |
| Vector store | Qdrant (Rust, open-source) | High-throughput with metadata filtering |
| Temporal KG | Neo4j + Graphiti (Zep AI) | P95 retrieval: 300ms, hybrid semantic + BM25 + graph |
| Backup/migration | AgentKeeper (95% critical fact recovery across model migrations) | — |
**Journal pattern (Stanford Generative Agents):** Agent writes about experiences, generates
high-level reflections 23x/day when importance scores exceed threshold. Ablation studies:
removing any component (observation, planning, reflection) significantly reduces behavioral
believability.
**Cross-reference:** The existing `brain/` package is the memory system. Qdrant and
Mem0 are the recommended upgrade targets.
---
### Multi-Agent Sub-System (36 months)
The blueprint describes a named sub-agent hierarchy:
| Agent | Role |
|-------|------|
| Oracle | Top-level planner / supervisor |
| Sentinel | Safety / moderation |
| Scout | Research / information gathering |
| Scribe | Writing / narrative |
| Ledger | Economic management |
| Weaver | Visual art generation |
| Composer | Music generation |
| Social | Platform publishing |
**Orchestration options:**
- **Agno** (already in use) — microsecond instantiation, 50× less memory than LangGraph
- **CrewAI Flows** — event-driven with fine-grained control
- **LangGraph** — DAG-based with stateful workflows and time-travel debugging
**Scheduling pattern (Stanford Generative Agents):** Top-down recursive daily → hourly →
5-minute planning. Event interrupts for reactive tasks. Re-planning triggers when accumulated
importance scores exceed threshold.
**Cross-reference:** The existing `spark/` package (event capture, advisory engine) aligns
with this architecture. `infrastructure/event_bus` is the choreography backbone.
---
### Economic Engine (14 weeks)
Lightning Labs released `lightning-agent-tools` (open-source) in February 2026:
- `lnget` — CLI HTTP client for L402 payments
- Remote signer architecture (private keys on separate machine from agent)
- Scoped macaroon credentials (pay-only, invoice-only, read-only roles)
- **Aperture** — converts any API to pay-per-use via L402 (HTTP 402)
| Option | Effort | Notes |
|--------|--------|-------|
| ln.bot | 1 week | "Bitcoin for AI Agents" — 3 commands create a wallet; CLI + MCP + REST |
| LND via gRPC | 23 weeks | Full programmatic node management for production |
| Coinbase Agentic Wallets | — | Fiat-adjacent; less aligned with sovereignty ethos |
**Revenue channels:** Wavlake (music, 90/10 Lightning), Nostr zaps (articles), Stacker News
(earn sats from engagement), Printful (physical goods), L402-gated API access (pay-per-use
services), Geyser.fund (Lightning crowdfunding, better initial runway than micropayments).
**Cross-reference:** The existing `lightning/` package in this repo is the foundation.
L402 paywall endpoints for Timmy's own services is the actionable gap.
---
## Pioneer Case Studies
| Agent | Active | Revenue | Key Lesson |
|-------|--------|---------|-----------|
| Botto | Since Oct 2021 | $5M+ (art auctions) | Community governance via DAO sustains engagement; "taste model" (humans guide, not direct) preserves autonomous authorship |
| Neuro-sama | Since Dec 2022 | $400K+/month (subscriptions) | 3+ years of iteration; errors became entertainment features; 24/7 capability is an insurmountable advantage |
| Truth Terminal | Since Jun 2024 | $20M accumulated | Memetic fitness > planned monetization; human gatekeeper approved tweets while selecting AI-intent responses; **establish legal entity first** |
| Holly+ | Since 2021 | Conceptual | DAO of stewards for voice governance; "identity play" as alternative to defensive IP |
| AI Sponge | 2023 | Banned | Unmoderated content → TOS violations + copyright |
| Nothing Forever | 2022present | 8 viewers | Unmoderated content → ban → audience collapse; novelty-only propositions fail |
**Universal pattern:** Human oversight + economic incentive alignment + multi-year personality
development + platform-native economics = success.
---
## Recommended Implementation Sequence
From the blueprint, mapped against Timmy's existing architecture:
### Phase 1: Immediate (weeks)
1. **Code sovereignty** — Forgejo + Claude Code automated PR workflows (already substantially done)
2. **Music pipeline** — Suno API → Wavlake/Nostr NIP-94 publishing
3. **Visual art pipeline** — ComfyUI API → Blossom/Nostr with LoRA character consistency
4. **Basic Lightning wallet** — ln.bot integration for receiving micropayments
5. **Long-form publishing** — Nostr NIP-23 + RSS feed generation
### Phase 2: Moderate effort (13 months)
6. **LATM tool registry** — frontier model creates Python utilities, caches them, lighter model applies
7. **Event-driven cross-domain reactions** — game event → blog + artwork + music (CrewAI/LangGraph)
8. **Podcast generation** — TTS + feedgen → Fountain.fm
9. **Self-improving pipeline** — agent creates, tests, caches own Python utilities
10. **Comic generation** — character-consistent panels with Jenova AI or local LoRA
### Phase 3: Significant investment (36 months)
11. **Full sub-agent hierarchy** — Oracle/Sentinel/Scout/Scribe/Ledger/Weaver with Agno
12. **SOUL.md identity system** — bounded evolution + guardian monitoring
13. **Hybrid memory upgrade** — Qdrant + Mem0/Graphiti replacing or extending `brain/`
14. **Procedural world generation** — Godot + AI-driven narrative (quests, NPCs, lore)
15. **Self-sustaining economic loop** — earned revenue covers compute costs
### Remains aspirational (12+ months)
- Fully autonomous novel-length fiction without editorial intervention
- YouTube monetization for AI-generated content (tightening platform policies)
- Copyright protection for AI-generated works (current US law denies this)
- True artistic identity evolution (genuine creative voice vs pattern remixing)
- Self-modifying architecture without regression or identity drift
---
## Gap Analysis: Blueprint vs Current Codebase
| Blueprint Capability | Current Status | Gap |
|---------------------|----------------|-----|
| Code sovereignty | Done (Claude Code + Forgejo) | LATM tool registry |
| Music generation | Not started | Suno API integration + Wavlake publishing |
| Visual art | Not started | ComfyUI API client + Blossom publishing |
| Writing/publishing | Not started | Nostr NIP-23 + Pandoc pipeline |
| World building | Bannerlord work (different scope) | Luanti mods as quick win |
| Identity (SOUL.md) | Partial (CLAUDE.md + MEMORY.md) | Full SOUL.md stack |
| Memory (hybrid) | `brain/` package (SQLite-based) | Qdrant + knowledge graph |
| Multi-agent | Agno in use | Named hierarchy + event choreography |
| Lightning payments | `lightning/` package | ln.bot wallet + L402 endpoints |
| Nostr identity | Referenced in roadmap, not built | NIP-05, NIP-89 capability cards |
| Legal entity | Unknown | **Must be resolved before economic activity** |
---
## ADR Candidates
Issues that warrant Architecture Decision Records based on this review:
1. **LATM tool registry pattern** — How Timmy creates, tests, and caches self-made tools
2. **Music generation strategy** — Suno (cloud, commercial quality) vs MusicGen (local, CC-BY-NC)
3. **Memory upgrade path** — When/how to migrate `brain/` from SQLite to Qdrant + KG
4. **SOUL.md adoption** — Extending existing CLAUDE.md/MEMORY.md to full SOUL.md stack
5. **Lightning L402 strategy** — Which services Timmy gates behind micropayments
6. **Sub-agent naming and contracts** — Formalizing Oracle/Sentinel/Scout/Scribe/Ledger/Weaver

View File

@@ -164,7 +164,3 @@ directory = "htmlcov"
[tool.coverage.xml] [tool.coverage.xml]
output = "coverage.xml" output = "coverage.xml"
[tool.mypy]
ignore_missing_imports = true
no_error_summary = true

0
src/__init__.py Normal file
View File

View File

@@ -6,8 +6,6 @@ import sqlite3
from contextlib import closing from contextlib import closing
from pathlib import Path from pathlib import Path
from typing import Any
from fastapi import APIRouter, Request from fastapi import APIRouter, Request
from fastapi.responses import HTMLResponse, JSONResponse from fastapi.responses import HTMLResponse, JSONResponse
@@ -38,9 +36,9 @@ def _discover_databases() -> list[dict]:
return dbs return dbs
def _query_database(db_path: str) -> dict[str, Any]: def _query_database(db_path: str) -> dict:
"""Open a database read-only and return all tables with their rows.""" """Open a database read-only and return all tables with their rows."""
result: dict[str, Any] = {"tables": {}, "error": None} result = {"tables": {}, "error": None}
try: try:
with closing(sqlite3.connect(f"file:{db_path}?mode=ro", uri=True)) as conn: with closing(sqlite3.connect(f"file:{db_path}?mode=ro", uri=True)) as conn:
conn.row_factory = sqlite3.Row conn.row_factory = sqlite3.Row

View File

@@ -186,24 +186,6 @@
<p class="chat-history-placeholder">Loading sovereignty metrics...</p> <p class="chat-history-placeholder">Loading sovereignty metrics...</p>
{% endcall %} {% endcall %}
<!-- Agent Scorecards -->
<div class="card mc-card-spaced" id="mc-scorecards-card">
<div class="card-header">
<h2 class="card-title">Agent Scorecards</h2>
<div class="d-flex align-items-center gap-2">
<select id="mc-scorecard-period" class="form-select form-select-sm" style="width: auto;"
onchange="loadMcScorecards()">
<option value="daily" selected>Daily</option>
<option value="weekly">Weekly</option>
</select>
<a href="/scorecards" class="btn btn-sm btn-outline-secondary">Full View</a>
</div>
</div>
<div id="mc-scorecards-content" class="p-2">
<p class="chat-history-placeholder">Loading scorecards...</p>
</div>
</div>
<!-- Chat History --> <!-- Chat History -->
<div class="card mc-card-spaced"> <div class="card mc-card-spaced">
<div class="card-header"> <div class="card-header">
@@ -520,20 +502,6 @@ async function loadSparkStatus() {
} }
} }
// Load agent scorecards
async function loadMcScorecards() {
var period = document.getElementById('mc-scorecard-period').value;
var container = document.getElementById('mc-scorecards-content');
container.innerHTML = '<p class="chat-history-placeholder">Loading scorecards...</p>';
try {
var response = await fetch('/scorecards/all/panels?period=' + period);
var html = await response.text();
container.innerHTML = html;
} catch (error) {
container.innerHTML = '<p class="chat-history-placeholder">Scorecards unavailable</p>';
}
}
// Initial load // Initial load
loadSparkStatus(); loadSparkStatus();
loadSovereignty(); loadSovereignty();
@@ -542,7 +510,6 @@ loadSwarmStats();
loadLightningStats(); loadLightningStats();
loadGrokStats(); loadGrokStats();
loadChatHistory(); loadChatHistory();
loadMcScorecards();
// Periodic updates // Periodic updates
setInterval(loadSovereignty, 30000); setInterval(loadSovereignty, 30000);
@@ -551,6 +518,5 @@ setInterval(loadSwarmStats, 5000);
setInterval(updateHeartbeat, 5000); setInterval(updateHeartbeat, 5000);
setInterval(loadGrokStats, 10000); setInterval(loadGrokStats, 10000);
setInterval(loadSparkStatus, 15000); setInterval(loadSparkStatus, 15000);
setInterval(loadMcScorecards, 300000);
</script> </script>
{% endblock %} {% endblock %}

View File

@@ -137,7 +137,7 @@ class HermesMonitor:
message=f"Check error: {r}", message=f"Check error: {r}",
) )
) )
elif isinstance(r, CheckResult): else:
checks.append(r) checks.append(r)
# Compute overall level # Compute overall level

View File

@@ -203,7 +203,7 @@ async def reload_config(
@router.get("/history") @router.get("/history")
async def get_history( async def get_history(
hours: int = 24, hours: int = 24,
store: Annotated[HealthHistoryStore | None, Depends(get_history_store)] = None, store: Annotated[HealthHistoryStore, Depends(get_history_store)] = None,
) -> list[dict[str, Any]]: ) -> list[dict[str, Any]]:
"""Get provider health history for the last N hours.""" """Get provider health history for the last N hours."""
if store is None: if store is None:

View File

@@ -744,20 +744,19 @@ class CascadeRouter:
self, self,
provider: Provider, provider: Provider,
messages: list[dict], messages: list[dict],
model: str | None, model: str,
temperature: float, temperature: float,
max_tokens: int | None, max_tokens: int | None,
content_type: ContentType = ContentType.TEXT, content_type: ContentType = ContentType.TEXT,
) -> dict: ) -> dict:
"""Try a single provider request.""" """Try a single provider request."""
start_time = time.time() start_time = time.time()
effective_model: str = model or provider.get_default_model() or ""
if provider.type == "ollama": if provider.type == "ollama":
result = await self._call_ollama( result = await self._call_ollama(
provider=provider, provider=provider,
messages=messages, messages=messages,
model=effective_model, model=model or provider.get_default_model(),
temperature=temperature, temperature=temperature,
max_tokens=max_tokens, max_tokens=max_tokens,
content_type=content_type, content_type=content_type,
@@ -766,7 +765,7 @@ class CascadeRouter:
result = await self._call_openai( result = await self._call_openai(
provider=provider, provider=provider,
messages=messages, messages=messages,
model=effective_model, model=model or provider.get_default_model(),
temperature=temperature, temperature=temperature,
max_tokens=max_tokens, max_tokens=max_tokens,
) )
@@ -774,7 +773,7 @@ class CascadeRouter:
result = await self._call_anthropic( result = await self._call_anthropic(
provider=provider, provider=provider,
messages=messages, messages=messages,
model=effective_model, model=model or provider.get_default_model(),
temperature=temperature, temperature=temperature,
max_tokens=max_tokens, max_tokens=max_tokens,
) )
@@ -782,7 +781,7 @@ class CascadeRouter:
result = await self._call_grok( result = await self._call_grok(
provider=provider, provider=provider,
messages=messages, messages=messages,
model=effective_model, model=model or provider.get_default_model(),
temperature=temperature, temperature=temperature,
max_tokens=max_tokens, max_tokens=max_tokens,
) )
@@ -790,7 +789,7 @@ class CascadeRouter:
result = await self._call_vllm_mlx( result = await self._call_vllm_mlx(
provider=provider, provider=provider,
messages=messages, messages=messages,
model=effective_model, model=model or provider.get_default_model(),
temperature=temperature, temperature=temperature,
max_tokens=max_tokens, max_tokens=max_tokens,
) )

View File

@@ -474,7 +474,7 @@ class DiscordVendor(ChatPlatform):
async def _run_client(self, token: str) -> None: async def _run_client(self, token: str) -> None:
"""Run the discord.py client (blocking call in a task).""" """Run the discord.py client (blocking call in a task)."""
try: try:
await self._client.start(token) # type: ignore[union-attr] await self._client.start(token)
except Exception as exc: except Exception as exc:
logger.error("Discord client error: %s", exc) logger.error("Discord client error: %s", exc)
self._state = PlatformState.ERROR self._state = PlatformState.ERROR
@@ -482,32 +482,32 @@ class DiscordVendor(ChatPlatform):
def _register_handlers(self) -> None: def _register_handlers(self) -> None:
"""Register Discord event handlers on the client.""" """Register Discord event handlers on the client."""
@self._client.event # type: ignore[union-attr] @self._client.event
async def on_ready(): async def on_ready():
self._guild_count = len(self._client.guilds) # type: ignore[union-attr] self._guild_count = len(self._client.guilds)
self._state = PlatformState.CONNECTED self._state = PlatformState.CONNECTED
logger.info( logger.info(
"Discord ready: %s in %d guild(s)", "Discord ready: %s in %d guild(s)",
self._client.user, # type: ignore[union-attr] self._client.user,
self._guild_count, self._guild_count,
) )
@self._client.event # type: ignore[union-attr] @self._client.event
async def on_message(message): async def on_message(message):
# Ignore our own messages # Ignore our own messages
if message.author == self._client.user: # type: ignore[union-attr] if message.author == self._client.user:
return return
# Only respond to mentions or DMs # Only respond to mentions or DMs
is_dm = not hasattr(message.channel, "guild") or message.channel.guild is None is_dm = not hasattr(message.channel, "guild") or message.channel.guild is None
is_mention = self._client.user in message.mentions # type: ignore[union-attr] is_mention = self._client.user in message.mentions
if not is_dm and not is_mention: if not is_dm and not is_mention:
return return
await self._handle_message(message) await self._handle_message(message)
@self._client.event # type: ignore[union-attr] @self._client.event
async def on_disconnect(): async def on_disconnect():
if self._state != PlatformState.DISCONNECTED: if self._state != PlatformState.DISCONNECTED:
self._state = PlatformState.CONNECTING self._state = PlatformState.CONNECTING
@@ -535,8 +535,8 @@ class DiscordVendor(ChatPlatform):
def _extract_content(self, message) -> str: def _extract_content(self, message) -> str:
"""Strip the bot mention and return clean message text.""" """Strip the bot mention and return clean message text."""
content = message.content content = message.content
if self._client.user: # type: ignore[union-attr] if self._client.user:
content = content.replace(f"<@{self._client.user.id}>", "").strip() # type: ignore[union-attr] content = content.replace(f"<@{self._client.user.id}>", "").strip()
return content return content
async def _invoke_agent(self, content: str, session_id: str, target): async def _invoke_agent(self, content: str, session_id: str, target):

View File

@@ -102,14 +102,14 @@ class TelegramBot:
self._token = tok self._token = tok
self._app = Application.builder().token(tok).build() self._app = Application.builder().token(tok).build()
self._app.add_handler(CommandHandler("start", self._cmd_start)) # type: ignore[union-attr] self._app.add_handler(CommandHandler("start", self._cmd_start))
self._app.add_handler( # type: ignore[union-attr] self._app.add_handler(
MessageHandler(filters.TEXT & ~filters.COMMAND, self._handle_message) MessageHandler(filters.TEXT & ~filters.COMMAND, self._handle_message)
) )
await self._app.initialize() # type: ignore[union-attr] await self._app.initialize()
await self._app.start() # type: ignore[union-attr] await self._app.start()
await self._app.updater.start_polling(allowed_updates=Update.ALL_TYPES) # type: ignore[union-attr] await self._app.updater.start_polling(allowed_updates=Update.ALL_TYPES)
self._running = True self._running = True
logger.info("Telegram bot started.") logger.info("Telegram bot started.")

View File

@@ -245,7 +245,6 @@ class VoiceLoop:
def _transcribe(self, audio: np.ndarray) -> str: def _transcribe(self, audio: np.ndarray) -> str:
"""Transcribe audio using local Whisper model.""" """Transcribe audio using local Whisper model."""
self._load_whisper() self._load_whisper()
assert self._whisper_model is not None, "Whisper model failed to load"
sys.stdout.write(" 🧠 Transcribing...\r") sys.stdout.write(" 🧠 Transcribing...\r")
sys.stdout.flush() sys.stdout.flush()

View File

@@ -1,270 +0,0 @@
"""Tests for Daily Run orchestrator — health snapshot integration.
Verifies that the orchestrator runs a pre-flight health snapshot before
any coding work begins, and aborts on red status unless --force is passed.
Refs: #923
"""
from __future__ import annotations
import argparse
import json
import sys
from pathlib import Path
from unittest.mock import MagicMock, patch
import pytest
# Add timmy_automations to path for imports
_TA_PATH = Path(__file__).resolve().parent.parent.parent / "timmy_automations" / "daily_run"
if str(_TA_PATH) not in sys.path:
sys.path.insert(0, str(_TA_PATH))
# Also add utils path
_TA_UTILS = Path(__file__).resolve().parent.parent.parent / "timmy_automations"
if str(_TA_UTILS) not in sys.path:
sys.path.insert(0, str(_TA_UTILS))
import health_snapshot as hs
import orchestrator as orch
def _make_snapshot(overall_status: str) -> hs.HealthSnapshot:
"""Build a minimal HealthSnapshot for testing."""
return hs.HealthSnapshot(
timestamp="2026-01-01T00:00:00+00:00",
overall_status=overall_status,
ci=hs.CISignal(status="pass", message="CI passing"),
issues=hs.IssueSignal(count=0, p0_count=0, p1_count=0),
flakiness=hs.FlakinessSignal(
status="healthy",
recent_failures=0,
recent_cycles=10,
failure_rate=0.0,
message="All good",
),
tokens=hs.TokenEconomySignal(status="balanced", message="Balanced"),
)
def _make_red_snapshot() -> hs.HealthSnapshot:
return hs.HealthSnapshot(
timestamp="2026-01-01T00:00:00+00:00",
overall_status="red",
ci=hs.CISignal(status="fail", message="CI failed"),
issues=hs.IssueSignal(count=1, p0_count=1, p1_count=0),
flakiness=hs.FlakinessSignal(
status="critical",
recent_failures=8,
recent_cycles=10,
failure_rate=0.8,
message="High flakiness",
),
tokens=hs.TokenEconomySignal(status="unknown", message="No data"),
)
def _default_args(**overrides) -> argparse.Namespace:
"""Build an argparse Namespace with defaults matching the orchestrator flags."""
defaults = {
"review": False,
"json": False,
"max_items": None,
"skip_health_check": False,
"force": False,
}
defaults.update(overrides)
return argparse.Namespace(**defaults)
class TestRunHealthSnapshot:
"""Test run_health_snapshot() — the pre-flight check called by main()."""
def test_green_returns_zero(self, capsys):
"""Green snapshot returns 0 (proceed)."""
args = _default_args()
with patch.object(orch, "_generate_health_snapshot", return_value=_make_snapshot("green")):
rc = orch.run_health_snapshot(args)
assert rc == 0
def test_yellow_returns_zero(self, capsys):
"""Yellow snapshot returns 0 (proceed with caution)."""
args = _default_args()
with patch.object(orch, "_generate_health_snapshot", return_value=_make_snapshot("yellow")):
rc = orch.run_health_snapshot(args)
assert rc == 0
def test_red_returns_one(self, capsys):
"""Red snapshot returns 1 (abort)."""
args = _default_args()
with patch.object(orch, "_generate_health_snapshot", return_value=_make_red_snapshot()):
rc = orch.run_health_snapshot(args)
assert rc == 1
def test_red_with_force_returns_zero(self, capsys):
"""Red snapshot with --force returns 0 (proceed anyway)."""
args = _default_args(force=True)
with patch.object(orch, "_generate_health_snapshot", return_value=_make_red_snapshot()):
rc = orch.run_health_snapshot(args)
assert rc == 0
def test_snapshot_exception_is_skipped(self, capsys):
"""If health snapshot raises, it degrades gracefully and returns 0."""
args = _default_args()
with patch.object(orch, "_generate_health_snapshot", side_effect=RuntimeError("boom")):
rc = orch.run_health_snapshot(args)
assert rc == 0
captured = capsys.readouterr()
assert "warning" in captured.err.lower() or "skipping" in captured.err.lower()
def test_snapshot_prints_summary(self, capsys):
"""Health snapshot prints a pre-flight summary block."""
args = _default_args()
with patch.object(orch, "_generate_health_snapshot", return_value=_make_snapshot("green")):
orch.run_health_snapshot(args)
captured = capsys.readouterr()
assert "PRE-FLIGHT HEALTH CHECK" in captured.out
assert "CI" in captured.out
def test_red_prints_abort_message(self, capsys):
"""Red snapshot prints an abort message to stderr."""
args = _default_args()
with patch.object(orch, "_generate_health_snapshot", return_value=_make_red_snapshot()):
orch.run_health_snapshot(args)
captured = capsys.readouterr()
assert "RED" in captured.err or "aborting" in captured.err.lower()
def test_p0_issues_shown_in_output(self, capsys):
"""P0 issue count is shown in the pre-flight output."""
args = _default_args()
snapshot = hs.HealthSnapshot(
timestamp="2026-01-01T00:00:00+00:00",
overall_status="red",
ci=hs.CISignal(status="pass", message="CI passing"),
issues=hs.IssueSignal(count=2, p0_count=2, p1_count=0),
flakiness=hs.FlakinessSignal(
status="healthy",
recent_failures=0,
recent_cycles=10,
failure_rate=0.0,
message="All good",
),
tokens=hs.TokenEconomySignal(status="balanced", message="Balanced"),
)
with patch.object(orch, "_generate_health_snapshot", return_value=snapshot):
orch.run_health_snapshot(args)
captured = capsys.readouterr()
assert "P0" in captured.out
class TestMainHealthCheckIntegration:
"""Test that main() runs health snapshot before any coding work."""
def _patch_gitea_unavailable(self):
return patch.object(orch.GiteaClient, "is_available", return_value=False)
def test_main_runs_health_check_before_gitea(self):
"""Health snapshot is called before Gitea client work."""
call_order = []
def fake_snapshot(*_a, **_kw):
call_order.append("health")
return _make_snapshot("green")
def fake_gitea_available(self):
call_order.append("gitea")
return False
args = _default_args()
with (
patch.object(orch, "_generate_health_snapshot", side_effect=fake_snapshot),
patch.object(orch.GiteaClient, "is_available", fake_gitea_available),
patch("sys.argv", ["orchestrator"]),
):
orch.main()
assert call_order.index("health") < call_order.index("gitea")
def test_main_aborts_on_red_before_gitea(self):
"""main() aborts with non-zero exit code when health is red."""
gitea_called = []
def fake_gitea_available(self):
gitea_called.append(True)
return True
with (
patch.object(orch, "_generate_health_snapshot", return_value=_make_red_snapshot()),
patch.object(orch.GiteaClient, "is_available", fake_gitea_available),
patch("sys.argv", ["orchestrator"]),
):
rc = orch.main()
assert rc != 0
assert not gitea_called, "Gitea should NOT be called when health is red"
def test_main_skips_health_check_with_flag(self):
"""--skip-health-check bypasses the pre-flight snapshot."""
health_called = []
def fake_snapshot(*_a, **_kw):
health_called.append(True)
return _make_snapshot("green")
with (
patch.object(orch, "_generate_health_snapshot", side_effect=fake_snapshot),
patch.object(orch.GiteaClient, "is_available", return_value=False),
patch("sys.argv", ["orchestrator", "--skip-health-check"]),
):
orch.main()
assert not health_called, "Health snapshot should be skipped"
def test_main_force_flag_continues_despite_red(self):
"""--force allows Daily Run to continue even when health is red."""
gitea_called = []
def fake_gitea_available(self):
gitea_called.append(True)
return False # Gitea unavailable → exits early but after health check
with (
patch.object(orch, "_generate_health_snapshot", return_value=_make_red_snapshot()),
patch.object(orch.GiteaClient, "is_available", fake_gitea_available),
patch("sys.argv", ["orchestrator", "--force"]),
):
orch.main()
# Gitea was reached despite red status because --force was passed
assert gitea_called
def test_main_json_output_on_red_includes_error(self, capsys):
"""JSON output includes error key when health is red."""
with (
patch.object(orch, "_generate_health_snapshot", return_value=_make_red_snapshot()),
patch.object(orch.GiteaClient, "is_available", return_value=True),
patch("sys.argv", ["orchestrator", "--json"]),
):
rc = orch.main()
assert rc != 0
captured = capsys.readouterr()
data = json.loads(captured.out)
assert "error" in data

View File

@@ -4,13 +4,10 @@
Connects to local Gitea, fetches candidate issues, and produces a concise agenda Connects to local Gitea, fetches candidate issues, and produces a concise agenda
plus a day summary (review mode). plus a day summary (review mode).
The Daily Run begins with a Quick Health Snapshot (#710) to ensure mandatory
systems are green before burning cycles on work that cannot land.
Run: python3 timmy_automations/daily_run/orchestrator.py [--review] Run: python3 timmy_automations/daily_run/orchestrator.py [--review]
Env: See timmy_automations/config/daily_run.json for configuration Env: See timmy_automations/config/daily_run.json for configuration
Refs: #703, #923 Refs: #703
""" """
from __future__ import annotations from __future__ import annotations
@@ -33,11 +30,6 @@ sys.path.insert(
) )
from utils.token_rules import TokenRules, compute_token_reward from utils.token_rules import TokenRules, compute_token_reward
# Health snapshot lives in the same package
from health_snapshot import generate_snapshot as _generate_health_snapshot
from health_snapshot import get_token as _hs_get_token
from health_snapshot import load_config as _hs_load_config
# ── Configuration ───────────────────────────────────────────────────────── # ── Configuration ─────────────────────────────────────────────────────────
REPO_ROOT = Path(__file__).resolve().parent.parent.parent REPO_ROOT = Path(__file__).resolve().parent.parent.parent
@@ -503,16 +495,6 @@ def parse_args() -> argparse.Namespace:
default=None, default=None,
help="Override max agenda items", help="Override max agenda items",
) )
p.add_argument(
"--skip-health-check",
action="store_true",
help="Skip the pre-flight health snapshot (not recommended)",
)
p.add_argument(
"--force",
action="store_true",
help="Continue even if health snapshot is red (overrides abort-on-red)",
)
return p.parse_args() return p.parse_args()
@@ -553,76 +535,6 @@ def compute_daily_run_tokens(success: bool = True) -> dict[str, Any]:
} }
def run_health_snapshot(args: argparse.Namespace) -> int:
"""Run pre-flight health snapshot and return 0 (ok) or 1 (abort).
Prints a concise summary of CI, issues, flakiness, and token economy.
Returns 1 if the overall status is red AND --force was not passed.
Returns 0 for green/yellow or when --force is active.
On any import/runtime error the check is skipped with a warning.
"""
try:
hs_config = _hs_load_config()
hs_token = _hs_get_token(hs_config)
snapshot = _generate_health_snapshot(hs_config, hs_token)
except Exception as exc: # noqa: BLE001
print(f"[health] Warning: health snapshot failed ({exc}) — skipping", file=sys.stderr)
return 0
# Print concise pre-flight header
status_emoji = {"green": "🟢", "yellow": "🟡", "red": "🔴"}.get(
snapshot.overall_status, ""
)
print("" * 60)
print(f"PRE-FLIGHT HEALTH CHECK {status_emoji} {snapshot.overall_status.upper()}")
print("" * 60)
ci_emoji = {"pass": "", "fail": "", "unknown": "⚠️", "unavailable": ""}.get(
snapshot.ci.status, ""
)
print(f" {ci_emoji} CI: {snapshot.ci.message}")
if snapshot.issues.p0_count > 0:
issue_emoji = "🔴"
elif snapshot.issues.p1_count > 0:
issue_emoji = "🟡"
else:
issue_emoji = ""
critical_str = f"{snapshot.issues.count} critical"
if snapshot.issues.p0_count:
critical_str += f" (P0: {snapshot.issues.p0_count})"
if snapshot.issues.p1_count:
critical_str += f" (P1: {snapshot.issues.p1_count})"
print(f" {issue_emoji} Issues: {critical_str}")
flak_emoji = {"healthy": "", "degraded": "🟡", "critical": "🔴", "unknown": ""}.get(
snapshot.flakiness.status, ""
)
print(f" {flak_emoji} Flakiness: {snapshot.flakiness.message}")
token_emoji = {"balanced": "", "inflationary": "🟡", "deflationary": "🔵", "unknown": ""}.get(
snapshot.tokens.status, ""
)
print(f" {token_emoji} Tokens: {snapshot.tokens.message}")
print()
if snapshot.overall_status == "red" and not args.force:
print(
"🛑 Health status is RED — aborting Daily Run to avoid burning cycles.",
file=sys.stderr,
)
print(
" Fix the issues above or re-run with --force to override.",
file=sys.stderr,
)
return 1
if snapshot.overall_status == "red":
print("⚠️ Health is RED but --force passed — proceeding anyway.", file=sys.stderr)
return 0
def main() -> int: def main() -> int:
args = parse_args() args = parse_args()
config = load_config() config = load_config()
@@ -630,15 +542,6 @@ def main() -> int:
if args.max_items: if args.max_items:
config["max_agenda_items"] = args.max_items config["max_agenda_items"] = args.max_items
# ── Step 0: Pre-flight health snapshot ──────────────────────────────────
if not args.skip_health_check:
health_rc = run_health_snapshot(args)
if health_rc != 0:
tokens = compute_daily_run_tokens(success=False)
if args.json:
print(json.dumps({"error": "health_check_failed", "tokens": tokens}))
return health_rc
token = get_token(config) token = get_token(config)
client = GiteaClient(config, token) client = GiteaClient(config, token)

10
tox.ini
View File

@@ -41,10 +41,8 @@ description = Static type checking with mypy
commands_pre = commands_pre =
deps = deps =
mypy>=1.0.0 mypy>=1.0.0
types-PyYAML
types-requests
commands = commands =
mypy src mypy src --ignore-missing-imports --no-error-summary
# ── Test Environments ──────────────────────────────────────────────────────── # ── Test Environments ────────────────────────────────────────────────────────
@@ -132,17 +130,13 @@ commands =
# ── Pre-push (mirrors CI exactly) ──────────────────────────────────────────── # ── Pre-push (mirrors CI exactly) ────────────────────────────────────────────
[testenv:pre-push] [testenv:pre-push]
description = Local gate — lint + typecheck + full CI suite (same as Gitea Actions) description = Local gate — lint + full CI suite (same as Gitea Actions)
deps = deps =
ruff>=0.8.0 ruff>=0.8.0
mypy>=1.0.0
types-PyYAML
types-requests
commands = commands =
ruff check src/ tests/ ruff check src/ tests/
ruff format --check src/ tests/ ruff format --check src/ tests/
bash -c 'files=$(grep -rl "<style" src/dashboard/templates/ --include="*.html" 2>/dev/null); if [ -n "$files" ]; then echo "ERROR: inline <style> blocks found — move CSS to static/css/mission-control.css:"; echo "$files"; exit 1; fi; echo "No inline CSS — OK"' bash -c 'files=$(grep -rl "<style" src/dashboard/templates/ --include="*.html" 2>/dev/null); if [ -n "$files" ]; then echo "ERROR: inline <style> blocks found — move CSS to static/css/mission-control.css:"; echo "$files"; exit 1; fi; echo "No inline CSS — OK"'
mypy src
mkdir -p reports mkdir -p reports
pytest tests/ \ pytest tests/ \
--cov=src \ --cov=src \