Archived

forked from Rockachopa/Timmy-time-dashboard

This repository has been archived on 2026-03-24. You can view files and clone it. You cannot open issues or pull requests or push a commit.

Files

Claude (Opus 4.6) 9eeb49a6f1 [claude] Autonomous research pipeline — orchestrator + SOVEREIGNTY.md (#972 ) (#1274 )

2026-03-24 01:40:53 +00:00

4.2 KiB

Raw Blame History

SOVEREIGNTY.md — Research Sovereignty Manifest

"If this spec is implemented correctly, it is the last research document Alexander should need to request from a corporate AI." — Issue #972, March 22 2026

What This Is

A machine-readable declaration of Timmy's research independence: where we are, where we're going, and how to measure progress.

The Problem We're Solving

On March 22, 2026, a single Claude session produced six deep research reports. It consumed ~3 hours of human time and substantial corporate AI inference. Every report was valuable — but the workflow was linear. It would cost exactly the same to reproduce tomorrow.

This file tracks the pipeline that crystallizes that workflow into something Timmy can run autonomously.

The Six-Step Pipeline

Step	What Happens	Status
1. Scope	Human describes knowledge gap → Gitea issue with template	✅ Done (`skills/research/`)
2. Query	LLM slot-fills template → 5–15 targeted queries	✅ Done (`research.py`)
3. Search	Execute queries → top result URLs	✅ Done (`research_tools.py`)
4. Fetch	Download + extract full pages (trafilatura)	✅ Done (`tools/system_tools.py`)
5. Synthesize	Compress findings → structured report	✅ Done (`research.py` cascade)
6. Deliver	Store to semantic memory + optional disk persist	✅ Done (`research.py`)

Cascade Tiers (Synthesis Quality vs. Cost)

Tier	Model	Cost	Quality	Status
4	SQLite semantic cache	$0.00 / instant	reuses prior	✅ Active
3	Ollama `qwen3:14b`	$0.00 / local	★★★	✅ Active
2	Claude API (haiku)	~$0.01/report	★★★★	✅ Active (opt-in)
1	Groq `llama-3.3-70b`	$0.00 / rate-limited	★★★★	🔲 Planned (#980)

Set ANTHROPIC_API_KEY to enable Tier 2 fallback.

Research Templates

Six prompt templates live in skills/research/:

Template	Use Case
`tool_evaluation.md`	Find all shipping tools for `{domain}`
`architecture_spike.md`	How to connect `{system_a}` to `{system_b}`
`game_analysis.md`	Evaluate `{game}` for AI agent play
`integration_guide.md`	Wire `{tool}` into `{stack}` with code
`state_of_art.md`	What exists in `{field}` as of `{date}`
`competitive_scan.md`	How does `{project}` compare to `{alternatives}`

Sovereignty Metrics

Metric	Target (Week 1)	Target (Month 1)	Target (Month 3)	Graduation
Queries answered locally	10%	40%	80%	>90%
API cost per report	<$1.50	<$0.50	<$0.10	<$0.01
Time from question to report	<3 hours	<30 min	<5 min	<1 min
Human involvement	100% (review)	Review only	Approve only	None

How to Use the Pipeline

from timmy.research import run_research

# Quick research (no template)
result = await run_research("best local embedding models for 36GB RAM")

# With a template and slot values
result = await run_research(
    topic="PDF text extraction libraries for Python",
    template="tool_evaluation",
    slots={"domain": "PDF parsing", "use_case": "RAG pipeline", "focus_criteria": "accuracy"},
    save_to_disk=True,
)

print(result.report)
print(f"Backend: {result.synthesis_backend}, Cached: {result.cached}")

Implementation Status

Component	Issue	Status
`web_fetch` tool (trafilatura)	#973	✅ Done
Research template library (6 templates)	#974	✅ Done
`ResearchOrchestrator` (`research.py`)	#975	✅ Done
Semantic index for outputs	#976	🔲 Planned
Auto-create Gitea issues from findings	#977	🔲 Planned
Paperclip task runner integration	#978	🔲 Planned
Kimi delegation via labels	#979	🔲 Planned
Groq free-tier cascade tier	#980	🔲 Planned
Sovereignty metrics dashboard	#981	🔲 Planned

Governing Spec

See issue #972 for the full spec and rationale.

Research artifacts committed to docs/research/.

4.2 KiB Raw Blame History Unescape Escape