Files
Timmy-time-dashboard/SOVEREIGNTY.md
Alexander Whitestone 81ee0557d6
Some checks failed
Tests / lint (pull_request) Failing after 35s
Tests / test (pull_request) Has been skipped
feat: implement autonomous research pipeline (#972)
Closes three P0 items from the governing research sovereignty spec:

- `src/timmy/research.py` — ResearchOrchestrator (6-step pipeline):
  Step 0 semantic cache check (SQLite, instant, $0 cost)
  Step 1 research template loading from skills/research/
  Step 2 query formulation via Ollama slot-fill
  Step 3 web search via SerpAPI (graceful fallback when key absent)
  Step 4 full-page fetch via trafilatura (web_fetch)
  Step 5 synthesis via cascade (Ollama → Claude API fallback)
  Step 6 store to semantic memory + optional disk persist

- `tests/timmy/test_research.py` — 24 unit tests, all passing

- `SOVEREIGNTY.md` — machine-readable research independence manifest
  with pipeline status, cascade tiers, templates, and metrics targets

Refs #972 (governing spec), #973 (web_fetch), #974 (templates), #975 (orchestrator)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 21:39:58 -04:00

4.2 KiB
Raw Blame History

SOVEREIGNTY.md — Research Sovereignty Manifest

"If this spec is implemented correctly, it is the last research document Alexander should need to request from a corporate AI." — Issue #972, March 22 2026


What This Is

A machine-readable declaration of Timmy's research independence: where we are, where we're going, and how to measure progress.


The Problem We're Solving

On March 22, 2026, a single Claude session produced six deep research reports. It consumed ~3 hours of human time and substantial corporate AI inference. Every report was valuable — but the workflow was linear. It would cost exactly the same to reproduce tomorrow.

This file tracks the pipeline that crystallizes that workflow into something Timmy can run autonomously.


The Six-Step Pipeline

Step What Happens Status
1. Scope Human describes knowledge gap → Gitea issue with template Done (skills/research/)
2. Query LLM slot-fills template → 515 targeted queries Done (research.py)
3. Search Execute queries → top result URLs Done (research_tools.py)
4. Fetch Download + extract full pages (trafilatura) Done (tools/system_tools.py)
5. Synthesize Compress findings → structured report Done (research.py cascade)
6. Deliver Store to semantic memory + optional disk persist Done (research.py)

Cascade Tiers (Synthesis Quality vs. Cost)

Tier Model Cost Quality Status
4 SQLite semantic cache $0.00 / instant reuses prior Active
3 Ollama qwen3:14b $0.00 / local ★★★ Active
2 Claude API (haiku) ~$0.01/report ★★★★ Active (opt-in)
1 Groq llama-3.3-70b $0.00 / rate-limited ★★★★ 🔲 Planned (#980)

Set ANTHROPIC_API_KEY to enable Tier 2 fallback.


Research Templates

Six prompt templates live in skills/research/:

Template Use Case
tool_evaluation.md Find all shipping tools for {domain}
architecture_spike.md How to connect {system_a} to {system_b}
game_analysis.md Evaluate {game} for AI agent play
integration_guide.md Wire {tool} into {stack} with code
state_of_art.md What exists in {field} as of {date}
competitive_scan.md How does {project} compare to {alternatives}

Sovereignty Metrics

Metric Target (Week 1) Target (Month 1) Target (Month 3) Graduation
Queries answered locally 10% 40% 80% >90%
API cost per report <$1.50 <$0.50 <$0.10 <$0.01
Time from question to report <3 hours <30 min <5 min <1 min
Human involvement 100% (review) Review only Approve only None

How to Use the Pipeline

from timmy.research import run_research

# Quick research (no template)
result = await run_research("best local embedding models for 36GB RAM")

# With a template and slot values
result = await run_research(
    topic="PDF text extraction libraries for Python",
    template="tool_evaluation",
    slots={"domain": "PDF parsing", "use_case": "RAG pipeline", "focus_criteria": "accuracy"},
    save_to_disk=True,
)

print(result.report)
print(f"Backend: {result.synthesis_backend}, Cached: {result.cached}")

Implementation Status

Component Issue Status
web_fetch tool (trafilatura) #973 Done
Research template library (6 templates) #974 Done
ResearchOrchestrator (research.py) #975 Done
Semantic index for outputs #976 🔲 Planned
Auto-create Gitea issues from findings #977 🔲 Planned
Paperclip task runner integration #978 🔲 Planned
Kimi delegation via labels #979 🔲 Planned
Groq free-tier cascade tier #980 🔲 Planned
Sovereignty metrics dashboard #981 🔲 Planned

Governing Spec

See issue #972 for the full spec and rationale.

Research artifacts committed to docs/research/.