123 lines
4.2 KiB
Markdown
123 lines
4.2 KiB
Markdown
|
|
# SOVEREIGNTY.md — Research Sovereignty Manifest
|
|||
|
|
|
|||
|
|
> "If this spec is implemented correctly, it is the last research document
|
|||
|
|
> Alexander should need to request from a corporate AI."
|
|||
|
|
> — Issue #972, March 22 2026
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## What This Is
|
|||
|
|
|
|||
|
|
A machine-readable declaration of Timmy's research independence:
|
|||
|
|
where we are, where we're going, and how to measure progress.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## The Problem We're Solving
|
|||
|
|
|
|||
|
|
On March 22, 2026, a single Claude session produced six deep research reports.
|
|||
|
|
It consumed ~3 hours of human time and substantial corporate AI inference.
|
|||
|
|
Every report was valuable — but the workflow was **linear**.
|
|||
|
|
It would cost exactly the same to reproduce tomorrow.
|
|||
|
|
|
|||
|
|
This file tracks the pipeline that crystallizes that workflow into something
|
|||
|
|
Timmy can run autonomously.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## The Six-Step Pipeline
|
|||
|
|
|
|||
|
|
| Step | What Happens | Status |
|
|||
|
|
|------|-------------|--------|
|
|||
|
|
| 1. Scope | Human describes knowledge gap → Gitea issue with template | ✅ Done (`skills/research/`) |
|
|||
|
|
| 2. Query | LLM slot-fills template → 5–15 targeted queries | ✅ Done (`research.py`) |
|
|||
|
|
| 3. Search | Execute queries → top result URLs | ✅ Done (`research_tools.py`) |
|
|||
|
|
| 4. Fetch | Download + extract full pages (trafilatura) | ✅ Done (`tools/system_tools.py`) |
|
|||
|
|
| 5. Synthesize | Compress findings → structured report | ✅ Done (`research.py` cascade) |
|
|||
|
|
| 6. Deliver | Store to semantic memory + optional disk persist | ✅ Done (`research.py`) |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Cascade Tiers (Synthesis Quality vs. Cost)
|
|||
|
|
|
|||
|
|
| Tier | Model | Cost | Quality | Status |
|
|||
|
|
|------|-------|------|---------|--------|
|
|||
|
|
| **4** | SQLite semantic cache | $0.00 / instant | reuses prior | ✅ Active |
|
|||
|
|
| **3** | Ollama `qwen3:14b` | $0.00 / local | ★★★ | ✅ Active |
|
|||
|
|
| **2** | Claude API (haiku) | ~$0.01/report | ★★★★ | ✅ Active (opt-in) |
|
|||
|
|
| **1** | Groq `llama-3.3-70b` | $0.00 / rate-limited | ★★★★ | 🔲 Planned (#980) |
|
|||
|
|
|
|||
|
|
Set `ANTHROPIC_API_KEY` to enable Tier 2 fallback.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Research Templates
|
|||
|
|
|
|||
|
|
Six prompt templates live in `skills/research/`:
|
|||
|
|
|
|||
|
|
| Template | Use Case |
|
|||
|
|
|----------|----------|
|
|||
|
|
| `tool_evaluation.md` | Find all shipping tools for `{domain}` |
|
|||
|
|
| `architecture_spike.md` | How to connect `{system_a}` to `{system_b}` |
|
|||
|
|
| `game_analysis.md` | Evaluate `{game}` for AI agent play |
|
|||
|
|
| `integration_guide.md` | Wire `{tool}` into `{stack}` with code |
|
|||
|
|
| `state_of_art.md` | What exists in `{field}` as of `{date}` |
|
|||
|
|
| `competitive_scan.md` | How does `{project}` compare to `{alternatives}` |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Sovereignty Metrics
|
|||
|
|
|
|||
|
|
| Metric | Target (Week 1) | Target (Month 1) | Target (Month 3) | Graduation |
|
|||
|
|
|--------|-----------------|------------------|------------------|------------|
|
|||
|
|
| Queries answered locally | 10% | 40% | 80% | >90% |
|
|||
|
|
| API cost per report | <$1.50 | <$0.50 | <$0.10 | <$0.01 |
|
|||
|
|
| Time from question to report | <3 hours | <30 min | <5 min | <1 min |
|
|||
|
|
| Human involvement | 100% (review) | Review only | Approve only | None |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## How to Use the Pipeline
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
from timmy.research import run_research
|
|||
|
|
|
|||
|
|
# Quick research (no template)
|
|||
|
|
result = await run_research("best local embedding models for 36GB RAM")
|
|||
|
|
|
|||
|
|
# With a template and slot values
|
|||
|
|
result = await run_research(
|
|||
|
|
topic="PDF text extraction libraries for Python",
|
|||
|
|
template="tool_evaluation",
|
|||
|
|
slots={"domain": "PDF parsing", "use_case": "RAG pipeline", "focus_criteria": "accuracy"},
|
|||
|
|
save_to_disk=True,
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
print(result.report)
|
|||
|
|
print(f"Backend: {result.synthesis_backend}, Cached: {result.cached}")
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Implementation Status
|
|||
|
|
|
|||
|
|
| Component | Issue | Status |
|
|||
|
|
|-----------|-------|--------|
|
|||
|
|
| `web_fetch` tool (trafilatura) | #973 | ✅ Done |
|
|||
|
|
| Research template library (6 templates) | #974 | ✅ Done |
|
|||
|
|
| `ResearchOrchestrator` (`research.py`) | #975 | ✅ Done |
|
|||
|
|
| Semantic index for outputs | #976 | 🔲 Planned |
|
|||
|
|
| Auto-create Gitea issues from findings | #977 | 🔲 Planned |
|
|||
|
|
| Paperclip task runner integration | #978 | 🔲 Planned |
|
|||
|
|
| Kimi delegation via labels | #979 | 🔲 Planned |
|
|||
|
|
| Groq free-tier cascade tier | #980 | 🔲 Planned |
|
|||
|
|
| Sovereignty metrics dashboard | #981 | 🔲 Planned |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Governing Spec
|
|||
|
|
|
|||
|
|
See [issue #972](http://143.198.27.163:3000/Rockachopa/Timmy-time-dashboard/issues/972) for the full spec and rationale.
|
|||
|
|
|
|||
|
|
Research artifacts committed to `docs/research/`.
|