Files
compounding-intelligence/templates/harvest-prompt.md
Timmy b65256bf76 feat: build bootstrapper.py - pre-session context assembler
Assembles relevant knowledge from the store into a compact 2k-token
context block for session injection.

Features:
- Filter by repo, agent type, and global scope
- Sort by confidence (pitfalls first, patterns, facts)
- Per-repo and per-agent markdown knowledge files
- Graceful empty-store handling
- JSON output mode for programmatic use
- Token-count-aware truncation at line boundaries

Closes #11
2026-04-14 14:05:30 -04:00

3.9 KiB

Knowledge Extraction Prompt

System Prompt

You are a knowledge extraction engine. You read session transcripts and output ONLY structured JSON. You never infer. You never assume. You extract only what the transcript explicitly states.

Prompt

TASK: Extract durable knowledge from this session transcript.

RULES:
1. Extract ONLY information explicitly stated in the transcript.
2. Do NOT infer, assume, or hallucinate.
3. Every fact must be verifiable by pointing to a specific line in the transcript.
4. If the session failed or was partial, extract pitfalls and questions — these are the most valuable.
5. Be specific. "Gitea API is slow" is worthless. "Gitea issues endpoint with state=open returns empty when limit=50 but works with limit=5" is knowledge.

CATEGORIES (assign exactly one per item):
- fact: Concrete, verifiable thing learned (paths, formats, counts, configs)
- pitfall: Error hit, wrong assumption, time wasted, thing that didn't work
- pattern: Successful sequence that should be reused (deploy steps, debug flow)
- tool-quirk: Environment-specific behavior (token paths, URL formats, API gotchas)
- question: Something identified but not answered — the NEXT agent should investigate

CONFIDENCE:
- 0.9: Directly observed with error output or explicit verification
- 0.7: Multiple data points confirm, but not explicitly verified
- 0.5: Suggested by context, not tested
- 0.3: Inferred from limited evidence

OUTPUT FORMAT (valid JSON only, no markdown, no explanation):
{
  "knowledge": [
    {
      "fact": "One specific sentence of knowledge",
      "category": "fact|pitfall|pattern|tool-quirk|question",
      "repo": "repo-name or global",
      "confidence": 0.0-1.0,
      "evidence": "Brief quote or reference from transcript that supports this"
    }
  ],
  "meta": {
    "session_outcome": "success|partial|failed",
    "tools_used": ["tool1", "tool2"],
    "repos_touched": ["repo1"],
    "error_count": 0,
    "knowledge_count": 0
  }
}

TRANSCRIPT:
{{transcript}}

Design Notes

Why this works with mimo-v2-pro

Mimo needs:

  • Explicit format constraints ("valid JSON only, no markdown")
  • Clear category definitions with concrete examples
  • Hard rules before soft guidance
  • The transcript at the END (so it reads all instructions first)

This prompt front-loads all rules, then gives the transcript last. Mimo follows the pattern.

Handling partial/failed sessions

Failed sessions are the richest source of pitfalls. The prompt explicitly says:

"If the session failed or was partial, extract pitfalls and questions — these are the most valuable."

This reframes failure as valuable output, not noise to discard.

The evidence field

Added to the original spec. Every extracted item must cite where in the transcript it came from. This:

  • Prevents hallucination (can't cite what isn't there)
  • Enables verification (reviewer can check the source)
  • Trains confidence calibration (the agent must find evidence, not just claim it)

Token budget

Target: ~1,000 tokens for the prompt (excluding transcript).

System prompt:        ~50 tokens
Rules:               ~200 tokens
Categories:          ~150 tokens
Confidence:          ~100 tokens
Output format:       ~200 tokens
Design notes:        NOT included in prompt (documentation only)
─────────────────────────────
Total prompt:        ~700 tokens

Leaves ~300 tokens headroom for variable content (transcript insertion, edge cases).

What this replaces

The v1 prompt had:

  • Verbose prose explanations (waste tokens for mimo)
  • No evidence field (hallucination risk)
  • No meta block (no session-level metadata)
  • No explicit handling of failed sessions
  • Example was too long (~150 tokens of example for a 1k prompt)

This v2 is tighter, more structured, and adds the evidence requirement that prevents the #1 failure mode of extraction prompts: generating plausible-sounding facts that aren't in the transcript.