Compare commits

..

8 Commits

22 changed files with 687 additions and 2963 deletions

View File

@@ -1,171 +0,0 @@
# Knowledge File Format Specification
**Version:** 1
**Issue:** #10
**Status:** Draft
---
## Overview
The knowledge system has two layers:
1. **index.json** — Machine-readable fact index. Fast lookups by ID, category, repo, tags.
2. **Knowledge files** (YAML) — Human-readable, editable facts organized by domain.
The harvester writes to both. The bootstrapper reads from index.json. Humans edit the YAML files directly.
---
## index.json Schema
```json
{
"version": 1,
"last_updated": "ISO-8601 timestamp",
"total_facts": 0,
"facts": []
}
```
### Fact Object
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `id` | string | yes | Unique identifier: `{domain}:{category}:{sequence}` |
| `fact` | string | yes | One-sentence description of the knowledge |
| `category` | enum | yes | One of: `fact`, `pitfall`, `pattern`, `tool-quirk`, `question` |
| `domain` | string | yes | Where this applies: repo name, `global`, or agent name |
| `confidence` | float | yes | 0.01.0. How certain is this knowledge? |
| `tags` | string[] | no | Searchable labels: `["git", "auth", "gitea"]` |
| `source_count` | int | no | How many sessions confirmed this fact |
| `first_seen` | date | no | ISO-8601 date first extracted |
| `last_confirmed` | date | no | ISO-8601 date last seen in a session |
| `expires` | date | no | Optional. After this date, fact is stale |
| `related` | string[] | no | IDs of related facts |
### ID Format
```
{domain}:{category}:{sequence}
```
- `domain` — repo name, `global`, or agent type
- `category` — one of the 5 categories
- `sequence` — zero-padded 3-digit number: `001`, `002`, ...
Examples:
- `the-nexus:pitfall:001`
- `global:tool-quirk:012`
- `hermes-agent:pattern:003`
### Categories
| Category | Definition | Example |
|----------|------------|---------|
| `fact` | Concrete, verifiable information | "Gitea API requires token auth at /api/v1" |
| `pitfall` | Errors, wrong assumptions, time-wasters | "Assumed env var GITEA_TOKEN; actual path is ~/.config/gitea/token" |
| `pattern` | Successful sequences of actions | "To deploy: test → build → push → webhook" |
| `tool-quirk` | Environment-specific behaviors | "URL format requires trailing slash on macOS" |
| `question` | Identified but unanswered | "Need optimal batch size for harvesting" |
### Confidence Scoring
| Range | Meaning |
|-------|---------|
| 0.91.0 | Explicitly stated and verified |
| 0.70.8 | Clearly implied by multiple data points |
| 0.50.6 | Suggested but not fully verified |
| 0.30.4 | Inferred from limited data |
| 0.10.2 | Speculative or uncertain |
---
## Knowledge Files (YAML)
Human-readable files stored in `knowledge/` subdirectories.
### Directory Structure
```
knowledge/
├── index.json # Machine-readable fact index
├── SCHEMA.md # This file
├── global/ # Cross-repo knowledge
│ ├── pitfalls.yaml # Pitfalls that span multiple repos
│ ├── patterns.yaml # Proven workflows
│ └── tool-quirks.yaml # Environment behaviors
├── repos/ # Per-repo knowledge
│ ├── the-nexus.yaml
│ ├── hermes-agent.yaml
│ └── ...
└── agents/ # Agent-type knowledge
├── mimo-sprint.yaml
└── ...
```
### YAML File Format
```yaml
---
domain: global # or repo name or agent name
category: tool-quirk # fact, pitfall, pattern, tool-quirk, question
version: 1
last_updated: "2026-04-13"
---
# Tool Quirks (Global)
Cross-environment behaviors that bite you if you don't know them.
## Authentication
- id: global:tool-quirk:001
fact: "Gitea token stored at ~/.config/gitea/token, not env var"
confidence: 0.95
tags: [git, auth, gitea]
source_count: 23
first_seen: "2026-03-27"
last_confirmed: "2026-04-13"
related: [global:pitfall:003]
- id: global:tool-quirk:002
fact: "Gitea API uses 'token' header format: Authorization: token TOKEN"
confidence: 0.9
tags: [git, api, gitea]
source_count: 8
first_seen: "2026-03-28"
```
### Rules
1. **One file per domain per category.** `repos/the-nexus.yaml` holds all the-nexus facts. Don't mix categories across files.
2. **Markdown sections for humans.** The YAML items live under markdown headers. This makes the files readable in Gitea's UI.
3. **ID is the link.** The `id` field connects YAML facts to index.json entries. Same ID = same fact.
4. **Harvester writes, humans edit.** The harvester appends new facts. Humans can correct confidence, add tags, or mark expired.
---
## Sync Rules
1. **Harvester → YAML:** Appends new facts to the appropriate YAML file.
2. **Harvester → index.json:** Adds/updates fact entries.
3. **Human edits YAML:** Changes propagate to index.json on next harvester run.
4. **Confidence decay:** Facts not confirmed in 30+ sessions get confidence *= 0.9.
5. **Expiration:** Facts with `expires` date past current date are marked `stale` in index.json.
---
## Validation
Facts must pass these checks before entering the index:
1. `id` matches format `{domain}:{category}:{sequence}`
2. `category` is one of the 5 allowed values
3. `confidence` is between 0.0 and 1.0
4. `fact` is non-empty string, max 280 characters
5. `domain` is non-empty string
6. `tags` are lowercase alphanumeric + hyphens
7. No duplicate IDs in index.json
Validation script: `scripts/validate_knowledge.py`

View File

@@ -1,80 +0,0 @@
---
domain: global
category: pitfall
version: 1
last_updated: "2026-04-13"
---
# Pitfalls (Global)
Cross-repo traps that waste time across the fleet.
## Git & Forge
- id: global:pitfall:001
fact: "Branch protection requires 1 approval on main — API merges fail with 405 without it"
confidence: 0.95
tags: [git, merge, branch-protection, gitea]
source_count: 12
first_seen: "2026-04-05"
last_confirmed: "2026-04-13"
related: [the-nexus:pitfall:001]
- id: global:pitfall:002
fact: "Never use --no-verify on git commits — it bypasses all hooks including safety checks"
confidence: 0.95
tags: [git, hooks, safety]
source_count: 5
first_seen: "2026-03-28"
last_confirmed: "2026-04-13"
- id: global:pitfall:003
fact: "Gitea PR creation workaround needed on the-nexus — direct API call fails, use alternative endpoint"
confidence: 0.9
tags: [gitea, pr, api, workaround]
source_count: 4
first_seen: "2026-04-06"
last_confirmed: "2026-04-12"
## Agent Operations
- id: global:pitfall:004
fact: "Anthropic is BANNED from fallback chain — if fallback triggers to Anthropic, something is wrong"
confidence: 0.95
tags: [provider, anthropic, fallback]
source_count: 7
first_seen: "2026-03-30"
last_confirmed: "2026-04-13"
- id: global:pitfall:005
fact: "Telegram tokens expired — don't assume Telegram notifications work without checking"
confidence: 0.85
tags: [telegram, notifications, token]
source_count: 3
first_seen: "2026-04-02"
- id: global:pitfall:006
fact: "Multiple gateways = 'cannot schedule futures' error — only one gateway process should run"
confidence: 0.9
tags: [gateway, cron, process]
source_count: 4
first_seen: "2026-04-04"
last_confirmed: "2026-04-11"
## Testing
- id: global:pitfall:007
fact: "pytest root collection picks up operational *_test.py scripts — restrict to tests/ directory"
confidence: 0.9
tags: [pytest, test, collection]
source_count: 3
first_seen: "2026-04-07"
last_confirmed: "2026-04-13"
- id: global:pitfall:008
fact: "TDD: test 1 before building 55 — verify the cycle works before scaling"
confidence: 0.95
tags: [tdd, testing, methodology]
source_count: 8
first_seen: "2026-03-25"
last_confirmed: "2026-04-13"

View File

@@ -1,73 +0,0 @@
---
domain: global
category: tool-quirk
version: 1
last_updated: "2026-04-13"
---
# Tool Quirks (Global)
Cross-environment behaviors that bite you if you don't know them.
## Authentication
- id: global:tool-quirk:001
fact: "Gitea token stored at ~/.config/gitea/token, not env var GITEA_TOKEN"
confidence: 0.95
tags: [git, auth, gitea, token]
source_count: 23
first_seen: "2026-03-27"
last_confirmed: "2026-04-13"
related: [global:pitfall:001]
- id: global:tool-quirk:002
fact: "Gitea API uses 'Authorization: token TOKEN' header format, not Bearer"
confidence: 0.9
tags: [git, api, gitea]
source_count: 8
first_seen: "2026-03-28"
last_confirmed: "2026-04-12"
- id: global:tool-quirk:003
fact: "Gitea Issues API type=issues param does NOT filter PRs — use truthiness check on pull_request field"
confidence: 0.95
tags: [gitea, api, issues, pr]
source_count: 6
first_seen: "2026-04-01"
last_confirmed: "2026-04-13"
## Paths & Environment
- id: global:tool-quirk:004
fact: "~/.hermes is the default hermes home — check get_hermes_home() not the path literal"
confidence: 0.9
tags: [paths, hermes, env]
source_count: 10
first_seen: "2026-03-30"
last_confirmed: "2026-04-13"
related: [hermes-agent:pitfall:005]
- id: global:tool-quirk:005
fact: "Ansible vault-encrypted vars in YAML require vault_inline_vars plugin — standard ansible-vault fails"
confidence: 0.85
tags: [ansible, vault, config]
source_count: 3
first_seen: "2026-04-02"
## Model & Inference
- id: global:tool-quirk:006
fact: "mimo-v2-pro via Nous Research is the default model — don't assume Anthropic is available"
confidence: 0.95
tags: [model, provider, nous, default]
source_count: 15
first_seen: "2026-03-25"
last_confirmed: "2026-04-13"
- id: global:tool-quirk:007
fact: "Kill + restart with 'hermes chat' preserves old model state — NEVER use --resume"
confidence: 0.95
tags: [hermes, model, restart, session]
source_count: 8
first_seen: "2026-03-29"
last_confirmed: "2026-04-12"

View File

@@ -1,489 +1,6 @@
{
"version": 1,
"last_updated": "2026-04-14T18:07:27.448168Z",
"total_facts": 55,
"migration": {
"migrated_from": [
"mempalace",
"fact_store",
"skills"
],
"migrated_at": "2026-04-14T18:07:27.448362Z",
"sources": {
"mempalace": 11,
"fact_store": 29,
"skills": 15
}
},
"facts": [
{
"fact": "Timmy Foundation: 17 repos, 282 open issues, 63.0% closure rate",
"category": "fact",
"repo": "global",
"confidence": 0.95,
"source": "mempalace",
"source_file": "forge.json"
},
{
"fact": "Timmy_Foundation/timmy-home: 227 open issues",
"category": "fact",
"repo": "timmy-home",
"confidence": 0.95,
"source": "mempalace",
"source_file": "forge.json"
},
{
"fact": "Timmy_Foundation/timmy-config: 133 open issues",
"category": "fact",
"repo": "timmy-config",
"confidence": 0.95,
"source": "mempalace",
"source_file": "forge.json"
},
{
"fact": "Timmy_Foundation/the-nexus: 72 open issues",
"category": "fact",
"repo": "the-nexus",
"confidence": 0.95,
"source": "mempalace",
"source_file": "forge.json"
},
{
"fact": "Timmy_Foundation/fleet-ops: 47 open issues",
"category": "fact",
"repo": "fleet-ops",
"confidence": 0.95,
"source": "mempalace",
"source_file": "forge.json"
},
{
"fact": "Timmy_Foundation/the-beacon: 12 open issues",
"category": "fact",
"repo": "the-beacon",
"confidence": 0.95,
"source": "mempalace",
"source_file": "forge.json"
},
{
"fact": "Assignment coverage: 99.6% (281 assigned, 1 unassigned)",
"category": "fact",
"repo": "global",
"confidence": 0.95,
"source": "mempalace",
"source_file": "forge.json"
},
{
"fact": "Priority: 4 P0, 8 P1, 11 epics",
"category": "fact",
"repo": "global",
"confidence": 0.95,
"source": "mempalace",
"source_file": "forge.json"
},
{
"fact": "CRITICAL timmy-home#580: Harden SOUL.md against Claude identity hijacking - Security: Protects the core inscription of Timmy's values on-chain",
"category": "pitfall",
"repo": "timmy-home",
"confidence": 0.9,
"source": "mempalace",
"source_file": "forge-palace-summary.json"
},
{
"fact": "CRITICAL timmy-home#579: [RCA] Ezra and Bezalel do not respond to Gitea @mention tags - DevOps: Two VPS wizard houses are not receiving critical notifications",
"category": "pitfall",
"repo": "timmy-home",
"confidence": 0.9,
"source": "mempalace",
"source_file": "forge-palace-summary.json"
},
{
"fact": "CRITICAL the-nexus#1125: [COMPUTER_USE] Add Desktop Automation Primitives to Hermes - Feature: Unlocks computer-use capability in agent toolkit",
"category": "pitfall",
"repo": "the-nexus",
"confidence": 0.9,
"source": "mempalace",
"source_file": "forge-palace-summary.json"
},
{
"fact": "Alexander prefers rate-limited stretches over underutilization. 'I would rather get rate limited and have it stretch out a bit than underutilize.'",
"category": "fact",
"repo": "global",
"confidence": 0.5,
"source": "fact_store",
"source_id": 1,
"source_tags": "preference"
},
{
"fact": "Alexander's frustration: reading source code instead of testing the actual command first. Validate with CLI first, code second.",
"category": "fact",
"repo": "global",
"confidence": 0.5,
"source": "fact_store",
"source_id": 2,
"source_tags": "preference"
},
{
"fact": "KEYMAXXING: ~/.hermes/keymaxxing/. inbox/ for drops, detect_provider.py, watcher.sh (60s poll). First key = Nous/OpenRouter (391 models, 25 free).",
"category": "tool-quirk",
"repo": "global",
"confidence": 0.5,
"source": "fact_store",
"source_id": 3,
"source_tags": "keymaxxing,nous"
},
{
"fact": "Automation philosophy: aggressive utilization but outcome-focused. No duplicate PRs, no noise. Quality gates mandatory. 'Build things to be a masterwork.'",
"category": "fact",
"repo": "global",
"confidence": 0.5,
"source": "fact_store",
"source_id": 4,
"source_tags": "automation,philosophy"
},
{
"fact": "Mnemosyne = priority project. Tag [Mnemosyne] issues for swarm priority. WebSocket bridge for live memory (issue #1164).",
"category": "fact",
"repo": "the-nexus",
"confidence": 0.5,
"source": "fact_store",
"source_id": 5,
"source_tags": "mnemosyne,nexus"
},
{
"fact": "Bitcoin inscription #90707: Sermon on the Mount (Matthew 5-7 ESV), block 776549, Feb 14 2023. Gospel immutable on-chain.",
"category": "fact",
"repo": "global",
"confidence": 0.5,
"source": "fact_store",
"source_id": 6,
"source_tags": "bitcoin,faith"
},
{
"fact": "Gitea API gotcha: 'labels' field requires integer IDs, not string names. GET /labels first to resolve. String names return HTTP 422.",
"category": "tool-quirk",
"repo": "global",
"confidence": 0.5,
"source": "fact_store",
"source_id": 8,
"source_tags": "gitea,api"
},
{
"fact": "CRON WORKER: Jobs needing files MUST use real scripts, not LLM prompts. LLM cron can't ls/cat - needs execute_code. Test one cycle before scaling. (55 cron jobs, 0 PRs for 37 minutes.)",
"category": "tool-quirk",
"repo": "hermes-agent",
"confidence": 0.5,
"source": "fact_store",
"source_id": 11,
"source_tags": "cron,worker"
},
{
"fact": "HERMES CLI: hermes chat -q 'prompt' --provider nous -m xiaomi/mimo-v2-pro. -q=query text, -p=profile name. Mixing causes silent failure.",
"category": "tool-quirk",
"repo": "hermes-agent",
"confidence": 0.5,
"source": "fact_store",
"source_id": 12,
"source_tags": "hermes,cli"
},
{
"fact": "Core preferences: 'Don't be precious.' Parallel over sequential. Test before scaling. Direct communicator. Satoshi/Hal engineering philosophy.",
"category": "fact",
"repo": "global",
"confidence": 0.5,
"source": "fact_store",
"source_id": 13,
"source_tags": "preferences"
},
{
"fact": "Gitea API: /orgs/TimmyFoundation/repos returns 404. Use /user/repos?limit=50 instead for all repos across all orgs.",
"category": "tool-quirk",
"repo": "global",
"confidence": 0.5,
"source": "fact_store",
"source_id": 15,
"source_tags": "gitea,api"
},
{
"fact": "CRITICAL: tool_use_enforcement must be 'true' in config.yaml. Without it, mimo-v2-pro generates text descriptions instead of executing tools. 36 PRs/day with this fix ($0).",
"category": "tool-quirk",
"repo": "hermes-agent",
"confidence": 0.5,
"source": "fact_store",
"source_id": 19,
"source_tags": "hermes,config,critical"
},
{
"fact": "forge.alexanderwhitestone.com clone: depth 50 times out. Use --depth 5 --single-branch instead.",
"category": "tool-quirk",
"repo": "global",
"confidence": 0.5,
"source": "fact_store",
"source_id": 20,
"source_tags": "git,forge"
},
{
"fact": "Telegram: Bot token ~/.config/telegram/special_bot. Alexander chat ID: 7635059073. API: POST /bot{token}/sendMessage.",
"category": "tool-quirk",
"repo": "global",
"confidence": 0.5,
"source": "fact_store",
"source_id": 23,
"source_tags": "telegram"
},
{
"fact": "Communication: Gitea for reports/deliverables. Telegram for urgent only. Wants proactive monitoring, visual confirmation, action-oriented.",
"category": "fact",
"repo": "global",
"confidence": 0.5,
"source": "fact_store",
"source_id": 24,
"source_tags": "communication"
},
{
"fact": "Kimi API: model ID is 'kimi-for-coding' not 'kimi-k2.5'. Key prefix sk-kimi- routes to api.kimi.com/coding/v1. One model only.",
"category": "fact",
"repo": "global",
"confidence": 0.5,
"source": "fact_store",
"source_id": 26,
"source_tags": "kimi,api"
},
{
"fact": "model-watchdog.py restarts panes without -p flag, losing profile. Falls through to hermes3 (8K context). Fix: preserve -p flag.",
"category": "fact",
"repo": "global",
"confidence": 0.5,
"source": "fact_store",
"source_id": 28,
"source_tags": "watchdog,bug,critical"
},
{
"fact": "INCIDENT: Modified hermes profile configs without permission. Corrupted fenrir/bezalel. NEVER modify configs. Report issues, wait. Anthropic BANNED.",
"category": "fact",
"repo": "timmy-config",
"confidence": 0.5,
"source": "fact_store",
"source_id": 30,
"source_tags": "incident,config"
},
{
"fact": "CRON GOTCHAS: .tick.lock blocks jobs on crash. tool_choice='required' crashes AIAgent. save_jobs needs fcntl.flock. Error jobs stay error. Tick backlog: 56 jobs + 6 workers = 9min.",
"category": "tool-quirk",
"repo": "hermes-agent",
"confidence": 0.5,
"source": "fact_store",
"source_id": 31,
"source_tags": "cron,gotchas"
},
{
"fact": "OpenRouter = FREE MODELS ONLY. All fallbacks must use :free suffix. Never paid models.",
"category": "fact",
"repo": "global",
"confidence": 0.5,
"source": "fact_store",
"source_id": 32,
"source_tags": "openrouter,rules"
},
{
"fact": "Config is infra-as-code. Canonical: Rockachopa/hermes-config on forge. Local config = source of truth for live system.",
"category": "fact",
"repo": "timmy-config",
"confidence": 0.5,
"source": "fact_store",
"source_id": 33,
"source_tags": "config"
},
{
"fact": "Accountability: Check edit history before claims. Never modify configs without instruction. When broken: create issue, stop using it.",
"category": "fact",
"repo": "global",
"confidence": 0.5,
"source": "fact_store",
"source_id": 34,
"source_tags": "accountability"
},
{
"fact": "OpenAI tool_calls: two argument formats - tc.arguments or tc.function.arguments. hermes-agent uses function format. Check both.",
"category": "tool-quirk",
"repo": "global",
"confidence": 0.5,
"source": "fact_store",
"source_id": 36,
"source_tags": "openai,api"
},
{
"fact": "TMUX RULE: Alexander creates windows/splits. Timmy NEVER creates layouts - only send-keys to existing panes.",
"category": "tool-quirk",
"repo": "global",
"confidence": 0.5,
"source": "fact_store",
"source_id": 39,
"source_tags": "tmux,rule"
},
{
"fact": "Long-running agents > fresh one-shots. Context compounds. Optimize persistent lanes, not disposable workers. Never rotate panes across repos.",
"category": "fact",
"repo": "global",
"confidence": 0.5,
"source": "fact_store",
"source_id": 40,
"source_tags": "dispatch,lanes"
},
{
"fact": "Dispatch style: 'Go. repo #issue. Description. Clone, implement, branch NAME, commit push PR.' Zero questions, immediate execution, results-only.",
"category": "fact",
"repo": "global",
"confidence": 0.5,
"source": "fact_store",
"source_id": 42,
"source_tags": "dispatch,workflow"
},
{
"fact": "Protocol: Done/stuck \u2192 commit, push, PR, next issue. File new issues to Gitea via API. Multiple agents can work same issue (different branches).",
"category": "fact",
"repo": "global",
"confidence": 0.5,
"source": "fact_store",
"source_id": 43,
"source_tags": "protocol"
},
{
"fact": "cron/__init__.py imports ModelContextError/CRON_MIN_CONTEXT_TOKENS from scheduler - neither exists. ImportError. Fix: remove imports. Issue #541.",
"category": "fact",
"repo": "hermes-agent",
"confidence": 0.5,
"source": "fact_store",
"source_id": 46,
"source_tags": "cron,bug"
},
{
"fact": "Gitea tokens: main (~/.config/gitea/token) = Rockachopa admin. timmy-token = Timmy bot. Contents API works with main token. PR creation with either.",
"category": "tool-quirk",
"repo": "global",
"confidence": 0.5,
"source": "fact_store",
"source_id": 48,
"source_tags": "gitea,auth"
},
{
"fact": "Skill: gitea-burn-cycle - Automated burn cycles on Gitea repos",
"category": "pattern",
"repo": "global",
"confidence": 0.85,
"source": "skill",
"source_path": "~/.hermes/skills/gitea-burn-cycle/SKILL.md"
},
{
"fact": "Skill: hermes-agent - Complete Hermes Agent guide - CLI, gateway, cron, profiles",
"category": "pattern",
"repo": "global",
"confidence": 0.85,
"source": "skill",
"source_path": "~/.hermes/skills/autonomous-ai-agents/hermes-agent/SKILL.md"
},
{
"fact": "Skill: cron-infra-as-code - Source-control cron jobs as YAML",
"category": "pattern",
"repo": "global",
"confidence": 0.85,
"source": "skill",
"source_path": "~/.hermes/skills/devops/cron-infra-as-code/SKILL.md"
},
{
"fact": "Skill: burn-loop-health-monitoring - Detect silent burn loop failures",
"category": "pattern",
"repo": "global",
"confidence": 0.85,
"source": "skill",
"source_path": "~/.hermes/skills/devops/burn-loop-health-monitoring/SKILL.md"
},
{
"fact": "Skill: fleet-config-deploy - Deploy config across VPS fleet with canary",
"category": "pattern",
"repo": "global",
"confidence": 0.85,
"source": "skill",
"source_path": "~/.hermes/skills/devops/fleet-config-deploy/SKILL.md"
},
{
"fact": "Skill: mimo-swarm - Coordinated mimo-v2-pro swarm: claim-work-release",
"category": "pattern",
"repo": "global",
"confidence": 0.85,
"source": "skill",
"source_path": "~/.hermes/skills/devops/mimo-swarm/SKILL.md"
},
{
"fact": "Skill: session-signal-extraction-pitfalls - Pitfalls extracting behavioral signals from sessions",
"category": "pattern",
"repo": "global",
"confidence": 0.85,
"source": "skill",
"source_path": "~/.hermes/skills/devops/session-signal-extraction-pitfalls/SKILL.md"
},
{
"fact": "Skill: json-repair-for-tool-calls - Fix JSON parse failures in tool calls - 14 patterns",
"category": "pattern",
"repo": "global",
"confidence": 0.85,
"source": "skill",
"source_path": "~/.hermes/skills/devops/json-repair-for-tool-calls/SKILL.md"
},
{
"fact": "Skill: poka-yoke-guards - Mistake-proofing guards for weak model agents",
"category": "pattern",
"repo": "global",
"confidence": 0.85,
"source": "skill",
"source_path": "~/.hermes/skills/poka-yoke-guards/SKILL.md"
},
{
"fact": "Skill: tmux-supervisor - Monitor tmux panes - drift detection",
"category": "pattern",
"repo": "global",
"confidence": 0.85,
"source": "skill",
"source_path": "~/.hermes/skills/devops/tmux-supervisor/SKILL.md"
},
{
"fact": "Skill: approval-threat-model-extension - Threat model: LLM jailbreaks, accidents, supply chain",
"category": "pattern",
"repo": "global",
"confidence": 0.85,
"source": "skill",
"source_path": "~/.hermes/skills/security/approval-threat-model-extension/SKILL.md"
},
{
"fact": "Skill: deploy-crons-fix - Fix deploy-crons.py model/provider dropping",
"category": "pattern",
"repo": "global",
"confidence": 0.85,
"source": "skill",
"source_path": "~/.hermes/skills/deploy-crons-fix/SKILL.md"
},
{
"fact": "Skill: sovereign-heart-architecture - State-based compassion interface pattern",
"category": "pattern",
"repo": "global",
"confidence": 0.85,
"source": "skill",
"source_path": "~/.hermes/skills/autonomous-ai-agents/sovereign-heart-architecture/SKILL.md"
},
{
"fact": "Skill: burn-night-operations - Max-throughput burn night scheduling",
"category": "pattern",
"repo": "global",
"confidence": 0.85,
"source": "skill",
"source_path": "~/.hermes/skills/devops/burn-night-operations/SKILL.md"
},
{
"fact": "Skill: cron-scaling-patterns - Cron scaling: workers, tick intervals, checkpoints",
"category": "pattern",
"repo": "global",
"confidence": 0.85,
"source": "skill",
"source_path": "~/.hermes/skills/devops/cron-scaling-patterns/SKILL.md"
}
]
"last_updated": "2026-04-13T20:00:00Z",
"total_facts": 0,
"facts": []
}

View File

@@ -1,82 +0,0 @@
---
domain: hermes-agent
category: pitfall
version: 1
last_updated: "2026-04-13"
---
# Pitfalls (hermes-agent)
Things that go wrong in this repo if you don't know the traps.
## Cron & Deployment
- id: hermes-agent:pitfall:001
fact: "deploy-crons.py leaves jobs in mixed model format — some have provider/model, some just model"
confidence: 0.95
tags: [cron, deploy, model, config]
source_count: 5
first_seen: "2026-04-08"
last_confirmed: "2026-04-13"
related: [hermes-agent:pitfall:002, hermes-agent:pitfall:003]
- id: hermes-agent:pitfall:002
fact: "deploy-crons.py --deploy doesn't set legacy skill field from skills list, breaking older jobs"
confidence: 0.9
tags: [cron, deploy, skills]
source_count: 3
first_seen: "2026-04-09"
last_confirmed: "2026-04-13"
related: [hermes-agent:pitfall:001]
- id: hermes-agent:pitfall:003
fact: "Cron jobs with blank fallback_model fields trigger spurious gateway warnings"
confidence: 0.9
tags: [cron, model, fallback]
source_count: 4
first_seen: "2026-04-07"
last_confirmed: "2026-04-12"
related: [hermes-agent:pitfall:001]
- id: hermes-agent:pitfall:004
fact: "model-watchdog.py checks first provider line, not model.provider — causes false drift alarms"
confidence: 0.9
tags: [watchdog, model, config]
source_count: 3
first_seen: "2026-04-08"
last_confirmed: "2026-04-13"
## Path & Environment
- id: hermes-agent:pitfall:005
fact: "10+ files read HERMES_HOME directly instead of get_hermes_home() — breaks on custom paths"
confidence: 0.85
tags: [paths, env, hermes-home]
source_count: 6
first_seen: "2026-04-06"
last_confirmed: "2026-04-12"
related: [global:pitfall:002]
- id: hermes-agent:pitfall:006
fact: "get_hermes_home() doesn't expand tilde when HERMES_HOME=~/... is set"
confidence: 0.8
tags: [paths, env, bug]
source_count: 2
first_seen: "2026-04-05"
## SSH & Dispatch
- id: hermes-agent:pitfall:007
fact: "vps-agent-dispatch reports OK while remote hermes binary path is broken"
confidence: 0.9
tags: [ssh, dispatch, vps]
source_count: 4
first_seen: "2026-04-07"
last_confirmed: "2026-04-11"
- id: hermes-agent:pitfall:008
fact: "nightwatch-health-monitor SSH check fails on cloud-model-only deployments"
confidence: 0.85
tags: [ssh, health, cloud]
source_count: 2
first_seen: "2026-04-10"

View File

@@ -1,68 +0,0 @@
---
domain: the-nexus
category: pitfall
version: 1
last_updated: "2026-04-13"
---
# Pitfalls (the-nexus)
Things that go wrong in this repo if you don't know the traps.
## Git & Merging
- id: the-nexus:pitfall:001
fact: "Merges fail with HTTP 405 due to branch protection — must use merge API with 1 approval"
confidence: 0.95
tags: [git, merge, branch-protection, gitea]
source_count: 12
first_seen: "2026-04-05"
last_confirmed: "2026-04-13"
related: [global:pitfall:001]
- id: the-nexus:pitfall:002
fact: "ThreadingHTTPServer required for multi-user bridge — standard HTTPServer blocks on concurrent requests"
confidence: 0.95
tags: [server, concurrency, bridge]
source_count: 5
first_seen: "2026-04-10"
last_confirmed: "2026-04-13"
related: [the-nexus:pattern:001]
- id: the-nexus:pitfall:003
fact: "ChatLog.log() crashes on message persistence when index.html has orphaned button tags"
confidence: 0.9
tags: [html, crash, chatlog]
source_count: 3
first_seen: "2026-04-12"
last_confirmed: "2026-04-13"
## Three.js & Performance
- id: the-nexus:pitfall:004
fact: "Three.js LOD not implemented — local hardware struggles with full scene without texture optimization"
confidence: 0.85
tags: [threejs, performance, lod]
source_count: 4
first_seen: "2026-04-09"
last_confirmed: "2026-04-13"
related: [the-nexus:pattern:002]
- id: the-nexus:pitfall:005
fact: "Duplicate content blocks appear in index.html when PR merges conflict silently"
confidence: 0.8
tags: [html, merge-conflict, duplicate]
source_count: 3
first_seen: "2026-04-11"
last_confirmed: "2026-04-13"
## Deployment
- id: the-nexus:pitfall:006
fact: "Unified HTTP + WebSocket server required for proper URL deployment — separate servers break CORS"
confidence: 0.9
tags: [deploy, websocket, http, cors]
source_count: 4
first_seen: "2026-04-10"
last_confirmed: "2026-04-13"
related: [the-nexus:pattern:001]

View File

@@ -1,61 +0,0 @@
# Compounding Intelligence Metrics
**Generated:** 2026-04-14T18:07:26.169469+00:00
## knowledge_velocity
New facts extracted per day. Higher = compounding loop working.
**Value:** 1.61 | **7d trend:** N/A — (unknown)
- total_facts: 44
- period_days: 18
- new_facts: 29
## knowledge_coverage
Percentage of domains/repos with 10+ facts. Measures breadth.
**Value:** 0.333 | **7d trend:** N/A — (unknown)
- covered_domains: 2
- total_domains: 6
## hit_rate
Percentage of sessions referencing bootstrapped knowledge.
**Value:** 0.677 | **7d trend:** N/A — (unknown)
- hit_sessions: 8064
- total_sessions: 11919
## error_recurrence
Ratio of recurring errors. Lower = fleet learning from mistakes.
**Value:** 0.169 | **7d trend:** N/A — (unknown)
- unique_errors: 53556
- recurring_errors: 9075
## task_completion
Percentage of sessions ending with successful completion.
**Value:** 0.452 | **7d trend:** N/A — (unknown)
- normal_end_rate: 0.56
- completed: 5385
- total: 11919
## first_try_success
Percentage of sessions completed without backtracking.
**Value:** 0.818 | **7d trend:** N/A — (unknown)
- avg_tool_msg_ratio: 0.391
- sampled: 5921
## knowledge_age
Freshness of knowledge store. 1.0 = all fresh, 0.0 = all stale.
**Value:** 0.973 | **7d trend:** N/A — (unknown)
- avg_age_days: 2.4
- stale_facts: 0
- total_facts: 44

View File

@@ -1,130 +0,0 @@
{
"generated_at": "2026-04-14T18:07:26.169469+00:00",
"knowledge_velocity": {
"value": 1.61,
"total_facts": 44,
"period_days": 18,
"new_facts": 29
},
"knowledge_coverage": {
"value": 0.333,
"covered_domains": 2,
"total_domains": 6,
"domain_details": {
"global": 15,
"unknown": 15,
"hermes-agent": 8,
"pitfalls": 8,
"tool-quirks": 7,
"the-nexus": 6
}
},
"hit_rate": {
"value": 0.677,
"hit_sessions": 8064,
"total_sessions": 11919
},
"error_recurrence": {
"value": 0.169,
"unique_errors": 53556,
"recurring_errors": 9075,
"top_errors": [
{
"error": "s, report the error details.",
"sessions": 1185
},
{
"error": "\": \"traceback (most recent call last):\\n file \\\"/private/var/folders/9k/v07xkpp",
"sessions": 694
},
{
"error": "\", \"output\": \"\\n--- stderr ---\\ntraceback (most recent call last):\\n file \\\"/pr",
"sessions": 684
},
{
"error": "ures \u2192 file an issue with the traceback and tag [bug]",
"sessions": 320
},
{
"error": "s you encounter \u2192 file an issue with reproduction steps",
"sessions": 320
},
{
"error": "s, file a [bug] issue first.",
"sessions": 320
},
{
"error": ", fix the code.",
"sessions": 314
},
{
"error": "fix it before doing anything else.",
"sessions": 313
},
{
"error": "ures \u2192 add a review comment explaining what's wrong",
"sessions": 303
},
{
"error": "ures \u2014 they're your roadmap for guardrails",
"sessions": 303
}
]
},
"task_completion": {
"value": 0.452,
"normal_end_rate": 0.56,
"completed": 5385,
"total": 11919,
"breakdown": {
"cron_complete": 5354,
"unknown": 5245,
"compression": 1092,
"cli_close": 197,
"session_reset": 31
}
},
"first_try_success": {
"value": 0.818,
"avg_tool_msg_ratio": 0.391,
"sampled": 5921,
"interpretation": "Higher value = fewer backtracks = better first-try success"
},
"knowledge_age": {
"value": 0.973,
"avg_age_days": 2.4,
"stale_facts": 0,
"total_facts": 44,
"interpretation": "1.0 = all facts fresh. 0.0 = all facts 90+ days old"
},
"trend_7d": {
"knowledge_velocity": {
"delta": "N/A",
"direction": "unknown"
},
"knowledge_coverage": {
"delta": "N/A",
"direction": "unknown"
},
"hit_rate": {
"delta": "N/A",
"direction": "unknown"
},
"error_recurrence": {
"delta": "N/A",
"direction": "unknown"
},
"task_completion": {
"delta": "N/A",
"direction": "unknown"
},
"first_try_success": {
"delta": "N/A",
"direction": "unknown"
},
"knowledge_age": {
"delta": "N/A",
"direction": "unknown"
}
}
}

View File

@@ -1,359 +0,0 @@
#!/usr/bin/env python3
"""
Bootstrapper — assemble pre-session context from knowledge store.
Reads the knowledge store and produces a compact context block (2k tokens max)
that can be injected into a new session so it starts with situational awareness.
Usage:
python3 bootstrapper.py --repo the-nexus --agent mimo-sprint
python3 bootstrapper.py --repo timmy-home --global
python3 bootstrapper.py --global
python3 bootstrapper.py --repo the-nexus --max-tokens 1000
"""
import argparse
import json
import sys
from pathlib import Path
from typing import Optional
# Resolve knowledge root relative to this script's parent
SCRIPT_DIR = Path(__file__).resolve().parent
REPO_ROOT = SCRIPT_DIR.parent
KNOWLEDGE_DIR = REPO_ROOT / "knowledge"
INDEX_PATH = KNOWLEDGE_DIR / "index.json"
# Approximate token count: ~4 chars per token for English text
CHARS_PER_TOKEN = 4
# Category sort priority (lower = shown first)
CATEGORY_PRIORITY = {
"pitfall": 0,
"tool-quirk": 1,
"pattern": 2,
"fact": 3,
"question": 4,
}
def load_index(index_path: Path = INDEX_PATH) -> dict:
"""Load and validate the knowledge index."""
if not index_path.exists():
return {"version": 1, "total_facts": 0, "facts": []}
with open(index_path) as f:
data = json.load(f)
if "facts" not in data:
print(f"WARNING: index.json missing 'facts' key", file=sys.stderr)
return {"version": 1, "total_facts": 0, "facts": []}
return data
def filter_facts(
facts: list[dict],
repo: Optional[str] = None,
agent: Optional[str] = None,
include_global: bool = True,
) -> list[dict]:
"""Filter facts by repo, agent, and global scope."""
filtered = []
for fact in facts:
fact_repo = fact.get("repo", "global")
fact_agent = fact.get("agent", "")
# Match by repo (regardless of agent)
if repo and fact_repo == repo:
filtered.append(fact)
continue
# Match by exact agent type
if agent and fact_agent == agent:
filtered.append(fact)
continue
# Include global facts without agent restriction (universal facts)
if include_global and fact_repo == "global" and not fact_agent:
filtered.append(fact)
return filtered
def sort_facts(facts: list[dict]) -> list[dict]:
"""
Sort facts by: confidence (desc), then category priority, then fact text.
Most reliable and most dangerous facts come first.
"""
def sort_key(f):
confidence = f.get("confidence", 0.5)
category = f.get("category", "fact")
cat_priority = CATEGORY_PRIORITY.get(category, 5)
return (-confidence, cat_priority, f.get("fact", ""))
return sorted(facts, key=sort_key)
def load_repo_knowledge(repo: str) -> Optional[str]:
"""Load per-repo knowledge markdown if it exists."""
repo_path = KNOWLEDGE_DIR / "repos" / f"{repo}.md"
if repo_path.exists():
return repo_path.read_text().strip()
return None
def load_agent_knowledge(agent: str) -> Optional[str]:
"""Load per-agent knowledge markdown if it exists."""
agent_path = KNOWLEDGE_DIR / "agents" / f"{agent}.md"
if agent_path.exists():
return agent_path.read_text().strip()
return None
def load_global_knowledge() -> list[str]:
"""Load all global knowledge markdown files."""
global_dir = KNOWLEDGE_DIR / "global"
if not global_dir.exists():
return []
chunks = []
for md_file in sorted(global_dir.glob("*.md")):
content = md_file.read_text().strip()
if content:
chunks.append(content)
return chunks
def render_facts_section(facts: list[dict], category: str, label: str) -> str:
"""Render a section of facts for a single category."""
cat_facts = [f for f in facts if f.get("category") == category]
if not cat_facts:
return ""
lines = [f"### {label}\n"]
for f in cat_facts:
conf = f.get("confidence", 0.5)
fact_text = f.get("fact", "")
repo_tag = f.get("repo", "")
if repo_tag and repo_tag != "global":
lines.append(f"- [{conf:.0%}] ({repo_tag}) {fact_text}")
else:
lines.append(f"- [{conf:.0%}] {fact_text}")
return "\n".join(lines) + "\n"
def estimate_tokens(text: str) -> int:
"""Rough token estimate."""
return len(text) // CHARS_PER_TOKEN
def truncate_to_tokens(text: str, max_tokens: int) -> str:
"""Truncate text to approximately max_tokens, cutting at line boundaries."""
max_chars = max_tokens * CHARS_PER_TOKEN
if len(text) <= max_chars:
return text
# Cut at last newline before the limit
truncated = text[:max_chars]
last_newline = truncated.rfind("\n")
if last_newline > 0:
truncated = truncated[:last_newline]
return truncated + "\n\n[... truncated to fit context window ...]"
def build_bootstrap_context(
repo: Optional[str] = None,
agent: Optional[str] = None,
include_global: bool = True,
max_tokens: int = 2000,
index_path: Path = INDEX_PATH,
) -> str:
"""
Build the full bootstrap context block.
Returns a markdown string suitable for injection into a session prompt.
"""
index = load_index(index_path)
facts = index.get("facts", [])
# Filter
filtered = filter_facts(facts, repo=repo, agent=agent, include_global=include_global)
# Sort
sorted_facts = sort_facts(filtered)
# Build sections
sections = ["## What You Know (bootstrapped)\n"]
# Per-repo markdown knowledge
if repo:
repo_md = load_repo_knowledge(repo)
if repo_md:
sections.append(f"### Repo Notes: {repo}\n")
sections.append(repo_md + "\n")
# Structured facts by category
if sorted_facts:
# Group by source
repo_facts = [f for f in sorted_facts if f.get("repo") == repo] if repo else []
global_facts = [f for f in sorted_facts if f.get("repo") == "global"]
agent_facts = [f for f in sorted_facts if f.get("agent") == agent] if agent else []
if repo_facts:
sections.append(f"### Repo: {repo}\n")
for cat, label in [
("pitfall", "PITFALLS"),
("tool-quirk", "QUIRKS"),
("pattern", "PATTERNS"),
("fact", "FACTS"),
("question", "OPEN QUESTIONS"),
]:
section = render_facts_section(repo_facts, cat, label)
if section:
sections.append(section)
if global_facts:
sections.append("### Global\n")
for cat, label in [
("pitfall", "PITFALLS"),
("tool-quirk", "QUIRKS"),
("pattern", "PATTERNS"),
("fact", "FACTS"),
]:
section = render_facts_section(global_facts, cat, label)
if section:
sections.append(section)
if agent_facts:
sections.append(f"### Agent Notes ({agent})\n")
for cat, label in [
("pitfall", "PITFALLS"),
("tool-quirk", "QUIRKS"),
("pattern", "PATTERNS"),
("fact", "FACTS"),
]:
section = render_facts_section(agent_facts, cat, label)
if section:
sections.append(section)
# Per-agent markdown knowledge
if agent:
agent_md = load_agent_knowledge(agent)
if agent_md:
sections.append(f"### Agent Profile: {agent}\n")
sections.append(agent_md + "\n")
# Global markdown knowledge
global_chunks = load_global_knowledge()
if global_chunks:
sections.append("### Global Notes\n")
sections.extend(chunk + "\n" for chunk in global_chunks)
# If nothing was found
if len(sections) == 1:
sections.append("_No relevant knowledge found. Starting fresh._\n")
if not facts:
sections.append(
"_Knowledge store is empty. Run the harvester to populate it._\n"
)
# Join and truncate
context = "\n".join(sections)
context = truncate_to_tokens(context, max_tokens)
return context
def main():
parser = argparse.ArgumentParser(
description="Assemble pre-session context from knowledge store"
)
parser.add_argument(
"--repo",
type=str,
default=None,
help="Repository name to filter facts by",
)
parser.add_argument(
"--agent",
type=str,
default=None,
help="Agent type to filter facts by (e.g., mimo-sprint, groq-fast)",
)
parser.add_argument(
"--global",
dest="include_global",
action="store_true",
default=True,
help="Include global facts (default: true)",
)
parser.add_argument(
"--no-global",
dest="include_global",
action="store_false",
help="Exclude global facts",
)
parser.add_argument(
"--max-tokens",
type=int,
default=2000,
help="Maximum token count for output (default: 2000)",
)
parser.add_argument(
"--index",
type=str,
default=None,
help="Path to index.json (default: knowledge/index.json)",
)
parser.add_argument(
"--json",
dest="output_json",
action="store_true",
help="Output raw JSON instead of markdown",
)
args = parser.parse_args()
index_path = Path(args.index) if args.index else INDEX_PATH
if args.output_json:
# JSON mode: return the filtered, sorted facts
index = load_index(index_path)
facts = index.get("facts", [])
filtered = filter_facts(
facts,
repo=args.repo,
agent=args.agent,
include_global=args.include_global,
)
sorted_facts = sort_facts(filtered)
output = {
"repo": args.repo,
"agent": args.agent,
"include_global": args.include_global,
"total_indexed": len(facts),
"matched": len(sorted_facts),
"facts": sorted_facts,
}
print(json.dumps(output, indent=2))
else:
# Markdown mode: full bootstrap context
context = build_bootstrap_context(
repo=args.repo,
agent=args.agent,
include_global=args.include_global,
max_tokens=args.max_tokens,
index_path=index_path,
)
print(context)
return 0
if __name__ == "__main__":
sys.exit(main())

View File

@@ -0,0 +1,162 @@
#!/usr/bin/env python3
"""
Gitea Issue Body Parser
Extracts structured data from Gitea issue markdown bodies:
- Title
- Context section
- Acceptance criteria (checkboxes)
- Labels
- Epic/parent references
Usage:
python3 scripts/gitea_issue_parser.py <issue_body.txt
python3 scripts/gitea_issue_parser.py --url https://forge.../api/v1/repos/.../issues/123
echo "issue body" | python3 scripts/gitea_issue_parser.py --stdin
Output: JSON with {title, context, criteria[], labels[], epic_ref}
"""
import argparse
import json
import re
import sys
from typing import Optional
def parse_issue_body(body: str, title: str = "", labels: list = None) -> dict:
"""Parse a Gitea issue body into structured JSON."""
result = {
"title": title,
"context": "",
"criteria": [],
"labels": labels or [],
"epic_ref": None,
"sections": {},
}
if not body:
return result
# Extract epic/parent reference from title or body
epic_pattern = r"#(\d+)"
title_refs = re.findall(epic_pattern, title)
body_refs = re.findall(epic_pattern, body[:200]) # Check early body refs
# Look for "Closes #N" or "Part of #N" or "Epic: #N"
close_match = re.search(r"(?:Closes?|Fixes?|Resolves?)\s+#(\d+)", body, re.IGNORECASE)
part_match = re.search(r"(?:Part of|Epic|Parent|Blocks?)\s+#(\d+)", body, re.IGNORECASE)
if close_match:
result["epic_ref"] = f"#{close_match.group(1)}"
elif part_match:
result["epic_ref"] = f"#{part_match.group(1)}"
elif title_refs:
result["epic_ref"] = f"#{title_refs[0]}"
elif body_refs:
result["epic_ref"] = f"#{body_refs[0]}"
# Split into sections by ## headers
section_pattern = r"^##\s+(.+)$"
lines = body.split("\n")
current_section = None
current_content = []
for line in lines:
header_match = re.match(section_pattern, line)
if header_match:
# Save previous section
if current_section:
result["sections"][current_section] = "\n".join(current_content).strip()
current_section = header_match.group(1).strip().lower()
current_content = []
else:
current_content.append(line)
# Save last section
if current_section:
result["sections"][current_section] = "\n".join(current_content).strip()
# Extract context
for key in ["context", "background", "description", "problem"]:
if key in result["sections"]:
result["context"] = result["sections"][key]
break
# Extract acceptance criteria (checkboxes)
criteria_section = None
for key in ["acceptance criteria", "acceptance_criteria", "criteria", "requirements", "definition of done"]:
if key in result["sections"]:
criteria_section = result["sections"][key]
break
if criteria_section:
checkbox_pattern = r"-\s*\[[ xX]?\]\s*(.+)"
for match in re.finditer(checkbox_pattern, criteria_section):
result["criteria"].append(match.group(1).strip())
# Also try plain numbered/bulleted lists if no checkboxes found
if not result["criteria"]:
list_pattern = r"^\s*(?:\d+\.|-|\*)\s+(.+)"
for match in re.finditer(list_pattern, criteria_section, re.MULTILINE):
result["criteria"].append(match.group(1).strip())
# If no sectioned criteria found, scan whole body for checkboxes
if not result["criteria"]:
for match in re.finditer(r"-\s*\[[ xX]?\]\s*(.+)", body):
result["criteria"].append(match.group(1).strip())
return result
def parse_from_url(api_url: str, token: str = None) -> dict:
"""Parse an issue from a Gitea API URL."""
import urllib.request
headers = {}
if token:
headers["Authorization"] = f"token {token}"
req = urllib.request.Request(api_url, headers=headers)
resp = json.loads(urllib.request.urlopen(req, timeout=30).read())
title = resp.get("title", "")
body = resp.get("body", "")
labels = [l["name"] for l in resp.get("labels", [])]
return parse_issue_body(body, title, labels)
def main():
parser = argparse.ArgumentParser(description="Parse Gitea issue body into structured JSON")
parser.add_argument("input", nargs="?", help="Issue body file (or - for stdin)")
parser.add_argument("--url", help="Gitea API URL for the issue")
parser.add_argument("--stdin", action="store_true", help="Read from stdin")
parser.add_argument("--token", help="Gitea API token (or set GITEA_TOKEN env var)")
parser.add_argument("--title", default="", help="Issue title (for epic ref extraction)")
parser.add_argument("--labels", nargs="*", default=[], help="Issue labels")
parser.add_argument("--pretty", action="store_true", help="Pretty-print JSON output")
args = parser.parse_args()
import os
token = args.token or os.environ.get("GITEA_TOKEN")
if args.url:
result = parse_from_url(args.url, token)
elif args.stdin or (args.input and args.input == "-"):
body = sys.stdin.read()
result = parse_issue_body(body, args.title, args.labels)
elif args.input:
with open(args.input) as f:
body = f.read()
result = parse_issue_body(body, args.title, args.labels)
else:
parser.print_help()
sys.exit(1)
indent = 2 if args.pretty else None
print(json.dumps(result, indent=indent))
if __name__ == "__main__":
main()

View File

@@ -1,607 +0,0 @@
#!/usr/bin/env python3
"""
Compounding Intelligence Metrics Engine.
Computes 7 metrics that prove whether the knowledge compounding loop is working:
1. Knowledge velocity — new facts per day
2. Knowledge coverage — % of domains with >10 facts
3. Hit rate — % of sessions referencing bootstrap knowledge
4. Error recurrence — same errors across sessions (should decrease)
5. Task completion — % of sessions ending successfully
6. First-try success — actions without backtracking
7. Knowledge age — staleness of facts
Usage:
python3 measurer.py # All metrics, all time
python3 measurer.py --since 2026-04-01 # Time range
python3 measurer.py --repo the-nexus # Per-repo metrics
python3 measurer.py --format json # JSON output (default)
python3 measurer.py --format markdown # Human-readable
python3 measurer.py --knowledge-dir ./knowledge # Custom knowledge path
python3 measurer.py --db ~/.hermes/state.db # Custom DB path
Data sources:
- knowledge/index.json — fact index
- knowledge/ — YAML fact files for coverage
- ~/.hermes/state.db — session/message metadata
"""
import argparse
import json
import os
import re
import sqlite3
import sys
from collections import Counter, defaultdict
from datetime import datetime, timedelta, timezone
from pathlib import Path
from typing import Any
# ─── Defaults ───────────────────────────────────────────────────────────────────
DEFAULT_KNOWLEDGE_DIR = Path(__file__).parent.parent / "knowledge"
DEFAULT_DB_PATH = Path.home() / ".hermes" / "state.db"
SEVEN_DAYS = timedelta(days=7)
# ─── Knowledge Store ────────────────────────────────────────────────────────────
def load_facts(knowledge_dir: Path) -> list[dict]:
"""Load all facts from index.json."""
index_path = knowledge_dir / "index.json"
if not index_path.exists():
return []
with open(index_path) as f:
data = json.load(f)
return data.get("facts", [])
def count_yaml_facts(knowledge_dir: Path) -> dict[str, int]:
"""Count facts per domain from YAML files (coverage source)."""
domain_counts: dict[str, int] = {}
# Walk repos/, global/, agents/ subdirs
for subdir in ["repos", "global", "agents"]:
dirpath = knowledge_dir / subdir
if not dirpath.exists():
continue
for yaml_file in dirpath.glob("*.yaml"):
# Count lines that start with "- id:" — each is a fact
count = 0
try:
content = yaml_file.read_text()
count = len(re.findall(r"^\s*-\s*id:", content, re.MULTILINE))
except Exception:
pass
domain = yaml_file.stem
domain_counts[domain] = domain_counts.get(domain, 0) + count
return domain_counts
# ─── Session Database ───────────────────────────────────────────────────────────
def open_db(db_path: Path) -> sqlite3.Connection:
"""Open session database."""
if not db_path.exists():
print(f"WARNING: Database not found at {db_path}", file=sys.stderr)
return None
conn = sqlite3.connect(str(db_path))
conn.row_factory = sqlite3.Row
return conn
def query_sessions(conn: sqlite3.Connection, since: str = None, repo: str = None) -> list[dict]:
"""Query sessions with optional filters."""
if conn is None:
return []
query = """
SELECT id, started_at, ended_at, end_reason, message_count,
tool_call_count, model
FROM sessions
WHERE 1=1
"""
params = []
if since:
since_ts = datetime.fromisoformat(since).replace(tzinfo=timezone.utc).timestamp()
query += " AND started_at >= ?"
params.append(since_ts)
query += " ORDER BY started_at ASC"
cur = conn.execute(query, params)
return [dict(row) for row in cur.fetchall()]
def query_messages(conn: sqlite3.Connection, session_ids: list[str] = None,
since_ts: float = None) -> list[dict]:
"""Query messages with optional session filter."""
if conn is None:
return []
query = """
SELECT m.session_id, m.role, m.content, m.tool_name, m.timestamp
FROM messages m
WHERE 1=1
"""
params = []
if since_ts:
query += " AND m.timestamp >= ?"
params.append(since_ts)
if session_ids:
placeholders = ",".join("?" for _ in session_ids)
query += f" AND m.session_id IN ({placeholders})"
params.extend(session_ids)
cur = conn.execute(query, params)
return [dict(row) for row in cur.fetchall()]
# ─── Metric Computations ───────────────────────────────────────────────────────
def compute_knowledge_velocity(facts: list[dict], since: str = None) -> dict:
"""Metric 1: New facts per day. Higher = compounding working."""
if not facts:
return {"value": 0.0, "total_facts": 0, "period_days": 0, "new_facts": 0}
dates = []
for f in facts:
d = f.get("first_seen") or f.get("created")
if d:
try:
dt = datetime.fromisoformat(d.replace("Z", "+00:00"))
if dt.tzinfo is None:
dt = dt.replace(tzinfo=timezone.utc)
dates.append(dt)
except (ValueError, AttributeError):
pass
if not dates:
return {"value": 0.0, "total_facts": len(facts), "period_days": 0, "new_facts": 0}
if since:
cutoff = datetime.fromisoformat(since).replace(tzinfo=timezone.utc)
dates = [d for d in dates if d >= cutoff]
if not dates:
return {"value": 0.0, "total_facts": len(facts), "period_days": 0, "new_facts": 0}
earliest = min(dates)
latest = max(dates)
period_days = max((latest - earliest).days, 1)
return {
"value": round(len(dates) / period_days, 2),
"total_facts": len(facts),
"period_days": period_days,
"new_facts": len(dates),
}
def compute_knowledge_coverage(facts: list[dict], yaml_counts: dict[str, int]) -> dict:
"""Metric 2: % of domains with >10 facts. Breadth indicator."""
domain_fact_counts: dict[str, int] = defaultdict(int)
for f in facts:
domain = f.get("domain", "unknown")
domain_fact_counts[domain] += 1
# Merge YAML counts (may have facts not yet indexed)
for domain, count in yaml_counts.items():
domain_fact_counts[domain] = max(domain_fact_counts[domain], count)
total_domains = len(domain_fact_counts)
if total_domains == 0:
return {"value": 0.0, "covered_domains": 0, "total_domains": 0, "domain_details": {}}
covered = sum(1 for c in domain_fact_counts.values() if c >= 10)
return {
"value": round(covered / total_domains, 3),
"covered_domains": covered,
"total_domains": total_domains,
"domain_details": dict(sorted(domain_fact_counts.items(), key=lambda x: -x[1])[:20]),
}
def compute_hit_rate(sessions: list[dict], messages: list[dict],
facts: list[dict]) -> dict:
"""Metric 3: % of sessions that reference bootstrap knowledge.
Looks for message content matching known fact text.
"""
if not sessions or not facts:
return {"value": 0.0, "hit_sessions": 0, "total_sessions": len(sessions)}
# Build a set of searchable fact fragments (lowercased, 4+ word phrases)
fact_fragments: set[str] = set()
for f in facts:
text = f.get("fact", "").lower().strip()
# Add full fact
if len(text) > 10:
fact_fragments.add(text)
# Add significant words
words = re.findall(r'\w{4,}', text)
for w in words:
fact_fragments.add(w)
if not fact_fragments:
return {"value": 0.0, "hit_sessions": 0, "total_sessions": len(sessions)}
# Group messages by session
session_messages: dict[str, list[str]] = defaultdict(list)
for m in messages:
content = (m.get("content") or "").lower()
if content:
session_messages[m["session_id"]].append(content)
# Check each session for fact references
hit_sessions = 0
for session in sessions:
sid = session["id"]
all_content = " ".join(session_messages.get(sid, []))
if any(frag in all_content for frag in fact_fragments):
hit_sessions += 1
return {
"value": round(hit_sessions / len(sessions), 3) if sessions else 0.0,
"hit_sessions": hit_sessions,
"total_sessions": len(sessions),
}
def compute_error_recurrence(messages: list[dict]) -> dict:
"""Metric 4: Same errors appearing across sessions. Should decrease.
Extracts error signatures and counts how many sessions each appears in.
"""
if not messages:
return {"value": 0.0, "unique_errors": 0, "recurring_errors": 0, "top_errors": []}
# Extract error patterns from assistant/tool messages
error_pattern = re.compile(
r'(?:error|Error|ERROR|failed|FAIL|exception|Exception)[:\s]*(.{10,80})',
re.IGNORECASE
)
error_to_sessions: dict[str, set[str]] = defaultdict(set)
for m in messages:
content = m.get("content") or ""
if not content:
continue
for match in error_pattern.finditer(content):
sig = match.group(1).strip().lower()
# Normalize whitespace
sig = re.sub(r'\s+', ' ', sig)
if len(sig) > 5:
error_to_sessions[sig].add(m["session_id"])
if not error_to_sessions:
return {"value": 0.0, "unique_errors": 0, "recurring_errors": 0, "top_errors": []}
recurring = {e: s for e, s in error_to_sessions.items() if len(s) > 1}
total_errors = len(error_to_sessions)
recurring_count = len(recurring)
# Top recurring errors
top = sorted(recurring.items(), key=lambda x: -len(x[1]))[:10]
return {
"value": round(recurring_count / total_errors, 3) if total_errors else 0.0,
"unique_errors": total_errors,
"recurring_errors": recurring_count,
"top_errors": [{"error": e, "sessions": len(s)} for e, s in top],
}
def compute_task_completion(sessions: list[dict]) -> dict:
"""Metric 5: % of sessions ending with successful status."""
if not sessions:
return {"value": 0.0, "completed": 0, "total": 0, "breakdown": {}}
breakdown: Counter = Counter()
for s in sessions:
reason = s.get("end_reason") or "unknown"
breakdown[reason] += 1
completed = breakdown.get("cron_complete", 0) + breakdown.get("session_reset", 0)
# "cli_close" and "compression" are also normal endings
normal_endings = completed + breakdown.get("cli_close", 0) + breakdown.get("compression", 0)
return {
"value": round(completed / len(sessions), 3) if sessions else 0.0,
"normal_end_rate": round(normal_endings / len(sessions), 3) if sessions else 0.0,
"completed": completed,
"total": len(sessions),
"breakdown": dict(breakdown.most_common()),
}
def compute_first_try_success(sessions: list[dict]) -> dict:
"""Metric 6: Sessions completed without excessive backtracking.
Proxy: ratio of tool_call_count to message_count.
Low ratio = fewer retries = more first-try success.
We invert this: high tool/msg ratio means more backtracking (bad).
"""
if not sessions:
return {"value": 0.0, "avg_tool_msg_ratio": 0.0, "sampled": 0}
ratios = []
for s in sessions:
msgs = s.get("message_count", 0) or 0
tools = s.get("tool_call_count", 0) or 0
if msgs > 2: # Skip trivial sessions
ratios.append(tools / msgs if msgs > 0 else 0)
if not ratios:
return {"value": 0.0, "avg_tool_msg_ratio": 0.0, "sampled": 0}
avg_ratio = sum(ratios) / len(ratios)
# First-try success: sessions with tool_msg_ratio < 0.5 (few tools per message)
first_try = sum(1 for r in ratios if r < 0.5)
return {
"value": round(first_try / len(ratios), 3),
"avg_tool_msg_ratio": round(avg_ratio, 3),
"sampled": len(ratios),
"interpretation": "Higher value = fewer backtracks = better first-try success",
}
def compute_knowledge_age(facts: list[dict]) -> dict:
"""Metric 7: Days since facts were last confirmed. Staleness indicator."""
if not facts:
return {"value": 0.0, "avg_age_days": 0, "stale_facts": 0, "total_facts": 0}
now = datetime.now(timezone.utc)
ages = []
stale_count = 0 # Facts not confirmed in 30+ days
for f in facts:
confirmed = f.get("last_confirmed") or f.get("first_seen")
if confirmed:
try:
dt = datetime.fromisoformat(confirmed.replace("Z", "+00:00"))
if dt.tzinfo is None:
dt = dt.replace(tzinfo=timezone.utc)
age = (now - dt).days
ages.append(age)
if age > 30:
stale_count += 1
except (ValueError, AttributeError):
pass
if not ages:
return {"value": 0.0, "avg_age_days": 0, "stale_facts": 0, "total_facts": len(facts)}
avg_age = sum(ages) / len(ages)
# Lower avg age = fresher = better. Invert for a 0-1 score.
freshness = max(0.0, 1.0 - (avg_age / 90)) # 90 days = 0 freshness
return {
"value": round(freshness, 3),
"avg_age_days": round(avg_age, 1),
"stale_facts": stale_count,
"total_facts": len(facts),
"interpretation": "1.0 = all facts fresh. 0.0 = all facts 90+ days old",
}
# ─── Trend Computation ─────────────────────────────────────────────────────────
def compute_trend(current: dict, previous: dict, metric_key: str = "value") -> dict:
"""Compute 7-day trend between two metric snapshots."""
if not previous:
return {"delta": "N/A", "direction": "unknown"}
curr_val = current.get(metric_key, 0)
prev_val = previous.get(metric_key, 0)
if prev_val == 0:
return {"delta": "N/A (no baseline)", "direction": "unknown"}
pct = ((curr_val - prev_val) / abs(prev_val)) * 100
direction = "up" if pct > 0 else "down" if pct < 0 else "flat"
# For error_recurrence, down is good
if metric_key == "error_recurrence" or metric_key == "knowledge_age":
direction_label = "good" if pct < 0 else "bad" if pct > 0 else "neutral"
else:
direction_label = "good" if pct > 0 else "bad" if pct < 0 else "neutral"
return {
"delta": f"{'+' if pct > 0 else ''}{pct:.1f}%",
"direction": direction,
"assessment": direction_label,
}
# ─── Output Formatters ─────────────────────────────────────────────────────────
def format_json(metrics: dict) -> str:
"""Format metrics as JSON."""
return json.dumps(metrics, indent=2)
def format_markdown(metrics: dict) -> str:
"""Format metrics as human-readable markdown."""
lines = [
"# Compounding Intelligence Metrics",
f"**Generated:** {metrics.get('generated_at', 'unknown')}",
"",
]
trend = metrics.get("trend_7d", {})
def metric_block(name: str, data: dict, desc: str, good_direction: str = "up"):
val = data.get("value", 0)
t = trend.get(name, {})
delta = t.get("delta", "N/A")
assessment = t.get("assessment", "unknown")
arrow = "" if assessment == "good" else "" if assessment == "bad" else ""
lines.extend([
f"## {name}",
f"{desc}",
"",
f"**Value:** {val} | **7d trend:** {delta} {arrow} ({assessment})",
"",
])
# Add key details
for k, v in data.items():
if k != "value" and k != "interpretation":
if isinstance(v, (int, float, str)):
lines.append(f"- {k}: {v}")
lines.append("")
metric_block(
"knowledge_velocity",
metrics.get("knowledge_velocity", {}),
"New facts extracted per day. Higher = compounding loop working.",
)
metric_block(
"knowledge_coverage",
metrics.get("knowledge_coverage", {}),
"Percentage of domains/repos with 10+ facts. Measures breadth.",
)
metric_block(
"hit_rate",
metrics.get("hit_rate", {}),
"Percentage of sessions referencing bootstrapped knowledge.",
)
metric_block(
"error_recurrence",
metrics.get("error_recurrence", {}),
"Ratio of recurring errors. Lower = fleet learning from mistakes.",
good_direction="down",
)
metric_block(
"task_completion",
metrics.get("task_completion", {}),
"Percentage of sessions ending with successful completion.",
)
metric_block(
"first_try_success",
metrics.get("first_try_success", {}),
"Percentage of sessions completed without backtracking.",
)
metric_block(
"knowledge_age",
metrics.get("knowledge_age", {}),
"Freshness of knowledge store. 1.0 = all fresh, 0.0 = all stale.",
good_direction="up",
)
return "\n".join(lines)
# ─── Snapshot Persistence ───────────────────────────────────────────────────────
def load_snapshot(metrics_dir: Path) -> dict:
"""Load most recent metrics snapshot for trend computation."""
snapshot_path = metrics_dir / "latest_snapshot.json"
if snapshot_path.exists():
with open(snapshot_path) as f:
return json.load(f)
return {}
def save_snapshot(metrics_dir: Path, metrics: dict):
"""Save current metrics as latest snapshot."""
metrics_dir.mkdir(parents=True, exist_ok=True)
snapshot_path = metrics_dir / "latest_snapshot.json"
with open(snapshot_path, "w") as f:
json.dump(metrics, f, indent=2)
# ─── Main ───────────────────────────────────────────────────────────────────────
def main():
parser = argparse.ArgumentParser(description="Compounding Intelligence Metrics")
parser.add_argument("--since", help="Start date (YYYY-MM-DD)")
parser.add_argument("--repo", help="Filter by repo/domain")
parser.add_argument("--format", choices=["json", "markdown"], default="json")
parser.add_argument("--knowledge-dir", type=Path, default=DEFAULT_KNOWLEDGE_DIR)
parser.add_argument("--db", type=Path, default=DEFAULT_DB_PATH)
parser.add_argument("--save-snapshot", action="store_true",
help="Save current metrics as snapshot for trend tracking")
parser.add_argument("--metrics-dir", type=Path,
default=Path(__file__).parent.parent / "metrics",
help="Directory for snapshots and dashboard")
args = parser.parse_args()
# ── Load data ───────────────────────────────────────────────────────────
facts = load_facts(args.knowledge_dir)
yaml_counts = count_yaml_facts(args.knowledge_dir)
if args.repo:
facts = [f for f in facts if f.get("domain") == args.repo]
conn = open_db(args.db)
sessions = query_sessions(conn, since=args.since)
messages = query_messages(conn) if conn else []
if conn:
conn.close()
# ── Compute metrics ─────────────────────────────────────────────────────
velocity = compute_knowledge_velocity(facts, since=args.since)
coverage = compute_knowledge_coverage(facts, yaml_counts)
hit_rate = compute_hit_rate(sessions, messages, facts)
error_recurrence = compute_error_recurrence(messages)
task_completion = compute_task_completion(sessions)
first_try = compute_first_try_success(sessions)
age = compute_knowledge_age(facts)
# ── Compute trends ──────────────────────────────────────────────────────
previous = load_snapshot(args.metrics_dir)
trend = {
"knowledge_velocity": compute_trend(velocity, previous.get("knowledge_velocity", {})),
"knowledge_coverage": compute_trend(coverage, previous.get("knowledge_coverage", {})),
"hit_rate": compute_trend(hit_rate, previous.get("hit_rate", {})),
"error_recurrence": compute_trend(error_recurrence, previous.get("error_recurrence", {}),
"value"),
"task_completion": compute_trend(task_completion, previous.get("task_completion", {})),
"first_try_success": compute_trend(first_try, previous.get("first_try_success", {})),
"knowledge_age": compute_trend(age, previous.get("knowledge_age", {})),
}
# ── Assemble output ─────────────────────────────────────────────────────
now = datetime.now(timezone.utc).isoformat()
metrics = {
"generated_at": now,
"knowledge_velocity": velocity,
"knowledge_coverage": coverage,
"hit_rate": hit_rate,
"error_recurrence": error_recurrence,
"task_completion": task_completion,
"first_try_success": first_try,
"knowledge_age": age,
"trend_7d": trend,
}
if args.since:
metrics["since"] = args.since
# ── Save snapshot if requested ──────────────────────────────────────────
if args.save_snapshot:
save_snapshot(args.metrics_dir, metrics)
# Also write dashboard
dashboard_path = args.metrics_dir / "dashboard.md"
with open(dashboard_path, "w") as f:
f.write(format_markdown(metrics))
# ── Output ──────────────────────────────────────────────────────────────
if args.format == "json":
print(format_json(metrics))
else:
print(format_markdown(metrics))
if __name__ == "__main__":
main()

View File

@@ -1,239 +0,0 @@
#!/usr/bin/env python3
"""
Tests for bootstrapper.py — context assembly from knowledge store.
"""
import json
import sys
import tempfile
from pathlib import Path
# Add scripts dir to path for import
sys.path.insert(0, str(Path(__file__).resolve().parent))
from bootstrapper import (
build_bootstrap_context,
estimate_tokens,
filter_facts,
load_index,
sort_facts,
truncate_to_tokens,
)
def make_index(facts: list[dict], tmp_dir: Path) -> Path:
"""Create a temporary index.json with given facts."""
index = {
"version": 1,
"last_updated": "2026-04-13T20:00:00Z",
"total_facts": len(facts),
"facts": facts,
}
path = tmp_dir / "index.json"
with open(path, "w") as f:
json.dump(index, f)
return path
def test_empty_index():
"""Empty knowledge store produces graceful output."""
with tempfile.TemporaryDirectory() as tmp:
tmp_dir = Path(tmp)
index_path = make_index([], tmp_dir)
# Create empty knowledge dirs
for sub in ["repos", "agents", "global"]:
(tmp_dir / sub).mkdir(exist_ok=True)
context = build_bootstrap_context(
repo="the-nexus", index_path=index_path
)
assert "No relevant knowledge found" in context
assert "Starting fresh" in context
print("PASS: empty_index")
def test_filter_by_repo():
"""Filter facts by repository."""
facts = [
{"fact": "A", "category": "fact", "repo": "the-nexus", "confidence": 0.9},
{"fact": "B", "category": "fact", "repo": "fleet-ops", "confidence": 0.8},
{"fact": "C", "category": "fact", "repo": "global", "confidence": 0.7},
]
filtered = filter_facts(facts, repo="the-nexus", include_global=True)
texts = [f["fact"] for f in filtered]
assert "A" in texts
assert "B" not in texts
assert "C" in texts
print("PASS: filter_by_repo")
def test_filter_by_agent():
"""Filter facts by agent type."""
facts = [
{"fact": "A", "category": "pattern", "repo": "global", "agent": "mimo-sprint", "confidence": 0.8},
{"fact": "B", "category": "pattern", "repo": "global", "agent": "groq-fast", "confidence": 0.7},
{"fact": "C", "category": "fact", "repo": "global", "confidence": 0.9},
]
filtered = filter_facts(facts, agent="mimo-sprint", include_global=True)
texts = [f["fact"] for f in filtered]
assert "A" in texts
assert "B" not in texts
assert "C" in texts # global, no agent restriction
print("PASS: filter_by_agent")
def test_no_global_flag():
"""Excluding global facts works."""
facts = [
{"fact": "A", "category": "fact", "repo": "the-nexus", "confidence": 0.9},
{"fact": "B", "category": "fact", "repo": "global", "confidence": 0.8},
]
filtered = filter_facts(facts, repo="the-nexus", include_global=False)
texts = [f["fact"] for f in filtered]
assert "A" in texts
assert "B" not in texts
print("PASS: no_global_flag")
def test_sort_by_confidence():
"""Facts sort by confidence descending."""
facts = [
{"fact": "low", "category": "fact", "repo": "global", "confidence": 0.3},
{"fact": "high", "category": "fact", "repo": "global", "confidence": 0.95},
{"fact": "mid", "category": "fact", "repo": "global", "confidence": 0.7},
]
sorted_f = sort_facts(facts)
assert sorted_f[0]["fact"] == "high"
assert sorted_f[1]["fact"] == "mid"
assert sorted_f[2]["fact"] == "low"
print("PASS: sort_by_confidence")
def test_sort_pitfalls_first():
"""Pitfalls sort before facts at same confidence."""
facts = [
{"fact": "regular fact", "category": "fact", "repo": "global", "confidence": 0.8},
{"fact": "danger pitfall", "category": "pitfall", "repo": "global", "confidence": 0.8},
]
sorted_f = sort_facts(facts)
assert sorted_f[0]["category"] == "pitfall"
print("PASS: sort_pitfalls_first")
def test_truncate_to_tokens():
"""Truncation cuts at line boundary."""
text = "line1\nline2\nline3\nline4\nline5\n"
truncated = truncate_to_tokens(text, max_tokens=2) # ~8 chars
assert "line1" in truncated
assert "truncated" in truncated.lower()
print("PASS: truncate_to_tokens")
def test_estimate_tokens():
"""Token estimation is reasonable."""
text = "a" * 400
tokens = estimate_tokens(text)
assert 90 <= tokens <= 110 # ~100 tokens
print("PASS: estimate_tokens")
def test_build_full_context():
"""Full context with facts renders correctly."""
facts = [
{"fact": "API merges fail with 405", "category": "pitfall", "repo": "the-nexus", "confidence": 0.95},
{"fact": "Has 50+ open PRs", "category": "fact", "repo": "the-nexus", "confidence": 0.9},
{"fact": "Token at ~/.config/gitea/token", "category": "tool-quirk", "repo": "global", "confidence": 0.9},
{"fact": "Check git remote -v first", "category": "pattern", "repo": "global", "confidence": 0.8},
]
with tempfile.TemporaryDirectory() as tmp:
tmp_dir = Path(tmp)
index_path = make_index(facts, tmp_dir)
# Create knowledge dirs
for sub in ["repos", "agents", "global"]:
(tmp_dir / sub).mkdir(exist_ok=True)
context = build_bootstrap_context(
repo="the-nexus",
agent="mimo-sprint",
include_global=True,
index_path=index_path,
)
assert "What You Know" in context
assert "PITFALLS" in context
assert "API merges fail with 405" in context
assert "the-nexus" in context
assert "Token at" in context # global fact included
print("PASS: build_full_context")
def test_max_tokens_respected():
"""Output respects max_tokens limit."""
# Generate lots of facts
facts = [
{"fact": f"Fact number {i} with some detail about things", "category": "fact", "repo": "global", "confidence": 0.8}
for i in range(100)
]
with tempfile.TemporaryDirectory() as tmp:
tmp_dir = Path(tmp)
index_path = make_index(facts, tmp_dir)
for sub in ["repos", "agents", "global"]:
(tmp_dir / sub).mkdir(exist_ok=True)
context = build_bootstrap_context(
repo=None,
max_tokens=500,
index_path=index_path,
)
actual_tokens = estimate_tokens(context)
# Allow 10% overshoot since we cut at line boundaries
assert actual_tokens <= 550, f"Expected ~500 tokens, got {actual_tokens}"
print(f"PASS: max_tokens_respected (got {actual_tokens} tokens)")
def test_missing_index_graceful():
"""Missing index.json doesn't crash."""
with tempfile.TemporaryDirectory() as tmp:
tmp_dir = Path(tmp)
# Don't create index.json
for sub in ["repos", "agents", "global"]:
(tmp_dir / sub).mkdir(exist_ok=True)
fake_index = tmp_dir / "nonexistent.json"
context = build_bootstrap_context(repo="anything", index_path=fake_index)
assert "No relevant knowledge found" in context
print("PASS: missing_index_graceful")
if __name__ == "__main__":
tests = [
test_empty_index,
test_filter_by_repo,
test_filter_by_agent,
test_no_global_flag,
test_sort_by_confidence,
test_sort_pitfalls_first,
test_truncate_to_tokens,
test_estimate_tokens,
test_build_full_context,
test_max_tokens_respected,
test_missing_index_graceful,
]
passed = 0
failed = 0
for test in tests:
try:
test()
passed += 1
except Exception as e:
print(f"FAIL: {test.__name__}{e}")
failed += 1
print(f"\n{passed} passed, {failed} failed")
sys.exit(0 if failed == 0 else 1)

View File

@@ -0,0 +1,111 @@
#!/usr/bin/env python3
"""Tests for gitea_issue_parser."""
import json
import sys
import os
sys.path.insert(0, os.path.dirname(__file__))
from gitea_issue_parser import parse_issue_body
def test_basic_structure():
body = """## Context
This is the background.
## Acceptance Criteria
- [ ] First criterion
- [x] Second criterion (already done)
- [ ] Third criterion
## Labels
`pipeline`, `extraction`
"""
result = parse_issue_body(body, "Test Issue", ["pipeline", "extraction"])
assert result["title"] == "Test Issue"
assert "background" in result["context"].lower()
assert len(result["criteria"]) == 3
assert "First criterion" in result["criteria"]
assert result["labels"] == ["pipeline", "extraction"]
print("PASS: test_basic_structure")
def test_epic_ref():
body = "Closes #645\n\nSome description."
result = parse_issue_body(body, "feat: thing (#688)")
assert result["epic_ref"] == "#645"
print("PASS: test_epic_ref")
def test_epic_ref_from_title():
body = "Some description without close ref."
result = parse_issue_body(body, "feat: scene descriptions (#645)")
assert result["epic_ref"] == "#645"
print("PASS: test_epic_ref_from_title")
def test_no_checkboxes():
body = """## Requirements
1. First thing
2. Second thing
3. Third thing
"""
result = parse_issue_body(body)
assert len(result["criteria"]) == 3
print("PASS: test_no_checkboxes")
def test_empty_body():
result = parse_issue_body("", "Empty Issue")
assert result["title"] == "Empty Issue"
assert result["criteria"] == []
assert result["context"] == ""
print("PASS: test_empty_body")
def test_real_issue_format():
body = """Closes #681
## Changes
Add `#!/usr/bin/env python3` shebang to 6 Python scripts.
## Verification
All 6 files confirmed missing shebangs before fix.
## Impact
Scripts can now be executed directly.
"""
result = parse_issue_body(body, "fix: add python3 shebangs (#685)")
assert result["epic_ref"] == "#681"
assert "shebang" in result["context"].lower()
print("PASS: test_real_issue_format")
def test_all_sections_captured():
body = """## Context
Background info.
## Acceptance Criteria
- [ ] Do thing
## Labels
`test`
"""
result = parse_issue_body(body)
assert "context" in result["sections"]
assert "acceptance criteria" in result["sections"]
print("PASS: test_all_sections_captured")
if __name__ == "__main__":
test_basic_structure()
test_epic_ref()
test_epic_ref_from_title()
test_no_checkboxes()
test_empty_body()
test_real_issue_format()
test_all_sections_captured()
print("\nAll tests passed.")

View File

@@ -1,129 +1,41 @@
#!/usr/bin/env python3
"""
Test harness for knowledge extraction prompt.
Validates output structure, content quality, and hallucination resistance.
Usage:
python3 scripts/test_harvest_prompt.py # Run all tests
python3 scripts/test_harvest_prompt.py --transcript FILE # Test against a real transcript
python3 scripts/test_harvest_prompt.py --validate FILE # Validate an existing extraction JSON
Test script for knowledge extraction prompt.
Validates that the prompt produces consistent, structured output.
"""
import json
import sys
import argparse
from pathlib import Path
VALID_CATEGORIES = {"fact", "pitfall", "pattern", "tool-quirk", "question"}
REQUIRED_FIELDS = {"fact", "category", "repo", "confidence", "evidence"}
REQUIRED_META = {"session_outcome", "tools_used", "repos_touched", "error_count", "knowledge_count"}
def validate_knowledge_item(item, idx):
"""Validate a single knowledge item. Returns list of errors."""
errors = []
if not isinstance(item, dict):
return [f"Item {idx}: not a dict"]
for field in REQUIRED_FIELDS:
def validate_knowledge_item(item):
"""Validate a single knowledge item."""
required_fields = ["fact", "category", "repo", "confidence"]
for field in required_fields:
if field not in item:
errors.append(f"Item {idx}: missing field '{field}'")
if not isinstance(item.get("fact", ""), str) or len(item.get("fact", "").strip()) == 0:
errors.append(f"Item {idx}: fact must be a non-empty string")
if item.get("category") not in VALID_CATEGORIES:
errors.append(f"Item {idx}: invalid category '{item.get('category')}'")
if not isinstance(item.get("repo", ""), str) or len(item.get("repo", "").strip()) == 0:
errors.append(f"Item {idx}: repo must be a non-empty string")
conf = item.get("confidence")
if not isinstance(conf, (int, float)) or not (0.0 <= conf <= 1.0):
errors.append(f"Item {idx}: confidence must be a number 0.0-1.0, got {conf}")
if not isinstance(item.get("evidence", ""), str) or len(item.get("evidence", "").strip()) == 0:
errors.append(f"Item {idx}: evidence must be a non-empty string (hallucination check)")
return errors
return False, f"Missing field: {field}"
if not isinstance(item["fact"], str) or len(item["fact"].strip()) == 0:
return False, "Fact must be a non-empty string"
valid_categories = ["fact", "pitfall", "pattern", "tool-quirk", "question"]
if item["category"] not in valid_categories:
return False, f"Invalid category: {item['category']}"
if not isinstance(item["repo"], str):
return False, "Repo must be a string"
if not isinstance(item["confidence"], (int, float)):
return False, "Confidence must be a number"
if not (0.0 <= item["confidence"] <= 1.0):
return False, "Confidence must be between 0.0 and 1.0"
return True, "Valid"
def validate_extraction(data):
"""Validate a full extraction result. Returns (is_valid, errors, warnings)."""
errors = []
warnings = []
if not isinstance(data, dict):
return False, ["Root is not a JSON object"], []
if "knowledge" not in data:
return False, ["Missing 'knowledge' array"], []
if not isinstance(data["knowledge"], list):
return False, ["'knowledge' is not an array"], []
for i, item in enumerate(data["knowledge"]):
errors.extend(validate_knowledge_item(item, i))
# Meta block validation
if "meta" not in data:
warnings.append("Missing 'meta' block (session_outcome, tools_used, etc.)")
else:
meta = data["meta"]
for field in REQUIRED_META:
if field not in meta:
warnings.append(f"Meta missing field '{field}'")
# Quality checks
facts = data["knowledge"]
if len(facts) == 0:
warnings.append("No knowledge extracted (empty session or extraction failure)")
# Check for near-duplicate facts
seen_facts = set()
for item in facts:
normalized = item.get("fact", "").lower().strip()[:80]
if normalized in seen_facts:
warnings.append(f"Duplicate fact detected: '{normalized[:50]}...'")
seen_facts.add(normalized)
# Check confidence distribution
confidences = [item.get("confidence", 0) for item in facts]
if confidences:
avg_conf = sum(confidences) / len(confidences)
if avg_conf > 0.9:
warnings.append(f"Average confidence {avg_conf:.2f} is suspiciously high (possible hallucination)")
if avg_conf < 0.4:
warnings.append(f"Average confidence {avg_conf:.2f} is very low (extraction may be too cautious)")
return len(errors) == 0, errors, warnings
def validate_transcript_coverage(data, transcript):
"""
Check that extracted facts are actually supported by the transcript.
This is a heuristic — checks that key terms from facts appear in transcript.
Returns list of potential hallucinations.
"""
hallucinations = []
transcript_lower = transcript.lower()
for item in data.get("knowledge", []):
fact = item.get("fact", "")
evidence = item.get("evidence", "")
# Check if evidence string appears in transcript
if evidence and evidence.lower() not in transcript_lower:
# Partial match — check if key terms are present
evidence_words = set(evidence.lower().split())
transcript_words = set(transcript_lower.split())
overlap = evidence_words & transcript_words
if len(overlap) < min(3, len(evidence_words) * 0.3):
hallucinations.append({
"fact": fact[:80],
"evidence": evidence[:80],
"issue": "Evidence not found in transcript — possible hallucination"
})
return hallucinations
# --- Test transcripts ---
SAMPLE_TRANSCRIPT_SUCCESS = """
def test_sample_transcript():
"""Test with a sample transcript."""
sample_transcript = """
User: Clone the repo at forge.alexanderwhitestone.com/Timmy_Foundation/compounding-intelligence
Agent: Trying git clone... Error: 401 Unauthorized
Agent: Checking for token... Found in ~/.config/gitea/token
@@ -132,253 +44,30 @@ Agent: Clone successful with token authentication
Agent: Now examining the repo structure...
Agent: Found knowledge/index.json with 0 facts
Agent: Creating harvest-prompt.md in templates/
Agent: Done. Files created: templates/harvest-prompt.md, scripts/test_harvest_prompt.py
"""
SAMPLE_TRANSCRIPT_FAILURE = """
User: Deploy the cron jobs to production
Agent: Connecting to VPS... SSH timeout after 30s
Agent: Retrying with verbose output... Connection refused on port 22
Agent: Checking firewall rules... iptables shows port 22 is open
Agent: Checking sshd config... sshd is not running
Agent: Attempting to start sshd... systemctl start sshd failed: unit not found
Agent: ERROR: OpenSSH server not installed on this VPS
Agent: Cannot proceed without SSH access. Deployment blocked.
"""
SAMPLE_TRANSCRIPT_API_QUIRK = """
User: List all open issues in hermes-agent
Agent: Querying Gitea API... endpoint: /repos/Timmy_Foundation/hermes-agent/issues?state=open&limit=50
Agent: Response: 0 items. That's odd.
Agent: Trying without state filter... endpoint: /repos/Timmy_Foundation/hermes-agent/issues?limit=50
Agent: Response: 50 items returned, but all have pull_request field. These are PRs, not issues.
Agent: Gitea issues API returns PRs alongside issues. Must filter by presence of pull_request key.
Agent: Also discovered: state=open parameter returns empty when used with limit=50, but works with limit=5.
Agent: This appears to be a Gitea API quirk — pagination interacts with state filter unexpectedly.
Agent: Workaround: fetch without state filter, then filter client-side for state=='open' and 'pull_request' not in item.
"""
def run_tests():
"""Run the built-in test suite."""
tests_passed = 0
tests_failed = 0
print("=" * 60)
print("KNOWLEDGE EXTRACTION PROMPT — TEST SUITE")
print("=" * 60)
# Test 1: Prompt file exists and is under 2k tokens (~8k chars)
print("\n[Test 1] Prompt file size constraint")
prompt_path = Path("templates/harvest-prompt.md")
if not prompt_path.exists():
print(" FAIL: harvest-prompt.md not found")
tests_failed += 1
else:
size = prompt_path.stat().st_size
# Rough token estimate: ~4 chars per token
est_tokens = size / 4
print(f" Prompt size: {size} bytes (~{est_tokens:.0f} tokens)")
if est_tokens > 2000:
print(f" WARN: Prompt exceeds ~1500 tokens (target: ~1000)")
else:
print(f" PASS: Within token budget")
tests_passed += 1
# Test 2: Validate a well-formed extraction
print("\n[Test 2] Valid extraction passes validation")
valid_extraction = {
"knowledge": [
{
"fact": "Gitea auth token is at ~/.config/gitea/token",
"category": "tool-quirk",
"repo": "global",
"confidence": 0.9,
"evidence": "Found in ~/.config/gitea/token"
},
{
"fact": "Clone fails with 401 when no token is provided",
"category": "pitfall",
"repo": "compounding-intelligence",
"confidence": 0.9,
"evidence": "Error: 401 Unauthorized"
}
],
"meta": {
"session_outcome": "success",
"tools_used": ["git"],
"repos_touched": ["compounding-intelligence"],
"error_count": 1,
"knowledge_count": 2
}
}
is_valid, errors, warnings = validate_extraction(valid_extraction)
if is_valid:
print(f" PASS: Valid extraction accepted ({len(warnings)} warnings)")
tests_passed += 1
else:
print(f" FAIL: Valid extraction rejected: {errors}")
tests_failed += 1
# Test 3: Reject missing fields
print("\n[Test 3] Missing fields are rejected")
bad_extraction = {
"knowledge": [
{"fact": "Something learned", "category": "fact"} # Missing repo, confidence, evidence
]
}
is_valid, errors, warnings = validate_extraction(bad_extraction)
if not is_valid:
print(f" PASS: Rejected with {len(errors)} errors")
tests_passed += 1
else:
print(f" FAIL: Should have rejected missing fields")
tests_failed += 1
# Test 4: Reject invalid category
print("\n[Test 4] Invalid category is rejected")
bad_cat = {
"knowledge": [
{"fact": "Test", "category": "discovery", "repo": "x", "confidence": 0.8, "evidence": "test"}
]
}
is_valid, errors, warnings = validate_extraction(bad_cat)
if not is_valid and any("category" in e for e in errors):
print(f" PASS: Invalid category 'discovery' rejected")
tests_passed += 1
else:
print(f" FAIL: Should have rejected invalid category")
tests_failed += 1
# Test 5: Detect near-duplicates
print("\n[Test 5] Duplicate detection")
dup_extraction = {
"knowledge": [
{"fact": "Token is at ~/.config/gitea/token", "category": "fact", "repo": "x", "confidence": 0.9, "evidence": "a"},
{"fact": "Token is at ~/.config/gitea/token", "category": "fact", "repo": "x", "confidence": 0.9, "evidence": "b"}
],
"meta": {"session_outcome": "success", "tools_used": [], "repos_touched": [], "error_count": 0, "knowledge_count": 2}
}
is_valid, errors, warnings = validate_extraction(dup_extraction)
if any("Duplicate" in w for w in warnings):
print(f" PASS: Duplicate detected")
tests_passed += 1
else:
print(f" FAIL: Should have detected duplicate")
tests_failed += 1
# Test 6: Hallucination check against transcript
print("\n[Test 6] Hallucination detection")
hallucinated = {
"knowledge": [
{
"fact": "Database port is 5433",
"category": "fact",
"repo": "x",
"confidence": 0.9,
"evidence": "PostgreSQL listening on port 5433"
}
],
"meta": {"session_outcome": "success", "tools_used": [], "repos_touched": [], "error_count": 0, "knowledge_count": 1}
}
hallucinations = validate_transcript_coverage(hallucinated, SAMPLE_TRANSCRIPT_SUCCESS)
if hallucinations:
print(f" PASS: Hallucination detected ({len(hallucinations)} items)")
tests_passed += 1
else:
print(f" FAIL: Should have detected hallucinated evidence")
tests_failed += 1
# Test 7: Failed session should extract pitfalls
print("\n[Test 7] Failed session extraction shape")
failed_extraction = {
"knowledge": [
{
"fact": "SSH server not installed on target VPS",
"category": "pitfall",
"repo": "global",
"confidence": 0.9,
"evidence": "ERROR: OpenSSH server not installed on this VPS"
},
{
"fact": "VPS blocks deployment without SSH access",
"category": "question",
"repo": "global",
"confidence": 0.7,
"evidence": "Cannot proceed without SSH access. Deployment blocked."
}
],
"meta": {
"session_outcome": "failed",
"tools_used": ["ssh", "systemctl"],
"repos_touched": [],
"error_count": 3,
"knowledge_count": 2
}
}
is_valid, errors, warnings = validate_extraction(failed_extraction)
if is_valid:
categories = [item["category"] for item in failed_extraction["knowledge"]]
if "pitfall" in categories:
print(f" PASS: Failed session extracted {len(categories)} items including pitfalls")
tests_passed += 1
else:
print(f" FAIL: Failed session should extract pitfalls")
tests_failed += 1
else:
print(f" FAIL: {errors}")
tests_failed += 1
# Test 8: Empty extraction is warned
print("\n[Test 8] Empty extraction warning")
empty = {"knowledge": [], "meta": {"session_outcome": "success", "tools_used": [], "repos_touched": [], "error_count": 0, "knowledge_count": 0}}
is_valid, errors, warnings = validate_extraction(empty)
if any("No knowledge" in w for w in warnings):
print(f" PASS: Empty extraction warned")
tests_passed += 1
else:
print(f" FAIL: Should warn on empty extraction")
tests_failed += 1
# Summary
print(f"\n{'=' * 60}")
print(f"Results: {tests_passed} passed, {tests_failed} failed")
print(f"{'=' * 60}")
return tests_failed == 0
def validate_file(filepath):
"""Validate an existing extraction JSON file."""
path = Path(filepath)
if not path.exists():
print(f"ERROR: {filepath} not found")
return False
data = json.loads(path.read_text())
is_valid, errors, warnings = validate_extraction(data)
print(f"Validation of {filepath}:")
print(f" Knowledge items: {len(data.get('knowledge', []))}")
print(f" Errors: {len(errors)}")
print(f" Warnings: {len(warnings)}")
for e in errors:
print(f" ERROR: {e}")
for w in warnings:
print(f" WARN: {w}")
return is_valid
# This would be replaced with actual prompt execution
print("Sample transcript processed")
print("Expected categories: fact, pitfall, pattern, tool-quirk, question")
return True
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Test knowledge extraction prompt")
parser.add_argument("--validate", help="Validate an existing extraction JSON file")
parser.add_argument("--transcript", help="Test against a real transcript file (informational)")
args = parser.parse_args()
if args.validate:
success = validate_file(args.validate)
sys.exit(0 if success else 1)
else:
success = run_tests()
sys.exit(0 if success else 1)
print("Testing knowledge extraction prompt...")
# Test 1: Validate prompt file exists
prompt_path = Path("templates/harvest-prompt.md")
if not prompt_path.exists():
print("ERROR: harvest-prompt.md not found")
sys.exit(1)
print(f"OK: Prompt file exists: {prompt_path}")
# Test 2: Check prompt size
prompt_size = prompt_path.stat().st_size
print(f"OK: Prompt size: {prompt_size} bytes")
# Test 3: Test sample transcript processing
if test_sample_transcript():
print("OK: Sample transcript test passed")
print("\nAll tests passed!")

View File

@@ -0,0 +1,212 @@
#!/usr/bin/env python3
"""
Comprehensive test script for knowledge extraction prompt.
Validates prompt structure, requirements, and consistency.
"""
import json
import re
from pathlib import Path
def test_prompt_structure():
"""Test that the prompt has the required structure."""
prompt_path = Path("templates/harvest-prompt.md")
if not prompt_path.exists():
return False, "harvest-prompt.md not found"
content = prompt_path.read_text()
# Check for required sections
required_sections = [
"System Prompt",
"Instructions",
"Categories",
"Output Format",
"Confidence Scoring",
"Constraints",
"Example"
]
for section in required_sections:
if section.lower() not in content.lower():
return False, f"Missing required section: {section}"
# Check for required categories
required_categories = ["fact", "pitfall", "pattern", "tool-quirk", "question"]
for category in required_categories:
if category not in content:
return False, f"Missing required category: {category}"
# Check for required output fields
required_fields = ["fact", "category", "repo", "confidence"]
for field in required_fields:
if field not in content:
return False, f"Missing required output field: {field}"
# Check prompt size (should be ~1k tokens, roughly 4k chars)
if len(content) > 5000:
return False, f"Prompt too large: {len(content)} chars (max ~5000)"
if len(content) < 1000:
return False, f"Prompt too small: {len(content)} chars (min ~1000)"
return True, "Prompt structure is valid"
def test_confidence_scoring():
"""Test that confidence scoring is properly defined."""
prompt_path = Path("templates/harvest-prompt.md")
content = prompt_path.read_text()
# Check for confidence scale definitions
confidence_levels = [
("0.9-1.0", "explicitly stated"),
("0.7-0.8", "clearly implied"),
("0.5-0.6", "suggested"),
("0.3-0.4", "inferred"),
("0.1-0.2", "speculative")
]
for level, description in confidence_levels:
if level not in content:
return False, f"Missing confidence level: {level}"
if description.lower() not in content.lower():
return False, f"Missing confidence description: {description}"
return True, "Confidence scoring is properly defined"
def test_example_quality():
"""Test that examples are clear and complete."""
prompt_path = Path("templates/harvest-prompt.md")
content = prompt_path.read_text()
# Check for example input/output
if "example" not in content.lower():
return False, "No examples provided"
# Check that example includes all categories
example_section = content[content.lower().find("example"):]
# Look for JSON example
json_match = re.search(r'\{[\s\S]*"knowledge"[\s\S]*\}', example_section)
if not json_match:
return False, "No JSON example found"
example_json = json_match.group(0)
# Check for all categories in example
for category in ["fact", "pitfall", "pattern", "tool-quirk", "question"]:
if category not in example_json:
return False, f"Example missing category: {category}"
return True, "Examples are clear and complete"
def test_constraint_coverage():
"""Test that constraints cover all requirements."""
prompt_path = Path("templates/harvest-prompt.md")
content = prompt_path.read_text()
required_constraints = [
"No hallucination",
"only extract",
"explicitly",
"partial",
"failed sessions",
"1k tokens"
]
for constraint in required_constraints:
if constraint.lower() not in content.lower():
return False, f"Missing constraint: {constraint}"
return True, "Constraints cover all requirements"
def test_test_sessions():
"""Test that test sessions exist and are valid."""
test_sessions_dir = Path("test_sessions")
if not test_sessions_dir.exists():
return False, "test_sessions directory not found"
session_files = list(test_sessions_dir.glob("*.jsonl"))
if len(session_files) < 5:
return False, f"Only {len(session_files)} test sessions found, need 5"
# Check each session file
for session_file in session_files:
content = session_file.read_text()
lines = content.strip().split("\n")
# Check that each line is valid JSON
for i, line in enumerate(lines, 1):
try:
json.loads(line)
except json.JSONDecodeError as e:
return False, f"Invalid JSON in {session_file.name}, line {i}: {e}"
return True, f"Found {len(session_files)} valid test sessions"
def run_all_tests():
"""Run all tests and return results."""
tests = [
("Prompt Structure", test_prompt_structure),
("Confidence Scoring", test_confidence_scoring),
("Example Quality", test_example_quality),
("Constraint Coverage", test_constraint_coverage),
("Test Sessions", test_test_sessions)
]
results = []
all_passed = True
for test_name, test_func in tests:
try:
passed, message = test_func()
results.append({
"test": test_name,
"passed": passed,
"message": message
})
if not passed:
all_passed = False
except Exception as e:
results.append({
"test": test_name,
"passed": False,
"message": f"Error: {str(e)}"
})
all_passed = False
# Print results
print("=" * 60)
print("HARVEST PROMPT TEST RESULTS")
print("=" * 60)
for result in results:
status = "✓ PASS" if result["passed"] else "✗ FAIL"
print(f"{status}: {result['test']}")
print(f" {result['message']}")
print()
print("=" * 60)
if all_passed:
print("ALL TESTS PASSED!")
else:
print("SOME TESTS FAILED!")
print("=" * 60)
return all_passed, results
if __name__ == "__main__":
all_passed, results = run_all_tests()
# Save results to file
with open("test_results.json", "w") as f:
json.dump({
"all_passed": all_passed,
"results": results,
"timestamp": "2026-04-14T19:05:00Z"
}, f, indent=2)
print(f"Results saved to test_results.json")
# Exit with appropriate code
exit(0 if all_passed else 1)

View File

@@ -1,155 +0,0 @@
#!/usr/bin/env python3
"""
Validate knowledge files and index.json against the schema.
Usage:
python scripts/validate_knowledge.py [--fix]
Without --fix: reports errors and exits non-zero if any found.
With --fix: auto-generates missing IDs and updates index.json.
"""
import json
import sys
import os
from pathlib import Path
from datetime import datetime
VALID_CATEGORIES = {"fact", "pitfall", "pattern", "tool-quirk", "question"}
REQUIRED_FACT_FIELDS = {"id", "fact", "category", "domain", "confidence"}
MAX_FACT_LENGTH = 280
def validate_fact(fact: dict, source: str = "") -> list[str]:
"""Validate a single fact dict. Returns list of errors."""
errors = []
for field in REQUIRED_FACT_FIELDS:
if field not in fact:
errors.append(f"{source}: missing required field '{field}'")
if "fact" in fact:
if not isinstance(fact["fact"], str) or len(fact["fact"].strip()) == 0:
errors.append(f"{source}: 'fact' must be non-empty string")
elif len(fact["fact"]) > MAX_FACT_LENGTH:
errors.append(f"{source}: 'fact' exceeds {MAX_FACT_LENGTH} chars ({len(fact['fact'])})")
if "category" in fact and fact["category"] not in VALID_CATEGORIES:
errors.append(f"{source}: invalid category '{fact['category']}' — must be one of {VALID_CATEGORIES}")
if "confidence" in fact:
if not isinstance(fact["confidence"], (int, float)):
errors.append(f"{source}: 'confidence' must be a number")
elif not (0.0 <= fact["confidence"] <= 1.0):
errors.append(f"{source}: 'confidence' must be 0.01.0, got {fact['confidence']}")
if "id" in fact:
parts = fact["id"].split(":")
if len(parts) != 3:
errors.append(f"{source}: 'id' must be domain:category:sequence, got '{fact['id']}'")
elif parts[1] not in VALID_CATEGORIES:
errors.append(f"{source}: id category '{parts[1]}' not in {VALID_CATEGORIES}")
if "tags" in fact:
if not isinstance(fact["tags"], list):
errors.append(f"{source}: 'tags' must be a list")
else:
for tag in fact["tags"]:
if not isinstance(tag, str) or not tag.replace("-", "").replace("_", "").isalnum():
errors.append(f"{source}: tag '{tag}' must be lowercase alphanumeric+hyphens")
return errors
def validate_index(index_path: Path) -> list[str]:
"""Validate index.json."""
errors = []
if not index_path.exists():
return [f"index.json not found at {index_path}"]
try:
with open(index_path) as f:
data = json.load(f)
except json.JSONDecodeError as e:
return [f"index.json: invalid JSON — {e}"]
if "version" not in data:
errors.append("index.json: missing 'version' field")
if "facts" not in data:
errors.append("index.json: missing 'facts' field")
elif not isinstance(data["facts"], list):
errors.append("index.json: 'facts' must be a list")
seen_ids = set()
for i, fact in enumerate(data.get("facts", [])):
fact_errors = validate_fact(fact, source=f"index.json facts[{i}]")
errors.extend(fact_errors)
if "id" in fact:
if fact["id"] in seen_ids:
errors.append(f"index.json: duplicate id '{fact['id']}'")
seen_ids.add(fact["id"])
return errors
def validate_yaml_facts(facts: list[dict], source: str) -> list[str]:
"""Validate facts extracted from a YAML file."""
errors = []
seen_ids = set()
for i, fact in enumerate(facts):
fact_errors = validate_fact(fact, source=f"{source}[{i}]")
errors.extend(fact_errors)
if "id" in fact:
if fact["id"] in seen_ids:
errors.append(f"{source}: duplicate id '{fact['id']}'")
seen_ids.add(fact["id"])
return errors
def main():
fix_mode = "--fix" in sys.argv
repo_root = Path(__file__).parent.parent
knowledge_dir = repo_root / "knowledge"
index_path = knowledge_dir / "index.json"
all_errors = []
# Validate index.json
index_errors = validate_index(index_path)
all_errors.extend(index_errors)
# Validate YAML files (basic existence check — full YAML parsing requires pyyaml)
yaml_dirs = ["global", "repos", "agents"]
for dir_name in yaml_dirs:
dir_path = knowledge_dir / dir_name
if not dir_path.exists():
all_errors.append(f"knowledge/{dir_name}/ directory not found")
# Report
if all_errors:
print(f"VALIDATION FAILED — {len(all_errors)} error(s):\n")
for err in all_errors:
print(f"{err}")
sys.exit(1)
else:
# Count facts
try:
with open(index_path) as f:
data = json.load(f)
fact_count = len(data.get("facts", []))
except:
fact_count = 0
print(f"VALIDATION PASSED")
print(f" index.json: {fact_count} facts")
print(f" schema: v1")
sys.exit(0)
if __name__ == "__main__":
main()

View File

@@ -2,107 +2,98 @@
## System Prompt
You are a knowledge extraction engine. You read session transcripts and output ONLY structured JSON. You never infer. You never assume. You extract only what the transcript explicitly states.
You are a knowledge extraction engine. Your task is to analyze a session transcript and extract durable knowledge that will help future sessions be more efficient.
## Prompt
## Instructions
Read the session transcript carefully. Extract ONLY information that is explicitly stated in the transcript. Do NOT infer, assume, or hallucinate information.
### Categories
Extract knowledge into these categories:
1. **fact**: Concrete, verifiable information learned (e.g., "Repository X has 5 files", "API returns JSON with field Y")
2. **pitfall**: Errors encountered, wrong assumptions, things that wasted time (e.g., "Assumed API token was in env var GITEA_TOKEN, but it's in ~/.config/gitea/token")
3. **pattern**: Successful sequences of actions (e.g., "To deploy: 1. Run tests 2. Build 3. Push to Gitea 4. Trigger webhook")
4. **tool-quirk**: Environment-specific behaviors (e.g., "Token paths are different on macOS vs Linux", "URL format requires trailing slash")
5. **question**: Things identified but not answered (e.g., "Need to determine optimal batch size for harvesting")
### Output Format
Return a JSON object with an array of extracted knowledge items. Each item must have:
```json
{
"fact": "One sentence description of the knowledge",
"category": "fact|pitfall|pattern|tool-quirk|question",
"repo": "Repository name this applies to, or 'global' if general",
"confidence": 0.0-1.0
}
```
TASK: Extract durable knowledge from this session transcript.
RULES:
1. Extract ONLY information explicitly stated in the transcript.
2. Do NOT infer, assume, or hallucinate.
3. Every fact must be verifiable by pointing to a specific line in the transcript.
4. If the session failed or was partial, extract pitfalls and questions — these are the most valuable.
5. Be specific. "Gitea API is slow" is worthless. "Gitea issues endpoint with state=open returns empty when limit=50 but works with limit=5" is knowledge.
### Confidence Scoring
CATEGORIES (assign exactly one per item):
- fact: Concrete, verifiable thing learned (paths, formats, counts, configs)
- pitfall: Error hit, wrong assumption, time wasted, thing that didn't work
- pattern: Successful sequence that should be reused (deploy steps, debug flow)
- tool-quirk: Environment-specific behavior (token paths, URL formats, API gotchas)
- question: Something identified but not answered — the NEXT agent should investigate
- 0.9-1.0: Explicitly stated with verification (e.g., "Error message shows X")
- 0.7-0.8: Clearly implied by multiple data points
- 0.5-0.6: Suggested but not fully verified
- 0.3-0.4: Inferred from limited data
- 0.1-0.2: Speculative or uncertain
CONFIDENCE:
- 0.9: Directly observed with error output or explicit verification
- 0.7: Multiple data points confirm, but not explicitly verified
- 0.5: Suggested by context, not tested
- 0.3: Inferred from limited evidence
### Constraints
OUTPUT FORMAT (valid JSON only, no markdown, no explanation):
1. **No hallucination**: Only extract what's explicitly in the transcript
2. **Specificity**: Each fact must be specific and actionable
3. **Relevance**: Only extract knowledge that would help future sessions
4. **Brevity**: One sentence per fact
5. **Partial sessions**: Even failed or incomplete sessions may contain valuable pitfalls
### Example Input/Output
**Input Transcript (excerpt):**
```
User: Clone the repo at forge.alexanderwhitestone.com/Timmy_Foundation/compounding-intelligence
Agent: Trying git clone... Error: 401 Unauthorized
Agent: Checking for token... Found in ~/.config/gitea/token
Agent: Token is gitea_token format, not OAuth
Agent: Clone successful with token authentication
```
**Output:**
```json
{
"knowledge": [
{
"fact": "One specific sentence of knowledge",
"category": "fact|pitfall|pattern|tool-quirk|question",
"repo": "repo-name or global",
"confidence": 0.0-1.0,
"evidence": "Brief quote or reference from transcript that supports this"
"fact": "Gitea repo at forge.alexanderwhitestone.com requires authentication for cloning",
"category": "fact",
"repo": "compounding-intelligence",
"confidence": 0.9
},
{
"fact": "Gitea authentication token is stored at ~/.config/gitea/token",
"category": "tool-quirk",
"repo": "global",
"confidence": 0.9
},
{
"fact": "Gitea uses gitea_token format, not OAuth for API access",
"category": "tool-quirk",
"repo": "global",
"confidence": 0.8
},
{
"fact": "Clone fails with 401 when no token is provided",
"category": "pitfall",
"repo": "compounding-intelligence",
"confidence": 0.9
}
],
"meta": {
"session_outcome": "success|partial|failed",
"tools_used": ["tool1", "tool2"],
"repos_touched": ["repo1"],
"error_count": 0,
"knowledge_count": 0
}
]
}
TRANSCRIPT:
{{transcript}}
```
## Design Notes
## Final Notes
### Why this works with mimo-v2-pro
Mimo needs:
- Explicit format constraints ("valid JSON only, no markdown")
- Clear category definitions with concrete examples
- Hard rules before soft guidance
- The transcript at the END (so it reads all instructions first)
This prompt front-loads all rules, then gives the transcript last. Mimo follows the pattern.
### Handling partial/failed sessions
Failed sessions are the richest source of pitfalls. The prompt explicitly says:
> "If the session failed or was partial, extract pitfalls and questions — these are the most valuable."
This reframes failure as valuable output, not noise to discard.
### The `evidence` field
Added to the original spec. Every extracted item must cite where in the transcript it came from. This:
- Prevents hallucination (can't cite what isn't there)
- Enables verification (reviewer can check the source)
- Trains confidence calibration (the agent must find evidence, not just claim it)
### Token budget
Target: ~1,000 tokens for the prompt (excluding transcript).
```
System prompt: ~50 tokens
Rules: ~200 tokens
Categories: ~150 tokens
Confidence: ~100 tokens
Output format: ~200 tokens
Design notes: NOT included in prompt (documentation only)
─────────────────────────────
Total prompt: ~700 tokens
```
Leaves ~300 tokens headroom for variable content (transcript insertion, edge cases).
### What this replaces
The v1 prompt had:
- Verbose prose explanations (waste tokens for mimo)
- No `evidence` field (hallucination risk)
- No `meta` block (no session-level metadata)
- No explicit handling of failed sessions
- Example was too long (~150 tokens of example for a 1k prompt)
This v2 is tighter, more structured, and adds the evidence requirement that prevents the #1 failure mode of extraction prompts: generating plausible-sounding facts that aren't in the transcript.
- Process the entire transcript, not just the beginning
- Pay special attention to errors and corrections
- Note any environment-specific details
- Track tool-specific behaviors and quirks
- If the session failed, focus on pitfalls and questions

View File

@@ -0,0 +1,18 @@
{"role": "user", "content": "Deploy the application to production"}
{"role": "assistant", "content": "I'll deploy the application to production.", "tool_calls": [{"function": "terminal", "arguments": {"command": "ssh deploy@prod-server 'cd /app && git pull && docker-compose up -d'"}}]}
{"role": "tool", "content": "ssh: connect to host prod-server port 22: Connection refused"}
{"role": "assistant", "content": "Connection refused. Let me check the server status.", "tool_calls": [{"function": "terminal", "arguments": {"command": "ping -c 3 prod-server"}}]}
{"role": "tool", "content": "PING prod-server (192.168.1.100): 56 data bytes\nRequest timeout for icmp_seq 0\nRequest timeout for icmp_seq 1\nRequest timeout for icmp_seq 2\n--- prod-server ping statistics ---\n3 packets transmitted, 0 packets received, 100.0% packet loss"}
{"role": "assistant", "content": "The server appears to be down. Let me check if there's an alternative deployment method."}
{"role": "user", "content": "Try using the VPS instead"}
{"role": "assistant", "content": "I'll try deploying to the VPS.", "tool_calls": [{"function": "terminal", "arguments": {"command": "ssh deploy@vps-server 'cd /app && git pull && docker-compose up -d'"}}]}
{"role": "tool", "content": "Permission denied (publickey)."}
{"role": "assistant", "content": "Permission denied. The SSH key might not be set up correctly. Let me check the SSH configuration.", "tool_calls": [{"function": "terminal", "arguments": {"command": "cat ~/.ssh/config"}}]}
{"role": "tool", "content": "Host vps-server\n HostName 167.99.126.228\n User deploy\n IdentityFile ~/.ssh/id_rsa_vps"}
{"role": "assistant", "content": "The SSH config looks correct. Let me check if the key exists.", "tool_calls": [{"function": "terminal", "arguments": {"command": "ls -la ~/.ssh/id_rsa_vps"}}]}
{"role": "tool", "content": "ls: /Users/user/.ssh/id_rsa_vps: No such file or directory"}
{"role": "assistant", "content": "The SSH key doesn't exist. This is a pitfall - we need to generate the SSH key first."}
{"role": "user", "content": "Generate a new SSH key for the VPS"}
{"role": "assistant", "content": "I'll generate a new SSH key.", "tool_calls": [{"function": "terminal", "arguments": {"command": "ssh-keygen -t rsa -b 4096 -f ~/.ssh/id_rsa_vps -N ''"}}]}
{"role": "tool", "content": "Generating public/private rsa key pair.\nYour identification has been saved in /Users/user/.ssh/id_rsa_vps\nYour public key has been saved in /Users/user/.ssh/id_rsa_vps.pub\nThe key fingerprint is:\nSHA256:abc123... user@machine"}
{"role": "assistant", "content": "SSH key generated. Now I need to add it to the VPS."}

View File

@@ -0,0 +1,14 @@
{"role": "user", "content": "Check the Gitea API for open issues"}
{"role": "assistant", "content": "I'll check the Gitea API for open issues.", "tool_calls": [{"function": "terminal", "arguments": {"command": "curl -H 'Authorization: token $GITEA_TOKEN' https://forge.alexanderwhitestone.com/api/v1/repos/Timmy_Foundation/compounding-intelligence/issues?state=open"}}]}
{"role": "tool", "content": "curl: (6) Could not resolve host: forge.alexanderwhitestone.com"}
{"role": "assistant", "content": "DNS resolution failed. Let me check the network connectivity.", "tool_calls": [{"function": "terminal", "arguments": {"command": "nslookup forge.alexanderwhitestone.com"}}]}
{"role": "tool", "content": "Server: 8.8.8.8\nAddress: 8.8.8.8#53\n\n** server can't find forge.alexanderwhitestone.com: NXDOMAIN"}
{"role": "assistant", "content": "The domain doesn't resolve. This might be a DNS issue. Let me try using the IP address directly.", "tool_calls": [{"function": "terminal", "arguments": {"command": "curl -H 'Authorization: token $GITEA_TOKEN' https://167.99.126.228/api/v1/repos/Timmy_Foundation/compounding-intelligence/issues?state=open"}}]}
{"role": "tool", "content": "curl: (60) SSL certificate problem: self signed certificate"}
{"role": "assistant", "content": "SSL certificate issue. The server is using a self-signed certificate. Let me try with -k flag to skip SSL verification.", "tool_calls": [{"function": "terminal", "arguments": {"command": "curl -k -H 'Authorization: token $GITEA_TOKEN' https://167.99.126.228/api/v1/repos/Timmy_Foundation/compounding-intelligence/issues?state=open"}}]}
{"role": "tool", "content": "[]"}
{"role": "assistant", "content": "No open issues found. The API returned an empty array."}
{"role": "user", "content": "Check if there are any pull requests"}
{"role": "assistant", "content": "Let me check for pull requests.", "tool_calls": [{"function": "terminal", "arguments": {"command": "curl -k -H 'Authorization: token $GITEA_TOKEN' https://167.99.126.228/api/v1/repos/Timmy_Foundation/compounding-intelligence/pulls?state=open"}}]}
{"role": "tool", "content": "[{"id": 20, "title": "feat: add harvester.py + session_reader.py — session knowledge extractor (closes #8)", "state": "open", "user": {"login": "Rockachopa"}, "created_at": "2026-04-14T17:28:20Z"}]"}
{"role": "assistant", "content": "Found 1 open pull request: PR #20 - "feat: add harvester.py + session_reader.py — session knowledge extractor (closes #8)" by Rockachopa."}

View File

@@ -0,0 +1,10 @@
{"role": "user", "content": "How do I deploy a new version of the application?"}
{"role": "assistant", "content": "Here's the deployment process:\n1. Run tests: pytest\n2. Build Docker image: docker build -t app:latest .\n3. Push to registry: docker push registry.example.com/app:latest\n4. Update production: ssh prod 'cd /app && docker-compose pull && docker-compose up -d'\n5. Verify: curl https://app.example.com/health"}
{"role": "user", "content": "What if tests fail?"}
{"role": "assistant", "content": "If tests fail:\n1. Check the error logs: pytest -v\n2. Fix the failing tests\n3. Re-run tests: pytest\n4. If still failing, check recent changes: git log --oneline -10\n5. Consider reverting: git revert HEAD"}
{"role": "user", "content": "What if the Docker build fails?"}
{"role": "assistant", "content": "If Docker build fails:\n1. Check Dockerfile syntax: docker build --no-cache -t app:latest .\n2. Check for missing dependencies: cat requirements.txt\n3. Check disk space: df -h\n4. Clean up Docker: docker system prune -f\n5. Try building with verbose output: docker build --progress=plain -t app:latest ."}
{"role": "user", "content": "What if the push to registry fails?"}
{"role": "assistant", "content": "If push to registry fails:\n1. Check registry credentials: docker login registry.example.com\n2. Check registry space: curl https://registry.example.com/v2/_catalog\n3. Check image size: docker images app:latest\n4. Try tagging with version: docker tag app:latest registry.example.com/app:v1.0.0\n5. Push with version tag: docker push registry.example.com/app:v1.0.0"}
{"role": "user", "content": "What if the production update fails?"}
{"role": "assistant", "content": "If production update fails:\n1. Check SSH connection: ssh prod 'echo connected'\n2. Check disk space on prod: ssh prod 'df -h'\n3. Check Docker status: ssh prod 'docker ps -a'\n4. Check logs: ssh prod 'docker-compose logs --tail=100'\n5. Rollback: ssh prod 'cd /app && git checkout HEAD~1 && docker-compose up -d'"}

View File

@@ -0,0 +1,12 @@
{"role": "user", "content": "I need to implement a new feature for the application"}
{"role": "assistant", "content": "I can help you implement a new feature. What feature would you like to add?"}
{"role": "user", "content": "I want to add user authentication"}
{"role": "assistant", "content": "User authentication is a good feature. I have a few questions:\n1. What authentication method do you prefer? (JWT, OAuth, Session-based)\n2. Do you need social login? (Google, GitHub, etc.)\n3. What user data do you need to store?\n4. Do you need role-based access control?\n5. What's your database? (PostgreSQL, MySQL, SQLite)"}
{"role": "user", "content": "I prefer JWT, no social login, just email and password, no roles, PostgreSQL"}
{"role": "assistant", "content": "Great choices. Here's the implementation plan:\n1. Create user model with email and password hash\n2. Create JWT token generation and validation\n3. Create login endpoint: POST /api/auth/login\n4. Create register endpoint: POST /api/auth/register\n5. Create middleware for protected routes\n6. Add password hashing with bcrypt\n7. Add rate limiting for login attempts\n8. Add password reset functionality\n9. Add email verification\n10. Test all endpoints"}
{"role": "user", "content": "What about password reset?"}
{"role": "assistant", "content": "For password reset:\n1. Create reset token table\n2. Create forgot password endpoint: POST /api/auth/forgot-password\n3. Send reset email with token\n4. Create reset password endpoint: POST /api/auth/reset-password\n5. Validate reset token and expiration\n6. Update password and delete token\n7. Send confirmation email"}
{"role": "user", "content": "What about email verification?"}
{"role": "assistant", "content": "For email verification:\n1. Add email_verified field to user model\n2. Create verification token table\n3. Send verification email on registration\n4. Create verify email endpoint: GET /api/auth/verify-email?token=...\n5. Validate token and update email_verified\n6. Add middleware to check email_verified for certain routes\n7. Add resend verification email endpoint"}
{"role": "user", "content": "What about rate limiting?"}
{"role": "assistant", "content": "For rate limiting:\n1. Use Redis for rate limit storage\n2. Create rate limit middleware\n3. Set limits: 5 login attempts per minute, 3 password resets per hour\n4. Return 429 Too Many Requests when limit exceeded\n5. Add X-RateLimit headers to responses\n6. Log rate limit violations\n7. Add whitelist for trusted IPs"}

View File

@@ -0,0 +1,13 @@
{"role": "user", "content": "Clone the repo at forge.alexanderwhitestone.com/Timmy_Foundation/compounding-intelligence"}
{"role": "assistant", "content": "I'll clone the repository for you.", "tool_calls": [{"function": "terminal", "arguments": {"command": "git clone https://forge.alexanderwhitestone.com/Timmy_Foundation/compounding-intelligence.git"}}]}
{"role": "tool", "content": "Cloning into 'compounding-intelligence'...\nremote: Enumerating objects: 15, done.\nremote: Counting objects: 100% (15/15), done.\nremote: Compressing objects: 100% (12/12), done.\nremote: Total 15 (delta 2), reused 0 (delta 0), pack-reused 0\nReceiving objects: 100% (15/15), done.\nResolving deltas: 100% (2/2), done."}
{"role": "assistant", "content": "Repository cloned successfully. Let me check the structure.", "tool_calls": [{"function": "terminal", "arguments": {"command": "ls -la compounding-intelligence/"}}]}
{"role": "tool", "content": "total 8\ndrwxr-xr-x 5 user staff 160 Apr 14 17:21 .\ndrwxr-xr-x 12 user staff 384 Apr 14 17:21 ..\ndrwxr-xr-x 13 user staff 416 Apr 14 17:21 .git\n-rw-r--r-- 1 user staff 1234 Apr 14 17:21 README.md\ndrwxr-xr-x 3 user staff 96 Apr 14 17:21 knowledge\n-rw-r--r-- 1 user staff 456 Apr 14 17:21 metrics\ndrwxr-xr-x 4 user staff 128 Apr 14 17:21 scripts\ndrwxr-xr-x 3 user staff 96 Apr 14 17:21 templates"}
{"role": "user", "content": "What files are in the templates directory?"}
{"role": "assistant", "content": "Let me check the templates directory.", "tool_calls": [{"function": "terminal", "arguments": {"command": "ls -la compounding-intelligence/templates/"}}]}
{"role": "tool", "content": "total 0\ndrwxr-xr-x 3 user staff 96 Apr 14 17:21 .\ndrwxr-xr-x 5 user staff 160 Apr 14 17:21 ..\n-rw-r--r-- 1 user staff 0 Apr 14 17:21 .gitkeep"}
{"role": "assistant", "content": "The templates directory currently only contains a .gitkeep file. No templates have been added yet."}
{"role": "user", "content": "Create a harvest-prompt.md template in the templates directory."}
{"role": "assistant", "content": "I'll create the harvest-prompt.md template for you.", "tool_calls": [{"function": "terminal", "arguments": {"command": "cat > compounding-intelligence/templates/harvest-prompt.md << 'EOF'\n# Knowledge Extraction Prompt\n\nExtract knowledge from session transcripts.\nEOF"}}]}
{"role": "tool", "content": ""}
{"role": "assistant", "content": "Template created successfully. The harvest-prompt.md file is now in the templates directory."}