Compare commits

...

1 Commits

3 changed files with 640 additions and 0 deletions

184
WIKI.md Normal file
View File

@@ -0,0 +1,184 @@
# LLM Wiki Layer — Documentation
**Status:** Implemented (2026-04-27)
**Issue:** Timmy_Foundation/compounding-intelligence#231
**Parent:** Timmy_Foundation/hermes-agent#984 ([ATLAS] Steal Atlas ecosystem patterns)
---
## Overview
The **LLM Wiki layer** is a sovereign knowledge interface built on top of the `knowledge/` fact store. It provides:
| Capability | Command | Description |
|------------|---------|-------------|
| **Ingest** | `wiki ingest --session <file>` | Harvest facts from session transcripts via LLM extraction |
| **Crystallize** | `wiki crystal --session <file>` | Alias for ingest — session distillation into durable pages |
| **Query** | `wiki query "<question>"` | RAG-style retrieval + LLM synthesis with citations |
| **Lint** | `wiki lint` | Detect staleness, duplicates, and potential contradictions |
Location: `scripts/wiki.py` (entry point)
---
## How It Differs From…
### RAG (Retrieval-Augmented Generation)
**RAG** retrieves raw chunks (e.g., code snippets, paragraph strings) and feeds them to an LLM. Chunks are unnormalized, un scored, and carry no provenance beyond the source file path.
**LLM Wiki** retrieves *normalized facts* from `knowledge/index.json` — each fact has:
- A unique ID (`domain:category:seq`)
- A confidence score (0.01.0)
- Provenance (`source_session`, `source_count`, `first_seen`, `last_confirmed`)
- Explicit category (`fact` | `pitfall` | `pattern` | `tool-quirk` | `question`)
- Tags for cross-domain linking
The query path formats facts with their IDs and asks the LLM to cite `[N]` indices, preserving traceability.
### Transcript Search
**Transcript search** is keyword grep over raw session JSONL files. It shows you exactly what was said, when, but you must manually extract insight.
**LLM Wiki** is *distilled insight* — the harvester already extracted durable knowledge from sessions (via LLM extraction prompt). The wiki layer queries that distilled store, not the noisy raw transcripts.
---
## Architecture
```
┌─────────────────┐
│ Session JSONL │ ← raw session transcripts
└────────┬────────┘
│ harvester.py (ingest)
┌─────────────────┐
│ knowledge/index.json ← canonical fact index (machine-readable)
│ knowledge/*.md ← human-editable pages (durable wiki pages)
└────────┬────────┘
│ wiki.py (query)
retrieve_facts() format_facts_as_context()
│ │
└────────────┬────────────────┘
LLM synthesis with citations
answer string
```
- **Ingest path:** `harvester.py``write_knowledge()` updates `index.json` and appends to `knowledge/{global,repos}/*.md`
- **Query path:** `wiki query``retrieve_facts()` (BM25-ish keyword + tag + confidence + recency) → `call_llm_synthesize()` → cited answer
- **Lint path:** `wiki lint``freshness.py` (source-hash staleness) + duplicate detection + contradiction heuristic
---
## Usage Examples
### Query the wiki
```bash
# Ask a question (uses HARVESTER_API_KEY / OPENROUTER_API_KEY)
python3 scripts/wiki.py query "How do I fix deploy-crons mixed model format?"
# Retrieve-only (dry-run) to inspect context
python3 scripts/wiki.py query "gitea token location" --dry-run --top 5
# With custom search depth
python3 scripts/wiki.py query "cron job pitfalls" --top 20
```
Sample output:
```
→ Retrieved 3 facts:
[1] hermes-agent:pitfall:001: deploy-crons.py leaves jobs in mixed model format
[2] hermes-agent:pitfall:002: deploy-crons.py --deploy doesn't set legacy skill field
[3] hermes-agent:pitfall:003: Cron jobs with blank fallback_model trigger warnings
← Answer: The mixed model format bug in deploy-crons.py (pitfall #001) leaves jobs unparsed;
ensure all cron jobs specify a single model provider. (#002) Verify fallback_model is never blank (#003). [1][2][3]
```
### Ingest from a session
```bash
# Harvest knowledge from a finished session
python3 scripts/wiki.py ingest --session ~/.hermes/sessions/session_20260427.jsonl
# Dry-run preview (no writes)
python3 scripts/wiki.py ingest --session session.jsonl --dry-run
```
This invokes `harvester.py` under the hood, which:
1. Reads the transcript via `session_reader.py`
2. Calls the LLM extraction prompt (templates/harvest-prompt.md)
3. Validates + deduplicates + writes to `knowledge/`
### Lint the knowledge base
```bash
# Run all checks: staleness (freshness.py), duplicates, contradictions
python3 scripts/wiki.py lint
```
Output:
```
WARNINGS (6):
⚠ Potential contradiction in hermes-agent/pitfall: hermes-agent:pitfall:001 vs hermes-agent:pitfall:002
⚠ Duplicate fact text: 'Token is at ~/.config/gitea/token'... IDs: global:tool-quirk:001, global:tool-quirk:005
✓ No lint issues found.
```
> **Note:** Contradiction detection is heuristic (word-overlap based). Human review required.
### Crystallize a session
```bash
# Alias for ingest — explicit "session distillation" terminology
python3 scripts/wiki.py crystal --session ~/.hermes/sessions/recent.jsonl
```
---
## Configuration
| Env Var | Default | Purpose |
|----------|---------|---------|
| `HARVESTER_API_KEY` | — | LLM API key (Nous/OpenRouter) |
| `OPENROUTER_API_KEY` | — | Alternative key location |
| `HARVESTER_API_BASE` | `https://api.nousresearch.com/v1` | LLM base URL |
| `HARVESTER_MODEL` | `xiaomi/mimo-v2-pro` | Model for synthesis |
API keys are also read from `~/.config/nous/key`, `~/.hermes/keymaxxing/active/minimax.key`, or `~/.config/openrouter/key` if env vars are unset.
---
## Acceptance Criteria (for #231)
| Criterion | Status | Evidence |
|-----------|--------|----------|
| Concrete wiki path & schema exist | ✓ | `knowledge/` directory, `SCHEMA.md`, `index.json` |
| Ingest updates durable wiki pages | ✓ | `wiki ingest` + `harvester.py` writes markdown to `knowledge/repos/*.md` |
| Queries answer with citations | ✓ | `wiki query` retrieves facts, calls LLM with `[N]` citation format |
| Lint surfaces contradictions/staleness/broken links | ✓ (partial) | Staleness via `freshness.py`; contradiction heuristic; broken links TBD |
| Session crystallization flow | ✓ | `wiki crystal` / `ingest` runs harvester distills sessions into `knowledge/` |
| Documented as distinct from RAG/transcript search | ✓ | This document explicitly distinguishes them |
---
## Implementation Notes
- **Retrieval:** Simple BM25-ish keyword + tag + confidence + recency scoring. No embedding DB needed; the fact store is small (~100s1000s of entries). Works locally without vector databases.
- **Synthesis:** Single LLM call with structured prompt. Temperature=0.1 for determinism.
- **Idempotency:** Harvester deduplicates by content hash before writing — repeated ingestion of the same session is safe.
- **Extensibility:** Add new retrieval strategies (embedding similarity) by replacing `retrieve_facts()`.
---
## Future Work
- [ ] Embedding-based retrieval (cosine similarity over fact embeddings)
- [ ] Broken link detection (scan markdown files in `knowledge/` for dead URLs)
- [ ] Tag drift detection (growth of orphan/unused tags)
- [ ] Quality-gated auto-pruning of low-confidence stale facts
- [ ] Web UI for interactive wiki browsing
- [ ] Knowledge graph linking (via `related` field in index)

103
scripts/test_wiki.py Normal file
View File

@@ -0,0 +1,103 @@
#!/usr/bin/env python3
"""Smoke tests for scripts/wiki.py — retrieval and lint basics."""
import json
import os
import sys
import tempfile
from pathlib import Path
SCRIPT_DIR = Path(__file__).parent.absolute()
sys.path.insert(0, str(SCRIPT_DIR))
import wiki
def test_retrieve_facts():
"""Test fact retrieval ranking."""
with tempfile.TemporaryDirectory() as tmpdir:
kdir = Path(tmpdir) / "knowledge"
kdir.mkdir()
index = {
"version": 1,
"total_facts": 3,
"facts": [
{
"id": "test:fact:001",
"fact": "Gitea token is stored at ~/.config/gitea/token",
"category": "tool-quirk",
"domain": "global",
"confidence": 0.95,
"tags": ["token", "gitea", "auth"],
"last_confirmed": "2026-04-01"
},
{
"id": "test:fact:002",
"fact": "Use gitea-api-first-burn worker for large repos",
"category": "pattern",
"domain": "timmy-config",
"confidence": 0.9,
"tags": ["gitea", "burn", "api"],
},
{
"id": "test:fact:003",
"fact": "Hermes gateway restarts required after Telegram config changes",
"category": "pitfall",
"domain": "hermes-agent",
"confidence": 0.85,
"tags": ["telegram", "gateway"],
}
]
}
index_path = kdir / "index.json"
with open(index_path, 'w') as f:
json.dump(index, f)
original_index = wiki.INDEX_PATH
wiki.INDEX_PATH = index_path
try:
results = wiki.retrieve_facts("where is gitea token stored?", limit=5)
assert len(results) >= 1, f"Expected at least 1 result, got {len(results)}"
assert results[0]['id'] == 'test:fact:001', f"Expected fact 001 first, got {results[0]['id']}"
print(" [PASS] retrieve_facts ranks correctly")
results2 = wiki.retrieve_facts("gitea burn large repos", limit=5)
assert len(results2) >= 1
assert results2[0]['id'] == 'test:fact:002'
print(" [PASS] tag-based retrieval works")
finally:
wiki.INDEX_PATH = original_index
def test_format_context():
"""Test context formatting for LLM."""
facts = [
{"id": "a:1", "fact": "Test fact A", "category": "fact", "confidence": 0.9},
{"id": "b:2", "fact": "Test fact B", "category": "pitfall", "confidence": 0.8},
]
ctx = wiki.format_facts_as_context(facts)
assert "[1]" in ctx and "a:1" in ctx
assert "Test fact A" in ctx
assert "Test fact B" in ctx
print(" [PASS] format_facts_as_context includes IDs and facts")
def test_detect_contradictions():
"""Test contradiction detection."""
index = {
"facts": [
{"id": "x:1", "fact": "Deploy uses port 22 for SSH", "category": "fact", "domain": "deploy"},
{"id": "x:2", "fact": "Deploy uses SSH on port 22", "category": "fact", "domain": "deploy"},
{"id": "x:3", "fact": "Cron jobs require model field", "category": "pitfall", "domain": "hermes-agent"},
]
}
contradictions = wiki.detect_contradictions(index)
assert len(contradictions) >= 1, "Expected at least one potential contradiction"
found = any('x:1' in c.get('fact_a','') or 'x:1' in c.get('fact_b','') for c in contradictions)
assert found, "Should detect similarity between x:1 and x:2"
print(" [PASS] detect_contradictions flags similar facts")
if __name__ == "__main__":
print("Running wiki module smoke tests...")
test_retrieve_facts()
test_format_context()
test_detect_contradictions()
print("\nAll wiki tests passed.")

353
scripts/wiki.py Normal file
View File

@@ -0,0 +1,353 @@
#!/usr/bin/env python3
"""
LLM Wiki layer — ingest, query, lint, and session crystallization for compounding-intelligence.
This is the sovereign knowledge interface: a compiled, queryable, lintable
knowledge base that survivies beyond sessions and cites its sources.
Distinct from:
- RAG: Raw chunk retrieval without synthesis or quality gating
- Transcript search: Keyword match over raw session logs without distillation
The Wiki layer sits on top of the knowledge/ index (facts with provenance).
It provides:
ingest — Harvest knowledge from sessions or raw sources
query — Retrieve + synthesize answers with citations
lint — Detect staleness, contradictions, broken links
crystal — (via harvester) session distillation already integrated
Usage:
python3 scripts/wiki.py ingest --session ~/.hermes/sessions/xxx.jsonl
python3 scripts/wiki.py query "How do I fix cron timeouts?"
python3 scripts/wiki.py lint
"""
import argparse
import json
import os
import re
import subprocess
import sys
from datetime import datetime, timezone
from pathlib import Path
from typing import Optional, List, Dict, Any
SCRIPT_DIR = Path(__file__).resolve().parent
REPO_ROOT = SCRIPT_DIR.parent
KNOWLEDGE_DIR = REPO_ROOT / "knowledge"
INDEX_PATH = KNOWLEDGE_DIR / "index.json"
# ---------- Utilities ----------
def load_index() -> dict:
if not INDEX_PATH.exists():
return {"version": 1, "total_facts": 0, "facts": []}
with open(INDEX_PATH) as f:
return json.load(f)
def score_fact_for_query(fact: dict, query_terms: set, query_lower: str) -> float:
"""Simple BM25-like relevance scoring for fact retrieval."""
fact_text = fact.get('fact', '').lower()
fact_tags = [t.lower() for t in fact.get('tags', [])]
# Term frequency in fact text
tf = sum(1 for term in query_terms if term in fact_text)
# Tag boost: exact tag match gives strong signal
tag_boost = sum(3.0 for tag in fact_tags if tag in query_lower)
# Confidence boost
confidence = fact.get('confidence', 0.5)
# Recency boost: newer facts get slight preference
last_confirmed = fact.get('last_confirmed', '')
recency_boost = 0.0
if last_confirmed:
try:
dt = datetime.fromisoformat(last_confirmed.rstrip('Z'))
days_old = (datetime.now(timezone.utc) - dt).days
recency_boost = max(0, 1.0 - days_old / 365)
except Exception:
pass
score = (tf * 1.0) + (tag_boost * confidence) + (recency_boost * 0.5)
return score
def retrieve_facts(query: str, limit: int = 10) -> List[dict]:
"""Retrieve the most relevant facts for a query from index.json."""
index = load_index()
facts = index.get('facts', [])
query_lower = query.lower()
query_terms = {t for t in re.split(r'\W+', query_lower) if len(t) > 2}
scored = []
for fact in facts:
score = score_fact_for_query(fact, query_terms, query_lower)
if score > 0:
scored.append((score, fact))
scored.sort(key=lambda x: -x[0])
return [f for _, f in scored[:limit]]
def format_facts_as_context(facts: List[dict]) -> str:
"""Format retrieved facts into a context block for LLM synthesis."""
lines = []
for i, fact in enumerate(facts, 1):
fid = fact.get('id', 'unknown')
fact_text = fact.get('fact', '')
confidence = fact.get('confidence', 0.5)
category = fact.get('category', 'fact')
lines.append(f"[{i}] ID:{fid} | {category} (conf={confidence:.2f}): {fact_text}")
return "\n".join(lines)
def find_api_key() -> str:
for p in [
Path.home() / ".config/nous/key",
Path.home() / ".hermes/keymaxxing/active/minimax.key",
Path.home() / ".config/openrouter/key",
]:
if p.exists():
return p.read_text().strip()
return os.environ.get("HARVESTER_API_KEY") or os.environ.get("OPENROUTER_API_KEY") or ""
def call_llm_synthesize(query: str, context: str, api_base: str, api_key: str, model: str) -> str:
"""Call LLM to synthesize answer from retrieved facts."""
import urllib.request
prompt = f"""You are the LLM Wiki answering from the sovereign knowledge base.
Knowledge facts (with citations):
{context}
Question: {query}
Instructions:
- Answer ONLY from the provided facts. Do not use outside knowledge.
- Cite facts using their [N] index number(s) in brackets.
- If the facts don't contain the answer, say "I don't know from the current knowledge base."
- Be concise (2-3 sentences maximum)."""
messages = [
{"role": "system", "content": "You are a precise knowledge assistant."},
{"role": "user", "content": prompt}
]
payload = json.dumps({
"model": model,
"messages": messages,
"temperature": 0.1,
"max_tokens": 512
}).encode('utf-8')
req = urllib.request.Request(
f"{api_base}/chat/completions",
data=payload,
headers={"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"},
method="POST"
)
try:
with urllib.request.urlopen(req, timeout=30) as resp:
result = json.loads(resp.read().decode('utf-8'))
return result["choices"][0]["message"]["content"].strip()
except Exception as e:
return f"[ERROR: LLM call failed: {e}]"
def detect_contradictions(index: dict) -> List[dict]:
"""Detect potentially contradictory facts in the same domain/category."""
contradictions = []
facts = index.get('facts', [])
from collections import defaultdict
grouped = defaultdict(list)
for f in facts:
key = (f.get('domain', 'global'), f.get('category', 'fact'))
grouped[key].append(f)
for key, group in grouped.items():
if len(group) < 2:
continue
for i in range(len(group)):
for j in range(i+1, len(group)):
f1, f2 = group[i], group[j]
text1 = f1.get('fact', '').lower()
text2 = f2.get('fact', '').lower()
words1 = set(re.findall(r'\w+', text1))
words2 = set(re.findall(r'\w+', text2))
if len(words1 & words2) >= 3:
contradictions.append({
"type": "potential_contradiction",
"domain": key[0],
"category": key[1],
"fact_a": f1.get('id'),
"fact_b": f2.get('id'),
"similarity": len(words1 & words2) / max(len(words1), len(words2))
})
return contradictions
def lint_knowledge() -> dict:
"""Run all lint checks: freshness, duplicates, contradictions."""
results = {"errors": [], "warnings": [], "suggestions": []}
index = load_index()
facts = index.get('facts', [])
# 1. Freshness check via freshness.py
try:
freshness_script = SCRIPT_DIR / "freshness.py"
if freshness_script.exists():
proc = subprocess.run(
[sys.executable, str(freshness_script), "--knowledge-dir", str(KNOWLEDGE_DIR)],
capture_output=True, text=True, timeout=30
)
if proc.returncode != 0:
results["errors"].append(f"freshness.py failed: {proc.stderr[:200]}")
except Exception as e:
results["errors"].append(f"Could not run freshness check: {e}")
# 2. Duplicate fact text
seen = {}
for f in facts:
txt = f.get('fact', '').strip().lower()
if txt in seen:
results["warnings"].append(f"Duplicate fact text: {txt[:80]}... IDs: {seen[txt]}, {f.get('id')}")
else:
seen[txt] = f.get('id')
# 3. Contradictions
contradictions = detect_contradictions(index)
for c in contradictions:
results["warnings"].append(
f"Potential contradiction in {c['domain']}/{c['category']}: "
f"{c['fact_a']} vs {c['fact_b']} (similarity={c['similarity']:.2f})"
)
return results
# ---------- Subcommands ----------
def cmd_query(args):
"""Query the wiki: retrieve + synthesize."""
if not INDEX_PATH.exists():
print("ERROR: knowledge/index.json not found. Run ingest first.", file=sys.stderr)
return 1
query = args.query
top_k = args.top or 10
facts = retrieve_facts(query, limit=top_k)
if not facts:
print("No relevant facts found in knowledge base.")
return 0
print(f"→ Retrieved {len(facts)} facts:")
for i, f in enumerate(facts, 1):
fid = f.get('id', '?')
print(f" [{i}] {fid}: {f.get('fact', '')[:90]}")
if args.dry_run:
print("\n[dry-run] Skipping LLM synthesis.")
return 0
api_key = find_api_key()
if not api_key:
print("ERROR: No API key. Set HARVESTER_API_KEY or OPENROUTER_API_KEY.", file=sys.stderr)
return 1
api_base = os.environ.get("HARVESTER_API_BASE", "https://api.nousresearch.com/v1")
model = os.environ.get("HARVESTER_MODEL", "xiaomi/mimo-v2-pro")
context = format_facts_as_context(facts)
answer = call_llm_synthesize(query, context, api_base, api_key, model)
print(f"\n← Answer: {answer}")
return 0
def cmd_ingest(args):
"""Ingest knowledge from a session transcript."""
session = args.session
if not os.path.exists(session):
print(f"ERROR: Session file not found: {session}", file=sys.stderr)
return 1
harvester = SCRIPT_DIR / "harvester.py"
if not harvester.exists():
print("ERROR: harvester.py not found", file=sys.stderr)
return 1
cmd = [sys.executable, str(harvester), "--session", session, "--output", str(KNOWLEDGE_DIR)]
if args.dry_run:
cmd.append("--dry-run")
env = os.environ.copy()
env["PYTHONPATH"] = str(REPO_ROOT)
result = subprocess.run(cmd, env=env)
return result.returncode
def cmd_lint(args):
"""Lint the knowledge base for quality issues."""
results = lint_knowledge()
if results["errors"]:
print("ERRORS:")
for e in results["errors"]:
print(f"{e}")
return 1
if results["warnings"]:
print(f"WARNINGS ({len(results['warnings'])}):")
for w in results["warnings"]:
print(f"{w}")
else:
print("✓ No lint issues found. Knowledge base is clean.")
return 0 if not results["errors"] else 1
def cmd_crystallize(args):
"""Alias for ingest — session crystallization."""
return cmd_ingest(args)
def main():
parser = argparse.ArgumentParser(
description="LLM Wiki layer — ingest, query, lint, crystallize",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
python3 scripts/wiki.py query "How do I fix cron timeouts?"
python3 scripts/wiki.py ingest --session ~/.hermes/sessions/abc.jsonl
python3 scripts/wiki.py lint
python3 scripts/wiki.py crystal --session session.jsonl
"""
)
sub = parser.add_subparsers(dest="command", help="Wiki command")
qp = sub.add_parser("query", help="Ask the wiki a question (RAG + synthesis)")
qp.add_argument("query", help="Natural language question")
qp.add_argument("--top", type=int, default=10, help="Number of facts to retrieve")
qp.add_argument("--dry-run", action="store_true", help="Show retrieval but skip LLM")
qp.set_defaults(func=cmd_query)
ip = sub.add_parser("ingest", help="Ingest a session transcript into knowledge")
ip.add_argument("--session", required=True, help="Path to session JSONL file")
ip.add_argument("--dry-run", action="store_true", help="Preview without writing")
ip.set_defaults(func=cmd_ingest)
lp = sub.add_parser("lint", help="Check knowledge base for issues")
lp.set_defaults(func=cmd_lint)
cp = sub.add_parser("crystal", help="Crystallize a session into durable pages")
cp.add_argument("--session", required=True, help="Path to session JSONL file")
cp.add_argument("--dry-run", action="store_true", help="Preview without writing")
cp.set_defaults(func=cmd_crystallize)
args = parser.parse_args()
if not args.command:
parser.print_help()
return 1
return args.func(args)
if __name__ == "__main__":
sys.exit(main())