Compare commits

..

1 Commits

5 changed files with 696 additions and 155 deletions

184
WIKI.md Normal file
View File

@@ -0,0 +1,184 @@
# LLM Wiki Layer — Documentation
**Status:** Implemented (2026-04-27)
**Issue:** Timmy_Foundation/compounding-intelligence#231
**Parent:** Timmy_Foundation/hermes-agent#984 ([ATLAS] Steal Atlas ecosystem patterns)
---
## Overview
The **LLM Wiki layer** is a sovereign knowledge interface built on top of the `knowledge/` fact store. It provides:
| Capability | Command | Description |
|------------|---------|-------------|
| **Ingest** | `wiki ingest --session <file>` | Harvest facts from session transcripts via LLM extraction |
| **Crystallize** | `wiki crystal --session <file>` | Alias for ingest — session distillation into durable pages |
| **Query** | `wiki query "<question>"` | RAG-style retrieval + LLM synthesis with citations |
| **Lint** | `wiki lint` | Detect staleness, duplicates, and potential contradictions |
Location: `scripts/wiki.py` (entry point)
---
## How It Differs From…
### RAG (Retrieval-Augmented Generation)
**RAG** retrieves raw chunks (e.g., code snippets, paragraph strings) and feeds them to an LLM. Chunks are unnormalized, un scored, and carry no provenance beyond the source file path.
**LLM Wiki** retrieves *normalized facts* from `knowledge/index.json` — each fact has:
- A unique ID (`domain:category:seq`)
- A confidence score (0.01.0)
- Provenance (`source_session`, `source_count`, `first_seen`, `last_confirmed`)
- Explicit category (`fact` | `pitfall` | `pattern` | `tool-quirk` | `question`)
- Tags for cross-domain linking
The query path formats facts with their IDs and asks the LLM to cite `[N]` indices, preserving traceability.
### Transcript Search
**Transcript search** is keyword grep over raw session JSONL files. It shows you exactly what was said, when, but you must manually extract insight.
**LLM Wiki** is *distilled insight* — the harvester already extracted durable knowledge from sessions (via LLM extraction prompt). The wiki layer queries that distilled store, not the noisy raw transcripts.
---
## Architecture
```
┌─────────────────┐
│ Session JSONL │ ← raw session transcripts
└────────┬────────┘
│ harvester.py (ingest)
┌─────────────────┐
│ knowledge/index.json ← canonical fact index (machine-readable)
│ knowledge/*.md ← human-editable pages (durable wiki pages)
└────────┬────────┘
│ wiki.py (query)
retrieve_facts() format_facts_as_context()
│ │
└────────────┬────────────────┘
LLM synthesis with citations
answer string
```
- **Ingest path:** `harvester.py``write_knowledge()` updates `index.json` and appends to `knowledge/{global,repos}/*.md`
- **Query path:** `wiki query``retrieve_facts()` (BM25-ish keyword + tag + confidence + recency) → `call_llm_synthesize()` → cited answer
- **Lint path:** `wiki lint``freshness.py` (source-hash staleness) + duplicate detection + contradiction heuristic
---
## Usage Examples
### Query the wiki
```bash
# Ask a question (uses HARVESTER_API_KEY / OPENROUTER_API_KEY)
python3 scripts/wiki.py query "How do I fix deploy-crons mixed model format?"
# Retrieve-only (dry-run) to inspect context
python3 scripts/wiki.py query "gitea token location" --dry-run --top 5
# With custom search depth
python3 scripts/wiki.py query "cron job pitfalls" --top 20
```
Sample output:
```
→ Retrieved 3 facts:
[1] hermes-agent:pitfall:001: deploy-crons.py leaves jobs in mixed model format
[2] hermes-agent:pitfall:002: deploy-crons.py --deploy doesn't set legacy skill field
[3] hermes-agent:pitfall:003: Cron jobs with blank fallback_model trigger warnings
← Answer: The mixed model format bug in deploy-crons.py (pitfall #001) leaves jobs unparsed;
ensure all cron jobs specify a single model provider. (#002) Verify fallback_model is never blank (#003). [1][2][3]
```
### Ingest from a session
```bash
# Harvest knowledge from a finished session
python3 scripts/wiki.py ingest --session ~/.hermes/sessions/session_20260427.jsonl
# Dry-run preview (no writes)
python3 scripts/wiki.py ingest --session session.jsonl --dry-run
```
This invokes `harvester.py` under the hood, which:
1. Reads the transcript via `session_reader.py`
2. Calls the LLM extraction prompt (templates/harvest-prompt.md)
3. Validates + deduplicates + writes to `knowledge/`
### Lint the knowledge base
```bash
# Run all checks: staleness (freshness.py), duplicates, contradictions
python3 scripts/wiki.py lint
```
Output:
```
WARNINGS (6):
⚠ Potential contradiction in hermes-agent/pitfall: hermes-agent:pitfall:001 vs hermes-agent:pitfall:002
⚠ Duplicate fact text: 'Token is at ~/.config/gitea/token'... IDs: global:tool-quirk:001, global:tool-quirk:005
✓ No lint issues found.
```
> **Note:** Contradiction detection is heuristic (word-overlap based). Human review required.
### Crystallize a session
```bash
# Alias for ingest — explicit "session distillation" terminology
python3 scripts/wiki.py crystal --session ~/.hermes/sessions/recent.jsonl
```
---
## Configuration
| Env Var | Default | Purpose |
|----------|---------|---------|
| `HARVESTER_API_KEY` | — | LLM API key (Nous/OpenRouter) |
| `OPENROUTER_API_KEY` | — | Alternative key location |
| `HARVESTER_API_BASE` | `https://api.nousresearch.com/v1` | LLM base URL |
| `HARVESTER_MODEL` | `xiaomi/mimo-v2-pro` | Model for synthesis |
API keys are also read from `~/.config/nous/key`, `~/.hermes/keymaxxing/active/minimax.key`, or `~/.config/openrouter/key` if env vars are unset.
---
## Acceptance Criteria (for #231)
| Criterion | Status | Evidence |
|-----------|--------|----------|
| Concrete wiki path & schema exist | ✓ | `knowledge/` directory, `SCHEMA.md`, `index.json` |
| Ingest updates durable wiki pages | ✓ | `wiki ingest` + `harvester.py` writes markdown to `knowledge/repos/*.md` |
| Queries answer with citations | ✓ | `wiki query` retrieves facts, calls LLM with `[N]` citation format |
| Lint surfaces contradictions/staleness/broken links | ✓ (partial) | Staleness via `freshness.py`; contradiction heuristic; broken links TBD |
| Session crystallization flow | ✓ | `wiki crystal` / `ingest` runs harvester distills sessions into `knowledge/` |
| Documented as distinct from RAG/transcript search | ✓ | This document explicitly distinguishes them |
---
## Implementation Notes
- **Retrieval:** Simple BM25-ish keyword + tag + confidence + recency scoring. No embedding DB needed; the fact store is small (~100s1000s of entries). Works locally without vector databases.
- **Synthesis:** Single LLM call with structured prompt. Temperature=0.1 for determinism.
- **Idempotency:** Harvester deduplicates by content hash before writing — repeated ingestion of the same session is safe.
- **Extensibility:** Add new retrieval strategies (embedding similarity) by replacing `retrieve_facts()`.
---
## Future Work
- [ ] Embedding-based retrieval (cosine similarity over fact embeddings)
- [ ] Broken link detection (scan markdown files in `knowledge/` for dead URLs)
- [ ] Tag drift detection (growth of orphan/unused tags)
- [ ] Quality-gated auto-pruning of low-confidence stale facts
- [ ] Web UI for interactive wiki browsing
- [ ] Knowledge graph linking (via `related` field in index)

View File

@@ -22,95 +22,114 @@ import sys
from pathlib import Path
from typing import Optional
from session_reader import extract_conversation, read_session
def compute_hash(text: str) -> str:
"""Content hash for deduplication."""
return hashlib.sha256(text.encode()).hexdigest()[:16]
def extract_pairs_from_conversation(conversation: list, session_id: str, model: str,
min_ratio: float = 1.5,
def extract_pairs_from_session(session_data: dict, min_ratio: float = 1.5,
min_response_words: int = 20) -> list:
"""Extract terse→rich pairs from a normalized conversation."""
"""Extract terse→rich pairs from a single session object."""
pairs = []
conversations = session_data.get("conversations", [])
session_id = session_data.get("id", "unknown")
model = session_data.get("model", "unknown")
seen_hashes = set()
for i, msg in enumerate(conversation):
# Look for assistant responses
if msg.get('role') != 'assistant':
for i, msg in enumerate(conversations):
# Look for assistant/gpt responses
if msg.get("from") not in ("gpt", "assistant"):
continue
response_text = msg.get('content', '')
response_text = msg.get("value", "")
if not response_text or len(response_text.split()) < min_response_words:
continue
# Find the preceding user message
# Find the preceding human message
prompt_text = ""
for j in range(i - 1, -1, -1):
if conversation[j].get('role') == 'user':
prompt_text = conversation[j].get('content', '')
if conversations[j].get("from") == "human":
prompt_text = conversations[j].get("value", "")
break
if not prompt_text:
continue
# Filter: skip tool results, system messages embedded as human
if prompt_text.startswith('{') and 'output' in prompt_text[:100]:
continue
if prompt_text.startswith('# SOUL.md') or prompt_text.startswith('You are'):
continue
if prompt_text.startswith("{") and "output" in prompt_text[:100]:
continue # likely a tool result
if prompt_text.startswith("# SOUL.md") or prompt_text.startswith("You are"):
continue # system prompt leak
# Quality filters
prompt_words = len(prompt_text.split())
response_words = len(response_text.split())
# Must have meaningful length ratio
if prompt_words == 0 or response_words == 0:
continue
ratio = response_words / prompt_words
if ratio < min_ratio:
continue
code_blocks = response_text.count('```')
if code_blocks >= 4 and len(response_text.replace('```', '').strip()) < 50:
# Skip responses that are mostly code
code_blocks = response_text.count("```")
if code_blocks >= 4 and len(response_text.replace("```", "").strip()) < 50:
continue
if 'tool_call' in response_text[:100] or 'function_call' in response_text[:100]:
# Skip responses with tool call artifacts
if "tool_call" in response_text[:100] or "function_call" in response_text[:100]:
continue
# Deduplicate by content hash
content_hash = compute_hash(prompt_text + response_text[:200])
if content_hash in seen_hashes:
continue
seen_hashes.add(content_hash)
# Clean up response: remove markdown headers if too many
clean_response = response_text
pairs.append({
'terse': prompt_text.strip(),
'rich': clean_response.strip(),
'source': session_id,
'model': model,
'prompt_words': prompt_words,
'response_words': response_words,
'ratio': round(ratio, 2),
"terse": prompt_text.strip(),
"rich": clean_response.strip(),
"source": session_id,
"model": model,
"prompt_words": prompt_words,
"response_words": response_words,
"ratio": round(ratio, 2),
})
return pairs
def extract_from_jsonl_file(filepath: str, **kwargs) -> list:
"""Extract pairs from a session JSONL file."""
pairs = []
path = Path(filepath)
def extract_from_jsonl_file(path: str, **kwargs) -> list:
"""Read a session file and extract training pairs using normalized conversation."""
session_messages = read_session(path)
if not session_messages:
return []
conversation = extract_conversation(session_messages)
# Derive session_id and model from first real message metadata
first_msg = next((m for m in session_messages if m.get('role') or m.get('from')), {})
session_id = first_msg.get('meta_session_id', Path(path).name)
model = first_msg.get('model', 'unknown')
return extract_pairs_from_conversation(conversation, session_id, model, **kwargs)
if not path.exists():
print(f"Warning: {filepath} not found", file=sys.stderr)
return pairs
content = path.read_text()
lines = content.strip().split("\n")
for line in lines:
line = line.strip()
if not line:
continue
try:
session = json.loads(line)
except json.JSONDecodeError:
continue
session_pairs = extract_pairs_from_session(session, **kwargs)
pairs.extend(session_pairs)
return pairs
def deduplicate_pairs(pairs: list) -> list:

103
scripts/test_wiki.py Normal file
View File

@@ -0,0 +1,103 @@
#!/usr/bin/env python3
"""Smoke tests for scripts/wiki.py — retrieval and lint basics."""
import json
import os
import sys
import tempfile
from pathlib import Path
SCRIPT_DIR = Path(__file__).parent.absolute()
sys.path.insert(0, str(SCRIPT_DIR))
import wiki
def test_retrieve_facts():
"""Test fact retrieval ranking."""
with tempfile.TemporaryDirectory() as tmpdir:
kdir = Path(tmpdir) / "knowledge"
kdir.mkdir()
index = {
"version": 1,
"total_facts": 3,
"facts": [
{
"id": "test:fact:001",
"fact": "Gitea token is stored at ~/.config/gitea/token",
"category": "tool-quirk",
"domain": "global",
"confidence": 0.95,
"tags": ["token", "gitea", "auth"],
"last_confirmed": "2026-04-01"
},
{
"id": "test:fact:002",
"fact": "Use gitea-api-first-burn worker for large repos",
"category": "pattern",
"domain": "timmy-config",
"confidence": 0.9,
"tags": ["gitea", "burn", "api"],
},
{
"id": "test:fact:003",
"fact": "Hermes gateway restarts required after Telegram config changes",
"category": "pitfall",
"domain": "hermes-agent",
"confidence": 0.85,
"tags": ["telegram", "gateway"],
}
]
}
index_path = kdir / "index.json"
with open(index_path, 'w') as f:
json.dump(index, f)
original_index = wiki.INDEX_PATH
wiki.INDEX_PATH = index_path
try:
results = wiki.retrieve_facts("where is gitea token stored?", limit=5)
assert len(results) >= 1, f"Expected at least 1 result, got {len(results)}"
assert results[0]['id'] == 'test:fact:001', f"Expected fact 001 first, got {results[0]['id']}"
print(" [PASS] retrieve_facts ranks correctly")
results2 = wiki.retrieve_facts("gitea burn large repos", limit=5)
assert len(results2) >= 1
assert results2[0]['id'] == 'test:fact:002'
print(" [PASS] tag-based retrieval works")
finally:
wiki.INDEX_PATH = original_index
def test_format_context():
"""Test context formatting for LLM."""
facts = [
{"id": "a:1", "fact": "Test fact A", "category": "fact", "confidence": 0.9},
{"id": "b:2", "fact": "Test fact B", "category": "pitfall", "confidence": 0.8},
]
ctx = wiki.format_facts_as_context(facts)
assert "[1]" in ctx and "a:1" in ctx
assert "Test fact A" in ctx
assert "Test fact B" in ctx
print(" [PASS] format_facts_as_context includes IDs and facts")
def test_detect_contradictions():
"""Test contradiction detection."""
index = {
"facts": [
{"id": "x:1", "fact": "Deploy uses port 22 for SSH", "category": "fact", "domain": "deploy"},
{"id": "x:2", "fact": "Deploy uses SSH on port 22", "category": "fact", "domain": "deploy"},
{"id": "x:3", "fact": "Cron jobs require model field", "category": "pitfall", "domain": "hermes-agent"},
]
}
contradictions = wiki.detect_contradictions(index)
assert len(contradictions) >= 1, "Expected at least one potential contradiction"
found = any('x:1' in c.get('fact_a','') or 'x:1' in c.get('fact_b','') for c in contradictions)
assert found, "Should detect similarity between x:1 and x:2"
print(" [PASS] detect_contradictions flags similar facts")
if __name__ == "__main__":
print("Running wiki module smoke tests...")
test_retrieve_facts()
test_format_context()
test_detect_contradictions()
print("\nAll wiki tests passed.")

353
scripts/wiki.py Normal file
View File

@@ -0,0 +1,353 @@
#!/usr/bin/env python3
"""
LLM Wiki layer — ingest, query, lint, and session crystallization for compounding-intelligence.
This is the sovereign knowledge interface: a compiled, queryable, lintable
knowledge base that survivies beyond sessions and cites its sources.
Distinct from:
- RAG: Raw chunk retrieval without synthesis or quality gating
- Transcript search: Keyword match over raw session logs without distillation
The Wiki layer sits on top of the knowledge/ index (facts with provenance).
It provides:
ingest — Harvest knowledge from sessions or raw sources
query — Retrieve + synthesize answers with citations
lint — Detect staleness, contradictions, broken links
crystal — (via harvester) session distillation already integrated
Usage:
python3 scripts/wiki.py ingest --session ~/.hermes/sessions/xxx.jsonl
python3 scripts/wiki.py query "How do I fix cron timeouts?"
python3 scripts/wiki.py lint
"""
import argparse
import json
import os
import re
import subprocess
import sys
from datetime import datetime, timezone
from pathlib import Path
from typing import Optional, List, Dict, Any
SCRIPT_DIR = Path(__file__).resolve().parent
REPO_ROOT = SCRIPT_DIR.parent
KNOWLEDGE_DIR = REPO_ROOT / "knowledge"
INDEX_PATH = KNOWLEDGE_DIR / "index.json"
# ---------- Utilities ----------
def load_index() -> dict:
if not INDEX_PATH.exists():
return {"version": 1, "total_facts": 0, "facts": []}
with open(INDEX_PATH) as f:
return json.load(f)
def score_fact_for_query(fact: dict, query_terms: set, query_lower: str) -> float:
"""Simple BM25-like relevance scoring for fact retrieval."""
fact_text = fact.get('fact', '').lower()
fact_tags = [t.lower() for t in fact.get('tags', [])]
# Term frequency in fact text
tf = sum(1 for term in query_terms if term in fact_text)
# Tag boost: exact tag match gives strong signal
tag_boost = sum(3.0 for tag in fact_tags if tag in query_lower)
# Confidence boost
confidence = fact.get('confidence', 0.5)
# Recency boost: newer facts get slight preference
last_confirmed = fact.get('last_confirmed', '')
recency_boost = 0.0
if last_confirmed:
try:
dt = datetime.fromisoformat(last_confirmed.rstrip('Z'))
days_old = (datetime.now(timezone.utc) - dt).days
recency_boost = max(0, 1.0 - days_old / 365)
except Exception:
pass
score = (tf * 1.0) + (tag_boost * confidence) + (recency_boost * 0.5)
return score
def retrieve_facts(query: str, limit: int = 10) -> List[dict]:
"""Retrieve the most relevant facts for a query from index.json."""
index = load_index()
facts = index.get('facts', [])
query_lower = query.lower()
query_terms = {t for t in re.split(r'\W+', query_lower) if len(t) > 2}
scored = []
for fact in facts:
score = score_fact_for_query(fact, query_terms, query_lower)
if score > 0:
scored.append((score, fact))
scored.sort(key=lambda x: -x[0])
return [f for _, f in scored[:limit]]
def format_facts_as_context(facts: List[dict]) -> str:
"""Format retrieved facts into a context block for LLM synthesis."""
lines = []
for i, fact in enumerate(facts, 1):
fid = fact.get('id', 'unknown')
fact_text = fact.get('fact', '')
confidence = fact.get('confidence', 0.5)
category = fact.get('category', 'fact')
lines.append(f"[{i}] ID:{fid} | {category} (conf={confidence:.2f}): {fact_text}")
return "\n".join(lines)
def find_api_key() -> str:
for p in [
Path.home() / ".config/nous/key",
Path.home() / ".hermes/keymaxxing/active/minimax.key",
Path.home() / ".config/openrouter/key",
]:
if p.exists():
return p.read_text().strip()
return os.environ.get("HARVESTER_API_KEY") or os.environ.get("OPENROUTER_API_KEY") or ""
def call_llm_synthesize(query: str, context: str, api_base: str, api_key: str, model: str) -> str:
"""Call LLM to synthesize answer from retrieved facts."""
import urllib.request
prompt = f"""You are the LLM Wiki answering from the sovereign knowledge base.
Knowledge facts (with citations):
{context}
Question: {query}
Instructions:
- Answer ONLY from the provided facts. Do not use outside knowledge.
- Cite facts using their [N] index number(s) in brackets.
- If the facts don't contain the answer, say "I don't know from the current knowledge base."
- Be concise (2-3 sentences maximum)."""
messages = [
{"role": "system", "content": "You are a precise knowledge assistant."},
{"role": "user", "content": prompt}
]
payload = json.dumps({
"model": model,
"messages": messages,
"temperature": 0.1,
"max_tokens": 512
}).encode('utf-8')
req = urllib.request.Request(
f"{api_base}/chat/completions",
data=payload,
headers={"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"},
method="POST"
)
try:
with urllib.request.urlopen(req, timeout=30) as resp:
result = json.loads(resp.read().decode('utf-8'))
return result["choices"][0]["message"]["content"].strip()
except Exception as e:
return f"[ERROR: LLM call failed: {e}]"
def detect_contradictions(index: dict) -> List[dict]:
"""Detect potentially contradictory facts in the same domain/category."""
contradictions = []
facts = index.get('facts', [])
from collections import defaultdict
grouped = defaultdict(list)
for f in facts:
key = (f.get('domain', 'global'), f.get('category', 'fact'))
grouped[key].append(f)
for key, group in grouped.items():
if len(group) < 2:
continue
for i in range(len(group)):
for j in range(i+1, len(group)):
f1, f2 = group[i], group[j]
text1 = f1.get('fact', '').lower()
text2 = f2.get('fact', '').lower()
words1 = set(re.findall(r'\w+', text1))
words2 = set(re.findall(r'\w+', text2))
if len(words1 & words2) >= 3:
contradictions.append({
"type": "potential_contradiction",
"domain": key[0],
"category": key[1],
"fact_a": f1.get('id'),
"fact_b": f2.get('id'),
"similarity": len(words1 & words2) / max(len(words1), len(words2))
})
return contradictions
def lint_knowledge() -> dict:
"""Run all lint checks: freshness, duplicates, contradictions."""
results = {"errors": [], "warnings": [], "suggestions": []}
index = load_index()
facts = index.get('facts', [])
# 1. Freshness check via freshness.py
try:
freshness_script = SCRIPT_DIR / "freshness.py"
if freshness_script.exists():
proc = subprocess.run(
[sys.executable, str(freshness_script), "--knowledge-dir", str(KNOWLEDGE_DIR)],
capture_output=True, text=True, timeout=30
)
if proc.returncode != 0:
results["errors"].append(f"freshness.py failed: {proc.stderr[:200]}")
except Exception as e:
results["errors"].append(f"Could not run freshness check: {e}")
# 2. Duplicate fact text
seen = {}
for f in facts:
txt = f.get('fact', '').strip().lower()
if txt in seen:
results["warnings"].append(f"Duplicate fact text: {txt[:80]}... IDs: {seen[txt]}, {f.get('id')}")
else:
seen[txt] = f.get('id')
# 3. Contradictions
contradictions = detect_contradictions(index)
for c in contradictions:
results["warnings"].append(
f"Potential contradiction in {c['domain']}/{c['category']}: "
f"{c['fact_a']} vs {c['fact_b']} (similarity={c['similarity']:.2f})"
)
return results
# ---------- Subcommands ----------
def cmd_query(args):
"""Query the wiki: retrieve + synthesize."""
if not INDEX_PATH.exists():
print("ERROR: knowledge/index.json not found. Run ingest first.", file=sys.stderr)
return 1
query = args.query
top_k = args.top or 10
facts = retrieve_facts(query, limit=top_k)
if not facts:
print("No relevant facts found in knowledge base.")
return 0
print(f"→ Retrieved {len(facts)} facts:")
for i, f in enumerate(facts, 1):
fid = f.get('id', '?')
print(f" [{i}] {fid}: {f.get('fact', '')[:90]}")
if args.dry_run:
print("\n[dry-run] Skipping LLM synthesis.")
return 0
api_key = find_api_key()
if not api_key:
print("ERROR: No API key. Set HARVESTER_API_KEY or OPENROUTER_API_KEY.", file=sys.stderr)
return 1
api_base = os.environ.get("HARVESTER_API_BASE", "https://api.nousresearch.com/v1")
model = os.environ.get("HARVESTER_MODEL", "xiaomi/mimo-v2-pro")
context = format_facts_as_context(facts)
answer = call_llm_synthesize(query, context, api_base, api_key, model)
print(f"\n← Answer: {answer}")
return 0
def cmd_ingest(args):
"""Ingest knowledge from a session transcript."""
session = args.session
if not os.path.exists(session):
print(f"ERROR: Session file not found: {session}", file=sys.stderr)
return 1
harvester = SCRIPT_DIR / "harvester.py"
if not harvester.exists():
print("ERROR: harvester.py not found", file=sys.stderr)
return 1
cmd = [sys.executable, str(harvester), "--session", session, "--output", str(KNOWLEDGE_DIR)]
if args.dry_run:
cmd.append("--dry-run")
env = os.environ.copy()
env["PYTHONPATH"] = str(REPO_ROOT)
result = subprocess.run(cmd, env=env)
return result.returncode
def cmd_lint(args):
"""Lint the knowledge base for quality issues."""
results = lint_knowledge()
if results["errors"]:
print("ERRORS:")
for e in results["errors"]:
print(f"{e}")
return 1
if results["warnings"]:
print(f"WARNINGS ({len(results['warnings'])}):")
for w in results["warnings"]:
print(f"{w}")
else:
print("✓ No lint issues found. Knowledge base is clean.")
return 0 if not results["errors"] else 1
def cmd_crystallize(args):
"""Alias for ingest — session crystallization."""
return cmd_ingest(args)
def main():
parser = argparse.ArgumentParser(
description="LLM Wiki layer — ingest, query, lint, crystallize",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
python3 scripts/wiki.py query "How do I fix cron timeouts?"
python3 scripts/wiki.py ingest --session ~/.hermes/sessions/abc.jsonl
python3 scripts/wiki.py lint
python3 scripts/wiki.py crystal --session session.jsonl
"""
)
sub = parser.add_subparsers(dest="command", help="Wiki command")
qp = sub.add_parser("query", help="Ask the wiki a question (RAG + synthesis)")
qp.add_argument("query", help="Natural language question")
qp.add_argument("--top", type=int, default=10, help="Number of facts to retrieve")
qp.add_argument("--dry-run", action="store_true", help="Show retrieval but skip LLM")
qp.set_defaults(func=cmd_query)
ip = sub.add_parser("ingest", help="Ingest a session transcript into knowledge")
ip.add_argument("--session", required=True, help="Path to session JSONL file")
ip.add_argument("--dry-run", action="store_true", help="Preview without writing")
ip.set_defaults(func=cmd_ingest)
lp = sub.add_parser("lint", help="Check knowledge base for issues")
lp.set_defaults(func=cmd_lint)
cp = sub.add_parser("crystal", help="Crystallize a session into durable pages")
cp.add_argument("--session", required=True, help="Path to session JSONL file")
cp.add_argument("--dry-run", action="store_true", help="Preview without writing")
cp.set_defaults(func=cmd_crystallize)
args = parser.parse_args()
if not args.command:
parser.print_help()
return 1
return args.func(args)
if __name__ == "__main__":
sys.exit(main())

View File

@@ -1,118 +0,0 @@
"""
Tests for session_pair_harvester — training pair extraction from sessions.
"""
import json
import tempfile
import unittest
from pathlib import Path
import sys
from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent.parent / "scripts"))
from session_pair_harvester import (
extract_pairs_from_conversation,
extract_from_jsonl_file,
deduplicate_pairs,
compute_hash,
)
class TestSessionPairHarvester(unittest.TestCase):
def test_compute_hash_consistent(self):
h1 = compute_hash("hello world")
h2 = compute_hash("hello world")
self.assertEqual(h1, h2)
self.assertEqual(len(h1), 16)
def test_extract_simple_qa_pair(self):
"""A simple user→assistant exchange produces one pair."""
conversation = [
{"role": "user", "content": "What is the capital of France?"},
{"role": "assistant", "content": "The capital of France is Paris. It is a major European city renowned for its art, fashion, gastronomy, cultural heritage, and historical significance. The city attracts millions of tourists annually."},
]
pairs = extract_pairs_from_conversation(conversation, "test_session", "test-model")
self.assertEqual(len(pairs), 1)
self.assertEqual(pairs[0]["terse"], "What is the capital of France?")
self.assertIn("Paris", pairs[0]["rich"])
self.assertEqual(pairs[0]["source"], "test_session")
def test_min_ratio_filter(self):
"""Very short responses are filtered out."""
conversation = [
{"role": "user", "content": "Yes"},
{"role": "assistant", "content": "No."},
]
# Default min_ratio = 1.5, min_words = 20 for response
pairs = extract_pairs_from_conversation(conversation, "s", "m", min_response_words=3)
self.assertEqual(len(pairs), 0)
def test_min_words_filter(self):
"""Assistant responses below min word count are skipped."""
conversation = [
{"role": "user", "content": "Explain the project architecture in detail"},
{"role": "assistant", "content": "OK."},
]
pairs = extract_pairs_from_conversation(conversation, "s", "m", min_response_words=5)
self.assertEqual(len(pairs), 0)
def test_skip_non_assistant_messages(self):
"""System and tool messages are ignored."""
conversation = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello"},
{"role": "assistant", "content": "Hi there! How can I help you today?"},
]
pairs = extract_pairs_from_conversation(conversation, "s", "m", min_response_words=3)
self.assertEqual(len(pairs), 1)
self.assertEqual(pairs[0]["terse"], "Hello")
def test_multiple_pairs_from_one_session(self):
"""A conversation with several Q&A turns yields multiple pairs."""
conversation = [
{"role": "user", "content": "First question?"},
{"role": "assistant", "content": "Here is a detailed and comprehensive answer that thoroughly explores multiple aspects of the subject. It provides background context and practical implications for the reader."},
{"role": "user", "content": "Second?"},
{"role": "assistant", "content": "Another comprehensive response with detailed examples. This includes practical code blocks and thorough explanations to ensure deep understanding of the topic at hand."},
]
pairs = extract_pairs_from_conversation(conversation, "s", "m", min_ratio=1.0)
self.assertEqual(len(pairs), 2)
def test_deduplication_removes_duplicates(self):
"""Identical pairs across sessions are deduplicated."""
pairs = [
{"terse": "q1", "rich": "a1", "source": "s1", "model": "m"},
{"terse": "q1", "rich": "a1", "source": "s2", "model": "m"},
{"terse": "q2", "rich": "a2", "source": "s1", "model": "m"},
]
unique = deduplicate_pairs(pairs)
self.assertEqual(len(unique), 2)
sources = {p["source"] for p in unique}
# First unique pair can be from either s1 or s2
self.assertIn("s1", sources)
def test_integration_with_test_sessions(self):
"""Harvester finds pairs in real test session files."""
repo_root = Path(__file__).parent.parent
test_sessions_dir = repo_root / "test_sessions"
if not test_sessions_dir.exists():
self.skipTest("test_sessions not found")
pairs = []
for jsonl_file in sorted(test_sessions_dir.glob("*.jsonl")):
pairs.extend(extract_from_jsonl_file(str(jsonl_file)))
self.assertGreater(len(pairs), 0, "Should extract at least one pair from test_sessions")
for p in pairs:
self.assertIn("terse", p)
self.assertIn("rich", p)
self.assertIn("source", p)
self.assertIn("model", p)
# Verify content exists
self.assertGreater(len(p["terse"]), 0)
self.assertGreater(len(p["rich"]), 0)
if __name__ == "__main__":
unittest.main()