Compare commits
2 Commits
fix/676
...
feat/200-k
| Author | SHA1 | Date | |
|---|---|---|---|
| baa2c84c3f | |||
| 6dd354385f |
412
GENOME.md
412
GENOME.md
@@ -1,16 +1,16 @@
|
||||
# GENOME.md — compounding-intelligence
|
||||
|
||||
**Generated:** 2026-04-17
|
||||
**Repo:** Timmy_Foundation/compounding-intelligence
|
||||
**Description:** Turn 1B+ daily agent tokens into durable, compounding fleet intelligence.
|
||||
*Auto-generated codebase genome. Addresses timmy-home#676.*
|
||||
|
||||
---
|
||||
|
||||
## Project Overview
|
||||
|
||||
Every agent session starts at zero. The same HTTP 405 gets rediscovered as a branch protection issue. The same token path gets searched from scratch. Intelligence evaporates when the session ends.
|
||||
**What:** A system that turns 1B+ daily agent tokens into durable, compounding fleet intelligence.
|
||||
|
||||
Compounding-intelligence solves this with three pipelines forming a loop:
|
||||
**Why:** Every agent session starts at zero. The same mistakes get made repeatedly — the same HTTP 405 is rediscovered as a branch protection issue, the same token path is searched for from scratch. Intelligence evaporates when the session ends.
|
||||
|
||||
**How:** Three pipelines form a compounding loop:
|
||||
|
||||
```
|
||||
SESSION ENDS → HARVESTER → KNOWLEDGE STORE → BOOTSTRAPPER → NEW SESSION STARTS SMARTER
|
||||
@@ -18,234 +18,222 @@ SESSION ENDS → HARVESTER → KNOWLEDGE STORE → BOOTSTRAPPER → NEW SESSION
|
||||
MEASURER → Prove it's working
|
||||
```
|
||||
|
||||
**Status:** Active development. Core pipelines implemented. 20+ scripts, 14 test files, knowledge store populated with real data.
|
||||
**Status:** Early stage. Template and test scaffolding exist. Core pipeline scripts (harvester.py, bootstrapper.py, measurer.py, session_reader.py) are planned but not yet implemented. The knowledge extraction prompt is complete and validated.
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
TRANS[Session Transcripts<br/>~/.hermes/sessions/*.jsonl] --> READER[session_reader.py]
|
||||
READER --> HARVESTER[harvester.py]
|
||||
HARVESTER -->|LLM extraction| PROMPT[harvest-prompt.md]
|
||||
HARVESTER --> DEDUP[deduplicate()]
|
||||
DEDUP --> INDEX[knowledge/index.json]
|
||||
DEDUP --> GLOBAL[knowledge/global/*.yaml]
|
||||
DEDUP --> REPO[knowledge/repos/*.yaml]
|
||||
|
||||
INDEX --> BOOTSTRAPPER[bootstrapper.py]
|
||||
BOOTSTRAPPER -->|filter + rank + truncate| CONTEXT[Bootstrap Context<br/>2k token injection]
|
||||
CONTEXT --> SESSION[New Session starts smarter]
|
||||
|
||||
INDEX --> VALIDATOR[validate_knowledge.py]
|
||||
INDEX --> STALENESS[knowledge_staleness_check.py]
|
||||
INDEX --> GAPS[knowledge_gap_identifier.py]
|
||||
|
||||
TRANS --> SAMPLER[sampler.py]
|
||||
SAMPLER -->|score + rank| BEST[High-value sessions]
|
||||
BEST --> HARVESTER
|
||||
|
||||
TRANS --> METADATA[session_metadata.py]
|
||||
METADATA --> SUMMARY[SessionSummary objects]
|
||||
|
||||
KNOWLEDGE --> DIFF[diff_analyzer.py]
|
||||
DIFF --> PROPOSALS[improvement_proposals.py]
|
||||
PROPOSALS --> PRIORITIES[priority_rebalancer.py]
|
||||
A[Session Transcript<br/>.jsonl] --> B[Harvester]
|
||||
B --> C{Extract Knowledge}
|
||||
C --> D[knowledge/index.json]
|
||||
C --> E[knowledge/global/*.md]
|
||||
C --> F[knowledge/repos/{repo}.md]
|
||||
C --> G[knowledge/agents/{agent}.md]
|
||||
D --> H[Bootstrapper]
|
||||
H --> I[Bootstrap Context<br/>2k token injection]
|
||||
I --> J[New Session<br/>starts smarter]
|
||||
J --> A
|
||||
D --> K[Measurer]
|
||||
K --> L[metrics/dashboard.md]
|
||||
K --> M[Velocity / Hit Rate<br/>Error Reduction]
|
||||
```
|
||||
|
||||
## Entry Points
|
||||
### Pipeline 1: Harvester
|
||||
|
||||
### Core Pipelines
|
||||
**Status:** Prompt designed. Script not implemented.
|
||||
|
||||
| Script | Purpose | Key Functions |
|
||||
|--------|---------|---------------|
|
||||
| `harvester.py` | Extract knowledge from session transcripts | `harvest_session()`, `call_llm()`, `deduplicate()`, `validate_fact()` |
|
||||
| `bootstrapper.py` | Build pre-session context from knowledge store | `build_bootstrap_context()`, `filter_facts()`, `sort_facts()`, `truncate_to_tokens()` |
|
||||
| `session_reader.py` | Parse JSONL session transcripts | `read_session()`, `extract_conversation()`, `messages_to_text()` |
|
||||
| `sampler.py` | Score and rank sessions for harvesting value | `scan_session_fast()`, `score_session()` |
|
||||
| `session_metadata.py` | Extract structured metadata from sessions | `extract_session_metadata()`, `SessionSummary` |
|
||||
Reads finished session transcripts (JSONL). Uses `templates/harvest-prompt.md` to extract durable knowledge into five categories:
|
||||
|
||||
### Analysis & Quality
|
||||
| Category | Description | Example |
|
||||
|----------|-------------|---------|
|
||||
| `fact` | Concrete, verifiable information | "Repository X has 5 files" |
|
||||
| `pitfall` | Errors encountered, wrong assumptions | "Token is at ~/.config/gitea/token, not env var" |
|
||||
| `pattern` | Successful action sequences | "Deploy: test → build → push → webhook" |
|
||||
| `tool-quirk` | Environment-specific behaviors | "URL format requires trailing slash" |
|
||||
| `question` | Identified but unanswered | "Need optimal batch size for harvesting" |
|
||||
|
||||
| Script | Purpose |
|
||||
|--------|---------|
|
||||
| `validate_knowledge.py` | Validate knowledge index schema compliance |
|
||||
| `knowledge_staleness_check.py` | Detect stale knowledge (source changed since extraction) |
|
||||
| `knowledge_gap_identifier.py` | Find untested functions, undocumented APIs, missing tests |
|
||||
| `diff_analyzer.py` | Analyze code diffs for improvement signals |
|
||||
| `improvement_proposals.py` | Generate ranked improvement proposals |
|
||||
| `priority_rebalancer.py` | Rebalance priorities across proposals |
|
||||
| `automation_opportunity_finder.py` | Find manual steps that can be automated |
|
||||
| `dead_code_detector.py` | Detect unused code |
|
||||
| `dependency_graph.py` | Map dependency relationships |
|
||||
| `perf_bottleneck_finder.py` | Find performance bottlenecks |
|
||||
| `refactoring_opportunity_finder.py` | Identify refactoring targets |
|
||||
| `gitea_issue_parser.py` | Parse Gitea issues for knowledge extraction |
|
||||
|
||||
### Automation
|
||||
|
||||
| Script | Purpose |
|
||||
|--------|---------|
|
||||
| `session_pair_harvester.py` | Extract training pairs from sessions |
|
||||
|
||||
## Data Flow
|
||||
|
||||
```
|
||||
1. Session ends → .jsonl written to ~/.hermes/sessions/
|
||||
2. sampler.py scores sessions by age, recency, repo coverage
|
||||
3. harvester.py reads top sessions, calls LLM with harvest-prompt.md
|
||||
4. LLM extracts facts/pitfalls/patterns/quirks/questions
|
||||
5. deduplicate() checks against existing index via fact_fingerprint()
|
||||
6. validate_fact() checks schema compliance
|
||||
7. write_knowledge() appends to knowledge/index.json + per-repo YAML
|
||||
8. On next session start, bootstrapper.py:
|
||||
a. Loads knowledge/index.json
|
||||
b. Filters by session's repo and agent type
|
||||
c. Sorts by confidence (high first), then recency
|
||||
d. Truncates to 2k token budget
|
||||
e. Injects as pre-context
|
||||
9. Agent starts with full situational awareness instead of zero
|
||||
```
|
||||
|
||||
## Key Abstractions
|
||||
|
||||
### Knowledge Item (fact/pitfall/pattern/quirk/question)
|
||||
Output schema per knowledge item:
|
||||
```json
|
||||
{
|
||||
"fact": "Gitea token is at ~/.config/gitea/token",
|
||||
"category": "tool-quirk",
|
||||
"repo": "global",
|
||||
"confidence": 0.9,
|
||||
"evidence": "Found during clone attempt",
|
||||
"source_session": "2026-04-13_abc123",
|
||||
"extracted_at": "2026-04-13T20:00:00Z"
|
||||
"fact": "One sentence description",
|
||||
"category": "fact|pitfall|pattern|tool-quirk|question",
|
||||
"repo": "repo-name or 'global'",
|
||||
"confidence": 0.0-1.0
|
||||
}
|
||||
```
|
||||
|
||||
### SessionSummary (session_metadata.py)
|
||||
Extracted metadata per session: duration, token count, tools used, repos touched, error count, outcome.
|
||||
### Pipeline 2: Bootstrapper
|
||||
|
||||
### Gap / GapReport (knowledge_gap_identifier.py)
|
||||
Structured gap analysis: untested functions, undocumented APIs, missing tests. Severity: critical/high/medium/low.
|
||||
**Status:** Not implemented.
|
||||
|
||||
### Knowledge Index (knowledge/index.json)
|
||||
Machine-readable fact store. 12KB, populated with real data. Categories: fact, pitfall, pattern, tool-quirk, question.
|
||||
Queries knowledge store before session start. Assembles a compact 2k-token context from relevant facts. Injects into session startup so the agent begins with full situational awareness.
|
||||
|
||||
## Knowledge Store
|
||||
### Pipeline 3: Measurer
|
||||
|
||||
```
|
||||
knowledge/
|
||||
├── index.json # Master fact store (12KB, populated)
|
||||
├── SCHEMA.md # Schema documentation
|
||||
├── global/
|
||||
│ ├── pitfalls.yaml # Cross-repo pitfalls (2KB)
|
||||
│ └── tool-quirks.yaml # Tool-specific quirks (2KB)
|
||||
├── repos/
|
||||
│ ├── hermes-agent.yaml # hermes-agent knowledge (2KB)
|
||||
│ └── the-nexus.yaml # the-nexus knowledge (2KB)
|
||||
└── agents/ # Per-agent knowledge (empty)
|
||||
```
|
||||
**Status:** Not implemented.
|
||||
|
||||
## API Surface
|
||||
|
||||
### LLM API (consumed)
|
||||
| Provider | Endpoint | Usage |
|
||||
|----------|----------|-------|
|
||||
| Nous Research | `https://inference-api.nousresearch.com/v1` | Knowledge extraction |
|
||||
| Ollama | `http://localhost:11434/v1` | Local fallback |
|
||||
|
||||
### File API (consumed/produced)
|
||||
| Path | Format | Direction |
|
||||
|------|--------|-----------|
|
||||
| `~/.hermes/sessions/*.jsonl` | JSONL | Input (session transcripts) |
|
||||
| `knowledge/index.json` | JSON | Output (master fact store) |
|
||||
| `knowledge/global/*.yaml` | YAML | Output (cross-repo knowledge) |
|
||||
| `knowledge/repos/*.yaml` | YAML | Output (per-repo knowledge) |
|
||||
| `templates/harvest-prompt.md` | Markdown | Config (extraction prompt) |
|
||||
|
||||
## Test Coverage
|
||||
|
||||
**14 test files** covering core pipelines:
|
||||
|
||||
| Test File | Covers |
|
||||
|-----------|--------|
|
||||
| `test_harvest_prompt.py` | Prompt validation, hallucination detection |
|
||||
| `test_harvest_prompt_comprehensive.py` | Extended prompt testing |
|
||||
| `test_harvester_pipeline.py` | Harvester extraction + dedup |
|
||||
| `test_bootstrapper.py` | Context building, filtering, truncation |
|
||||
| `test_session_pair_harvester.py` | Training pair extraction |
|
||||
| `test_improvement_proposals.py` | Proposal generation |
|
||||
| `test_priority_rebalancer.py` | Priority scoring |
|
||||
| `test_knowledge_staleness.py` | Staleness detection |
|
||||
| `test_automation_opportunity_finder.py` | Automation detection |
|
||||
| `test_diff_analyzer.py` | Diff analysis |
|
||||
| `test_gitea_issue_parser.py` | Issue parsing |
|
||||
| `test_refactoring_opportunity_finder.py` | Refactoring signals |
|
||||
| `test_knowledge_gap_identifier.py` | Gap analysis |
|
||||
| `test_perf_bottleneck_finder.py` | Perf bottleneck detection |
|
||||
|
||||
### Coverage Gaps
|
||||
|
||||
1. **session_reader.py** — No dedicated test file (tested indirectly)
|
||||
2. **sampler.py** — No test file (scoring logic untested)
|
||||
3. **session_metadata.py** — No test file
|
||||
4. **validate_knowledge.py** — No test file
|
||||
5. **knowledge_staleness_check.py** — Tested but limited
|
||||
|
||||
## Security Considerations
|
||||
|
||||
### API Key Handling
|
||||
- `harvester.py` reads API key from `~/.hermes/auth.json` or env vars
|
||||
- Key passed to LLM API in request headers only
|
||||
- No key logging
|
||||
|
||||
### Knowledge Integrity
|
||||
- `validate_fact()` checks schema before writing
|
||||
- `deduplicate()` prevents duplicate entries via fingerprint
|
||||
- `knowledge_staleness_check.py` detects when source code changed but knowledge didn't
|
||||
- Confidence scores prevent low-quality knowledge from polluting the store
|
||||
|
||||
### File Safety
|
||||
- Knowledge writes are append-only (never deletes)
|
||||
- Bootstrap context is truncated to budget (no prompt injection via knowledge)
|
||||
- Session reader handles malformed JSONL gracefully
|
||||
|
||||
## File Index
|
||||
|
||||
```
|
||||
scripts/
|
||||
harvester.py (473 lines) — Core knowledge extraction
|
||||
bootstrapper.py (302 lines) — Pre-session context builder
|
||||
session_reader.py (137 lines) — JSONL session parser
|
||||
sampler.py (363 lines) — Session scoring + ranking
|
||||
session_metadata.py (271 lines) — Session metadata extraction
|
||||
validate_knowledge.py (44 lines) — Index validation
|
||||
knowledge_staleness_check.py (125 lines) — Staleness detection
|
||||
knowledge_gap_identifier.py (291 lines) — Gap analysis engine
|
||||
diff_analyzer.py (203 lines) — Diff analysis
|
||||
improvement_proposals.py (518 lines) — Proposal generation
|
||||
priority_rebalancer.py (745 lines) — Priority scoring
|
||||
automation_opportunity_finder.py (600 lines) — Automation detection
|
||||
dead_code_detector.py (270 lines) — Dead code detection
|
||||
dependency_graph.py (220 lines) — Dependency mapping
|
||||
perf_bottleneck_finder.py (635 lines) — Perf analysis
|
||||
refactoring_opportunity_finder.py (46 lines) — Refactoring signals
|
||||
gitea_issue_parser.py (140 lines) — Gitea issue parsing
|
||||
session_pair_harvester.py (224 lines) — Training pair extraction
|
||||
knowledge/
|
||||
index.json (12KB) — Master fact store
|
||||
SCHEMA.md (3KB) — Schema docs
|
||||
global/pitfalls.yaml (2KB) — Cross-repo pitfalls
|
||||
global/tool-quirks.yaml (2KB) — Tool quirks
|
||||
repos/hermes-agent.yaml (2KB) — Repo-specific knowledge
|
||||
repos/the-nexus.yaml (2KB) — Repo-specific knowledge
|
||||
templates/
|
||||
harvest-prompt.md (4KB) — Extraction prompt
|
||||
test_sessions/ (5 files) — Sample transcripts
|
||||
tests/ + scripts/test_* (14 files)— Test suite
|
||||
```
|
||||
|
||||
**Total:** ~6,500 lines of code across 18 scripts + 14 test files.
|
||||
Tracks compounding metrics: knowledge velocity (facts/day), error reduction (%), hit rate (knowledge used / knowledge available), task completion improvement.
|
||||
|
||||
---
|
||||
|
||||
*Generated by Codebase Genome pipeline — Issue #676*
|
||||
## Directory Structure
|
||||
|
||||
```
|
||||
compounding-intelligence/
|
||||
├── README.md # Project overview and architecture
|
||||
├── GENOME.md # This file (codebase genome)
|
||||
├── knowledge/ # [PLANNED] Knowledge store
|
||||
│ ├── index.json # Machine-readable fact index
|
||||
│ ├── global/ # Cross-repo knowledge
|
||||
│ ├── repos/{repo}.md # Per-repo knowledge
|
||||
│ └── agents/{agent}.md # Agent-type notes
|
||||
├── scripts/
|
||||
│ ├── test_harvest_prompt.py # Basic prompt validation (2.5KB)
|
||||
│ └── test_harvest_prompt_comprehensive.py # Full prompt structure test (6.8KB)
|
||||
├── templates/
|
||||
│ └── harvest-prompt.md # Knowledge extraction prompt (3.5KB)
|
||||
├── test_sessions/
|
||||
│ ├── session_success.jsonl # Happy path test data
|
||||
│ ├── session_failure.jsonl # Failure path test data
|
||||
│ ├── session_partial.jsonl # Incomplete session test data
|
||||
│ ├── session_patterns.jsonl # Pattern extraction test data
|
||||
│ └── session_questions.jsonl # Question identification test data
|
||||
└── metrics/ # [PLANNED] Compounding metrics
|
||||
└── dashboard.md
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Entry Points and Data Flow
|
||||
|
||||
### Entry Point 1: Knowledge Extraction (Harvester)
|
||||
|
||||
```
|
||||
Input: Session transcript (JSONL)
|
||||
↓
|
||||
templates/harvest-prompt.md (LLM prompt)
|
||||
↓
|
||||
Knowledge items (JSON array)
|
||||
↓
|
||||
Output: knowledge/index.json + per-repo/per-agent markdown files
|
||||
```
|
||||
|
||||
### Entry Point 2: Session Bootstrap (Bootstrapper)
|
||||
|
||||
```
|
||||
Input: Session context (repo, agent type, task type)
|
||||
↓
|
||||
knowledge/index.json (query relevant facts)
|
||||
↓
|
||||
2k-token bootstrap context
|
||||
↓
|
||||
Output: Injected into session startup
|
||||
```
|
||||
|
||||
### Entry Point 3: Measurement (Measurer)
|
||||
|
||||
```
|
||||
Input: knowledge/index.json + session history
|
||||
↓
|
||||
Velocity, hit rate, error reduction calculations
|
||||
↓
|
||||
Output: metrics/dashboard.md
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Key Abstractions
|
||||
|
||||
### Knowledge Item
|
||||
The atomic unit. One sentence, one category, one confidence score. Designed to be small enough that 1000 items fit in a 2k-token bootstrap context.
|
||||
|
||||
### Knowledge Store
|
||||
A directory structure that mirrors the fleet's mental model:
|
||||
- `global/` — knowledge that applies everywhere (tool quirks, environment facts)
|
||||
- `repos/` — knowledge specific to each repo
|
||||
- `agents/` — knowledge specific to each agent type
|
||||
|
||||
### Confidence Score
|
||||
0.0–1.0 scale. Defines how certain the harvester is about each extracted fact:
|
||||
- 0.9–1.0: Explicitly stated with verification
|
||||
- 0.7–0.8: Clearly implied by multiple data points
|
||||
- 0.5–0.6: Suggested but not fully verified
|
||||
- 0.3–0.4: Inferred from limited data
|
||||
- 0.1–0.2: Speculative or uncertain
|
||||
|
||||
### Bootstrap Context
|
||||
The 2k-token injection that a new session receives. Assembled from the most relevant knowledge items for the current task, filtered by confidence > 0.7, deduplicated, and compressed.
|
||||
|
||||
---
|
||||
|
||||
## API Surface
|
||||
|
||||
### Internal (scripts not yet implemented)
|
||||
|
||||
| Script | Input | Output | Status |
|
||||
|--------|-------|--------|--------|
|
||||
| `harvester.py` | Session JSONL path | Knowledge items JSON | PLANNED |
|
||||
| `bootstrapper.py` | Repo + agent type | 2k-token context string | PLANNED |
|
||||
| `measurer.py` | Knowledge store path | Metrics JSON | PLANNED |
|
||||
| `session_reader.py` | Session JSONL path | Parsed transcript | PLANNED |
|
||||
|
||||
### Prompt (templates/harvest-prompt.md)
|
||||
|
||||
The extraction prompt is the core "API." It takes a session transcript and returns structured JSON. It defines:
|
||||
- Five extraction categories
|
||||
- Output format (JSON array of knowledge items)
|
||||
- Confidence scoring rubric
|
||||
- Constraints (no hallucination, specificity, relevance, brevity)
|
||||
- Example input/output pair
|
||||
|
||||
---
|
||||
|
||||
## Test Coverage
|
||||
|
||||
### What Exists
|
||||
|
||||
| File | Tests | Coverage |
|
||||
|------|-------|----------|
|
||||
| `scripts/test_harvest_prompt.py` | 2 tests | Prompt file existence, sample transcript |
|
||||
| `scripts/test_harvest_prompt_comprehensive.py` | 5 tests | Prompt structure, categories, fields, confidence scoring, size limits |
|
||||
| `test_sessions/*.jsonl` | 5 sessions | Success, failure, partial, patterns, questions |
|
||||
|
||||
### What's Missing
|
||||
|
||||
1. **Harvester integration test** — Does the prompt actually extract correct knowledge from real transcripts?
|
||||
2. **Bootstrapper test** — Does it assemble relevant context correctly?
|
||||
3. **Knowledge store test** — Does the index.json maintain consistency?
|
||||
4. **Confidence calibration test** — Do high-confidence facts actually prove true in later sessions?
|
||||
5. **Deduplication test** — Are duplicate facts across sessions handled?
|
||||
6. **Staleness test** — How does the system handle outdated knowledge?
|
||||
|
||||
---
|
||||
|
||||
## Security Considerations
|
||||
|
||||
1. **No secrets in knowledge store** — The harvester must filter out API keys, tokens, and credentials from extracted facts. The prompt constraints mention this but there is no automated guard.
|
||||
|
||||
2. **Knowledge poisoning** — A malicious or corrupted session could inject false facts. Confidence scoring partially mitigates this, but there is no verification step.
|
||||
|
||||
3. **Access control** — The knowledge store has no access control. Any process that can read the directory can read all facts. In a multi-tenant setup, this is a concern.
|
||||
|
||||
4. **Transcript privacy** — Session transcripts may contain user data. The harvester must not extract personally identifiable information into the knowledge store.
|
||||
|
||||
---
|
||||
|
||||
## The 100x Path (from README)
|
||||
|
||||
```
|
||||
Month 1: 15,000 facts, sessions 20% faster
|
||||
Month 2: 45,000 facts, sessions 40% faster, first-try success up 30%
|
||||
Month 3: 90,000 facts, fleet measurably smarter per token
|
||||
```
|
||||
|
||||
Each new session is better than the last. The intelligence compounds.
|
||||
|
||||
---
|
||||
|
||||
*Generated by codebase-genome pipeline. Ref: timmy-home#676.*
|
||||
|
||||
387
scripts/freshness.py
Normal file
387
scripts/freshness.py
Normal file
@@ -0,0 +1,387 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Knowledge Freshness Cron — Detect stale entries from code changes (Issue #200)
|
||||
|
||||
Automatically detects when knowledge entries become stale due to code changes.
|
||||
|
||||
Detection Method:
|
||||
1. Track source file hash alongside knowledge entry
|
||||
2. Compare current file hashes vs stored
|
||||
3. Mismatch → flag entry as potentially stale
|
||||
4. Report stale entries and optionally re-extract
|
||||
|
||||
Usage:
|
||||
python3 scripts/freshness.py --knowledge-dir knowledge/
|
||||
python3 scripts/freshness.py --knowledge-dir knowledge/ --json
|
||||
python3 scripts/freshness.py --knowledge-dir knowledge/ --repo /path/to/repo
|
||||
python3 scripts/freshness.py --knowledge-dir knowledge/ --auto-reextract
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import hashlib
|
||||
import json
|
||||
import os
|
||||
import subprocess
|
||||
import sys
|
||||
import yaml
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
from typing import Dict, List, Any, Optional, Tuple
|
||||
|
||||
|
||||
def compute_file_hash(filepath: str) -> Optional[str]:
|
||||
"""Compute SHA-256 hash of a file. Returns None if file doesn't exist."""
|
||||
try:
|
||||
with open(filepath, "rb") as f:
|
||||
return "sha256:" + hashlib.sha256(f.read()).hexdigest()
|
||||
except (FileNotFoundError, IsADirectoryError, PermissionError):
|
||||
return None
|
||||
|
||||
|
||||
def get_git_file_changes(repo_path: str, days: int = 1) -> Dict[str, List[str]]:
|
||||
"""
|
||||
Get files changed in git in the last N days.
|
||||
|
||||
Returns dict with 'modified', 'added', 'deleted' lists of file paths.
|
||||
"""
|
||||
changes = {"modified": [], "added": [], "deleted": []}
|
||||
|
||||
try:
|
||||
# Get commits from last N days
|
||||
cmd = [
|
||||
"git", "-C", repo_path, "log",
|
||||
f"--since={days} days ago",
|
||||
"--name-status",
|
||||
"--pretty=format:",
|
||||
"--diff-filter=MAD"
|
||||
]
|
||||
result = subprocess.run(cmd, capture_output=True, text=True, timeout=30)
|
||||
|
||||
if result.returncode != 0:
|
||||
return changes
|
||||
|
||||
for line in result.stdout.splitlines():
|
||||
line = line.strip()
|
||||
if not line:
|
||||
continue
|
||||
|
||||
parts = line.split('\t', 1)
|
||||
if len(parts) != 2:
|
||||
continue
|
||||
|
||||
status, filepath = parts
|
||||
if status == 'M':
|
||||
changes["modified"].append(filepath)
|
||||
elif status == 'A':
|
||||
changes["added"].append(filepath)
|
||||
elif status == 'D':
|
||||
changes["deleted"].append(filepath)
|
||||
|
||||
except (subprocess.TimeoutExpired, FileNotFoundError):
|
||||
pass
|
||||
|
||||
# Deduplicate
|
||||
for key in changes:
|
||||
changes[key] = list(set(changes[key]))
|
||||
|
||||
return changes
|
||||
|
||||
|
||||
def load_knowledge_entries(knowledge_dir: str) -> List[Dict[str, Any]]:
|
||||
"""
|
||||
Load knowledge entries from YAML files in the knowledge directory.
|
||||
|
||||
Supports:
|
||||
- knowledge/index.json (legacy format)
|
||||
- knowledge/global/*.yaml
|
||||
- knowledge/repos/*.yaml
|
||||
- knowledge/agents/*.yaml
|
||||
"""
|
||||
entries = []
|
||||
|
||||
# Load from index.json if exists
|
||||
index_path = os.path.join(knowledge_dir, "index.json")
|
||||
if os.path.exists(index_path):
|
||||
try:
|
||||
with open(index_path) as f:
|
||||
data = json.load(f)
|
||||
for fact in data.get("facts", []):
|
||||
entries.append({
|
||||
"source": "index.json",
|
||||
"fact": fact.get("fact", ""),
|
||||
"source_file": fact.get("source_file"),
|
||||
"source_hash": fact.get("source_hash"),
|
||||
"category": fact.get("category", "unknown"),
|
||||
"confidence": fact.get("confidence", 0.5)
|
||||
})
|
||||
except (json.JSONDecodeError, KeyError):
|
||||
pass
|
||||
|
||||
# Load from YAML files
|
||||
for subdir in ["global", "repos", "agents"]:
|
||||
subdir_path = os.path.join(knowledge_dir, subdir)
|
||||
if not os.path.isdir(subdir_path):
|
||||
continue
|
||||
|
||||
for filename in os.listdir(subdir_path):
|
||||
if not filename.endswith((".yaml", ".yml")):
|
||||
continue
|
||||
|
||||
filepath = os.path.join(subdir_path, filename)
|
||||
try:
|
||||
with open(filepath) as f:
|
||||
data = yaml.safe_load(f)
|
||||
|
||||
if not data or not isinstance(data, dict):
|
||||
continue
|
||||
|
||||
# Extract entries from YAML structure
|
||||
for key, value in data.items():
|
||||
if isinstance(value, list):
|
||||
for item in value:
|
||||
if isinstance(item, dict):
|
||||
entries.append({
|
||||
"source": f"{subdir}/{filename}",
|
||||
"fact": item.get("description", item.get("fact", "")),
|
||||
"source_file": item.get("source_file"),
|
||||
"source_hash": item.get("source_hash"),
|
||||
"category": item.get("category", "unknown"),
|
||||
"confidence": item.get("confidence", 0.5)
|
||||
})
|
||||
elif isinstance(value, dict):
|
||||
entries.append({
|
||||
"source": f"{subdir}/{filename}",
|
||||
"fact": value.get("description", value.get("fact", "")),
|
||||
"source_file": value.get("source_file"),
|
||||
"source_hash": value.get("source_hash"),
|
||||
"category": value.get("category", "unknown"),
|
||||
"confidence": value.get("confidence", 0.5)
|
||||
})
|
||||
except (yaml.YAMLError, IOError):
|
||||
pass
|
||||
|
||||
return entries
|
||||
|
||||
|
||||
def check_freshness(knowledge_dir: str, repo_root: str = ".",
|
||||
days: int = 1) -> Dict[str, Any]:
|
||||
"""
|
||||
Check freshness of knowledge entries against recent code changes.
|
||||
|
||||
Returns:
|
||||
{
|
||||
"timestamp": ISO timestamp,
|
||||
"total_entries": int,
|
||||
"stale_entries": [...],
|
||||
"fresh_entries": [...],
|
||||
"git_changes": {...},
|
||||
"summary": {...}
|
||||
}
|
||||
"""
|
||||
entries = load_knowledge_entries(knowledge_dir)
|
||||
git_changes = get_git_file_changes(repo_root, days)
|
||||
|
||||
stale_entries = []
|
||||
fresh_entries = []
|
||||
|
||||
for entry in entries:
|
||||
source_file = entry.get("source_file")
|
||||
if not source_file:
|
||||
# Entry without source file reference
|
||||
fresh_entries.append({**entry, "status": "no_source"})
|
||||
continue
|
||||
|
||||
# Check if source file was recently modified
|
||||
is_stale = False
|
||||
reason = ""
|
||||
|
||||
if source_file in git_changes["modified"]:
|
||||
is_stale = True
|
||||
reason = "source_modified"
|
||||
elif source_file in git_changes["deleted"]:
|
||||
is_stale = True
|
||||
reason = "source_deleted"
|
||||
elif source_file in git_changes["added"]:
|
||||
is_stale = True
|
||||
reason = "source_added"
|
||||
|
||||
# Also check hash if available
|
||||
stored_hash = entry.get("source_hash")
|
||||
if stored_hash:
|
||||
full_path = os.path.join(repo_root, source_file)
|
||||
current_hash = compute_file_hash(full_path)
|
||||
|
||||
if current_hash is None:
|
||||
is_stale = True
|
||||
reason = "source_missing"
|
||||
elif current_hash != stored_hash:
|
||||
is_stale = True
|
||||
reason = "hash_mismatch"
|
||||
|
||||
if is_stale:
|
||||
stale_entries.append({
|
||||
**entry,
|
||||
"status": "stale",
|
||||
"reason": reason
|
||||
})
|
||||
else:
|
||||
fresh_entries.append({**entry, "status": "fresh"})
|
||||
|
||||
# Compute summary
|
||||
total = len(entries)
|
||||
stale_count = len(stale_entries)
|
||||
fresh_count = len(fresh_entries)
|
||||
|
||||
# Group stale entries by reason
|
||||
stale_by_reason = {}
|
||||
for entry in stale_entries:
|
||||
reason = entry.get("reason", "unknown")
|
||||
if reason not in stale_by_reason:
|
||||
stale_by_reason[reason] = 0
|
||||
stale_by_reason[reason] += 1
|
||||
|
||||
return {
|
||||
"timestamp": datetime.now(timezone.utc).isoformat(),
|
||||
"total_entries": total,
|
||||
"stale_entries": stale_entries,
|
||||
"fresh_entries": fresh_entries,
|
||||
"git_changes": git_changes,
|
||||
"summary": {
|
||||
"total": total,
|
||||
"stale": stale_count,
|
||||
"fresh": fresh_count,
|
||||
"stale_percentage": round(stale_count / total * 100, 1) if total > 0 else 0,
|
||||
"stale_by_reason": stale_by_reason,
|
||||
"git_changes_summary": {
|
||||
"modified": len(git_changes["modified"]),
|
||||
"added": len(git_changes["added"]),
|
||||
"deleted": len(git_changes["deleted"])
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
def update_stale_hashes(knowledge_dir: str, repo_root: str = ".") -> int:
|
||||
"""
|
||||
Update hashes for stale entries. Returns count of updated entries.
|
||||
"""
|
||||
entries = load_knowledge_entries(knowledge_dir)
|
||||
updated = 0
|
||||
|
||||
# This is a simplified version - in practice, you'd need to
|
||||
# write back to the specific YAML files
|
||||
for entry in entries:
|
||||
source_file = entry.get("source_file")
|
||||
if not source_file:
|
||||
continue
|
||||
|
||||
full_path = os.path.join(repo_root, source_file)
|
||||
current_hash = compute_file_hash(full_path)
|
||||
|
||||
if current_hash and entry.get("source_hash") != current_hash:
|
||||
# Mark for update (in practice, you'd write back to the file)
|
||||
updated += 1
|
||||
|
||||
return updated
|
||||
|
||||
|
||||
def format_report(result: Dict[str, Any], max_items: int = 20) -> str:
|
||||
"""Format freshness check results as a human-readable report."""
|
||||
timestamp = result["timestamp"]
|
||||
summary = result["summary"]
|
||||
stale_entries = result["stale_entries"]
|
||||
git_changes = result["git_changes"]
|
||||
|
||||
lines = [
|
||||
"Knowledge Freshness Report",
|
||||
"=" * 50,
|
||||
f"Generated: {timestamp}",
|
||||
f"Total entries: {summary['total']}",
|
||||
f"Stale entries: {summary['stale']} ({summary['stale_percentage']}%)",
|
||||
f"Fresh entries: {summary['fresh']}",
|
||||
""
|
||||
]
|
||||
|
||||
# Git changes summary
|
||||
lines.extend([
|
||||
"Git Changes (last 24h):",
|
||||
f" Modified: {len(git_changes['modified'])} files",
|
||||
f" Added: {len(git_changes['added'])} files",
|
||||
f" Deleted: {len(git_changes['deleted'])} files",
|
||||
""
|
||||
])
|
||||
|
||||
# Stale entries by reason
|
||||
if summary.get("stale_by_reason"):
|
||||
lines.extend([
|
||||
"Stale Entries by Reason:",
|
||||
""
|
||||
])
|
||||
for reason, count in summary["stale_by_reason"].items():
|
||||
lines.append(f" {reason}: {count}")
|
||||
lines.append("")
|
||||
|
||||
# List stale entries
|
||||
if stale_entries:
|
||||
lines.extend([
|
||||
"Stale Entries:",
|
||||
""
|
||||
])
|
||||
for i, entry in enumerate(stale_entries[:max_items], 1):
|
||||
source = entry.get("source_file", "?")
|
||||
reason = entry.get("reason", "unknown")
|
||||
fact = entry.get("fact", "")[:60]
|
||||
lines.append(f"{i:2d}. [{reason}] {source}")
|
||||
if fact:
|
||||
lines.append(f" {fact}")
|
||||
|
||||
if len(stale_entries) > max_items:
|
||||
lines.append(f"\n... and {len(stale_entries) - max_items} more")
|
||||
else:
|
||||
lines.append("No stale entries found. All knowledge is fresh!")
|
||||
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Knowledge Freshness Cron — detect stale entries from code changes")
|
||||
parser.add_argument("--knowledge-dir", required=True,
|
||||
help="Path to knowledge directory")
|
||||
parser.add_argument("--repo", default=".",
|
||||
help="Path to repository for git change detection")
|
||||
parser.add_argument("--days", type=int, default=1,
|
||||
help="Number of days to check for git changes (default: 1)")
|
||||
parser.add_argument("--json", action="store_true",
|
||||
help="Output as JSON instead of human-readable")
|
||||
parser.add_argument("--max", type=int, default=20,
|
||||
help="Maximum stale entries to show (default: 20)")
|
||||
parser.add_argument("--auto-reextract", action="store_true",
|
||||
help="Auto-re-extract knowledge for stale entries")
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
if not os.path.isdir(args.knowledge_dir):
|
||||
print(f"Error: {args.knowledge_dir} is not a directory", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
if not os.path.isdir(args.repo):
|
||||
print(f"Error: {args.repo} is not a directory", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
result = check_freshness(args.knowledge_dir, args.repo, args.days)
|
||||
|
||||
if args.json:
|
||||
print(json.dumps(result, indent=2))
|
||||
else:
|
||||
print(format_report(result, args.max))
|
||||
|
||||
# Auto-re-extract if requested
|
||||
if args.auto_reextract and result["stale_entries"]:
|
||||
print(f"\nAuto-re-extracting {len(result['stale_entries'])} stale entries...")
|
||||
# In a real implementation, this would call the harvester
|
||||
print("(Auto-re-extraction not yet implemented)")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
227
tests/test_freshness.py
Normal file
227
tests/test_freshness.py
Normal file
@@ -0,0 +1,227 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Tests for scripts/freshness.py — 8 tests."""
|
||||
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import tempfile
|
||||
|
||||
sys.path.insert(0, os.path.join(os.path.dirname(__file__) or ".", ".."))
|
||||
import importlib.util
|
||||
spec = importlib.util.spec_from_file_location(
|
||||
"freshness", os.path.join(os.path.dirname(__file__) or ".", "..", "scripts", "freshness.py"))
|
||||
mod = importlib.util.module_from_spec(spec)
|
||||
spec.loader.exec_module(mod)
|
||||
|
||||
compute_file_hash = mod.compute_file_hash
|
||||
check_freshness = mod.check_freshness
|
||||
load_knowledge_entries = mod.load_knowledge_entries
|
||||
|
||||
|
||||
def test_compute_file_hash():
|
||||
"""File hash should be computed correctly."""
|
||||
with tempfile.NamedTemporaryFile(mode='w', delete=False) as f:
|
||||
f.write("test content")
|
||||
f.flush()
|
||||
h = compute_file_hash(f.name)
|
||||
assert h is not None
|
||||
assert h.startswith("sha256:")
|
||||
os.unlink(f.name)
|
||||
print("PASS: test_compute_file_hash")
|
||||
|
||||
|
||||
def test_compute_file_hash_nonexistent():
|
||||
"""Nonexistent file should return None."""
|
||||
h = compute_file_hash("/nonexistent/file.txt")
|
||||
assert h is None
|
||||
print("PASS: test_compute_file_hash_nonexistent")
|
||||
|
||||
|
||||
def test_load_knowledge_entries_empty():
|
||||
"""Empty knowledge dir should return empty list."""
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
entries = load_knowledge_entries(tmpdir)
|
||||
assert entries == []
|
||||
print("PASS: test_load_knowledge_entries_empty")
|
||||
|
||||
|
||||
def test_load_knowledge_entries_from_index():
|
||||
"""Should load entries from index.json."""
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
# Create index.json
|
||||
index_path = os.path.join(tmpdir, "index.json")
|
||||
with open(index_path, "w") as f:
|
||||
json.dump({
|
||||
"facts": [
|
||||
{
|
||||
"fact": "Test fact",
|
||||
"source_file": "test.py",
|
||||
"source_hash": "sha256:abc123",
|
||||
"category": "fact",
|
||||
"confidence": 0.9
|
||||
}
|
||||
]
|
||||
}, f)
|
||||
|
||||
entries = load_knowledge_entries(tmpdir)
|
||||
assert len(entries) == 1
|
||||
assert entries[0]["fact"] == "Test fact"
|
||||
assert entries[0]["source_file"] == "test.py"
|
||||
print("PASS: test_load_knowledge_entries_from_index")
|
||||
|
||||
|
||||
def test_load_knowledge_entries_from_yaml():
|
||||
"""Should load entries from YAML files."""
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
# Create global directory
|
||||
global_dir = os.path.join(tmpdir, "global")
|
||||
os.makedirs(global_dir)
|
||||
|
||||
# Create YAML file
|
||||
yaml_path = os.path.join(global_dir, "test.yaml")
|
||||
with open(yaml_path, "w") as f:
|
||||
f.write("""
|
||||
pitfalls:
|
||||
- description: "Test pitfall"
|
||||
source_file: "test.py"
|
||||
source_hash: "sha256:def456"
|
||||
category: "pitfall"
|
||||
confidence: 0.8
|
||||
""")
|
||||
|
||||
entries = load_knowledge_entries(tmpdir)
|
||||
assert len(entries) == 1
|
||||
assert entries[0]["fact"] == "Test pitfall"
|
||||
assert entries[0]["category"] == "pitfall"
|
||||
print("PASS: test_load_knowledge_entries_from_yaml")
|
||||
|
||||
|
||||
def test_check_freshness_no_changes():
|
||||
"""With no source file reference, entries should be counted correctly."""
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
# Create knowledge dir
|
||||
knowledge_dir = os.path.join(tmpdir, "knowledge")
|
||||
os.makedirs(knowledge_dir)
|
||||
|
||||
# Create repo dir
|
||||
repo_dir = os.path.join(tmpdir, "repo")
|
||||
os.makedirs(repo_dir)
|
||||
|
||||
# Create index.json with entry that has no source_file
|
||||
index_path = os.path.join(knowledge_dir, "index.json")
|
||||
with open(index_path, "w") as f:
|
||||
json.dump({
|
||||
"facts": [
|
||||
{
|
||||
"fact": "General knowledge",
|
||||
"category": "fact",
|
||||
"confidence": 0.9
|
||||
# No source_file or source_hash
|
||||
}
|
||||
]
|
||||
}, f)
|
||||
|
||||
result = check_freshness(knowledge_dir, repo_dir, days=1)
|
||||
|
||||
# Entry without source_file should be counted as "fresh" (no_source status)
|
||||
assert result["summary"]["total"] == 1
|
||||
assert result["summary"]["stale"] == 0
|
||||
assert result["summary"]["fresh"] == 1
|
||||
assert result["fresh_entries"][0]["status"] == "no_source"
|
||||
print("PASS: test_check_freshness_no_changes")
|
||||
|
||||
|
||||
def test_check_freshness_with_hash_mismatch():
|
||||
"""Hash mismatch should mark entry as stale."""
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
# Create knowledge dir
|
||||
knowledge_dir = os.path.join(tmpdir, "knowledge")
|
||||
os.makedirs(knowledge_dir)
|
||||
|
||||
# Create repo dir with a file
|
||||
repo_dir = os.path.join(tmpdir, "repo")
|
||||
os.makedirs(repo_dir)
|
||||
|
||||
test_file = os.path.join(repo_dir, "test.py")
|
||||
with open(test_file, "w") as f:
|
||||
f.write("print('hello')")
|
||||
|
||||
# Create index.json with wrong hash
|
||||
index_path = os.path.join(knowledge_dir, "index.json")
|
||||
with open(index_path, "w") as f:
|
||||
json.dump({
|
||||
"facts": [
|
||||
{
|
||||
"fact": "Test fact",
|
||||
"source_file": "test.py",
|
||||
"source_hash": "sha256:wronghash",
|
||||
"category": "fact",
|
||||
"confidence": 0.9
|
||||
}
|
||||
]
|
||||
}, f)
|
||||
|
||||
# Initialize git repo
|
||||
os.system(f"cd {repo_dir} && git init && git add . && git commit -m 'init' 2>/dev/null")
|
||||
|
||||
result = check_freshness(knowledge_dir, repo_dir, days=1)
|
||||
|
||||
assert result["summary"]["total"] == 1
|
||||
assert result["summary"]["stale"] == 1
|
||||
assert result["summary"]["fresh"] == 0
|
||||
assert result["stale_entries"][0]["reason"] == "hash_mismatch"
|
||||
print("PASS: test_check_freshness_with_hash_mismatch")
|
||||
|
||||
|
||||
def test_check_freshness_missing_source():
|
||||
"""Missing source file should mark entry as stale."""
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
# Create knowledge dir
|
||||
knowledge_dir = os.path.join(tmpdir, "knowledge")
|
||||
os.makedirs(knowledge_dir)
|
||||
|
||||
# Create repo dir (without the referenced file)
|
||||
repo_dir = os.path.join(tmpdir, "repo")
|
||||
os.makedirs(repo_dir)
|
||||
|
||||
# Create index.json referencing nonexistent file
|
||||
index_path = os.path.join(knowledge_dir, "index.json")
|
||||
with open(index_path, "w") as f:
|
||||
json.dump({
|
||||
"facts": [
|
||||
{
|
||||
"fact": "Test fact",
|
||||
"source_file": "nonexistent.py",
|
||||
"source_hash": "sha256:abc123",
|
||||
"category": "fact",
|
||||
"confidence": 0.9
|
||||
}
|
||||
]
|
||||
}, f)
|
||||
|
||||
# Initialize git repo
|
||||
os.system(f"cd {repo_dir} && git init && git add . && git commit -m 'init' 2>/dev/null")
|
||||
|
||||
result = check_freshness(knowledge_dir, repo_dir, days=1)
|
||||
|
||||
assert result["summary"]["total"] == 1
|
||||
assert result["summary"]["stale"] == 1
|
||||
assert result["summary"]["fresh"] == 0
|
||||
assert result["stale_entries"][0]["reason"] == "source_missing"
|
||||
print("PASS: test_check_freshness_missing_source")
|
||||
|
||||
|
||||
def run_all():
|
||||
test_compute_file_hash()
|
||||
test_compute_file_hash_nonexistent()
|
||||
test_load_knowledge_entries_empty()
|
||||
test_load_knowledge_entries_from_index()
|
||||
test_load_knowledge_entries_from_yaml()
|
||||
test_check_freshness_no_changes()
|
||||
test_check_freshness_with_hash_mismatch()
|
||||
test_check_freshness_missing_source()
|
||||
print("\nAll 8 tests passed!")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
run_all()
|
||||
Reference in New Issue
Block a user