Compare commits

..

2 Commits

55 changed files with 134 additions and 10866 deletions

View File

@@ -1,22 +0,0 @@
name: Test
on:
pull_request:
push:
branches: [main]
jobs:
pytest:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install test dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
- name: Run test suite
run: |
make test

2
.gitignore vendored
View File

@@ -1,2 +0,0 @@
__pycache__/
*.pyc

251
GENOME.md
View File

@@ -1,251 +0,0 @@
# GENOME.md — compounding-intelligence
**Generated:** 2026-04-17
**Repo:** Timmy_Foundation/compounding-intelligence
**Description:** Turn 1B+ daily agent tokens into durable, compounding fleet intelligence.
---
## Project Overview
Every agent session starts at zero. The same HTTP 405 gets rediscovered as a branch protection issue. The same token path gets searched from scratch. Intelligence evaporates when the session ends.
Compounding-intelligence solves this with three pipelines forming a loop:
```
SESSION ENDS → HARVESTER → KNOWLEDGE STORE → BOOTSTRAPPER → NEW SESSION STARTS SMARTER
MEASURER → Prove it's working
```
**Status:** Active development. Core pipelines implemented. 20+ scripts, 14 test files, knowledge store populated with real data.
## Architecture
```mermaid
graph TD
TRANS[Session Transcripts<br/>~/.hermes/sessions/*.jsonl] --> READER[session_reader.py]
READER --> HARVESTER[harvester.py]
HARVESTER -->|LLM extraction| PROMPT[harvest-prompt.md]
HARVESTER --> DEDUP[deduplicate()]
DEDUP --> INDEX[knowledge/index.json]
DEDUP --> GLOBAL[knowledge/global/*.yaml]
DEDUP --> REPO[knowledge/repos/*.yaml]
INDEX --> BOOTSTRAPPER[bootstrapper.py]
BOOTSTRAPPER -->|filter + rank + truncate| CONTEXT[Bootstrap Context<br/>2k token injection]
CONTEXT --> SESSION[New Session starts smarter]
INDEX --> VALIDATOR[validate_knowledge.py]
INDEX --> STALENESS[knowledge_staleness_check.py]
INDEX --> GAPS[knowledge_gap_identifier.py]
TRANS --> SAMPLER[sampler.py]
SAMPLER -->|score + rank| BEST[High-value sessions]
BEST --> HARVESTER
TRANS --> METADATA[session_metadata.py]
METADATA --> SUMMARY[SessionSummary objects]
KNOWLEDGE --> DIFF[diff_analyzer.py]
DIFF --> PROPOSALS[improvement_proposals.py]
PROPOSALS --> PRIORITIES[priority_rebalancer.py]
```
## Entry Points
### Core Pipelines
| Script | Purpose | Key Functions |
|--------|---------|---------------|
| `harvester.py` | Extract knowledge from session transcripts | `harvest_session()`, `call_llm()`, `deduplicate()`, `validate_fact()` |
| `bootstrapper.py` | Build pre-session context from knowledge store | `build_bootstrap_context()`, `filter_facts()`, `sort_facts()`, `truncate_to_tokens()` |
| `session_reader.py` | Parse JSONL session transcripts | `read_session()`, `extract_conversation()`, `messages_to_text()` |
| `sampler.py` | Score and rank sessions for harvesting value | `scan_session_fast()`, `score_session()` |
| `session_metadata.py` | Extract structured metadata from sessions | `extract_session_metadata()`, `SessionSummary` |
### Analysis & Quality
| Script | Purpose |
|--------|---------|
| `validate_knowledge.py` | Validate knowledge index schema compliance |
| `knowledge_staleness_check.py` | Detect stale knowledge (source changed since extraction) |
| `knowledge_gap_identifier.py` | Find untested functions, undocumented APIs, missing tests |
| `diff_analyzer.py` | Analyze code diffs for improvement signals |
| `improvement_proposals.py` | Generate ranked improvement proposals |
| `priority_rebalancer.py` | Rebalance priorities across proposals |
| `automation_opportunity_finder.py` | Find manual steps that can be automated |
| `dead_code_detector.py` | Detect unused code |
| `dependency_graph.py` | Map dependency relationships |
| `perf_bottleneck_finder.py` | Find performance bottlenecks |
| `refactoring_opportunity_finder.py` | Identify refactoring targets |
| `gitea_issue_parser.py` | Parse Gitea issues for knowledge extraction |
### Automation
| Script | Purpose |
|--------|---------|
| `session_pair_harvester.py` | Extract training pairs from sessions |
## Data Flow
```
1. Session ends → .jsonl written to ~/.hermes/sessions/
2. sampler.py scores sessions by age, recency, repo coverage
3. harvester.py reads top sessions, calls LLM with harvest-prompt.md
4. LLM extracts facts/pitfalls/patterns/quirks/questions
5. deduplicate() checks against existing index via fact_fingerprint()
6. validate_fact() checks schema compliance
7. write_knowledge() appends to knowledge/index.json + per-repo YAML
8. On next session start, bootstrapper.py:
a. Loads knowledge/index.json
b. Filters by session's repo and agent type
c. Sorts by confidence (high first), then recency
d. Truncates to 2k token budget
e. Injects as pre-context
9. Agent starts with full situational awareness instead of zero
```
## Key Abstractions
### Knowledge Item (fact/pitfall/pattern/quirk/question)
```json
{
"fact": "Gitea token is at ~/.config/gitea/token",
"category": "tool-quirk",
"repo": "global",
"confidence": 0.9,
"evidence": "Found during clone attempt",
"source_session": "2026-04-13_abc123",
"extracted_at": "2026-04-13T20:00:00Z"
}
```
### SessionSummary (session_metadata.py)
Extracted metadata per session: duration, token count, tools used, repos touched, error count, outcome.
### Gap / GapReport (knowledge_gap_identifier.py)
Structured gap analysis: untested functions, undocumented APIs, missing tests. Severity: critical/high/medium/low.
### Knowledge Index (knowledge/index.json)
Machine-readable fact store. 12KB, populated with real data. Categories: fact, pitfall, pattern, tool-quirk, question.
## Knowledge Store
```
knowledge/
├── index.json # Master fact store (12KB, populated)
├── SCHEMA.md # Schema documentation
├── global/
│ ├── pitfalls.yaml # Cross-repo pitfalls (2KB)
│ └── tool-quirks.yaml # Tool-specific quirks (2KB)
├── repos/
│ ├── hermes-agent.yaml # hermes-agent knowledge (2KB)
│ └── the-nexus.yaml # the-nexus knowledge (2KB)
└── agents/ # Per-agent knowledge (empty)
```
## API Surface
### LLM API (consumed)
| Provider | Endpoint | Usage |
|----------|----------|-------|
| Nous Research | `https://inference-api.nousresearch.com/v1` | Knowledge extraction |
| Ollama | `http://localhost:11434/v1` | Local fallback |
### File API (consumed/produced)
| Path | Format | Direction |
|------|--------|-----------|
| `~/.hermes/sessions/*.jsonl` | JSONL | Input (session transcripts) |
| `knowledge/index.json` | JSON | Output (master fact store) |
| `knowledge/global/*.yaml` | YAML | Output (cross-repo knowledge) |
| `knowledge/repos/*.yaml` | YAML | Output (per-repo knowledge) |
| `templates/harvest-prompt.md` | Markdown | Config (extraction prompt) |
## Test Coverage
**14 test files** covering core pipelines:
| Test File | Covers |
|-----------|--------|
| `test_harvest_prompt.py` | Prompt validation, hallucination detection |
| `test_harvest_prompt_comprehensive.py` | Extended prompt testing |
| `test_harvester_pipeline.py` | Harvester extraction + dedup |
| `test_bootstrapper.py` | Context building, filtering, truncation |
| `test_session_pair_harvester.py` | Training pair extraction |
| `test_improvement_proposals.py` | Proposal generation |
| `test_priority_rebalancer.py` | Priority scoring |
| `test_knowledge_staleness.py` | Staleness detection |
| `test_automation_opportunity_finder.py` | Automation detection |
| `test_diff_analyzer.py` | Diff analysis |
| `test_gitea_issue_parser.py` | Issue parsing |
| `test_refactoring_opportunity_finder.py` | Refactoring signals |
| `test_knowledge_gap_identifier.py` | Gap analysis |
| `test_perf_bottleneck_finder.py` | Perf bottleneck detection |
### Coverage Gaps
1. **session_reader.py** — No dedicated test file (tested indirectly)
2. **sampler.py** — No test file (scoring logic untested)
3. **session_metadata.py** — No test file
4. **validate_knowledge.py** — No test file
5. **knowledge_staleness_check.py** — Tested but limited
## Security Considerations
### API Key Handling
- `harvester.py` reads API key from `~/.hermes/auth.json` or env vars
- Key passed to LLM API in request headers only
- No key logging
### Knowledge Integrity
- `validate_fact()` checks schema before writing
- `deduplicate()` prevents duplicate entries via fingerprint
- `knowledge_staleness_check.py` detects when source code changed but knowledge didn't
- Confidence scores prevent low-quality knowledge from polluting the store
### File Safety
- Knowledge writes are append-only (never deletes)
- Bootstrap context is truncated to budget (no prompt injection via knowledge)
- Session reader handles malformed JSONL gracefully
## File Index
```
scripts/
harvester.py (473 lines) — Core knowledge extraction
bootstrapper.py (302 lines) — Pre-session context builder
session_reader.py (137 lines) — JSONL session parser
sampler.py (363 lines) — Session scoring + ranking
session_metadata.py (271 lines) — Session metadata extraction
validate_knowledge.py (44 lines) — Index validation
knowledge_staleness_check.py (125 lines) — Staleness detection
knowledge_gap_identifier.py (291 lines) — Gap analysis engine
diff_analyzer.py (203 lines) — Diff analysis
improvement_proposals.py (518 lines) — Proposal generation
priority_rebalancer.py (745 lines) — Priority scoring
automation_opportunity_finder.py (600 lines) — Automation detection
dead_code_detector.py (270 lines) — Dead code detection
dependency_graph.py (220 lines) — Dependency mapping
perf_bottleneck_finder.py (635 lines) — Perf analysis
refactoring_opportunity_finder.py (46 lines) — Refactoring signals
gitea_issue_parser.py (140 lines) — Gitea issue parsing
session_pair_harvester.py (224 lines) — Training pair extraction
knowledge/
index.json (12KB) — Master fact store
SCHEMA.md (3KB) — Schema docs
global/pitfalls.yaml (2KB) — Cross-repo pitfalls
global/tool-quirks.yaml (2KB) — Tool quirks
repos/hermes-agent.yaml (2KB) — Repo-specific knowledge
repos/the-nexus.yaml (2KB) — Repo-specific knowledge
templates/
harvest-prompt.md (4KB) — Extraction prompt
test_sessions/ (5 files) — Sample transcripts
tests/ + scripts/test_* (14 files)— Test suite
```
**Total:** ~6,500 lines of code across 18 scripts + 14 test files.
---
*Generated by Codebase Genome pipeline — Issue #676*

View File

@@ -1,4 +0,0 @@
.PHONY: test
test:
python3 -m pytest tests/test_ci_config.py scripts/test_*.py -v

View File

@@ -1,114 +0,0 @@
# Knowledge File Format Specification
**Version:** 1
**Issue:** #10
**Status:** Draft
---
## Overview
The knowledge system has two layers:
1. **index.json** — Machine-readable fact index. Fast lookups by ID, category, repo, tags.
2. **Knowledge files** (YAML) — Human-readable, editable facts organized by domain.
The harvester writes to both. The bootstrapper reads from index.json. Humans edit the YAML files directly.
---
## index.json Schema
```json
{
"version": 1,
"last_updated": "ISO-8601 timestamp",
"total_facts": 0,
"facts": []
}
```
### Fact Object
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `id` | string | yes | Unique identifier: `{domain}:{category}:{sequence}` |
| `fact` | string | yes | One-sentence description of the knowledge |
| `category` | enum | yes | One of: `fact`, `pitfall`, `pattern`, `tool-quirk`, `question` |
| `domain` | string | yes | Where this applies: repo name, `global`, or agent name |
| `confidence` | float | yes | 0.01.0. How certain is this knowledge? |
| `tags` | string[] | no | Searchable labels |
| `source_count` | int | no | How many sessions confirmed this fact |
| `first_seen` | date | no | ISO-8601 date first extracted |
| `last_confirmed` | date | no | ISO-8601 date last seen in a session |
| `expires` | date | no | Optional. After this date, fact is stale |
| `related` | string[] | no | IDs of related facts |
### ID Format: `{domain}:{category}:{sequence}`
### Categories
| Category | Definition |
|----------|------------|
| `fact` | Concrete, verifiable information |
| `pitfall` | Errors, wrong assumptions, time-wasters |
| `pattern` | Successful sequences of actions |
| `tool-quirk` | Environment-specific behaviors |
| `question` | Identified but unanswered |
### Confidence Scoring
| Range | Meaning |
|-------|---------|
| 0.91.0 | Explicitly stated and verified |
| 0.70.8 | Clearly implied by multiple data points |
| 0.50.6 | Suggested but not fully verified |
| 0.30.4 | Inferred from limited data |
| 0.10.2 | Speculative or uncertain |
---
## Directory Structure
```
knowledge/
├── index.json # Machine-readable fact index
├── SCHEMA.md # This file
├── global/ # Cross-repo knowledge
│ ├── pitfalls.yaml
│ ├── patterns.yaml
│ └── tool-quirks.yaml
├── repos/ # Per-repo knowledge
│ ├── {repo-name}.yaml
│ └── ...
└── agents/ # Agent-type knowledge
└── {agent-type}.yaml
```
## YAML File Format
YAML files use frontmatter for metadata, then markdown sections with fact entries:
```yaml
---
domain: global
category: tool-quirk
version: 1
last_updated: "2026-04-13"
---
# Title
## Section
- id: global:tool-quirk:001
fact: "Description"
confidence: 0.95
tags: [tag1, tag2]
source_count: 5
first_seen: "2026-03-27"
```
## Validation
Run `python scripts/validate_knowledge.py` to validate index.json.

View File

@@ -1,80 +0,0 @@
---
domain: global
category: pitfall
version: 1
last_updated: "2026-04-13"
---
# Pitfalls (Global)
Cross-repo traps that waste time across the fleet.
## Git & Forge
- id: global:pitfall:001
fact: "Branch protection requires 1 approval on main - API merges fail with 405 without it"
confidence: 0.95
tags: [git, merge, branch-protection, gitea]
source_count: 12
first_seen: "2026-04-05"
last_confirmed: "2026-04-13"
related: [the-nexus:pitfall:001]
- id: global:pitfall:002
fact: "Never use --no-verify on git commits - it bypasses all hooks including safety checks"
confidence: 0.95
tags: [git, hooks, safety]
source_count: 5
first_seen: "2026-03-28"
last_confirmed: "2026-04-13"
- id: global:pitfall:003
fact: "Gitea PR creation workaround needed on the-nexus - direct API call fails, use alternative endpoint"
confidence: 0.9
tags: [gitea, pr, api, workaround]
source_count: 4
first_seen: "2026-04-06"
last_confirmed: "2026-04-12"
## Agent Operations
- id: global:pitfall:004
fact: "Anthropic is BANNED from fallback chain - if fallback triggers to Anthropic, something is wrong"
confidence: 0.95
tags: [provider, anthropic, fallback]
source_count: 7
first_seen: "2026-03-30"
last_confirmed: "2026-04-13"
- id: global:pitfall:005
fact: "Telegram tokens expired - don't assume Telegram notifications work without checking"
confidence: 0.85
tags: [telegram, notifications, token]
source_count: 3
first_seen: "2026-04-02"
- id: global:pitfall:006
fact: "Multiple gateways = 'cannot schedule futures' error - only one gateway process should run"
confidence: 0.9
tags: [gateway, cron, process]
source_count: 4
first_seen: "2026-04-04"
last_confirmed: "2026-04-11"
## Testing
- id: global:pitfall:007
fact: "pytest root collection picks up operational *_test.py scripts - restrict to tests/ directory"
confidence: 0.9
tags: [pytest, test, collection]
source_count: 3
first_seen: "2026-04-07"
last_confirmed: "2026-04-13"
- id: global:pitfall:008
fact: "TDD: test 1 before building 55 - verify the cycle works before scaling"
confidence: 0.95
tags: [tdd, testing, methodology]
source_count: 8
first_seen: "2026-03-25"
last_confirmed: "2026-04-13"

View File

@@ -1,71 +0,0 @@
---
domain: global
category: tool-quirk
version: 1
last_updated: "2026-04-13"
---
# Tool Quirks (Global)
## Authentication
- id: global:tool-quirk:001
fact: "Gitea token stored at ~/.config/gitea/token, not env var GITEA_TOKEN"
confidence: 0.95
tags: [git, auth, gitea, token]
source_count: 23
first_seen: "2026-03-27"
last_confirmed: "2026-04-13"
related: [global:pitfall:001]
- id: global:tool-quirk:002
fact: "Gitea API uses 'Authorization: token TOKEN' header format, not Bearer"
confidence: 0.9
tags: [git, api, gitea]
source_count: 8
first_seen: "2026-03-28"
last_confirmed: "2026-04-12"
- id: global:tool-quirk:003
fact: "Gitea Issues API type=issues param does NOT filter PRs - use truthiness check on pull_request field"
confidence: 0.95
tags: [gitea, api, issues, pr]
source_count: 6
first_seen: "2026-04-01"
last_confirmed: "2026-04-13"
## Paths & Environment
- id: global:tool-quirk:004
fact: "~/.hermes is the default hermes home - check get_hermes_home() not the path literal"
confidence: 0.9
tags: [paths, hermes, env]
source_count: 10
first_seen: "2026-03-30"
last_confirmed: "2026-04-13"
related: [hermes-agent:pitfall:005]
- id: global:tool-quirk:005
fact: "Ansible vault-encrypted vars in YAML require vault_inline_vars plugin"
confidence: 0.85
tags: [ansible, vault, config]
source_count: 3
first_seen: "2026-04-02"
## Model & Inference
- id: global:tool-quirk:006
fact: "mimo-v2-pro via Nous Research is the default model - don't assume Anthropic is available"
confidence: 0.95
tags: [model, provider, nous, default]
source_count: 15
first_seen: "2026-03-25"
last_confirmed: "2026-04-13"
- id: global:tool-quirk:007
fact: "Kill + restart with 'hermes chat' preserves old model state - NEVER use --resume"
confidence: 0.95
tags: [hermes, model, restart, session]
source_count: 8
first_seen: "2026-03-29"
last_confirmed: "2026-04-12"

View File

@@ -1,472 +1,6 @@
{
"version": 1,
"last_updated": "2026-04-13T20:00:00Z",
"total_facts": 29,
"facts": [
{
"id": "hermes-agent:pitfall:001",
"fact": "deploy-crons.py leaves jobs in mixed model format",
"category": "pitfall",
"domain": "hermes-agent",
"confidence": 0.95,
"tags": [
"cron",
"deploy",
"model",
"config"
],
"source_count": 5,
"first_seen": "2026-04-08",
"last_confirmed": "2026-04-13",
"related": [
"hermes-agent:pitfall:002",
"hermes-agent:pitfall:003"
]
},
{
"id": "hermes-agent:pitfall:002",
"fact": "deploy-crons.py --deploy doesn't set legacy skill field from skills list",
"category": "pitfall",
"domain": "hermes-agent",
"confidence": 0.9,
"tags": [
"cron",
"deploy",
"skills"
],
"source_count": 3,
"first_seen": "2026-04-09",
"last_confirmed": "2026-04-13",
"related": [
"hermes-agent:pitfall:001"
]
},
{
"id": "hermes-agent:pitfall:003",
"fact": "Cron jobs with blank fallback_model fields trigger spurious gateway warnings",
"category": "pitfall",
"domain": "hermes-agent",
"confidence": 0.9,
"tags": [
"cron",
"model",
"fallback"
],
"source_count": 4,
"first_seen": "2026-04-07",
"last_confirmed": "2026-04-12",
"related": [
"hermes-agent:pitfall:001"
]
},
{
"id": "hermes-agent:pitfall:004",
"fact": "model-watchdog.py checks first provider line, not model.provider - causes false drift alarms",
"category": "pitfall",
"domain": "hermes-agent",
"confidence": 0.9,
"tags": [
"watchdog",
"model",
"config"
],
"source_count": 3,
"first_seen": "2026-04-08",
"last_confirmed": "2026-04-13"
},
{
"id": "hermes-agent:pitfall:005",
"fact": "10+ files read HERMES_HOME directly instead of get_hermes_home()",
"category": "pitfall",
"domain": "hermes-agent",
"confidence": 0.85,
"tags": [
"paths",
"env",
"hermes-home"
],
"source_count": 6,
"first_seen": "2026-04-06",
"last_confirmed": "2026-04-12",
"related": [
"global:pitfall:002"
]
},
{
"id": "hermes-agent:pitfall:006",
"fact": "get_hermes_home() doesn't expand tilde when HERMES_HOME=~/... is set",
"category": "pitfall",
"domain": "hermes-agent",
"confidence": 0.8,
"tags": [
"paths",
"env",
"bug"
],
"source_count": 2,
"first_seen": "2026-04-05"
},
{
"id": "hermes-agent:pitfall:007",
"fact": "vps-agent-dispatch reports OK while remote hermes binary path is broken",
"category": "pitfall",
"domain": "hermes-agent",
"confidence": 0.9,
"tags": [
"ssh",
"dispatch",
"vps"
],
"source_count": 4,
"first_seen": "2026-04-07",
"last_confirmed": "2026-04-11"
},
{
"id": "hermes-agent:pitfall:008",
"fact": "nightwatch-health-monitor SSH check fails on cloud-model-only deployments",
"category": "pitfall",
"domain": "hermes-agent",
"confidence": 0.85,
"tags": [
"ssh",
"health",
"cloud"
],
"source_count": 2,
"first_seen": "2026-04-10"
},
{
"id": "the-nexus:pitfall:001",
"fact": "Merges fail with HTTP 405 due to branch protection",
"category": "pitfall",
"domain": "the-nexus",
"confidence": 0.95,
"tags": [
"git",
"merge",
"branch-protection",
"gitea"
],
"source_count": 12,
"first_seen": "2026-04-05",
"last_confirmed": "2026-04-13",
"related": [
"global:pitfall:001"
]
},
{
"id": "the-nexus:pitfall:002",
"fact": "ThreadingHTTPServer required for multi-user bridge - standard HTTPServer blocks on concurrent requests",
"category": "pitfall",
"domain": "the-nexus",
"confidence": 0.95,
"tags": [
"server",
"concurrency",
"bridge"
],
"source_count": 5,
"first_seen": "2026-04-10",
"last_confirmed": "2026-04-13"
},
{
"id": "the-nexus:pitfall:003",
"fact": "ChatLog.log() crashes on message persistence when index.html has orphaned button tags",
"category": "pitfall",
"domain": "the-nexus",
"confidence": 0.9,
"tags": [
"html",
"crash",
"chatlog"
],
"source_count": 3,
"first_seen": "2026-04-12",
"last_confirmed": "2026-04-13"
},
{
"id": "the-nexus:pitfall:004",
"fact": "Three.js LOD not implemented - local hardware struggles with full scene",
"category": "pitfall",
"domain": "the-nexus",
"confidence": 0.85,
"tags": [
"threejs",
"performance",
"lod"
],
"source_count": 4,
"first_seen": "2026-04-09",
"last_confirmed": "2026-04-13"
},
{
"id": "the-nexus:pitfall:005",
"fact": "Duplicate content blocks appear in index.html when PR merges conflict silently",
"category": "pitfall",
"domain": "the-nexus",
"confidence": 0.8,
"tags": [
"html",
"merge-conflict",
"duplicate"
],
"source_count": 3,
"first_seen": "2026-04-11",
"last_confirmed": "2026-04-13"
},
{
"id": "the-nexus:pitfall:006",
"fact": "Unified HTTP + WebSocket server required for proper URL deployment - separate servers break CORS",
"category": "pitfall",
"domain": "the-nexus",
"confidence": 0.9,
"tags": [
"deploy",
"websocket",
"http",
"cors"
],
"source_count": 4,
"first_seen": "2026-04-10",
"last_confirmed": "2026-04-13"
},
{
"id": "global:tool-quirk:001",
"fact": "Gitea token stored at ~/.config/gitea/token, not env var GITEA_TOKEN",
"category": "tool-quirk",
"domain": "global",
"confidence": 0.95,
"tags": [
"git",
"auth",
"gitea",
"token"
],
"source_count": 23,
"first_seen": "2026-03-27",
"last_confirmed": "2026-04-13",
"related": [
"global:pitfall:001"
]
},
{
"id": "global:tool-quirk:002",
"fact": "Gitea API uses 'Authorization: token TOKEN' header format, not Bearer",
"category": "tool-quirk",
"domain": "global",
"confidence": 0.9,
"tags": [
"git",
"api",
"gitea"
],
"source_count": 8,
"first_seen": "2026-03-28",
"last_confirmed": "2026-04-12"
},
{
"id": "global:tool-quirk:003",
"fact": "Gitea Issues API type=issues param does NOT filter PRs",
"category": "tool-quirk",
"domain": "global",
"confidence": 0.95,
"tags": [
"gitea",
"api",
"issues",
"pr"
],
"source_count": 6,
"first_seen": "2026-04-01",
"last_confirmed": "2026-04-13"
},
{
"id": "global:tool-quirk:004",
"fact": "~/.hermes is the default hermes home - check get_hermes_home() not the path literal",
"category": "tool-quirk",
"domain": "global",
"confidence": 0.9,
"tags": [
"paths",
"hermes",
"env"
],
"source_count": 10,
"first_seen": "2026-03-30",
"last_confirmed": "2026-04-13",
"related": [
"hermes-agent:pitfall:005"
]
},
{
"id": "global:tool-quirk:005",
"fact": "Ansible vault-encrypted vars in YAML require vault_inline_vars plugin",
"category": "tool-quirk",
"domain": "global",
"confidence": 0.85,
"tags": [
"ansible",
"vault",
"config"
],
"source_count": 3,
"first_seen": "2026-04-02"
},
{
"id": "global:tool-quirk:006",
"fact": "mimo-v2-pro via Nous Research is the default model - don't assume Anthropic is available",
"category": "tool-quirk",
"domain": "global",
"confidence": 0.95,
"tags": [
"model",
"provider",
"nous",
"default"
],
"source_count": 15,
"first_seen": "2026-03-25",
"last_confirmed": "2026-04-13"
},
{
"id": "global:tool-quirk:007",
"fact": "Kill + restart with 'hermes chat' preserves old model state - NEVER use --resume",
"category": "tool-quirk",
"domain": "global",
"confidence": 0.95,
"tags": [
"hermes",
"model",
"restart",
"session"
],
"source_count": 8,
"first_seen": "2026-03-29",
"last_confirmed": "2026-04-12"
},
{
"id": "global:pitfall:001",
"fact": "Branch protection requires 1 approval on main - API merges fail with 405 without it",
"category": "pitfall",
"domain": "global",
"confidence": 0.95,
"tags": [
"git",
"merge",
"branch-protection",
"gitea"
],
"source_count": 12,
"first_seen": "2026-04-05",
"last_confirmed": "2026-04-13",
"related": [
"the-nexus:pitfall:001"
]
},
{
"id": "global:pitfall:002",
"fact": "Never use --no-verify on git commits",
"category": "pitfall",
"domain": "global",
"confidence": 0.95,
"tags": [
"git",
"hooks",
"safety"
],
"source_count": 5,
"first_seen": "2026-03-28",
"last_confirmed": "2026-04-13"
},
{
"id": "global:pitfall:003",
"fact": "Gitea PR creation workaround needed on the-nexus - direct API call fails",
"category": "pitfall",
"domain": "global",
"confidence": 0.9,
"tags": [
"gitea",
"pr",
"api",
"workaround"
],
"source_count": 4,
"first_seen": "2026-04-06",
"last_confirmed": "2026-04-12"
},
{
"id": "global:pitfall:004",
"fact": "Anthropic is BANNED from fallback chain",
"category": "pitfall",
"domain": "global",
"confidence": 0.95,
"tags": [
"provider",
"anthropic",
"fallback"
],
"source_count": 7,
"first_seen": "2026-03-30",
"last_confirmed": "2026-04-13"
},
{
"id": "global:pitfall:005",
"fact": "Telegram tokens expired - don't assume Telegram notifications work",
"category": "pitfall",
"domain": "global",
"confidence": 0.85,
"tags": [
"telegram",
"notifications",
"token"
],
"source_count": 3,
"first_seen": "2026-04-02"
},
{
"id": "global:pitfall:006",
"fact": "Multiple gateways = 'cannot schedule futures' error - only one gateway process should run",
"category": "pitfall",
"domain": "global",
"confidence": 0.9,
"tags": [
"gateway",
"cron",
"process"
],
"source_count": 4,
"first_seen": "2026-04-04",
"last_confirmed": "2026-04-11"
},
{
"id": "global:pitfall:007",
"fact": "pytest root collection picks up operational *_test.py scripts - restrict to tests/ directory",
"category": "pitfall",
"domain": "global",
"confidence": 0.9,
"tags": [
"pytest",
"test",
"collection"
],
"source_count": 3,
"first_seen": "2026-04-07",
"last_confirmed": "2026-04-13"
},
{
"id": "global:pitfall:008",
"fact": "TDD: test 1 before building 55",
"category": "pitfall",
"domain": "global",
"confidence": 0.95,
"tags": [
"tdd",
"testing",
"methodology"
],
"source_count": 8,
"first_seen": "2026-03-25",
"last_confirmed": "2026-04-13"
}
]
"total_facts": 0,
"facts": []
}

View File

@@ -1,80 +0,0 @@
---
domain: hermes-agent
category: pitfall
version: 1
last_updated: "2026-04-13"
---
# Pitfalls (hermes-agent)
## Cron & Deployment
- id: hermes-agent:pitfall:001
fact: "deploy-crons.py leaves jobs in mixed model format - some have provider/model, some just model"
confidence: 0.95
tags: [cron, deploy, model, config]
source_count: 5
first_seen: "2026-04-08"
last_confirmed: "2026-04-13"
related: [hermes-agent:pitfall:002, hermes-agent:pitfall:003]
- id: hermes-agent:pitfall:002
fact: "deploy-crons.py --deploy doesn't set legacy skill field from skills list"
confidence: 0.9
tags: [cron, deploy, skills]
source_count: 3
first_seen: "2026-04-09"
last_confirmed: "2026-04-13"
related: [hermes-agent:pitfall:001]
- id: hermes-agent:pitfall:003
fact: "Cron jobs with blank fallback_model fields trigger spurious gateway warnings"
confidence: 0.9
tags: [cron, model, fallback]
source_count: 4
first_seen: "2026-04-07"
last_confirmed: "2026-04-12"
related: [hermes-agent:pitfall:001]
- id: hermes-agent:pitfall:004
fact: "model-watchdog.py checks first provider line, not model.provider - causes false drift alarms"
confidence: 0.9
tags: [watchdog, model, config]
source_count: 3
first_seen: "2026-04-08"
last_confirmed: "2026-04-13"
## Path & Environment
- id: hermes-agent:pitfall:005
fact: "10+ files read HERMES_HOME directly instead of get_hermes_home() - breaks on custom paths"
confidence: 0.85
tags: [paths, env, hermes-home]
source_count: 6
first_seen: "2026-04-06"
last_confirmed: "2026-04-12"
related: [global:pitfall:002]
- id: hermes-agent:pitfall:006
fact: "get_hermes_home() doesn't expand tilde when HERMES_HOME=~/... is set"
confidence: 0.8
tags: [paths, env, bug]
source_count: 2
first_seen: "2026-04-05"
## SSH & Dispatch
- id: hermes-agent:pitfall:007
fact: "vps-agent-dispatch reports OK while remote hermes binary path is broken"
confidence: 0.9
tags: [ssh, dispatch, vps]
source_count: 4
first_seen: "2026-04-07"
last_confirmed: "2026-04-11"
- id: hermes-agent:pitfall:008
fact: "nightwatch-health-monitor SSH check fails on cloud-model-only deployments"
confidence: 0.85
tags: [ssh, health, cloud]
source_count: 2
first_seen: "2026-04-10"

View File

@@ -1,63 +0,0 @@
---
domain: the-nexus
category: pitfall
version: 1
last_updated: "2026-04-13"
---
# Pitfalls (the-nexus)
## Git & Merging
- id: the-nexus:pitfall:001
fact: "Merges fail with HTTP 405 due to branch protection - must use merge API with 1 approval"
confidence: 0.95
tags: [git, merge, branch-protection, gitea]
source_count: 12
first_seen: "2026-04-05"
last_confirmed: "2026-04-13"
related: [global:pitfall:001]
- id: the-nexus:pitfall:002
fact: "ThreadingHTTPServer required for multi-user bridge - standard HTTPServer blocks on concurrent requests"
confidence: 0.95
tags: [server, concurrency, bridge]
source_count: 5
first_seen: "2026-04-10"
last_confirmed: "2026-04-13"
- id: the-nexus:pitfall:003
fact: "ChatLog.log() crashes on message persistence when index.html has orphaned button tags"
confidence: 0.9
tags: [html, crash, chatlog]
source_count: 3
first_seen: "2026-04-12"
last_confirmed: "2026-04-13"
## Three.js & Performance
- id: the-nexus:pitfall:004
fact: "Three.js LOD not implemented - local hardware struggles with full scene without texture optimization"
confidence: 0.85
tags: [threejs, performance, lod]
source_count: 4
first_seen: "2026-04-09"
last_confirmed: "2026-04-13"
- id: the-nexus:pitfall:005
fact: "Duplicate content blocks appear in index.html when PR merges conflict silently"
confidence: 0.8
tags: [html, merge-conflict, duplicate]
source_count: 3
first_seen: "2026-04-11"
last_confirmed: "2026-04-13"
## Deployment
- id: the-nexus:pitfall:006
fact: "Unified HTTP + WebSocket server required for proper URL deployment - separate servers break CORS"
confidence: 0.9
tags: [deploy, websocket, http, cors]
source_count: 4
first_seen: "2026-04-10"
last_confirmed: "2026-04-13"

View File

@@ -1,297 +0,0 @@
#!/usr/bin/env python3
"""
quality_gate.py — Score and filter knowledge entries.
Scores each entry on 4 dimensions:
- Specificity: concrete examples vs vague generalities
- Actionability: can this be used to do something?
- Freshness: is this still accurate?
- Source quality: was the model/provider reliable?
Usage:
from quality_gate import score_entry, filter_entries, quality_report
score = score_entry(entry)
filtered = filter_entries(entries, threshold=0.5)
report = quality_report(entries)
"""
import json
import math
import re
import sys
from datetime import datetime, timezone
from pathlib import Path
from typing import Dict, List, Any, Optional
# Source quality scores (higher = more reliable)
SOURCE_QUALITY = {
"claude-sonnet": 0.9,
"claude-opus": 0.95,
"gpt-4": 0.85,
"gpt-4-turbo": 0.85,
"gpt-5": 0.9,
"mimo-v2-pro": 0.8,
"gemini-pro": 0.8,
"llama-3-70b": 0.75,
"llama-3-8b": 0.7,
"ollama": 0.6,
"unknown": 0.5,
}
DEFAULT_SOURCE_QUALITY = 0.5
# Specificity indicators
SPECIFIC_INDICATORS = [
r"\b\d+\.\d+", # decimal numbers
r"\b\d{4}-\d{2}-\d{2}", # dates
r"\b[A-Z][a-z]+\s[A-Z][a-z]+", # proper nouns
r"`[^`]+`", # code/commands
r"https?://", # URLs
r"\b(example|instance|specifically|concretely)\b",
r"\b(step \d|first|second|third)\b",
r"\b(exactly|precisely|measured|counted)\b",
]
# Vagueness indicators (penalty)
VAGUE_INDICATORS = [
r"\b(generally|usually|often|sometimes|might|could|perhaps)\b",
r"\b(various|several|many|some|few)\b",
r"\b(it depends|varies|differs)\b",
r"\b(basically|essentially|fundamentally)\b",
r"\b(everyone knows|it's obvious|clearly)\b",
]
# Actionability indicators
ACTIONABLE_INDICATORS = [
r"\b(run|execute|install|deploy|configure|set up)\b",
r"\b(use|apply|implement|create|build)\b",
r"\b(check|verify|test|validate|confirm)\b",
r"\b(fix|resolve|solve|debug|troubleshoot)\b",
r"\b(if .+ then|when .+ do|to .+ use)\b",
r"```[a-z]*\n", # code blocks
r"\$\s", # shell commands
r"\b\d+\.\s", # numbered steps
]
def score_specificity(content: str) -> float:
"""Score specificity: 0=vague, 1=very specific."""
content_lower = content.lower()
score = 0.5 # baseline
# Check for specific indicators
specific_count = sum(
len(re.findall(p, content, re.IGNORECASE))
for p in SPECIFIC_INDICATORS
)
# Check for vague indicators
vague_count = sum(
len(re.findall(p, content_lower))
for p in VAGUE_INDICATORS
)
# Adjust score
score += min(specific_count * 0.05, 0.4)
score -= min(vague_count * 0.08, 0.3)
# Length bonus (longer = more detail, up to a point)
word_count = len(content.split())
if word_count > 50:
score += min((word_count - 50) * 0.001, 0.1)
return max(0.0, min(1.0, score))
def score_actionability(content: str) -> float:
"""Score actionability: 0=abstract, 1=highly actionable."""
content_lower = content.lower()
score = 0.3 # baseline (most knowledge is informational)
# Check for actionable indicators
actionable_count = sum(
len(re.findall(p, content_lower))
for p in ACTIONABLE_INDICATORS
)
score += min(actionable_count * 0.1, 0.6)
# Code blocks are highly actionable
if "```" in content:
score += 0.2
# Numbered steps are actionable
if re.search(r"\d+\.\s+\w", content):
score += 0.1
return max(0.0, min(1.0, score))
def score_freshness(timestamp: Optional[str]) -> float:
"""Score freshness: 1=new, decays over time."""
if not timestamp:
return 0.5
try:
if isinstance(timestamp, str):
ts = datetime.fromisoformat(timestamp.replace("Z", "+00:00"))
else:
ts = timestamp
now = datetime.now(timezone.utc)
age_days = (now - ts).days
# Exponential decay: 1.0 at day 0, 0.5 at ~180 days, 0.1 at ~365 days
score = math.exp(-age_days / 180)
return max(0.1, min(1.0, score))
except (ValueError, TypeError):
return 0.5
def score_source_quality(model: Optional[str]) -> float:
"""Score source quality based on model/provider."""
if not model:
return DEFAULT_SOURCE_QUALITY
# Normalize model name
model_lower = model.lower()
for key, score in SOURCE_QUALITY.items():
if key in model_lower:
return score
return DEFAULT_SOURCE_QUALITY
def score_entry(entry: dict) -> float:
"""
Score a knowledge entry on quality (0.0-1.0).
Weights:
- specificity: 0.3
- actionability: 0.3
- freshness: 0.2
- source_quality: 0.2
"""
content = entry.get("content", entry.get("text", entry.get("response", "")))
model = entry.get("model", entry.get("provenance", {}).get("model"))
timestamp = entry.get("timestamp", entry.get("provenance", {}).get("timestamp"))
specificity = score_specificity(content)
actionability = score_actionability(content)
freshness = score_freshness(timestamp)
source = score_source_quality(model)
return round(
0.3 * specificity +
0.3 * actionability +
0.2 * freshness +
0.2 * source,
4
)
def score_entry_detailed(entry: dict) -> dict:
"""Score with breakdown."""
content = entry.get("content", entry.get("text", entry.get("response", "")))
model = entry.get("model", entry.get("provenance", {}).get("model"))
timestamp = entry.get("timestamp", entry.get("provenance", {}).get("timestamp"))
specificity = score_specificity(content)
actionability = score_actionability(content)
freshness = score_freshness(timestamp)
source = score_source_quality(model)
return {
"score": round(0.3 * specificity + 0.3 * actionability + 0.2 * freshness + 0.2 * source, 4),
"specificity": round(specificity, 4),
"actionability": round(actionability, 4),
"freshness": round(freshness, 4),
"source_quality": round(source, 4),
}
def filter_entries(entries: List[dict], threshold: float = 0.5) -> List[dict]:
"""Filter entries below quality threshold."""
filtered = []
for entry in entries:
if score_entry(entry) >= threshold:
filtered.append(entry)
return filtered
def quality_report(entries: List[dict]) -> str:
"""Generate quality distribution report."""
if not entries:
return "No entries to analyze."
scores = [score_entry(e) for e in entries]
avg = sum(scores) / len(scores)
min_score = min(scores)
max_score = max(scores)
# Distribution buckets
buckets = {"high": 0, "medium": 0, "low": 0, "rejected": 0}
for s in scores:
if s >= 0.7:
buckets["high"] += 1
elif s >= 0.5:
buckets["medium"] += 1
elif s >= 0.3:
buckets["low"] += 1
else:
buckets["rejected"] += 1
lines = [
"=" * 50,
" QUALITY GATE REPORT",
"=" * 50,
f" Total entries: {len(entries)}",
f" Average score: {avg:.3f}",
f" Min: {min_score:.3f}",
f" Max: {max_score:.3f}",
"",
" Distribution:",
]
for bucket, count in buckets.items():
pct = count / len(entries) * 100
bar = "" * int(pct / 5)
lines.append(f" {bucket:<12} {count:>5} ({pct:>5.1f}%) {bar}")
passed = buckets["high"] + buckets["medium"]
lines.append(f"\n Pass rate (>= 0.5): {passed}/{len(entries)} ({passed/len(entries)*100:.1f}%)")
lines.append("=" * 50)
return "\n".join(lines)
def main():
import argparse
parser = argparse.ArgumentParser(description="Knowledge quality gate")
parser.add_argument("files", nargs="+", help="JSONL files to score")
parser.add_argument("--threshold", type=float, default=0.5, help="Quality threshold")
parser.add_argument("--json", action="store_true", help="JSON output")
parser.add_argument("--filter", action="store_true", help="Filter and write back")
args = parser.parse_args()
all_entries = []
for filepath in args.files:
with open(filepath) as f:
for line in f:
if line.strip():
all_entries.append(json.loads(line))
if args.json:
results = [{"entry": e, **score_entry_detailed(e)} for e in all_entries]
print(json.dumps(results, indent=2))
elif args.filter:
filtered = filter_entries(all_entries, args.threshold)
print(f"Kept {len(filtered)}/{len(all_entries)} entries (threshold: {args.threshold})")
else:
print(quality_report(all_entries))
if __name__ == "__main__":
main()

View File

@@ -1 +0,0 @@
pytest>=8,<9

View File

@@ -1,554 +0,0 @@
#!/usr/bin/env python3
"""
Automation Opportunity Finder — Scan fleet for manual processes that could be automated.
Analyzes:
1. Cron jobs — finds manual steps between scheduled tasks
2. Documentation — extracts TODO/FIXME/manual-step patterns
3. Scripts — detects repeated command sequences
4. Session transcripts — finds repeated tool-call patterns
Usage:
python3 scripts/automation_opportunity_finder.py --hermes-home ~/.hermes
python3 scripts/automation_opportunity_finder.py --hermes-home ~/.hermes --json
python3 scripts/automation_opportunity_finder.py --hermes-home ~/.hermes --output proposals.json
"""
import argparse
import json
import os
import re
import sys
from collections import Counter, defaultdict
from datetime import datetime, timezone
from pathlib import Path
from typing import Any, Dict, List, Optional
# ---------------------------------------------------------------------------
# Patterns that signal manual work
# ---------------------------------------------------------------------------
MANUAL_STEP_PATTERNS = [
# Explicit manual markers
(r"(?i)\bTODO[:\s]", "todo"),
(r"(?i)\bFIXME[:\s]", "fixme"),
(r"(?i)\bMANUAL[:\s]", "manual_step"),
(r"(?i)\bHACK[:\s]", "hack"),
(r"(?i)\bWORKAROUND[:\s]", "workaround"),
# Step-by-step instructions in docs
(r"(?i)^(\d+)[.\)]\s+(run|execute|ssh|scp|curl|cd|make|docker|ansible|git)", "sequential_step"),
# Explicitly manual operations
(r"(?i)\bmanually\b", "manual_keyword"),
(r"(?i)\bby hand\b", "manual_keyword"),
(r"(?i)\bdon\'?t forget to\b", "manual_keyword"),
(r"(?i)\bremember to\b", "manual_keyword"),
(r"(?i)\bmake sure to\b", "manual_keyword"),
]
# Shell commands that appear frequently in runbooks — signal automatable workflows
SHELL_COMMAND_PATTERNS = [
"ssh ", "scp ", "rsync ", "curl ", "wget ",
"docker ", "docker-compose ", "kubectl ",
"ansible-playbook ", "terraform ", "systemctl ",
"systemctl restart", "systemctl status",
"git push", "git pull", "git merge", "git checkout",
"pip install", "npm install", "cargo build",
]
# Directories to skip during scans — large/uninteresting trees
EXCLUDE_DIRS = frozenset({
"node_modules", "venv", ".venv", "__pycache__", ".git",
"site-packages", "dist", "build", ".tox", ".mypy_cache",
".pytest_cache", "coverage", ".next", "vendor",
"skills", # hermes skills dir is huge
"audio_cache", "skins", "profiles",
})
# Session tool calls that appear repeatedly — candidates for workflow automation
TOOL_SEQUENCE_MIN_OCCURRENCES = 3
# ---------------------------------------------------------------------------
# Analyzers
# ---------------------------------------------------------------------------
def analyze_cron_jobs(hermes_home: str) -> List[Dict[str, Any]]:
"""Analyze cron job definitions for automation gaps."""
proposals = []
cron_dir = Path(hermes_home) / "cron"
jobs_file = cron_dir / "jobs.json"
if not jobs_file.exists():
# Try YAML format
for ext in (".yaml", ".yml"):
alt = cron_dir / f"jobs{ext}"
if alt.exists():
jobs_file = alt
break
if not jobs_file.exists():
return proposals
try:
if jobs_file.suffix == ".json":
with open(jobs_file) as f:
jobs = json.load(f)
else:
try:
import yaml
with open(jobs_file) as f:
jobs = yaml.safe_load(f)
except ImportError:
return proposals
except (json.JSONDecodeError, Exception):
return proposals
if not isinstance(jobs, list):
return proposals
# Look for disabled jobs (someone turned them off — might need a different approach)
disabled = [j for j in jobs if not j.get("enabled", True)]
if disabled:
names = [j.get("name", j.get("id", "?")) for j in disabled[:5]]
proposals.append({
"category": "cron_disabled",
"title": f"{len(disabled)} disabled cron job(s) may need automation rework",
"description": f"These jobs were disabled: {', '.join(names)}. Investigate why and whether a different automation approach is needed.",
"confidence": 0.7,
"impact": "medium",
"sources": [str(jobs_file)],
})
# Look for jobs with high error counts
error_jobs = [j for j in jobs if j.get("last_status") == "error"]
if error_jobs:
names = [j.get("name", j.get("id", "?")) for j in error_jobs[:5]]
proposals.append({
"category": "cron_errors",
"title": f"{len(error_jobs)} cron job(s) failing — may need automation rework",
"description": f"Jobs with errors: {', '.join(names)}. Failure patterns suggest missing dependencies or fragile automation.",
"confidence": 0.8,
"impact": "high",
"sources": [str(jobs_file)],
})
# Look for jobs with delivery errors (platform issues)
delivery_errors = [j for j in jobs if j.get("last_delivery_error")]
if delivery_errors:
proposals.append({
"category": "cron_delivery",
"title": f"{len(delivery_errors)} cron job(s) have delivery failures",
"description": "Delivery failures suggest missing retry logic or platform integration gaps.",
"confidence": 0.75,
"impact": "medium",
"sources": [str(jobs_file)],
})
# Look for jobs on short intervals that could be event-driven
for job in jobs:
schedule = job.get("schedule", "")
# Check for very frequent schedules (every minute, every 5 min)
if isinstance(schedule, str) and re.match(r"^\*\/([1-5])\s", schedule):
proposals.append({
"category": "cron_frequency",
"title": f"Job '{job.get('name', job.get('id', '?'))}' runs every {schedule.split()[0]} — consider event-driven",
"description": f"High-frequency cron ({schedule}) may be better as event-driven or daemon.",
"confidence": 0.6,
"impact": "low",
"sources": [str(jobs_file)],
})
return proposals
def analyze_documents(root_dirs: List[str]) -> List[Dict[str, Any]]:
"""Scan documentation for manual step patterns."""
proposals = []
doc_extensions = {".md", ".txt", ".rst", ".adoc"}
findings_by_category = defaultdict(list)
for root_dir in root_dirs:
root = Path(root_dir)
if not root.exists():
continue
for path in root.rglob("*"):
if path.is_dir():
continue
if path.suffix not in doc_extensions:
continue
# Skip excluded dirs and hidden dirs
parts = path.relative_to(root).parts if root in path.parents or root == path.parent else path.parts
if any(p.startswith(".") or p in EXCLUDE_DIRS for p in parts):
continue
if len(parts) > 8:
continue
try:
content = path.read_text(errors="replace")
except (PermissionError, OSError):
continue
lines = content.split("\n")
for i, line in enumerate(lines):
stripped = line.strip()
if not stripped:
continue
for pattern, category in MANUAL_STEP_PATTERNS:
if re.search(pattern, stripped):
findings_by_category[category].append({
"file": str(path),
"line": i + 1,
"text": stripped[:200],
})
# Generate proposals from findings
for category, findings in findings_by_category.items():
if len(findings) < 2:
continue
file_count = len(set(f["file"] for f in findings))
proposals.append({
"category": f"manual_{category}",
"title": f"{len(findings)} '{category}' markers across {file_count} doc(s)",
"description": f"Found in: {', '.join(set(Path(f['file']).name for f in findings[:5]))}",
"confidence": 0.65,
"impact": "medium",
"sources": list(set(f["file"] for f in findings[:10])),
"details": findings[:5], # sample
})
return proposals
def analyze_scripts(root_dirs: List[str]) -> List[Dict[str, Any]]:
"""Detect repeated command sequences in scripts."""
proposals = []
script_extensions = {".py", ".sh", ".bash", ".zsh"}
command_counter = Counter()
command_locations = defaultdict(list)
for root_dir in root_dirs:
root = Path(root_dir)
if not root.exists():
continue
for path in root.rglob("*"):
if path.is_dir():
continue
if path.suffix not in script_extensions:
continue
parts = path.relative_to(root).parts if root in path.parents or root == path.parent else path.parts
if any(p.startswith(".") or p in EXCLUDE_DIRS for p in parts):
continue
if len(parts) > 8:
continue
try:
content = path.read_text(errors="replace")
except (PermissionError, OSError):
continue
lines = content.split("\n")
for i, line in enumerate(lines):
stripped = line.strip()
if not stripped or stripped.startswith("#"):
continue
for cmd_prefix in SHELL_COMMAND_PATTERNS:
if cmd_prefix in stripped:
# Normalize the command
normalized = re.sub(r"\s+", " ", stripped)[:120]
command_counter[normalized] += 1
command_locations[normalized].append(f"{path}:{i+1}")
# Proposals for commands appearing 3+ times
for cmd, count in command_counter.most_common(20):
if count < 3:
break
locs = command_locations[cmd]
file_count = len(set(loc.split(":")[0] for loc in locs))
proposals.append({
"category": "repeated_command",
"title": f"Command repeated {count}x across {file_count} file(s): {cmd[:80]}",
"description": f"Locations: {', '.join(locs[:3])}",
"confidence": min(0.5 + (count * 0.1), 0.95),
"impact": "medium",
"sources": list(set(loc.split(":")[0] for loc in locs)),
})
return proposals
def analyze_session_transcripts(session_dirs: List[str]) -> List[Dict[str, Any]]:
"""Find repeated tool-call patterns in session transcripts."""
proposals = []
tool_sequence_counter = Counter()
tool_sequence_examples = {}
for session_dir in session_dirs:
session_path = Path(session_dir)
if not session_path.exists():
continue
for path in session_path.rglob("*.jsonl"):
try:
content = path.read_text(errors="replace")
except (PermissionError, OSError):
continue
# Extract tool calls in sequence
tool_sequence = []
for line in content.split("\n"):
line = line.strip()
if not line:
continue
try:
msg = json.loads(line)
except json.JSONDecodeError:
continue
# Look for tool calls in assistant messages
if msg.get("role") == "assistant" and msg.get("tool_calls"):
for tc in msg["tool_calls"]:
func_name = tc.get("function", {}).get("name", "?")
tool_sequence.append(func_name)
# Find 2-call sequences
for i in range(len(tool_sequence) - 1):
seq = (tool_sequence[i], tool_sequence[i + 1])
tool_sequence_counter[seq] += 1
if seq not in tool_sequence_examples:
tool_sequence_examples[seq] = str(path.name)
# Find 3-call sequences
for i in range(len(tool_sequence) - 2):
seq = (tool_sequence[i], tool_sequence[i + 1], tool_sequence[i + 2])
tool_sequence_counter[seq] += 1
if seq not in tool_sequence_examples:
tool_sequence_examples[seq] = str(path.name)
# Generate proposals for frequently repeated sequences
for seq, count in tool_sequence_counter.most_common(20):
if count < TOOL_SEQUENCE_MIN_OCCURRENCES:
break
seq_str = " -> ".join(seq)
proposals.append({
"category": "tool_sequence",
"title": f"Tool sequence '{seq_str}' repeated {count} times",
"description": f"Consider creating a workflow/skill that automates this sequence.",
"confidence": min(0.5 + (count * 0.05), 0.9),
"impact": "medium",
"sources": [tool_sequence_examples.get(seq, "unknown")],
})
return proposals
def analyze_shell_history(root_dirs: List[str]) -> List[Dict[str, Any]]:
"""Find repeated shell commands from history files."""
proposals = []
command_counter = Counter()
for root_dir in root_dirs:
root = Path(root_dir)
history_files = []
# Look for shell history files
for name in (".bash_history", ".zsh_history", ".python_history"):
p = root / name
if p.exists():
history_files.append(p)
# Also check in hermes home
for p in root.glob("**/*history*"):
if p.is_file() and p.suffix in ("", ".txt", ".log"):
history_files.append(p)
for hf in history_files:
try:
content = hf.read_text(errors="replace")
except (PermissionError, OSError):
continue
for line in content.split("\n"):
stripped = line.strip()
# ZSH history format: ": 1234567890:0;command"
stripped = re.sub(r"^:\s*\d+:\d+;", "", stripped)
if not stripped or len(stripped) < 5:
continue
# Skip trivial commands
if stripped in ("ls", "cd", "pwd", "clear", "exit"):
continue
command_counter[stripped] += 1
for cmd, count in command_counter.most_common(10):
if count < 5:
break
proposals.append({
"category": "shell_repetition",
"title": f"Shell command run {count}+ times: {cmd[:80]}",
"description": "Frequently repeated shell command — candidate for alias, function, or script.",
"confidence": min(0.4 + (count * 0.05), 0.85),
"impact": "low",
"sources": ["shell_history"],
})
return proposals
# ---------------------------------------------------------------------------
# Proposal output
# ---------------------------------------------------------------------------
def deduplicate_proposals(proposals: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
"""Remove duplicate proposals based on title similarity."""
seen_titles = set()
unique = []
for p in proposals:
# Normalize title for dedup
key = re.sub(r"\d+", "N", p["title"]).lower()
if key not in seen_titles:
seen_titles.add(key)
unique.append(p)
return unique
def rank_proposals(proposals: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
"""Sort proposals by impact * confidence (highest first)."""
impact_weight = {"critical": 1.0, "high": 0.8, "medium": 0.5, "low": 0.2}
return sorted(
proposals,
key=lambda p: impact_weight.get(p.get("impact", "low"), 0.2) * p.get("confidence", 0.5),
reverse=True,
)
def format_text_report(proposals: List[Dict[str, Any]]) -> str:
"""Format proposals as human-readable text."""
if not proposals:
return "No automation opportunities found."
lines = [
"=" * 70,
" AUTOMATION OPPORTUNITY REPORT",
f" Generated: {datetime.now(timezone.utc).strftime('%Y-%m-%d %H:%M UTC')}",
f" Proposals: {len(proposals)}",
"=" * 70,
"",
]
for i, p in enumerate(proposals, 1):
score = p.get("confidence", 0.5) * {"critical": 1.0, "high": 0.8, "medium": 0.5, "low": 0.2}.get(p.get("impact", "low"), 0.2)
lines.append(f"[{i}] {p['title']}")
lines.append(f" Category: {p['category']} | Impact: {p.get('impact','?')} | Confidence: {p.get('confidence',0):.0%} | Score: {score:.2f}")
lines.append(f" {p['description']}")
if p.get("sources"):
lines.append(f" Sources: {', '.join(p['sources'][:3])}")
lines.append("")
# Summary by category
cat_counts = Counter(p["category"] for p in proposals)
lines.append("-" * 70)
lines.append("Summary by category:")
for cat, count in cat_counts.most_common():
lines.append(f" {cat}: {count}")
return "\n".join(lines)
# ---------------------------------------------------------------------------
# Main
# ---------------------------------------------------------------------------
def main():
parser = argparse.ArgumentParser(description="Find automation opportunities across the fleet")
parser.add_argument("--hermes-home", default=os.path.expanduser("~/.hermes"),
help="Path to hermes home directory (default: ~/.hermes)")
parser.add_argument("--scan-dirs", nargs="*",
help="Additional directories to scan (default: hermes-home + cwd)")
parser.add_argument("--session-dirs", nargs="*",
help="Session transcript directories (default: hermes-home/sessions)")
parser.add_argument("--json", action="store_true", help="Output as JSON")
parser.add_argument("--output", "-o", help="Write proposals to file")
parser.add_argument("--min-confidence", type=float, default=0.3,
help="Minimum confidence threshold (default: 0.3)")
parser.add_argument("--categories", nargs="*",
help="Only include these categories (cron, docs, scripts, sessions, shell)")
args = parser.parse_args()
hermes_home = os.path.expanduser(args.hermes_home)
# Default scan directories
scan_dirs = [hermes_home, "."]
if args.scan_dirs:
scan_dirs.extend(args.scan_dirs)
session_dirs = [os.path.join(hermes_home, "sessions")]
if args.session_dirs:
session_dirs.extend(args.session_dirs)
# Also check common session locations
for subdir in ("transcripts", "session-db"):
p = os.path.join(hermes_home, subdir)
if os.path.isdir(p):
session_dirs.append(p)
categories = set(args.categories) if args.categories else {"cron", "docs", "scripts", "sessions", "shell"}
# Run analyzers
all_proposals = []
if "cron" in categories:
all_proposals.extend(analyze_cron_jobs(hermes_home))
if "docs" in categories:
all_proposals.extend(analyze_documents(scan_dirs))
if "scripts" in categories:
all_proposals.extend(analyze_scripts(scan_dirs))
if "sessions" in categories:
all_proposals.extend(analyze_session_transcripts(session_dirs))
if "shell" in categories:
all_proposals.extend(analyze_shell_history(scan_dirs))
# Deduplicate and rank
all_proposals = deduplicate_proposals(all_proposals)
all_proposals = rank_proposals(all_proposals)
# Filter by confidence
all_proposals = [p for p in all_proposals if p.get("confidence", 0) >= args.min_confidence]
# Output
output = {
"generated_at": datetime.now(timezone.utc).isoformat(),
"hermes_home": hermes_home,
"scan_dirs": scan_dirs,
"total_proposals": len(all_proposals),
"proposals": all_proposals,
}
if args.json:
result = json.dumps(output, indent=2)
else:
result = format_text_report(all_proposals)
if args.output:
with open(args.output, "w") as f:
if args.json:
json.dump(output, f, indent=2)
else:
f.write(result)
print(f"Written to {args.output}", file=sys.stderr)
else:
print(result)
return 0 if all_proposals else 1
if __name__ == "__main__":
sys.exit(main())

View File

@@ -1,359 +0,0 @@
#!/usr/bin/env python3
"""
Bootstrapper — assemble pre-session context from knowledge store.
Reads the knowledge store and produces a compact context block (2k tokens max)
that can be injected into a new session so it starts with situational awareness.
Usage:
python3 bootstrapper.py --repo the-nexus --agent mimo-sprint
python3 bootstrapper.py --repo timmy-home --global
python3 bootstrapper.py --global
python3 bootstrapper.py --repo the-nexus --max-tokens 1000
"""
import argparse
import json
import sys
from pathlib import Path
from typing import Optional
# Resolve knowledge root relative to this script's parent
SCRIPT_DIR = Path(__file__).resolve().parent
REPO_ROOT = SCRIPT_DIR.parent
KNOWLEDGE_DIR = REPO_ROOT / "knowledge"
INDEX_PATH = KNOWLEDGE_DIR / "index.json"
# Approximate token count: ~4 chars per token for English text
CHARS_PER_TOKEN = 4
# Category sort priority (lower = shown first)
CATEGORY_PRIORITY = {
"pitfall": 0,
"tool-quirk": 1,
"pattern": 2,
"fact": 3,
"question": 4,
}
def load_index(index_path: Path = INDEX_PATH) -> dict:
"""Load and validate the knowledge index."""
if not index_path.exists():
return {"version": 1, "total_facts": 0, "facts": []}
with open(index_path) as f:
data = json.load(f)
if "facts" not in data:
print(f"WARNING: index.json missing 'facts' key", file=sys.stderr)
return {"version": 1, "total_facts": 0, "facts": []}
return data
def filter_facts(
facts: list[dict],
repo: Optional[str] = None,
agent: Optional[str] = None,
include_global: bool = True,
) -> list[dict]:
"""Filter facts by repo, agent, and global scope."""
filtered = []
for fact in facts:
fact_repo = fact.get("repo", "global")
fact_agent = fact.get("agent", "")
# Match by repo (regardless of agent)
if repo and fact_repo == repo:
filtered.append(fact)
continue
# Match by exact agent type
if agent and fact_agent == agent:
filtered.append(fact)
continue
# Include global facts without agent restriction (universal facts)
if include_global and fact_repo == "global" and not fact_agent:
filtered.append(fact)
return filtered
def sort_facts(facts: list[dict]) -> list[dict]:
"""
Sort facts by: confidence (desc), then category priority, then fact text.
Most reliable and most dangerous facts come first.
"""
def sort_key(f):
confidence = f.get("confidence", 0.5)
category = f.get("category", "fact")
cat_priority = CATEGORY_PRIORITY.get(category, 5)
return (-confidence, cat_priority, f.get("fact", ""))
return sorted(facts, key=sort_key)
def load_repo_knowledge(repo: str) -> Optional[str]:
"""Load per-repo knowledge markdown if it exists."""
repo_path = KNOWLEDGE_DIR / "repos" / f"{repo}.md"
if repo_path.exists():
return repo_path.read_text().strip()
return None
def load_agent_knowledge(agent: str) -> Optional[str]:
"""Load per-agent knowledge markdown if it exists."""
agent_path = KNOWLEDGE_DIR / "agents" / f"{agent}.md"
if agent_path.exists():
return agent_path.read_text().strip()
return None
def load_global_knowledge() -> list[str]:
"""Load all global knowledge markdown files."""
global_dir = KNOWLEDGE_DIR / "global"
if not global_dir.exists():
return []
chunks = []
for md_file in sorted(global_dir.glob("*.md")):
content = md_file.read_text().strip()
if content:
chunks.append(content)
return chunks
def render_facts_section(facts: list[dict], category: str, label: str) -> str:
"""Render a section of facts for a single category."""
cat_facts = [f for f in facts if f.get("category") == category]
if not cat_facts:
return ""
lines = [f"### {label}\n"]
for f in cat_facts:
conf = f.get("confidence", 0.5)
fact_text = f.get("fact", "")
repo_tag = f.get("repo", "")
if repo_tag and repo_tag != "global":
lines.append(f"- [{conf:.0%}] ({repo_tag}) {fact_text}")
else:
lines.append(f"- [{conf:.0%}] {fact_text}")
return "\n".join(lines) + "\n"
def estimate_tokens(text: str) -> int:
"""Rough token estimate."""
return len(text) // CHARS_PER_TOKEN
def truncate_to_tokens(text: str, max_tokens: int) -> str:
"""Truncate text to approximately max_tokens, cutting at line boundaries."""
max_chars = max_tokens * CHARS_PER_TOKEN
if len(text) <= max_chars:
return text
# Cut at last newline before the limit
truncated = text[:max_chars]
last_newline = truncated.rfind("\n")
if last_newline > 0:
truncated = truncated[:last_newline]
return truncated + "\n\n[... truncated to fit context window ...]"
def build_bootstrap_context(
repo: Optional[str] = None,
agent: Optional[str] = None,
include_global: bool = True,
max_tokens: int = 2000,
index_path: Path = INDEX_PATH,
) -> str:
"""
Build the full bootstrap context block.
Returns a markdown string suitable for injection into a session prompt.
"""
index = load_index(index_path)
facts = index.get("facts", [])
# Filter
filtered = filter_facts(facts, repo=repo, agent=agent, include_global=include_global)
# Sort
sorted_facts = sort_facts(filtered)
# Build sections
sections = ["## What You Know (bootstrapped)\n"]
# Per-repo markdown knowledge
if repo:
repo_md = load_repo_knowledge(repo)
if repo_md:
sections.append(f"### Repo Notes: {repo}\n")
sections.append(repo_md + "\n")
# Structured facts by category
if sorted_facts:
# Group by source
repo_facts = [f for f in sorted_facts if f.get("repo") == repo] if repo else []
global_facts = [f for f in sorted_facts if f.get("repo") == "global"]
agent_facts = [f for f in sorted_facts if f.get("agent") == agent] if agent else []
if repo_facts:
sections.append(f"### Repo: {repo}\n")
for cat, label in [
("pitfall", "PITFALLS"),
("tool-quirk", "QUIRKS"),
("pattern", "PATTERNS"),
("fact", "FACTS"),
("question", "OPEN QUESTIONS"),
]:
section = render_facts_section(repo_facts, cat, label)
if section:
sections.append(section)
if global_facts:
sections.append("### Global\n")
for cat, label in [
("pitfall", "PITFALLS"),
("tool-quirk", "QUIRKS"),
("pattern", "PATTERNS"),
("fact", "FACTS"),
]:
section = render_facts_section(global_facts, cat, label)
if section:
sections.append(section)
if agent_facts:
sections.append(f"### Agent Notes ({agent})\n")
for cat, label in [
("pitfall", "PITFALLS"),
("tool-quirk", "QUIRKS"),
("pattern", "PATTERNS"),
("fact", "FACTS"),
]:
section = render_facts_section(agent_facts, cat, label)
if section:
sections.append(section)
# Per-agent markdown knowledge
if agent:
agent_md = load_agent_knowledge(agent)
if agent_md:
sections.append(f"### Agent Profile: {agent}\n")
sections.append(agent_md + "\n")
# Global markdown knowledge
global_chunks = load_global_knowledge()
if global_chunks:
sections.append("### Global Notes\n")
sections.extend(chunk + "\n" for chunk in global_chunks)
# If nothing was found
if len(sections) == 1:
sections.append("_No relevant knowledge found. Starting fresh._\n")
if not facts:
sections.append(
"_Knowledge store is empty. Run the harvester to populate it._\n"
)
# Join and truncate
context = "\n".join(sections)
context = truncate_to_tokens(context, max_tokens)
return context
def main():
parser = argparse.ArgumentParser(
description="Assemble pre-session context from knowledge store"
)
parser.add_argument(
"--repo",
type=str,
default=None,
help="Repository name to filter facts by",
)
parser.add_argument(
"--agent",
type=str,
default=None,
help="Agent type to filter facts by (e.g., mimo-sprint, groq-fast)",
)
parser.add_argument(
"--global",
dest="include_global",
action="store_true",
default=True,
help="Include global facts (default: true)",
)
parser.add_argument(
"--no-global",
dest="include_global",
action="store_false",
help="Exclude global facts",
)
parser.add_argument(
"--max-tokens",
type=int,
default=2000,
help="Maximum token count for output (default: 2000)",
)
parser.add_argument(
"--index",
type=str,
default=None,
help="Path to index.json (default: knowledge/index.json)",
)
parser.add_argument(
"--json",
dest="output_json",
action="store_true",
help="Output raw JSON instead of markdown",
)
args = parser.parse_args()
index_path = Path(args.index) if args.index else INDEX_PATH
if args.output_json:
# JSON mode: return the filtered, sorted facts
index = load_index(index_path)
facts = index.get("facts", [])
filtered = filter_facts(
facts,
repo=args.repo,
agent=args.agent,
include_global=args.include_global,
)
sorted_facts = sort_facts(filtered)
output = {
"repo": args.repo,
"agent": args.agent,
"include_global": args.include_global,
"total_indexed": len(facts),
"matched": len(sorted_facts),
"facts": sorted_facts,
}
print(json.dumps(output, indent=2))
else:
# Markdown mode: full bootstrap context
context = build_bootstrap_context(
repo=args.repo,
agent=args.agent,
include_global=args.include_global,
max_tokens=args.max_tokens,
index_path=index_path,
)
print(context)
return 0
if __name__ == "__main__":
sys.exit(main())

View File

@@ -1,282 +0,0 @@
#!/usr/bin/env python3
"""
Dead Code Detector for Python Codebases
AST-based analysis to find defined but never-called functions and classes.
Excludes entry points, plugin hooks, __init__ exports.
Usage:
python3 scripts/dead_code_detector.py /path/to/repo/
python3 scripts/dead_code_detector.py hermes-agent/ --format json
python3 scripts/dead_code_detector.py . --exclude tests/,venv/
Output: file:line, function/class name, last git author (if available)
"""
import argparse
import ast
import json
import os
import subprocess
import sys
from collections import defaultdict
from pathlib import Path
from typing import Optional
# Names that are expected to be unused (entry points, protocol methods, etc.)
SAFE_UNUSED_PATTERNS = {
# Python dunders
"__init__", "__str__", "__repr__", "__eq__", "__hash__", "__len__",
"__getitem__", "__setitem__", "__contains__", "__iter__", "__next__",
"__enter__", "__exit__", "__call__", "__bool__", "__del__",
"__post_init__", "__class_getitem__",
# Common entry points
"main", "app", "handler", "setup", "teardown", "fixture",
# pytest
"conftest", "test_", "pytest_", # prefix patterns
# Protocols / abstract
"abstractmethod", "abc_",
}
def is_safe_unused(name: str, filepath: str) -> bool:
"""Check if an unused name is expected to be unused."""
# Test files are exempt
if "test" in filepath.lower():
return True
# Known patterns
for pattern in SAFE_UNUSED_PATTERNS:
if name.startswith(pattern) or name == pattern:
return True
# __init__.py exports are often unused internally
if filepath.endswith("__init__.py"):
return True
return False
def get_git_blame(filepath: str, lineno: int) -> Optional[str]:
"""Get last author of a line via git blame."""
try:
result = subprocess.run(
["git", "blame", "-L", f"{lineno},{lineno}", "--porcelain", filepath],
capture_output=True, text=True, timeout=5
)
for line in result.stdout.split("\n"):
if line.startswith("author "):
return line[7:]
except:
pass
return None
class DefinitionCollector(ast.NodeVisitor):
"""Collect all function and class definitions."""
def __init__(self):
self.definitions = [] # (name, type, lineno, filepath)
def visit_FunctionDef(self, node):
self.definitions.append((node.name, "function", node.lineno))
self.generic_visit(node)
def visit_AsyncFunctionDef(self, node):
self.definitions.append((node.name, "async_function", node.lineno))
self.generic_visit(node)
def visit_ClassDef(self, node):
self.definitions.append((node.name, "class", node.lineno))
self.generic_visit(node)
class NameUsageCollector(ast.NodeVisitor):
"""Collect all name references (calls, imports, attribute access)."""
def __init__(self):
self.names = set()
self.calls = set()
self.imports = set()
def visit_Name(self, node):
self.names.add(node.id)
self.generic_visit(node)
def visit_Attribute(self, node):
if isinstance(node.value, ast.Name):
self.names.add(node.value.id)
self.generic_visit(node)
def visit_Call(self, node):
if isinstance(node.func, ast.Name):
self.calls.add(node.func.id)
elif isinstance(node.func, ast.Attribute):
if isinstance(node.func.value, ast.Name):
self.names.add(node.func.value.id)
self.calls.add(node.func.attr)
self.generic_visit(node)
def visit_Import(self, node):
for alias in node.names:
self.imports.add(alias.asname or alias.name)
self.generic_visit(node)
def visit_ImportFrom(self, node):
for alias in node.names:
self.imports.add(alias.asname or alias.name)
self.generic_visit(node)
def analyze_file(filepath: str) -> dict:
"""Analyze a single Python file for dead code."""
path = Path(filepath)
try:
content = path.read_text()
tree = ast.parse(content, filename=str(filepath))
except (SyntaxError, UnicodeDecodeError):
return {"error": f"Could not parse {filepath}"}
# Collect definitions
def_collector = DefinitionCollector()
def_collector.visit(tree)
definitions = def_collector.definitions
# Collect usage
usage_collector = NameUsageCollector()
usage_collector.visit(tree)
used_names = usage_collector.names | usage_collector.calls | usage_collector.imports
# Also scan the entire repo for references to this file's definitions
# (this is done at the repo level, not file level)
dead = []
for name, def_type, lineno in definitions:
if name.startswith("_") and not name.startswith("__"):
# Private functions — might be used externally, less likely dead
pass
if name not in used_names:
if not is_safe_unused(name, filepath):
dead.append({
"name": name,
"type": def_type,
"file": filepath,
"line": lineno,
})
return {"definitions": len(definitions), "dead": dead}
def scan_repo(repo_path: str, exclude_patterns: list = None) -> dict:
"""Scan an entire repo for dead code."""
path = Path(repo_path)
exclude = exclude_patterns or ["venv", ".venv", "node_modules", "__pycache__",
".git", "dist", "build", ".tox", "vendor"]
all_definitions = {} # name -> [{file, line, type}]
all_files = []
dead_code = []
# First pass: collect all definitions across repo
for fpath in path.rglob("*.py"):
parts = fpath.parts
if any(ex in parts for ex in exclude):
continue
if fpath.name.startswith("."):
continue
try:
content = fpath.read_text(errors="ignore")
tree = ast.parse(content, filename=str(fpath))
except:
continue
all_files.append(str(fpath))
collector = DefinitionCollector()
collector.visit(tree)
for name, def_type, lineno in collector.definitions:
rel_path = str(fpath.relative_to(path))
if name not in all_definitions:
all_definitions[name] = []
all_definitions[name].append({
"file": rel_path,
"line": lineno,
"type": def_type,
})
# Second pass: check each name for usage across entire repo
all_used_names = set()
for fpath_str in all_files:
try:
content = Path(fpath_str).read_text(errors="ignore")
tree = ast.parse(content)
except:
continue
usage = NameUsageCollector()
usage.visit(tree)
all_used_names.update(usage.names)
all_used_names.update(usage.calls)
all_used_names.update(usage.imports)
# Find dead code
for name, locations in all_definitions.items():
if name not in all_used_names:
for loc in locations:
if not is_safe_unused(name, loc["file"]):
dead_code.append({
"name": name,
"type": loc["type"],
"file": loc["file"],
"line": loc["line"],
})
return {
"repo": path.name,
"files_scanned": len(all_files),
"total_definitions": sum(len(v) for v in all_definitions.values()),
"dead_code_count": len(dead_code),
"dead_code": sorted(dead_code, key=lambda x: (x["file"], x["line"])),
}
def main():
parser = argparse.ArgumentParser(description="Find dead code in Python codebases")
parser.add_argument("repo", help="Repository path to scan")
parser.add_argument("--format", choices=["text", "json"], default="text")
parser.add_argument("--exclude", help="Comma-separated patterns to exclude")
parser.add_argument("--git-blame", action="store_true", help="Include git blame info")
args = parser.parse_args()
exclude = args.exclude.split(",") if args.exclude else None
result = scan_repo(args.repo, exclude)
if args.format == "json":
print(json.dumps(result, indent=2))
else:
print(f"Dead Code Report: {result['repo']}")
print(f"Files scanned: {result['files_scanned']}")
print(f"Total definitions: {result['total_definitions']}")
print(f"Dead code found: {result['dead_code_count']}")
print()
if result["dead_code"]:
print(f"{'File':<45} {'Line':>4} {'Type':<10} {'Name'}")
print("-" * 85)
for item in result["dead_code"]:
author = ""
if args.git_blame:
author = get_git_blame(
os.path.join(args.repo, item["file"]),
item["line"]
) or ""
author = f" ({author})" if author else ""
print(f"{item['file']:<45} {item['line']:>4} {item['type']:<10} {item['name']}{author}")
else:
print("No dead code detected!")
if __name__ == "__main__":
main()

View File

@@ -1,317 +0,0 @@
#!/usr/bin/env python3
"""
dedup.py — Knowledge deduplication: content hash + semantic similarity.
Deduplicates harvested knowledge entries to avoid training on duplicates.
Uses content hashing for exact matches and token overlap for near-duplicates.
Usage:
python3 dedup.py --input knowledge/index.json --output knowledge/index_deduped.json
python3 dedup.py --input knowledge/index.json --dry-run
python3 dedup.py --test # Run built-in dedup test
"""
import argparse
import hashlib
import json
import re
import sys
from pathlib import Path
from typing import List, Dict, Optional, Tuple
def normalize_text(text: str) -> str:
"""Normalize text for hashing: lowercase, collapse whitespace, strip."""
text = text.lower().strip()
text = re.sub(r'\s+', ' ', text)
return text
def content_hash(text: str) -> str:
"""SHA256 hash of normalized text for exact dedup."""
normalized = normalize_text(text)
return hashlib.sha256(normalized.encode('utf-8')).hexdigest()
def tokenize(text: str) -> set:
"""Simple tokenizer: lowercase words, 3+ chars."""
words = re.findall(r'[a-z0-9_]{3,}', text.lower())
return set(words)
def token_similarity(a: str, b: str) -> float:
"""Token-based Jaccard similarity (0.0-1.0).
Fast local alternative to embedding similarity.
Good enough for near-duplicate detection.
"""
tokens_a = tokenize(a)
tokens_b = tokenize(b)
if not tokens_a or not tokens_b:
return 0.0
intersection = tokens_a & tokens_b
union = tokens_a | tokens_b
return len(intersection) / len(union)
def quality_score(fact: dict) -> float:
"""Compute quality score for merge ranking.
Higher is better. Factors:
- confidence (0-1)
- source_count (more confirmations = better)
- has tags (richer metadata)
"""
confidence = fact.get('confidence', 0.5)
source_count = fact.get('source_count', 1)
has_tags = 1.0 if fact.get('tags') else 0.0
has_related = 1.0 if fact.get('related') else 0.0
# Weighted composite
score = (
confidence * 0.5 +
min(source_count / 10, 1.0) * 0.3 +
has_tags * 0.1 +
has_related * 0.1
)
return round(score, 4)
def merge_facts(keep: dict, drop: dict) -> dict:
"""Merge two near-duplicate facts, keeping higher-quality fields.
The 'keep' fact is enriched with metadata from 'drop'.
"""
# Merge tags (union)
keep_tags = set(keep.get('tags', []))
drop_tags = set(drop.get('tags', []))
keep['tags'] = sorted(keep_tags | drop_tags)
# Merge related (union)
keep_related = set(keep.get('related', []))
drop_related = set(drop.get('related', []))
keep['related'] = sorted(keep_related | drop_related)
# Update source_count (sum)
keep['source_count'] = keep.get('source_count', 1) + drop.get('source_count', 1)
# Update confidence (max — we've now seen it from multiple sources)
keep['confidence'] = max(keep.get('confidence', 0), drop.get('confidence', 0))
# Track that we merged
if '_merged_from' not in keep:
keep['_merged_from'] = []
keep['_merged_from'].append(drop.get('id', 'unknown'))
return keep
def dedup_facts(
facts: List[dict],
exact_threshold: float = 1.0,
near_threshold: float = 0.95,
dry_run: bool = False,
) -> Tuple[List[dict], dict]:
"""Deduplicate a list of knowledge facts.
Args:
facts: List of fact dicts (from index.json)
exact_threshold: Hash match = exact duplicate
near_threshold: Token similarity above this = near-duplicate
dry_run: If True, don't modify, just report
Returns:
(deduped_facts, stats_dict)
"""
if not facts:
return [], {"total": 0, "exact_dupes": 0, "near_dupes": 0, "unique": 0}
# Phase 1: Exact dedup by content hash
hash_seen = {} # hash -> index in deduped list
exact_dupes = 0
deduped = []
for fact in facts:
text = fact.get('fact', '')
h = content_hash(text)
if h in hash_seen:
# Exact duplicate — merge metadata into existing
existing_idx = hash_seen[h]
if not dry_run:
deduped[existing_idx] = merge_facts(deduped[existing_idx], fact)
exact_dupes += 1
else:
hash_seen[h] = len(deduped)
deduped.append(fact)
# Phase 2: Near-dup by token similarity
near_dupes = 0
i = 0
while i < len(deduped):
j = i + 1
while j < len(deduped):
sim = token_similarity(deduped[i].get('fact', ''), deduped[j].get('fact', ''))
if sim >= near_threshold:
# Near-duplicate — keep higher quality
q_i = quality_score(deduped[i])
q_j = quality_score(deduped[j])
if q_i >= q_j:
if not dry_run:
deduped[i] = merge_facts(deduped[i], deduped[j])
deduped.pop(j)
else:
# j is higher quality — merge i into j, then remove i
if not dry_run:
deduped[j] = merge_facts(deduped[j], deduped[i])
deduped.pop(i)
break # i changed, restart inner loop
near_dupes += 1
else:
j += 1
i += 1
stats = {
"total": len(facts),
"exact_dupes": exact_dupes,
"near_dupes": near_dupes,
"unique": len(deduped),
"removed": len(facts) - len(deduped),
}
return deduped, stats
def dedup_index_file(
input_path: str,
output_path: Optional[str] = None,
near_threshold: float = 0.95,
dry_run: bool = False,
) -> dict:
"""Deduplicate an index.json file.
Args:
input_path: Path to index.json
output_path: Where to write deduped file (default: overwrite input)
near_threshold: Token similarity threshold for near-dupes
dry_run: Report only, don't write
Returns stats dict.
"""
path = Path(input_path)
if not path.exists():
raise FileNotFoundError(f"Index file not found: {input_path}")
with open(path) as f:
data = json.load(f)
facts = data.get('facts', [])
deduped, stats = dedup_facts(facts, near_threshold=near_threshold, dry_run=dry_run)
if not dry_run:
data['facts'] = deduped
data['total_facts'] = len(deduped)
data['last_dedup'] = __import__('datetime').datetime.now(
__import__('datetime').timezone.utc
).isoformat()
out_path = Path(output_path) if output_path else path
with open(out_path, 'w') as f:
json.dump(data, f, indent=2, ensure_ascii=False)
return stats
def generate_test_duplicates(n: int = 20) -> List[dict]:
"""Generate test facts with intentional duplicates for testing.
Creates n unique facts plus n/4 exact dupes and n/4 near-dupes.
"""
import random
random.seed(42)
unique_facts = []
for i in range(n):
topic = random.choice(["git", "python", "docker", "rust", "nginx"])
tip = random.choice(["use verbose flags", "check logs first", "restart service", "clear cache", "update config"])
unique_facts.append({
"id": f"test:fact:{i:03d}",
"fact": f"When working with {topic}, always {tip} before deploying.",
"category": "fact",
"domain": "test",
"confidence": round(random.uniform(0.5, 1.0), 2),
"source_count": random.randint(1, 5),
"tags": [topic, "test"],
})
# Add exact duplicates (same text, different IDs)
duped = list(unique_facts)
for i in range(n // 4):
original = unique_facts[i]
dupe = dict(original)
dupe["id"] = f"test:fact:dup{i:03d}"
dupe["confidence"] = round(random.uniform(0.3, 0.8), 2)
duped.append(dupe)
# Add near-duplicates (slightly different phrasing)
for i in range(n // 4):
original = unique_facts[i]
near = dict(original)
near["id"] = f"test:fact:near{i:03d}"
near["fact"] = original["fact"].replace("always", "should").replace("before deploying", "prior to deployment")
near["confidence"] = round(random.uniform(0.4, 0.9), 2)
duped.append(near)
return duped
def main():
parser = argparse.ArgumentParser(description="Knowledge deduplication")
parser.add_argument("--input", help="Path to index.json")
parser.add_argument("--output", help="Output path (default: overwrite input)")
parser.add_argument("--threshold", type=float, default=0.95,
help="Near-dup similarity threshold (default: 0.95)")
parser.add_argument("--dry-run", action="store_true", help="Report only, don't write")
parser.add_argument("--test", action="store_true", help="Run built-in dedup test")
parser.add_argument("--json", action="store_true", help="JSON output")
args = parser.parse_args()
if args.test:
test_facts = generate_test_duplicates(20)
print(f"Generated {len(test_facts)} test facts (20 unique + dupes)")
deduped, stats = dedup_facts(test_facts, near_threshold=args.threshold)
print(f"\nDedup results:")
print(f" Total input: {stats['total']}")
print(f" Exact dupes: {stats['exact_dupes']}")
print(f" Near dupes: {stats['near_dupes']}")
print(f" Unique output: {stats['unique']}")
print(f" Removed: {stats['removed']}")
# Verify: should have ~20 unique (some merged)
assert stats['unique'] <= 20, f"Too many unique: {stats['unique']} > 20"
assert stats['unique'] >= 15, f"Too few unique: {stats['unique']} < 15"
assert stats['removed'] > 0, "No duplicates removed"
print("\nOK: Dedup test passed")
return
if not args.input:
print("ERROR: Provide --input or --test")
sys.exit(1)
stats = dedup_index_file(args.input, args.output, args.threshold, args.dry_run)
if args.json:
print(json.dumps(stats, indent=2))
else:
print(f"Dedup results:")
print(f" Total input: {stats['total']}")
print(f" Exact dupes: {stats['exact_dupes']}")
print(f" Near dupes: {stats['near_dupes']}")
print(f" Unique output: {stats['unique']}")
print(f" Removed: {stats['removed']}")
if args.dry_run:
print(" (dry run — no changes written)")
if __name__ == "__main__":
main()

View File

@@ -1,249 +0,0 @@
#!/usr/bin/env python3
"""
Cross-Repo Dependency Graph Builder
Scans repos for import/require/reference patterns and builds a directed
dependency graph. Detects circular dependencies. Outputs DOT and Mermaid.
Usage:
python3 scripts/dependency_graph.py /path/to/repos/
python3 scripts/dependency_graph.py --repos repo1,repo2,repo3 --format mermaid
python3 scripts/dependency_graph.py --repos-dir /path/to/ --format dot --output deps.dot
Patterns detected:
- Python: import X, from X import Y
- JavaScript: require("X"), import ... from "X"
- Go: import "X"
- Ansible: include_role, import_role
- Docker/Compose: image: X, depends_on
- Config references: repo-name in YAML/TOML/JSON
"""
import argparse
import json
import os
import re
import sys
from collections import defaultdict
from pathlib import Path
# Known repo names for matching
KNOWN_REPOS = [
"hermes-agent", "timmy-config", "timmy-home", "the-nexus", "the-door",
"the-beacon", "fleet-ops", "burn-fleet", "timmy-dispatch", "turboquant",
"compounding-intelligence", "the-playground", "second-son-of-timmy",
"ai-safety-review", "the-echo-pattern", "timmy-academy", "wolf",
"the-testament",
]
def normalize_repo_name(name: str) -> str:
"""Normalize a repo name for comparison."""
return name.lower().replace("_", "-").replace(".git", "").strip()
def scan_file_for_deps(filepath: str, content: str, own_repo: str) -> set:
"""Scan a file's content for references to other repos."""
deps = set()
own_norm = normalize_repo_name(own_repo)
for repo in KNOWN_REPOS:
repo_norm = normalize_repo_name(repo)
if repo_norm == own_norm:
continue
# Direct name references
patterns = [
repo, # exact name
repo.replace("-", "_"), # underscore variant
repo.replace("-", ""), # no separator
f"/{repo}/", # path reference
f'"{repo}"', # quoted
f"'{repo}'", # single quoted
f"Timmy_Foundation/{repo}", # full Gitea path
f"Timmy_Foundation.{repo}", # Python module path
]
for pattern in patterns:
if pattern in content:
deps.add(repo)
break
return deps
def scan_repo(repo_path: str, repo_name: str = None) -> dict:
"""Scan a repo directory for dependencies."""
path = Path(repo_path)
if not path.is_dir():
return {"error": f"Not a directory: {repo_path}"}
if not repo_name:
repo_name = path.name
deps = set()
files_scanned = 0
exts = {".py", ".js", ".ts", ".go", ".yaml", ".yml", ".toml", ".json",
".md", ".sh", ".bash", ".Dockerfile", ".tf", ".hcl"}
for fpath in path.rglob("*"):
if not fpath.is_file():
continue
if fpath.suffix not in exts:
continue
# Skip common non-source dirs
parts = fpath.parts
if any(p in (".git", "node_modules", "__pycache__", ".venv", "venv",
"vendor", "dist", "build", ".tox") for p in parts):
continue
try:
content = fpath.read_text(errors="ignore")
except:
continue
file_deps = scan_file_for_deps(str(fpath), content, repo_name)
deps.update(file_deps)
files_scanned += 1
return {
"repo": repo_name,
"dependencies": sorted(deps),
"files_scanned": files_scanned,
}
def detect_cycles(graph: dict) -> list:
"""Detect circular dependencies using DFS."""
cycles = []
visited = set()
rec_stack = set()
def dfs(node, path):
visited.add(node)
rec_stack.add(node)
for neighbor in graph.get(node, {}).get("dependencies", []):
if neighbor not in visited:
result = dfs(neighbor, path + [neighbor])
if result:
return result
elif neighbor in rec_stack:
cycle_start = path.index(neighbor)
return path[cycle_start:] + [neighbor]
rec_stack.remove(node)
return None
for node in graph:
if node not in visited:
cycle = dfs(node, [node])
if cycle:
cycles.append(cycle)
return cycles
def to_dot(graph: dict) -> str:
"""Generate DOT format output."""
lines = ["digraph dependencies {"]
lines.append(" rankdir=LR;")
lines.append(' node [shape=box, style=filled, fillcolor="#1a1a2e", fontcolor="#e6edf3"];')
lines.append(' edge [color="#4a4a6a"];')
lines.append("")
for repo, data in sorted(graph.items()):
dep_count = len(data.get("dependencies", []))
fill = "#2d1b69" if dep_count > 2 else "#16213e"
lines.append(f' "{repo}" [fillcolor="{fill}"];')
for dep in data.get("dependencies", []):
lines.append(f' "{repo}" -> "{dep}";')
lines.append("}")
return "\n".join(lines)
def to_mermaid(graph: dict) -> str:
"""Generate Mermaid format output."""
lines = ["graph LR"]
for repo, data in sorted(graph.items()):
for dep in data.get("dependencies", []):
lines.append(f" {repo.replace('-','_')} --> {dep.replace('-','_')}")
# Add node labels
lines.append("")
for repo in sorted(graph.keys()):
lines.append(f" {repo.replace('-','_')}[{repo}]")
return "\n".join(lines)
def main():
parser = argparse.ArgumentParser(description="Build cross-repo dependency graph")
parser.add_argument("repos_dir", nargs="?", help="Directory containing repos")
parser.add_argument("--repos", help="Comma-separated list of repo paths")
parser.add_argument("--format", choices=["dot", "mermaid", "json"], default="json")
parser.add_argument("--output", "-o", help="Output file (default: stdout)")
parser.add_argument("--cycles-only", action="store_true", help="Only report cycles")
args = parser.parse_args()
results = {}
repo_paths = []
if args.repos:
repo_paths = [p.strip() for p in args.repos.split(",")]
elif args.repos_dir:
base = Path(args.repos_dir)
repo_paths = [str(p) for p in base.iterdir() if p.is_dir() and not p.name.startswith(".")]
else:
parser.print_help()
sys.exit(1)
for rpath in repo_paths:
name = Path(rpath).name
print(f"Scanning {name}...", file=sys.stderr)
result = scan_repo(rpath, name)
if "error" not in result:
results[name] = result
# Detect cycles
cycles = detect_cycles(results)
if args.cycles_only:
if cycles:
print("CIRCULAR DEPENDENCIES DETECTED:")
for cycle in cycles:
print(f" {' -> '.join(cycle)}")
sys.exit(1)
else:
print("No circular dependencies found.")
sys.exit(0)
# Output
output = {}
if args.format == "dot":
output = to_dot(results)
elif args.format == "mermaid":
output = to_mermaid(results)
else:
output = json.dumps({
"repos": results,
"cycles": cycles,
"summary": {
"total_repos": len(results),
"total_deps": sum(len(r["dependencies"]) for r in results.values()),
"cycles_found": len(cycles),
}
}, indent=2)
if args.output:
Path(args.output).write_text(output)
print(f"Written to {args.output}", file=sys.stderr)
else:
print(output)
if __name__ == "__main__":
main()

View File

@@ -1,216 +0,0 @@
#!/usr/bin/env python3
"""
Diff Analyzer — Parse unified diffs and categorize every change.
Pipeline 6.1 for Compounding Intelligence.
"""
import re
from dataclasses import dataclass, field, asdict
from enum import Enum
from typing import List, Dict, Any, Optional
class ChangeCategory(Enum):
ADDED = "added"
DELETED = "deleted"
MODIFIED = "modified"
MOVED = "moved"
CONTEXT = "context"
@dataclass
class Hunk:
"""A single diff hunk with header, line ranges, and category."""
header: str
old_start: int
old_count: int
new_start: int
new_count: int
lines: List[str] = field(default_factory=list)
category: ChangeCategory = ChangeCategory.CONTEXT
def to_dict(self) -> Dict[str, Any]:
d = asdict(self)
d["category"] = self.category.value
return d
@dataclass
class FileChange:
"""A single file's changes."""
path: str
old_path: Optional[str] = None # For renames
hunks: List[Hunk] = field(default_factory=list)
added_lines: int = 0
deleted_lines: int = 0
is_new: bool = False
is_deleted: bool = False
is_renamed: bool = False
is_binary: bool = False
def to_dict(self) -> Dict[str, Any]:
return {
"path": self.path,
"old_path": self.old_path,
"hunks": [h.to_dict() for h in self.hunks],
"added_lines": self.added_lines,
"deleted_lines": self.deleted_lines,
"is_new": self.is_new,
"is_deleted": self.is_deleted,
"is_renamed": self.is_renamed,
"is_binary": self.is_binary,
}
@dataclass
class ChangeSummary:
"""Aggregate stats + per-file breakdown."""
files: List[FileChange] = field(default_factory=list)
total_added: int = 0
total_deleted: int = 0
total_files_changed: int = 0
total_hunks: int = 0
new_files: int = 0
deleted_files: int = 0
renamed_files: int = 0
binary_files: int = 0
def to_dict(self) -> Dict[str, Any]:
return {
"total_files_changed": self.total_files_changed,
"total_added": self.total_added,
"total_deleted": self.total_deleted,
"total_hunks": self.total_hunks,
"new_files": self.new_files,
"deleted_files": self.deleted_files,
"renamed_files": self.renamed_files,
"binary_files": self.binary_files,
"files": [f.to_dict() for f in self.files],
}
class DiffAnalyzer:
"""Parses unified diff format and produces structured ChangeSummary."""
HUNK_HEADER_RE = re.compile(r"^@@\s+-(\d+)(?:,(\d+))?\s+\+(\d+)(?:,(\d+))?\s+@@(.*)$")
DIFF_FILE_RE = re.compile(r"^diff --git a/(.*) b/(.*)")
RENAME_RE = re.compile(r"^rename from (.+)$")
RENAME_TO_RE = re.compile(r"^rename to (.+)$")
NEW_FILE_RE = re.compile(r"^new file mode")
DELETED_FILE_RE = re.compile(r"^deleted file mode")
BINARY_RE = re.compile(r"^Binary files .* differ")
def analyze(self, diff_text: str) -> ChangeSummary:
"""Parse a unified diff and return a ChangeSummary."""
summary = ChangeSummary()
if not diff_text or not diff_text.strip():
return summary
# Split diff into per-file sections
file_diffs = self._split_files(diff_text)
for file_diff in file_diffs:
fc = self._parse_file_diff(file_diff)
summary.files.append(fc)
summary.total_added += fc.added_lines
summary.total_deleted += fc.deleted_lines
summary.total_hunks += len(fc.hunks)
if fc.is_new:
summary.new_files += 1
if fc.is_deleted:
summary.deleted_files += 1
if fc.is_renamed:
summary.renamed_files += 1
if fc.is_binary:
summary.binary_files += 1
summary.total_files_changed = len(summary.files)
return summary
def _split_files(self, diff_text: str) -> List[str]:
"""Split a multi-file diff into individual file diffs."""
lines = diff_text.split("\n")
chunks = []
current = []
for line in lines:
if line.startswith("diff --git ") and current:
chunks.append("\n".join(current))
current = [line]
else:
current.append(line)
if current:
chunks.append("\n".join(current))
return chunks
def _parse_file_diff(self, diff_text: str) -> FileChange:
"""Parse a single file's diff section."""
lines = diff_text.split("\n")
fc = FileChange(path="")
# Extract file paths
for line in lines:
m = self.DIFF_FILE_RE.match(line)
if m:
fc.path = m.group(2)
break
# Check for special states
for line in lines:
if self.NEW_FILE_RE.match(line):
fc.is_new = True
elif self.DELETED_FILE_RE.match(line):
fc.is_deleted = True
elif self.RENAME_RE.match(line):
fc.old_path = m.group(1) if (m := self.RENAME_RE.match(line)) else None
fc.is_renamed = True
elif self.BINARY_RE.match(line):
fc.is_binary = True
return fc # No hunks for binary
# Rename TO
for line in lines:
m = self.RENAME_TO_RE.match(line)
if m and fc.is_renamed:
fc.path = m.group(1)
# Parse hunks
current_hunk = None
for line in lines:
m = self.HUNK_HEADER_RE.match(line)
if m:
if current_hunk:
self._classify_hunk(current_hunk, fc)
fc.hunks.append(current_hunk)
current_hunk = Hunk(
header=m.group(5).strip(),
old_start=int(m.group(1)),
old_count=int(m.group(2) or 1),
new_start=int(m.group(3)),
new_count=int(m.group(4) or 1),
)
elif current_hunk and (line.startswith("+") or line.startswith("-") or line.startswith(" ")):
current_hunk.lines.append(line)
if current_hunk:
self._classify_hunk(current_hunk, fc)
fc.hunks.append(current_hunk)
return fc
def _classify_hunk(self, hunk: Hunk, fc: FileChange):
"""Classify a hunk and count lines."""
added = sum(1 for l in hunk.lines if l.startswith("+"))
deleted = sum(1 for l in hunk.lines if l.startswith("-"))
fc.added_lines += added
fc.deleted_lines += deleted
if added > 0 and deleted == 0:
hunk.category = ChangeCategory.ADDED
elif deleted > 0 and added == 0:
hunk.category = ChangeCategory.DELETED
elif added > 0 and deleted > 0:
hunk.category = ChangeCategory.MODIFIED
else:
hunk.category = ChangeCategory.CONTEXT

View File

@@ -1,387 +0,0 @@
#!/usr/bin/env python3
"""
Knowledge Freshness Cron — Detect stale entries from code changes (Issue #200)
Automatically detects when knowledge entries become stale due to code changes.
Detection Method:
1. Track source file hash alongside knowledge entry
2. Compare current file hashes vs stored
3. Mismatch → flag entry as potentially stale
4. Report stale entries and optionally re-extract
Usage:
python3 scripts/freshness.py --knowledge-dir knowledge/
python3 scripts/freshness.py --knowledge-dir knowledge/ --json
python3 scripts/freshness.py --knowledge-dir knowledge/ --repo /path/to/repo
python3 scripts/freshness.py --knowledge-dir knowledge/ --auto-reextract
"""
import argparse
import hashlib
import json
import os
import subprocess
import sys
import yaml
from datetime import datetime, timezone
from pathlib import Path
from typing import Dict, List, Any, Optional, Tuple
def compute_file_hash(filepath: str) -> Optional[str]:
"""Compute SHA-256 hash of a file. Returns None if file doesn't exist."""
try:
with open(filepath, "rb") as f:
return "sha256:" + hashlib.sha256(f.read()).hexdigest()
except (FileNotFoundError, IsADirectoryError, PermissionError):
return None
def get_git_file_changes(repo_path: str, days: int = 1) -> Dict[str, List[str]]:
"""
Get files changed in git in the last N days.
Returns dict with 'modified', 'added', 'deleted' lists of file paths.
"""
changes = {"modified": [], "added": [], "deleted": []}
try:
# Get commits from last N days
cmd = [
"git", "-C", repo_path, "log",
f"--since={days} days ago",
"--name-status",
"--pretty=format:",
"--diff-filter=MAD"
]
result = subprocess.run(cmd, capture_output=True, text=True, timeout=30)
if result.returncode != 0:
return changes
for line in result.stdout.splitlines():
line = line.strip()
if not line:
continue
parts = line.split('\t', 1)
if len(parts) != 2:
continue
status, filepath = parts
if status == 'M':
changes["modified"].append(filepath)
elif status == 'A':
changes["added"].append(filepath)
elif status == 'D':
changes["deleted"].append(filepath)
except (subprocess.TimeoutExpired, FileNotFoundError):
pass
# Deduplicate
for key in changes:
changes[key] = list(set(changes[key]))
return changes
def load_knowledge_entries(knowledge_dir: str) -> List[Dict[str, Any]]:
"""
Load knowledge entries from YAML files in the knowledge directory.
Supports:
- knowledge/index.json (legacy format)
- knowledge/global/*.yaml
- knowledge/repos/*.yaml
- knowledge/agents/*.yaml
"""
entries = []
# Load from index.json if exists
index_path = os.path.join(knowledge_dir, "index.json")
if os.path.exists(index_path):
try:
with open(index_path) as f:
data = json.load(f)
for fact in data.get("facts", []):
entries.append({
"source": "index.json",
"fact": fact.get("fact", ""),
"source_file": fact.get("source_file"),
"source_hash": fact.get("source_hash"),
"category": fact.get("category", "unknown"),
"confidence": fact.get("confidence", 0.5)
})
except (json.JSONDecodeError, KeyError):
pass
# Load from YAML files
for subdir in ["global", "repos", "agents"]:
subdir_path = os.path.join(knowledge_dir, subdir)
if not os.path.isdir(subdir_path):
continue
for filename in os.listdir(subdir_path):
if not filename.endswith((".yaml", ".yml")):
continue
filepath = os.path.join(subdir_path, filename)
try:
with open(filepath) as f:
data = yaml.safe_load(f)
if not data or not isinstance(data, dict):
continue
# Extract entries from YAML structure
for key, value in data.items():
if isinstance(value, list):
for item in value:
if isinstance(item, dict):
entries.append({
"source": f"{subdir}/{filename}",
"fact": item.get("description", item.get("fact", "")),
"source_file": item.get("source_file"),
"source_hash": item.get("source_hash"),
"category": item.get("category", "unknown"),
"confidence": item.get("confidence", 0.5)
})
elif isinstance(value, dict):
entries.append({
"source": f"{subdir}/{filename}",
"fact": value.get("description", value.get("fact", "")),
"source_file": value.get("source_file"),
"source_hash": value.get("source_hash"),
"category": value.get("category", "unknown"),
"confidence": value.get("confidence", 0.5)
})
except (yaml.YAMLError, IOError):
pass
return entries
def check_freshness(knowledge_dir: str, repo_root: str = ".",
days: int = 1) -> Dict[str, Any]:
"""
Check freshness of knowledge entries against recent code changes.
Returns:
{
"timestamp": ISO timestamp,
"total_entries": int,
"stale_entries": [...],
"fresh_entries": [...],
"git_changes": {...},
"summary": {...}
}
"""
entries = load_knowledge_entries(knowledge_dir)
git_changes = get_git_file_changes(repo_root, days)
stale_entries = []
fresh_entries = []
for entry in entries:
source_file = entry.get("source_file")
if not source_file:
# Entry without source file reference
fresh_entries.append({**entry, "status": "no_source"})
continue
# Check if source file was recently modified
is_stale = False
reason = ""
if source_file in git_changes["modified"]:
is_stale = True
reason = "source_modified"
elif source_file in git_changes["deleted"]:
is_stale = True
reason = "source_deleted"
elif source_file in git_changes["added"]:
is_stale = True
reason = "source_added"
# Also check hash if available
stored_hash = entry.get("source_hash")
if stored_hash:
full_path = os.path.join(repo_root, source_file)
current_hash = compute_file_hash(full_path)
if current_hash is None:
is_stale = True
reason = "source_missing"
elif current_hash != stored_hash:
is_stale = True
reason = "hash_mismatch"
if is_stale:
stale_entries.append({
**entry,
"status": "stale",
"reason": reason
})
else:
fresh_entries.append({**entry, "status": "fresh"})
# Compute summary
total = len(entries)
stale_count = len(stale_entries)
fresh_count = len(fresh_entries)
# Group stale entries by reason
stale_by_reason = {}
for entry in stale_entries:
reason = entry.get("reason", "unknown")
if reason not in stale_by_reason:
stale_by_reason[reason] = 0
stale_by_reason[reason] += 1
return {
"timestamp": datetime.now(timezone.utc).isoformat(),
"total_entries": total,
"stale_entries": stale_entries,
"fresh_entries": fresh_entries,
"git_changes": git_changes,
"summary": {
"total": total,
"stale": stale_count,
"fresh": fresh_count,
"stale_percentage": round(stale_count / total * 100, 1) if total > 0 else 0,
"stale_by_reason": stale_by_reason,
"git_changes_summary": {
"modified": len(git_changes["modified"]),
"added": len(git_changes["added"]),
"deleted": len(git_changes["deleted"])
}
}
}
def update_stale_hashes(knowledge_dir: str, repo_root: str = ".") -> int:
"""
Update hashes for stale entries. Returns count of updated entries.
"""
entries = load_knowledge_entries(knowledge_dir)
updated = 0
# This is a simplified version - in practice, you'd need to
# write back to the specific YAML files
for entry in entries:
source_file = entry.get("source_file")
if not source_file:
continue
full_path = os.path.join(repo_root, source_file)
current_hash = compute_file_hash(full_path)
if current_hash and entry.get("source_hash") != current_hash:
# Mark for update (in practice, you'd write back to the file)
updated += 1
return updated
def format_report(result: Dict[str, Any], max_items: int = 20) -> str:
"""Format freshness check results as a human-readable report."""
timestamp = result["timestamp"]
summary = result["summary"]
stale_entries = result["stale_entries"]
git_changes = result["git_changes"]
lines = [
"Knowledge Freshness Report",
"=" * 50,
f"Generated: {timestamp}",
f"Total entries: {summary['total']}",
f"Stale entries: {summary['stale']} ({summary['stale_percentage']}%)",
f"Fresh entries: {summary['fresh']}",
""
]
# Git changes summary
lines.extend([
"Git Changes (last 24h):",
f" Modified: {len(git_changes['modified'])} files",
f" Added: {len(git_changes['added'])} files",
f" Deleted: {len(git_changes['deleted'])} files",
""
])
# Stale entries by reason
if summary.get("stale_by_reason"):
lines.extend([
"Stale Entries by Reason:",
""
])
for reason, count in summary["stale_by_reason"].items():
lines.append(f" {reason}: {count}")
lines.append("")
# List stale entries
if stale_entries:
lines.extend([
"Stale Entries:",
""
])
for i, entry in enumerate(stale_entries[:max_items], 1):
source = entry.get("source_file", "?")
reason = entry.get("reason", "unknown")
fact = entry.get("fact", "")[:60]
lines.append(f"{i:2d}. [{reason}] {source}")
if fact:
lines.append(f" {fact}")
if len(stale_entries) > max_items:
lines.append(f"\n... and {len(stale_entries) - max_items} more")
else:
lines.append("No stale entries found. All knowledge is fresh!")
return "\n".join(lines)
def main():
parser = argparse.ArgumentParser(
description="Knowledge Freshness Cron — detect stale entries from code changes")
parser.add_argument("--knowledge-dir", required=True,
help="Path to knowledge directory")
parser.add_argument("--repo", default=".",
help="Path to repository for git change detection")
parser.add_argument("--days", type=int, default=1,
help="Number of days to check for git changes (default: 1)")
parser.add_argument("--json", action="store_true",
help="Output as JSON instead of human-readable")
parser.add_argument("--max", type=int, default=20,
help="Maximum stale entries to show (default: 20)")
parser.add_argument("--auto-reextract", action="store_true",
help="Auto-re-extract knowledge for stale entries")
args = parser.parse_args()
if not os.path.isdir(args.knowledge_dir):
print(f"Error: {args.knowledge_dir} is not a directory", file=sys.stderr)
sys.exit(1)
if not os.path.isdir(args.repo):
print(f"Error: {args.repo} is not a directory", file=sys.stderr)
sys.exit(1)
result = check_freshness(args.knowledge_dir, args.repo, args.days)
if args.json:
print(json.dumps(result, indent=2))
else:
print(format_report(result, args.max))
# Auto-re-extract if requested
if args.auto_reextract and result["stale_entries"]:
print(f"\nAuto-re-extracting {len(result['stale_entries'])} stale entries...")
# In a real implementation, this would call the harvester
print("(Auto-re-extraction not yet implemented)")
if __name__ == "__main__":
main()

View File

@@ -1,131 +0,0 @@
#!/usr/bin/env python3
"""
Gitea Issue Body Parser — Extract structured data from markdown issue bodies.
Usage:
cat issue_body.txt | python3 scripts/gitea_issue_parser.py --stdin --pretty
python3 scripts/gitea_issue_parser.py --url https://forge.../api/v1/repos/.../issues/123 --pretty
python3 scripts/gitea_issue_parser.py body.txt --title "Fix thing (#42)" --labels pipeline extraction
"""
import argparse
import json
import re
import sys
from typing import Dict, List, Any, Optional
def parse_issue_body(body: str, title: str = "", labels: List[str] = None) -> Dict[str, Any]:
"""Parse a Gitea issue markdown body into structured JSON.
Extracted fields:
- title: Issue title
- context: Background/description section
- criteria[]: Acceptance criteria (checkboxes or numbered lists)
- labels[]: Issue labels
- epic_ref: Parent/epic issue reference (from "Closes #N" or title)
- sections{}: All ## sections as key-value pairs
"""
result = {
"title": title,
"context": "",
"criteria": [],
"labels": labels or [],
"epic_ref": None,
"sections": {},
}
if not body:
return result
# Extract epic reference from title or body
epic_patterns = [
r"(?:closes|fixes|addresses|refs?)\s+#(\d+)",
r"#(\d+)",
]
for pattern in epic_patterns:
match = re.search(pattern, (title + " " + body).lower())
if match:
result["epic_ref"] = int(match.group(1))
break
# Parse ## sections
section_pattern = r"^##\s+(.+?)$\n((?:^(?!##\s).*$\n?)*)"
for match in re.finditer(section_pattern, body, re.MULTILINE):
section_name = match.group(1).strip().lower().replace(" ", "_")
section_content = match.group(2).strip()
result["sections"][section_name] = section_content
# Extract acceptance criteria (checkboxes)
checkbox_pattern = r"^\s*-\s*\[([ xX])\]\s*(.+)$"
for match in re.finditer(checkbox_pattern, body, re.MULTILINE):
checked = match.group(1).lower() == "x"
text = match.group(2).strip()
result["criteria"].append({"text": text, "checked": checked})
# If no checkboxes, try numbered lists in "Acceptance Criteria" or "Criteria" section
if not result["criteria"]:
for section_name in ["acceptance_criteria", "criteria", "acceptance criteria"]:
if section_name in result["sections"]:
numbered = r"^\s*\d+\.\s*(.+)$"
for match in re.finditer(numbered, result["sections"][section_name], re.MULTILINE):
result["criteria"].append({"text": match.group(1).strip(), "checked": False})
break
# Extract context (first section or first paragraph before any ## heading)
first_heading = body.find("## ")
if first_heading > 0:
context_text = body[:first_heading].strip()
else:
context_text = body.split("\n\n")[0].strip()
# Clean up: remove "## Context" or "## Problem" header if present
context_text = re.sub(r"^#+\s*\w+\s*\n?", "", context_text).strip()
result["context"] = context_text[:500] # Cap at 500 chars
return result
def fetch_issue_from_url(url: str) -> Dict[str, Any]:
"""Fetch an issue from a Gitea API URL and parse it."""
import urllib.request
req = urllib.request.Request(url, headers={"Accept": "application/json"})
with urllib.request.urlopen(req) as resp:
data = json.loads(resp.read())
return parse_issue_body(
body=data.get("body", ""),
title=data.get("title", ""),
labels=[l["name"] for l in data.get("labels", [])]
)
def main():
parser = argparse.ArgumentParser(description="Parse Gitea issue markdown into structured JSON")
parser.add_argument("file", nargs="?", help="Issue body file (or use --stdin)")
parser.add_argument("--stdin", action="store_true", help="Read from stdin")
parser.add_argument("--url", help="Gitea API URL to fetch issue from")
parser.add_argument("--title", default="", help="Issue title")
parser.add_argument("--labels", nargs="*", default=[], help="Issue labels")
parser.add_argument("--pretty", action="store_true", help="Pretty-print JSON output")
args = parser.parse_args()
if args.url:
result = fetch_issue_from_url(args.url)
elif args.stdin:
body = sys.stdin.read()
result = parse_issue_body(body, args.title, args.labels)
elif args.file:
with open(args.file) as f:
body = f.read()
result = parse_issue_body(body, args.title, args.labels)
else:
parser.print_help()
sys.exit(1)
indent = 2 if args.pretty else None
print(json.dumps(result, indent=indent))
if __name__ == "__main__":
main()

View File

@@ -1,451 +0,0 @@
#!/usr/bin/env python3
"""
Improvement Proposal Generator for compounding-intelligence.
Analyzes fleet session data to identify waste patterns and generates
concrete improvement proposals with ROI estimates.
Input: Session analytics JSON (from fleet metrics or session database)
Output: Markdown proposal document + JSON proposals
Usage:
python3 scripts/improvement_proposals.py --input analytics.json
python3 scripts/improvement_proposals.py --input analytics.json --format json
python3 scripts/improvement_proposals.py --input analytics.json --output proposals.md
python3 scripts/improvement_proposals.py --input analytics.json --threshold 5.0
"""
import argparse
import json
import os
import sys
from collections import defaultdict
from dataclasses import dataclass, field, asdict
from datetime import datetime, timezone
from typing import Any, Dict, List, Optional
# ── Data types ─────────────────────────────────────────────────
@dataclass
class WastePattern:
"""A detected waste pattern in fleet sessions."""
pattern_type: str # "repeated_error", "manual_process", "slow_tool", "failed_retry"
description: str
occurrences: int
total_time_hours: float
affected_repos: List[str] = field(default_factory=list)
example_sessions: List[str] = field(default_factory=list)
@dataclass
class ImprovementProposal:
"""A concrete improvement proposal with ROI estimate."""
title: str
problem: str
proposed_solution: str
estimated_weekly_hours_saved: float
estimated_monthly_hours_saved: float
implementation_hours: float
roi_weeks: float # weeks to break even
priority: str # "critical", "high", "medium", "low"
affected_area: str
supporting_evidence: List[str] = field(default_factory=list)
# ── Session analysis ──────────────────────────────────────────
def analyze_sessions(sessions: List[dict]) -> List[WastePattern]:
"""
Analyze session data to find waste patterns.
Looks for:
- Repeated errors (same error across sessions)
- Manual processes (long sequences of similar actions)
- Slow tools (tools with high latency)
- Failed retries (multiple attempts at same task)
"""
patterns = []
# Track error frequency across sessions
error_counts: Dict[str, List[dict]] = defaultdict(list)
tool_latencies: Dict[str, List[float]] = defaultdict(list)
retry_counts: Dict[str, int] = defaultdict(int)
manual_sequences: List[dict] = []
for session in sessions:
session_id = session.get("session_id", "unknown")
repo = session.get("repo", "global")
messages = session.get("messages", [])
errors = session.get("errors", [])
tool_calls = session.get("tool_calls", [])
duration_min = session.get("duration_minutes", 0)
# 1. Repeated errors
for err in errors:
err_key = _normalize_error(err.get("message", ""))
if err_key:
error_counts[err_key].append({
"session_id": session_id,
"repo": repo,
})
# 2. Tool latency tracking
for tc in tool_calls:
tool_name = tc.get("tool", "unknown")
latency = tc.get("latency_ms", 0)
if latency > 0:
tool_latencies[tool_name].append(latency)
# 3. Failed retries (same tool called 3+ times in sequence)
prev_tool = None
streak = 0
for tc in tool_calls:
tool_name = tc.get("tool", "unknown")
if tool_name == prev_tool:
streak += 1
else:
if streak >= 3:
retry_counts[prev_tool] += 1
streak = 1
prev_tool = tool_name
if streak >= 3:
retry_counts[prev_tool] += 1
# 4. Manual processes (10+ sequential tool calls with no automation)
if len(tool_calls) > 10:
tool_sequence = [tc.get("tool") for tc in tool_calls]
unique_tools = set(tool_sequence)
if len(unique_tools) <= 3 and len(tool_calls) > 10:
manual_sequences.append({
"session_id": session_id,
"repo": repo,
"tool_count": len(tool_calls),
"unique_tools": list(unique_tools),
"duration_min": duration_min,
})
# Generate patterns from collected data
# Repeated errors (appearing in 3+ sessions)
for err_key, occurrences in error_counts.items():
if len(occurrences) >= 3:
repos = list(set(o["repo"] for o in occurrences))
sessions_list = [o["session_id"] for o in occurrences[:5]]
# Estimate time wasted: 5 min per error occurrence
hours = len(occurrences) * 5 / 60
patterns.append(WastePattern(
pattern_type="repeated_error",
description=f"Error: {err_key[:100]}",
occurrences=len(occurrences),
total_time_hours=round(hours, 1),
affected_repos=repos,
example_sessions=sessions_list,
))
# Slow tools (avg latency > 5000ms across 5+ calls)
for tool, latencies in tool_latencies.items():
if len(latencies) >= 5:
avg_ms = sum(latencies) / len(latencies)
if avg_ms > 5000:
hours = sum(latencies) / 1000 / 3600
patterns.append(WastePattern(
pattern_type="slow_tool",
description=f"Tool '{tool}' averages {avg_ms:.0f}ms latency",
occurrences=len(latencies),
total_time_hours=round(hours, 1),
affected_repos=["global"],
example_sessions=[],
))
# Failed retries
for tool, count in retry_counts.items():
if count >= 2:
hours = count * 10 / 60 # ~10 min per failed retry sequence
patterns.append(WastePattern(
pattern_type="failed_retry",
description=f"Tool '{tool}' had {count} retry sequences (3+ consecutive calls)",
occurrences=count,
total_time_hours=round(hours, 1),
affected_repos=["global"],
example_sessions=[],
))
# Manual processes
for seq in manual_sequences:
hours = seq["duration_min"] / 60
patterns.append(WastePattern(
pattern_type="manual_process",
description=f"Session {seq['session_id'][:12]}: {seq['tool_count']} tool calls with only {len(seq['unique_tools'])} unique tools",
occurrences=1,
total_time_hours=round(hours, 1),
affected_repos=[seq["repo"]],
example_sessions=[seq["session_id"]],
))
return sorted(patterns, key=lambda p: p.total_time_hours, reverse=True)
def _normalize_error(message: str) -> str:
"""Normalize error message to a common key."""
if not message:
return ""
msg = message.lower().strip()
# Remove variable parts (paths, IDs, timestamps)
import re
msg = re.sub(r'/\S+', '/PATH', msg)
msg = re.sub(r'\b[0-9a-f]{8,}\b', 'HASH', msg)
msg = re.sub(r'\d{4}-\d{2}-\d{2}[tT]\d{2}:\d{2}', 'TIME', msg)
return msg[:150]
# ── Proposal generation ───────────────────────────────────────
def generate_proposals(
patterns: List[WastePattern],
hourly_rate: float = 50.0,
implementation_overhead: float = 1.5,
) -> List[ImprovementProposal]:
"""
Generate improvement proposals from waste patterns.
Args:
patterns: Detected waste patterns
hourly_rate: Developer hourly rate for ROI calculation
implementation_overhead: Multiplier for implementation time estimate
"""
proposals = []
# Group patterns by type
by_type: Dict[str, List[WastePattern]] = defaultdict(list)
for p in patterns:
by_type[p.pattern_type].append(p)
# 1. Repeated errors → Create fix/skill
for p in by_type.get("repeated_error", []):
weekly_hours = p.total_time_hours / 4 # monthly → weekly
impl_hours = max(1.0, p.occurrences * 0.25) * implementation_overhead
roi_weeks = impl_hours / weekly_hours if weekly_hours > 0 else float('inf')
proposals.append(ImprovementProposal(
title=f"Fix repeated error: {p.description[:60]}",
problem=f"This error occurred {p.occurrences} times across {len(p.affected_repos)} repos, wasting ~{p.total_time_hours:.1f} hours.",
proposed_solution="Root-cause the error and create a permanent fix or mitigation skill.",
estimated_weekly_hours_saved=round(weekly_hours, 1),
estimated_monthly_hours_saved=round(p.total_time_hours, 1),
implementation_hours=round(impl_hours, 1),
roi_weeks=round(roi_weeks, 1),
priority=_priority_from_roi(roi_weeks),
affected_area="reliability",
supporting_evidence=[f"{p.occurrences} occurrences in sessions: {', '.join(p.example_sessions[:3])}"],
))
# 2. Slow tools → Optimize or replace
for p in by_type.get("slow_tool", []):
weekly_hours = p.total_time_hours / 4
impl_hours = 3.0 * implementation_overhead # optimization task
roi_weeks = impl_hours / weekly_hours if weekly_hours > 0 else float('inf')
proposals.append(ImprovementProposal(
title=f"Optimize slow tool: {p.description[:60]}",
problem=f"Tool has {p.occurrences} calls with high latency, wasting ~{p.total_time_hours:.1f} hours total.",
proposed_solution="Profile the tool, add caching, or replace with a faster alternative.",
estimated_weekly_hours_saved=round(weekly_hours, 1),
estimated_monthly_hours_saved=round(p.total_time_hours, 1),
implementation_hours=round(impl_hours, 1),
roi_weeks=round(roi_weeks, 1),
priority=_priority_from_roi(roi_weeks),
affected_area="performance",
supporting_evidence=[f"{p.occurrences} slow calls detected"],
))
# 3. Failed retries → Add retry logic or validation
for p in by_type.get("failed_retry", []):
weekly_hours = p.total_time_hours / 4
impl_hours = 2.0 * implementation_overhead
roi_weeks = impl_hours / weekly_hours if weekly_hours > 0 else float('inf')
proposals.append(ImprovementProposal(
title=f"Reduce retries for tool '{p.description[:50]}'",
problem=f"Tool had {p.occurrences} retry sequences, wasting ~{p.total_time_hours:.1f} hours.",
proposed_solution="Add input validation, pre-flight checks, or automatic retry with backoff.",
estimated_weekly_hours_saved=round(weekly_hours, 1),
estimated_monthly_hours_saved=round(p.total_time_hours, 1),
implementation_hours=round(impl_hours, 1),
roi_weeks=round(roi_weeks, 1),
priority=_priority_from_roi(roi_weeks),
affected_area="reliability",
supporting_evidence=[f"{p.occurrences} retry sequences detected"],
))
# 4. Manual processes → Automate
total_manual_hours = sum(p.total_time_hours for p in by_type.get("manual_process", []))
manual_patterns = by_type.get("manual_process", [])
if manual_patterns:
weekly_hours = total_manual_hours / 4
impl_hours = len(manual_patterns) * 2.0 * implementation_overhead
roi_weeks = impl_hours / weekly_hours if weekly_hours > 0 else float('inf')
proposals.append(ImprovementProposal(
title=f"Automate {len(manual_patterns)} manual processes",
problem=f"{len(manual_patterns)} sessions had long manual tool sequences, wasting ~{total_manual_hours:.1f} hours.",
proposed_solution="Create composite skills or scripts that combine the repeated tool sequences into single operations.",
estimated_weekly_hours_saved=round(weekly_hours, 1),
estimated_monthly_hours_saved=round(total_manual_hours, 1),
implementation_hours=round(impl_hours, 1),
roi_weeks=round(roi_weeks, 1),
priority=_priority_from_roi(roi_weeks),
affected_area="automation",
supporting_evidence=[f"{len(manual_patterns)} manual sessions detected"],
))
return sorted(proposals, key=lambda p: p.estimated_monthly_hours_saved, reverse=True)
def _priority_from_roi(roi_weeks: float) -> str:
"""Determine priority from ROI break-even time."""
if roi_weeks <= 1:
return "critical"
elif roi_weeks <= 4:
return "high"
elif roi_weeks <= 12:
return "medium"
return "low"
# ── Output formatting ─────────────────────────────────────────
def format_proposals_markdown(
proposals: List[ImprovementProposal],
patterns: List[WastePattern],
generated_at: str,
) -> str:
"""Format proposals as a markdown document."""
lines = [
"# Improvement Proposals",
"",
f"Generated: {generated_at}",
f"Based on analysis of {sum(p.occurrences for p in patterns)} waste events across {len(set(r for p in patterns for r in p.affected_repos))} repos.",
"",
"---",
"",
"## Summary",
"",
f"| Metric | Value |",
f"|--------|-------|",
f"| Total proposals | {len(proposals)} |",
f"| Critical priority | {sum(1 for p in proposals if p.priority == 'critical')} |",
f"| Total monthly hours wasted | {sum(p.estimated_monthly_hours_saved for p in proposals):.1f}h |",
f"| Total weekly hours recoverable | {sum(p.estimated_weekly_hours_saved for p in proposals):.1f}h |",
f"| Implementation investment | {sum(p.implementation_hours for p in proposals):.1f}h |",
"",
"---",
"",
]
for i, prop in enumerate(proposals, 1):
lines.extend([
f"## {i}. {prop.title}",
"",
f"**Priority:** {prop.priority.upper()} ",
f"**Area:** {prop.affected_area} ",
f"**ROI break-even:** {prop.roi_weeks:.1f} weeks",
"",
"### Problem",
"",
prop.problem,
"",
"### Proposed Solution",
"",
prop.proposed_solution,
"",
"### ROI Estimate",
"",
f"- Weekly hours saved: **{prop.estimated_weekly_hours_saved}h**",
f"- Monthly hours saved: **{prop.estimated_monthly_hours_saved}h**",
f"- Implementation time: **{prop.implementation_hours}h**",
f"- Break-even: **{prop.roi_weeks:.1f} weeks**",
"",
])
if prop.supporting_evidence:
lines.extend([
"### Evidence",
"",
])
for ev in prop.supporting_evidence:
lines.append(f"- {ev}")
lines.append("")
lines.extend(["---", ""])
# Waste pattern appendix
lines.extend([
"## Appendix: Detected Waste Patterns",
"",
"| Type | Description | Occurrences | Hours |",
"|------|-------------|-------------|-------|",
])
for p in patterns[:20]:
lines.append(
f"| {p.pattern_type} | {p.description[:60]} | {p.occurrences} | {p.total_time_hours}h |"
)
lines.append("")
return "\n".join(lines)
def format_proposals_json(proposals: List[ImprovementProposal]) -> str:
"""Format proposals as JSON."""
return json.dumps(
[asdict(p) for p in proposals],
indent=2,
default=str,
)
# ── Main ──────────────────────────────────────────────────────
def main():
parser = argparse.ArgumentParser(description="Generate improvement proposals from session analytics")
parser.add_argument("--input", required=True, help="Path to session analytics JSON file")
parser.add_argument("--output", help="Output file path (default: stdout)")
parser.add_argument("--format", choices=["markdown", "json"], default="markdown", help="Output format")
parser.add_argument("--hourly-rate", type=float, default=50.0, help="Developer hourly rate for ROI calc")
parser.add_argument("--threshold", type=float, default=2.0, help="Min monthly hours to include proposal")
args = parser.parse_args()
with open(args.input) as f:
data = json.load(f)
sessions = data if isinstance(data, list) else data.get("sessions", [])
if not sessions:
print("No sessions found in input file.", file=sys.stderr)
sys.exit(1)
# Analyze
patterns = analyze_sessions(sessions)
proposals = generate_proposals(patterns, hourly_rate=args.hourly_rate)
# Filter by threshold
proposals = [p for p in proposals if p.estimated_monthly_hours_saved >= args.threshold]
generated_at = datetime.now(timezone.utc).isoformat()
if args.format == "markdown":
output = format_proposals_markdown(proposals, patterns, generated_at)
else:
output = format_proposals_json(proposals)
if args.output:
with open(args.output, "w") as f:
f.write(output)
print(f"Wrote {len(proposals)} proposals to {args.output}", file=sys.stderr)
else:
print(output)
if __name__ == "__main__":
main()

View File

@@ -1,275 +0,0 @@
"""
Knowledge Gap Identifier — Pipeline 10.7
Cross-references code, docs, and tests to find gaps:
- Undocumented functions/classes
- Untested code paths
- Documented but missing implementations
- Test files without corresponding source
Produces a gap report with severity and suggestions.
"""
from __future__ import annotations
import ast
import os
import re
from dataclasses import dataclass, field
from enum import Enum
from pathlib import Path
from typing import Dict, List, Optional, Set
class GapSeverity(Enum):
INFO = "info"
WARNING = "warning"
ERROR = "error"
class GapType(Enum):
UNDOCUMENTED = "undocumented"
UNTESTED = "untested"
MISSING_IMPLEMENTATION = "missing_implementation"
ORPHAN_TEST = "orphan_test"
STALE_DOC = "stale_doc"
@dataclass
class Gap:
"""A single knowledge gap."""
gap_type: GapType
severity: GapSeverity
file: str
line: Optional[int]
name: str
description: str
suggestion: str
@dataclass
class GapReport:
"""Full gap analysis report."""
repo_path: str
gaps: List[Gap] = field(default_factory=list)
stats: Dict[str, int] = field(default_factory=dict)
def summary(self) -> str:
lines = [f"Gap Report for {self.repo_path}", "=" * 40]
by_type = {}
for g in self.gaps:
by_type.setdefault(g.gap_type.value, []).append(g)
for gtype, items in sorted(by_type.items()):
lines.append(f"\n{gtype.upper()} ({len(items)}):")
for g in items:
loc = f"{g.file}:{g.line}" if g.line else g.file
lines.append(f" [{g.severity.value}] {g.name} @ {loc}")
lines.append(f" {g.description}")
lines.append(f"\nTotal gaps: {len(self.gaps)}")
self.stats = {k: len(v) for k, v in by_type.items()}
return "\n".join(lines)
def to_dict(self) -> dict:
return {
"repo_path": self.repo_path,
"total_gaps": len(self.gaps),
"stats": {k: len(v) for k, v in
{gt: [g for g in self.gaps if g.gap_type == gt]
for gt in GapType}.items() if v},
"gaps": [
{
"type": g.gap_type.value,
"severity": g.severity.value,
"file": g.file,
"line": g.line,
"name": g.name,
"description": g.description,
"suggestion": g.suggestion,
}
for g in self.gaps
],
}
def _collect_python_files(root: Path) -> List[Path]:
"""Collect .py files, excluding venv/node_modules/.git."""
skip = {".git", "venv", "env", ".venv", "node_modules", "__pycache__", ".tox", ".mypy_cache"}
files = []
for dirpath, dirnames, filenames in os.walk(root):
dirnames[:] = [d for d in dirnames if d not in skip]
for f in filenames:
if f.endswith(".py"):
files.append(Path(dirpath) / f)
return files
def _extract_python_symbols(filepath: Path) -> Set[str]:
"""Extract top-level function and class names from a Python file."""
symbols = set()
try:
source = filepath.read_text(encoding="utf-8", errors="replace")
tree = ast.parse(source, filename=str(filepath))
except (SyntaxError, UnicodeDecodeError):
return symbols
for node in ast.iter_child_nodes(tree):
if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef, ast.ClassDef)):
symbols.add(node.name)
return symbols
def _extract_doc_symbols(filepath: Path) -> Set[str]:
"""Extract function/class names mentioned in markdown docs."""
symbols = set()
try:
text = filepath.read_text(encoding="utf-8", errors="replace")
except (UnicodeDecodeError, OSError):
return symbols
# Match backtick-quoted identifiers: `ClassName`, `func_name`, `func()`
for m in re.finditer(r"`([A-Za-z_]\w+)(?:\(\))?`", text):
symbols.add(m.group(1))
# Match ## ClassName or ### func_name headings
for m in re.finditer(r"^#{1,4}\s+(\w+)", text, re.MULTILINE):
symbols.add(m.group(1))
return symbols
def _collect_test_files(root: Path) -> Dict[str, Path]:
"""Map test module names to their file paths."""
test_map = {}
for dirpath, dirnames, filenames in os.walk(root):
dirnames[:] = [d for d in dirnames if d not in {".git", "venv", "node_modules"}]
for f in filenames:
if f.startswith("test_") and f.endswith(".py"):
# test_foo.py -> foo
module_name = f[5:-3]
test_map[module_name] = Path(dirpath) / f
return test_map
class KnowledgeGapIdentifier:
"""Analyzes a repo for knowledge gaps between code, docs, and tests."""
def analyze(self, repo_path: str) -> GapReport:
root = Path(repo_path).resolve()
report = GapReport(repo_path=str(root))
if not root.is_dir():
report.gaps.append(Gap(
gap_type=GapType.UNDOCUMENTED,
severity=GapSeverity.ERROR,
file=str(root),
line=None,
name="repo",
description="Path is not a directory",
suggestion="Provide a valid repo directory",
))
return report
# Collect artifacts
py_files = _collect_python_files(root)
doc_files = list(root.glob("docs/**/*.md")) + list(root.glob("*.md"))
test_map = _collect_test_files(root / "tests") if (root / "tests").is_dir() else {}
# Extract symbols from each source file
source_symbols: Dict[str, Set[str]] = {} # relative_path -> symbols
all_source_symbols: Set[str] = set()
for pf in py_files:
rel = str(pf.relative_to(root))
# Skip test files and setup/config
if "/tests/" in rel or rel.startswith("tests/") or rel.startswith("test_"):
continue
if pf.name in ("setup.py", "conftest.py", "conf.py"):
continue
syms = _extract_python_symbols(pf)
if syms:
source_symbols[rel] = syms
all_source_symbols.update(syms)
# Extract documented symbols
doc_symbols: Set[str] = set()
for df in doc_files:
doc_symbols.update(_extract_doc_symbols(df))
# Extract test-covered symbols
tested_modules: Set[str] = set(test_map.keys())
# --- Find gaps ---
# 1. Undocumented: source symbols not in any doc
for rel_path, syms in source_symbols.items():
for sym in sorted(syms):
if sym.startswith("_") and not sym.startswith("__"):
continue # Skip private
if sym not in doc_symbols:
report.gaps.append(Gap(
gap_type=GapType.UNDOCUMENTED,
severity=GapSeverity.WARNING,
file=rel_path,
line=None,
name=sym,
description=f"{sym} defined in {rel_path} but not referenced in any docs",
suggestion=f"Add documentation for {sym} in a .md file",
))
# 2. Untested: source modules without a corresponding test file
for rel_path in source_symbols:
module_name = Path(rel_path).stem
if module_name not in tested_modules and module_name not in ("__init__", "main", "config"):
report.gaps.append(Gap(
gap_type=GapType.UNTESTED,
severity=GapSeverity.ERROR,
file=rel_path,
line=None,
name=module_name,
description=f"No test file found for {rel_path}",
suggestion=f"Create tests/test_{module_name}.py",
))
# 3. Missing implementation: doc references symbol not in any source
referenced_but_missing = doc_symbols - all_source_symbols
for sym in sorted(referenced_but_missing):
# Filter out common non-code terms
if sym.lower() in {"todo", "fixme", "note", "example", "usage", "api",
"install", "setup", "config", "license", "contributing",
"changelog", "readme", "python", "bash", "json", "yaml",
"http", "url", "cli", "gui", "ui", "api", "rest"}:
continue
if len(sym) < 3:
continue
report.gaps.append(Gap(
gap_type=GapType.MISSING_IMPLEMENTATION,
severity=GapSeverity.INFO,
file="(docs)",
line=None,
name=sym,
description=f"{sym} referenced in docs but not found in source code",
suggestion=f"Verify if {sym} should be implemented or update docs",
))
# 4. Orphan tests: test files without matching source
for test_mod, test_path in test_map.items():
if test_mod not in tested_modules and not any(
test_mod in Path(f).stem for f in source_symbols
):
# Check if any source file partially matches
matches_source = any(test_mod.replace("_", "-") in f or test_mod.replace("_", "") in Path(f).stem
for f in source_symbols)
if not matches_source:
rel = str(test_path.relative_to(root))
report.gaps.append(Gap(
gap_type=GapType.ORPHAN_TEST,
severity=GapSeverity.WARNING,
file=rel,
line=None,
name=test_mod,
description=f"Test file {rel} exists but no matching source module found",
suggestion=f"Verify if the source was renamed or removed",
))
return report

View File

@@ -1,131 +0,0 @@
#!/usr/bin/env python3
"""
Knowledge Store Staleness Detector — Detect stale knowledge entries by comparing source file hashes.
Usage:
python3 scripts/knowledge_staleness_check.py --index knowledge/index.json
python3 scripts/knowledge_staleness_check.py --index knowledge/index.json --json
python3 scripts/knowledge_staleness_check.py --index knowledge/index.json --fix
"""
import argparse
import hashlib
import json
import os
import sys
from datetime import datetime, timezone
from pathlib import Path
from typing import Dict, List, Any, Optional
def compute_file_hash(filepath: str) -> Optional[str]:
"""Compute SHA-256 hash of a file. Returns None if file doesn't exist."""
try:
with open(filepath, "rb") as f:
return "sha256:" + hashlib.sha256(f.read()).hexdigest()
except (FileNotFoundError, IsADirectoryError, PermissionError):
return None
def check_staleness(index_path: str, repo_root: str = ".") -> List[Dict[str, Any]]:
"""Check all entries in knowledge index for staleness.
Returns list of entries with staleness info:
- status: "fresh" | "stale" | "missing_source" | "no_hash"
- current_hash: computed hash (if source exists)
- stored_hash: hash from index
"""
with open(index_path) as f:
data = json.load(f)
facts = data.get("facts", [])
results = []
for entry in facts:
source_file = entry.get("source_file")
stored_hash = entry.get("source_hash")
if not source_file:
results.append({**entry, "status": "no_source", "current_hash": None})
continue
full_path = os.path.join(repo_root, source_file)
current_hash = compute_file_hash(full_path)
if current_hash is None:
results.append({**entry, "status": "missing_source", "current_hash": None})
elif not stored_hash:
results.append({**entry, "status": "no_hash", "current_hash": current_hash})
elif current_hash != stored_hash:
results.append({**entry, "status": "stale", "current_hash": current_hash})
else:
results.append({**entry, "status": "fresh", "current_hash": current_hash})
return results
def fix_hashes(index_path: str, repo_root: str = ".") -> int:
"""Add hashes to entries missing them. Returns count of fixed entries."""
with open(index_path) as f:
data = json.load(f)
fixed = 0
for entry in data.get("facts", []):
if entry.get("source_hash"):
continue
source_file = entry.get("source_file")
if not source_file:
continue
full_path = os.path.join(repo_root, source_file)
h = compute_file_hash(full_path)
if h:
entry["source_hash"] = h
fixed += 1
with open(index_path, "w") as f:
json.dump(data, f, indent=2)
return fixed
def main():
parser = argparse.ArgumentParser(description="Check knowledge store staleness")
parser.add_argument("--index", required=True, help="Path to knowledge/index.json")
parser.add_argument("--repo", default=".", help="Repo root for source file resolution")
parser.add_argument("--json", action="store_true", help="Output as JSON")
parser.add_argument("--fix", action="store_true", help="Add hashes to entries missing them")
args = parser.parse_args()
if args.fix:
fixed = fix_hashes(args.index, args.repo)
print(f"Fixed {fixed} entries with missing hashes.")
return
results = check_staleness(args.index, args.repo)
if args.json:
print(json.dumps(results, indent=2))
else:
stale = [r for r in results if r["status"] != "fresh"]
fresh = [r for r in results if r["status"] == "fresh"]
print(f"Knowledge Store Staleness Check")
print(f" Total entries: {len(results)}")
print(f" Fresh: {len(fresh)}")
print(f" Stale/Issues: {len(stale)}")
print()
if stale:
print("Issues found:")
for r in stale:
status = r["status"]
fact = r.get("fact", "?")[:60]
source = r.get("source_file", "?")
print(f" [{status}] {source}: {fact}")
else:
print("All entries are fresh!")
if __name__ == "__main__":
main()

View File

@@ -1,551 +0,0 @@
#!/usr/bin/env python3
"""
Performance Bottleneck Finder — Identify slow tests, builds, and CI steps.
Analyzes:
1. Pytest output for slow tests
2. Build logs for slow steps
3. CI workflow durations
4. File system for large/slow artifacts
Usage:
python3 scripts/perf_bottleneck_finder.py --repo /path/to/repo
python3 scripts/perf_bottleneck_finder.py --repo /path/to/repo --json
python3 scripts/perf_bottleneck_finder.py --repo /path/to/repo --report metrics/perf_report.md
Weekly cron:
0 9 * * 1 cd /path/to/compounding-intelligence && python3 scripts/perf_bottleneck_finder.py --repo /path/to/target --report metrics/perf_report.md
"""
import argparse
import json
import os
import re
import subprocess
import sys
from collections import defaultdict
from dataclasses import dataclass, field, asdict
from datetime import datetime, timezone
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple
# ── Configuration ──────────────────────────────────────────────────
SLOW_TEST_THRESHOLD_S = 2.0 # Tests slower than this are flagged
SLOW_BUILD_STEP_THRESHOLD_S = 10.0
TOP_N_BOTTLENECKS = 10 # Report top N bottlenecks
PYTEST_DURATIONS_COUNT = 20 # Number of slow tests to collect
LOG_EXTENSIONS = {".log", ".txt"}
@dataclass
class Bottleneck:
"""A single performance bottleneck."""
category: str # "test", "build", "ci", "artifact", "import"
name: str # What's slow
duration_s: float # How long it takes
severity: str # "critical", "warning", "info"
recommendation: str # How to fix
file_path: Optional[str] = None
line_number: Optional[int] = None
@dataclass
class PerfReport:
"""Full performance report."""
timestamp: str
repo_path: str
bottlenecks: List[Bottleneck] = field(default_factory=list)
summary: Dict[str, Any] = field(default_factory=dict)
test_stats: Dict[str, Any] = field(default_factory=dict)
build_stats: Dict[str, Any] = field(default_factory=dict)
ci_stats: Dict[str, Any] = field(default_factory=dict)
def to_dict(self) -> dict:
d = asdict(self)
return d
# ── Test Analysis ──────────────────────────────────────────────────
def find_slow_tests_pytest(repo_path: str) -> List[Bottleneck]:
"""Run pytest --durations and parse slow tests."""
bottlenecks = []
# Try to run pytest with durations
try:
result = subprocess.run(
["python3", "-m", "pytest", "--co", "-q", "--durations=0"],
cwd=repo_path, capture_output=True, text=True, timeout=30
)
# If tests exist, try to get durations from last run
durations_file = os.path.join(repo_path, ".pytest_cache", "v", "cache", "durations")
if os.path.exists(durations_file):
with open(durations_file) as f:
for line in f:
parts = line.strip().split()
if len(parts) >= 2:
try:
duration = float(parts[0])
test_name = " ".join(parts[1:])
if duration > SLOW_TEST_THRESHOLD_S:
severity = "critical" if duration > 10 else "warning"
bottlenecks.append(Bottleneck(
category="test",
name=test_name,
duration_s=duration,
severity=severity,
recommendation=f"Test takes {duration:.1f}s. Consider mocking slow I/O, using fixtures, or marking with @pytest.mark.slow."
))
except ValueError:
continue
except (subprocess.TimeoutExpired, FileNotFoundError):
pass
return bottlenecks
def find_slow_tests_by_scan(repo_path: str) -> List[Bottleneck]:
"""Scan test files for patterns that indicate slow tests."""
bottlenecks = []
test_patterns = [
(r"time\.sleep\((\d+(?:\.\d+)?)\)", "Contains time.sleep() — consider using mock or async wait"),
(r"subprocess\.run\(.*timeout=(\d+)", "Subprocess with timeout — may block test"),
(r"requests\.(get|post|put|delete)\(", "Real HTTP call — mock with responses or httpretty"),
(r"open\\([^)]*)[\x27\x22]w[\x27\x22]", "File I/O in test — use tmp_path fixture"),
]
for root, dirs, files in os.walk(repo_path):
# Skip hidden and cache dirs
dirs[:] = [d for d in dirs if not d.startswith(('.', '__pycache__', 'node_modules', '.git'))]
for fname in files:
if not (fname.startswith("test_") or fname.endswith("_test.py")):
continue
if not fname.endswith(".py"):
continue
fpath = os.path.join(root, fname)
rel_path = os.path.relpath(fpath, repo_path)
try:
with open(fpath) as f:
lines = f.readlines()
except (PermissionError, UnicodeDecodeError):
continue
for i, line in enumerate(lines):
for pattern, recommendation in test_patterns:
match = re.search(pattern, line)
if match:
duration = 1.0 # Default estimate
if "sleep" in pattern:
try:
duration = float(match.group(1))
except (ValueError, IndexError):
duration = 1.0
elif "timeout" in pattern:
try:
duration = float(match.group(1))
except (ValueError, IndexError):
duration = 10.0
else:
duration = 2.0 # Estimated
bottlenecks.append(Bottleneck(
category="test",
name=f"{rel_path}:{i+1}",
duration_s=duration,
severity="warning" if duration < 5 else "critical",
recommendation=recommendation,
file_path=rel_path,
line_number=i + 1
))
return bottlenecks
# ── Build Analysis ─────────────────────────────────────────────────
def analyze_build_artifacts(repo_path: str) -> List[Bottleneck]:
"""Find large build artifacts that slow down builds."""
bottlenecks = []
large_dirs = {
"node_modules": "Consider using npm ci --production or yarn --production",
"__pycache__": "Consider .gitignore and cleaning before builds",
".tox": "Consider caching tox environments",
".pytest_cache": "Consider cleaning between CI runs",
"dist": "Check if dist/ artifacts are being rebuilt unnecessarily",
"build": "Check if build/ artifacts are being rebuilt unnecessarily",
".next": "Next.js cache — consider incremental builds",
"venv": "Virtual env in repo — move outside or use Docker",
}
for dirname, recommendation in large_dirs.items():
dirpath = os.path.join(repo_path, dirname)
if os.path.isdir(dirpath):
total_size = 0
file_count = 0
for root, dirs, files in os.walk(dirpath):
for f in files:
try:
fpath = os.path.join(root, f)
total_size += os.path.getsize(fpath)
file_count += 1
except OSError:
pass
if total_size > 10 * 1024 * 1024: # > 10MB
size_mb = total_size / (1024 * 1024)
bottlenecks.append(Bottleneck(
category="build",
name=f"{dirname}/ ({size_mb:.1f}MB, {file_count} files)",
duration_s=size_mb * 0.5, # Rough estimate
severity="critical" if size_mb > 100 else "warning",
recommendation=recommendation
))
return bottlenecks
def analyze_makefile_targets(repo_path: str) -> List[Bottleneck]:
"""Analyze Makefile for potentially slow targets."""
bottlenecks = []
makefiles = []
for root, dirs, files in os.walk(repo_path):
dirs[:] = [d for d in dirs if not d.startswith(('.', '__pycache__'))]
for f in files:
if f in ("Makefile", "makefile", "GNUmakefile"):
makefiles.append(os.path.join(root, f))
slow_patterns = [
(r"pip install", "pip install without --no-deps or constraints"),
(r"npm install(?!.*--production)", "npm install without --production flag"),
(r"docker build", "Docker build — consider multi-stage and layer caching"),
(r"pytest(?!.*-x|--maxfail)", "pytest without early exit on failure"),
(r"mypy|mypy --strict", "Type checking — consider incremental mode"),
]
for mfile in makefiles:
rel_path = os.path.relpath(mfile, repo_path)
try:
with open(mfile) as f:
content = f.read()
except (PermissionError, UnicodeDecodeError):
continue
for pattern, recommendation in slow_patterns:
if re.search(pattern, content):
bottlenecks.append(Bottleneck(
category="build",
name=f"{rel_path}: {pattern}",
duration_s=5.0,
severity="info",
recommendation=recommendation,
file_path=rel_path
))
return bottlenecks
# ── CI Analysis ────────────────────────────────────────────────────
def analyze_github_actions(repo_path: str) -> List[Bottleneck]:
"""Analyze GitHub Actions workflow files for inefficiencies."""
bottlenecks = []
workflow_dir = os.path.join(repo_path, ".github", "workflows")
if not os.path.isdir(workflow_dir):
return bottlenecks
slow_patterns = [
(r"runs-on:\s*ubuntu-latest", 0, "Consider caching dependencies between runs"),
(r"npm install", 2, "Use npm ci instead of npm install for reproducible builds"),
(r"pip install(?!.*--cache-dir)", 2, "Add --cache-dir or use pip cache action"),
(r"docker build(?!.*--cache-from)", 5, "Use Docker layer caching"),
(r"python -m pytest(?!.*-n|--numprocesses)", 3, "Consider pytest-xdist for parallel test execution"),
]
for fname in os.listdir(workflow_dir):
if not fname.endswith(('.yml', '.yaml')):
continue
fpath = os.path.join(workflow_dir, fname)
try:
with open(fpath) as f:
content = f.read()
except (PermissionError, UnicodeDecodeError):
continue
for pattern, est_savings, recommendation in slow_patterns:
if re.search(pattern, content):
bottlenecks.append(Bottleneck(
category="ci",
name=f"{fname}: {pattern}",
duration_s=est_savings,
severity="info",
recommendation=recommendation,
file_path=f".github/workflows/{fname}"
))
return bottlenecks
def analyze_gitea_ci(repo_path: str) -> List[Bottleneck]:
"""Analyze Gitea/Drone CI config files."""
bottlenecks = []
ci_files = [".gitea/workflows", ".drone.yml", ".woodpecker.yml"]
for ci_path in ci_files:
full_path = os.path.join(repo_path, ci_path)
if os.path.isfile(full_path):
try:
with open(full_path) as f:
content = f.read()
except (PermissionError, UnicodeDecodeError):
continue
if "pip install" in content and "--cache-dir" not in content:
bottlenecks.append(Bottleneck(
category="ci",
name=f"{ci_path}: pip without cache",
duration_s=5.0,
severity="warning",
recommendation="Add --cache-dir or mount pip cache volume",
file_path=ci_path
))
elif os.path.isdir(full_path):
for fname in os.listdir(full_path):
if not fname.endswith(('.yml', '.yaml')):
continue
fpath = os.path.join(full_path, fname)
try:
with open(fpath) as f:
content = f.read()
except (PermissionError, UnicodeDecodeError):
continue
if "pip install" in content and "--cache-dir" not in content:
bottlenecks.append(Bottleneck(
category="ci",
name=f"{ci_path}/{fname}: pip without cache",
duration_s=5.0,
severity="warning",
recommendation="Add --cache-dir or mount pip cache volume",
file_path=f"{ci_path}/{fname}"
))
return bottlenecks
# ── Import Analysis ────────────────────────────────────────────────
def find_slow_imports(repo_path: str) -> List[Bottleneck]:
"""Find Python files with heavy import chains."""
bottlenecks = []
heavy_imports = {
"pandas": 0.5,
"numpy": 0.3,
"torch": 2.0,
"tensorflow": 3.0,
"scipy": 0.5,
"matplotlib": 0.8,
"sklearn": 0.5,
"transformers": 1.5,
}
for root, dirs, files in os.walk(repo_path):
dirs[:] = [d for d in dirs if not d.startswith(('.', '__pycache__', 'node_modules'))]
for fname in files:
if not fname.endswith(".py"):
continue
fpath = os.path.join(root, fname)
rel_path = os.path.relpath(fpath, repo_path)
try:
with open(fpath) as f:
lines = f.readlines()
except (PermissionError, UnicodeDecodeError):
continue
for i, line in enumerate(lines):
stripped = line.strip()
if stripped.startswith("import ") or stripped.startswith("from "):
for heavy, est_time in heavy_imports.items():
if heavy in stripped:
bottlenecks.append(Bottleneck(
category="import",
name=f"{rel_path}:{i+1}: import {heavy}",
duration_s=est_time,
severity="info" if est_time < 1.0 else "warning",
recommendation=f"Heavy import ({heavy} ~{est_time}s). Consider lazy import or conditional import.",
file_path=rel_path,
line_number=i + 1
))
return bottlenecks
# ── Report Generation ──────────────────────────────────────────────
def severity_sort_key(b: Bottleneck) -> Tuple[int, float]:
"""Sort by severity then duration."""
sev_order = {"critical": 0, "warning": 1, "info": 2}
return (sev_order.get(b.severity, 3), -b.duration_s)
def generate_report(repo_path: str) -> PerfReport:
"""Run all analyses and generate a performance report."""
report = PerfReport(
timestamp=datetime.now(timezone.utc).isoformat(),
repo_path=os.path.abspath(repo_path)
)
# Collect all bottlenecks
all_bottlenecks = []
print("Scanning for slow tests (pytest cache)...")
all_bottlenecks.extend(find_slow_tests_pytest(repo_path))
print("Scanning for slow test patterns...")
all_bottlenecks.extend(find_slow_tests_by_scan(repo_path))
print("Analyzing build artifacts...")
all_bottlenecks.extend(analyze_build_artifacts(repo_path))
print("Analyzing Makefiles...")
all_bottlenecks.extend(analyze_makefile_targets(repo_path))
print("Analyzing CI workflows...")
all_bottlenecks.extend(analyze_github_actions(repo_path))
all_bottlenecks.extend(analyze_gitea_ci(repo_path))
print("Scanning for heavy imports...")
all_bottlenecks.extend(find_slow_imports(repo_path))
# Sort by severity and duration
all_bottlenecks.sort(key=severity_sort_key)
report.bottlenecks = all_bottlenecks[:TOP_N_BOTTLENECKS * 2] # Keep more for stats
# Compute summary
by_category = defaultdict(list)
for b in all_bottlenecks:
by_category[b.category].append(b)
report.summary = {
"total_bottlenecks": len(all_bottlenecks),
"critical": sum(1 for b in all_bottlenecks if b.severity == "critical"),
"warning": sum(1 for b in all_bottlenecks if b.severity == "warning"),
"info": sum(1 for b in all_bottlenecks if b.severity == "info"),
"estimated_total_slowdown_s": sum(b.duration_s for b in all_bottlenecks),
"by_category": {cat: len(items) for cat, items in by_category.items()},
}
report.test_stats = {
"slow_tests": len(by_category.get("test", [])),
"total_estimated_s": sum(b.duration_s for b in by_category.get("test", [])),
}
report.build_stats = {
"build_issues": len(by_category.get("build", [])),
"total_estimated_s": sum(b.duration_s for b in by_category.get("build", [])),
}
report.ci_stats = {
"ci_issues": len(by_category.get("ci", [])),
"total_estimated_s": sum(b.duration_s for b in by_category.get("ci", [])),
}
return report
def format_markdown(report: PerfReport) -> str:
"""Format report as markdown."""
lines = []
lines.append(f"# Performance Bottleneck Report")
lines.append(f"")
lines.append(f"Generated: {report.timestamp}")
lines.append(f"Repository: {report.repo_path}")
lines.append(f"")
# Summary
s = report.summary
lines.append(f"## Summary")
lines.append(f"")
lines.append(f"- **Total bottlenecks:** {s['total_bottlenecks']}")
lines.append(f"- **Critical:** {s['critical']} | **Warning:** {s['warning']} | **Info:** {s['info']}")
lines.append(f"- **Estimated total slowdown:** {s['estimated_total_slowdown_s']:.1f}s")
lines.append(f"- **By category:** {', '.join(f'{k}: {v}' for k, v in s['by_category'].items())}")
lines.append(f"")
# Top bottlenecks
lines.append(f"## Top {min(TOP_N_BOTTLENECKS, len(report.bottlenecks))} Bottlenecks")
lines.append(f"")
for i, b in enumerate(report.bottlenecks[:TOP_N_BOTTLENECKS], 1):
icon = {"critical": "🔴", "warning": "🟡", "info": "🔵"}.get(b.severity, "")
loc = f" ({b.file_path}:{b.line_number})" if b.file_path else ""
lines.append(f"{i}. {icon} **{b.category}** — {b.name}{loc}")
lines.append(f" - Duration: ~{b.duration_s:.1f}s | Severity: {b.severity}")
lines.append(f" - Fix: {b.recommendation}")
lines.append(f"")
# Category breakdowns
for cat in ["test", "build", "ci", "import"]:
items = [b for b in report.bottlenecks if b.category == cat]
if items:
lines.append(f"## {cat.title()} Bottlenecks")
lines.append(f"")
for b in items:
icon = {"critical": "🔴", "warning": "🟡", "info": "🔵"}.get(b.severity, "")
loc = f" ({b.file_path}:{b.line_number})" if b.file_path else ""
lines.append(f"- {icon} {b.name}{loc} — ~{b.duration_s:.1f}s — {b.recommendation}")
lines.append(f"")
return "\n".join(lines)
# ── Main ───────────────────────────────────────────────────────────
def main():
parser = argparse.ArgumentParser(description="Performance Bottleneck Finder")
parser.add_argument("--repo", default=".", help="Path to repository to analyze")
parser.add_argument("--json", action="store_true", help="Output as JSON")
parser.add_argument("--report", help="Write markdown report to file")
parser.add_argument("--threshold", type=float, default=SLOW_TEST_THRESHOLD_S,
help="Slow test threshold in seconds")
args = parser.parse_args()
# Threshold override handled via module-level default
# (scan_tests uses SLOW_TEST_THRESHOLD_S from module scope)
if not os.path.isdir(args.repo):
print(f"Error: {args.repo} is not a directory", file=sys.stderr)
sys.exit(1)
report = generate_report(args.repo)
if args.json:
print(json.dumps(report.to_dict(), indent=2))
else:
md = format_markdown(report)
if args.report:
os.makedirs(os.path.dirname(args.report) or ".", exist_ok=True)
with open(args.report, "w") as f:
f.write(md)
print(f"Report written to {args.report}")
else:
print(md)
# Exit code: 1 if critical bottlenecks found
if report.summary.get("critical", 0) > 0:
sys.exit(1)
if __name__ == "__main__":
main()

View File

@@ -1,682 +0,0 @@
#!/usr/bin/env python3
"""
Priority Rebalancer — Re-evaluate issue priorities based on accumulated data.
Reads pipeline outputs, knowledge store, and Gitea issues to suggest
priority changes based on what the fleet has learned.
Usage:
python3 scripts/priority_rebalancer.py --org Timmy_Foundation
python3 scripts/priority_rebalancer.py --org Timmy_Foundation --repo compounding-intelligence
python3 scripts/priority_rebalancer.py --org Timmy_Foundation --dry-run
python3 scripts/priority_rebalancer.py --org Timmy_Foundation --apply
Output:
metrics/priority_report.json — full analysis
metrics/priority_suggestions.md — human-readable suggestions
"""
import argparse
import json
import os
import sys
from datetime import datetime, timezone, timedelta
from pathlib import Path
from typing import Dict, List, Any, Optional, Tuple
from dataclasses import dataclass, field, asdict
from collections import Counter, defaultdict
import urllib.request
import urllib.error
# ============================================================
# Data Models
# ============================================================
@dataclass
class IssueScore:
issue_id: int
repo: str
title: str
current_labels: List[str]
current_priority: Optional[str]
suggested_priority: Optional[str]
score: float
reasons: List[str]
age_days: int
comment_count: int
assignee: Optional[str]
dependencies: List[str] = field(default_factory=list)
blocking: List[str] = field(default_factory=list)
@dataclass
class PipelineSignal:
source: str # "knowledge", "metrics", "sessions", "staleness"
signal_type: str # "stale_knowledge", "high_error_rate", "missing_coverage", etc.
weight: float # 0.0 - 1.0
detail: str
affected_repos: List[str] = field(default_factory=list)
affected_issues: List[int] = field(default_factory=list)
# ============================================================
# Gitea API Client
# ============================================================
class GiteaClient:
def __init__(self, base_url: str, token: str):
self.base_url = base_url.rstrip("/")
self.token = token
def _request(self, path: str, params: Dict = None) -> Any:
url = f"{self.base_url}/api/v1{path}"
if params:
qs = "&".join(f"{k}={v}" for k, v in params.items() if v is not None)
url += f"?{qs}"
req = urllib.request.Request(url)
req.add_header("Authorization", f"token {self.token}")
req.add_header("Content-Type", "application/json")
try:
with urllib.request.urlopen(req, timeout=30) as resp:
return json.loads(resp.read().decode())
except urllib.error.HTTPError as e:
print(f"API error {e.code} for {path}: {e.read().decode()[:200]}", file=sys.stderr)
return None
def get_org_repos(self, org: str) -> List[Dict]:
repos = []
page = 1
while True:
batch = self._request(f"/orgs/{org}/repos", {"limit": 50, "page": page})
if not batch:
break
repos.extend(batch)
if len(batch) < 50:
break
page += 1
return repos
def get_issues(self, org: str, repo: str, state: str = "open") -> List[Dict]:
issues = []
page = 1
while True:
batch = self._request(f"/repos/{org}/{repo}/issues",
{"state": state, "limit": 50, "page": page, "type": "issues"})
if not batch:
break
issues.extend(batch)
if len(batch) < 50:
break
page += 1
return issues
def add_label_to_issue(self, org: str, repo: str, issue_num: int, label_ids: List[int]) -> bool:
url = f"{self.base_url}/api/v1/repos/{org}/{repo}/issues/{issue_num}/labels"
data = json.dumps({"labels": label_ids}).encode()
req = urllib.request.Request(url, data=data, method="POST")
req.add_header("Authorization", f"token {self.token}")
req.add_header("Content-Type", "application/json")
try:
with urllib.request.urlopen(req, timeout=15) as resp:
return resp.status == 200
except Exception:
return False
def remove_label_from_issue(self, org: str, repo: str, issue_num: int, label_id: int) -> bool:
url = f"{self.base_url}/api/v1/repos/{org}/{repo}/issues/{issue_num}/labels/{label_id}"
req = urllib.request.Request(url, method="DELETE")
req.add_header("Authorization", f"token {self.token}")
try:
with urllib.request.urlopen(req, timeout=15) as resp:
return resp.status == 200
except Exception:
return False
def get_repo_labels(self, org: str, repo: str) -> List[Dict]:
labels = []
page = 1
while True:
batch = self._request(f"/repos/{org}/{repo}/labels", {"limit": 50, "page": page})
if not batch:
break
labels.extend(batch)
if len(batch) < 50:
break
page += 1
return labels
def add_comment(self, org: str, repo: str, issue_num: int, body: str) -> bool:
url = f"{self.base_url}/api/v1/repos/{org}/{repo}/issues/{issue_num}/comments"
data = json.dumps({"body": body}).encode()
req = urllib.request.Request(url, data=data, method="POST")
req.add_header("Authorization", f"token {self.token}")
req.add_header("Content-Type", "application/json")
try:
with urllib.request.urlopen(req, timeout=15) as resp:
return resp.status == 201
except Exception:
return False
# ============================================================
# Pipeline Signal Collectors
# ============================================================
def collect_knowledge_signals(knowledge_dir: str) -> List[PipelineSignal]:
"""Analyze knowledge store for coverage gaps and staleness."""
signals = []
index_path = os.path.join(knowledge_dir, "index.json")
if not os.path.exists(index_path):
signals.append(PipelineSignal(
source="knowledge",
signal_type="missing_index",
weight=0.8,
detail="knowledge/index.json not found — no knowledge base exists"
))
return signals
try:
with open(index_path) as f:
data = json.load(f)
except (json.JSONDecodeError, IOError) as e:
signals.append(PipelineSignal(
source="knowledge",
signal_type="corrupt_index",
weight=0.9,
detail=f"knowledge/index.json is corrupt: {e}"
))
return signals
facts = data.get("facts", [])
total = len(facts)
if total == 0:
signals.append(PipelineSignal(
source="knowledge",
signal_type="empty_knowledge",
weight=0.7,
detail="Knowledge store has 0 facts — harvester not running or not finding sessions"
))
return signals
# Check staleness
stale_count = 0
missing_source = 0
for fact in facts:
status = fact.get("status", "unknown")
if status == "stale":
stale_count += 1
elif status in ("missing_source", "no_source"):
missing_source += 1
if stale_count > 0:
signals.append(PipelineSignal(
source="knowledge",
signal_type="stale_knowledge",
weight=min(1.0, stale_count / max(1, total)),
detail=f"{stale_count}/{total} facts are stale (source files changed)"
))
if missing_source > 0:
signals.append(PipelineSignal(
source="knowledge",
signal_type="missing_sources",
weight=min(1.0, missing_source / max(1, total)),
detail=f"{missing_source}/{total} facts have missing source files"
))
# Coverage by repo
repo_counts = Counter(f.get("repo", "unknown") for f in facts)
if len(repo_counts) < 3:
signals.append(PipelineSignal(
source="knowledge",
signal_type="low_coverage",
weight=0.5,
detail=f"Knowledge covers only {len(repo_counts)} repos — expand harvester scope",
affected_repos=list(repo_counts.keys())
))
return signals
def collect_staleness_signals(scripts_dir: str, knowledge_dir: str) -> List[PipelineSignal]:
"""Run staleness checker if available."""
signals = []
checker = os.path.join(scripts_dir, "knowledge_staleness_check.py")
index_path = os.path.join(knowledge_dir, "index.json")
if not os.path.exists(checker) or not os.path.exists(index_path):
return signals
try:
import subprocess
result = subprocess.run(
["python3", checker, "--index", index_path, "--json"],
capture_output=True, text=True, timeout=30
)
if result.returncode == 0:
data = json.loads(result.stdout)
stale = data.get("stale_count", 0)
total = data.get("total", 0)
if stale > 0:
signals.append(PipelineSignal(
source="staleness",
signal_type="stale_knowledge",
weight=min(1.0, stale / max(1, total)),
detail=f"Staleness checker found {stale}/{total} stale entries"
))
except Exception:
pass
return signals
def collect_metrics_signals(metrics_dir: str) -> List[PipelineSignal]:
"""Analyze metrics directory for pipeline health."""
signals = []
if not os.path.isdir(metrics_dir):
return signals
files = os.listdir(metrics_dir)
if len(files) <= 1: # Only .gitkeep
signals.append(PipelineSignal(
source="metrics",
signal_type="no_metrics",
weight=0.6,
detail="Metrics directory is empty — measurer pipeline not producing output"
))
return signals
# ============================================================
# Priority Scoring Engine
# ============================================================
PRIORITY_LEVELS = ["P0", "P1", "P2", "P3", "none"]
PRIORITY_LABELS = {"p0": "P0", "p1": "P1", "p2": "P2", "p3": "P3",
"priority:critical": "P0", "priority:high": "P1",
"priority:medium": "P2", "priority:low": "P3"}
def extract_priority(labels: List[str]) -> Optional[str]:
"""Extract priority level from issue labels."""
for label in labels:
lower = label.lower().strip()
if lower in PRIORITY_LABELS:
return PRIORITY_LABELS[lower]
return None
def compute_issue_score(
issue: Dict,
repo: str,
signals: List[PipelineSignal],
now: datetime
) -> IssueScore:
"""Compute priority score for a single issue."""
labels = [l.get("name", "") if isinstance(l, dict) else l for l in issue.get("labels", [])]
title = issue.get("title", "")
issue_id = issue.get("number", 0)
current_priority = extract_priority(labels)
# Parse dates
created_str = issue.get("created_at", "")
try:
created = datetime.fromisoformat(created_str.replace("Z", "+00:00"))
age_days = (now - created.replace(tzinfo=None)).days
except (ValueError, AttributeError):
age_days = 0
comment_count = issue.get("comments", 0)
assignee = None
assignees = issue.get("assignees") or []
if assignees:
assignee = assignees[0].get("login") if isinstance(assignees[0], dict) else str(assignees[0])
# Base score
score = 0.0
reasons = []
# Age factor: older issues drift down unless actively discussed
if age_days > 90 and comment_count < 2:
score -= 15
reasons.append(f"Dormant: {age_days} days old with only {comment_count} comments")
elif age_days > 30:
score -= 5
reasons.append(f"Aging: {age_days} days old")
# Activity factor: recent discussion suggests urgency
if comment_count > 5:
score += 10
reasons.append(f"Active discussion: {comment_count} comments")
elif comment_count > 0:
score += 3
# Assignment factor: unassigned issues need triage
if not assignee:
score += 5
reasons.append("Unassigned — needs triage")
# Pipeline signal alignment
for signal in signals:
title_lower = title.lower()
if signal.signal_type == "stale_knowledge" and "stale" in title_lower:
score += signal.weight * 20
reasons.append(f"Matches signal: {signal.detail}")
elif signal.signal_type == "empty_knowledge" and ("harvester" in title_lower or "knowledge" in title_lower):
score += signal.weight * 25
reasons.append(f"Critical gap: {signal.detail}")
elif signal.signal_type == "no_metrics" and "measur" in title_lower:
score += signal.weight * 15
reasons.append(f"Pipeline gap: {signal.detail}")
elif signal.signal_type == "low_coverage" and any(r.lower() in title_lower for r in signal.affected_repos):
score += signal.weight * 10
reasons.append(f"Coverage gap: {signal.detail}")
# Keyword boosts
keyword_scores = {
"broken": 20, "bug": 15, "fix": 10, "error": 12, "fail": 15,
"security": 25, "auth": 20, "data loss": 30, "crash": 25,
"blocker": 20, "urgent": 15, "critical": 15,
"epic": 8, "feature": -3, "nice to have": -10, "someday": -15
}
title_lower = title.lower()
for keyword, boost in keyword_scores.items():
if keyword in title_lower:
score += boost
if boost > 0:
reasons.append(f"Keyword match: '{keyword}' (+{boost})")
# Label-based adjustments
for label in labels:
lower = label.lower()
if lower == "pipeline":
score += 5 # Pipeline issues are infrastructure
elif lower == "bug":
score += 12
elif lower == "enhancement":
score -= 2
elif lower == "documentation":
score -= 5
elif "epic" in lower:
score += 3
# Determine suggested priority
if score >= 40:
suggested = "P0"
elif score >= 25:
suggested = "P1"
elif score >= 10:
suggested = "P2"
elif score >= 0:
suggested = "P3"
else:
suggested = None # Consider closing or deprioritizing
return IssueScore(
issue_id=issue_id,
repo=repo,
title=title,
current_labels=labels,
current_priority=current_priority,
suggested_priority=suggested,
score=round(score, 1),
reasons=reasons if reasons else ["No strong signals"],
age_days=age_days,
comment_count=comment_count,
assignee=assignee
)
# ============================================================
# Report Generation
# ============================================================
def generate_report(
scores: List[IssueScore],
signals: List[PipelineSignal],
org: str,
repos_scanned: List[str]
) -> Dict[str, Any]:
"""Generate the full priority report."""
now = datetime.now(timezone.utc).isoformat()
# Categorize changes
upgrades = [s for s in scores if s.suggested_priority and s.current_priority and
PRIORITY_LEVELS.index(s.suggested_priority) < PRIORITY_LEVELS.index(s.current_priority)]
downgrades = [s for s in scores if s.suggested_priority and s.current_priority and
PRIORITY_LEVELS.index(s.suggested_priority) > PRIORITY_LEVELS.index(s.current_priority)]
new_assignments = [s for s in scores if s.suggested_priority and not s.current_priority]
no_change = [s for s in scores if s.suggested_priority == s.current_priority]
return {
"generated_at": now,
"org": org,
"repos_scanned": repos_scanned,
"total_issues": len(scores),
"signals": [asdict(s) for s in signals],
"summary": {
"suggested_upgrades": len(upgrades),
"suggested_downgrades": len(downgrades),
"suggested_new_priorities": len(new_assignments),
"unchanged": len(no_change)
},
"top_priority": [asdict(s) for s in sorted(scores, key=lambda x: x.score, reverse=True)[:10]],
"upgrades": [asdict(s) for s in upgrades],
"downgrades": [asdict(s) for s in downgrades],
"new_assignments": [asdict(s) for s in new_assignments],
"all_scores": [asdict(s) for s in sorted(scores, key=lambda x: x.score, reverse=True)]
}
def generate_markdown_report(report: Dict[str, Any]) -> str:
"""Generate human-readable markdown report."""
lines = []
lines.append("# Priority Rebalancer Report")
lines.append(f"**Generated:** {report['generated_at']}")
lines.append(f"**Org:** {report['org']}")
lines.append(f"**Repos scanned:** {', '.join(report['repos_scanned'])}")
lines.append(f"**Issues analyzed:** {report['total_issues']}")
lines.append("")
# Signals
if report["signals"]:
lines.append("## Pipeline Signals")
for sig in report["signals"]:
weight_bar = "" * int(sig["weight"] * 10) + "" * (10 - int(sig["weight"] * 10))
lines.append(f"- [{weight_bar}] **{sig['source']}/{sig['signal_type']}** — {sig['detail']}")
lines.append("")
# Summary
s = report["summary"]
lines.append("## Summary")
lines.append(f"- Suggested upgrades: **{s['suggested_upgrades']}**")
lines.append(f"- Suggested downgrades: **{s['suggested_downgrades']}**")
lines.append(f"- New priority assignments: **{s['suggested_new_priorities']}**")
lines.append(f"- Unchanged: **{s['unchanged']}**")
lines.append("")
# Top 10
lines.append("## Top 10 by Score")
for i, item in enumerate(report["top_priority"][:10], 1):
cur = item["current_priority"] or "none"
sug = item["suggested_priority"] or "none"
arrow = "" if PRIORITY_LEVELS.index(sug) < PRIORITY_LEVELS.index(cur) else "" if PRIORITY_LEVELS.index(sug) > PRIORITY_LEVELS.index(cur) else ""
lines.append(f"{i}. **[{item['repo']}#{item['issue_id']}]** {item['title']}")
lines.append(f" Score: {item['score']} | Current: {cur} {arrow} Suggested: {sug}")
lines.append(f" Reasons: {'; '.join(item['reasons'][:3])}")
lines.append("")
# Upgrades
if report["upgrades"]:
lines.append("## Suggested Upgrades")
for item in report["upgrades"]:
lines.append(f"- **[{item['repo']}#{item['issue_id']}]** {item['title']}")
lines.append(f" {item['current_priority']}{item['suggested_priority']} (score: {item['score']})")
lines.append(f" {'; '.join(item['reasons'][:2])}")
lines.append("")
# Downgrades
if report["downgrades"]:
lines.append("## Suggested Downgrades")
for item in report["downgrades"]:
lines.append(f"- **[{item['repo']}#{item['issue_id']}]** {item['title']}")
lines.append(f" {item['current_priority']}{item['suggested_priority']} (score: {item['score']})")
lines.append(f" {'; '.join(item['reasons'][:2])}")
lines.append("")
# New assignments
if report["new_assignments"]:
lines.append("## New Priority Suggestions (currently unlabelled)")
for item in report["new_assignments"][:20]:
lines.append(f"- **[{item['repo']}#{item['issue_id']}]** {item['title']}")
lines.append(f" Suggested: {item['suggested_priority']} (score: {item['score']})")
lines.append(f" {'; '.join(item['reasons'][:2])}")
lines.append("")
return "\n".join(lines)
# ============================================================
# Main
# ============================================================
def main():
parser = argparse.ArgumentParser(description="Priority Rebalancer — re-score issues based on pipeline data")
parser.add_argument("--org", default="Timmy_Foundation", help="Gitea org name")
parser.add_argument("--repo", help="Single repo to analyze (default: all)")
parser.add_argument("--base-url", default="https://forge.alexanderwhitestone.com", help="Gitea base URL")
parser.add_argument("--knowledge-dir", default=None, help="Path to knowledge directory")
parser.add_argument("--metrics-dir", default=None, help="Path to metrics directory")
parser.add_argument("--scripts-dir", default=None, help="Path to scripts directory")
parser.add_argument("--output-dir", default=None, help="Path to output directory")
parser.add_argument("--dry-run", action="store_true", help="Show what would change without applying")
parser.add_argument("--apply", action="store_true", help="Apply priority changes via API")
parser.add_argument("--json", action="store_true", help="Output JSON instead of markdown")
args = parser.parse_args()
# Resolve paths relative to script location
script_dir = Path(__file__).parent
repo_root = script_dir.parent
knowledge_dir = args.knowledge_dir or str(repo_root / "knowledge")
metrics_dir = args.metrics_dir or str(repo_root / "metrics")
scripts_dir = args.scripts_dir or str(repo_root / "scripts")
output_dir = args.output_dir or str(repo_root / "metrics")
# Get token
token = os.environ.get("GITEA_TOKEN")
if not token:
token_path = os.path.expanduser("~/.config/gitea/token")
if os.path.exists(token_path):
with open(token_path) as f:
token = f.read().strip()
if not token:
print("Error: No Gitea token found. Set GITEA_TOKEN or create ~/.config/gitea/token", file=sys.stderr)
sys.exit(1)
client = GiteaClient(args.base_url, token)
now = datetime.utcnow()
# Collect pipeline signals
print("Collecting pipeline signals...", file=sys.stderr)
signals = []
signals.extend(collect_knowledge_signals(knowledge_dir))
signals.extend(collect_staleness_signals(scripts_dir, knowledge_dir))
signals.extend(collect_metrics_signals(metrics_dir))
print(f" Found {len(signals)} signals", file=sys.stderr)
# Get repos
if args.repo:
repos = [{"name": args.repo}]
else:
repos = client.get_org_repos(args.org)
repo_names = [r["name"] for r in repos]
print(f"Scanning {len(repo_names)} repos: {', '.join(repo_names[:5])}{'...' if len(repo_names) > 5 else ''}", file=sys.stderr)
# Score all issues
all_scores = []
for repo in repos:
repo_name = repo["name"]
issues = client.get_issues(args.org, repo_name)
print(f" {repo_name}: {len(issues)} open issues", file=sys.stderr)
for issue in issues:
if issue.get("pull_request"):
continue
score = compute_issue_score(issue, repo_name, signals, now)
all_scores.append(score)
# Generate report
report = generate_report(all_scores, signals, args.org, repo_names)
# Output
os.makedirs(output_dir, exist_ok=True)
if args.json:
print(json.dumps(report, indent=2))
else:
md = generate_markdown_report(report)
print(md)
# Write files
report_path = os.path.join(output_dir, "priority_report.json")
with open(report_path, "w") as f:
json.dump(report, f, indent=2)
print(f"\nFull report: {report_path}", file=sys.stderr)
md_path = os.path.join(output_dir, "priority_suggestions.md")
with open(md_path, "w") as f:
f.write(generate_markdown_report(report))
print(f"Suggestions: {md_path}", file=sys.stderr)
# Apply changes if requested
if args.apply:
print("\nApplying priority changes...", file=sys.stderr)
applied = 0
# Get label IDs for priority labels
priority_label_map = {}
for repo_name in repo_names:
labels = client.get_repo_labels(args.org, repo_name)
for label in labels:
name = label.get("name", "").lower()
if name in ("p0", "p1", "p2", "p3"):
priority_label_map[(repo_name, name)] = label["id"]
for score in all_scores:
if score.suggested_priority and score.suggested_priority != score.current_priority:
sug_lower = score.suggested_priority.lower()
label_id = priority_label_map.get((score.repo, sug_lower))
if label_id:
ok = client.add_label_to_issue(args.org, score.repo, score.issue_id, [label_id])
if ok:
applied += 1
print(f" Applied: [{score.repo}#{score.issue_id}] → {score.suggested_priority}", file=sys.stderr)
# Add comment explaining the change
comment = f"**Priority Rebalancer** suggested: **{score.suggested_priority}** (was: {score.current_priority or 'none'})\n\n"
comment += f"Score: {score.score}\n"
comment += f"Reasons:\n"
for r in score.reasons[:5]:
comment += f"- {r}\n"
client.add_comment(args.org, score.repo, score.issue_id, comment)
print(f"Applied {applied} priority changes", file=sys.stderr)
elif args.dry_run:
print(f"\nDry run — {report['summary']['suggested_upgrades'] + report['summary']['suggested_downgrades'] + report['summary']['suggested_new_priorities']} changes would be applied", file=sys.stderr)
if __name__ == "__main__":
main()

View File

@@ -1,290 +0,0 @@
#!/usr/bin/env python3
"""
Finds refactoring opportunities in codebases
Engine ID: 10.4
Usage:
python3 scripts/refactoring_opportunity_finder.py --output proposals/refactoring_opportunity_finder.json
python3 scripts/refactoring_opportunity_finder.py --output proposals/refactoring_opportunity_finder.json --dry-run
"""
import argparse
import ast
import json
import os
import sys
from dataclasses import dataclass, field
from datetime import datetime, timezone
from typing import Optional, Tuple
# ── Data Classes ────────────────────────────────────────────────────────
@dataclass
class FileMetrics:
"""Metrics for a single source file."""
path: str
lines: int = 0
complexity: float = 0.0
max_complexity: int = 0
functions: int = 0
classes: int = 0
churn_30d: int = 0
churn_90d: int = 0
test_coverage: Optional[float] = None
refactoring_score: float = 0.0
# ── Complexity Analysis ─────────────────────────────────────────────────
class ComplexityVisitor(ast.NodeVisitor):
"""AST visitor that computes cyclomatic complexity per function."""
def __init__(self):
self.complexities = []
self.function_count = 0
self.class_count = 0
self._current_complexity = 0
self._in_function = False
def visit_FunctionDef(self, node):
self.function_count += 1
old_complexity = self._current_complexity
old_in_function = self._in_function
self._current_complexity = 1 # Base complexity
self._in_function = True
self.generic_visit(node)
self.complexities.append(self._current_complexity)
self._current_complexity = old_complexity
self._in_function = old_in_function
visit_AsyncFunctionDef = visit_FunctionDef
def visit_ClassDef(self, node):
self.class_count += 1
self.generic_visit(node)
def visit_If(self, node):
if self._in_function:
self._current_complexity += 1
self.generic_visit(node)
def visit_For(self, node):
if self._in_function:
self._current_complexity += 1
self.generic_visit(node)
visit_AsyncFor = visit_For
def visit_While(self, node):
if self._in_function:
self._current_complexity += 1
self.generic_visit(node)
def visit_ExceptHandler(self, node):
if self._in_function:
self._current_complexity += 1
self.generic_visit(node)
def visit_With(self, node):
if self._in_function:
self._current_complexity += 1
self.generic_visit(node)
visit_AsyncWith = visit_With
def visit_Assert(self, node):
if self._in_function:
self._current_complexity += 1
self.generic_visit(node)
def visit_BoolOp(self, node):
# Each 'and'/'or' adds a branch
if self._in_function:
self._current_complexity += len(node.values) - 1
self.generic_visit(node)
def visit_IfExp(self, node):
# Ternary expression
if self._in_function:
self._current_complexity += 1
self.generic_visit(node)
def compute_file_complexity(filepath: str) -> Tuple[float, int, int, int, int]:
"""
Compute cyclomatic complexity for a Python file.
Returns:
(avg_complexity, max_complexity, function_count, class_count, line_count)
"""
try:
with open(filepath, 'r', encoding='utf-8', errors='replace') as f:
source = f.read()
except (IOError, OSError):
return 0.0, 0, 0, 0, 0
try:
tree = ast.parse(source, filename=filepath)
except SyntaxError:
return 0.0, 0, 0, 0, 0
visitor = ComplexityVisitor()
visitor.visit(tree)
line_count = len(source.splitlines())
if not visitor.complexities:
# No functions, but might have classes
return 0.0, 0, visitor.function_count, visitor.class_count, line_count
avg = sum(visitor.complexities) / len(visitor.complexities)
max_c = max(visitor.complexities)
return avg, max_c, visitor.function_count, visitor.class_count, line_count
# ── Refactoring Score ───────────────────────────────────────────────────
def calculate_refactoring_score(metrics: FileMetrics) -> float:
"""
Calculate a refactoring priority score (0-100) based on file metrics.
Higher score = higher priority for refactoring.
Components:
- Complexity (0-30 points): higher avg/max complexity = higher score
- Size (0-20 points): larger files = higher score
- Churn (0-30 points): more changes recently = higher score
- Coverage (0-20 points): lower test coverage = higher score
"""
score = 0.0
# Complexity component (0-30)
# avg=10+ or max=20+ → 30 points
complexity_score = min(30.0, (metrics.complexity * 2) + (metrics.max_complexity * 0.5))
score += max(0.0, complexity_score)
# Size component (0-20)
# 500+ lines → 20 points
size_score = min(20.0, metrics.lines / 25.0)
score += max(0.0, size_score)
# Churn component (0-30)
# Weighted: recent churn (30d) counts more than older (90d)
churn_score = min(30.0, (metrics.churn_30d * 2) + (metrics.churn_90d * 0.5))
score += max(0.0, churn_score)
# Coverage component (0-20)
# Lower coverage → higher score
if metrics.test_coverage is not None:
# coverage=0 → 20 points, coverage=1 → 0 points
coverage_score = (1.0 - metrics.test_coverage) * 20.0
else:
# No data → assume medium risk (10 points)
coverage_score = 10.0
score += max(0.0, coverage_score)
return min(100.0, max(0.0, score))
# ── Proposal Generation ─────────────────────────────────────────────────
def scan_directory(directory: str, extensions: tuple = ('.py',)) -> list:
"""Scan directory for source files."""
files = []
for root, dirs, filenames in os.walk(directory):
# Skip hidden dirs and common non-source dirs
dirs[:] = [d for d in dirs if not d.startswith('.') and d not in (
'__pycache__', 'node_modules', 'venv', '.venv', 'env',
'build', 'dist', '.git', '.tox'
)]
for fname in filenames:
if any(fname.endswith(ext) for ext in extensions):
files.append(os.path.join(root, fname))
return files
def generate_proposals(directory: str = '.', min_score: float = 30.0) -> list:
"""Generate refactoring proposals by analyzing source files."""
proposals = []
files = scan_directory(directory)
for filepath in files:
avg, max_c, funcs, classes, lines = compute_file_complexity(filepath)
if funcs == 0 and classes == 0:
continue
metrics = FileMetrics(
path=filepath,
lines=lines,
complexity=avg,
max_complexity=max_c,
functions=funcs,
classes=classes
)
score = calculate_refactoring_score(metrics)
metrics.refactoring_score = score
if score >= min_score:
reasons = []
if max_c > 10:
reasons.append(f"high max complexity ({max_c})")
if avg > 5:
reasons.append(f"high avg complexity ({avg:.1f})")
if lines > 300:
reasons.append(f"large file ({lines} lines)")
proposals.append({
"title": f"Refactor {os.path.basename(filepath)} (score: {score:.0f})",
"description": f"{filepath}: {', '.join(reasons) if reasons else 'general improvement candidate'}",
"impact": min(10, int(score / 10)),
"effort": min(10, max(1, int(max_c / 3))),
"category": "refactoring",
"source_engine": "10.4",
"timestamp": datetime.now(timezone.utc).isoformat(),
"metrics": {
"path": filepath,
"score": round(score, 2),
"avg_complexity": round(avg, 2),
"max_complexity": max_c,
"lines": lines,
"functions": funcs,
"classes": classes
}
})
# Sort by score descending
proposals.sort(key=lambda p: p.get('metrics', {}).get('score', 0), reverse=True)
return proposals
# ── CLI ─────────────────────────────────────────────────────────────────
def main():
parser = argparse.ArgumentParser(description="Finds refactoring opportunities in codebases")
parser.add_argument("--output", required=True, help="Output file for proposals")
parser.add_argument("--dry-run", action="store_true", help="Don't write output file")
parser.add_argument("--directory", default=".", help="Directory to scan")
parser.add_argument("--min-score", type=float, default=30.0, help="Minimum score threshold")
args = parser.parse_args()
proposals = generate_proposals(args.directory, args.min_score)
if not args.dry_run:
os.makedirs(os.path.dirname(args.output) or '.', exist_ok=True)
with open(args.output, "w") as f:
json.dump({"proposals": proposals}, f, indent=2)
print(f"Generated {len(proposals)} proposals -> {args.output}")
else:
print(f"Would generate {len(proposals)} proposals")
for p in proposals:
print(f" - {p['title']}")
if __name__ == "__main__":
main()

View File

@@ -1,353 +0,0 @@
#!/usr/bin/env python3
"""
sampler.py — Score and rank sessions by harvest value.
With 20k+ sessions on disk, we can't harvest all at once. This script
scores each session by how likely it is to contain valuable knowledge,
so the harvester processes the best ones first.
Scoring strategy:
- Recency: last 7d=3pts, last 30d=2pts, older=1pt
- Length: >50 messages=3pts, >20=2pts, <20=1pt
- Repo uniqueness: first session for a repo=5pts, otherwise=1pt
- Outcome: failure=3pts (most to learn), success=2pts, unknown=1pt
- Tool calls: >10 tool invocations=2pts (complex sessions)
Usage:
python3 sampler.py --count 100 # Top 100 sessions
python3 sampler.py --repo the-nexus --count 20 # Top 20 for a repo
python3 sampler.py --since 2026-04-01 # All sessions since date
python3 sampler.py --count 50 --min-score 8 # Only high-value sessions
python3 sampler.py --count 100 --output sample.json # Save to file
"""
import argparse
import json
import os
import sys
import time
from datetime import datetime, timedelta, timezone
from pathlib import Path
from typing import Optional
# --- Fast session scanning (no full parse) ---
def scan_session_fast(path: str) -> dict:
"""Extract scoring metadata from a session without parsing the full JSONL.
Reads only: first line, last ~20 lines, and line count. This processes
20k sessions in seconds instead of minutes.
"""
meta = {
'path': path,
'message_count': 0,
'has_tool_calls': False,
'tool_call_count': 0,
'first_timestamp': '',
'last_timestamp': '',
'is_failure': False,
'repos_mentioned': set(),
'first_role': '',
'last_content_preview': '',
}
try:
file_size = os.path.getsize(path)
if file_size == 0:
return meta
with open(path, 'r', encoding='utf-8', errors='replace') as f:
# Read first line for timestamp + role
first_line = f.readline().strip()
if first_line:
try:
first_msg = json.loads(first_line)
meta['first_timestamp'] = first_msg.get('timestamp', '')
meta['first_role'] = first_msg.get('role', '')
except json.JSONDecodeError:
pass
# Fast line count + collect tail lines
# For the tail, seek to near end of file
tail_lines = []
line_count = 1 # already read first
if file_size > 8192:
# Seek to last 8KB for tail sampling
f.seek(max(0, file_size - 8192))
f.readline() # skip partial line
for line in f:
line = line.strip()
if line:
tail_lines.append(line)
line_count += 1
# We lost the exact count for big files — estimate from file size
# Average JSONL line is ~500 bytes
if line_count < 100:
line_count = max(line_count, file_size // 500)
else:
# Small file — read all
for line in f:
line = line.strip()
if line:
tail_lines.append(line)
line_count += 1
meta['message_count'] = line_count
# Parse tail lines for outcome, tool calls, repos
for line in tail_lines[-30:]: # last 30 non-empty lines
try:
msg = json.loads(line)
# Track last timestamp
ts = msg.get('timestamp', '')
if ts:
meta['last_timestamp'] = ts
# Count tool calls
if msg.get('tool_calls'):
meta['has_tool_calls'] = True
meta['tool_call_count'] += len(msg['tool_calls'])
# Detect failure signals in content
content = ''
if isinstance(msg.get('content'), str):
content = msg['content'].lower()
elif isinstance(msg.get('content'), list):
for part in msg['content']:
if isinstance(part, dict) and part.get('type') == 'text':
content += part.get('text', '').lower()
if content:
meta['last_content_preview'] = content[:200]
failure_signals = ['error', 'failed', 'cannot', 'unable',
'exception', 'traceback', 'rejected', 'denied']
if any(sig in content for sig in failure_signals):
meta['is_failure'] = True
# Extract repo references from tool call arguments
if msg.get('tool_calls'):
for tc in msg['tool_calls']:
args = tc.get('function', {}).get('arguments', '')
if isinstance(args, str):
# Look for repo patterns
for pattern in ['Timmy_Foundation/', 'Rockachopa/', 'compounding-intelligence', 'the-nexus', 'timmy-home', 'hermes-agent', 'the-beacon', 'the-door']:
if pattern in args:
repo = pattern.rstrip('/')
meta['repos_mentioned'].add(repo)
except json.JSONDecodeError:
continue
except (IOError, OSError):
pass
meta['repos_mentioned'] = list(meta['repos_mentioned'])
return meta
# --- Filename timestamp parsing ---
def parse_session_timestamp(filename: str) -> Optional[datetime]:
"""Parse timestamp from session filename.
Common formats:
session_20260413_123456_hash.jsonl
20260413_123456_hash.jsonl
"""
stem = Path(filename).stem
parts = stem.split('_')
# Try session_YYYYMMDD_HHMMSS format
for i, part in enumerate(parts):
if len(part) == 8 and part.isdigit():
date_part = part
time_part = parts[i + 1] if i + 1 < len(parts) and len(parts[i + 1]) == 6 else '000000'
try:
return datetime.strptime(f"{date_part}_{time_part}", '%Y%m%d_%H%M%S').replace(tzinfo=timezone.utc)
except ValueError:
continue
# Fallback: use file modification time
return None
# --- Scoring ---
def score_session(meta: dict, now: datetime, seen_repos: set) -> tuple[int, dict]:
"""Score a session for harvest value. Returns (score, breakdown)."""
score = 0
breakdown = {}
# 1. Recency
ts = parse_session_timestamp(os.path.basename(meta['path']))
if ts is None:
# Fallback to mtime
try:
ts = datetime.fromtimestamp(os.path.getmtime(meta['path']), tz=timezone.utc)
except OSError:
ts = now - timedelta(days=365)
age_days = (now - ts).days
if age_days <= 7:
recency = 3
elif age_days <= 30:
recency = 2
else:
recency = 1
score += recency
breakdown['recency'] = recency
# 2. Length
count = meta['message_count']
if count > 50:
length = 3
elif count > 20:
length = 2
else:
length = 1
score += length
breakdown['length'] = length
# 3. Repo uniqueness (first session mentioning a repo gets bonus)
repo_score = 0
for repo in meta.get('repos_mentioned', []):
if repo not in seen_repos:
seen_repos.add(repo)
repo_score = max(repo_score, 5)
else:
repo_score = max(repo_score, 1)
score += repo_score
breakdown['repo_unique'] = repo_score
# 4. Outcome
if meta.get('is_failure'):
outcome = 3
elif meta.get('last_content_preview', '').strip():
outcome = 2 # has some content = likely completed
else:
outcome = 1
score += outcome
breakdown['outcome'] = outcome
# 5. Tool calls
if meta.get('tool_call_count', 0) > 10:
tool = 2
else:
tool = 0
score += tool
breakdown['tool_calls'] = tool
return score, breakdown
# --- Main ---
def main():
parser = argparse.ArgumentParser(description="Score and rank sessions for harvesting")
parser.add_argument('--sessions-dir', default=os.path.expanduser('~/.hermes/sessions'),
help='Directory containing session files')
parser.add_argument('--count', type=int, default=100, help='Number of top sessions to return')
parser.add_argument('--repo', default='', help='Filter to sessions mentioning this repo')
parser.add_argument('--since', default='', help='Only score sessions after this date (YYYY-MM-DD)')
parser.add_argument('--min-score', type=int, default=0, help='Minimum score threshold')
parser.add_argument('--output', default='', help='Output file (JSON). Default: stdout')
parser.add_argument('--format', choices=['json', 'paths', 'table'], default='table',
help='Output format: json (full), paths (one per line), table (human)')
parser.add_argument('--top-percent', type=float, default=0, help='Return top N%% instead of --count')
args = parser.parse_args()
sessions_dir = Path(args.sessions_dir)
if not sessions_dir.is_dir():
print(f"ERROR: Sessions directory not found: {sessions_dir}", file=sys.stderr)
sys.exit(1)
# Find all JSONL files
print(f"Scanning {sessions_dir}...", file=sys.stderr)
t0 = time.time()
session_files = list(sessions_dir.glob('*.jsonl'))
total = len(session_files)
print(f"Found {total} session files", file=sys.stderr)
# Parse since date
since_dt = None
if args.since:
since_dt = datetime.strptime(args.since, '%Y-%m-%d').replace(tzinfo=timezone.utc)
# Score all sessions
now = datetime.now(timezone.utc)
seen_repos = set() # Track repos for uniqueness scoring
scored = []
for i, sf in enumerate(session_files):
# Date filter (fast path: check filename first)
if since_dt:
ts = parse_session_timestamp(sf.name)
if ts and ts < since_dt:
continue
meta = scan_session_fast(str(sf))
# Repo filter
if args.repo:
repos = meta.get('repos_mentioned', [])
if args.repo.lower() not in [r.lower() for r in repos]:
# Also check filename
if args.repo.lower() not in sf.name.lower():
continue
score, breakdown = score_session(meta, now, seen_repos)
if score >= args.min_score:
scored.append({
'path': str(sf),
'filename': sf.name,
'score': score,
'breakdown': breakdown,
'message_count': meta['message_count'],
'repos': meta['repos_mentioned'],
'is_failure': meta['is_failure'],
})
if (i + 1) % 5000 == 0:
elapsed = time.time() - t0
print(f" Scanned {i + 1}/{total} ({elapsed:.1f}s)", file=sys.stderr)
elapsed = time.time() - t0
print(f"Scored {len(scored)} sessions in {elapsed:.1f}s", file=sys.stderr)
# Sort by score descending
scored.sort(key=lambda x: x['score'], reverse=True)
# Apply count or percent
if args.top_percent > 0:
count = max(1, int(len(scored) * args.top_percent / 100))
else:
count = args.count
scored = scored[:count]
# Output
if args.output:
with open(args.output, 'w', encoding='utf-8') as f:
json.dump(scored, f, indent=2)
print(f"Wrote {len(scored)} sessions to {args.output}", file=sys.stderr)
elif args.format == 'json':
json.dump(scored, sys.stdout, indent=2)
elif args.format == 'paths':
for s in scored:
print(s['path'])
else: # table
print(f"{'SCORE':>5} {'MSGS':>5} {'REPOS':<25} {'FILE'}")
print(f"{'-'*5} {'-'*5} {'-'*25} {'-'*40}")
for s in scored:
repos = ', '.join(s['repos'][:2]) if s['repos'] else '-'
fail = ' FAIL' if s['is_failure'] else ''
print(f"{s['score']:>5} {s['message_count']:>5} {repos:<25} {s['filename'][:40]}{fail}")
if __name__ == '__main__':
main()

View File

@@ -1,276 +0,0 @@
#!/usr/bin/env python3
"""
session_metadata.py - Extract structured metadata from Hermes session transcripts.
Works alongside session_reader.py to provide higher-level session analysis.
"""
import json
import re
import sys
from dataclasses import dataclass, asdict
from datetime import datetime
from pathlib import Path
from typing import Dict, List, Optional, Any
# Import from session_reader (the canonical reader)
from session_reader import read_session
@dataclass
class SessionSummary:
"""Structured summary of a Hermes session transcript."""
session_id: str
model: str
repo: str
outcome: str
message_count: int
tool_calls: int
duration_estimate: str
key_actions: List[str]
errors_encountered: List[str]
start_time: Optional[str] = None
end_time: Optional[str] = None
total_tokens_estimate: int = 0
user_messages: int = 0
assistant_messages: int = 0
tool_outputs: int = 0
def extract_session_metadata(file_path: str) -> SessionSummary:
"""
Extract structured metadata from a Hermes session JSONL transcript.
Uses session_reader.read_session() for file reading.
"""
session_id = Path(file_path).stem
messages = []
model = "unknown"
repo = "unknown"
tool_calls_count = 0
key_actions = []
errors = []
start_time = None
end_time = None
total_tokens = 0
# Common repo patterns to look for
repo_patterns = [
r"(?:the-nexus|compounding-intelligence|timmy-config|hermes-agent)",
r"(?:forge\.alexanderwhitestone\.com/([^/]+/[^/\\s]+))",
r"(?:github\.com/([^/]+/[^/\\s]+))",
r"(?:Timmy_Foundation/([^/\\s]+))",
]
try:
# Use the canonical reader from session_reader.py
messages = read_session(file_path)
except FileNotFoundError:
return SessionSummary(
session_id=session_id,
model="unknown",
repo="unknown",
outcome="failure",
message_count=0,
tool_calls=0,
duration_estimate="0m",
key_actions=[],
errors_encountered=[f"File not found: {file_path}"]
)
# Process messages for metadata
for entry in messages:
# Extract model from assistant messages
if entry.get("role") == "assistant" and entry.get("model"):
model = entry["model"]
# Extract timestamps
if entry.get("timestamp"):
ts = entry["timestamp"]
if start_time is None:
start_time = ts
end_time = ts
# Count tool calls
if entry.get("tool_calls"):
tool_calls_count += len(entry["tool_calls"])
for tc in entry["tool_calls"]:
if tc.get("function", {}).get("name"):
action = f"{tc['function']['name']}"
if action not in key_actions:
key_actions.append(action)
# Estimate tokens from content length
content = entry.get("content", "")
if isinstance(content, str):
total_tokens += len(content.split())
elif isinstance(content, list):
for item in content:
if isinstance(item, dict) and "text" in item:
total_tokens += len(item["text"].split())
# Look for repo mentions in content
if entry.get("content"):
content_str = str(entry["content"])
for pattern in repo_patterns:
match = re.search(pattern, content_str, re.IGNORECASE)
if match:
if match.groups():
repo = match.group(1)
else:
repo = match.group(0)
break
# Look for error messages
if entry.get("role") == "tool" and entry.get("is_error"):
error_msg = entry.get("content", "Unknown error")
if isinstance(error_msg, str) and len(error_msg) < 200:
errors.append(error_msg[:200])
# Count message types
user_messages = sum(1 for m in messages if m.get("role") == "user")
assistant_messages = sum(1 for m in messages if m.get("role") == "assistant")
tool_outputs = sum(1 for m in messages if m.get("role") == "tool")
# Calculate duration estimate
duration_estimate = "unknown"
if start_time and end_time:
try:
# Try to parse timestamps
start_dt = None
end_dt = None
# Handle various timestamp formats
for fmt in ["%Y-%m-%dT%H:%M:%S.%fZ", "%Y-%m-%dT%H:%M:%SZ", "%Y-%m-%d %H:%M:%S"]:
try:
if start_dt is None:
start_dt = datetime.strptime(start_time, fmt)
if end_dt is None:
end_dt = datetime.strptime(end_time, fmt)
except ValueError:
continue
if start_dt and end_dt:
duration = end_dt - start_dt
minutes = duration.total_seconds() / 60
duration_estimate = f"{minutes:.0f}m"
except Exception:
pass
# Classify outcome
outcome = "unknown"
if errors:
# Check if any errors are fatal
fatal_errors = any("405" in e or "permission" in e.lower() or "authentication" in e.lower()
for e in errors)
if fatal_errors:
outcome = "failure"
else:
outcome = "partial"
elif messages:
# Check last message for success indicators
last_msg = messages[-1]
if last_msg.get("role") == "assistant":
content = last_msg.get("content", "")
if isinstance(content, str):
success_indicators = ["done", "completed", "success", "merged", "pushed"]
if any(indicator in content.lower() for indicator in success_indicators):
outcome = "success"
else:
outcome = "unknown"
# Deduplicate key actions (keep unique, limit to 10)
unique_actions = []
for action in key_actions:
if action not in unique_actions:
unique_actions.append(action)
if len(unique_actions) >= 10:
break
# Deduplicate errors (keep unique, limit to 5)
unique_errors = []
for error in errors:
if error not in unique_errors:
unique_errors.append(error)
if len(unique_errors) >= 5:
break
return SessionSummary(
session_id=session_id,
model=model,
repo=repo,
outcome=outcome,
message_count=len(messages),
tool_calls=tool_calls_count,
duration_estimate=duration_estimate,
key_actions=unique_actions,
errors_encountered=unique_errors,
start_time=start_time,
end_time=end_time,
total_tokens_estimate=total_tokens,
user_messages=user_messages,
assistant_messages=assistant_messages,
tool_outputs=tool_outputs
)
def process_session_directory(directory_path: str, output_file: Optional[str] = None) -> List[SessionSummary]:
"""
Process all JSONL files in a directory.
"""
directory = Path(directory_path)
if not directory.exists():
print(f"Error: Directory {directory_path} does not exist", file=sys.stderr)
return []
jsonl_files = list(directory.glob("*.jsonl"))
if not jsonl_files:
print(f"Warning: No JSONL files found in {directory_path}", file=sys.stderr)
return []
summaries = []
for jsonl_file in sorted(jsonl_files):
print(f"Processing {jsonl_file.name}...", file=sys.stderr)
summary = extract_session_metadata(str(jsonl_file))
summaries.append(summary)
if output_file:
with open(output_file, 'w', encoding='utf-8') as f:
json.dump([asdict(s) for s in summaries], f, indent=2)
print(f"Wrote {len(summaries)} summaries to {output_file}", file=sys.stderr)
return summaries
def main():
"""CLI entry point."""
import argparse
parser = argparse.ArgumentParser(description="Extract metadata from Hermes session JSONL transcripts")
parser.add_argument("path", help="Path to JSONL file or directory of session files")
parser.add_argument("-o", "--output", help="Output JSON file (default: stdout)")
parser.add_argument("-v", "--verbose", action="store_true", help="Verbose output")
args = parser.parse_args()
path = Path(args.path)
if path.is_file():
summary = extract_session_metadata(str(path))
if args.output:
with open(args.output, 'w') as f:
json.dump(asdict(summary), f, indent=2)
print(f"Wrote summary to {args.output}", file=sys.stderr)
else:
print(json.dumps(asdict(summary), indent=2))
elif path.is_dir():
summaries = process_session_directory(str(path), args.output)
if not args.output:
print(json.dumps([asdict(s) for s in summaries], indent=2))
else:
print(f"Error: {args.path} is not a file or directory", file=sys.stderr)
sys.exit(1)
if __name__ == "__main__":
main()

View File

@@ -1,234 +0,0 @@
#!/usr/bin/env python3
"""
Session Transcript → Training Pair Harvester
Scans Hermes session JSONL files for Q&A patterns and extracts
terse→rich training pairs. Outputs JSONL matching the timmy-config
training pairs spec.
Usage:
python3 scripts/session_pair_harvester.py ~/.hermes/sessions/
python3 scripts/session_pair_harvester.py session.jsonl --output pairs.jsonl
python3 scripts/session_pair_harvester.py --dir ~/.hermes/sessions/ --min-ratio 2.0
Output format:
{"terse": "user short prompt", "rich": "ai detailed response", "source": "session_id", "model": "..."}
"""
import argparse
import hashlib
import json
import sys
from pathlib import Path
from typing import Optional
def compute_hash(text: str) -> str:
"""Content hash for deduplication."""
return hashlib.sha256(text.encode()).hexdigest()[:16]
def extract_pairs_from_session(session_data: dict, min_ratio: float = 1.5,
min_response_words: int = 20) -> list:
"""Extract terse→rich pairs from a single session object."""
pairs = []
conversations = session_data.get("conversations", [])
session_id = session_data.get("id", "unknown")
model = session_data.get("model", "unknown")
seen_hashes = set()
for i, msg in enumerate(conversations):
# Look for assistant/gpt responses
if msg.get("from") not in ("gpt", "assistant"):
continue
response_text = msg.get("value", "")
if not response_text or len(response_text.split()) < min_response_words:
continue
# Find the preceding human message
prompt_text = ""
for j in range(i - 1, -1, -1):
if conversations[j].get("from") == "human":
prompt_text = conversations[j].get("value", "")
break
if not prompt_text:
continue
# Filter: skip tool results, system messages embedded as human
if prompt_text.startswith("{") and "output" in prompt_text[:100]:
continue # likely a tool result
if prompt_text.startswith("# SOUL.md") or prompt_text.startswith("You are"):
continue # system prompt leak
# Quality filters
prompt_words = len(prompt_text.split())
response_words = len(response_text.split())
# Must have meaningful length ratio
if prompt_words == 0 or response_words == 0:
continue
ratio = response_words / prompt_words
if ratio < min_ratio:
continue
# Skip responses that are mostly code
code_blocks = response_text.count("```")
if code_blocks >= 4 and len(response_text.replace("```", "").strip()) < 50:
continue
# Skip responses with tool call artifacts
if "tool_call" in response_text[:100] or "function_call" in response_text[:100]:
continue
# Deduplicate by content hash
content_hash = compute_hash(prompt_text + response_text[:200])
if content_hash in seen_hashes:
continue
seen_hashes.add(content_hash)
# Clean up response: remove markdown headers if too many
clean_response = response_text
pairs.append({
"terse": prompt_text.strip(),
"rich": clean_response.strip(),
"source": session_id,
"model": model,
"prompt_words": prompt_words,
"response_words": response_words,
"ratio": round(ratio, 2),
})
return pairs
def extract_from_jsonl_file(filepath: str, **kwargs) -> list:
"""Extract pairs from a session JSONL file."""
pairs = []
path = Path(filepath)
if not path.exists():
print(f"Warning: {filepath} not found", file=sys.stderr)
return pairs
content = path.read_text()
lines = content.strip().split("\n")
for line in lines:
line = line.strip()
if not line:
continue
try:
session = json.loads(line)
except json.JSONDecodeError:
continue
session_pairs = extract_pairs_from_session(session, **kwargs)
pairs.extend(session_pairs)
return pairs
def deduplicate_pairs(pairs: list) -> list:
"""Remove duplicate pairs across files."""
seen = set()
unique = []
for pair in pairs:
key = compute_hash(pair["terse"] + pair["rich"][:200])
if key not in seen:
seen.add(key)
unique.append(pair)
return unique
def main():
parser = argparse.ArgumentParser(description="Harvest training pairs from session transcripts")
parser.add_argument("input", nargs="?", help="Session JSONL file or directory")
parser.add_argument("--dir", "-d", help="Directory to scan for session files")
parser.add_argument("--output", "-o", default="harvested_pairs.jsonl", help="Output file")
parser.add_argument("--min-ratio", type=float, default=1.5, help="Min response/prompt word ratio")
parser.add_argument("--min-words", type=int, default=20, help="Min response word count")
parser.add_argument("--dry-run", action="store_true", help="Print stats without writing")
args = parser.parse_args()
all_pairs = []
files_scanned = 0
scan_dir = args.dir or args.input
if not scan_dir:
parser.print_help()
sys.exit(1)
scan_path = Path(scan_dir)
if scan_path.is_dir():
jsonl_files = sorted(scan_path.rglob("*.jsonl"))
print(f"Scanning {len(jsonl_files)} files in {scan_dir}...", file=sys.stderr)
for fpath in jsonl_files:
pairs = extract_from_jsonl_file(
str(fpath),
min_ratio=args.min_ratio,
min_response_words=args.min_words
)
all_pairs.extend(pairs)
files_scanned += 1
else:
pairs = extract_from_jsonl_file(
str(scan_path),
min_ratio=args.min_ratio,
min_response_words=args.min_words
)
all_pairs.extend(pairs)
files_scanned = 1
# Deduplicate
unique_pairs = deduplicate_pairs(all_pairs)
# Stats
if unique_pairs:
avg_prompt = sum(p["prompt_words"] for p in unique_pairs) / len(unique_pairs)
avg_response = sum(p["response_words"] for p in unique_pairs) / len(unique_pairs)
avg_ratio = sum(p["ratio"] for p in unique_pairs) / len(unique_pairs)
else:
avg_prompt = avg_response = avg_ratio = 0
stats = {
"files_scanned": files_scanned,
"raw_pairs": len(all_pairs),
"unique_pairs": len(unique_pairs),
"duplicates_removed": len(all_pairs) - len(unique_pairs),
"avg_prompt_words": round(avg_prompt, 1),
"avg_response_words": round(avg_response, 1),
"avg_ratio": round(avg_ratio, 2),
}
print(json.dumps(stats, indent=2), file=sys.stderr)
if args.dry_run:
# Print sample pairs
for pair in unique_pairs[:3]:
print(f"\n--- Source: {pair['source']} (ratio: {pair['ratio']}) ---", file=sys.stderr)
print(f"TERSE: {pair['terse'][:100]}...", file=sys.stderr)
print(f"RICH: {pair['rich'][:150]}...", file=sys.stderr)
return
# Write output
output_path = Path(args.output)
with open(output_path, "w") as f:
for pair in unique_pairs:
# Strip internal fields for output
output = {
"terse": pair["terse"],
"rich": pair["rich"],
"source": pair["source"],
"model": pair["model"],
}
f.write(json.dumps(output) + "\n")
print(f"\nWrote {len(unique_pairs)} pairs to {output_path}", file=sys.stderr)
if __name__ == "__main__":
main()

View File

@@ -1,147 +0,0 @@
#!/usr/bin/env python3
"""Tests for scripts/automation_opportunity_finder.py — 8 tests."""
import json
import os
import sys
import tempfile
sys.path.insert(0, os.path.dirname(__file__) or ".")
import importlib.util
spec = importlib.util.spec_from_file_location(
"aof",
os.path.join(os.path.dirname(__file__) or ".", "automation_opportunity_finder.py"),
)
mod = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)
def test_analyze_cron_jobs_no_file():
"""Returns empty list when no cron jobs file exists."""
with tempfile.TemporaryDirectory() as tmpdir:
result = mod.analyze_cron_jobs(tmpdir)
assert result == []
print("PASS: test_analyze_cron_jobs_no_file")
def test_analyze_cron_jobs_disabled():
"""Detects disabled cron jobs."""
with tempfile.TemporaryDirectory() as tmpdir:
cron_dir = os.path.join(tmpdir, "cron")
os.makedirs(cron_dir)
jobs = [
{"id": "j1", "name": "backup", "enabled": False, "schedule": "0 * * * *"},
{"id": "j2", "name": "health", "enabled": True, "schedule": "*/5 * * * *"},
]
with open(os.path.join(cron_dir, "jobs.json"), "w") as f:
json.dump(jobs, f)
result = mod.analyze_cron_jobs(tmpdir)
assert any(p["category"] == "cron_disabled" for p in result)
print("PASS: test_analyze_cron_jobs_disabled")
def test_analyze_cron_jobs_errors():
"""Detects cron jobs with error status."""
with tempfile.TemporaryDirectory() as tmpdir:
cron_dir = os.path.join(tmpdir, "cron")
os.makedirs(cron_dir)
jobs = [
{"id": "j1", "name": "broken", "enabled": True, "last_status": "error", "schedule": "0 * * * *"},
]
with open(os.path.join(cron_dir, "jobs.json"), "w") as f:
json.dump(jobs, f)
result = mod.analyze_cron_jobs(tmpdir)
assert any(p["category"] == "cron_errors" for p in result)
print("PASS: test_analyze_cron_jobs_errors")
def test_analyze_documents_finds_todos():
"""Detects TODO markers in documents."""
with tempfile.TemporaryDirectory() as tmpdir:
docs_dir = os.path.join(tmpdir, "docs")
os.makedirs(docs_dir)
for i in range(3):
with open(os.path.join(docs_dir, f"guide{i}.md"), "w") as f:
f.write(f"# Guide {i}\n\nTODO: Automate this step\n")
result = mod.analyze_documents([tmpdir])
assert any(p["category"] == "manual_todo" for p in result)
todo_proposals = [p for p in result if p["category"] == "manual_todo"]
assert todo_proposals[0]["details"].__len__() == 3
print("PASS: test_analyze_documents_finds_todos")
def test_analyze_scripts_repeated_commands():
"""Detects repeated shell commands across scripts."""
with tempfile.TemporaryDirectory() as tmpdir:
scripts_dir = os.path.join(tmpdir, "scripts")
os.makedirs(scripts_dir)
repeated_cmd = "docker restart myapp"
for i in range(4):
with open(os.path.join(scripts_dir, f"deploy{i}.sh"), "w") as f:
f.write(f"#!/bin/bash\n{repeated_cmd}\n")
result = mod.analyze_scripts([tmpdir])
assert any(p["category"] == "repeated_command" for p in result)
print("PASS: test_analyze_scripts_repeated_commands")
def test_analyze_session_transcripts():
"""Detects repeated tool-call sequences."""
with tempfile.TemporaryDirectory() as tmpdir:
sessions_dir = os.path.join(tmpdir, "sessions")
os.makedirs(sessions_dir)
for i in range(4):
with open(os.path.join(sessions_dir, f"session{i}.jsonl"), "w") as f:
f.write(json.dumps({"role": "user", "content": f"task {i}"}) + "\n")
f.write(json.dumps({
"role": "assistant",
"content": "working",
"tool_calls": [
{"function": {"name": "read_file"}},
{"function": {"name": "write_file"}},
]
}) + "\n")
result = mod.analyze_session_transcripts([sessions_dir])
assert any(p["category"] == "tool_sequence" for p in result)
seq_proposals = [p for p in result if p["category"] == "tool_sequence"]
assert any("read_file" in p["title"] and "write_file" in p["title"] for p in seq_proposals)
print("PASS: test_analyze_session_transcripts")
def test_deduplicate_proposals():
"""Deduplicates proposals with similar titles."""
proposals = [
{"title": "TODO found 3 times", "category": "manual_todo", "confidence": 0.7, "impact": "medium", "description": "x", "sources": []},
{"title": "TODO found 3 times", "category": "manual_todo", "confidence": 0.7, "impact": "medium", "description": "x", "sources": []},
{"title": "FIXME found 5 times", "category": "manual_fixme", "confidence": 0.8, "impact": "medium", "description": "y", "sources": []},
]
result = mod.deduplicate_proposals(proposals)
assert len(result) == 2
print("PASS: test_deduplicate_proposals")
def test_rank_proposals():
"""Ranks proposals by impact * confidence."""
proposals = [
{"title": "low", "category": "x", "confidence": 0.9, "impact": "low", "description": "", "sources": []},
{"title": "high", "category": "x", "confidence": 0.8, "impact": "high", "description": "", "sources": []},
{"title": "med", "category": "x", "confidence": 0.7, "impact": "medium", "description": "", "sources": []},
]
result = mod.rank_proposals(proposals)
assert result[0]["title"] == "high"
assert result[-1]["title"] == "low"
print("PASS: test_rank_proposals")
if __name__ == "__main__":
tests = [v for k, v in globals().items() if k.startswith("test_")]
passed = 0
failed = 0
for t in tests:
try:
t()
passed += 1
except Exception as e:
print(f"FAIL: {t.__name__}: {e}")
failed += 1
print(f"\n{passed}/{passed+failed} tests passed")
sys.exit(1 if failed else 0)

View File

@@ -1,239 +0,0 @@
#!/usr/bin/env python3
"""
Tests for bootstrapper.py — context assembly from knowledge store.
"""
import json
import sys
import tempfile
from pathlib import Path
# Add scripts dir to path for import
sys.path.insert(0, str(Path(__file__).resolve().parent))
from bootstrapper import (
build_bootstrap_context,
estimate_tokens,
filter_facts,
load_index,
sort_facts,
truncate_to_tokens,
)
def make_index(facts: list[dict], tmp_dir: Path) -> Path:
"""Create a temporary index.json with given facts."""
index = {
"version": 1,
"last_updated": "2026-04-13T20:00:00Z",
"total_facts": len(facts),
"facts": facts,
}
path = tmp_dir / "index.json"
with open(path, "w") as f:
json.dump(index, f)
return path
def test_empty_index():
"""Empty knowledge store produces graceful output."""
with tempfile.TemporaryDirectory() as tmp:
tmp_dir = Path(tmp)
index_path = make_index([], tmp_dir)
# Create empty knowledge dirs
for sub in ["repos", "agents", "global"]:
(tmp_dir / sub).mkdir(exist_ok=True)
context = build_bootstrap_context(
repo="the-nexus", index_path=index_path
)
assert "No relevant knowledge found" in context
assert "Starting fresh" in context
print("PASS: empty_index")
def test_filter_by_repo():
"""Filter facts by repository."""
facts = [
{"fact": "A", "category": "fact", "repo": "the-nexus", "confidence": 0.9},
{"fact": "B", "category": "fact", "repo": "fleet-ops", "confidence": 0.8},
{"fact": "C", "category": "fact", "repo": "global", "confidence": 0.7},
]
filtered = filter_facts(facts, repo="the-nexus", include_global=True)
texts = [f["fact"] for f in filtered]
assert "A" in texts
assert "B" not in texts
assert "C" in texts
print("PASS: filter_by_repo")
def test_filter_by_agent():
"""Filter facts by agent type."""
facts = [
{"fact": "A", "category": "pattern", "repo": "global", "agent": "mimo-sprint", "confidence": 0.8},
{"fact": "B", "category": "pattern", "repo": "global", "agent": "groq-fast", "confidence": 0.7},
{"fact": "C", "category": "fact", "repo": "global", "confidence": 0.9},
]
filtered = filter_facts(facts, agent="mimo-sprint", include_global=True)
texts = [f["fact"] for f in filtered]
assert "A" in texts
assert "B" not in texts
assert "C" in texts # global, no agent restriction
print("PASS: filter_by_agent")
def test_no_global_flag():
"""Excluding global facts works."""
facts = [
{"fact": "A", "category": "fact", "repo": "the-nexus", "confidence": 0.9},
{"fact": "B", "category": "fact", "repo": "global", "confidence": 0.8},
]
filtered = filter_facts(facts, repo="the-nexus", include_global=False)
texts = [f["fact"] for f in filtered]
assert "A" in texts
assert "B" not in texts
print("PASS: no_global_flag")
def test_sort_by_confidence():
"""Facts sort by confidence descending."""
facts = [
{"fact": "low", "category": "fact", "repo": "global", "confidence": 0.3},
{"fact": "high", "category": "fact", "repo": "global", "confidence": 0.95},
{"fact": "mid", "category": "fact", "repo": "global", "confidence": 0.7},
]
sorted_f = sort_facts(facts)
assert sorted_f[0]["fact"] == "high"
assert sorted_f[1]["fact"] == "mid"
assert sorted_f[2]["fact"] == "low"
print("PASS: sort_by_confidence")
def test_sort_pitfalls_first():
"""Pitfalls sort before facts at same confidence."""
facts = [
{"fact": "regular fact", "category": "fact", "repo": "global", "confidence": 0.8},
{"fact": "danger pitfall", "category": "pitfall", "repo": "global", "confidence": 0.8},
]
sorted_f = sort_facts(facts)
assert sorted_f[0]["category"] == "pitfall"
print("PASS: sort_pitfalls_first")
def test_truncate_to_tokens():
"""Truncation cuts at line boundary."""
text = "line1\nline2\nline3\nline4\nline5\n"
truncated = truncate_to_tokens(text, max_tokens=2) # ~8 chars
assert "line1" in truncated
assert "truncated" in truncated.lower()
print("PASS: truncate_to_tokens")
def test_estimate_tokens():
"""Token estimation is reasonable."""
text = "a" * 400
tokens = estimate_tokens(text)
assert 90 <= tokens <= 110 # ~100 tokens
print("PASS: estimate_tokens")
def test_build_full_context():
"""Full context with facts renders correctly."""
facts = [
{"fact": "API merges fail with 405", "category": "pitfall", "repo": "the-nexus", "confidence": 0.95},
{"fact": "Has 50+ open PRs", "category": "fact", "repo": "the-nexus", "confidence": 0.9},
{"fact": "Token at ~/.config/gitea/token", "category": "tool-quirk", "repo": "global", "confidence": 0.9},
{"fact": "Check git remote -v first", "category": "pattern", "repo": "global", "confidence": 0.8},
]
with tempfile.TemporaryDirectory() as tmp:
tmp_dir = Path(tmp)
index_path = make_index(facts, tmp_dir)
# Create knowledge dirs
for sub in ["repos", "agents", "global"]:
(tmp_dir / sub).mkdir(exist_ok=True)
context = build_bootstrap_context(
repo="the-nexus",
agent="mimo-sprint",
include_global=True,
index_path=index_path,
)
assert "What You Know" in context
assert "PITFALLS" in context
assert "API merges fail with 405" in context
assert "the-nexus" in context
assert "Token at" in context # global fact included
print("PASS: build_full_context")
def test_max_tokens_respected():
"""Output respects max_tokens limit."""
# Generate lots of facts
facts = [
{"fact": f"Fact number {i} with some detail about things", "category": "fact", "repo": "global", "confidence": 0.8}
for i in range(100)
]
with tempfile.TemporaryDirectory() as tmp:
tmp_dir = Path(tmp)
index_path = make_index(facts, tmp_dir)
for sub in ["repos", "agents", "global"]:
(tmp_dir / sub).mkdir(exist_ok=True)
context = build_bootstrap_context(
repo=None,
max_tokens=500,
index_path=index_path,
)
actual_tokens = estimate_tokens(context)
# Allow 10% overshoot since we cut at line boundaries
assert actual_tokens <= 550, f"Expected ~500 tokens, got {actual_tokens}"
print(f"PASS: max_tokens_respected (got {actual_tokens} tokens)")
def test_missing_index_graceful():
"""Missing index.json doesn't crash."""
with tempfile.TemporaryDirectory() as tmp:
tmp_dir = Path(tmp)
# Don't create index.json
for sub in ["repos", "agents", "global"]:
(tmp_dir / sub).mkdir(exist_ok=True)
fake_index = tmp_dir / "nonexistent.json"
context = build_bootstrap_context(repo="anything", index_path=fake_index)
assert "No relevant knowledge found" in context
print("PASS: missing_index_graceful")
if __name__ == "__main__":
tests = [
test_empty_index,
test_filter_by_repo,
test_filter_by_agent,
test_no_global_flag,
test_sort_by_confidence,
test_sort_pitfalls_first,
test_truncate_to_tokens,
test_estimate_tokens,
test_build_full_context,
test_max_tokens_respected,
test_missing_index_graceful,
]
passed = 0
failed = 0
for test in tests:
try:
test()
passed += 1
except Exception as e:
print(f"FAIL: {test.__name__}{e}")
failed += 1
print(f"\n{passed} passed, {failed} failed")
sys.exit(0 if failed == 0 else 1)

View File

@@ -1,189 +0,0 @@
#!/usr/bin/env python3
"""Tests for scripts/diff_analyzer.py — 10 tests."""
import sys
import os
sys.path.insert(0, os.path.dirname(__file__) or ".")
import importlib.util
spec = importlib.util.spec_from_file_location("da", os.path.join(os.path.dirname(__file__) or ".", "diff_analyzer.py"))
mod = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)
DiffAnalyzer = mod.DiffAnalyzer
ChangeCategory = mod.ChangeCategory
SAMPLE_ADD = """diff --git a/new.py b/new.py
new file mode 100644
--- /dev/null
+++ b/new.py
@@ -0,0 +1,3 @@
+def hello():
+ print("world")
+ return True
"""
SAMPLE_DELETE = """diff --git a/old.py b/old.py
deleted file mode 100644
--- a/old.py
+++ /dev/null
@@ -1,2 +0,0 @@
-def goodbye():
- pass
"""
SAMPLE_MODIFY = """diff --git a/app.py b/app.py
--- a/app.py
+++ b/app.py
@@ -1,3 +1,4 @@
def main():
- print("old")
+ print("new")
+ print("extra")
return 0
"""
SAMPLE_RENAME = """diff --git a/old_name.py b/new_name.py
rename from old_name.py
rename to new_name.py
--- a/old_name.py
+++ b/new_name.py
@@ -1,1 +1,1 @@
-old content
+new content
"""
SAMPLE_MULTI = """diff --git a/a.py b/a.py
--- a/a.py
+++ b/a.py
@@ -1,1 +1,2 @@
existing
+added line
diff --git b/b.py b/b.py
new file mode 100644
--- /dev/null
+++ b/b.py
@@ -0,0 +1,1 @@
+new file
"""
SAMPLE_BINARY = """diff --git a/img.png b/img.png
Binary files a/img.png and b/img.png differ
"""
def test_empty():
a = DiffAnalyzer()
s = a.analyze("")
assert s.total_files_changed == 0
print("PASS: test_empty")
def test_addition():
a = DiffAnalyzer()
s = a.analyze(SAMPLE_ADD)
assert s.total_files_changed == 1
assert s.total_added == 3
assert s.total_deleted == 0
assert s.new_files == 1
assert s.files[0].hunks[0].category == ChangeCategory.ADDED
print("PASS: test_addition")
def test_deletion():
a = DiffAnalyzer()
s = a.analyze(SAMPLE_DELETE)
assert s.total_deleted == 2
assert s.deleted_files == 1
assert s.files[0].hunks[0].category == ChangeCategory.DELETED
print("PASS: test_deletion")
def test_modification():
a = DiffAnalyzer()
s = a.analyze(SAMPLE_MODIFY)
assert s.total_added == 2
assert s.total_deleted == 1
assert s.files[0].hunks[0].category == ChangeCategory.MODIFIED
print("PASS: test_modification")
def test_rename():
a = DiffAnalyzer()
s = a.analyze(SAMPLE_RENAME)
assert s.renamed_files == 1
assert s.files[0].old_path == "old_name.py"
assert s.files[0].path == "new_name.py"
assert s.files[0].is_renamed == True
print("PASS: test_rename")
def test_multiple_files():
a = DiffAnalyzer()
s = a.analyze(SAMPLE_MULTI)
assert s.total_files_changed == 2
assert s.new_files == 1
print("PASS: test_multiple_files")
def test_binary():
a = DiffAnalyzer()
s = a.analyze(SAMPLE_BINARY)
assert s.binary_files == 1
assert s.files[0].is_binary == True
assert len(s.files[0].hunks) == 0
print("PASS: test_binary")
def test_to_dict():
a = DiffAnalyzer()
s = a.analyze(SAMPLE_MODIFY)
d = s.to_dict()
assert "total_files_changed" in d
assert "files" in d
assert isinstance(d["files"], list)
print("PASS: test_to_dict")
def test_context_only():
diff = """diff --git a/f.py b/f.py
--- a/f.py
+++ b/f.py
@@ -1,3 +1,3 @@
line1
-old
+new
line3
"""
a = DiffAnalyzer()
s = a.analyze(diff)
# Has both added and deleted = MODIFIED
assert s.files[0].hunks[0].category == ChangeCategory.MODIFIED
print("PASS: test_context_only")
def test_multi_hunk():
diff = """diff --git a/f.py b/f.py
--- a/f.py
+++ b/f.py
@@ -1,1 +1,2 @@
existing
+first addition
@@ -10,1 +11,2 @@
more
+second addition
"""
a = DiffAnalyzer()
s = a.analyze(diff)
assert s.total_hunks == 2
assert s.total_added == 2
print("PASS: test_multi_hunk")
def run_all():
test_empty()
test_addition()
test_deletion()
test_modification()
test_rename()
test_multiple_files()
test_binary()
test_to_dict()
test_context_only()
test_multi_hunk()
print("\nAll 10 tests passed!")
if __name__ == "__main__":
run_all()

View File

@@ -1,109 +0,0 @@
#!/usr/bin/env python3
"""Tests for scripts/gitea_issue_parser.py"""
import sys
import os
sys.path.insert(0, os.path.dirname(__file__) or ".")
# Import from sibling
import importlib.util
spec = importlib.util.spec_from_file_location("parser", os.path.join(os.path.dirname(__file__) or ".", "gitea_issue_parser.py"))
mod = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)
parse_issue_body = mod.parse_issue_body
def test_basic_parsing():
body = """## Context
This is the background info.
## Acceptance Criteria
- [ ] First criterion
- [x] Second criterion (done)
## What to build
Some description.
"""
result = parse_issue_body(body, title="Test (#42)", labels=["bug"])
assert result["title"] == "Test (#42)"
assert result["labels"] == ["bug"]
assert result["epic_ref"] == 42
assert len(result["criteria"]) == 2
assert result["criteria"][0]["text"] == "First criterion"
assert result["criteria"][0]["checked"] == False
assert result["criteria"][1]["checked"] == True
assert "context" in result["sections"]
print("PASS: test_basic_parsing")
def test_numbered_criteria():
body = """## Acceptance Criteria
1. First item
2. Second item
3. Third item
"""
result = parse_issue_body(body)
assert len(result["criteria"]) == 3
assert result["criteria"][0]["text"] == "First item"
print("PASS: test_numbered_criteria")
def test_epic_ref_from_body():
body = "Closes #123\n\nSome description."
result = parse_issue_body(body)
assert result["epic_ref"] == 123
print("PASS: test_epic_ref_from_body")
def test_empty_body():
result = parse_issue_body("")
assert result["criteria"] == []
assert result["context"] == ""
assert result["sections"] == {}
print("PASS: test_empty_body")
def test_no_sections():
body = "Just a plain issue body with no headings."
result = parse_issue_body(body)
assert result["context"] == "Just a plain issue body with no headings."
print("PASS: test_no_sections")
def test_multiple_sections():
body = """## Problem
Something is broken.
## Fix
Do this instead.
## Notes
Additional info.
"""
result = parse_issue_body(body)
assert "problem" in result["sections"]
assert "fix" in result["sections"]
assert "notes" in result["sections"]
assert "Something is broken" in result["sections"]["problem"]
print("PASS: test_multiple_sections")
def run_all():
test_basic_parsing()
test_numbered_criteria()
test_epic_ref_from_body()
test_empty_body()
test_no_sections()
test_multiple_sections()
print("\nAll 6 tests passed!")
if __name__ == "__main__":
run_all()

View File

@@ -1,129 +1,41 @@
#!/usr/bin/env python3
"""
Test harness for knowledge extraction prompt.
Validates output structure, content quality, and hallucination resistance.
Usage:
python3 scripts/test_harvest_prompt.py # Run all tests
python3 scripts/test_harvest_prompt.py --transcript FILE # Test against a real transcript
python3 scripts/test_harvest_prompt.py --validate FILE # Validate an existing extraction JSON
Test script for knowledge extraction prompt.
Validates that the prompt produces consistent, structured output.
"""
import json
import sys
import argparse
from pathlib import Path
VALID_CATEGORIES = {"fact", "pitfall", "pattern", "tool-quirk", "question"}
REQUIRED_FIELDS = {"fact", "category", "repo", "confidence", "evidence"}
REQUIRED_META = {"session_outcome", "tools_used", "repos_touched", "error_count", "knowledge_count"}
def validate_knowledge_item(item, idx):
"""Validate a single knowledge item. Returns list of errors."""
errors = []
if not isinstance(item, dict):
return [f"Item {idx}: not a dict"]
for field in REQUIRED_FIELDS:
def validate_knowledge_item(item):
"""Validate a single knowledge item."""
required_fields = ["fact", "category", "repo", "confidence"]
for field in required_fields:
if field not in item:
errors.append(f"Item {idx}: missing field '{field}'")
if not isinstance(item.get("fact", ""), str) or len(item.get("fact", "").strip()) == 0:
errors.append(f"Item {idx}: fact must be a non-empty string")
if item.get("category") not in VALID_CATEGORIES:
errors.append(f"Item {idx}: invalid category '{item.get('category')}'")
if not isinstance(item.get("repo", ""), str) or len(item.get("repo", "").strip()) == 0:
errors.append(f"Item {idx}: repo must be a non-empty string")
conf = item.get("confidence")
if not isinstance(conf, (int, float)) or not (0.0 <= conf <= 1.0):
errors.append(f"Item {idx}: confidence must be a number 0.0-1.0, got {conf}")
if not isinstance(item.get("evidence", ""), str) or len(item.get("evidence", "").strip()) == 0:
errors.append(f"Item {idx}: evidence must be a non-empty string (hallucination check)")
return errors
return False, f"Missing field: {field}"
if not isinstance(item["fact"], str) or len(item["fact"].strip()) == 0:
return False, "Fact must be a non-empty string"
valid_categories = ["fact", "pitfall", "pattern", "tool-quirk", "question"]
if item["category"] not in valid_categories:
return False, f"Invalid category: {item['category']}"
if not isinstance(item["repo"], str):
return False, "Repo must be a string"
if not isinstance(item["confidence"], (int, float)):
return False, "Confidence must be a number"
if not (0.0 <= item["confidence"] <= 1.0):
return False, "Confidence must be between 0.0 and 1.0"
return True, "Valid"
def validate_extraction(data):
"""Validate a full extraction result. Returns (is_valid, errors, warnings)."""
errors = []
warnings = []
if not isinstance(data, dict):
return False, ["Root is not a JSON object"], []
if "knowledge" not in data:
return False, ["Missing 'knowledge' array"], []
if not isinstance(data["knowledge"], list):
return False, ["'knowledge' is not an array"], []
for i, item in enumerate(data["knowledge"]):
errors.extend(validate_knowledge_item(item, i))
# Meta block validation
if "meta" not in data:
warnings.append("Missing 'meta' block (session_outcome, tools_used, etc.)")
else:
meta = data["meta"]
for field in REQUIRED_META:
if field not in meta:
warnings.append(f"Meta missing field '{field}'")
# Quality checks
facts = data["knowledge"]
if len(facts) == 0:
warnings.append("No knowledge extracted (empty session or extraction failure)")
# Check for near-duplicate facts
seen_facts = set()
for item in facts:
normalized = item.get("fact", "").lower().strip()[:80]
if normalized in seen_facts:
warnings.append(f"Duplicate fact detected: '{normalized[:50]}...'")
seen_facts.add(normalized)
# Check confidence distribution
confidences = [item.get("confidence", 0) for item in facts]
if confidences:
avg_conf = sum(confidences) / len(confidences)
if avg_conf > 0.9:
warnings.append(f"Average confidence {avg_conf:.2f} is suspiciously high (possible hallucination)")
if avg_conf < 0.4:
warnings.append(f"Average confidence {avg_conf:.2f} is very low (extraction may be too cautious)")
return len(errors) == 0, errors, warnings
def validate_transcript_coverage(data, transcript):
"""
Check that extracted facts are actually supported by the transcript.
This is a heuristic — checks that key terms from facts appear in transcript.
Returns list of potential hallucinations.
"""
hallucinations = []
transcript_lower = transcript.lower()
for item in data.get("knowledge", []):
fact = item.get("fact", "")
evidence = item.get("evidence", "")
# Check if evidence string appears in transcript
if evidence and evidence.lower() not in transcript_lower:
# Partial match — check if key terms are present
evidence_words = set(evidence.lower().split())
transcript_words = set(transcript_lower.split())
overlap = evidence_words & transcript_words
if len(overlap) < min(3, len(evidence_words) * 0.3):
hallucinations.append({
"fact": fact[:80],
"evidence": evidence[:80],
"issue": "Evidence not found in transcript — possible hallucination"
})
return hallucinations
# --- Test transcripts ---
SAMPLE_TRANSCRIPT_SUCCESS = """
def test_sample_transcript():
"""Test with a sample transcript."""
sample_transcript = """
User: Clone the repo at forge.alexanderwhitestone.com/Timmy_Foundation/compounding-intelligence
Agent: Trying git clone... Error: 401 Unauthorized
Agent: Checking for token... Found in ~/.config/gitea/token
@@ -132,253 +44,30 @@ Agent: Clone successful with token authentication
Agent: Now examining the repo structure...
Agent: Found knowledge/index.json with 0 facts
Agent: Creating harvest-prompt.md in templates/
Agent: Done. Files created: templates/harvest-prompt.md, scripts/test_harvest_prompt.py
"""
SAMPLE_TRANSCRIPT_FAILURE = """
User: Deploy the cron jobs to production
Agent: Connecting to VPS... SSH timeout after 30s
Agent: Retrying with verbose output... Connection refused on port 22
Agent: Checking firewall rules... iptables shows port 22 is open
Agent: Checking sshd config... sshd is not running
Agent: Attempting to start sshd... systemctl start sshd failed: unit not found
Agent: ERROR: OpenSSH server not installed on this VPS
Agent: Cannot proceed without SSH access. Deployment blocked.
"""
SAMPLE_TRANSCRIPT_API_QUIRK = """
User: List all open issues in hermes-agent
Agent: Querying Gitea API... endpoint: /repos/Timmy_Foundation/hermes-agent/issues?state=open&limit=50
Agent: Response: 0 items. That's odd.
Agent: Trying without state filter... endpoint: /repos/Timmy_Foundation/hermes-agent/issues?limit=50
Agent: Response: 50 items returned, but all have pull_request field. These are PRs, not issues.
Agent: Gitea issues API returns PRs alongside issues. Must filter by presence of pull_request key.
Agent: Also discovered: state=open parameter returns empty when used with limit=50, but works with limit=5.
Agent: This appears to be a Gitea API quirk — pagination interacts with state filter unexpectedly.
Agent: Workaround: fetch without state filter, then filter client-side for state=='open' and 'pull_request' not in item.
"""
def run_tests():
"""Run the built-in test suite."""
tests_passed = 0
tests_failed = 0
print("=" * 60)
print("KNOWLEDGE EXTRACTION PROMPT — TEST SUITE")
print("=" * 60)
# Test 1: Prompt file exists and is under 2k tokens (~8k chars)
print("\n[Test 1] Prompt file size constraint")
prompt_path = Path("templates/harvest-prompt.md")
if not prompt_path.exists():
print(" FAIL: harvest-prompt.md not found")
tests_failed += 1
else:
size = prompt_path.stat().st_size
# Rough token estimate: ~4 chars per token
est_tokens = size / 4
print(f" Prompt size: {size} bytes (~{est_tokens:.0f} tokens)")
if est_tokens > 2000:
print(f" WARN: Prompt exceeds ~1500 tokens (target: ~1000)")
else:
print(f" PASS: Within token budget")
tests_passed += 1
# Test 2: Validate a well-formed extraction
print("\n[Test 2] Valid extraction passes validation")
valid_extraction = {
"knowledge": [
{
"fact": "Gitea auth token is at ~/.config/gitea/token",
"category": "tool-quirk",
"repo": "global",
"confidence": 0.9,
"evidence": "Found in ~/.config/gitea/token"
},
{
"fact": "Clone fails with 401 when no token is provided",
"category": "pitfall",
"repo": "compounding-intelligence",
"confidence": 0.9,
"evidence": "Error: 401 Unauthorized"
}
],
"meta": {
"session_outcome": "success",
"tools_used": ["git"],
"repos_touched": ["compounding-intelligence"],
"error_count": 1,
"knowledge_count": 2
}
}
is_valid, errors, warnings = validate_extraction(valid_extraction)
if is_valid:
print(f" PASS: Valid extraction accepted ({len(warnings)} warnings)")
tests_passed += 1
else:
print(f" FAIL: Valid extraction rejected: {errors}")
tests_failed += 1
# Test 3: Reject missing fields
print("\n[Test 3] Missing fields are rejected")
bad_extraction = {
"knowledge": [
{"fact": "Something learned", "category": "fact"} # Missing repo, confidence, evidence
]
}
is_valid, errors, warnings = validate_extraction(bad_extraction)
if not is_valid:
print(f" PASS: Rejected with {len(errors)} errors")
tests_passed += 1
else:
print(f" FAIL: Should have rejected missing fields")
tests_failed += 1
# Test 4: Reject invalid category
print("\n[Test 4] Invalid category is rejected")
bad_cat = {
"knowledge": [
{"fact": "Test", "category": "discovery", "repo": "x", "confidence": 0.8, "evidence": "test"}
]
}
is_valid, errors, warnings = validate_extraction(bad_cat)
if not is_valid and any("category" in e for e in errors):
print(f" PASS: Invalid category 'discovery' rejected")
tests_passed += 1
else:
print(f" FAIL: Should have rejected invalid category")
tests_failed += 1
# Test 5: Detect near-duplicates
print("\n[Test 5] Duplicate detection")
dup_extraction = {
"knowledge": [
{"fact": "Token is at ~/.config/gitea/token", "category": "fact", "repo": "x", "confidence": 0.9, "evidence": "a"},
{"fact": "Token is at ~/.config/gitea/token", "category": "fact", "repo": "x", "confidence": 0.9, "evidence": "b"}
],
"meta": {"session_outcome": "success", "tools_used": [], "repos_touched": [], "error_count": 0, "knowledge_count": 2}
}
is_valid, errors, warnings = validate_extraction(dup_extraction)
if any("Duplicate" in w for w in warnings):
print(f" PASS: Duplicate detected")
tests_passed += 1
else:
print(f" FAIL: Should have detected duplicate")
tests_failed += 1
# Test 6: Hallucination check against transcript
print("\n[Test 6] Hallucination detection")
hallucinated = {
"knowledge": [
{
"fact": "Database port is 5433",
"category": "fact",
"repo": "x",
"confidence": 0.9,
"evidence": "PostgreSQL listening on port 5433"
}
],
"meta": {"session_outcome": "success", "tools_used": [], "repos_touched": [], "error_count": 0, "knowledge_count": 1}
}
hallucinations = validate_transcript_coverage(hallucinated, SAMPLE_TRANSCRIPT_SUCCESS)
if hallucinations:
print(f" PASS: Hallucination detected ({len(hallucinations)} items)")
tests_passed += 1
else:
print(f" FAIL: Should have detected hallucinated evidence")
tests_failed += 1
# Test 7: Failed session should extract pitfalls
print("\n[Test 7] Failed session extraction shape")
failed_extraction = {
"knowledge": [
{
"fact": "SSH server not installed on target VPS",
"category": "pitfall",
"repo": "global",
"confidence": 0.9,
"evidence": "ERROR: OpenSSH server not installed on this VPS"
},
{
"fact": "VPS blocks deployment without SSH access",
"category": "question",
"repo": "global",
"confidence": 0.7,
"evidence": "Cannot proceed without SSH access. Deployment blocked."
}
],
"meta": {
"session_outcome": "failed",
"tools_used": ["ssh", "systemctl"],
"repos_touched": [],
"error_count": 3,
"knowledge_count": 2
}
}
is_valid, errors, warnings = validate_extraction(failed_extraction)
if is_valid:
categories = [item["category"] for item in failed_extraction["knowledge"]]
if "pitfall" in categories:
print(f" PASS: Failed session extracted {len(categories)} items including pitfalls")
tests_passed += 1
else:
print(f" FAIL: Failed session should extract pitfalls")
tests_failed += 1
else:
print(f" FAIL: {errors}")
tests_failed += 1
# Test 8: Empty extraction is warned
print("\n[Test 8] Empty extraction warning")
empty = {"knowledge": [], "meta": {"session_outcome": "success", "tools_used": [], "repos_touched": [], "error_count": 0, "knowledge_count": 0}}
is_valid, errors, warnings = validate_extraction(empty)
if any("No knowledge" in w for w in warnings):
print(f" PASS: Empty extraction warned")
tests_passed += 1
else:
print(f" FAIL: Should warn on empty extraction")
tests_failed += 1
# Summary
print(f"\n{'=' * 60}")
print(f"Results: {tests_passed} passed, {tests_failed} failed")
print(f"{'=' * 60}")
return tests_failed == 0
def validate_file(filepath):
"""Validate an existing extraction JSON file."""
path = Path(filepath)
if not path.exists():
print(f"ERROR: {filepath} not found")
return False
data = json.loads(path.read_text())
is_valid, errors, warnings = validate_extraction(data)
print(f"Validation of {filepath}:")
print(f" Knowledge items: {len(data.get('knowledge', []))}")
print(f" Errors: {len(errors)}")
print(f" Warnings: {len(warnings)}")
for e in errors:
print(f" ERROR: {e}")
for w in warnings:
print(f" WARN: {w}")
return is_valid
# This would be replaced with actual prompt execution
print("Sample transcript processed")
print("Expected categories: fact, pitfall, pattern, tool-quirk, question")
return True
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Test knowledge extraction prompt")
parser.add_argument("--validate", help="Validate an existing extraction JSON file")
parser.add_argument("--transcript", help="Test against a real transcript file (informational)")
args = parser.parse_args()
if args.validate:
success = validate_file(args.validate)
sys.exit(0 if success else 1)
else:
success = run_tests()
sys.exit(0 if success else 1)
print("Testing knowledge extraction prompt...")
# Test 1: Validate prompt file exists
prompt_path = Path("templates/harvest-prompt.md")
if not prompt_path.exists():
print("ERROR: harvest-prompt.md not found")
sys.exit(1)
print(f"OK: Prompt file exists: {prompt_path}")
# Test 2: Check prompt size
prompt_size = prompt_path.stat().st_size
print(f"OK: Prompt size: {prompt_size} bytes")
# Test 3: Test sample transcript processing
if test_sample_transcript():
print("OK: Sample transcript test passed")
print("\nAll tests passed!")

View File

@@ -1,72 +0,0 @@
#!/usr/bin/env python3
"""Comprehensive tests for knowledge extraction prompt."""
import json, re
from pathlib import Path
def check_prompt_structure():
p = Path("templates/harvest-prompt.md")
if not p.exists(): return False, "harvest-prompt.md not found"
c = p.read_text()
for s in ["System Prompt","Instructions","Categories","Output Format","Confidence Scoring","Constraints","Example"]:
if s.lower() not in c.lower(): return False, f"Missing section: {s}"
for cat in ["fact","pitfall","pattern","tool-quirk","question"]:
if cat not in c: return False, f"Missing category: {cat}"
if len(c) > 5000: return False, f"Too large: {len(c)}"
if len(c) < 1000: return False, f"Too small: {len(c)}"
return True, "Prompt structure is valid"
def check_confidence_scoring():
c = Path("templates/harvest-prompt.md").read_text()
for l in ["0.9-1.0","0.7-0.8","0.5-0.6","0.3-0.4","0.1-0.2"]:
if l not in c: return False, f"Missing level: {l}"
return True, "Confidence scoring defined"
def check_example_quality():
c = Path("templates/harvest-prompt.md").read_text()
if "example" not in c.lower(): return False, "No examples"
m = re.search(r'"knowledge"', c[c.lower().find("example"):])
if not m: return False, "No JSON example"
return True, "Examples present"
def check_constraint_coverage():
c = Path("templates/harvest-prompt.md").read_text()
for x in ["no hallucination","explicitly","partial","failed sessions"]:
if x not in c.lower(): return False, f"Missing: {x}"
return True, "Constraints covered"
def check_test_sessions():
d = Path("test_sessions")
if not d.exists(): return False, "test_sessions/ not found"
files = list(d.glob("*.jsonl"))
if len(files) < 5: return False, f"Only {len(files)} sessions"
for f in files:
for i, line in enumerate(f.read_text().strip().split("\n"), 1):
try: json.loads(line)
except json.JSONDecodeError as e: return False, f"{f.name}:{i}: {e}"
return True, f"{len(files)} valid sessions"
def test_prompt_structure():
passed, msg = check_prompt_structure()
assert passed, msg
def test_confidence_scoring():
passed, msg = check_confidence_scoring()
assert passed, msg
def test_example_quality():
passed, msg = check_example_quality()
assert passed, msg
def test_constraint_coverage():
passed, msg = check_constraint_coverage()
assert passed, msg
def test_test_sessions():
passed, msg = check_test_sessions()
assert passed, msg
if __name__ == "__main__":
checks = [check_prompt_structure, check_confidence_scoring, check_example_quality, check_constraint_coverage, check_test_sessions]
for fn in checks:
ok, msg = fn()
print(f"{'PASS' if ok else 'FAIL'}: {fn.__name__} -- {msg}")

View File

@@ -1,162 +0,0 @@
#!/usr/bin/env python3
"""
Smoke test for harvester pipeline — verifies the full chain:
session_reader -> prompt -> LLM (mocked) -> validate -> deduplicate -> store
Does NOT call the real LLM. Tests plumbing only.
"""
import json
import sys
import tempfile
import os
from pathlib import Path
# Setup path
SCRIPT_DIR = Path(__file__).parent.absolute()
sys.path.insert(0, str(SCRIPT_DIR))
from session_reader import read_session, extract_conversation, truncate_for_context, messages_to_text
from harvester import validate_fact, deduplicate, load_existing_knowledge, fact_fingerprint
def test_session_reader():
"""Test that session_reader parses JSONL correctly."""
with tempfile.NamedTemporaryFile(mode='w', suffix='.jsonl', delete=False) as f:
f.write('{"role": "user", "content": "Hello", "timestamp": "2026-04-13T10:00:00Z"}\n')
f.write('{"role": "assistant", "content": "Hi there", "timestamp": "2026-04-13T10:00:01Z"}\n')
f.write('{"role": "user", "content": "Clone the repo", "timestamp": "2026-04-13T10:00:02Z"}\n')
f.write('{"role": "assistant", "content": "Cloned successfully", "timestamp": "2026-04-13T10:00:05Z"}\n')
path = f.name
messages = read_session(path)
assert len(messages) == 4, f"Expected 4 messages, got {len(messages)}"
conv = extract_conversation(messages)
assert len(conv) == 4, f"Expected 4 conversation turns, got {len(conv)}"
text = messages_to_text(conv)
assert "USER: Hello" in text
assert "ASSISTANT: Hi there" in text
truncated = truncate_for_context(conv, head=2, tail=2)
assert len(truncated) == 4 # 4 <= head+tail, so no truncation
os.unlink(path)
print(" [PASS] session_reader pipeline works")
def test_validate_fact():
"""Test fact validation."""
good = {"fact": "Gitea token is at ~/.config/gitea/token", "category": "tool-quirk", "repo": "global", "confidence": 0.9}
assert validate_fact(good), "Valid fact should pass"
bad_missing = {"fact": "Something", "category": "fact"}
assert not validate_fact(bad_missing), "Missing fields should fail"
bad_category = {"fact": "Something", "category": "nonsense", "repo": "x", "confidence": 0.5}
assert not validate_fact(bad_category), "Bad category should fail"
bad_conf = {"fact": "Something", "category": "fact", "repo": "x", "confidence": 1.5}
assert not validate_fact(bad_conf), "Confidence > 1.0 should fail"
print(" [PASS] fact validation works")
def test_deduplicate():
"""Test deduplication."""
existing = [
{"fact": "Token is at ~/.config/gitea/token", "category": "tool-quirk", "repo": "global", "confidence": 0.9}
]
new = [
{"fact": "Token is at ~/.config/gitea/token", "category": "tool-quirk", "repo": "global", "confidence": 0.9}, # exact dup
{"fact": "Deploy uses Ansible on port 22", "category": "pattern", "repo": "fleet", "confidence": 0.8}, # unique
]
result = deduplicate(new, existing)
assert len(result) == 1, f"Expected 1 unique, got {len(result)}"
assert result[0]["fact"] == "Deploy uses Ansible on port 22"
print(" [PASS] deduplication works")
def test_knowledge_store_roundtrip():
"""Test loading and writing knowledge index."""
with tempfile.TemporaryDirectory() as tmpdir:
# Load empty index
index = load_existing_knowledge(tmpdir)
assert index["total_facts"] == 0
# Write a fact
new_facts = [{"fact": "Test fact", "category": "fact", "repo": "test", "confidence": 0.9}]
# Use harvester's write function
from harvester import write_knowledge
write_knowledge(index, new_facts, tmpdir, source_session="test.jsonl")
# Reload and verify
index2 = load_existing_knowledge(tmpdir)
assert index2["total_facts"] == 1
assert index2["facts"][0]["fact"] == "Test fact"
assert index2["facts"][0]["source_session"] == "test.jsonl"
# Check markdown was written
md_path = Path(tmpdir) / "repos" / "test.md"
assert md_path.exists(), "Markdown file should be created"
print(" [PASS] knowledge store roundtrip works")
def test_full_chain_no_llm():
"""Test the full pipeline minus the LLM call."""
with tempfile.NamedTemporaryFile(mode='w', suffix='.jsonl', delete=False) as f:
f.write('{"role": "user", "content": "Clone compounding-intelligence", "timestamp": "2026-04-13T10:00:00Z"}\n')
f.write('{"role": "assistant", "content": "Cloned successfully", "timestamp": "2026-04-13T10:00:05Z"}\n')
session_path = f.name
with tempfile.TemporaryDirectory() as knowledge_dir:
# Step 1: Read
messages = read_session(session_path)
assert len(messages) == 2
# Step 2: Extract conversation
conv = extract_conversation(messages)
assert len(conv) == 2
# Step 3: Truncate
truncated = truncate_for_context(conv, head=50, tail=50)
# Step 4: Convert to text (this goes to the LLM)
transcript = messages_to_text(truncated)
assert "Clone compounding-intelligence" in transcript
# Step 5-7: Would be LLM call, validate, deduplicate
# We simulate LLM output here
mock_facts = [
{"fact": "compounding-intelligence repo was cloned", "category": "fact", "repo": "compounding-intelligence", "confidence": 0.9}
]
valid = [f for f in mock_facts if validate_fact(f)]
# Step 6: Deduplicate
index = load_existing_knowledge(knowledge_dir)
new_facts = deduplicate(valid, index.get("facts", []))
assert len(new_facts) == 1
# Step 7: Store
from harvester import write_knowledge
write_knowledge(index, new_facts, knowledge_dir, source_session=session_path)
# Verify
index2 = load_existing_knowledge(knowledge_dir)
assert index2["total_facts"] == 1
os.unlink(session_path)
print(" [PASS] full chain (reader -> validate -> dedup -> store) works")
if __name__ == "__main__":
print("Running harvester pipeline smoke tests...")
test_session_reader()
test_validate_fact()
test_deduplicate()
test_knowledge_store_roundtrip()
test_full_chain_no_llm()
print("\nAll tests passed.")

View File

@@ -1,282 +0,0 @@
#!/usr/bin/env python3
"""Tests for scripts/improvement_proposals.py — 15 tests."""
import json
import os
import sys
import tempfile
sys.path.insert(0, os.path.dirname(__file__) or ".")
import importlib.util
spec = importlib.util.spec_from_file_location(
"ip", os.path.join(os.path.dirname(__file__) or ".", "improvement_proposals.py")
)
mod = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)
analyze_sessions = mod.analyze_sessions
generate_proposals = mod.generate_proposals
format_proposals_markdown = mod.format_proposals_markdown
format_proposals_json = mod.format_proposals_json
_normalize_error = mod._normalize_error
# ── Helper to build test sessions ─────────────────────────────
def _make_session(session_id, repo="test-repo", errors=None, tool_calls=None, duration=30):
return {
"session_id": session_id,
"repo": repo,
"errors": [{"message": e} for e in (errors or [])],
"tool_calls": tool_calls or [],
"duration_minutes": duration,
"messages": [],
}
def _make_tool_calls(repeats):
"""Create tool call list with repeated tools."""
calls = []
for tool, count in repeats:
for _ in range(count):
calls.append({"tool": tool, "latency_ms": 100})
return calls
# ── Tests ─────────────────────────────────────────────────────
def test_empty_sessions():
patterns = analyze_sessions([])
assert patterns == []
print("PASS: test_empty_sessions")
def test_no_patterns_on_clean_sessions():
sessions = [
_make_session("s1", tool_calls=[{"tool": "read_file", "latency_ms": 50}]),
_make_session("s2", tool_calls=[{"tool": "write_file", "latency_ms": 80}]),
]
patterns = analyze_sessions(sessions)
# No repeated errors, no slow tools, no retries
assert len(patterns) == 0
print("PASS: test_no_patterns_on_clean_sessions")
def test_repeated_error_detection():
"""Same error across 3+ sessions triggers pattern."""
sessions = [
_make_session(f"s{i}", errors=["ModuleNotFoundError: No module named bannerlord_trace"])
for i in range(4)
]
patterns = analyze_sessions(sessions)
repeated = [p for p in patterns if p.pattern_type == "repeated_error"]
assert len(repeated) == 1
assert repeated[0].occurrences == 4
print("PASS: test_repeated_error_detection")
def test_repeated_error_threshold():
"""2 occurrences should NOT trigger (threshold is 3)."""
sessions = [
_make_session("s1", errors=["TimeoutError: connection timed out"]),
_make_session("s2", errors=["TimeoutError: connection timed out"]),
]
patterns = analyze_sessions(sessions)
repeated = [p for p in patterns if p.pattern_type == "repeated_error"]
assert len(repeated) == 0
print("PASS: test_repeated_error_threshold")
def test_slow_tool_detection():
"""Tool with avg latency > 5000ms across 5+ calls."""
calls = [{"tool": "git_push", "latency_ms": 8000} for _ in range(10)]
sessions = [_make_session("s1", tool_calls=calls)]
patterns = analyze_sessions(sessions)
slow = [p for p in patterns if p.pattern_type == "slow_tool"]
assert len(slow) == 1
assert "git_push" in slow[0].description
print("PASS: test_slow_tool_detection")
def test_fast_tool_not_flagged():
"""Tool under 5000ms avg should not trigger."""
calls = [{"tool": "read_file", "latency_ms": 50} for _ in range(10)]
sessions = [_make_session("s1", tool_calls=calls)]
patterns = analyze_sessions(sessions)
slow = [p for p in patterns if p.pattern_type == "slow_tool"]
assert len(slow) == 0
print("PASS: test_fast_tool_not_flagged")
def test_failed_retry_detection():
"""3+ consecutive calls to same tool triggers retry pattern."""
calls = _make_tool_calls([("execute_code", 5)])
sessions = [_make_session("s1", tool_calls=calls)]
sessions.extend([
_make_session(f"s{i}", tool_calls=_make_tool_calls([("execute_code", 4)]))
for i in range(2, 5)
])
patterns = analyze_sessions(sessions)
retries = [p for p in patterns if p.pattern_type == "failed_retry"]
assert len(retries) >= 1
print("PASS: test_failed_retry_detection")
def test_manual_process_detection():
"""10+ tool calls with <= 3 unique tools."""
calls = _make_tool_calls([("terminal", 8), ("read_file", 5)])
sessions = [_make_session("s1", tool_calls=calls, duration=25)]
patterns = analyze_sessions(sessions)
manual = [p for p in patterns if p.pattern_type == "manual_process"]
assert len(manual) == 1
print("PASS: test_manual_process_detection")
def test_generate_proposals_from_patterns():
"""Proposals generated from waste patterns."""
sessions = [
_make_session(f"s{i}", errors=["Error: push timeout"])
for i in range(5)
]
patterns = analyze_sessions(sessions)
proposals = generate_proposals(patterns)
assert len(proposals) >= 1
assert proposals[0].estimated_monthly_hours_saved > 0
assert proposals[0].priority in ("critical", "high", "medium", "low")
print("PASS: test_generate_proposals_from_patterns")
def test_proposal_roi_positive():
"""ROI weeks should be a positive number for recoverable time."""
patterns = [mod.WastePattern(
pattern_type="repeated_error",
description="Test error",
occurrences=10,
total_time_hours=5.0,
affected_repos=["test"],
)]
proposals = generate_proposals(patterns)
assert len(proposals) == 1
assert proposals[0].roi_weeks > 0
assert proposals[0].roi_weeks < 100
print("PASS: test_proposal_roi_positive")
def test_proposals_sorted_by_impact():
"""Proposals should be sorted by monthly hours saved (descending)."""
sessions = [
_make_session("s1", errors=["Minor warning"] * 3, duration=5),
_make_session("s2", errors=["Critical failure: deploy crashed"] * 5, duration=60),
]
# Add more sessions to cross threshold
for i in range(3, 7):
sessions.append(_make_session(f"s{i}", errors=["Critical failure: deploy crashed"]))
patterns = analyze_sessions(sessions)
proposals = generate_proposals(patterns)
if len(proposals) >= 2:
for i in range(len(proposals) - 1):
assert proposals[i].estimated_monthly_hours_saved >= proposals[i + 1].estimated_monthly_hours_saved
print("PASS: test_proposals_sorted_by_impact")
def test_format_markdown():
"""Markdown output should contain expected sections."""
patterns = [mod.WastePattern(
pattern_type="repeated_error", description="Test", occurrences=5,
total_time_hours=2.5, affected_repos=["repo"],
)]
proposals = generate_proposals(patterns)
md = format_proposals_markdown(proposals, patterns, "2026-04-15T00:00:00Z")
assert "# Improvement Proposals" in md
assert "## Summary" in md
assert "### Problem" in md
assert "### ROI Estimate" in md
assert "## Appendix" in md
print("PASS: test_format_markdown")
def test_format_json():
"""JSON output should be valid and parseable."""
patterns = [mod.WastePattern(
pattern_type="slow_tool", description="Slow", occurrences=10,
total_time_hours=3.0, affected_repos=["global"],
)]
proposals = generate_proposals(patterns)
output = format_proposals_json(proposals)
parsed = json.loads(output)
assert isinstance(parsed, list)
assert len(parsed) == len(proposals)
assert "title" in parsed[0]
assert "roi_weeks" in parsed[0]
print("PASS: test_format_json")
def test_normalize_error():
"""Error normalization should remove paths and hashes."""
err1 = _normalize_error("Failed to clone /Users/apayne/repo with token abc123def456")
assert "/PATH" in err1
assert "HASH" in err1
assert "/Users/apayne" not in err1
err2 = _normalize_error("")
assert err2 == ""
err3 = _normalize_error("Simple error message")
assert "simple error" in err3
print("PASS: test_normalize_error")
def test_cli_integration():
"""End-to-end test: write input JSON, run script, check output."""
import subprocess
sessions = [
_make_session(f"s{i}", errors=["Connection refused: port 8080"])
for i in range(5)
]
with tempfile.TemporaryDirectory() as tmpdir:
input_path = os.path.join(tmpdir, "analytics.json")
output_path = os.path.join(tmpdir, "proposals.md")
with open(input_path, "w") as f:
json.dump({"sessions": sessions}, f)
script = os.path.join(os.path.dirname(__file__) or ".", "improvement_proposals.py")
result = subprocess.run(
[sys.executable, script, "--input", input_path, "--output", output_path],
capture_output=True, text=True, timeout=10,
)
assert result.returncode == 0, f"CLI failed: {result.stderr}"
assert os.path.exists(output_path)
with open(output_path) as f:
content = f.read()
assert "# Improvement Proposals" in content
print("PASS: test_cli_integration")
def run_all():
test_empty_sessions()
test_no_patterns_on_clean_sessions()
test_repeated_error_detection()
test_repeated_error_threshold()
test_slow_tool_detection()
test_fast_tool_not_flagged()
test_failed_retry_detection()
test_manual_process_detection()
test_generate_proposals_from_patterns()
test_proposal_roi_positive()
test_proposals_sorted_by_impact()
test_format_markdown()
test_format_json()
test_normalize_error()
test_cli_integration()
print("\nAll 15 tests passed!")
if __name__ == "__main__":
run_all()

View File

@@ -1,129 +0,0 @@
#!/usr/bin/env python3
"""Tests for scripts/knowledge_staleness_check.py — 8 tests."""
import json
import os
import sys
import tempfile
sys.path.insert(0, os.path.dirname(__file__) or ".")
import importlib.util
spec = importlib.util.spec_from_file_location("ks", os.path.join(os.path.dirname(__file__) or ".", "knowledge_staleness_check.py"))
mod = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)
check_staleness = mod.check_staleness
fix_hashes = mod.fix_hashes
compute_file_hash = mod.compute_file_hash
def test_fresh_entry():
with tempfile.TemporaryDirectory() as tmpdir:
src = os.path.join(tmpdir, "source.py")
with open(src, "w") as f:
f.write("print('hello')")
h = compute_file_hash(src)
idx = os.path.join(tmpdir, "index.json")
with open(idx, "w") as f:
json.dump({"facts": [{"fact": "hello", "source_file": "source.py", "source_hash": h}]}, f)
results = check_staleness(idx, tmpdir)
assert results[0]["status"] == "fresh"
print("PASS: test_fresh_entry")
def test_stale_entry():
with tempfile.TemporaryDirectory() as tmpdir:
src = os.path.join(tmpdir, "source.py")
with open(src, "w") as f:
f.write("original content")
idx = os.path.join(tmpdir, "index.json")
with open(idx, "w") as f:
json.dump({"facts": [{"fact": "old", "source_file": "source.py", "source_hash": "sha256:wrong"}]}, f)
# Now change the source
with open(src, "w") as f:
f.write("modified content")
results = check_staleness(idx, tmpdir)
assert results[0]["status"] == "stale"
print("PASS: test_stale_entry")
def test_missing_source():
with tempfile.TemporaryDirectory() as tmpdir:
idx = os.path.join(tmpdir, "index.json")
with open(idx, "w") as f:
json.dump({"facts": [{"fact": "gone", "source_file": "nonexistent.py", "source_hash": "sha256:abc"}]}, f)
results = check_staleness(idx, tmpdir)
assert results[0]["status"] == "missing_source"
print("PASS: test_missing_source")
def test_no_hash():
with tempfile.TemporaryDirectory() as tmpdir:
src = os.path.join(tmpdir, "source.py")
with open(src, "w") as f:
f.write("content")
idx = os.path.join(tmpdir, "index.json")
with open(idx, "w") as f:
json.dump({"facts": [{"fact": "no hash", "source_file": "source.py"}]}, f)
results = check_staleness(idx, tmpdir)
assert results[0]["status"] == "no_hash"
assert results[0]["current_hash"].startswith("sha256:")
print("PASS: test_no_hash")
def test_no_source_field():
with tempfile.TemporaryDirectory() as tmpdir:
idx = os.path.join(tmpdir, "index.json")
with open(idx, "w") as f:
json.dump({"facts": [{"fact": "orphan"}]}, f)
results = check_staleness(idx, tmpdir)
assert results[0]["status"] == "no_source"
print("PASS: test_no_source_field")
def test_fix_hashes():
with tempfile.TemporaryDirectory() as tmpdir:
src = os.path.join(tmpdir, "source.py")
with open(src, "w") as f:
f.write("content for hashing")
idx = os.path.join(tmpdir, "index.json")
with open(idx, "w") as f:
json.dump({"facts": [{"fact": "needs hash", "source_file": "source.py"}]}, f)
fixed = fix_hashes(idx, tmpdir)
assert fixed == 1
# Verify hash was added
with open(idx) as f:
data = json.load(f)
assert data["facts"][0]["source_hash"].startswith("sha256:")
print("PASS: test_fix_hashes")
def test_empty_index():
with tempfile.TemporaryDirectory() as tmpdir:
idx = os.path.join(tmpdir, "index.json")
with open(idx, "w") as f:
json.dump({"facts": []}, f)
results = check_staleness(idx, tmpdir)
assert results == []
print("PASS: test_empty_index")
def test_compute_hash_nonexistent():
h = compute_file_hash("/nonexistent/path/file.py")
assert h is None
print("PASS: test_compute_hash_nonexistent")
def run_all():
test_fresh_entry()
test_stale_entry()
test_missing_source()
test_no_hash()
test_no_source_field()
test_fix_hashes()
test_empty_index()
test_compute_hash_nonexistent()
print("\nAll 8 tests passed!")
if __name__ == "__main__":
run_all()

View File

@@ -1,305 +0,0 @@
#!/usr/bin/env python3
"""
Tests for Priority Rebalancer
"""
import json
import os
import sys
import tempfile
from datetime import datetime, timedelta
from pathlib import Path
# Add script dir to path
sys.path.insert(0, str(Path(__file__).parent))
from priority_rebalancer import (
GiteaClient,
IssueScore,
PipelineSignal,
compute_issue_score,
collect_knowledge_signals,
collect_metrics_signals,
extract_priority,
generate_report,
generate_markdown_report,
PRIORITY_LEVELS,
)
# ============================================================
# Test Helpers
# ============================================================
PASS = 0
FAIL = 0
def test(name):
def decorator(fn):
global PASS, FAIL
try:
fn()
PASS += 1
print(f" [PASS] {name}")
except Exception as e:
FAIL += 1
print(f" [FAIL] {name}: {e}")
return decorator
def assert_eq(a, b, msg=""):
if a != b:
raise AssertionError(f"{msg} expected {b!r}, got {a!r}")
def assert_true(v, msg=""):
if not v:
raise AssertionError(msg or "Expected True")
def assert_false(v, msg=""):
if v:
raise AssertionError(msg or "Expected False")
# ============================================================
# Priority Extraction Tests
# ============================================================
print("=== Priority Rebalancer Tests ===\n")
print("-- Priority Extraction --")
@test("extract P0 from label")
def _():
assert_eq(extract_priority(["P0", "bug"]), "P0")
@test("extract P1 from priority:high")
def _():
assert_eq(extract_priority(["priority:high"]), "P1")
@test("extract P2 from priority:medium")
def _():
assert_eq(extract_priority(["priority:medium"]), "P2")
@test("extract P3 from priority:low")
def _():
assert_eq(extract_priority(["priority:low"]), "P3")
@test("returns None for no priority")
def _():
assert_eq(extract_priority(["bug", "enhancement"]), None)
@test("case insensitive")
def _():
assert_eq(extract_priority(["p1"]), "P1")
assert_eq(extract_priority(["PRIORITY:CRITICAL"]), "P0")
# ============================================================
# Issue Scoring Tests
# ============================================================
print("\n-- Issue Scoring --")
def make_issue(**kwargs):
defaults = {
"number": 1,
"title": "Test issue",
"labels": [],
"created_at": (datetime.utcnow() - timedelta(days=5)).isoformat() + "Z",
"comments": 0,
"assignees": None,
}
defaults.update(kwargs)
return defaults
@test("bug gets score boost")
def _():
issue = make_issue(title="Incorrect output format", labels=["bug"])
score = compute_issue_score(issue, "test-repo", [], datetime.utcnow())
assert_true(score.score > 0, f"Bug should boost score, got {score.score}")
# Bug label alone should be P2 or P3 (not P0)
assert_true(score.suggested_priority in ("P2", "P3"),
f"Bug label alone should be P2/P3, got {score.suggested_priority}")
@test("security gets high score")
def _():
issue = make_issue(title="Security: auth bypass", labels=["bug"])
score = compute_issue_score(issue, "test-repo", [], datetime.utcnow())
assert_true(score.score >= 25, f"Security should score high, got {score.score}")
@test("old dormant issue gets penalized")
def _():
issue = make_issue(
title="Some old feature",
created_at=(datetime.utcnow() - timedelta(days=120)).isoformat() + "Z",
comments=0
)
score = compute_issue_score(issue, "test-repo", [], datetime.utcnow())
assert_true(score.score < 0, f"Old dormant should be negative, got {score.score}")
assert_true(any("Dormant" in r for r in score.reasons), "Should mention dormancy")
@test("active discussion boosts score")
def _():
issue = make_issue(title="Important fix", comments=8)
score = compute_issue_score(issue, "test-repo", [], datetime.utcnow())
assert_true(score.score > 5, f"Active discussion should boost, got {score.score}")
assert_true(any("Active" in r for r in score.reasons))
@test("unassigned gets slight boost")
def _():
issue = make_issue(title="Fix bug", assignees=None)
score = compute_issue_score(issue, "test-repo", [], datetime.utcnow())
assert_true(any("Unassigned" in r for r in score.reasons))
@test("assigned issue notes assignee")
def _():
issue = make_issue(title="Fix bug", assignees=[{"login": "alice"}])
score = compute_issue_score(issue, "test-repo", [], datetime.utcnow())
assert_eq(score.assignee, "alice")
@test("nice-to-have gets penalized")
def _():
issue = make_issue(title="Nice to have: fancy animation")
score = compute_issue_score(issue, "test-repo", [], datetime.utcnow())
assert_true(score.score < 0, f"Nice-to-have should be negative, got {score.score}")
# ============================================================
# Pipeline Signal Tests
# ============================================================
print("\n-- Pipeline Signals --")
@test("signal alignment boosts matching issues")
def _():
signals = [PipelineSignal(
source="knowledge",
signal_type="stale_knowledge",
weight=0.8,
detail="20 stale facts"
)]
issue = make_issue(title="Fix stale knowledge entries")
score = compute_issue_score(issue, "test-repo", signals, datetime.utcnow())
assert_true(any("Matches signal" in r for r in score.reasons))
@test("empty knowledge boosts harvester issues")
def _():
signals = [PipelineSignal(
source="knowledge",
signal_type="empty_knowledge",
weight=0.7,
detail="0 facts"
)]
issue = make_issue(title="Implement harvester pipeline")
score = compute_issue_score(issue, "test-repo", signals, datetime.utcnow())
assert_true(any("Critical gap" in r for r in score.reasons))
# ============================================================
# Knowledge Signal Collection Tests
# ============================================================
print("\n-- Knowledge Signal Collection --")
@test("missing index generates signal")
def _():
with tempfile.TemporaryDirectory() as tmpdir:
signals = collect_knowledge_signals(tmpdir)
assert_true(len(signals) > 0)
assert_eq(signals[0].signal_type, "missing_index")
@test("empty knowledge generates signal")
def _():
with tempfile.TemporaryDirectory() as tmpdir:
idx = os.path.join(tmpdir, "index.json")
with open(idx, "w") as f:
json.dump({"facts": []}, f)
signals = collect_knowledge_signals(tmpdir)
assert_true(any(s.signal_type == "empty_knowledge" for s in signals))
@test("corrupt index generates signal")
def _():
with tempfile.TemporaryDirectory() as tmpdir:
idx = os.path.join(tmpdir, "index.json")
with open(idx, "w") as f:
f.write("not json {{{")
signals = collect_knowledge_signals(tmpdir)
assert_true(any(s.signal_type == "corrupt_index" for s in signals))
@test("knowledge with facts passes")
def _():
with tempfile.TemporaryDirectory() as tmpdir:
idx = os.path.join(tmpdir, "index.json")
with open(idx, "w") as f:
json.dump({"facts": [
{"id": 1, "repo": "test", "status": "fresh"},
{"id": 2, "repo": "test", "status": "fresh"},
]}, f)
signals = collect_knowledge_signals(tmpdir)
# Should not generate missing_index or empty_knowledge
assert_false(any(s.signal_type in ("missing_index", "empty_knowledge") for s in signals))
# ============================================================
# Metrics Signal Collection Tests
# ============================================================
print("\n-- Metrics Signal Collection --")
@test("empty metrics dir generates signal")
def _():
with tempfile.TemporaryDirectory() as tmpdir:
signals = collect_metrics_signals(tmpdir)
assert_true(any(s.signal_type == "no_metrics" for s in signals))
@test("metrics with files passes")
def _():
with tempfile.TemporaryDirectory() as tmpdir:
# Create files (simulating real metrics dir with .gitkeep + actual files)
with open(os.path.join(tmpdir, ".gitkeep"), "w") as f:
f.write("")
with open(os.path.join(tmpdir, "report.json"), "w") as f:
f.write("{}")
signals = collect_metrics_signals(tmpdir)
assert_false(any(s.signal_type == "no_metrics" for s in signals))
# ============================================================
# Report Generation Tests
# ============================================================
print("\n-- Report Generation --")
@test("report has correct structure")
def _():
scores = [
IssueScore(1, "repo1", "Bug fix", ["bug"], None, "P1", 30.0, ["test"], 5, 3, None),
IssueScore(2, "repo1", "Feature", ["enhancement"], "P3", None, -5.0, ["test"], 60, 0, "alice"),
]
signals = [PipelineSignal("knowledge", "stale_knowledge", 0.5, "10 stale")]
report = generate_report(scores, signals, "test-org", ["repo1"])
assert_eq(report["org"], "test-org")
assert_eq(report["total_issues"], 2)
assert_true("generated_at" in report)
assert_true("summary" in report)
assert_true("top_priority" in report)
assert_eq(report["summary"]["suggested_new_priorities"], 1)
@test("markdown report is non-empty")
def _():
scores = [IssueScore(1, "repo1", "Test", ["bug"], None, "P2", 15.0, ["reason"], 5, 0, None)]
report = generate_report(scores, [], "test-org", ["repo1"])
md = generate_markdown_report(report)
assert_true(len(md) > 100)
assert_true("Priority Rebalancer Report" in md)
assert_true("Top 10" in md)
# ============================================================
# Summary
# ============================================================
print(f"\n=== Summary ===")
print(f"Total: {PASS + FAIL} | Passed: {PASS} | Failed: {FAIL}")
if FAIL > 0:
sys.exit(1)

View File

@@ -1,242 +0,0 @@
#!/usr/bin/env python3
"""Tests for scripts/refactoring_opportunity_finder.py — 10 tests."""
import json
import os
import sys
import tempfile
sys.path.insert(0, os.path.dirname(__file__) or ".")
import importlib.util
spec = importlib.util.spec_from_file_location(
"rof", os.path.join(os.path.dirname(__file__) or ".", "refactoring_opportunity_finder.py"))
mod = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)
compute_file_complexity = mod.compute_file_complexity
calculate_refactoring_score = mod.calculate_refactoring_score
FileMetrics = mod.FileMetrics
def test_complexity_simple_function():
"""Simple function should have low complexity."""
with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f:
f.write("""
def simple():
return 42
""")
f.flush()
avg, max_c, funcs, classes, lines = compute_file_complexity(f.name)
assert avg == 1.0, f"Expected 1.0, got {avg}"
assert max_c == 1, f"Expected 1, got {max_c}"
assert funcs == 1, f"Expected 1, got {funcs}"
assert classes == 0, f"Expected 0, got {classes}"
os.unlink(f.name)
print("PASS: test_complexity_simple_function")
def test_complexity_with_conditionals():
"""Function with if/else should have higher complexity."""
with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f:
f.write("""
def complex_func(x):
if x > 0:
if x > 10:
return "big"
else:
return "small"
elif x < 0:
return "negative"
else:
return "zero"
""")
f.flush()
avg, max_c, funcs, classes, lines = compute_file_complexity(f.name)
# Base 1 + 3 if/elif + 1 nested if = 5
assert max_c >= 4, f"Expected max_c >= 4, got {max_c}"
assert funcs == 1, f"Expected 1, got {funcs}"
os.unlink(f.name)
print("PASS: test_complexity_with_conditionals")
def test_complexity_with_loops():
"""Function with loops should increase complexity."""
with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f:
f.write("""
def loop_func(items):
result = []
for item in items:
if item > 0:
result.append(item)
while len(result) > 10:
result.pop()
return result
""")
f.flush()
avg, max_c, funcs, classes, lines = compute_file_complexity(f.name)
# Base 1 + 1 for + 1 if + 1 while = 4
assert max_c >= 3, f"Expected max_c >= 3, got {max_c}"
os.unlink(f.name)
print("PASS: test_complexity_with_loops")
def test_complexity_with_class():
"""Class with methods should count both."""
with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f:
f.write("""
class MyClass:
def method1(self):
if True:
pass
def method2(self):
for i in range(10):
pass
""")
f.flush()
avg, max_c, funcs, classes, lines = compute_file_complexity(f.name)
assert classes == 1, f"Expected 1 class, got {classes}"
assert funcs == 2, f"Expected 2 functions, got {funcs}"
os.unlink(f.name)
print("PASS: test_complexity_with_class")
def test_complexity_syntax_error():
"""File with syntax error should return zeros."""
with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f:
f.write("def broken(:\n pass")
f.flush()
avg, max_c, funcs, classes, lines = compute_file_complexity(f.name)
assert avg == 0.0, f"Expected 0.0, got {avg}"
assert funcs == 0, f"Expected 0, got {funcs}"
os.unlink(f.name)
print("PASS: test_complexity_syntax_error")
def test_refactoring_score_high_complexity():
"""High complexity should give high score."""
metrics = FileMetrics(
path="test.py",
lines=200,
complexity=15.0,
max_complexity=25,
functions=10,
classes=2,
churn_30d=5,
churn_90d=15,
test_coverage=0.3,
refactoring_score=0.0
)
score = calculate_refactoring_score(metrics)
assert score > 50, f"Expected score > 50, got {score}"
print("PASS: test_refactoring_score_high_complexity")
def test_refactoring_score_low_complexity():
"""Low complexity should give lower score."""
metrics = FileMetrics(
path="test.py",
lines=50,
complexity=2.0,
max_complexity=3,
functions=3,
classes=0,
churn_30d=0,
churn_90d=1,
test_coverage=0.9,
refactoring_score=0.0
)
score = calculate_refactoring_score(metrics)
assert score < 30, f"Expected score < 30, got {score}"
print("PASS: test_refactoring_score_low_complexity")
def test_refactoring_score_high_churn():
"""High churn should increase score."""
metrics = FileMetrics(
path="test.py",
lines=100,
complexity=5.0,
max_complexity=8,
functions=5,
classes=0,
churn_30d=10,
churn_90d=20,
test_coverage=0.5,
refactoring_score=0.0
)
score = calculate_refactoring_score(metrics)
# Churn should contribute significantly
assert score > 40, f"Expected score > 40 for high churn, got {score}"
print("PASS: test_refactoring_score_high_churn")
def test_refactoring_score_no_coverage():
"""No coverage data should assume medium risk."""
metrics = FileMetrics(
path="test.py",
lines=100,
complexity=5.0,
max_complexity=8,
functions=5,
classes=0,
churn_30d=1,
churn_90d=2,
test_coverage=None,
refactoring_score=0.0
)
score = calculate_refactoring_score(metrics)
# Should have some score from the 5-point coverage component
assert score > 0, f"Expected positive score, got {score}"
print("PASS: test_refactoring_score_no_coverage")
def test_refactoring_score_large_file():
"""Large files should score higher."""
metrics_small = FileMetrics(
path="small.py",
lines=50,
complexity=5.0,
max_complexity=8,
functions=3,
classes=0,
churn_30d=1,
churn_90d=2,
test_coverage=0.8,
refactoring_score=0.0
)
metrics_large = FileMetrics(
path="large.py",
lines=1000,
complexity=5.0,
max_complexity=8,
functions=3,
classes=0,
churn_30d=1,
churn_90d=2,
test_coverage=0.8,
refactoring_score=0.0
)
score_small = calculate_refactoring_score(metrics_small)
score_large = calculate_refactoring_score(metrics_large)
assert score_large > score_small, \
f"Large file ({score_large}) should score higher than small ({score_small})"
print("PASS: test_refactoring_score_large_file")
def run_all():
test_complexity_simple_function()
test_complexity_with_conditionals()
test_complexity_with_loops()
test_complexity_with_class()
test_complexity_syntax_error()
test_refactoring_score_high_complexity()
test_refactoring_score_low_complexity()
test_refactoring_score_high_churn()
test_refactoring_score_no_coverage()
test_refactoring_score_large_file()
print("\nAll 10 tests passed!")
if __name__ == "__main__":
run_all()

View File

@@ -1,90 +0,0 @@
#!/usr/bin/env python3
"""Tests for session_pair_harvester."""
import json
import sys
import os
import tempfile
sys.path.insert(0, os.path.dirname(__file__))
from session_pair_harvester import extract_pairs_from_session, deduplicate_pairs, compute_hash
def test_basic_extraction():
session = {
"id": "test_001",
"model": "test-model",
"conversations": [
{"from": "system", "value": "You are helpful."},
{"from": "human", "value": "What is Python?"},
{"from": "gpt", "value": "Python is a high-level programming language known for its readability and versatility. It supports multiple paradigms including procedural, object-oriented, and functional programming. Python is widely used in web development, data science, machine learning, and automation."},
]
}
pairs = extract_pairs_from_session(session, min_ratio=1.5, min_response_words=10)
assert len(pairs) == 1
assert pairs[0]["terse"] == "What is Python?"
assert "programming language" in pairs[0]["rich"]
assert pairs[0]["source"] == "test_001"
print("PASS: test_basic_extraction")
def test_filters_short_responses():
session = {
"id": "test_002",
"model": "test",
"conversations": [
{"from": "human", "value": "Hi"},
{"from": "gpt", "value": "Hello!"},
]
}
pairs = extract_pairs_from_session(session, min_ratio=1.5, min_response_words=20)
assert len(pairs) == 0
print("PASS: test_filters_short_responses")
def test_skips_tool_results():
session = {
"id": "test_003",
"model": "test",
"conversations": [
{"from": "human", "value": '{"output": "file content", "exit_code": 0}'},
{"from": "gpt", "value": "The file was read successfully. Now let me analyze the content and provide a detailed summary of what was found in the file system."},
]
}
pairs = extract_pairs_from_session(session, min_ratio=1.5, min_response_words=10)
assert len(pairs) == 0
print("PASS: test_skips_tool_results")
def test_deduplication():
pairs = [
{"terse": "What is X?", "rich": "X is Y.", "source": "s1", "model": "m"},
{"terse": "What is X?", "rich": "X is Y.", "source": "s2", "model": "m"},
{"terse": "What is Z?", "rich": "Z is W.", "source": "s1", "model": "m"},
]
unique = deduplicate_pairs(pairs)
assert len(unique) == 2
print("PASS: test_deduplication")
def test_ratio_filter():
session = {
"id": "test_005",
"model": "test",
"conversations": [
{"from": "human", "value": "Explain quantum computing in detail with examples and applications"},
{"from": "gpt", "value": "OK."},
]
}
pairs = extract_pairs_from_session(session, min_ratio=1.5, min_response_words=10)
assert len(pairs) == 0 # response too short relative to prompt
print("PASS: test_ratio_filter")
if __name__ == "__main__":
test_basic_extraction()
test_filters_short_responses()
test_skips_tool_results()
test_deduplication()
test_ratio_filter()
print("\nAll tests passed.")

View File

@@ -1,131 +0,0 @@
#!/usr/bin/env python3
"""
Doc Link Validator — Extract and verify all documentation links.
Issue: #103 — 4.8: Doc Link Validator
Acceptance:
Extracts links from docs | HTTP HEAD check | Reports broken links
(Weekly cron/CI integration out of scope for this minimal script)
"""
import argparse
import re
import sys
from pathlib import Path
from typing import List, Tuple, Optional
from urllib.request import Request, urlopen
from urllib.error import URLError, HTTPError
from urllib.parse import urlparse
# Markdown link patterns
INLINE_LINK_RE = re.compile(r'\[[^\]]*\]\(([^)\s]+)(?:\s+"[^"]*")?\)')
AUTOLINK_RE = re.compile(r'<([^>]+)>')
def extract_links(content: str) -> List[str]:
urls = [m.group(1) for m in INLINE_LINK_RE.finditer(content)]
urls += [m.group(1) for m in AUTOLINK_RE.finditer(content)]
return urls
def is_ignorable(url: str, ignore_prefixes: List[str]) -> bool:
p = urlparse(url)
if p.scheme not in ('http', 'https'):
return True
host = p.netloc.split(':')[0]
if host in ('localhost', '127.0.0.1', '::1'):
return True
# Private IPv4 ranges
if re.match(r'^(10\.|192\.168\.|172\.(1[6-9]|2[0-9]|3[01])\.)', host):
return True
for prefix in ignore_prefixes:
if url.startswith(prefix):
return True
return False
def check_url(url: str, timeout: float = 8.0) -> Tuple[bool, Optional[int], str]:
try:
req = Request(url, method='HEAD')
req.add_header('User-Agent', 'DocLinkValidator/1.0')
try:
with urlopen(req, timeout=timeout) as resp:
return True, resp.getcode(), "OK"
except HTTPError as e:
if e.code in (405, 403, 400):
req2 = Request(url, method='GET')
req2.add_header('User-Agent', 'DocLinkValidator/1.0')
req2.add_header('Range', 'bytes=0-1')
with urlopen(req2, timeout=timeout) as resp2:
return True, resp2.getcode(), "OK via GET"
return False, e.code, e.reason
except URLError as e:
return False, None, str(e.reason) if hasattr(e, 'reason') else str(e)
except Exception as e:
return False, None, str(e)
def main() -> int:
p = argparse.ArgumentParser(description="Validate documentation links")
p.add_argument('--root', default='.', help='Repository root')
p.add_argument('--fail-on-broken', action='store_true', help='Exit non-zero if broken links found')
p.add_argument('--json', action='store_true', help='Emit JSON report')
p.add_argument('--ignore', default='', help='Comma-separated URL prefixes to ignore')
args = p.parse_args()
root = Path(args.root).resolve()
ignore_prefixes = [x.strip() for x in args.ignore.split(',') if x.strip()]
md_files = list(root.rglob('*.md'))
if not md_files:
print("No markdown files found.", file=sys.stderr)
return 1
print(f"Scanning {len(md_files)} markdown files")
all_links: List[Tuple[Path, str]] = []
for md in md_files:
content = md.read_text(errors='replace')
for m in INLINE_LINK_RE.finditer(content):
all_links.append((md, m.group(1)))
for m in AUTOLINK_RE.finditer(content):
all_links.append((md, m.group(1)))
print(f"Raw link occurrences: {len(all_links)}")
# De-duplicate by URL, keep first file context
first_file: dict[str, Path] = {}
unique_urls: List[str] = []
for file, url in all_links:
if url not in first_file:
first_file[url] = file
unique_urls.append(url)
print(f"Unique URLs to check: {len(unique_urls)}")
broken: List[dict] = []
ok_count = 0
for url in unique_urls:
if is_ignorable(url, ignore_prefixes):
continue
ok, code, reason = check_url(url)
if ok:
ok_count += 1
else:
broken.append({"url": url, "file": str(first_file[url]), "error": reason})
print(f"OK: {ok_count} Broken: {len(broken)}")
if broken:
print("\nBroken links:")
for b in broken:
print(f" [{b['file']}] {b['url']}{b['error']}")
if args.json:
print(json.dumps({"scanned": len(unique_urls), "ok": ok_count,
"broken": len(broken), "broken_links": broken}, indent=2))
return 1 if (args.fail_on_broken and broken) else 0
if __name__ == '__main__':
sys.exit(main())

View File

@@ -1,38 +0,0 @@
#!/usr/bin/env python3
"""Validate knowledge files and index.json against the schema."""
import json, sys
from pathlib import Path
VALID_CATEGORIES = {"fact", "pitfall", "pattern", "tool-quirk", "question"}
REQUIRED = {"id", "fact", "category", "domain", "confidence"}
def validate_fact(fact, src=""):
errs = []
for f in REQUIRED:
if f not in fact: errs.append(f"{src}: missing '{f}'")
if "category" in fact and fact["category"] not in VALID_CATEGORIES:
errs.append(f"{src}: invalid category '{fact['category']}'")
if "confidence" in fact:
if not isinstance(fact["confidence"], (int, float)) or not (0 <= fact["confidence"] <= 1):
errs.append(f"{src}: confidence must be 0.0-1.0")
if "id" in fact:
parts = fact["id"].split(":")
if len(parts) != 3: errs.append(f"{src}: id must be domain:category:sequence")
return errs
def main():
idx = Path(__file__).parent.parent / "knowledge" / "index.json"
if not idx.exists(): print(f"FAILED: {idx} not found"); sys.exit(1)
data = json.load(open(idx))
errs = []
seen = set()
for i, f in enumerate(data.get("facts", [])):
errs.extend(validate_fact(f, f"[{i}]"))
if "id" in f:
if f["id"] in seen: errs.append(f"duplicate id '{f['id']}'")
seen.add(f["id"])
if errs:
print(f"FAILED - {len(errs)} errors:"); [print(f" x {e}") for e in errs]; sys.exit(1)
print(f"PASSED - {len(data.get('facts', []))} facts")
if __name__ == "__main__": main()

View File

@@ -2,107 +2,98 @@
## System Prompt
You are a knowledge extraction engine. You read session transcripts and output ONLY structured JSON. You never infer. You never assume. You extract only what the transcript explicitly states.
You are a knowledge extraction engine. Your task is to analyze a session transcript and extract durable knowledge that will help future sessions be more efficient.
## Prompt
## Instructions
Read the session transcript carefully. Extract ONLY information that is explicitly stated in the transcript. Do NOT infer, assume, or hallucinate information.
### Categories
Extract knowledge into these categories:
1. **fact**: Concrete, verifiable information learned (e.g., "Repository X has 5 files", "API returns JSON with field Y")
2. **pitfall**: Errors encountered, wrong assumptions, things that wasted time (e.g., "Assumed API token was in env var GITEA_TOKEN, but it's in ~/.config/gitea/token")
3. **pattern**: Successful sequences of actions (e.g., "To deploy: 1. Run tests 2. Build 3. Push to Gitea 4. Trigger webhook")
4. **tool-quirk**: Environment-specific behaviors (e.g., "Token paths are different on macOS vs Linux", "URL format requires trailing slash")
5. **question**: Things identified but not answered (e.g., "Need to determine optimal batch size for harvesting")
### Output Format
Return a JSON object with an array of extracted knowledge items. Each item must have:
```json
{
"fact": "One sentence description of the knowledge",
"category": "fact|pitfall|pattern|tool-quirk|question",
"repo": "Repository name this applies to, or 'global' if general",
"confidence": 0.0-1.0
}
```
TASK: Extract durable knowledge from this session transcript.
RULES:
1. Extract ONLY information explicitly stated in the transcript.
2. Do NOT infer, assume, or hallucinate.
3. Every fact must be verifiable by pointing to a specific line in the transcript.
4. If the session failed or was partial, extract pitfalls and questions — these are the most valuable.
5. Be specific. "Gitea API is slow" is worthless. "Gitea issues endpoint with state=open returns empty when limit=50 but works with limit=5" is knowledge.
### Confidence Scoring
CATEGORIES (assign exactly one per item):
- fact: Concrete, verifiable thing learned (paths, formats, counts, configs)
- pitfall: Error hit, wrong assumption, time wasted, thing that didn't work
- pattern: Successful sequence that should be reused (deploy steps, debug flow)
- tool-quirk: Environment-specific behavior (token paths, URL formats, API gotchas)
- question: Something identified but not answered — the NEXT agent should investigate
- 0.9-1.0: Explicitly stated with verification (e.g., "Error message shows X")
- 0.7-0.8: Clearly implied by multiple data points
- 0.5-0.6: Suggested but not fully verified
- 0.3-0.4: Inferred from limited data
- 0.1-0.2: Speculative or uncertain
CONFIDENCE:
- 0.9: Directly observed with error output or explicit verification
- 0.7: Multiple data points confirm, but not explicitly verified
- 0.5: Suggested by context, not tested
- 0.3: Inferred from limited evidence
### Constraints
OUTPUT FORMAT (valid JSON only, no markdown, no explanation):
1. **No hallucination**: Only extract what's explicitly in the transcript
2. **Specificity**: Each fact must be specific and actionable
3. **Relevance**: Only extract knowledge that would help future sessions
4. **Brevity**: One sentence per fact
5. **Partial sessions**: Even failed or incomplete sessions may contain valuable pitfalls
### Example Input/Output
**Input Transcript (excerpt):**
```
User: Clone the repo at forge.alexanderwhitestone.com/Timmy_Foundation/compounding-intelligence
Agent: Trying git clone... Error: 401 Unauthorized
Agent: Checking for token... Found in ~/.config/gitea/token
Agent: Token is gitea_token format, not OAuth
Agent: Clone successful with token authentication
```
**Output:**
```json
{
"knowledge": [
{
"fact": "One specific sentence of knowledge",
"category": "fact|pitfall|pattern|tool-quirk|question",
"repo": "repo-name or global",
"confidence": 0.0-1.0,
"evidence": "Brief quote or reference from transcript that supports this"
"fact": "Gitea repo at forge.alexanderwhitestone.com requires authentication for cloning",
"category": "fact",
"repo": "compounding-intelligence",
"confidence": 0.9
},
{
"fact": "Gitea authentication token is stored at ~/.config/gitea/token",
"category": "tool-quirk",
"repo": "global",
"confidence": 0.9
},
{
"fact": "Gitea uses gitea_token format, not OAuth for API access",
"category": "tool-quirk",
"repo": "global",
"confidence": 0.8
},
{
"fact": "Clone fails with 401 when no token is provided",
"category": "pitfall",
"repo": "compounding-intelligence",
"confidence": 0.9
}
],
"meta": {
"session_outcome": "success|partial|failed",
"tools_used": ["tool1", "tool2"],
"repos_touched": ["repo1"],
"error_count": 0,
"knowledge_count": 0
}
]
}
TRANSCRIPT:
{{transcript}}
```
## Design Notes
## Final Notes
### Why this works with mimo-v2-pro
Mimo needs:
- Explicit format constraints ("valid JSON only, no markdown")
- Clear category definitions with concrete examples
- Hard rules before soft guidance
- The transcript at the END (so it reads all instructions first)
This prompt front-loads all rules, then gives the transcript last. Mimo follows the pattern.
### Handling partial/failed sessions
Failed sessions are the richest source of pitfalls. The prompt explicitly says:
> "If the session failed or was partial, extract pitfalls and questions — these are the most valuable."
This reframes failure as valuable output, not noise to discard.
### The `evidence` field
Added to the original spec. Every extracted item must cite where in the transcript it came from. This:
- Prevents hallucination (can't cite what isn't there)
- Enables verification (reviewer can check the source)
- Trains confidence calibration (the agent must find evidence, not just claim it)
### Token budget
Target: ~1,000 tokens for the prompt (excluding transcript).
```
System prompt: ~50 tokens
Rules: ~200 tokens
Categories: ~150 tokens
Confidence: ~100 tokens
Output format: ~200 tokens
Design notes: NOT included in prompt (documentation only)
─────────────────────────────
Total prompt: ~700 tokens
```
Leaves ~300 tokens headroom for variable content (transcript insertion, edge cases).
### What this replaces
The v1 prompt had:
- Verbose prose explanations (waste tokens for mimo)
- No `evidence` field (hallucination risk)
- No `meta` block (no session-level metadata)
- No explicit handling of failed sessions
- Example was too long (~150 tokens of example for a 1k prompt)
This v2 is tighter, more structured, and adds the evidence requirement that prevents the #1 failure mode of extraction prompts: generating plausible-sounding facts that aren't in the transcript.
- Process the entire transcript, not just the beginning
- Pay special attention to errors and corrections
- Note any environment-specific details
- Track tool-specific behaviors and quirks
- If the session failed, focus on pitfalls and questions

View File

@@ -1,18 +0,0 @@
{"role": "user", "content": "Deploy the application to production"}
{"role": "assistant", "content": "I'll deploy the application to production.", "tool_calls": [{"function": "terminal", "arguments": {"command": "ssh deploy@prod-server 'cd /app && git pull && docker-compose up -d'"}}]}
{"role": "tool", "content": "ssh: connect to host prod-server port 22: Connection refused"}
{"role": "assistant", "content": "Connection refused. Let me check the server status.", "tool_calls": [{"function": "terminal", "arguments": {"command": "ping -c 3 prod-server"}}]}
{"role": "tool", "content": "PING prod-server (192.168.1.100): 56 data bytes\nRequest timeout for icmp_seq 0\nRequest timeout for icmp_seq 1\nRequest timeout for icmp_seq 2\n--- prod-server ping statistics ---\n3 packets transmitted, 0 packets received, 100.0% packet loss"}
{"role": "assistant", "content": "The server appears to be down. Let me check if there's an alternative deployment method."}
{"role": "user", "content": "Try using the VPS instead"}
{"role": "assistant", "content": "I'll try deploying to the VPS.", "tool_calls": [{"function": "terminal", "arguments": {"command": "ssh deploy@vps-server 'cd /app && git pull && docker-compose up -d'"}}]}
{"role": "tool", "content": "Permission denied (publickey)."}
{"role": "assistant", "content": "Permission denied. The SSH key might not be set up correctly. Let me check the SSH configuration.", "tool_calls": [{"function": "terminal", "arguments": {"command": "cat ~/.ssh/config"}}]}
{"role": "tool", "content": "Host vps-server\n HostName 167.99.126.228\n User deploy\n IdentityFile ~/.ssh/id_rsa_vps"}
{"role": "assistant", "content": "The SSH config looks correct. Let me check if the key exists.", "tool_calls": [{"function": "terminal", "arguments": {"command": "ls -la ~/.ssh/id_rsa_vps"}}]}
{"role": "tool", "content": "ls: /Users/user/.ssh/id_rsa_vps: No such file or directory"}
{"role": "assistant", "content": "The SSH key doesn't exist. This is a pitfall - we need to generate the SSH key first."}
{"role": "user", "content": "Generate a new SSH key for the VPS"}
{"role": "assistant", "content": "I'll generate a new SSH key.", "tool_calls": [{"function": "terminal", "arguments": {"command": "ssh-keygen -t rsa -b 4096 -f ~/.ssh/id_rsa_vps -N ''"}}]}
{"role": "tool", "content": "Generating public/private rsa key pair.\nYour identification has been saved in /Users/user/.ssh/id_rsa_vps\nYour public key has been saved in /Users/user/.ssh/id_rsa_vps.pub\nThe key fingerprint is:\nSHA256:abc123... user@machine"}
{"role": "assistant", "content": "SSH key generated. Now I need to add it to the VPS."}

View File

@@ -1,14 +0,0 @@
{"role": "user", "content": "Check the Gitea API for open issues"}
{"role": "assistant", "content": "I'll check the Gitea API for open issues.", "tool_calls": [{"function": "terminal", "arguments": {"command": "curl -H 'Authorization: token $GITEA_TOKEN' https://forge.alexanderwhitestone.com/api/v1/repos/Timmy_Foundation/compounding-intelligence/issues?state=open"}}]}
{"role": "tool", "content": "curl: (6) Could not resolve host: forge.alexanderwhitestone.com"}
{"role": "assistant", "content": "DNS resolution failed. Let me check the network connectivity.", "tool_calls": [{"function": "terminal", "arguments": {"command": "nslookup forge.alexanderwhitestone.com"}}]}
{"role": "tool", "content": "Server: 8.8.8.8\nAddress: 8.8.8.8#53\n\n** server can't find forge.alexanderwhitestone.com: NXDOMAIN"}
{"role": "assistant", "content": "The domain doesn't resolve. This might be a DNS issue. Let me try using the IP address directly.", "tool_calls": [{"function": "terminal", "arguments": {"command": "curl -H 'Authorization: token $GITEA_TOKEN' https://167.99.126.228/api/v1/repos/Timmy_Foundation/compounding-intelligence/issues?state=open"}}]}
{"role": "tool", "content": "curl: (60) SSL certificate problem: self signed certificate"}
{"role": "assistant", "content": "SSL certificate issue. The server is using a self-signed certificate. Let me try with -k flag to skip SSL verification.", "tool_calls": [{"function": "terminal", "arguments": {"command": "curl -k -H 'Authorization: token $GITEA_TOKEN' https://167.99.126.228/api/v1/repos/Timmy_Foundation/compounding-intelligence/issues?state=open"}}]}
{"role": "tool", "content": "[]"}
{"role": "assistant", "content": "No open issues found. The API returned an empty array."}
{"role": "user", "content": "Check if there are any pull requests"}
{"role": "assistant", "content": "Let me check for pull requests.", "tool_calls": [{"function": "terminal", "arguments": {"command": "curl -k -H 'Authorization: token $GITEA_TOKEN' https://167.99.126.228/api/v1/repos/Timmy_Foundation/compounding-intelligence/pulls?state=open"}}]}
{"role": "tool", "content": "[{"id": 20, "title": "feat: add harvester.py + session_reader.py — session knowledge extractor (closes #8)", "state": "open", "user": {"login": "Rockachopa"}, "created_at": "2026-04-14T17:28:20Z"}]"}
{"role": "assistant", "content": "Found 1 open pull request: PR #20 - "feat: add harvester.py + session_reader.py — session knowledge extractor (closes #8)" by Rockachopa."}

View File

@@ -1,10 +0,0 @@
{"role": "user", "content": "How do I deploy a new version of the application?"}
{"role": "assistant", "content": "Here's the deployment process:\n1. Run tests: pytest\n2. Build Docker image: docker build -t app:latest .\n3. Push to registry: docker push registry.example.com/app:latest\n4. Update production: ssh prod 'cd /app && docker-compose pull && docker-compose up -d'\n5. Verify: curl https://app.example.com/health"}
{"role": "user", "content": "What if tests fail?"}
{"role": "assistant", "content": "If tests fail:\n1. Check the error logs: pytest -v\n2. Fix the failing tests\n3. Re-run tests: pytest\n4. If still failing, check recent changes: git log --oneline -10\n5. Consider reverting: git revert HEAD"}
{"role": "user", "content": "What if the Docker build fails?"}
{"role": "assistant", "content": "If Docker build fails:\n1. Check Dockerfile syntax: docker build --no-cache -t app:latest .\n2. Check for missing dependencies: cat requirements.txt\n3. Check disk space: df -h\n4. Clean up Docker: docker system prune -f\n5. Try building with verbose output: docker build --progress=plain -t app:latest ."}
{"role": "user", "content": "What if the push to registry fails?"}
{"role": "assistant", "content": "If push to registry fails:\n1. Check registry credentials: docker login registry.example.com\n2. Check registry space: curl https://registry.example.com/v2/_catalog\n3. Check image size: docker images app:latest\n4. Try tagging with version: docker tag app:latest registry.example.com/app:v1.0.0\n5. Push with version tag: docker push registry.example.com/app:v1.0.0"}
{"role": "user", "content": "What if the production update fails?"}
{"role": "assistant", "content": "If production update fails:\n1. Check SSH connection: ssh prod 'echo connected'\n2. Check disk space on prod: ssh prod 'df -h'\n3. Check Docker status: ssh prod 'docker ps -a'\n4. Check logs: ssh prod 'docker-compose logs --tail=100'\n5. Rollback: ssh prod 'cd /app && git checkout HEAD~1 && docker-compose up -d'"}

View File

@@ -1,12 +0,0 @@
{"role": "user", "content": "I need to implement a new feature for the application"}
{"role": "assistant", "content": "I can help you implement a new feature. What feature would you like to add?"}
{"role": "user", "content": "I want to add user authentication"}
{"role": "assistant", "content": "User authentication is a good feature. I have a few questions:\n1. What authentication method do you prefer? (JWT, OAuth, Session-based)\n2. Do you need social login? (Google, GitHub, etc.)\n3. What user data do you need to store?\n4. Do you need role-based access control?\n5. What's your database? (PostgreSQL, MySQL, SQLite)"}
{"role": "user", "content": "I prefer JWT, no social login, just email and password, no roles, PostgreSQL"}
{"role": "assistant", "content": "Great choices. Here's the implementation plan:\n1. Create user model with email and password hash\n2. Create JWT token generation and validation\n3. Create login endpoint: POST /api/auth/login\n4. Create register endpoint: POST /api/auth/register\n5. Create middleware for protected routes\n6. Add password hashing with bcrypt\n7. Add rate limiting for login attempts\n8. Add password reset functionality\n9. Add email verification\n10. Test all endpoints"}
{"role": "user", "content": "What about password reset?"}
{"role": "assistant", "content": "For password reset:\n1. Create reset token table\n2. Create forgot password endpoint: POST /api/auth/forgot-password\n3. Send reset email with token\n4. Create reset password endpoint: POST /api/auth/reset-password\n5. Validate reset token and expiration\n6. Update password and delete token\n7. Send confirmation email"}
{"role": "user", "content": "What about email verification?"}
{"role": "assistant", "content": "For email verification:\n1. Add email_verified field to user model\n2. Create verification token table\n3. Send verification email on registration\n4. Create verify email endpoint: GET /api/auth/verify-email?token=...\n5. Validate token and update email_verified\n6. Add middleware to check email_verified for certain routes\n7. Add resend verification email endpoint"}
{"role": "user", "content": "What about rate limiting?"}
{"role": "assistant", "content": "For rate limiting:\n1. Use Redis for rate limit storage\n2. Create rate limit middleware\n3. Set limits: 5 login attempts per minute, 3 password resets per hour\n4. Return 429 Too Many Requests when limit exceeded\n5. Add X-RateLimit headers to responses\n6. Log rate limit violations\n7. Add whitelist for trusted IPs"}

View File

@@ -1,13 +0,0 @@
{"role": "user", "content": "Clone the repo at forge.alexanderwhitestone.com/Timmy_Foundation/compounding-intelligence"}
{"role": "assistant", "content": "I'll clone the repository for you.", "tool_calls": [{"function": "terminal", "arguments": {"command": "git clone https://forge.alexanderwhitestone.com/Timmy_Foundation/compounding-intelligence.git"}}]}
{"role": "tool", "content": "Cloning into 'compounding-intelligence'...\nremote: Enumerating objects: 15, done.\nremote: Counting objects: 100% (15/15), done.\nremote: Compressing objects: 100% (12/12), done.\nremote: Total 15 (delta 2), reused 0 (delta 0), pack-reused 0\nReceiving objects: 100% (15/15), done.\nResolving deltas: 100% (2/2), done."}
{"role": "assistant", "content": "Repository cloned successfully. Let me check the structure.", "tool_calls": [{"function": "terminal", "arguments": {"command": "ls -la compounding-intelligence/"}}]}
{"role": "tool", "content": "total 8\ndrwxr-xr-x 5 user staff 160 Apr 14 17:21 .\ndrwxr-xr-x 12 user staff 384 Apr 14 17:21 ..\ndrwxr-xr-x 13 user staff 416 Apr 14 17:21 .git\n-rw-r--r-- 1 user staff 1234 Apr 14 17:21 README.md\ndrwxr-xr-x 3 user staff 96 Apr 14 17:21 knowledge\n-rw-r--r-- 1 user staff 456 Apr 14 17:21 metrics\ndrwxr-xr-x 4 user staff 128 Apr 14 17:21 scripts\ndrwxr-xr-x 3 user staff 96 Apr 14 17:21 templates"}
{"role": "user", "content": "What files are in the templates directory?"}
{"role": "assistant", "content": "Let me check the templates directory.", "tool_calls": [{"function": "terminal", "arguments": {"command": "ls -la compounding-intelligence/templates/"}}]}
{"role": "tool", "content": "total 0\ndrwxr-xr-x 3 user staff 96 Apr 14 17:21 .\ndrwxr-xr-x 5 user staff 160 Apr 14 17:21 ..\n-rw-r--r-- 1 user staff 0 Apr 14 17:21 .gitkeep"}
{"role": "assistant", "content": "The templates directory currently only contains a .gitkeep file. No templates have been added yet."}
{"role": "user", "content": "Create a harvest-prompt.md template in the templates directory."}
{"role": "assistant", "content": "I'll create the harvest-prompt.md template for you.", "tool_calls": [{"function": "terminal", "arguments": {"command": "cat > compounding-intelligence/templates/harvest-prompt.md << 'EOF'\n# Knowledge Extraction Prompt\n\nExtract knowledge from session transcripts.\nEOF"}}]}
{"role": "tool", "content": ""}
{"role": "assistant", "content": "Template created successfully. The harvest-prompt.md file is now in the templates directory."}

View File

@@ -1,19 +0,0 @@
from pathlib import Path
def test_requirements_makefile_and_workflow_exist() -> None:
assert Path("requirements.txt").exists()
assert Path("Makefile").exists()
assert Path(".gitea/workflows/test.yml").exists()
def test_ci_workflow_runs_project_test_command() -> None:
workflow = Path(".gitea/workflows/test.yml").read_text(encoding="utf-8")
requirements = Path("requirements.txt").read_text(encoding="utf-8")
makefile = Path("Makefile").read_text(encoding="utf-8")
assert "pytest" in requirements
assert "test:" in makefile
assert "python3 -m pytest tests/test_ci_config.py scripts/test_*.py -v" in makefile
assert "pip install -r requirements.txt" in workflow
assert "make test" in workflow

View File

@@ -1,207 +0,0 @@
"""Tests for knowledge deduplication module (Issue #196)."""
import json
import sys
from pathlib import Path
import pytest
sys.path.insert(0, str(Path(__file__).parent.parent / "scripts"))
from dedup import (
normalize_text,
content_hash,
tokenize,
token_similarity,
quality_score,
merge_facts,
dedup_facts,
generate_test_duplicates,
)
class TestNormalize:
def test_lowercases(self):
assert normalize_text("Hello World") == "hello world"
def test_collapses_whitespace(self):
assert normalize_text(" hello world ") == "hello world"
def test_strips(self):
assert normalize_text(" text ") == "text"
class TestContentHash:
def test_deterministic(self):
h1 = content_hash("Hello World")
h2 = content_hash("hello world")
h3 = content_hash(" Hello World ")
assert h1 == h2 == h3
def test_different_texts(self):
h1 = content_hash("Hello")
h2 = content_hash("World")
assert h1 != h2
def test_returns_hex(self):
h = content_hash("test")
assert len(h) == 64 # SHA256
assert all(c in '0123456789abcdef' for c in h)
class TestTokenize:
def test_extracts_words(self):
tokens = tokenize("Hello World Test")
assert "hello" in tokens
assert "world" in tokens
assert "test" in tokens
def test_skips_short_words(self):
tokens = tokenize("a to is the hello")
assert "a" not in tokens
assert "to" not in tokens
assert "hello" in tokens
def test_returns_set(self):
tokens = tokenize("hello hello world")
assert isinstance(tokens, set)
assert len(tokens) == 2
class TestTokenSimilarity:
def test_identical(self):
assert token_similarity("hello world", "hello world") == 1.0
def test_no_overlap(self):
assert token_similarity("alpha beta", "gamma delta") == 0.0
def test_partial_overlap(self):
sim = token_similarity("hello world test", "hello universe test")
assert 0.3 < sim < 0.7
def test_empty(self):
assert token_similarity("", "hello") == 0.0
assert token_similarity("hello", "") == 0.0
def test_symmetric(self):
a = "hello world test"
b = "hello universe test"
assert token_similarity(a, b) == token_similarity(b, a)
class TestQualityScore:
def test_high_confidence(self):
fact = {"confidence": 0.95, "source_count": 5, "tags": ["test"], "related": ["x"]}
score = quality_score(fact)
assert score > 0.7
def test_low_confidence(self):
fact = {"confidence": 0.3, "source_count": 1}
score = quality_score(fact)
assert score < 0.5
def test_defaults(self):
score = quality_score({})
assert 0 < score < 1
class TestMergeFacts:
def test_merges_tags(self):
keep = {"id": "a", "fact": "test", "tags": ["git"], "confidence": 0.9}
drop = {"id": "b", "fact": "test", "tags": ["python"], "confidence": 0.8}
merged = merge_facts(keep, drop)
assert "git" in merged["tags"]
assert "python" in merged["tags"]
def test_merges_source_count(self):
keep = {"id": "a", "fact": "test", "source_count": 3}
drop = {"id": "b", "fact": "test", "source_count": 2}
merged = merge_facts(keep, drop)
assert merged["source_count"] == 5
def test_keeps_higher_confidence(self):
keep = {"id": "a", "fact": "test", "confidence": 0.7}
drop = {"id": "b", "fact": "test", "confidence": 0.9}
merged = merge_facts(keep, drop)
assert merged["confidence"] == 0.9
def test_tracks_merged_from(self):
keep = {"id": "a", "fact": "test"}
drop = {"id": "b", "fact": "test"}
merged = merge_facts(keep, drop)
assert "b" in merged["_merged_from"]
class TestDedupFacts:
def test_removes_exact_dupes(self):
facts = [
{"id": "1", "fact": "Always use git rebase"},
{"id": "2", "fact": "Always use git rebase"}, # exact dupe
{"id": "3", "fact": "Check logs first"},
]
deduped, stats = dedup_facts(facts)
assert stats["exact_dupes"] == 1
assert stats["unique"] == 2
def test_removes_near_dupes(self):
facts = [
{"id": "1", "fact": "Always check logs before deploying to production server"},
{"id": "2", "fact": "Always check logs before deploying to production environment"},
{"id": "3", "fact": "Use docker compose for local development environments"},
]
deduped, stats = dedup_facts(facts, near_threshold=0.5)
assert stats["near_dupes"] >= 1
assert stats["unique"] == 2
def test_preserves_unique(self):
facts = [
{"id": "1", "fact": "Use git rebase for clean history"},
{"id": "2", "fact": "Docker containers should be stateless"},
{"id": "3", "fact": "Always write tests before code"},
]
deduped, stats = dedup_facts(facts)
assert stats["unique"] == 3
assert stats["removed"] == 0
def test_empty_input(self):
deduped, stats = dedup_facts([])
assert stats["total"] == 0
assert stats["unique"] == 0
def test_keeps_higher_quality_near_dup(self):
facts = [
{"id": "1", "fact": "Check logs before deploying to production server", "confidence": 0.5, "source_count": 1},
{"id": "2", "fact": "Check logs before deploying to production environment", "confidence": 0.9, "source_count": 5, "tags": ["ops"]},
]
deduped, stats = dedup_facts(facts, near_threshold=0.5)
assert stats["unique"] == 1
# Higher quality fact should be kept
assert deduped[0]["confidence"] == 0.9
def test_dry_run_does_not_modify(self):
facts = [
{"id": "1", "fact": "Same text"},
{"id": "2", "fact": "Same text"},
]
deduped, stats = dedup_facts(facts, dry_run=True)
assert stats["exact_dupes"] == 1
# In dry_run, merge_facts is skipped so facts aren't modified
assert len(deduped) == 1
class TestGenerateTestDuplicates:
def test_generates_correct_count(self):
facts = generate_test_duplicates(20)
assert len(facts) > 20 # 20 unique + duplicates
def test_has_exact_dupes(self):
facts = generate_test_duplicates(20)
hashes = [content_hash(f["fact"]) for f in facts]
# Should have some duplicate hashes
assert len(hashes) != len(set(hashes))
def test_dedup_removes_dupes(self):
facts = generate_test_duplicates(20)
deduped, stats = dedup_facts(facts)
assert stats["unique"] <= 20
assert stats["removed"] > 0

View File

@@ -1,227 +0,0 @@
#!/usr/bin/env python3
"""Tests for scripts/freshness.py — 8 tests."""
import json
import os
import sys
import tempfile
sys.path.insert(0, os.path.join(os.path.dirname(__file__) or ".", ".."))
import importlib.util
spec = importlib.util.spec_from_file_location(
"freshness", os.path.join(os.path.dirname(__file__) or ".", "..", "scripts", "freshness.py"))
mod = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)
compute_file_hash = mod.compute_file_hash
check_freshness = mod.check_freshness
load_knowledge_entries = mod.load_knowledge_entries
def test_compute_file_hash():
"""File hash should be computed correctly."""
with tempfile.NamedTemporaryFile(mode='w', delete=False) as f:
f.write("test content")
f.flush()
h = compute_file_hash(f.name)
assert h is not None
assert h.startswith("sha256:")
os.unlink(f.name)
print("PASS: test_compute_file_hash")
def test_compute_file_hash_nonexistent():
"""Nonexistent file should return None."""
h = compute_file_hash("/nonexistent/file.txt")
assert h is None
print("PASS: test_compute_file_hash_nonexistent")
def test_load_knowledge_entries_empty():
"""Empty knowledge dir should return empty list."""
with tempfile.TemporaryDirectory() as tmpdir:
entries = load_knowledge_entries(tmpdir)
assert entries == []
print("PASS: test_load_knowledge_entries_empty")
def test_load_knowledge_entries_from_index():
"""Should load entries from index.json."""
with tempfile.TemporaryDirectory() as tmpdir:
# Create index.json
index_path = os.path.join(tmpdir, "index.json")
with open(index_path, "w") as f:
json.dump({
"facts": [
{
"fact": "Test fact",
"source_file": "test.py",
"source_hash": "sha256:abc123",
"category": "fact",
"confidence": 0.9
}
]
}, f)
entries = load_knowledge_entries(tmpdir)
assert len(entries) == 1
assert entries[0]["fact"] == "Test fact"
assert entries[0]["source_file"] == "test.py"
print("PASS: test_load_knowledge_entries_from_index")
def test_load_knowledge_entries_from_yaml():
"""Should load entries from YAML files."""
with tempfile.TemporaryDirectory() as tmpdir:
# Create global directory
global_dir = os.path.join(tmpdir, "global")
os.makedirs(global_dir)
# Create YAML file
yaml_path = os.path.join(global_dir, "test.yaml")
with open(yaml_path, "w") as f:
f.write("""
pitfalls:
- description: "Test pitfall"
source_file: "test.py"
source_hash: "sha256:def456"
category: "pitfall"
confidence: 0.8
""")
entries = load_knowledge_entries(tmpdir)
assert len(entries) == 1
assert entries[0]["fact"] == "Test pitfall"
assert entries[0]["category"] == "pitfall"
print("PASS: test_load_knowledge_entries_from_yaml")
def test_check_freshness_no_changes():
"""With no source file reference, entries should be counted correctly."""
with tempfile.TemporaryDirectory() as tmpdir:
# Create knowledge dir
knowledge_dir = os.path.join(tmpdir, "knowledge")
os.makedirs(knowledge_dir)
# Create repo dir
repo_dir = os.path.join(tmpdir, "repo")
os.makedirs(repo_dir)
# Create index.json with entry that has no source_file
index_path = os.path.join(knowledge_dir, "index.json")
with open(index_path, "w") as f:
json.dump({
"facts": [
{
"fact": "General knowledge",
"category": "fact",
"confidence": 0.9
# No source_file or source_hash
}
]
}, f)
result = check_freshness(knowledge_dir, repo_dir, days=1)
# Entry without source_file should be counted as "fresh" (no_source status)
assert result["summary"]["total"] == 1
assert result["summary"]["stale"] == 0
assert result["summary"]["fresh"] == 1
assert result["fresh_entries"][0]["status"] == "no_source"
print("PASS: test_check_freshness_no_changes")
def test_check_freshness_with_hash_mismatch():
"""Hash mismatch should mark entry as stale."""
with tempfile.TemporaryDirectory() as tmpdir:
# Create knowledge dir
knowledge_dir = os.path.join(tmpdir, "knowledge")
os.makedirs(knowledge_dir)
# Create repo dir with a file
repo_dir = os.path.join(tmpdir, "repo")
os.makedirs(repo_dir)
test_file = os.path.join(repo_dir, "test.py")
with open(test_file, "w") as f:
f.write("print('hello')")
# Create index.json with wrong hash
index_path = os.path.join(knowledge_dir, "index.json")
with open(index_path, "w") as f:
json.dump({
"facts": [
{
"fact": "Test fact",
"source_file": "test.py",
"source_hash": "sha256:wronghash",
"category": "fact",
"confidence": 0.9
}
]
}, f)
# Initialize git repo
os.system(f"cd {repo_dir} && git init && git add . && git commit -m 'init' 2>/dev/null")
result = check_freshness(knowledge_dir, repo_dir, days=1)
assert result["summary"]["total"] == 1
assert result["summary"]["stale"] == 1
assert result["summary"]["fresh"] == 0
assert result["stale_entries"][0]["reason"] == "hash_mismatch"
print("PASS: test_check_freshness_with_hash_mismatch")
def test_check_freshness_missing_source():
"""Missing source file should mark entry as stale."""
with tempfile.TemporaryDirectory() as tmpdir:
# Create knowledge dir
knowledge_dir = os.path.join(tmpdir, "knowledge")
os.makedirs(knowledge_dir)
# Create repo dir (without the referenced file)
repo_dir = os.path.join(tmpdir, "repo")
os.makedirs(repo_dir)
# Create index.json referencing nonexistent file
index_path = os.path.join(knowledge_dir, "index.json")
with open(index_path, "w") as f:
json.dump({
"facts": [
{
"fact": "Test fact",
"source_file": "nonexistent.py",
"source_hash": "sha256:abc123",
"category": "fact",
"confidence": 0.9
}
]
}, f)
# Initialize git repo
os.system(f"cd {repo_dir} && git init && git add . && git commit -m 'init' 2>/dev/null")
result = check_freshness(knowledge_dir, repo_dir, days=1)
assert result["summary"]["total"] == 1
assert result["summary"]["stale"] == 1
assert result["summary"]["fresh"] == 0
assert result["stale_entries"][0]["reason"] == "source_missing"
print("PASS: test_check_freshness_missing_source")
def run_all():
test_compute_file_hash()
test_compute_file_hash_nonexistent()
test_load_knowledge_entries_empty()
test_load_knowledge_entries_from_index()
test_load_knowledge_entries_from_yaml()
test_check_freshness_no_changes()
test_check_freshness_with_hash_mismatch()
test_check_freshness_missing_source()
print("\nAll 8 tests passed!")
if __name__ == "__main__":
run_all()

View File

@@ -1,141 +0,0 @@
"""Tests for knowledge_gap_identifier module."""
import sys
import os
import tempfile
import shutil
from pathlib import Path
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..', 'scripts'))
from knowledge_gap_identifier import KnowledgeGapIdentifier, GapType, GapSeverity
def _make_repo(tmpdir, structure):
"""Create a test repo from a dict of {path: content}."""
for rel_path, content in structure.items():
p = Path(tmpdir) / rel_path
p.parent.mkdir(parents=True, exist_ok=True)
p.write_text(content)
def test_undocumented_symbol():
with tempfile.TemporaryDirectory() as tmpdir:
_make_repo(tmpdir, {
"src/calculator.py": "def add(a, b):\n return a + b\n",
"README.md": "# Calculator\n",
})
report = KnowledgeGapIdentifier().analyze(tmpdir)
undocumented = [g for g in report.gaps if g.gap_type == GapType.UNDOCUMENTED]
assert any(g.name == "add" for g in undocumented), "add should be undocumented"
def test_documented_symbol_no_gap():
with tempfile.TemporaryDirectory() as tmpdir:
_make_repo(tmpdir, {
"src/calculator.py": "def add(a, b):\n return a + b\n",
"README.md": "# Calculator\nUse `add()` to add numbers.\n",
})
report = KnowledgeGapIdentifier().analyze(tmpdir)
undocumented = [g for g in report.gaps
if g.gap_type == GapType.UNDOCUMENTED and g.name == "add"]
assert len(undocumented) == 0, "add is documented, should not be flagged"
def test_untested_module():
with tempfile.TemporaryDirectory() as tmpdir:
_make_repo(tmpdir, {
"src/calculator.py": "def add(a, b):\n return a + b\n",
"src/helper.py": "def format(x):\n return str(x)\n",
"tests/test_calculator.py": "from src.calculator import add\nassert add(1,2) == 3\n",
})
report = KnowledgeGapIdentifier().analyze(tmpdir)
untested = [g for g in report.gaps if g.gap_type == GapType.UNTESTED]
assert any("helper" in g.name for g in untested), "helper should be untested"
def test_tested_module_no_gap():
with tempfile.TemporaryDirectory() as tmpdir:
_make_repo(tmpdir, {
"src/calculator.py": "def add(a, b):\n return a + b\n",
"tests/test_calculator.py": "def test_add():\n assert True\n",
})
report = KnowledgeGapIdentifier().analyze(tmpdir)
untested = [g for g in report.gaps
if g.gap_type == GapType.UNTESTED and "calculator" in g.name]
assert len(untested) == 0, "calculator has tests, should not be flagged"
def test_missing_implementation():
with tempfile.TemporaryDirectory() as tmpdir:
_make_repo(tmpdir, {
"src/app.py": "def run():\n pass\n",
"docs/api.md": "# API\nUse `NonExistentClass` to do things.\n",
})
report = KnowledgeGapIdentifier().analyze(tmpdir)
missing = [g for g in report.gaps if g.gap_type == GapType.MISSING_IMPLEMENTATION]
assert any(g.name == "NonExistentClass" for g in missing)
def test_private_symbols_skipped():
with tempfile.TemporaryDirectory() as tmpdir:
_make_repo(tmpdir, {
"src/app.py": "def _internal():\n pass\ndef public():\n pass\n",
"README.md": "# App\n",
})
report = KnowledgeGapIdentifier().analyze(tmpdir)
undocumented_names = [g.name for g in report.gaps if g.gap_type == GapType.UNDOCUMENTED]
assert "_internal" not in undocumented_names, "Private symbols should be skipped"
assert "public" in undocumented_names
def test_empty_repo():
with tempfile.TemporaryDirectory() as tmpdir:
report = KnowledgeGapIdentifier().analyze(tmpdir)
assert len(report.gaps) == 0
def test_invalid_path():
report = KnowledgeGapIdentifier().analyze("/nonexistent/path/xyz")
assert len(report.gaps) == 1
assert report.gaps[0].severity == GapSeverity.ERROR
def test_report_summary():
with tempfile.TemporaryDirectory() as tmpdir:
_make_repo(tmpdir, {
"src/app.py": "class MyService:\n def handle(self):\n pass\n",
"README.md": "# App\n",
})
report = KnowledgeGapIdentifier().analyze(tmpdir)
summary = report.summary()
assert "UNDOCUMENTED" in summary
assert "MyService" in summary
def test_report_to_dict():
with tempfile.TemporaryDirectory() as tmpdir:
_make_repo(tmpdir, {
"src/app.py": "def hello():\n pass\n",
"README.md": "# App\n",
})
report = KnowledgeGapIdentifier().analyze(tmpdir)
d = report.to_dict()
assert "total_gaps" in d
assert "gaps" in d
assert isinstance(d["gaps"], list)
assert d["total_gaps"] > 0
if __name__ == "__main__":
test_undocumented_symbol()
test_documented_symbol_no_gap()
test_untested_module()
test_tested_module_no_gap()
test_missing_implementation()
test_private_symbols_skipped()
test_empty_repo()
test_invalid_path()
test_report_summary()
test_report_to_dict()
print("All 10 tests passed.")

View File

@@ -1,319 +0,0 @@
#!/usr/bin/env python3
"""
Tests for Performance Bottleneck Finder.
"""
import json
import os
import tempfile
import textwrap
from pathlib import Path
import pytest
# Add scripts to path
import sys
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "scripts"))
from perf_bottleneck_finder import (
Bottleneck,
PerfReport,
find_slow_tests_by_scan,
analyze_build_artifacts,
analyze_makefile_targets,
find_slow_imports,
generate_report,
format_markdown,
severity_sort_key,
)
class TestBottleneck:
"""Test Bottleneck dataclass."""
def test_creation(self):
b = Bottleneck(
category="test",
name="test_foo",
duration_s=5.0,
severity="warning",
recommendation="Mock it"
)
assert b.category == "test"
assert b.name == "test_foo"
assert b.duration_s == 5.0
assert b.severity == "warning"
assert b.recommendation == "Mock it"
assert b.file_path is None
assert b.line_number is None
def test_with_location(self):
b = Bottleneck(
category="test",
name="test_bar",
duration_s=2.0,
severity="info",
recommendation="Consider",
file_path="tests/test_bar.py",
line_number=42
)
assert b.file_path == "tests/test_bar.py"
assert b.line_number == 42
def test_to_dict(self):
b = Bottleneck("test", "x", 1.0, "info", "y")
d = b.__dict__
assert "category" in d
assert "duration_s" in d
class TestPerfReport:
"""Test PerfReport dataclass."""
def test_creation(self):
report = PerfReport(
timestamp="2026-01-01T00:00:00Z",
repo_path="/tmp/repo"
)
assert report.timestamp == "2026-01-01T00:00:00Z"
assert report.bottlenecks == []
assert report.summary == {}
def test_to_dict(self):
report = PerfReport(
timestamp="2026-01-01T00:00:00Z",
repo_path="/tmp/repo",
bottlenecks=[Bottleneck("test", "x", 1.0, "info", "y")]
)
d = report.to_dict()
assert "bottlenecks" in d
assert len(d["bottlenecks"]) == 1
class TestSeveritySort:
"""Test severity sorting."""
def test_critical_first(self):
items = [
Bottleneck("test", "a", 1.0, "info", ""),
Bottleneck("test", "b", 0.5, "critical", ""),
Bottleneck("test", "c", 2.0, "warning", ""),
]
items.sort(key=severity_sort_key)
assert items[0].severity == "critical"
assert items[1].severity == "warning"
assert items[2].severity == "info"
def test_duration_within_severity(self):
items = [
Bottleneck("test", "slow", 10.0, "warning", ""),
Bottleneck("test", "fast", 1.0, "warning", ""),
]
items.sort(key=severity_sort_key)
assert items[0].name == "slow" # Higher duration first within same severity
class TestSlowTestScan:
"""Test slow test pattern scanning."""
def test_finds_sleep(self, tmp_path):
test_file = tmp_path / "test_sleepy.py"
test_file.write_text(textwrap.dedent('''
import time
def test_slow():
time.sleep(5)
assert True
'''))
bottlenecks = find_slow_tests_by_scan(str(tmp_path))
assert len(bottlenecks) >= 1
assert any("sleep" in b.recommendation.lower() for b in bottlenecks)
def test_finds_http_calls(self, tmp_path):
test_file = tmp_path / "test_http.py"
test_file.write_text(textwrap.dedent('''
import requests
def test_api():
resp = requests.get("https://example.com")
assert resp.status_code == 200
'''))
bottlenecks = find_slow_tests_by_scan(str(tmp_path))
assert len(bottlenecks) >= 1
assert any("HTTP" in b.recommendation or "mock" in b.recommendation.lower() for b in bottlenecks)
def test_skips_non_test_files(self, tmp_path):
src_file = tmp_path / "main.py"
src_file.write_text("import time\ntime.sleep(10)\n")
bottlenecks = find_slow_tests_by_scan(str(tmp_path))
assert len(bottlenecks) == 0
def test_handles_missing_dir(self):
bottlenecks = find_slow_tests_by_scan("/nonexistent/path")
assert bottlenecks == []
def test_file_path_populated(self, tmp_path):
test_file = tmp_path / "test_example.py"
test_file.write_text("import time\n\ndef test_it():\n time.sleep(2)\n")
bottlenecks = find_slow_tests_by_scan(str(tmp_path))
assert len(bottlenecks) >= 1
assert bottlenecks[0].file_path is not None
assert bottlenecks[0].line_number is not None
class TestBuildArtifacts:
"""Test build artifact analysis."""
def test_finds_large_node_modules(self, tmp_path):
nm = tmp_path / "node_modules"
nm.mkdir()
# Create a file > 10MB
big_file = nm / "big.txt"
big_file.write_bytes(b"x" * (11 * 1024 * 1024))
bottlenecks = analyze_build_artifacts(str(tmp_path))
assert len(bottlenecks) >= 1
assert any("node_modules" in b.name for b in bottlenecks)
def test_ignores_small_dirs(self, tmp_path):
nm = tmp_path / "node_modules"
nm.mkdir()
small_file = nm / "small.txt"
small_file.write_bytes(b"x" * 100)
bottlenecks = analyze_build_artifacts(str(tmp_path))
assert not any("node_modules" in b.name for b in bottlenecks)
def test_finds_pycache(self, tmp_path):
cache = tmp_path / "__pycache__"
cache.mkdir()
big_file = cache / "big.pyc"
big_file.write_bytes(b"x" * (11 * 1024 * 1024))
bottlenecks = analyze_build_artifacts(str(tmp_path))
assert any("__pycache__" in b.name for b in bottlenecks)
class TestMakefileAnalysis:
"""Test Makefile analysis."""
def test_finds_pip_install(self, tmp_path):
makefile = tmp_path / "Makefile"
makefile.write_text(textwrap.dedent('''
install:
pip install -r requirements.txt
test:
pytest
'''))
bottlenecks = analyze_makefile_targets(str(tmp_path))
assert len(bottlenecks) >= 1
def test_no_makefile(self, tmp_path):
bottlenecks = analyze_makefile_targets(str(tmp_path))
assert bottlenecks == []
class TestImportAnalysis:
"""Test heavy import detection."""
def test_finds_pandas(self, tmp_path):
src = tmp_path / "analysis.py"
src.write_text("import pandas as pd\n")
bottlenecks = find_slow_imports(str(tmp_path))
assert len(bottlenecks) >= 1
assert any("pandas" in b.name for b in bottlenecks)
def test_finds_torch(self, tmp_path):
src = tmp_path / "model.py"
src.write_text("import torch\n")
bottlenecks = find_slow_imports(str(tmp_path))
assert any("torch" in b.name for b in bottlenecks)
def test_skips_light_imports(self, tmp_path):
src = tmp_path / "utils.py"
src.write_text("import json\nimport os\nimport sys\n")
bottlenecks = find_slow_imports(str(tmp_path))
assert len(bottlenecks) == 0
class TestGenerateReport:
"""Test full report generation."""
def test_empty_repo(self, tmp_path):
report = generate_report(str(tmp_path))
assert report.summary["total_bottlenecks"] >= 0
assert "critical" in report.summary
assert "warning" in report.summary
def test_with_findings(self, tmp_path):
# Create a test file with issues
test_file = tmp_path / "test_slow.py"
test_file.write_text(textwrap.dedent('''
import time
import requests
def test_sleepy():
time.sleep(3)
def test_http():
requests.get("https://example.com")
'''))
report = generate_report(str(tmp_path))
assert report.summary["total_bottlenecks"] >= 2
assert len(report.bottlenecks) > 0
def test_summary_categories(self, tmp_path):
report = generate_report(str(tmp_path))
assert "by_category" in report.summary
class TestMarkdownReport:
"""Test markdown output."""
def test_format(self):
report = PerfReport(
timestamp="2026-01-01T00:00:00Z",
repo_path="/tmp/repo",
bottlenecks=[
Bottleneck("test", "slow_test", 5.0, "critical", "Fix it")
],
summary={
"total_bottlenecks": 1,
"critical": 1,
"warning": 0,
"info": 0,
"estimated_total_slowdown_s": 5.0,
"by_category": {"test": 1},
}
)
md = format_markdown(report)
assert "# Performance Bottleneck Report" in md
assert "slow_test" in md
assert "🔴" in md
assert "Fix it" in md
def test_empty_report(self):
report = PerfReport(
timestamp="2026-01-01T00:00:00Z",
repo_path="/tmp/repo",
summary={
"total_bottlenecks": 0,
"critical": 0,
"warning": 0,
"info": 0,
"estimated_total_slowdown_s": 0,
"by_category": {},
}
)
md = format_markdown(report)
assert "Total bottlenecks:** 0" in md

View File

@@ -1,108 +0,0 @@
"""
Tests for quality_gate.py — Knowledge entry quality scoring.
"""
import unittest
from datetime import datetime, timezone, timedelta
import sys
from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent.parent))
from quality_gate import (
score_specificity,
score_actionability,
score_freshness,
score_source_quality,
score_entry,
filter_entries,
)
class TestScoreSpecificity(unittest.TestCase):
def test_specific_content_scores_high(self):
content = "Run `python3 deploy.py --env prod` on 2026-04-15. Example: step 1 configure nginx."
score = score_specificity(content)
self.assertGreater(score, 0.6)
def test_vague_content_scores_low(self):
content = "It generally depends. Various factors might affect this. Basically, it varies."
score = score_specificity(content)
self.assertLess(score, 0.5)
def test_empty_scores_baseline(self):
score = score_specificity("")
self.assertAlmostEqual(score, 0.5, delta=0.1)
class TestScoreActionability(unittest.TestCase):
def test_actionable_content_scores_high(self):
content = "1. Run `pip install -r requirements.txt`\n2. Execute `python3 train.py`\n3. Verify with `pytest`"
score = score_actionability(content)
self.assertGreater(score, 0.6)
def test_abstract_content_scores_low(self):
content = "The concept of intelligence is fascinating and multifaceted."
score = score_actionability(content)
self.assertLess(score, 0.5)
class TestScoreFreshness(unittest.TestCase):
def test_recent_timestamp_scores_high(self):
recent = datetime.now(timezone.utc).isoformat()
score = score_freshness(recent)
self.assertGreater(score, 0.9)
def test_old_timestamp_scores_low(self):
old = (datetime.now(timezone.utc) - timedelta(days=365)).isoformat()
score = score_freshness(old)
self.assertLess(score, 0.2)
def test_none_returns_baseline(self):
score = score_freshness(None)
self.assertEqual(score, 0.5)
class TestScoreSourceQuality(unittest.TestCase):
def test_claude_scores_high(self):
self.assertGreater(score_source_quality("claude-sonnet"), 0.85)
def test_ollama_scores_lower(self):
self.assertLess(score_source_quality("ollama"), 0.7)
def test_unknown_returns_default(self):
self.assertEqual(score_source_quality("unknown"), 0.5)
class TestScoreEntry(unittest.TestCase):
def test_good_entry_scores_high(self):
entry = {
"content": "To deploy: run `kubectl apply -f deployment.yaml`. Verify with `kubectl get pods`.",
"model": "claude-sonnet",
"timestamp": datetime.now(timezone.utc).isoformat(),
}
score = score_entry(entry)
self.assertGreater(score, 0.6)
def test_poor_entry_scores_low(self):
entry = {
"content": "It depends. Various things might happen.",
"model": "unknown",
}
score = score_entry(entry)
self.assertLess(score, 0.5)
class TestFilterEntries(unittest.TestCase):
def test_filters_low_quality(self):
entries = [
{"content": "Run `deploy.py` to fix the issue.", "model": "claude"},
{"content": "It might work sometimes.", "model": "unknown"},
{"content": "Configure nginx: step 1 edit nginx.conf", "model": "gpt-4"},
]
filtered = filter_entries(entries, threshold=0.5)
self.assertGreaterEqual(len(filtered), 2)
if __name__ == "__main__":
unittest.main()