Compare commits
60 Commits
burn/8-har
...
fix/211-sy
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
4363864012 | ||
| fe8a70adc1 | |||
|
|
a4cbfbd5bf | ||
|
|
10096977bd | ||
|
|
08c4237dca | ||
|
|
661f2730f2 | ||
|
|
80cc656fc1 | ||
|
|
522fa84151 | ||
|
|
69992cf82a | ||
|
|
d0a422deef | ||
|
|
4210a4ea9d | ||
|
|
bac0e0336b | ||
|
|
64b823c598 | ||
|
|
27459271ac | ||
|
|
45cd835bd2 | ||
|
|
5e8a201a26 | ||
|
|
fa43d82f79 | ||
|
|
c7b7b38510 | ||
|
|
908fbbb26c | ||
|
|
ea33c331f6 | ||
|
|
f55d31cd02 | ||
| 8ca3132990 | |||
|
|
363407293d | ||
|
|
4bd0a5d7ac | ||
| 93bc3fc18a | |||
|
|
9668034ad6 | ||
| 49365c64d2 | |||
|
|
341abab2a0 | ||
| e8359cf10a | |||
| b3592e14ad | |||
| f1175df79d | |||
| be805a1b4c | |||
| 1d47665dd4 | |||
| 5eab5e4aac | |||
| 71dd801575 | |||
| e6f1b07f16 | |||
| 81c02f6709 | |||
| c2c3c6a3b9 | |||
| d664119b9c | |||
| 764414d4d5 | |||
| 54f3bef7fc | |||
| 4fcd372de4 | |||
| 77a753f6f2 | |||
| cbebd93cbb | |||
| b36f617d4a | |||
| b5466dc938 | |||
| 55797c8a3e | |||
| 7342fc7cb2 | |||
| 206cfbb498 | |||
| cdb71adddf | |||
| 160dfcf419 | |||
| 8d716ff03f | |||
| 920510996e | |||
| 1fafeaf5a4 | |||
| 36b440f998 | |||
| 9f3caabf42 | |||
| a21f3a44e1 | |||
|
|
b32d316023 | ||
|
|
b65256bf76 | ||
|
|
da073ad7cf |
22
.gitea/workflows/test.yml
Normal file
22
.gitea/workflows/test.yml
Normal file
@@ -0,0 +1,22 @@
|
|||||||
|
name: Test
|
||||||
|
|
||||||
|
on:
|
||||||
|
pull_request:
|
||||||
|
push:
|
||||||
|
branches: [main]
|
||||||
|
|
||||||
|
jobs:
|
||||||
|
pytest:
|
||||||
|
runs-on: ubuntu-latest
|
||||||
|
steps:
|
||||||
|
- uses: actions/checkout@v4
|
||||||
|
- uses: actions/setup-python@v5
|
||||||
|
with:
|
||||||
|
python-version: '3.11'
|
||||||
|
- name: Install test dependencies
|
||||||
|
run: |
|
||||||
|
python -m pip install --upgrade pip
|
||||||
|
pip install -r requirements.txt
|
||||||
|
- name: Run test suite
|
||||||
|
run: |
|
||||||
|
make test
|
||||||
239
GENOME.md
Normal file
239
GENOME.md
Normal file
@@ -0,0 +1,239 @@
|
|||||||
|
# GENOME.md — compounding-intelligence
|
||||||
|
|
||||||
|
*Auto-generated codebase genome. Addresses timmy-home#676.*
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Project Overview
|
||||||
|
|
||||||
|
**What:** A system that turns 1B+ daily agent tokens into durable, compounding fleet intelligence.
|
||||||
|
|
||||||
|
**Why:** Every agent session starts at zero. The same mistakes get made repeatedly — the same HTTP 405 is rediscovered as a branch protection issue, the same token path is searched for from scratch. Intelligence evaporates when the session ends.
|
||||||
|
|
||||||
|
**How:** Three pipelines form a compounding loop:
|
||||||
|
|
||||||
|
```
|
||||||
|
SESSION ENDS → HARVESTER → KNOWLEDGE STORE → BOOTSTRAPPER → NEW SESSION STARTS SMARTER
|
||||||
|
↓
|
||||||
|
MEASURER → Prove it's working
|
||||||
|
```
|
||||||
|
|
||||||
|
**Status:** Early stage. Template and test scaffolding exist. Core pipeline scripts (harvester.py, bootstrapper.py, measurer.py, session_reader.py) are planned but not yet implemented. The knowledge extraction prompt is complete and validated.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
graph TD
|
||||||
|
A[Session Transcript<br/>.jsonl] --> B[Harvester]
|
||||||
|
B --> C{Extract Knowledge}
|
||||||
|
C --> D[knowledge/index.json]
|
||||||
|
C --> E[knowledge/global/*.md]
|
||||||
|
C --> F[knowledge/repos/{repo}.md]
|
||||||
|
C --> G[knowledge/agents/{agent}.md]
|
||||||
|
D --> H[Bootstrapper]
|
||||||
|
H --> I[Bootstrap Context<br/>2k token injection]
|
||||||
|
I --> J[New Session<br/>starts smarter]
|
||||||
|
J --> A
|
||||||
|
D --> K[Measurer]
|
||||||
|
K --> L[metrics/dashboard.md]
|
||||||
|
K --> M[Velocity / Hit Rate<br/>Error Reduction]
|
||||||
|
```
|
||||||
|
|
||||||
|
### Pipeline 1: Harvester
|
||||||
|
|
||||||
|
**Status:** Prompt designed. Script not implemented.
|
||||||
|
|
||||||
|
Reads finished session transcripts (JSONL). Uses `templates/harvest-prompt.md` to extract durable knowledge into five categories:
|
||||||
|
|
||||||
|
| Category | Description | Example |
|
||||||
|
|----------|-------------|---------|
|
||||||
|
| `fact` | Concrete, verifiable information | "Repository X has 5 files" |
|
||||||
|
| `pitfall` | Errors encountered, wrong assumptions | "Token is at ~/.config/gitea/token, not env var" |
|
||||||
|
| `pattern` | Successful action sequences | "Deploy: test → build → push → webhook" |
|
||||||
|
| `tool-quirk` | Environment-specific behaviors | "URL format requires trailing slash" |
|
||||||
|
| `question` | Identified but unanswered | "Need optimal batch size for harvesting" |
|
||||||
|
|
||||||
|
Output schema per knowledge item:
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"fact": "One sentence description",
|
||||||
|
"category": "fact|pitfall|pattern|tool-quirk|question",
|
||||||
|
"repo": "repo-name or 'global'",
|
||||||
|
"confidence": 0.0-1.0
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Pipeline 2: Bootstrapper
|
||||||
|
|
||||||
|
**Status:** Not implemented.
|
||||||
|
|
||||||
|
Queries knowledge store before session start. Assembles a compact 2k-token context from relevant facts. Injects into session startup so the agent begins with full situational awareness.
|
||||||
|
|
||||||
|
### Pipeline 3: Measurer
|
||||||
|
|
||||||
|
**Status:** Not implemented.
|
||||||
|
|
||||||
|
Tracks compounding metrics: knowledge velocity (facts/day), error reduction (%), hit rate (knowledge used / knowledge available), task completion improvement.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Directory Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
compounding-intelligence/
|
||||||
|
├── README.md # Project overview and architecture
|
||||||
|
├── GENOME.md # This file (codebase genome)
|
||||||
|
├── knowledge/ # [PLANNED] Knowledge store
|
||||||
|
│ ├── index.json # Machine-readable fact index
|
||||||
|
│ ├── global/ # Cross-repo knowledge
|
||||||
|
│ ├── repos/{repo}.md # Per-repo knowledge
|
||||||
|
│ └── agents/{agent}.md # Agent-type notes
|
||||||
|
├── scripts/
|
||||||
|
│ ├── test_harvest_prompt.py # Basic prompt validation (2.5KB)
|
||||||
|
│ └── test_harvest_prompt_comprehensive.py # Full prompt structure test (6.8KB)
|
||||||
|
├── templates/
|
||||||
|
│ └── harvest-prompt.md # Knowledge extraction prompt (3.5KB)
|
||||||
|
├── test_sessions/
|
||||||
|
│ ├── session_success.jsonl # Happy path test data
|
||||||
|
│ ├── session_failure.jsonl # Failure path test data
|
||||||
|
│ ├── session_partial.jsonl # Incomplete session test data
|
||||||
|
│ ├── session_patterns.jsonl # Pattern extraction test data
|
||||||
|
│ └── session_questions.jsonl # Question identification test data
|
||||||
|
└── metrics/ # [PLANNED] Compounding metrics
|
||||||
|
└── dashboard.md
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Entry Points and Data Flow
|
||||||
|
|
||||||
|
### Entry Point 1: Knowledge Extraction (Harvester)
|
||||||
|
|
||||||
|
```
|
||||||
|
Input: Session transcript (JSONL)
|
||||||
|
↓
|
||||||
|
templates/harvest-prompt.md (LLM prompt)
|
||||||
|
↓
|
||||||
|
Knowledge items (JSON array)
|
||||||
|
↓
|
||||||
|
Output: knowledge/index.json + per-repo/per-agent markdown files
|
||||||
|
```
|
||||||
|
|
||||||
|
### Entry Point 2: Session Bootstrap (Bootstrapper)
|
||||||
|
|
||||||
|
```
|
||||||
|
Input: Session context (repo, agent type, task type)
|
||||||
|
↓
|
||||||
|
knowledge/index.json (query relevant facts)
|
||||||
|
↓
|
||||||
|
2k-token bootstrap context
|
||||||
|
↓
|
||||||
|
Output: Injected into session startup
|
||||||
|
```
|
||||||
|
|
||||||
|
### Entry Point 3: Measurement (Measurer)
|
||||||
|
|
||||||
|
```
|
||||||
|
Input: knowledge/index.json + session history
|
||||||
|
↓
|
||||||
|
Velocity, hit rate, error reduction calculations
|
||||||
|
↓
|
||||||
|
Output: metrics/dashboard.md
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Key Abstractions
|
||||||
|
|
||||||
|
### Knowledge Item
|
||||||
|
The atomic unit. One sentence, one category, one confidence score. Designed to be small enough that 1000 items fit in a 2k-token bootstrap context.
|
||||||
|
|
||||||
|
### Knowledge Store
|
||||||
|
A directory structure that mirrors the fleet's mental model:
|
||||||
|
- `global/` — knowledge that applies everywhere (tool quirks, environment facts)
|
||||||
|
- `repos/` — knowledge specific to each repo
|
||||||
|
- `agents/` — knowledge specific to each agent type
|
||||||
|
|
||||||
|
### Confidence Score
|
||||||
|
0.0–1.0 scale. Defines how certain the harvester is about each extracted fact:
|
||||||
|
- 0.9–1.0: Explicitly stated with verification
|
||||||
|
- 0.7–0.8: Clearly implied by multiple data points
|
||||||
|
- 0.5–0.6: Suggested but not fully verified
|
||||||
|
- 0.3–0.4: Inferred from limited data
|
||||||
|
- 0.1–0.2: Speculative or uncertain
|
||||||
|
|
||||||
|
### Bootstrap Context
|
||||||
|
The 2k-token injection that a new session receives. Assembled from the most relevant knowledge items for the current task, filtered by confidence > 0.7, deduplicated, and compressed.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## API Surface
|
||||||
|
|
||||||
|
### Internal (scripts not yet implemented)
|
||||||
|
|
||||||
|
| Script | Input | Output | Status |
|
||||||
|
|--------|-------|--------|--------|
|
||||||
|
| `harvester.py` | Session JSONL path | Knowledge items JSON | PLANNED |
|
||||||
|
| `bootstrapper.py` | Repo + agent type | 2k-token context string | PLANNED |
|
||||||
|
| `measurer.py` | Knowledge store path | Metrics JSON | PLANNED |
|
||||||
|
| `session_reader.py` | Session JSONL path | Parsed transcript | PLANNED |
|
||||||
|
|
||||||
|
### Prompt (templates/harvest-prompt.md)
|
||||||
|
|
||||||
|
The extraction prompt is the core "API." It takes a session transcript and returns structured JSON. It defines:
|
||||||
|
- Five extraction categories
|
||||||
|
- Output format (JSON array of knowledge items)
|
||||||
|
- Confidence scoring rubric
|
||||||
|
- Constraints (no hallucination, specificity, relevance, brevity)
|
||||||
|
- Example input/output pair
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Test Coverage
|
||||||
|
|
||||||
|
### What Exists
|
||||||
|
|
||||||
|
| File | Tests | Coverage |
|
||||||
|
|------|-------|----------|
|
||||||
|
| `scripts/test_harvest_prompt.py` | 2 tests | Prompt file existence, sample transcript |
|
||||||
|
| `scripts/test_harvest_prompt_comprehensive.py` | 5 tests | Prompt structure, categories, fields, confidence scoring, size limits |
|
||||||
|
| `test_sessions/*.jsonl` | 5 sessions | Success, failure, partial, patterns, questions |
|
||||||
|
|
||||||
|
### What's Missing
|
||||||
|
|
||||||
|
1. **Harvester integration test** — Does the prompt actually extract correct knowledge from real transcripts?
|
||||||
|
2. **Bootstrapper test** — Does it assemble relevant context correctly?
|
||||||
|
3. **Knowledge store test** — Does the index.json maintain consistency?
|
||||||
|
4. **Confidence calibration test** — Do high-confidence facts actually prove true in later sessions?
|
||||||
|
5. **Deduplication test** — Are duplicate facts across sessions handled?
|
||||||
|
6. **Staleness test** — How does the system handle outdated knowledge?
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Security Considerations
|
||||||
|
|
||||||
|
1. **No secrets in knowledge store** — The harvester must filter out API keys, tokens, and credentials from extracted facts. The prompt constraints mention this but there is no automated guard.
|
||||||
|
|
||||||
|
2. **Knowledge poisoning** — A malicious or corrupted session could inject false facts. Confidence scoring partially mitigates this, but there is no verification step.
|
||||||
|
|
||||||
|
3. **Access control** — The knowledge store has no access control. Any process that can read the directory can read all facts. In a multi-tenant setup, this is a concern.
|
||||||
|
|
||||||
|
4. **Transcript privacy** — Session transcripts may contain user data. The harvester must not extract personally identifiable information into the knowledge store.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## The 100x Path (from README)
|
||||||
|
|
||||||
|
```
|
||||||
|
Month 1: 15,000 facts, sessions 20% faster
|
||||||
|
Month 2: 45,000 facts, sessions 40% faster, first-try success up 30%
|
||||||
|
Month 3: 90,000 facts, fleet measurably smarter per token
|
||||||
|
```
|
||||||
|
|
||||||
|
Each new session is better than the last. The intelligence compounds.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*Generated by codebase-genome pipeline. Ref: timmy-home#676.*
|
||||||
4
Makefile
Normal file
4
Makefile
Normal file
@@ -0,0 +1,4 @@
|
|||||||
|
.PHONY: test
|
||||||
|
|
||||||
|
test:
|
||||||
|
python3 -m pytest tests/test_ci_config.py scripts/test_*.py -v
|
||||||
114
knowledge/SCHEMA.md
Normal file
114
knowledge/SCHEMA.md
Normal file
@@ -0,0 +1,114 @@
|
|||||||
|
# Knowledge File Format Specification
|
||||||
|
|
||||||
|
**Version:** 1
|
||||||
|
**Issue:** #10
|
||||||
|
**Status:** Draft
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
The knowledge system has two layers:
|
||||||
|
|
||||||
|
1. **index.json** — Machine-readable fact index. Fast lookups by ID, category, repo, tags.
|
||||||
|
2. **Knowledge files** (YAML) — Human-readable, editable facts organized by domain.
|
||||||
|
|
||||||
|
The harvester writes to both. The bootstrapper reads from index.json. Humans edit the YAML files directly.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## index.json Schema
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"version": 1,
|
||||||
|
"last_updated": "ISO-8601 timestamp",
|
||||||
|
"total_facts": 0,
|
||||||
|
"facts": []
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Fact Object
|
||||||
|
|
||||||
|
| Field | Type | Required | Description |
|
||||||
|
|-------|------|----------|-------------|
|
||||||
|
| `id` | string | yes | Unique identifier: `{domain}:{category}:{sequence}` |
|
||||||
|
| `fact` | string | yes | One-sentence description of the knowledge |
|
||||||
|
| `category` | enum | yes | One of: `fact`, `pitfall`, `pattern`, `tool-quirk`, `question` |
|
||||||
|
| `domain` | string | yes | Where this applies: repo name, `global`, or agent name |
|
||||||
|
| `confidence` | float | yes | 0.0–1.0. How certain is this knowledge? |
|
||||||
|
| `tags` | string[] | no | Searchable labels |
|
||||||
|
| `source_count` | int | no | How many sessions confirmed this fact |
|
||||||
|
| `first_seen` | date | no | ISO-8601 date first extracted |
|
||||||
|
| `last_confirmed` | date | no | ISO-8601 date last seen in a session |
|
||||||
|
| `expires` | date | no | Optional. After this date, fact is stale |
|
||||||
|
| `related` | string[] | no | IDs of related facts |
|
||||||
|
|
||||||
|
### ID Format: `{domain}:{category}:{sequence}`
|
||||||
|
|
||||||
|
### Categories
|
||||||
|
|
||||||
|
| Category | Definition |
|
||||||
|
|----------|------------|
|
||||||
|
| `fact` | Concrete, verifiable information |
|
||||||
|
| `pitfall` | Errors, wrong assumptions, time-wasters |
|
||||||
|
| `pattern` | Successful sequences of actions |
|
||||||
|
| `tool-quirk` | Environment-specific behaviors |
|
||||||
|
| `question` | Identified but unanswered |
|
||||||
|
|
||||||
|
### Confidence Scoring
|
||||||
|
|
||||||
|
| Range | Meaning |
|
||||||
|
|-------|---------|
|
||||||
|
| 0.9–1.0 | Explicitly stated and verified |
|
||||||
|
| 0.7–0.8 | Clearly implied by multiple data points |
|
||||||
|
| 0.5–0.6 | Suggested but not fully verified |
|
||||||
|
| 0.3–0.4 | Inferred from limited data |
|
||||||
|
| 0.1–0.2 | Speculative or uncertain |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Directory Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
knowledge/
|
||||||
|
├── index.json # Machine-readable fact index
|
||||||
|
├── SCHEMA.md # This file
|
||||||
|
├── global/ # Cross-repo knowledge
|
||||||
|
│ ├── pitfalls.yaml
|
||||||
|
│ ├── patterns.yaml
|
||||||
|
│ └── tool-quirks.yaml
|
||||||
|
├── repos/ # Per-repo knowledge
|
||||||
|
│ ├── {repo-name}.yaml
|
||||||
|
│ └── ...
|
||||||
|
└── agents/ # Agent-type knowledge
|
||||||
|
└── {agent-type}.yaml
|
||||||
|
```
|
||||||
|
|
||||||
|
## YAML File Format
|
||||||
|
|
||||||
|
YAML files use frontmatter for metadata, then markdown sections with fact entries:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
---
|
||||||
|
domain: global
|
||||||
|
category: tool-quirk
|
||||||
|
version: 1
|
||||||
|
last_updated: "2026-04-13"
|
||||||
|
---
|
||||||
|
|
||||||
|
# Title
|
||||||
|
|
||||||
|
## Section
|
||||||
|
|
||||||
|
- id: global:tool-quirk:001
|
||||||
|
fact: "Description"
|
||||||
|
confidence: 0.95
|
||||||
|
tags: [tag1, tag2]
|
||||||
|
source_count: 5
|
||||||
|
first_seen: "2026-03-27"
|
||||||
|
```
|
||||||
|
|
||||||
|
## Validation
|
||||||
|
|
||||||
|
Run `python scripts/validate_knowledge.py` to validate index.json.
|
||||||
80
knowledge/global/pitfalls.yaml
Normal file
80
knowledge/global/pitfalls.yaml
Normal file
@@ -0,0 +1,80 @@
|
|||||||
|
---
|
||||||
|
domain: global
|
||||||
|
category: pitfall
|
||||||
|
version: 1
|
||||||
|
last_updated: "2026-04-13"
|
||||||
|
---
|
||||||
|
|
||||||
|
# Pitfalls (Global)
|
||||||
|
|
||||||
|
Cross-repo traps that waste time across the fleet.
|
||||||
|
|
||||||
|
## Git & Forge
|
||||||
|
|
||||||
|
- id: global:pitfall:001
|
||||||
|
fact: "Branch protection requires 1 approval on main - API merges fail with 405 without it"
|
||||||
|
confidence: 0.95
|
||||||
|
tags: [git, merge, branch-protection, gitea]
|
||||||
|
source_count: 12
|
||||||
|
first_seen: "2026-04-05"
|
||||||
|
last_confirmed: "2026-04-13"
|
||||||
|
related: [the-nexus:pitfall:001]
|
||||||
|
|
||||||
|
- id: global:pitfall:002
|
||||||
|
fact: "Never use --no-verify on git commits - it bypasses all hooks including safety checks"
|
||||||
|
confidence: 0.95
|
||||||
|
tags: [git, hooks, safety]
|
||||||
|
source_count: 5
|
||||||
|
first_seen: "2026-03-28"
|
||||||
|
last_confirmed: "2026-04-13"
|
||||||
|
|
||||||
|
- id: global:pitfall:003
|
||||||
|
fact: "Gitea PR creation workaround needed on the-nexus - direct API call fails, use alternative endpoint"
|
||||||
|
confidence: 0.9
|
||||||
|
tags: [gitea, pr, api, workaround]
|
||||||
|
source_count: 4
|
||||||
|
first_seen: "2026-04-06"
|
||||||
|
last_confirmed: "2026-04-12"
|
||||||
|
|
||||||
|
## Agent Operations
|
||||||
|
|
||||||
|
- id: global:pitfall:004
|
||||||
|
fact: "Anthropic is BANNED from fallback chain - if fallback triggers to Anthropic, something is wrong"
|
||||||
|
confidence: 0.95
|
||||||
|
tags: [provider, anthropic, fallback]
|
||||||
|
source_count: 7
|
||||||
|
first_seen: "2026-03-30"
|
||||||
|
last_confirmed: "2026-04-13"
|
||||||
|
|
||||||
|
- id: global:pitfall:005
|
||||||
|
fact: "Telegram tokens expired - don't assume Telegram notifications work without checking"
|
||||||
|
confidence: 0.85
|
||||||
|
tags: [telegram, notifications, token]
|
||||||
|
source_count: 3
|
||||||
|
first_seen: "2026-04-02"
|
||||||
|
|
||||||
|
- id: global:pitfall:006
|
||||||
|
fact: "Multiple gateways = 'cannot schedule futures' error - only one gateway process should run"
|
||||||
|
confidence: 0.9
|
||||||
|
tags: [gateway, cron, process]
|
||||||
|
source_count: 4
|
||||||
|
first_seen: "2026-04-04"
|
||||||
|
last_confirmed: "2026-04-11"
|
||||||
|
|
||||||
|
## Testing
|
||||||
|
|
||||||
|
- id: global:pitfall:007
|
||||||
|
fact: "pytest root collection picks up operational *_test.py scripts - restrict to tests/ directory"
|
||||||
|
confidence: 0.9
|
||||||
|
tags: [pytest, test, collection]
|
||||||
|
source_count: 3
|
||||||
|
first_seen: "2026-04-07"
|
||||||
|
last_confirmed: "2026-04-13"
|
||||||
|
|
||||||
|
- id: global:pitfall:008
|
||||||
|
fact: "TDD: test 1 before building 55 - verify the cycle works before scaling"
|
||||||
|
confidence: 0.95
|
||||||
|
tags: [tdd, testing, methodology]
|
||||||
|
source_count: 8
|
||||||
|
first_seen: "2026-03-25"
|
||||||
|
last_confirmed: "2026-04-13"
|
||||||
71
knowledge/global/tool-quirks.yaml
Normal file
71
knowledge/global/tool-quirks.yaml
Normal file
@@ -0,0 +1,71 @@
|
|||||||
|
---
|
||||||
|
domain: global
|
||||||
|
category: tool-quirk
|
||||||
|
version: 1
|
||||||
|
last_updated: "2026-04-13"
|
||||||
|
---
|
||||||
|
|
||||||
|
# Tool Quirks (Global)
|
||||||
|
|
||||||
|
## Authentication
|
||||||
|
|
||||||
|
- id: global:tool-quirk:001
|
||||||
|
fact: "Gitea token stored at ~/.config/gitea/token, not env var GITEA_TOKEN"
|
||||||
|
confidence: 0.95
|
||||||
|
tags: [git, auth, gitea, token]
|
||||||
|
source_count: 23
|
||||||
|
first_seen: "2026-03-27"
|
||||||
|
last_confirmed: "2026-04-13"
|
||||||
|
related: [global:pitfall:001]
|
||||||
|
|
||||||
|
- id: global:tool-quirk:002
|
||||||
|
fact: "Gitea API uses 'Authorization: token TOKEN' header format, not Bearer"
|
||||||
|
confidence: 0.9
|
||||||
|
tags: [git, api, gitea]
|
||||||
|
source_count: 8
|
||||||
|
first_seen: "2026-03-28"
|
||||||
|
last_confirmed: "2026-04-12"
|
||||||
|
|
||||||
|
- id: global:tool-quirk:003
|
||||||
|
fact: "Gitea Issues API type=issues param does NOT filter PRs - use truthiness check on pull_request field"
|
||||||
|
confidence: 0.95
|
||||||
|
tags: [gitea, api, issues, pr]
|
||||||
|
source_count: 6
|
||||||
|
first_seen: "2026-04-01"
|
||||||
|
last_confirmed: "2026-04-13"
|
||||||
|
|
||||||
|
## Paths & Environment
|
||||||
|
|
||||||
|
- id: global:tool-quirk:004
|
||||||
|
fact: "~/.hermes is the default hermes home - check get_hermes_home() not the path literal"
|
||||||
|
confidence: 0.9
|
||||||
|
tags: [paths, hermes, env]
|
||||||
|
source_count: 10
|
||||||
|
first_seen: "2026-03-30"
|
||||||
|
last_confirmed: "2026-04-13"
|
||||||
|
related: [hermes-agent:pitfall:005]
|
||||||
|
|
||||||
|
- id: global:tool-quirk:005
|
||||||
|
fact: "Ansible vault-encrypted vars in YAML require vault_inline_vars plugin"
|
||||||
|
confidence: 0.85
|
||||||
|
tags: [ansible, vault, config]
|
||||||
|
source_count: 3
|
||||||
|
first_seen: "2026-04-02"
|
||||||
|
|
||||||
|
## Model & Inference
|
||||||
|
|
||||||
|
- id: global:tool-quirk:006
|
||||||
|
fact: "mimo-v2-pro via Nous Research is the default model - don't assume Anthropic is available"
|
||||||
|
confidence: 0.95
|
||||||
|
tags: [model, provider, nous, default]
|
||||||
|
source_count: 15
|
||||||
|
first_seen: "2026-03-25"
|
||||||
|
last_confirmed: "2026-04-13"
|
||||||
|
|
||||||
|
- id: global:tool-quirk:007
|
||||||
|
fact: "Kill + restart with 'hermes chat' preserves old model state - NEVER use --resume"
|
||||||
|
confidence: 0.95
|
||||||
|
tags: [hermes, model, restart, session]
|
||||||
|
source_count: 8
|
||||||
|
first_seen: "2026-03-29"
|
||||||
|
last_confirmed: "2026-04-12"
|
||||||
@@ -1,6 +1,472 @@
|
|||||||
{
|
{
|
||||||
"version": 1,
|
"version": 1,
|
||||||
"last_updated": "2026-04-13T20:00:00Z",
|
"last_updated": "2026-04-13T20:00:00Z",
|
||||||
"total_facts": 0,
|
"total_facts": 29,
|
||||||
"facts": []
|
"facts": [
|
||||||
|
{
|
||||||
|
"id": "hermes-agent:pitfall:001",
|
||||||
|
"fact": "deploy-crons.py leaves jobs in mixed model format",
|
||||||
|
"category": "pitfall",
|
||||||
|
"domain": "hermes-agent",
|
||||||
|
"confidence": 0.95,
|
||||||
|
"tags": [
|
||||||
|
"cron",
|
||||||
|
"deploy",
|
||||||
|
"model",
|
||||||
|
"config"
|
||||||
|
],
|
||||||
|
"source_count": 5,
|
||||||
|
"first_seen": "2026-04-08",
|
||||||
|
"last_confirmed": "2026-04-13",
|
||||||
|
"related": [
|
||||||
|
"hermes-agent:pitfall:002",
|
||||||
|
"hermes-agent:pitfall:003"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "hermes-agent:pitfall:002",
|
||||||
|
"fact": "deploy-crons.py --deploy doesn't set legacy skill field from skills list",
|
||||||
|
"category": "pitfall",
|
||||||
|
"domain": "hermes-agent",
|
||||||
|
"confidence": 0.9,
|
||||||
|
"tags": [
|
||||||
|
"cron",
|
||||||
|
"deploy",
|
||||||
|
"skills"
|
||||||
|
],
|
||||||
|
"source_count": 3,
|
||||||
|
"first_seen": "2026-04-09",
|
||||||
|
"last_confirmed": "2026-04-13",
|
||||||
|
"related": [
|
||||||
|
"hermes-agent:pitfall:001"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "hermes-agent:pitfall:003",
|
||||||
|
"fact": "Cron jobs with blank fallback_model fields trigger spurious gateway warnings",
|
||||||
|
"category": "pitfall",
|
||||||
|
"domain": "hermes-agent",
|
||||||
|
"confidence": 0.9,
|
||||||
|
"tags": [
|
||||||
|
"cron",
|
||||||
|
"model",
|
||||||
|
"fallback"
|
||||||
|
],
|
||||||
|
"source_count": 4,
|
||||||
|
"first_seen": "2026-04-07",
|
||||||
|
"last_confirmed": "2026-04-12",
|
||||||
|
"related": [
|
||||||
|
"hermes-agent:pitfall:001"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "hermes-agent:pitfall:004",
|
||||||
|
"fact": "model-watchdog.py checks first provider line, not model.provider - causes false drift alarms",
|
||||||
|
"category": "pitfall",
|
||||||
|
"domain": "hermes-agent",
|
||||||
|
"confidence": 0.9,
|
||||||
|
"tags": [
|
||||||
|
"watchdog",
|
||||||
|
"model",
|
||||||
|
"config"
|
||||||
|
],
|
||||||
|
"source_count": 3,
|
||||||
|
"first_seen": "2026-04-08",
|
||||||
|
"last_confirmed": "2026-04-13"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "hermes-agent:pitfall:005",
|
||||||
|
"fact": "10+ files read HERMES_HOME directly instead of get_hermes_home()",
|
||||||
|
"category": "pitfall",
|
||||||
|
"domain": "hermes-agent",
|
||||||
|
"confidence": 0.85,
|
||||||
|
"tags": [
|
||||||
|
"paths",
|
||||||
|
"env",
|
||||||
|
"hermes-home"
|
||||||
|
],
|
||||||
|
"source_count": 6,
|
||||||
|
"first_seen": "2026-04-06",
|
||||||
|
"last_confirmed": "2026-04-12",
|
||||||
|
"related": [
|
||||||
|
"global:pitfall:002"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "hermes-agent:pitfall:006",
|
||||||
|
"fact": "get_hermes_home() doesn't expand tilde when HERMES_HOME=~/... is set",
|
||||||
|
"category": "pitfall",
|
||||||
|
"domain": "hermes-agent",
|
||||||
|
"confidence": 0.8,
|
||||||
|
"tags": [
|
||||||
|
"paths",
|
||||||
|
"env",
|
||||||
|
"bug"
|
||||||
|
],
|
||||||
|
"source_count": 2,
|
||||||
|
"first_seen": "2026-04-05"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "hermes-agent:pitfall:007",
|
||||||
|
"fact": "vps-agent-dispatch reports OK while remote hermes binary path is broken",
|
||||||
|
"category": "pitfall",
|
||||||
|
"domain": "hermes-agent",
|
||||||
|
"confidence": 0.9,
|
||||||
|
"tags": [
|
||||||
|
"ssh",
|
||||||
|
"dispatch",
|
||||||
|
"vps"
|
||||||
|
],
|
||||||
|
"source_count": 4,
|
||||||
|
"first_seen": "2026-04-07",
|
||||||
|
"last_confirmed": "2026-04-11"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "hermes-agent:pitfall:008",
|
||||||
|
"fact": "nightwatch-health-monitor SSH check fails on cloud-model-only deployments",
|
||||||
|
"category": "pitfall",
|
||||||
|
"domain": "hermes-agent",
|
||||||
|
"confidence": 0.85,
|
||||||
|
"tags": [
|
||||||
|
"ssh",
|
||||||
|
"health",
|
||||||
|
"cloud"
|
||||||
|
],
|
||||||
|
"source_count": 2,
|
||||||
|
"first_seen": "2026-04-10"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "the-nexus:pitfall:001",
|
||||||
|
"fact": "Merges fail with HTTP 405 due to branch protection",
|
||||||
|
"category": "pitfall",
|
||||||
|
"domain": "the-nexus",
|
||||||
|
"confidence": 0.95,
|
||||||
|
"tags": [
|
||||||
|
"git",
|
||||||
|
"merge",
|
||||||
|
"branch-protection",
|
||||||
|
"gitea"
|
||||||
|
],
|
||||||
|
"source_count": 12,
|
||||||
|
"first_seen": "2026-04-05",
|
||||||
|
"last_confirmed": "2026-04-13",
|
||||||
|
"related": [
|
||||||
|
"global:pitfall:001"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "the-nexus:pitfall:002",
|
||||||
|
"fact": "ThreadingHTTPServer required for multi-user bridge - standard HTTPServer blocks on concurrent requests",
|
||||||
|
"category": "pitfall",
|
||||||
|
"domain": "the-nexus",
|
||||||
|
"confidence": 0.95,
|
||||||
|
"tags": [
|
||||||
|
"server",
|
||||||
|
"concurrency",
|
||||||
|
"bridge"
|
||||||
|
],
|
||||||
|
"source_count": 5,
|
||||||
|
"first_seen": "2026-04-10",
|
||||||
|
"last_confirmed": "2026-04-13"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "the-nexus:pitfall:003",
|
||||||
|
"fact": "ChatLog.log() crashes on message persistence when index.html has orphaned button tags",
|
||||||
|
"category": "pitfall",
|
||||||
|
"domain": "the-nexus",
|
||||||
|
"confidence": 0.9,
|
||||||
|
"tags": [
|
||||||
|
"html",
|
||||||
|
"crash",
|
||||||
|
"chatlog"
|
||||||
|
],
|
||||||
|
"source_count": 3,
|
||||||
|
"first_seen": "2026-04-12",
|
||||||
|
"last_confirmed": "2026-04-13"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "the-nexus:pitfall:004",
|
||||||
|
"fact": "Three.js LOD not implemented - local hardware struggles with full scene",
|
||||||
|
"category": "pitfall",
|
||||||
|
"domain": "the-nexus",
|
||||||
|
"confidence": 0.85,
|
||||||
|
"tags": [
|
||||||
|
"threejs",
|
||||||
|
"performance",
|
||||||
|
"lod"
|
||||||
|
],
|
||||||
|
"source_count": 4,
|
||||||
|
"first_seen": "2026-04-09",
|
||||||
|
"last_confirmed": "2026-04-13"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "the-nexus:pitfall:005",
|
||||||
|
"fact": "Duplicate content blocks appear in index.html when PR merges conflict silently",
|
||||||
|
"category": "pitfall",
|
||||||
|
"domain": "the-nexus",
|
||||||
|
"confidence": 0.8,
|
||||||
|
"tags": [
|
||||||
|
"html",
|
||||||
|
"merge-conflict",
|
||||||
|
"duplicate"
|
||||||
|
],
|
||||||
|
"source_count": 3,
|
||||||
|
"first_seen": "2026-04-11",
|
||||||
|
"last_confirmed": "2026-04-13"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "the-nexus:pitfall:006",
|
||||||
|
"fact": "Unified HTTP + WebSocket server required for proper URL deployment - separate servers break CORS",
|
||||||
|
"category": "pitfall",
|
||||||
|
"domain": "the-nexus",
|
||||||
|
"confidence": 0.9,
|
||||||
|
"tags": [
|
||||||
|
"deploy",
|
||||||
|
"websocket",
|
||||||
|
"http",
|
||||||
|
"cors"
|
||||||
|
],
|
||||||
|
"source_count": 4,
|
||||||
|
"first_seen": "2026-04-10",
|
||||||
|
"last_confirmed": "2026-04-13"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "global:tool-quirk:001",
|
||||||
|
"fact": "Gitea token stored at ~/.config/gitea/token, not env var GITEA_TOKEN",
|
||||||
|
"category": "tool-quirk",
|
||||||
|
"domain": "global",
|
||||||
|
"confidence": 0.95,
|
||||||
|
"tags": [
|
||||||
|
"git",
|
||||||
|
"auth",
|
||||||
|
"gitea",
|
||||||
|
"token"
|
||||||
|
],
|
||||||
|
"source_count": 23,
|
||||||
|
"first_seen": "2026-03-27",
|
||||||
|
"last_confirmed": "2026-04-13",
|
||||||
|
"related": [
|
||||||
|
"global:pitfall:001"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "global:tool-quirk:002",
|
||||||
|
"fact": "Gitea API uses 'Authorization: token TOKEN' header format, not Bearer",
|
||||||
|
"category": "tool-quirk",
|
||||||
|
"domain": "global",
|
||||||
|
"confidence": 0.9,
|
||||||
|
"tags": [
|
||||||
|
"git",
|
||||||
|
"api",
|
||||||
|
"gitea"
|
||||||
|
],
|
||||||
|
"source_count": 8,
|
||||||
|
"first_seen": "2026-03-28",
|
||||||
|
"last_confirmed": "2026-04-12"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "global:tool-quirk:003",
|
||||||
|
"fact": "Gitea Issues API type=issues param does NOT filter PRs",
|
||||||
|
"category": "tool-quirk",
|
||||||
|
"domain": "global",
|
||||||
|
"confidence": 0.95,
|
||||||
|
"tags": [
|
||||||
|
"gitea",
|
||||||
|
"api",
|
||||||
|
"issues",
|
||||||
|
"pr"
|
||||||
|
],
|
||||||
|
"source_count": 6,
|
||||||
|
"first_seen": "2026-04-01",
|
||||||
|
"last_confirmed": "2026-04-13"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "global:tool-quirk:004",
|
||||||
|
"fact": "~/.hermes is the default hermes home - check get_hermes_home() not the path literal",
|
||||||
|
"category": "tool-quirk",
|
||||||
|
"domain": "global",
|
||||||
|
"confidence": 0.9,
|
||||||
|
"tags": [
|
||||||
|
"paths",
|
||||||
|
"hermes",
|
||||||
|
"env"
|
||||||
|
],
|
||||||
|
"source_count": 10,
|
||||||
|
"first_seen": "2026-03-30",
|
||||||
|
"last_confirmed": "2026-04-13",
|
||||||
|
"related": [
|
||||||
|
"hermes-agent:pitfall:005"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "global:tool-quirk:005",
|
||||||
|
"fact": "Ansible vault-encrypted vars in YAML require vault_inline_vars plugin",
|
||||||
|
"category": "tool-quirk",
|
||||||
|
"domain": "global",
|
||||||
|
"confidence": 0.85,
|
||||||
|
"tags": [
|
||||||
|
"ansible",
|
||||||
|
"vault",
|
||||||
|
"config"
|
||||||
|
],
|
||||||
|
"source_count": 3,
|
||||||
|
"first_seen": "2026-04-02"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "global:tool-quirk:006",
|
||||||
|
"fact": "mimo-v2-pro via Nous Research is the default model - don't assume Anthropic is available",
|
||||||
|
"category": "tool-quirk",
|
||||||
|
"domain": "global",
|
||||||
|
"confidence": 0.95,
|
||||||
|
"tags": [
|
||||||
|
"model",
|
||||||
|
"provider",
|
||||||
|
"nous",
|
||||||
|
"default"
|
||||||
|
],
|
||||||
|
"source_count": 15,
|
||||||
|
"first_seen": "2026-03-25",
|
||||||
|
"last_confirmed": "2026-04-13"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "global:tool-quirk:007",
|
||||||
|
"fact": "Kill + restart with 'hermes chat' preserves old model state - NEVER use --resume",
|
||||||
|
"category": "tool-quirk",
|
||||||
|
"domain": "global",
|
||||||
|
"confidence": 0.95,
|
||||||
|
"tags": [
|
||||||
|
"hermes",
|
||||||
|
"model",
|
||||||
|
"restart",
|
||||||
|
"session"
|
||||||
|
],
|
||||||
|
"source_count": 8,
|
||||||
|
"first_seen": "2026-03-29",
|
||||||
|
"last_confirmed": "2026-04-12"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "global:pitfall:001",
|
||||||
|
"fact": "Branch protection requires 1 approval on main - API merges fail with 405 without it",
|
||||||
|
"category": "pitfall",
|
||||||
|
"domain": "global",
|
||||||
|
"confidence": 0.95,
|
||||||
|
"tags": [
|
||||||
|
"git",
|
||||||
|
"merge",
|
||||||
|
"branch-protection",
|
||||||
|
"gitea"
|
||||||
|
],
|
||||||
|
"source_count": 12,
|
||||||
|
"first_seen": "2026-04-05",
|
||||||
|
"last_confirmed": "2026-04-13",
|
||||||
|
"related": [
|
||||||
|
"the-nexus:pitfall:001"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "global:pitfall:002",
|
||||||
|
"fact": "Never use --no-verify on git commits",
|
||||||
|
"category": "pitfall",
|
||||||
|
"domain": "global",
|
||||||
|
"confidence": 0.95,
|
||||||
|
"tags": [
|
||||||
|
"git",
|
||||||
|
"hooks",
|
||||||
|
"safety"
|
||||||
|
],
|
||||||
|
"source_count": 5,
|
||||||
|
"first_seen": "2026-03-28",
|
||||||
|
"last_confirmed": "2026-04-13"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "global:pitfall:003",
|
||||||
|
"fact": "Gitea PR creation workaround needed on the-nexus - direct API call fails",
|
||||||
|
"category": "pitfall",
|
||||||
|
"domain": "global",
|
||||||
|
"confidence": 0.9,
|
||||||
|
"tags": [
|
||||||
|
"gitea",
|
||||||
|
"pr",
|
||||||
|
"api",
|
||||||
|
"workaround"
|
||||||
|
],
|
||||||
|
"source_count": 4,
|
||||||
|
"first_seen": "2026-04-06",
|
||||||
|
"last_confirmed": "2026-04-12"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "global:pitfall:004",
|
||||||
|
"fact": "Anthropic is BANNED from fallback chain",
|
||||||
|
"category": "pitfall",
|
||||||
|
"domain": "global",
|
||||||
|
"confidence": 0.95,
|
||||||
|
"tags": [
|
||||||
|
"provider",
|
||||||
|
"anthropic",
|
||||||
|
"fallback"
|
||||||
|
],
|
||||||
|
"source_count": 7,
|
||||||
|
"first_seen": "2026-03-30",
|
||||||
|
"last_confirmed": "2026-04-13"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "global:pitfall:005",
|
||||||
|
"fact": "Telegram tokens expired - don't assume Telegram notifications work",
|
||||||
|
"category": "pitfall",
|
||||||
|
"domain": "global",
|
||||||
|
"confidence": 0.85,
|
||||||
|
"tags": [
|
||||||
|
"telegram",
|
||||||
|
"notifications",
|
||||||
|
"token"
|
||||||
|
],
|
||||||
|
"source_count": 3,
|
||||||
|
"first_seen": "2026-04-02"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "global:pitfall:006",
|
||||||
|
"fact": "Multiple gateways = 'cannot schedule futures' error - only one gateway process should run",
|
||||||
|
"category": "pitfall",
|
||||||
|
"domain": "global",
|
||||||
|
"confidence": 0.9,
|
||||||
|
"tags": [
|
||||||
|
"gateway",
|
||||||
|
"cron",
|
||||||
|
"process"
|
||||||
|
],
|
||||||
|
"source_count": 4,
|
||||||
|
"first_seen": "2026-04-04",
|
||||||
|
"last_confirmed": "2026-04-11"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "global:pitfall:007",
|
||||||
|
"fact": "pytest root collection picks up operational *_test.py scripts - restrict to tests/ directory",
|
||||||
|
"category": "pitfall",
|
||||||
|
"domain": "global",
|
||||||
|
"confidence": 0.9,
|
||||||
|
"tags": [
|
||||||
|
"pytest",
|
||||||
|
"test",
|
||||||
|
"collection"
|
||||||
|
],
|
||||||
|
"source_count": 3,
|
||||||
|
"first_seen": "2026-04-07",
|
||||||
|
"last_confirmed": "2026-04-13"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "global:pitfall:008",
|
||||||
|
"fact": "TDD: test 1 before building 55",
|
||||||
|
"category": "pitfall",
|
||||||
|
"domain": "global",
|
||||||
|
"confidence": 0.95,
|
||||||
|
"tags": [
|
||||||
|
"tdd",
|
||||||
|
"testing",
|
||||||
|
"methodology"
|
||||||
|
],
|
||||||
|
"source_count": 8,
|
||||||
|
"first_seen": "2026-03-25",
|
||||||
|
"last_confirmed": "2026-04-13"
|
||||||
|
}
|
||||||
|
]
|
||||||
}
|
}
|
||||||
80
knowledge/repos/hermes-agent.yaml
Normal file
80
knowledge/repos/hermes-agent.yaml
Normal file
@@ -0,0 +1,80 @@
|
|||||||
|
---
|
||||||
|
domain: hermes-agent
|
||||||
|
category: pitfall
|
||||||
|
version: 1
|
||||||
|
last_updated: "2026-04-13"
|
||||||
|
---
|
||||||
|
|
||||||
|
# Pitfalls (hermes-agent)
|
||||||
|
|
||||||
|
## Cron & Deployment
|
||||||
|
|
||||||
|
- id: hermes-agent:pitfall:001
|
||||||
|
fact: "deploy-crons.py leaves jobs in mixed model format - some have provider/model, some just model"
|
||||||
|
confidence: 0.95
|
||||||
|
tags: [cron, deploy, model, config]
|
||||||
|
source_count: 5
|
||||||
|
first_seen: "2026-04-08"
|
||||||
|
last_confirmed: "2026-04-13"
|
||||||
|
related: [hermes-agent:pitfall:002, hermes-agent:pitfall:003]
|
||||||
|
|
||||||
|
- id: hermes-agent:pitfall:002
|
||||||
|
fact: "deploy-crons.py --deploy doesn't set legacy skill field from skills list"
|
||||||
|
confidence: 0.9
|
||||||
|
tags: [cron, deploy, skills]
|
||||||
|
source_count: 3
|
||||||
|
first_seen: "2026-04-09"
|
||||||
|
last_confirmed: "2026-04-13"
|
||||||
|
related: [hermes-agent:pitfall:001]
|
||||||
|
|
||||||
|
- id: hermes-agent:pitfall:003
|
||||||
|
fact: "Cron jobs with blank fallback_model fields trigger spurious gateway warnings"
|
||||||
|
confidence: 0.9
|
||||||
|
tags: [cron, model, fallback]
|
||||||
|
source_count: 4
|
||||||
|
first_seen: "2026-04-07"
|
||||||
|
last_confirmed: "2026-04-12"
|
||||||
|
related: [hermes-agent:pitfall:001]
|
||||||
|
|
||||||
|
- id: hermes-agent:pitfall:004
|
||||||
|
fact: "model-watchdog.py checks first provider line, not model.provider - causes false drift alarms"
|
||||||
|
confidence: 0.9
|
||||||
|
tags: [watchdog, model, config]
|
||||||
|
source_count: 3
|
||||||
|
first_seen: "2026-04-08"
|
||||||
|
last_confirmed: "2026-04-13"
|
||||||
|
|
||||||
|
## Path & Environment
|
||||||
|
|
||||||
|
- id: hermes-agent:pitfall:005
|
||||||
|
fact: "10+ files read HERMES_HOME directly instead of get_hermes_home() - breaks on custom paths"
|
||||||
|
confidence: 0.85
|
||||||
|
tags: [paths, env, hermes-home]
|
||||||
|
source_count: 6
|
||||||
|
first_seen: "2026-04-06"
|
||||||
|
last_confirmed: "2026-04-12"
|
||||||
|
related: [global:pitfall:002]
|
||||||
|
|
||||||
|
- id: hermes-agent:pitfall:006
|
||||||
|
fact: "get_hermes_home() doesn't expand tilde when HERMES_HOME=~/... is set"
|
||||||
|
confidence: 0.8
|
||||||
|
tags: [paths, env, bug]
|
||||||
|
source_count: 2
|
||||||
|
first_seen: "2026-04-05"
|
||||||
|
|
||||||
|
## SSH & Dispatch
|
||||||
|
|
||||||
|
- id: hermes-agent:pitfall:007
|
||||||
|
fact: "vps-agent-dispatch reports OK while remote hermes binary path is broken"
|
||||||
|
confidence: 0.9
|
||||||
|
tags: [ssh, dispatch, vps]
|
||||||
|
source_count: 4
|
||||||
|
first_seen: "2026-04-07"
|
||||||
|
last_confirmed: "2026-04-11"
|
||||||
|
|
||||||
|
- id: hermes-agent:pitfall:008
|
||||||
|
fact: "nightwatch-health-monitor SSH check fails on cloud-model-only deployments"
|
||||||
|
confidence: 0.85
|
||||||
|
tags: [ssh, health, cloud]
|
||||||
|
source_count: 2
|
||||||
|
first_seen: "2026-04-10"
|
||||||
63
knowledge/repos/the-nexus.yaml
Normal file
63
knowledge/repos/the-nexus.yaml
Normal file
@@ -0,0 +1,63 @@
|
|||||||
|
---
|
||||||
|
domain: the-nexus
|
||||||
|
category: pitfall
|
||||||
|
version: 1
|
||||||
|
last_updated: "2026-04-13"
|
||||||
|
---
|
||||||
|
|
||||||
|
# Pitfalls (the-nexus)
|
||||||
|
|
||||||
|
## Git & Merging
|
||||||
|
|
||||||
|
- id: the-nexus:pitfall:001
|
||||||
|
fact: "Merges fail with HTTP 405 due to branch protection - must use merge API with 1 approval"
|
||||||
|
confidence: 0.95
|
||||||
|
tags: [git, merge, branch-protection, gitea]
|
||||||
|
source_count: 12
|
||||||
|
first_seen: "2026-04-05"
|
||||||
|
last_confirmed: "2026-04-13"
|
||||||
|
related: [global:pitfall:001]
|
||||||
|
|
||||||
|
- id: the-nexus:pitfall:002
|
||||||
|
fact: "ThreadingHTTPServer required for multi-user bridge - standard HTTPServer blocks on concurrent requests"
|
||||||
|
confidence: 0.95
|
||||||
|
tags: [server, concurrency, bridge]
|
||||||
|
source_count: 5
|
||||||
|
first_seen: "2026-04-10"
|
||||||
|
last_confirmed: "2026-04-13"
|
||||||
|
|
||||||
|
- id: the-nexus:pitfall:003
|
||||||
|
fact: "ChatLog.log() crashes on message persistence when index.html has orphaned button tags"
|
||||||
|
confidence: 0.9
|
||||||
|
tags: [html, crash, chatlog]
|
||||||
|
source_count: 3
|
||||||
|
first_seen: "2026-04-12"
|
||||||
|
last_confirmed: "2026-04-13"
|
||||||
|
|
||||||
|
## Three.js & Performance
|
||||||
|
|
||||||
|
- id: the-nexus:pitfall:004
|
||||||
|
fact: "Three.js LOD not implemented - local hardware struggles with full scene without texture optimization"
|
||||||
|
confidence: 0.85
|
||||||
|
tags: [threejs, performance, lod]
|
||||||
|
source_count: 4
|
||||||
|
first_seen: "2026-04-09"
|
||||||
|
last_confirmed: "2026-04-13"
|
||||||
|
|
||||||
|
- id: the-nexus:pitfall:005
|
||||||
|
fact: "Duplicate content blocks appear in index.html when PR merges conflict silently"
|
||||||
|
confidence: 0.8
|
||||||
|
tags: [html, merge-conflict, duplicate]
|
||||||
|
source_count: 3
|
||||||
|
first_seen: "2026-04-11"
|
||||||
|
last_confirmed: "2026-04-13"
|
||||||
|
|
||||||
|
## Deployment
|
||||||
|
|
||||||
|
- id: the-nexus:pitfall:006
|
||||||
|
fact: "Unified HTTP + WebSocket server required for proper URL deployment - separate servers break CORS"
|
||||||
|
confidence: 0.9
|
||||||
|
tags: [deploy, websocket, http, cors]
|
||||||
|
source_count: 4
|
||||||
|
first_seen: "2026-04-10"
|
||||||
|
last_confirmed: "2026-04-13"
|
||||||
1
requirements.txt
Normal file
1
requirements.txt
Normal file
@@ -0,0 +1 @@
|
|||||||
|
pytest>=8,<9
|
||||||
554
scripts/automation_opportunity_finder.py
Normal file
554
scripts/automation_opportunity_finder.py
Normal file
@@ -0,0 +1,554 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Automation Opportunity Finder — Scan fleet for manual processes that could be automated.
|
||||||
|
|
||||||
|
Analyzes:
|
||||||
|
1. Cron jobs — finds manual steps between scheduled tasks
|
||||||
|
2. Documentation — extracts TODO/FIXME/manual-step patterns
|
||||||
|
3. Scripts — detects repeated command sequences
|
||||||
|
4. Session transcripts — finds repeated tool-call patterns
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python3 scripts/automation_opportunity_finder.py --hermes-home ~/.hermes
|
||||||
|
python3 scripts/automation_opportunity_finder.py --hermes-home ~/.hermes --json
|
||||||
|
python3 scripts/automation_opportunity_finder.py --hermes-home ~/.hermes --output proposals.json
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import re
|
||||||
|
import sys
|
||||||
|
from collections import Counter, defaultdict
|
||||||
|
from datetime import datetime, timezone
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Any, Dict, List, Optional
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Patterns that signal manual work
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
MANUAL_STEP_PATTERNS = [
|
||||||
|
# Explicit manual markers
|
||||||
|
(r"(?i)\bTODO[:\s]", "todo"),
|
||||||
|
(r"(?i)\bFIXME[:\s]", "fixme"),
|
||||||
|
(r"(?i)\bMANUAL[:\s]", "manual_step"),
|
||||||
|
(r"(?i)\bHACK[:\s]", "hack"),
|
||||||
|
(r"(?i)\bWORKAROUND[:\s]", "workaround"),
|
||||||
|
# Step-by-step instructions in docs
|
||||||
|
(r"(?i)^(\d+)[.\)]\s+(run|execute|ssh|scp|curl|cd|make|docker|ansible|git)", "sequential_step"),
|
||||||
|
# Explicitly manual operations
|
||||||
|
(r"(?i)\bmanually\b", "manual_keyword"),
|
||||||
|
(r"(?i)\bby hand\b", "manual_keyword"),
|
||||||
|
(r"(?i)\bdon\'?t forget to\b", "manual_keyword"),
|
||||||
|
(r"(?i)\bremember to\b", "manual_keyword"),
|
||||||
|
(r"(?i)\bmake sure to\b", "manual_keyword"),
|
||||||
|
]
|
||||||
|
|
||||||
|
# Shell commands that appear frequently in runbooks — signal automatable workflows
|
||||||
|
SHELL_COMMAND_PATTERNS = [
|
||||||
|
"ssh ", "scp ", "rsync ", "curl ", "wget ",
|
||||||
|
"docker ", "docker-compose ", "kubectl ",
|
||||||
|
"ansible-playbook ", "terraform ", "systemctl ",
|
||||||
|
"systemctl restart", "systemctl status",
|
||||||
|
"git push", "git pull", "git merge", "git checkout",
|
||||||
|
"pip install", "npm install", "cargo build",
|
||||||
|
]
|
||||||
|
|
||||||
|
# Directories to skip during scans — large/uninteresting trees
|
||||||
|
EXCLUDE_DIRS = frozenset({
|
||||||
|
"node_modules", "venv", ".venv", "__pycache__", ".git",
|
||||||
|
"site-packages", "dist", "build", ".tox", ".mypy_cache",
|
||||||
|
".pytest_cache", "coverage", ".next", "vendor",
|
||||||
|
"skills", # hermes skills dir is huge
|
||||||
|
"audio_cache", "skins", "profiles",
|
||||||
|
})
|
||||||
|
|
||||||
|
# Session tool calls that appear repeatedly — candidates for workflow automation
|
||||||
|
TOOL_SEQUENCE_MIN_OCCURRENCES = 3
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Analyzers
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def analyze_cron_jobs(hermes_home: str) -> List[Dict[str, Any]]:
|
||||||
|
"""Analyze cron job definitions for automation gaps."""
|
||||||
|
proposals = []
|
||||||
|
cron_dir = Path(hermes_home) / "cron"
|
||||||
|
jobs_file = cron_dir / "jobs.json"
|
||||||
|
|
||||||
|
if not jobs_file.exists():
|
||||||
|
# Try YAML format
|
||||||
|
for ext in (".yaml", ".yml"):
|
||||||
|
alt = cron_dir / f"jobs{ext}"
|
||||||
|
if alt.exists():
|
||||||
|
jobs_file = alt
|
||||||
|
break
|
||||||
|
|
||||||
|
if not jobs_file.exists():
|
||||||
|
return proposals
|
||||||
|
|
||||||
|
try:
|
||||||
|
if jobs_file.suffix == ".json":
|
||||||
|
with open(jobs_file) as f:
|
||||||
|
jobs = json.load(f)
|
||||||
|
else:
|
||||||
|
try:
|
||||||
|
import yaml
|
||||||
|
with open(jobs_file) as f:
|
||||||
|
jobs = yaml.safe_load(f)
|
||||||
|
except ImportError:
|
||||||
|
return proposals
|
||||||
|
except (json.JSONDecodeError, Exception):
|
||||||
|
return proposals
|
||||||
|
|
||||||
|
if not isinstance(jobs, list):
|
||||||
|
return proposals
|
||||||
|
|
||||||
|
# Look for disabled jobs (someone turned them off — might need a different approach)
|
||||||
|
disabled = [j for j in jobs if not j.get("enabled", True)]
|
||||||
|
if disabled:
|
||||||
|
names = [j.get("name", j.get("id", "?")) for j in disabled[:5]]
|
||||||
|
proposals.append({
|
||||||
|
"category": "cron_disabled",
|
||||||
|
"title": f"{len(disabled)} disabled cron job(s) may need automation rework",
|
||||||
|
"description": f"These jobs were disabled: {', '.join(names)}. Investigate why and whether a different automation approach is needed.",
|
||||||
|
"confidence": 0.7,
|
||||||
|
"impact": "medium",
|
||||||
|
"sources": [str(jobs_file)],
|
||||||
|
})
|
||||||
|
|
||||||
|
# Look for jobs with high error counts
|
||||||
|
error_jobs = [j for j in jobs if j.get("last_status") == "error"]
|
||||||
|
if error_jobs:
|
||||||
|
names = [j.get("name", j.get("id", "?")) for j in error_jobs[:5]]
|
||||||
|
proposals.append({
|
||||||
|
"category": "cron_errors",
|
||||||
|
"title": f"{len(error_jobs)} cron job(s) failing — may need automation rework",
|
||||||
|
"description": f"Jobs with errors: {', '.join(names)}. Failure patterns suggest missing dependencies or fragile automation.",
|
||||||
|
"confidence": 0.8,
|
||||||
|
"impact": "high",
|
||||||
|
"sources": [str(jobs_file)],
|
||||||
|
})
|
||||||
|
|
||||||
|
# Look for jobs with delivery errors (platform issues)
|
||||||
|
delivery_errors = [j for j in jobs if j.get("last_delivery_error")]
|
||||||
|
if delivery_errors:
|
||||||
|
proposals.append({
|
||||||
|
"category": "cron_delivery",
|
||||||
|
"title": f"{len(delivery_errors)} cron job(s) have delivery failures",
|
||||||
|
"description": "Delivery failures suggest missing retry logic or platform integration gaps.",
|
||||||
|
"confidence": 0.75,
|
||||||
|
"impact": "medium",
|
||||||
|
"sources": [str(jobs_file)],
|
||||||
|
})
|
||||||
|
|
||||||
|
# Look for jobs on short intervals that could be event-driven
|
||||||
|
for job in jobs:
|
||||||
|
schedule = job.get("schedule", "")
|
||||||
|
# Check for very frequent schedules (every minute, every 5 min)
|
||||||
|
if isinstance(schedule, str) and re.match(r"^\*\/([1-5])\s", schedule):
|
||||||
|
proposals.append({
|
||||||
|
"category": "cron_frequency",
|
||||||
|
"title": f"Job '{job.get('name', job.get('id', '?'))}' runs every {schedule.split()[0]} — consider event-driven",
|
||||||
|
"description": f"High-frequency cron ({schedule}) may be better as event-driven or daemon.",
|
||||||
|
"confidence": 0.6,
|
||||||
|
"impact": "low",
|
||||||
|
"sources": [str(jobs_file)],
|
||||||
|
})
|
||||||
|
|
||||||
|
return proposals
|
||||||
|
|
||||||
|
|
||||||
|
def analyze_documents(root_dirs: List[str]) -> List[Dict[str, Any]]:
|
||||||
|
"""Scan documentation for manual step patterns."""
|
||||||
|
proposals = []
|
||||||
|
doc_extensions = {".md", ".txt", ".rst", ".adoc"}
|
||||||
|
findings_by_category = defaultdict(list)
|
||||||
|
|
||||||
|
for root_dir in root_dirs:
|
||||||
|
root = Path(root_dir)
|
||||||
|
if not root.exists():
|
||||||
|
continue
|
||||||
|
|
||||||
|
for path in root.rglob("*"):
|
||||||
|
if path.is_dir():
|
||||||
|
continue
|
||||||
|
if path.suffix not in doc_extensions:
|
||||||
|
continue
|
||||||
|
# Skip excluded dirs and hidden dirs
|
||||||
|
parts = path.relative_to(root).parts if root in path.parents or root == path.parent else path.parts
|
||||||
|
if any(p.startswith(".") or p in EXCLUDE_DIRS for p in parts):
|
||||||
|
continue
|
||||||
|
if len(parts) > 8:
|
||||||
|
continue
|
||||||
|
|
||||||
|
try:
|
||||||
|
content = path.read_text(errors="replace")
|
||||||
|
except (PermissionError, OSError):
|
||||||
|
continue
|
||||||
|
|
||||||
|
lines = content.split("\n")
|
||||||
|
for i, line in enumerate(lines):
|
||||||
|
stripped = line.strip()
|
||||||
|
if not stripped:
|
||||||
|
continue
|
||||||
|
|
||||||
|
for pattern, category in MANUAL_STEP_PATTERNS:
|
||||||
|
if re.search(pattern, stripped):
|
||||||
|
findings_by_category[category].append({
|
||||||
|
"file": str(path),
|
||||||
|
"line": i + 1,
|
||||||
|
"text": stripped[:200],
|
||||||
|
})
|
||||||
|
|
||||||
|
# Generate proposals from findings
|
||||||
|
for category, findings in findings_by_category.items():
|
||||||
|
if len(findings) < 2:
|
||||||
|
continue
|
||||||
|
|
||||||
|
file_count = len(set(f["file"] for f in findings))
|
||||||
|
proposals.append({
|
||||||
|
"category": f"manual_{category}",
|
||||||
|
"title": f"{len(findings)} '{category}' markers across {file_count} doc(s)",
|
||||||
|
"description": f"Found in: {', '.join(set(Path(f['file']).name for f in findings[:5]))}",
|
||||||
|
"confidence": 0.65,
|
||||||
|
"impact": "medium",
|
||||||
|
"sources": list(set(f["file"] for f in findings[:10])),
|
||||||
|
"details": findings[:5], # sample
|
||||||
|
})
|
||||||
|
|
||||||
|
return proposals
|
||||||
|
|
||||||
|
|
||||||
|
def analyze_scripts(root_dirs: List[str]) -> List[Dict[str, Any]]:
|
||||||
|
"""Detect repeated command sequences in scripts."""
|
||||||
|
proposals = []
|
||||||
|
script_extensions = {".py", ".sh", ".bash", ".zsh"}
|
||||||
|
command_counter = Counter()
|
||||||
|
command_locations = defaultdict(list)
|
||||||
|
|
||||||
|
for root_dir in root_dirs:
|
||||||
|
root = Path(root_dir)
|
||||||
|
if not root.exists():
|
||||||
|
continue
|
||||||
|
|
||||||
|
for path in root.rglob("*"):
|
||||||
|
if path.is_dir():
|
||||||
|
continue
|
||||||
|
if path.suffix not in script_extensions:
|
||||||
|
continue
|
||||||
|
parts = path.relative_to(root).parts if root in path.parents or root == path.parent else path.parts
|
||||||
|
if any(p.startswith(".") or p in EXCLUDE_DIRS for p in parts):
|
||||||
|
continue
|
||||||
|
if len(parts) > 8:
|
||||||
|
continue
|
||||||
|
|
||||||
|
try:
|
||||||
|
content = path.read_text(errors="replace")
|
||||||
|
except (PermissionError, OSError):
|
||||||
|
continue
|
||||||
|
|
||||||
|
lines = content.split("\n")
|
||||||
|
for i, line in enumerate(lines):
|
||||||
|
stripped = line.strip()
|
||||||
|
if not stripped or stripped.startswith("#"):
|
||||||
|
continue
|
||||||
|
|
||||||
|
for cmd_prefix in SHELL_COMMAND_PATTERNS:
|
||||||
|
if cmd_prefix in stripped:
|
||||||
|
# Normalize the command
|
||||||
|
normalized = re.sub(r"\s+", " ", stripped)[:120]
|
||||||
|
command_counter[normalized] += 1
|
||||||
|
command_locations[normalized].append(f"{path}:{i+1}")
|
||||||
|
|
||||||
|
# Proposals for commands appearing 3+ times
|
||||||
|
for cmd, count in command_counter.most_common(20):
|
||||||
|
if count < 3:
|
||||||
|
break
|
||||||
|
locs = command_locations[cmd]
|
||||||
|
file_count = len(set(loc.split(":")[0] for loc in locs))
|
||||||
|
proposals.append({
|
||||||
|
"category": "repeated_command",
|
||||||
|
"title": f"Command repeated {count}x across {file_count} file(s): {cmd[:80]}",
|
||||||
|
"description": f"Locations: {', '.join(locs[:3])}",
|
||||||
|
"confidence": min(0.5 + (count * 0.1), 0.95),
|
||||||
|
"impact": "medium",
|
||||||
|
"sources": list(set(loc.split(":")[0] for loc in locs)),
|
||||||
|
})
|
||||||
|
|
||||||
|
return proposals
|
||||||
|
|
||||||
|
|
||||||
|
def analyze_session_transcripts(session_dirs: List[str]) -> List[Dict[str, Any]]:
|
||||||
|
"""Find repeated tool-call patterns in session transcripts."""
|
||||||
|
proposals = []
|
||||||
|
tool_sequence_counter = Counter()
|
||||||
|
tool_sequence_examples = {}
|
||||||
|
|
||||||
|
for session_dir in session_dirs:
|
||||||
|
session_path = Path(session_dir)
|
||||||
|
if not session_path.exists():
|
||||||
|
continue
|
||||||
|
|
||||||
|
for path in session_path.rglob("*.jsonl"):
|
||||||
|
try:
|
||||||
|
content = path.read_text(errors="replace")
|
||||||
|
except (PermissionError, OSError):
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Extract tool calls in sequence
|
||||||
|
tool_sequence = []
|
||||||
|
for line in content.split("\n"):
|
||||||
|
line = line.strip()
|
||||||
|
if not line:
|
||||||
|
continue
|
||||||
|
try:
|
||||||
|
msg = json.loads(line)
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Look for tool calls in assistant messages
|
||||||
|
if msg.get("role") == "assistant" and msg.get("tool_calls"):
|
||||||
|
for tc in msg["tool_calls"]:
|
||||||
|
func_name = tc.get("function", {}).get("name", "?")
|
||||||
|
tool_sequence.append(func_name)
|
||||||
|
|
||||||
|
# Find 2-call sequences
|
||||||
|
for i in range(len(tool_sequence) - 1):
|
||||||
|
seq = (tool_sequence[i], tool_sequence[i + 1])
|
||||||
|
tool_sequence_counter[seq] += 1
|
||||||
|
if seq not in tool_sequence_examples:
|
||||||
|
tool_sequence_examples[seq] = str(path.name)
|
||||||
|
|
||||||
|
# Find 3-call sequences
|
||||||
|
for i in range(len(tool_sequence) - 2):
|
||||||
|
seq = (tool_sequence[i], tool_sequence[i + 1], tool_sequence[i + 2])
|
||||||
|
tool_sequence_counter[seq] += 1
|
||||||
|
if seq not in tool_sequence_examples:
|
||||||
|
tool_sequence_examples[seq] = str(path.name)
|
||||||
|
|
||||||
|
# Generate proposals for frequently repeated sequences
|
||||||
|
for seq, count in tool_sequence_counter.most_common(20):
|
||||||
|
if count < TOOL_SEQUENCE_MIN_OCCURRENCES:
|
||||||
|
break
|
||||||
|
seq_str = " -> ".join(seq)
|
||||||
|
proposals.append({
|
||||||
|
"category": "tool_sequence",
|
||||||
|
"title": f"Tool sequence '{seq_str}' repeated {count} times",
|
||||||
|
"description": f"Consider creating a workflow/skill that automates this sequence.",
|
||||||
|
"confidence": min(0.5 + (count * 0.05), 0.9),
|
||||||
|
"impact": "medium",
|
||||||
|
"sources": [tool_sequence_examples.get(seq, "unknown")],
|
||||||
|
})
|
||||||
|
|
||||||
|
return proposals
|
||||||
|
|
||||||
|
|
||||||
|
def analyze_shell_history(root_dirs: List[str]) -> List[Dict[str, Any]]:
|
||||||
|
"""Find repeated shell commands from history files."""
|
||||||
|
proposals = []
|
||||||
|
command_counter = Counter()
|
||||||
|
|
||||||
|
for root_dir in root_dirs:
|
||||||
|
root = Path(root_dir)
|
||||||
|
history_files = []
|
||||||
|
|
||||||
|
# Look for shell history files
|
||||||
|
for name in (".bash_history", ".zsh_history", ".python_history"):
|
||||||
|
p = root / name
|
||||||
|
if p.exists():
|
||||||
|
history_files.append(p)
|
||||||
|
|
||||||
|
# Also check in hermes home
|
||||||
|
for p in root.glob("**/*history*"):
|
||||||
|
if p.is_file() and p.suffix in ("", ".txt", ".log"):
|
||||||
|
history_files.append(p)
|
||||||
|
|
||||||
|
for hf in history_files:
|
||||||
|
try:
|
||||||
|
content = hf.read_text(errors="replace")
|
||||||
|
except (PermissionError, OSError):
|
||||||
|
continue
|
||||||
|
|
||||||
|
for line in content.split("\n"):
|
||||||
|
stripped = line.strip()
|
||||||
|
# ZSH history format: ": 1234567890:0;command"
|
||||||
|
stripped = re.sub(r"^:\s*\d+:\d+;", "", stripped)
|
||||||
|
if not stripped or len(stripped) < 5:
|
||||||
|
continue
|
||||||
|
# Skip trivial commands
|
||||||
|
if stripped in ("ls", "cd", "pwd", "clear", "exit"):
|
||||||
|
continue
|
||||||
|
command_counter[stripped] += 1
|
||||||
|
|
||||||
|
for cmd, count in command_counter.most_common(10):
|
||||||
|
if count < 5:
|
||||||
|
break
|
||||||
|
proposals.append({
|
||||||
|
"category": "shell_repetition",
|
||||||
|
"title": f"Shell command run {count}+ times: {cmd[:80]}",
|
||||||
|
"description": "Frequently repeated shell command — candidate for alias, function, or script.",
|
||||||
|
"confidence": min(0.4 + (count * 0.05), 0.85),
|
||||||
|
"impact": "low",
|
||||||
|
"sources": ["shell_history"],
|
||||||
|
})
|
||||||
|
|
||||||
|
return proposals
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Proposal output
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def deduplicate_proposals(proposals: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
|
||||||
|
"""Remove duplicate proposals based on title similarity."""
|
||||||
|
seen_titles = set()
|
||||||
|
unique = []
|
||||||
|
for p in proposals:
|
||||||
|
# Normalize title for dedup
|
||||||
|
key = re.sub(r"\d+", "N", p["title"]).lower()
|
||||||
|
if key not in seen_titles:
|
||||||
|
seen_titles.add(key)
|
||||||
|
unique.append(p)
|
||||||
|
return unique
|
||||||
|
|
||||||
|
|
||||||
|
def rank_proposals(proposals: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
|
||||||
|
"""Sort proposals by impact * confidence (highest first)."""
|
||||||
|
impact_weight = {"critical": 1.0, "high": 0.8, "medium": 0.5, "low": 0.2}
|
||||||
|
return sorted(
|
||||||
|
proposals,
|
||||||
|
key=lambda p: impact_weight.get(p.get("impact", "low"), 0.2) * p.get("confidence", 0.5),
|
||||||
|
reverse=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def format_text_report(proposals: List[Dict[str, Any]]) -> str:
|
||||||
|
"""Format proposals as human-readable text."""
|
||||||
|
if not proposals:
|
||||||
|
return "No automation opportunities found."
|
||||||
|
|
||||||
|
lines = [
|
||||||
|
"=" * 70,
|
||||||
|
" AUTOMATION OPPORTUNITY REPORT",
|
||||||
|
f" Generated: {datetime.now(timezone.utc).strftime('%Y-%m-%d %H:%M UTC')}",
|
||||||
|
f" Proposals: {len(proposals)}",
|
||||||
|
"=" * 70,
|
||||||
|
"",
|
||||||
|
]
|
||||||
|
|
||||||
|
for i, p in enumerate(proposals, 1):
|
||||||
|
score = p.get("confidence", 0.5) * {"critical": 1.0, "high": 0.8, "medium": 0.5, "low": 0.2}.get(p.get("impact", "low"), 0.2)
|
||||||
|
lines.append(f"[{i}] {p['title']}")
|
||||||
|
lines.append(f" Category: {p['category']} | Impact: {p.get('impact','?')} | Confidence: {p.get('confidence',0):.0%} | Score: {score:.2f}")
|
||||||
|
lines.append(f" {p['description']}")
|
||||||
|
if p.get("sources"):
|
||||||
|
lines.append(f" Sources: {', '.join(p['sources'][:3])}")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
# Summary by category
|
||||||
|
cat_counts = Counter(p["category"] for p in proposals)
|
||||||
|
lines.append("-" * 70)
|
||||||
|
lines.append("Summary by category:")
|
||||||
|
for cat, count in cat_counts.most_common():
|
||||||
|
lines.append(f" {cat}: {count}")
|
||||||
|
|
||||||
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Main
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def main():
|
||||||
|
parser = argparse.ArgumentParser(description="Find automation opportunities across the fleet")
|
||||||
|
parser.add_argument("--hermes-home", default=os.path.expanduser("~/.hermes"),
|
||||||
|
help="Path to hermes home directory (default: ~/.hermes)")
|
||||||
|
parser.add_argument("--scan-dirs", nargs="*",
|
||||||
|
help="Additional directories to scan (default: hermes-home + cwd)")
|
||||||
|
parser.add_argument("--session-dirs", nargs="*",
|
||||||
|
help="Session transcript directories (default: hermes-home/sessions)")
|
||||||
|
parser.add_argument("--json", action="store_true", help="Output as JSON")
|
||||||
|
parser.add_argument("--output", "-o", help="Write proposals to file")
|
||||||
|
parser.add_argument("--min-confidence", type=float, default=0.3,
|
||||||
|
help="Minimum confidence threshold (default: 0.3)")
|
||||||
|
parser.add_argument("--categories", nargs="*",
|
||||||
|
help="Only include these categories (cron, docs, scripts, sessions, shell)")
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
hermes_home = os.path.expanduser(args.hermes_home)
|
||||||
|
|
||||||
|
# Default scan directories
|
||||||
|
scan_dirs = [hermes_home, "."]
|
||||||
|
if args.scan_dirs:
|
||||||
|
scan_dirs.extend(args.scan_dirs)
|
||||||
|
|
||||||
|
session_dirs = [os.path.join(hermes_home, "sessions")]
|
||||||
|
if args.session_dirs:
|
||||||
|
session_dirs.extend(args.session_dirs)
|
||||||
|
|
||||||
|
# Also check common session locations
|
||||||
|
for subdir in ("transcripts", "session-db"):
|
||||||
|
p = os.path.join(hermes_home, subdir)
|
||||||
|
if os.path.isdir(p):
|
||||||
|
session_dirs.append(p)
|
||||||
|
|
||||||
|
categories = set(args.categories) if args.categories else {"cron", "docs", "scripts", "sessions", "shell"}
|
||||||
|
|
||||||
|
# Run analyzers
|
||||||
|
all_proposals = []
|
||||||
|
|
||||||
|
if "cron" in categories:
|
||||||
|
all_proposals.extend(analyze_cron_jobs(hermes_home))
|
||||||
|
|
||||||
|
if "docs" in categories:
|
||||||
|
all_proposals.extend(analyze_documents(scan_dirs))
|
||||||
|
|
||||||
|
if "scripts" in categories:
|
||||||
|
all_proposals.extend(analyze_scripts(scan_dirs))
|
||||||
|
|
||||||
|
if "sessions" in categories:
|
||||||
|
all_proposals.extend(analyze_session_transcripts(session_dirs))
|
||||||
|
|
||||||
|
if "shell" in categories:
|
||||||
|
all_proposals.extend(analyze_shell_history(scan_dirs))
|
||||||
|
|
||||||
|
# Deduplicate and rank
|
||||||
|
all_proposals = deduplicate_proposals(all_proposals)
|
||||||
|
all_proposals = rank_proposals(all_proposals)
|
||||||
|
|
||||||
|
# Filter by confidence
|
||||||
|
all_proposals = [p for p in all_proposals if p.get("confidence", 0) >= args.min_confidence]
|
||||||
|
|
||||||
|
# Output
|
||||||
|
output = {
|
||||||
|
"generated_at": datetime.now(timezone.utc).isoformat(),
|
||||||
|
"hermes_home": hermes_home,
|
||||||
|
"scan_dirs": scan_dirs,
|
||||||
|
"total_proposals": len(all_proposals),
|
||||||
|
"proposals": all_proposals,
|
||||||
|
}
|
||||||
|
|
||||||
|
if args.json:
|
||||||
|
result = json.dumps(output, indent=2)
|
||||||
|
else:
|
||||||
|
result = format_text_report(all_proposals)
|
||||||
|
|
||||||
|
if args.output:
|
||||||
|
with open(args.output, "w") as f:
|
||||||
|
if args.json:
|
||||||
|
json.dump(output, f, indent=2)
|
||||||
|
else:
|
||||||
|
f.write(result)
|
||||||
|
print(f"Written to {args.output}", file=sys.stderr)
|
||||||
|
else:
|
||||||
|
print(result)
|
||||||
|
|
||||||
|
return 0 if all_proposals else 1
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
sys.exit(main())
|
||||||
359
scripts/bootstrapper.py
Normal file
359
scripts/bootstrapper.py
Normal file
@@ -0,0 +1,359 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Bootstrapper — assemble pre-session context from knowledge store.
|
||||||
|
|
||||||
|
Reads the knowledge store and produces a compact context block (2k tokens max)
|
||||||
|
that can be injected into a new session so it starts with situational awareness.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python3 bootstrapper.py --repo the-nexus --agent mimo-sprint
|
||||||
|
python3 bootstrapper.py --repo timmy-home --global
|
||||||
|
python3 bootstrapper.py --global
|
||||||
|
python3 bootstrapper.py --repo the-nexus --max-tokens 1000
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import json
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Optional
|
||||||
|
|
||||||
|
# Resolve knowledge root relative to this script's parent
|
||||||
|
SCRIPT_DIR = Path(__file__).resolve().parent
|
||||||
|
REPO_ROOT = SCRIPT_DIR.parent
|
||||||
|
KNOWLEDGE_DIR = REPO_ROOT / "knowledge"
|
||||||
|
INDEX_PATH = KNOWLEDGE_DIR / "index.json"
|
||||||
|
|
||||||
|
# Approximate token count: ~4 chars per token for English text
|
||||||
|
CHARS_PER_TOKEN = 4
|
||||||
|
|
||||||
|
# Category sort priority (lower = shown first)
|
||||||
|
CATEGORY_PRIORITY = {
|
||||||
|
"pitfall": 0,
|
||||||
|
"tool-quirk": 1,
|
||||||
|
"pattern": 2,
|
||||||
|
"fact": 3,
|
||||||
|
"question": 4,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def load_index(index_path: Path = INDEX_PATH) -> dict:
|
||||||
|
"""Load and validate the knowledge index."""
|
||||||
|
if not index_path.exists():
|
||||||
|
return {"version": 1, "total_facts": 0, "facts": []}
|
||||||
|
|
||||||
|
with open(index_path) as f:
|
||||||
|
data = json.load(f)
|
||||||
|
|
||||||
|
if "facts" not in data:
|
||||||
|
print(f"WARNING: index.json missing 'facts' key", file=sys.stderr)
|
||||||
|
return {"version": 1, "total_facts": 0, "facts": []}
|
||||||
|
|
||||||
|
return data
|
||||||
|
|
||||||
|
|
||||||
|
def filter_facts(
|
||||||
|
facts: list[dict],
|
||||||
|
repo: Optional[str] = None,
|
||||||
|
agent: Optional[str] = None,
|
||||||
|
include_global: bool = True,
|
||||||
|
) -> list[dict]:
|
||||||
|
"""Filter facts by repo, agent, and global scope."""
|
||||||
|
filtered = []
|
||||||
|
|
||||||
|
for fact in facts:
|
||||||
|
fact_repo = fact.get("repo", "global")
|
||||||
|
fact_agent = fact.get("agent", "")
|
||||||
|
|
||||||
|
# Match by repo (regardless of agent)
|
||||||
|
if repo and fact_repo == repo:
|
||||||
|
filtered.append(fact)
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Match by exact agent type
|
||||||
|
if agent and fact_agent == agent:
|
||||||
|
filtered.append(fact)
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Include global facts without agent restriction (universal facts)
|
||||||
|
if include_global and fact_repo == "global" and not fact_agent:
|
||||||
|
filtered.append(fact)
|
||||||
|
|
||||||
|
return filtered
|
||||||
|
|
||||||
|
|
||||||
|
def sort_facts(facts: list[dict]) -> list[dict]:
|
||||||
|
"""
|
||||||
|
Sort facts by: confidence (desc), then category priority, then fact text.
|
||||||
|
Most reliable and most dangerous facts come first.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def sort_key(f):
|
||||||
|
confidence = f.get("confidence", 0.5)
|
||||||
|
category = f.get("category", "fact")
|
||||||
|
cat_priority = CATEGORY_PRIORITY.get(category, 5)
|
||||||
|
return (-confidence, cat_priority, f.get("fact", ""))
|
||||||
|
|
||||||
|
return sorted(facts, key=sort_key)
|
||||||
|
|
||||||
|
|
||||||
|
def load_repo_knowledge(repo: str) -> Optional[str]:
|
||||||
|
"""Load per-repo knowledge markdown if it exists."""
|
||||||
|
repo_path = KNOWLEDGE_DIR / "repos" / f"{repo}.md"
|
||||||
|
if repo_path.exists():
|
||||||
|
return repo_path.read_text().strip()
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def load_agent_knowledge(agent: str) -> Optional[str]:
|
||||||
|
"""Load per-agent knowledge markdown if it exists."""
|
||||||
|
agent_path = KNOWLEDGE_DIR / "agents" / f"{agent}.md"
|
||||||
|
if agent_path.exists():
|
||||||
|
return agent_path.read_text().strip()
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def load_global_knowledge() -> list[str]:
|
||||||
|
"""Load all global knowledge markdown files."""
|
||||||
|
global_dir = KNOWLEDGE_DIR / "global"
|
||||||
|
if not global_dir.exists():
|
||||||
|
return []
|
||||||
|
|
||||||
|
chunks = []
|
||||||
|
for md_file in sorted(global_dir.glob("*.md")):
|
||||||
|
content = md_file.read_text().strip()
|
||||||
|
if content:
|
||||||
|
chunks.append(content)
|
||||||
|
return chunks
|
||||||
|
|
||||||
|
|
||||||
|
def render_facts_section(facts: list[dict], category: str, label: str) -> str:
|
||||||
|
"""Render a section of facts for a single category."""
|
||||||
|
cat_facts = [f for f in facts if f.get("category") == category]
|
||||||
|
if not cat_facts:
|
||||||
|
return ""
|
||||||
|
|
||||||
|
lines = [f"### {label}\n"]
|
||||||
|
for f in cat_facts:
|
||||||
|
conf = f.get("confidence", 0.5)
|
||||||
|
fact_text = f.get("fact", "")
|
||||||
|
repo_tag = f.get("repo", "")
|
||||||
|
if repo_tag and repo_tag != "global":
|
||||||
|
lines.append(f"- [{conf:.0%}] ({repo_tag}) {fact_text}")
|
||||||
|
else:
|
||||||
|
lines.append(f"- [{conf:.0%}] {fact_text}")
|
||||||
|
|
||||||
|
return "\n".join(lines) + "\n"
|
||||||
|
|
||||||
|
|
||||||
|
def estimate_tokens(text: str) -> int:
|
||||||
|
"""Rough token estimate."""
|
||||||
|
return len(text) // CHARS_PER_TOKEN
|
||||||
|
|
||||||
|
|
||||||
|
def truncate_to_tokens(text: str, max_tokens: int) -> str:
|
||||||
|
"""Truncate text to approximately max_tokens, cutting at line boundaries."""
|
||||||
|
max_chars = max_tokens * CHARS_PER_TOKEN
|
||||||
|
if len(text) <= max_chars:
|
||||||
|
return text
|
||||||
|
|
||||||
|
# Cut at last newline before the limit
|
||||||
|
truncated = text[:max_chars]
|
||||||
|
last_newline = truncated.rfind("\n")
|
||||||
|
if last_newline > 0:
|
||||||
|
truncated = truncated[:last_newline]
|
||||||
|
|
||||||
|
return truncated + "\n\n[... truncated to fit context window ...]"
|
||||||
|
|
||||||
|
|
||||||
|
def build_bootstrap_context(
|
||||||
|
repo: Optional[str] = None,
|
||||||
|
agent: Optional[str] = None,
|
||||||
|
include_global: bool = True,
|
||||||
|
max_tokens: int = 2000,
|
||||||
|
index_path: Path = INDEX_PATH,
|
||||||
|
) -> str:
|
||||||
|
"""
|
||||||
|
Build the full bootstrap context block.
|
||||||
|
|
||||||
|
Returns a markdown string suitable for injection into a session prompt.
|
||||||
|
"""
|
||||||
|
index = load_index(index_path)
|
||||||
|
facts = index.get("facts", [])
|
||||||
|
|
||||||
|
# Filter
|
||||||
|
filtered = filter_facts(facts, repo=repo, agent=agent, include_global=include_global)
|
||||||
|
|
||||||
|
# Sort
|
||||||
|
sorted_facts = sort_facts(filtered)
|
||||||
|
|
||||||
|
# Build sections
|
||||||
|
sections = ["## What You Know (bootstrapped)\n"]
|
||||||
|
|
||||||
|
# Per-repo markdown knowledge
|
||||||
|
if repo:
|
||||||
|
repo_md = load_repo_knowledge(repo)
|
||||||
|
if repo_md:
|
||||||
|
sections.append(f"### Repo Notes: {repo}\n")
|
||||||
|
sections.append(repo_md + "\n")
|
||||||
|
|
||||||
|
# Structured facts by category
|
||||||
|
if sorted_facts:
|
||||||
|
# Group by source
|
||||||
|
repo_facts = [f for f in sorted_facts if f.get("repo") == repo] if repo else []
|
||||||
|
global_facts = [f for f in sorted_facts if f.get("repo") == "global"]
|
||||||
|
agent_facts = [f for f in sorted_facts if f.get("agent") == agent] if agent else []
|
||||||
|
|
||||||
|
if repo_facts:
|
||||||
|
sections.append(f"### Repo: {repo}\n")
|
||||||
|
for cat, label in [
|
||||||
|
("pitfall", "PITFALLS"),
|
||||||
|
("tool-quirk", "QUIRKS"),
|
||||||
|
("pattern", "PATTERNS"),
|
||||||
|
("fact", "FACTS"),
|
||||||
|
("question", "OPEN QUESTIONS"),
|
||||||
|
]:
|
||||||
|
section = render_facts_section(repo_facts, cat, label)
|
||||||
|
if section:
|
||||||
|
sections.append(section)
|
||||||
|
|
||||||
|
if global_facts:
|
||||||
|
sections.append("### Global\n")
|
||||||
|
for cat, label in [
|
||||||
|
("pitfall", "PITFALLS"),
|
||||||
|
("tool-quirk", "QUIRKS"),
|
||||||
|
("pattern", "PATTERNS"),
|
||||||
|
("fact", "FACTS"),
|
||||||
|
]:
|
||||||
|
section = render_facts_section(global_facts, cat, label)
|
||||||
|
if section:
|
||||||
|
sections.append(section)
|
||||||
|
|
||||||
|
if agent_facts:
|
||||||
|
sections.append(f"### Agent Notes ({agent})\n")
|
||||||
|
for cat, label in [
|
||||||
|
("pitfall", "PITFALLS"),
|
||||||
|
("tool-quirk", "QUIRKS"),
|
||||||
|
("pattern", "PATTERNS"),
|
||||||
|
("fact", "FACTS"),
|
||||||
|
]:
|
||||||
|
section = render_facts_section(agent_facts, cat, label)
|
||||||
|
if section:
|
||||||
|
sections.append(section)
|
||||||
|
|
||||||
|
# Per-agent markdown knowledge
|
||||||
|
if agent:
|
||||||
|
agent_md = load_agent_knowledge(agent)
|
||||||
|
if agent_md:
|
||||||
|
sections.append(f"### Agent Profile: {agent}\n")
|
||||||
|
sections.append(agent_md + "\n")
|
||||||
|
|
||||||
|
# Global markdown knowledge
|
||||||
|
global_chunks = load_global_knowledge()
|
||||||
|
if global_chunks:
|
||||||
|
sections.append("### Global Notes\n")
|
||||||
|
sections.extend(chunk + "\n" for chunk in global_chunks)
|
||||||
|
|
||||||
|
# If nothing was found
|
||||||
|
if len(sections) == 1:
|
||||||
|
sections.append("_No relevant knowledge found. Starting fresh._\n")
|
||||||
|
if not facts:
|
||||||
|
sections.append(
|
||||||
|
"_Knowledge store is empty. Run the harvester to populate it._\n"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Join and truncate
|
||||||
|
context = "\n".join(sections)
|
||||||
|
context = truncate_to_tokens(context, max_tokens)
|
||||||
|
|
||||||
|
return context
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
parser = argparse.ArgumentParser(
|
||||||
|
description="Assemble pre-session context from knowledge store"
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--repo",
|
||||||
|
type=str,
|
||||||
|
default=None,
|
||||||
|
help="Repository name to filter facts by",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--agent",
|
||||||
|
type=str,
|
||||||
|
default=None,
|
||||||
|
help="Agent type to filter facts by (e.g., mimo-sprint, groq-fast)",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--global",
|
||||||
|
dest="include_global",
|
||||||
|
action="store_true",
|
||||||
|
default=True,
|
||||||
|
help="Include global facts (default: true)",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--no-global",
|
||||||
|
dest="include_global",
|
||||||
|
action="store_false",
|
||||||
|
help="Exclude global facts",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--max-tokens",
|
||||||
|
type=int,
|
||||||
|
default=2000,
|
||||||
|
help="Maximum token count for output (default: 2000)",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--index",
|
||||||
|
type=str,
|
||||||
|
default=None,
|
||||||
|
help="Path to index.json (default: knowledge/index.json)",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--json",
|
||||||
|
dest="output_json",
|
||||||
|
action="store_true",
|
||||||
|
help="Output raw JSON instead of markdown",
|
||||||
|
)
|
||||||
|
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
index_path = Path(args.index) if args.index else INDEX_PATH
|
||||||
|
|
||||||
|
if args.output_json:
|
||||||
|
# JSON mode: return the filtered, sorted facts
|
||||||
|
index = load_index(index_path)
|
||||||
|
facts = index.get("facts", [])
|
||||||
|
filtered = filter_facts(
|
||||||
|
facts,
|
||||||
|
repo=args.repo,
|
||||||
|
agent=args.agent,
|
||||||
|
include_global=args.include_global,
|
||||||
|
)
|
||||||
|
sorted_facts = sort_facts(filtered)
|
||||||
|
output = {
|
||||||
|
"repo": args.repo,
|
||||||
|
"agent": args.agent,
|
||||||
|
"include_global": args.include_global,
|
||||||
|
"total_indexed": len(facts),
|
||||||
|
"matched": len(sorted_facts),
|
||||||
|
"facts": sorted_facts,
|
||||||
|
}
|
||||||
|
print(json.dumps(output, indent=2))
|
||||||
|
else:
|
||||||
|
# Markdown mode: full bootstrap context
|
||||||
|
context = build_bootstrap_context(
|
||||||
|
repo=args.repo,
|
||||||
|
agent=args.agent,
|
||||||
|
include_global=args.include_global,
|
||||||
|
max_tokens=args.max_tokens,
|
||||||
|
index_path=index_path,
|
||||||
|
)
|
||||||
|
print(context)
|
||||||
|
|
||||||
|
return 0
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
sys.exit(main())
|
||||||
282
scripts/dead_code_detector.py
Normal file
282
scripts/dead_code_detector.py
Normal file
@@ -0,0 +1,282 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Dead Code Detector for Python Codebases
|
||||||
|
|
||||||
|
AST-based analysis to find defined but never-called functions and classes.
|
||||||
|
Excludes entry points, plugin hooks, __init__ exports.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python3 scripts/dead_code_detector.py /path/to/repo/
|
||||||
|
python3 scripts/dead_code_detector.py hermes-agent/ --format json
|
||||||
|
python3 scripts/dead_code_detector.py . --exclude tests/,venv/
|
||||||
|
|
||||||
|
Output: file:line, function/class name, last git author (if available)
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import ast
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import subprocess
|
||||||
|
import sys
|
||||||
|
from collections import defaultdict
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Optional
|
||||||
|
|
||||||
|
|
||||||
|
# Names that are expected to be unused (entry points, protocol methods, etc.)
|
||||||
|
SAFE_UNUSED_PATTERNS = {
|
||||||
|
# Python dunders
|
||||||
|
"__init__", "__str__", "__repr__", "__eq__", "__hash__", "__len__",
|
||||||
|
"__getitem__", "__setitem__", "__contains__", "__iter__", "__next__",
|
||||||
|
"__enter__", "__exit__", "__call__", "__bool__", "__del__",
|
||||||
|
"__post_init__", "__class_getitem__",
|
||||||
|
# Common entry points
|
||||||
|
"main", "app", "handler", "setup", "teardown", "fixture",
|
||||||
|
# pytest
|
||||||
|
"conftest", "test_", "pytest_", # prefix patterns
|
||||||
|
# Protocols / abstract
|
||||||
|
"abstractmethod", "abc_",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def is_safe_unused(name: str, filepath: str) -> bool:
|
||||||
|
"""Check if an unused name is expected to be unused."""
|
||||||
|
# Test files are exempt
|
||||||
|
if "test" in filepath.lower():
|
||||||
|
return True
|
||||||
|
|
||||||
|
# Known patterns
|
||||||
|
for pattern in SAFE_UNUSED_PATTERNS:
|
||||||
|
if name.startswith(pattern) or name == pattern:
|
||||||
|
return True
|
||||||
|
|
||||||
|
# __init__.py exports are often unused internally
|
||||||
|
if filepath.endswith("__init__.py"):
|
||||||
|
return True
|
||||||
|
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
def get_git_blame(filepath: str, lineno: int) -> Optional[str]:
|
||||||
|
"""Get last author of a line via git blame."""
|
||||||
|
try:
|
||||||
|
result = subprocess.run(
|
||||||
|
["git", "blame", "-L", f"{lineno},{lineno}", "--porcelain", filepath],
|
||||||
|
capture_output=True, text=True, timeout=5
|
||||||
|
)
|
||||||
|
for line in result.stdout.split("\n"):
|
||||||
|
if line.startswith("author "):
|
||||||
|
return line[7:]
|
||||||
|
except:
|
||||||
|
pass
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
class DefinitionCollector(ast.NodeVisitor):
|
||||||
|
"""Collect all function and class definitions."""
|
||||||
|
|
||||||
|
def __init__(self):
|
||||||
|
self.definitions = [] # (name, type, lineno, filepath)
|
||||||
|
|
||||||
|
def visit_FunctionDef(self, node):
|
||||||
|
self.definitions.append((node.name, "function", node.lineno))
|
||||||
|
self.generic_visit(node)
|
||||||
|
|
||||||
|
def visit_AsyncFunctionDef(self, node):
|
||||||
|
self.definitions.append((node.name, "async_function", node.lineno))
|
||||||
|
self.generic_visit(node)
|
||||||
|
|
||||||
|
def visit_ClassDef(self, node):
|
||||||
|
self.definitions.append((node.name, "class", node.lineno))
|
||||||
|
self.generic_visit(node)
|
||||||
|
|
||||||
|
|
||||||
|
class NameUsageCollector(ast.NodeVisitor):
|
||||||
|
"""Collect all name references (calls, imports, attribute access)."""
|
||||||
|
|
||||||
|
def __init__(self):
|
||||||
|
self.names = set()
|
||||||
|
self.calls = set()
|
||||||
|
self.imports = set()
|
||||||
|
|
||||||
|
def visit_Name(self, node):
|
||||||
|
self.names.add(node.id)
|
||||||
|
self.generic_visit(node)
|
||||||
|
|
||||||
|
def visit_Attribute(self, node):
|
||||||
|
if isinstance(node.value, ast.Name):
|
||||||
|
self.names.add(node.value.id)
|
||||||
|
self.generic_visit(node)
|
||||||
|
|
||||||
|
def visit_Call(self, node):
|
||||||
|
if isinstance(node.func, ast.Name):
|
||||||
|
self.calls.add(node.func.id)
|
||||||
|
elif isinstance(node.func, ast.Attribute):
|
||||||
|
if isinstance(node.func.value, ast.Name):
|
||||||
|
self.names.add(node.func.value.id)
|
||||||
|
self.calls.add(node.func.attr)
|
||||||
|
self.generic_visit(node)
|
||||||
|
|
||||||
|
def visit_Import(self, node):
|
||||||
|
for alias in node.names:
|
||||||
|
self.imports.add(alias.asname or alias.name)
|
||||||
|
self.generic_visit(node)
|
||||||
|
|
||||||
|
def visit_ImportFrom(self, node):
|
||||||
|
for alias in node.names:
|
||||||
|
self.imports.add(alias.asname or alias.name)
|
||||||
|
self.generic_visit(node)
|
||||||
|
|
||||||
|
|
||||||
|
def analyze_file(filepath: str) -> dict:
|
||||||
|
"""Analyze a single Python file for dead code."""
|
||||||
|
path = Path(filepath)
|
||||||
|
try:
|
||||||
|
content = path.read_text()
|
||||||
|
tree = ast.parse(content, filename=str(filepath))
|
||||||
|
except (SyntaxError, UnicodeDecodeError):
|
||||||
|
return {"error": f"Could not parse {filepath}"}
|
||||||
|
|
||||||
|
# Collect definitions
|
||||||
|
def_collector = DefinitionCollector()
|
||||||
|
def_collector.visit(tree)
|
||||||
|
definitions = def_collector.definitions
|
||||||
|
|
||||||
|
# Collect usage
|
||||||
|
usage_collector = NameUsageCollector()
|
||||||
|
usage_collector.visit(tree)
|
||||||
|
used_names = usage_collector.names | usage_collector.calls | usage_collector.imports
|
||||||
|
|
||||||
|
# Also scan the entire repo for references to this file's definitions
|
||||||
|
# (this is done at the repo level, not file level)
|
||||||
|
|
||||||
|
dead = []
|
||||||
|
for name, def_type, lineno in definitions:
|
||||||
|
if name.startswith("_") and not name.startswith("__"):
|
||||||
|
# Private functions — might be used externally, less likely dead
|
||||||
|
pass
|
||||||
|
|
||||||
|
if name not in used_names:
|
||||||
|
if not is_safe_unused(name, filepath):
|
||||||
|
dead.append({
|
||||||
|
"name": name,
|
||||||
|
"type": def_type,
|
||||||
|
"file": filepath,
|
||||||
|
"line": lineno,
|
||||||
|
})
|
||||||
|
|
||||||
|
return {"definitions": len(definitions), "dead": dead}
|
||||||
|
|
||||||
|
|
||||||
|
def scan_repo(repo_path: str, exclude_patterns: list = None) -> dict:
|
||||||
|
"""Scan an entire repo for dead code."""
|
||||||
|
path = Path(repo_path)
|
||||||
|
exclude = exclude_patterns or ["venv", ".venv", "node_modules", "__pycache__",
|
||||||
|
".git", "dist", "build", ".tox", "vendor"]
|
||||||
|
|
||||||
|
all_definitions = {} # name -> [{file, line, type}]
|
||||||
|
all_files = []
|
||||||
|
dead_code = []
|
||||||
|
|
||||||
|
# First pass: collect all definitions across repo
|
||||||
|
for fpath in path.rglob("*.py"):
|
||||||
|
parts = fpath.parts
|
||||||
|
if any(ex in parts for ex in exclude):
|
||||||
|
continue
|
||||||
|
if fpath.name.startswith("."):
|
||||||
|
continue
|
||||||
|
|
||||||
|
try:
|
||||||
|
content = fpath.read_text(errors="ignore")
|
||||||
|
tree = ast.parse(content, filename=str(fpath))
|
||||||
|
except:
|
||||||
|
continue
|
||||||
|
|
||||||
|
all_files.append(str(fpath))
|
||||||
|
collector = DefinitionCollector()
|
||||||
|
collector.visit(tree)
|
||||||
|
|
||||||
|
for name, def_type, lineno in collector.definitions:
|
||||||
|
rel_path = str(fpath.relative_to(path))
|
||||||
|
if name not in all_definitions:
|
||||||
|
all_definitions[name] = []
|
||||||
|
all_definitions[name].append({
|
||||||
|
"file": rel_path,
|
||||||
|
"line": lineno,
|
||||||
|
"type": def_type,
|
||||||
|
})
|
||||||
|
|
||||||
|
# Second pass: check each name for usage across entire repo
|
||||||
|
all_used_names = set()
|
||||||
|
for fpath_str in all_files:
|
||||||
|
try:
|
||||||
|
content = Path(fpath_str).read_text(errors="ignore")
|
||||||
|
tree = ast.parse(content)
|
||||||
|
except:
|
||||||
|
continue
|
||||||
|
usage = NameUsageCollector()
|
||||||
|
usage.visit(tree)
|
||||||
|
all_used_names.update(usage.names)
|
||||||
|
all_used_names.update(usage.calls)
|
||||||
|
all_used_names.update(usage.imports)
|
||||||
|
|
||||||
|
# Find dead code
|
||||||
|
for name, locations in all_definitions.items():
|
||||||
|
if name not in all_used_names:
|
||||||
|
for loc in locations:
|
||||||
|
if not is_safe_unused(name, loc["file"]):
|
||||||
|
dead_code.append({
|
||||||
|
"name": name,
|
||||||
|
"type": loc["type"],
|
||||||
|
"file": loc["file"],
|
||||||
|
"line": loc["line"],
|
||||||
|
})
|
||||||
|
|
||||||
|
return {
|
||||||
|
"repo": path.name,
|
||||||
|
"files_scanned": len(all_files),
|
||||||
|
"total_definitions": sum(len(v) for v in all_definitions.values()),
|
||||||
|
"dead_code_count": len(dead_code),
|
||||||
|
"dead_code": sorted(dead_code, key=lambda x: (x["file"], x["line"])),
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
parser = argparse.ArgumentParser(description="Find dead code in Python codebases")
|
||||||
|
parser.add_argument("repo", help="Repository path to scan")
|
||||||
|
parser.add_argument("--format", choices=["text", "json"], default="text")
|
||||||
|
parser.add_argument("--exclude", help="Comma-separated patterns to exclude")
|
||||||
|
parser.add_argument("--git-blame", action="store_true", help="Include git blame info")
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
exclude = args.exclude.split(",") if args.exclude else None
|
||||||
|
result = scan_repo(args.repo, exclude)
|
||||||
|
|
||||||
|
if args.format == "json":
|
||||||
|
print(json.dumps(result, indent=2))
|
||||||
|
else:
|
||||||
|
print(f"Dead Code Report: {result['repo']}")
|
||||||
|
print(f"Files scanned: {result['files_scanned']}")
|
||||||
|
print(f"Total definitions: {result['total_definitions']}")
|
||||||
|
print(f"Dead code found: {result['dead_code_count']}")
|
||||||
|
print()
|
||||||
|
|
||||||
|
if result["dead_code"]:
|
||||||
|
print(f"{'File':<45} {'Line':>4} {'Type':<10} {'Name'}")
|
||||||
|
print("-" * 85)
|
||||||
|
for item in result["dead_code"]:
|
||||||
|
author = ""
|
||||||
|
if args.git_blame:
|
||||||
|
author = get_git_blame(
|
||||||
|
os.path.join(args.repo, item["file"]),
|
||||||
|
item["line"]
|
||||||
|
) or ""
|
||||||
|
author = f" ({author})" if author else ""
|
||||||
|
print(f"{item['file']:<45} {item['line']:>4} {item['type']:<10} {item['name']}{author}")
|
||||||
|
else:
|
||||||
|
print("No dead code detected!")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
249
scripts/dependency_graph.py
Normal file
249
scripts/dependency_graph.py
Normal file
@@ -0,0 +1,249 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Cross-Repo Dependency Graph Builder
|
||||||
|
|
||||||
|
Scans repos for import/require/reference patterns and builds a directed
|
||||||
|
dependency graph. Detects circular dependencies. Outputs DOT and Mermaid.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python3 scripts/dependency_graph.py /path/to/repos/
|
||||||
|
python3 scripts/dependency_graph.py --repos repo1,repo2,repo3 --format mermaid
|
||||||
|
python3 scripts/dependency_graph.py --repos-dir /path/to/ --format dot --output deps.dot
|
||||||
|
|
||||||
|
Patterns detected:
|
||||||
|
- Python: import X, from X import Y
|
||||||
|
- JavaScript: require("X"), import ... from "X"
|
||||||
|
- Go: import "X"
|
||||||
|
- Ansible: include_role, import_role
|
||||||
|
- Docker/Compose: image: X, depends_on
|
||||||
|
- Config references: repo-name in YAML/TOML/JSON
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import re
|
||||||
|
import sys
|
||||||
|
from collections import defaultdict
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
|
||||||
|
# Known repo names for matching
|
||||||
|
KNOWN_REPOS = [
|
||||||
|
"hermes-agent", "timmy-config", "timmy-home", "the-nexus", "the-door",
|
||||||
|
"the-beacon", "fleet-ops", "burn-fleet", "timmy-dispatch", "turboquant",
|
||||||
|
"compounding-intelligence", "the-playground", "second-son-of-timmy",
|
||||||
|
"ai-safety-review", "the-echo-pattern", "timmy-academy", "wolf",
|
||||||
|
"the-testament",
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
def normalize_repo_name(name: str) -> str:
|
||||||
|
"""Normalize a repo name for comparison."""
|
||||||
|
return name.lower().replace("_", "-").replace(".git", "").strip()
|
||||||
|
|
||||||
|
|
||||||
|
def scan_file_for_deps(filepath: str, content: str, own_repo: str) -> set:
|
||||||
|
"""Scan a file's content for references to other repos."""
|
||||||
|
deps = set()
|
||||||
|
own_norm = normalize_repo_name(own_repo)
|
||||||
|
|
||||||
|
for repo in KNOWN_REPOS:
|
||||||
|
repo_norm = normalize_repo_name(repo)
|
||||||
|
if repo_norm == own_norm:
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Direct name references
|
||||||
|
patterns = [
|
||||||
|
repo, # exact name
|
||||||
|
repo.replace("-", "_"), # underscore variant
|
||||||
|
repo.replace("-", ""), # no separator
|
||||||
|
f"/{repo}/", # path reference
|
||||||
|
f'"{repo}"', # quoted
|
||||||
|
f"'{repo}'", # single quoted
|
||||||
|
f"Timmy_Foundation/{repo}", # full Gitea path
|
||||||
|
f"Timmy_Foundation.{repo}", # Python module path
|
||||||
|
]
|
||||||
|
|
||||||
|
for pattern in patterns:
|
||||||
|
if pattern in content:
|
||||||
|
deps.add(repo)
|
||||||
|
break
|
||||||
|
|
||||||
|
return deps
|
||||||
|
|
||||||
|
|
||||||
|
def scan_repo(repo_path: str, repo_name: str = None) -> dict:
|
||||||
|
"""Scan a repo directory for dependencies."""
|
||||||
|
path = Path(repo_path)
|
||||||
|
if not path.is_dir():
|
||||||
|
return {"error": f"Not a directory: {repo_path}"}
|
||||||
|
|
||||||
|
if not repo_name:
|
||||||
|
repo_name = path.name
|
||||||
|
|
||||||
|
deps = set()
|
||||||
|
files_scanned = 0
|
||||||
|
exts = {".py", ".js", ".ts", ".go", ".yaml", ".yml", ".toml", ".json",
|
||||||
|
".md", ".sh", ".bash", ".Dockerfile", ".tf", ".hcl"}
|
||||||
|
|
||||||
|
for fpath in path.rglob("*"):
|
||||||
|
if not fpath.is_file():
|
||||||
|
continue
|
||||||
|
if fpath.suffix not in exts:
|
||||||
|
continue
|
||||||
|
# Skip common non-source dirs
|
||||||
|
parts = fpath.parts
|
||||||
|
if any(p in (".git", "node_modules", "__pycache__", ".venv", "venv",
|
||||||
|
"vendor", "dist", "build", ".tox") for p in parts):
|
||||||
|
continue
|
||||||
|
|
||||||
|
try:
|
||||||
|
content = fpath.read_text(errors="ignore")
|
||||||
|
except:
|
||||||
|
continue
|
||||||
|
|
||||||
|
file_deps = scan_file_for_deps(str(fpath), content, repo_name)
|
||||||
|
deps.update(file_deps)
|
||||||
|
files_scanned += 1
|
||||||
|
|
||||||
|
return {
|
||||||
|
"repo": repo_name,
|
||||||
|
"dependencies": sorted(deps),
|
||||||
|
"files_scanned": files_scanned,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def detect_cycles(graph: dict) -> list:
|
||||||
|
"""Detect circular dependencies using DFS."""
|
||||||
|
cycles = []
|
||||||
|
visited = set()
|
||||||
|
rec_stack = set()
|
||||||
|
|
||||||
|
def dfs(node, path):
|
||||||
|
visited.add(node)
|
||||||
|
rec_stack.add(node)
|
||||||
|
|
||||||
|
for neighbor in graph.get(node, {}).get("dependencies", []):
|
||||||
|
if neighbor not in visited:
|
||||||
|
result = dfs(neighbor, path + [neighbor])
|
||||||
|
if result:
|
||||||
|
return result
|
||||||
|
elif neighbor in rec_stack:
|
||||||
|
cycle_start = path.index(neighbor)
|
||||||
|
return path[cycle_start:] + [neighbor]
|
||||||
|
|
||||||
|
rec_stack.remove(node)
|
||||||
|
return None
|
||||||
|
|
||||||
|
for node in graph:
|
||||||
|
if node not in visited:
|
||||||
|
cycle = dfs(node, [node])
|
||||||
|
if cycle:
|
||||||
|
cycles.append(cycle)
|
||||||
|
|
||||||
|
return cycles
|
||||||
|
|
||||||
|
|
||||||
|
def to_dot(graph: dict) -> str:
|
||||||
|
"""Generate DOT format output."""
|
||||||
|
lines = ["digraph dependencies {"]
|
||||||
|
lines.append(" rankdir=LR;")
|
||||||
|
lines.append(" node [shape=box, style=filled, fillcolor="#1a1a2e", fontcolor="#e6edf3"];")
|
||||||
|
lines.append(" edge [color="#4a4a6a"];")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
for repo, data in sorted(graph.items()):
|
||||||
|
dep_count = len(data.get("dependencies", []))
|
||||||
|
fill = "#2d1b69" if dep_count > 2 else "#16213e"
|
||||||
|
lines.append(f' "{repo}" [fillcolor="{fill}"];')
|
||||||
|
for dep in data.get("dependencies", []):
|
||||||
|
lines.append(f' "{repo}" -> "{dep}";')
|
||||||
|
|
||||||
|
lines.append("}")
|
||||||
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
|
||||||
|
def to_mermaid(graph: dict) -> str:
|
||||||
|
"""Generate Mermaid format output."""
|
||||||
|
lines = ["graph LR"]
|
||||||
|
|
||||||
|
for repo, data in sorted(graph.items()):
|
||||||
|
for dep in data.get("dependencies", []):
|
||||||
|
lines.append(f" {repo.replace('-','_')} --> {dep.replace('-','_')}")
|
||||||
|
|
||||||
|
# Add node labels
|
||||||
|
lines.append("")
|
||||||
|
for repo in sorted(graph.keys()):
|
||||||
|
lines.append(f" {repo.replace('-','_')}[{repo}]")
|
||||||
|
|
||||||
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
parser = argparse.ArgumentParser(description="Build cross-repo dependency graph")
|
||||||
|
parser.add_argument("repos_dir", nargs="?", help="Directory containing repos")
|
||||||
|
parser.add_argument("--repos", help="Comma-separated list of repo paths")
|
||||||
|
parser.add_argument("--format", choices=["dot", "mermaid", "json"], default="json")
|
||||||
|
parser.add_argument("--output", "-o", help="Output file (default: stdout)")
|
||||||
|
parser.add_argument("--cycles-only", action="store_true", help="Only report cycles")
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
results = {}
|
||||||
|
repo_paths = []
|
||||||
|
|
||||||
|
if args.repos:
|
||||||
|
repo_paths = [p.strip() for p in args.repos.split(",")]
|
||||||
|
elif args.repos_dir:
|
||||||
|
base = Path(args.repos_dir)
|
||||||
|
repo_paths = [str(p) for p in base.iterdir() if p.is_dir() and not p.name.startswith(".")]
|
||||||
|
else:
|
||||||
|
parser.print_help()
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
for rpath in repo_paths:
|
||||||
|
name = Path(rpath).name
|
||||||
|
print(f"Scanning {name}...", file=sys.stderr)
|
||||||
|
result = scan_repo(rpath, name)
|
||||||
|
if "error" not in result:
|
||||||
|
results[name] = result
|
||||||
|
|
||||||
|
# Detect cycles
|
||||||
|
cycles = detect_cycles(results)
|
||||||
|
|
||||||
|
if args.cycles_only:
|
||||||
|
if cycles:
|
||||||
|
print("CIRCULAR DEPENDENCIES DETECTED:")
|
||||||
|
for cycle in cycles:
|
||||||
|
print(f" {' -> '.join(cycle)}")
|
||||||
|
sys.exit(1)
|
||||||
|
else:
|
||||||
|
print("No circular dependencies found.")
|
||||||
|
sys.exit(0)
|
||||||
|
|
||||||
|
# Output
|
||||||
|
output = {}
|
||||||
|
if args.format == "dot":
|
||||||
|
output = to_dot(results)
|
||||||
|
elif args.format == "mermaid":
|
||||||
|
output = to_mermaid(results)
|
||||||
|
else:
|
||||||
|
output = json.dumps({
|
||||||
|
"repos": results,
|
||||||
|
"cycles": cycles,
|
||||||
|
"summary": {
|
||||||
|
"total_repos": len(results),
|
||||||
|
"total_deps": sum(len(r["dependencies"]) for r in results.values()),
|
||||||
|
"cycles_found": len(cycles),
|
||||||
|
}
|
||||||
|
}, indent=2)
|
||||||
|
|
||||||
|
if args.output:
|
||||||
|
Path(args.output).write_text(output)
|
||||||
|
print(f"Written to {args.output}", file=sys.stderr)
|
||||||
|
else:
|
||||||
|
print(output)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
216
scripts/diff_analyzer.py
Normal file
216
scripts/diff_analyzer.py
Normal file
@@ -0,0 +1,216 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Diff Analyzer — Parse unified diffs and categorize every change.
|
||||||
|
|
||||||
|
Pipeline 6.1 for Compounding Intelligence.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import re
|
||||||
|
from dataclasses import dataclass, field, asdict
|
||||||
|
from enum import Enum
|
||||||
|
from typing import List, Dict, Any, Optional
|
||||||
|
|
||||||
|
|
||||||
|
class ChangeCategory(Enum):
|
||||||
|
ADDED = "added"
|
||||||
|
DELETED = "deleted"
|
||||||
|
MODIFIED = "modified"
|
||||||
|
MOVED = "moved"
|
||||||
|
CONTEXT = "context"
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class Hunk:
|
||||||
|
"""A single diff hunk with header, line ranges, and category."""
|
||||||
|
header: str
|
||||||
|
old_start: int
|
||||||
|
old_count: int
|
||||||
|
new_start: int
|
||||||
|
new_count: int
|
||||||
|
lines: List[str] = field(default_factory=list)
|
||||||
|
category: ChangeCategory = ChangeCategory.CONTEXT
|
||||||
|
|
||||||
|
def to_dict(self) -> Dict[str, Any]:
|
||||||
|
d = asdict(self)
|
||||||
|
d["category"] = self.category.value
|
||||||
|
return d
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class FileChange:
|
||||||
|
"""A single file's changes."""
|
||||||
|
path: str
|
||||||
|
old_path: Optional[str] = None # For renames
|
||||||
|
hunks: List[Hunk] = field(default_factory=list)
|
||||||
|
added_lines: int = 0
|
||||||
|
deleted_lines: int = 0
|
||||||
|
is_new: bool = False
|
||||||
|
is_deleted: bool = False
|
||||||
|
is_renamed: bool = False
|
||||||
|
is_binary: bool = False
|
||||||
|
|
||||||
|
def to_dict(self) -> Dict[str, Any]:
|
||||||
|
return {
|
||||||
|
"path": self.path,
|
||||||
|
"old_path": self.old_path,
|
||||||
|
"hunks": [h.to_dict() for h in self.hunks],
|
||||||
|
"added_lines": self.added_lines,
|
||||||
|
"deleted_lines": self.deleted_lines,
|
||||||
|
"is_new": self.is_new,
|
||||||
|
"is_deleted": self.is_deleted,
|
||||||
|
"is_renamed": self.is_renamed,
|
||||||
|
"is_binary": self.is_binary,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class ChangeSummary:
|
||||||
|
"""Aggregate stats + per-file breakdown."""
|
||||||
|
files: List[FileChange] = field(default_factory=list)
|
||||||
|
total_added: int = 0
|
||||||
|
total_deleted: int = 0
|
||||||
|
total_files_changed: int = 0
|
||||||
|
total_hunks: int = 0
|
||||||
|
new_files: int = 0
|
||||||
|
deleted_files: int = 0
|
||||||
|
renamed_files: int = 0
|
||||||
|
binary_files: int = 0
|
||||||
|
|
||||||
|
def to_dict(self) -> Dict[str, Any]:
|
||||||
|
return {
|
||||||
|
"total_files_changed": self.total_files_changed,
|
||||||
|
"total_added": self.total_added,
|
||||||
|
"total_deleted": self.total_deleted,
|
||||||
|
"total_hunks": self.total_hunks,
|
||||||
|
"new_files": self.new_files,
|
||||||
|
"deleted_files": self.deleted_files,
|
||||||
|
"renamed_files": self.renamed_files,
|
||||||
|
"binary_files": self.binary_files,
|
||||||
|
"files": [f.to_dict() for f in self.files],
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
class DiffAnalyzer:
|
||||||
|
"""Parses unified diff format and produces structured ChangeSummary."""
|
||||||
|
|
||||||
|
HUNK_HEADER_RE = re.compile(r"^@@\s+-(\d+)(?:,(\d+))?\s+\+(\d+)(?:,(\d+))?\s+@@(.*)$")
|
||||||
|
DIFF_FILE_RE = re.compile(r"^diff --git a/(.*) b/(.*)")
|
||||||
|
RENAME_RE = re.compile(r"^rename from (.+)$")
|
||||||
|
RENAME_TO_RE = re.compile(r"^rename to (.+)$")
|
||||||
|
NEW_FILE_RE = re.compile(r"^new file mode")
|
||||||
|
DELETED_FILE_RE = re.compile(r"^deleted file mode")
|
||||||
|
BINARY_RE = re.compile(r"^Binary files .* differ")
|
||||||
|
|
||||||
|
def analyze(self, diff_text: str) -> ChangeSummary:
|
||||||
|
"""Parse a unified diff and return a ChangeSummary."""
|
||||||
|
summary = ChangeSummary()
|
||||||
|
if not diff_text or not diff_text.strip():
|
||||||
|
return summary
|
||||||
|
|
||||||
|
# Split diff into per-file sections
|
||||||
|
file_diffs = self._split_files(diff_text)
|
||||||
|
|
||||||
|
for file_diff in file_diffs:
|
||||||
|
fc = self._parse_file_diff(file_diff)
|
||||||
|
summary.files.append(fc)
|
||||||
|
summary.total_added += fc.added_lines
|
||||||
|
summary.total_deleted += fc.deleted_lines
|
||||||
|
summary.total_hunks += len(fc.hunks)
|
||||||
|
if fc.is_new:
|
||||||
|
summary.new_files += 1
|
||||||
|
if fc.is_deleted:
|
||||||
|
summary.deleted_files += 1
|
||||||
|
if fc.is_renamed:
|
||||||
|
summary.renamed_files += 1
|
||||||
|
if fc.is_binary:
|
||||||
|
summary.binary_files += 1
|
||||||
|
|
||||||
|
summary.total_files_changed = len(summary.files)
|
||||||
|
return summary
|
||||||
|
|
||||||
|
def _split_files(self, diff_text: str) -> List[str]:
|
||||||
|
"""Split a multi-file diff into individual file diffs."""
|
||||||
|
lines = diff_text.split("\n")
|
||||||
|
chunks = []
|
||||||
|
current = []
|
||||||
|
for line in lines:
|
||||||
|
if line.startswith("diff --git ") and current:
|
||||||
|
chunks.append("\n".join(current))
|
||||||
|
current = [line]
|
||||||
|
else:
|
||||||
|
current.append(line)
|
||||||
|
if current:
|
||||||
|
chunks.append("\n".join(current))
|
||||||
|
return chunks
|
||||||
|
|
||||||
|
def _parse_file_diff(self, diff_text: str) -> FileChange:
|
||||||
|
"""Parse a single file's diff section."""
|
||||||
|
lines = diff_text.split("\n")
|
||||||
|
fc = FileChange(path="")
|
||||||
|
|
||||||
|
# Extract file paths
|
||||||
|
for line in lines:
|
||||||
|
m = self.DIFF_FILE_RE.match(line)
|
||||||
|
if m:
|
||||||
|
fc.path = m.group(2)
|
||||||
|
break
|
||||||
|
|
||||||
|
# Check for special states
|
||||||
|
for line in lines:
|
||||||
|
if self.NEW_FILE_RE.match(line):
|
||||||
|
fc.is_new = True
|
||||||
|
elif self.DELETED_FILE_RE.match(line):
|
||||||
|
fc.is_deleted = True
|
||||||
|
elif self.RENAME_RE.match(line):
|
||||||
|
fc.old_path = m.group(1) if (m := self.RENAME_RE.match(line)) else None
|
||||||
|
fc.is_renamed = True
|
||||||
|
elif self.BINARY_RE.match(line):
|
||||||
|
fc.is_binary = True
|
||||||
|
return fc # No hunks for binary
|
||||||
|
|
||||||
|
# Rename TO
|
||||||
|
for line in lines:
|
||||||
|
m = self.RENAME_TO_RE.match(line)
|
||||||
|
if m and fc.is_renamed:
|
||||||
|
fc.path = m.group(1)
|
||||||
|
|
||||||
|
# Parse hunks
|
||||||
|
current_hunk = None
|
||||||
|
for line in lines:
|
||||||
|
m = self.HUNK_HEADER_RE.match(line)
|
||||||
|
if m:
|
||||||
|
if current_hunk:
|
||||||
|
self._classify_hunk(current_hunk, fc)
|
||||||
|
fc.hunks.append(current_hunk)
|
||||||
|
current_hunk = Hunk(
|
||||||
|
header=m.group(5).strip(),
|
||||||
|
old_start=int(m.group(1)),
|
||||||
|
old_count=int(m.group(2) or 1),
|
||||||
|
new_start=int(m.group(3)),
|
||||||
|
new_count=int(m.group(4) or 1),
|
||||||
|
)
|
||||||
|
elif current_hunk and (line.startswith("+") or line.startswith("-") or line.startswith(" ")):
|
||||||
|
current_hunk.lines.append(line)
|
||||||
|
|
||||||
|
if current_hunk:
|
||||||
|
self._classify_hunk(current_hunk, fc)
|
||||||
|
fc.hunks.append(current_hunk)
|
||||||
|
|
||||||
|
return fc
|
||||||
|
|
||||||
|
def _classify_hunk(self, hunk: Hunk, fc: FileChange):
|
||||||
|
"""Classify a hunk and count lines."""
|
||||||
|
added = sum(1 for l in hunk.lines if l.startswith("+"))
|
||||||
|
deleted = sum(1 for l in hunk.lines if l.startswith("-"))
|
||||||
|
|
||||||
|
fc.added_lines += added
|
||||||
|
fc.deleted_lines += deleted
|
||||||
|
|
||||||
|
if added > 0 and deleted == 0:
|
||||||
|
hunk.category = ChangeCategory.ADDED
|
||||||
|
elif deleted > 0 and added == 0:
|
||||||
|
hunk.category = ChangeCategory.DELETED
|
||||||
|
elif added > 0 and deleted > 0:
|
||||||
|
hunk.category = ChangeCategory.MODIFIED
|
||||||
|
else:
|
||||||
|
hunk.category = ChangeCategory.CONTEXT
|
||||||
131
scripts/gitea_issue_parser.py
Normal file
131
scripts/gitea_issue_parser.py
Normal file
@@ -0,0 +1,131 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Gitea Issue Body Parser — Extract structured data from markdown issue bodies.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
cat issue_body.txt | python3 scripts/gitea_issue_parser.py --stdin --pretty
|
||||||
|
python3 scripts/gitea_issue_parser.py --url https://forge.../api/v1/repos/.../issues/123 --pretty
|
||||||
|
python3 scripts/gitea_issue_parser.py body.txt --title "Fix thing (#42)" --labels pipeline extraction
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import json
|
||||||
|
import re
|
||||||
|
import sys
|
||||||
|
from typing import Dict, List, Any, Optional
|
||||||
|
|
||||||
|
|
||||||
|
def parse_issue_body(body: str, title: str = "", labels: List[str] = None) -> Dict[str, Any]:
|
||||||
|
"""Parse a Gitea issue markdown body into structured JSON.
|
||||||
|
|
||||||
|
Extracted fields:
|
||||||
|
- title: Issue title
|
||||||
|
- context: Background/description section
|
||||||
|
- criteria[]: Acceptance criteria (checkboxes or numbered lists)
|
||||||
|
- labels[]: Issue labels
|
||||||
|
- epic_ref: Parent/epic issue reference (from "Closes #N" or title)
|
||||||
|
- sections{}: All ## sections as key-value pairs
|
||||||
|
"""
|
||||||
|
result = {
|
||||||
|
"title": title,
|
||||||
|
"context": "",
|
||||||
|
"criteria": [],
|
||||||
|
"labels": labels or [],
|
||||||
|
"epic_ref": None,
|
||||||
|
"sections": {},
|
||||||
|
}
|
||||||
|
|
||||||
|
if not body:
|
||||||
|
return result
|
||||||
|
|
||||||
|
# Extract epic reference from title or body
|
||||||
|
epic_patterns = [
|
||||||
|
r"(?:closes|fixes|addresses|refs?)\s+#(\d+)",
|
||||||
|
r"#(\d+)",
|
||||||
|
]
|
||||||
|
for pattern in epic_patterns:
|
||||||
|
match = re.search(pattern, (title + " " + body).lower())
|
||||||
|
if match:
|
||||||
|
result["epic_ref"] = int(match.group(1))
|
||||||
|
break
|
||||||
|
|
||||||
|
# Parse ## sections
|
||||||
|
section_pattern = r"^##\s+(.+?)$\n((?:^(?!##\s).*$\n?)*)"
|
||||||
|
for match in re.finditer(section_pattern, body, re.MULTILINE):
|
||||||
|
section_name = match.group(1).strip().lower().replace(" ", "_")
|
||||||
|
section_content = match.group(2).strip()
|
||||||
|
result["sections"][section_name] = section_content
|
||||||
|
|
||||||
|
# Extract acceptance criteria (checkboxes)
|
||||||
|
checkbox_pattern = r"^\s*-\s*\[([ xX])\]\s*(.+)$"
|
||||||
|
for match in re.finditer(checkbox_pattern, body, re.MULTILINE):
|
||||||
|
checked = match.group(1).lower() == "x"
|
||||||
|
text = match.group(2).strip()
|
||||||
|
result["criteria"].append({"text": text, "checked": checked})
|
||||||
|
|
||||||
|
# If no checkboxes, try numbered lists in "Acceptance Criteria" or "Criteria" section
|
||||||
|
if not result["criteria"]:
|
||||||
|
for section_name in ["acceptance_criteria", "criteria", "acceptance criteria"]:
|
||||||
|
if section_name in result["sections"]:
|
||||||
|
numbered = r"^\s*\d+\.\s*(.+)$"
|
||||||
|
for match in re.finditer(numbered, result["sections"][section_name], re.MULTILINE):
|
||||||
|
result["criteria"].append({"text": match.group(1).strip(), "checked": False})
|
||||||
|
break
|
||||||
|
|
||||||
|
# Extract context (first section or first paragraph before any ## heading)
|
||||||
|
first_heading = body.find("## ")
|
||||||
|
if first_heading > 0:
|
||||||
|
context_text = body[:first_heading].strip()
|
||||||
|
else:
|
||||||
|
context_text = body.split("\n\n")[0].strip()
|
||||||
|
# Clean up: remove "## Context" or "## Problem" header if present
|
||||||
|
context_text = re.sub(r"^#+\s*\w+\s*\n?", "", context_text).strip()
|
||||||
|
result["context"] = context_text[:500] # Cap at 500 chars
|
||||||
|
|
||||||
|
return result
|
||||||
|
|
||||||
|
|
||||||
|
def fetch_issue_from_url(url: str) -> Dict[str, Any]:
|
||||||
|
"""Fetch an issue from a Gitea API URL and parse it."""
|
||||||
|
import urllib.request
|
||||||
|
req = urllib.request.Request(url, headers={"Accept": "application/json"})
|
||||||
|
with urllib.request.urlopen(req) as resp:
|
||||||
|
data = json.loads(resp.read())
|
||||||
|
|
||||||
|
return parse_issue_body(
|
||||||
|
body=data.get("body", ""),
|
||||||
|
title=data.get("title", ""),
|
||||||
|
labels=[l["name"] for l in data.get("labels", [])]
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
parser = argparse.ArgumentParser(description="Parse Gitea issue markdown into structured JSON")
|
||||||
|
parser.add_argument("file", nargs="?", help="Issue body file (or use --stdin)")
|
||||||
|
parser.add_argument("--stdin", action="store_true", help="Read from stdin")
|
||||||
|
parser.add_argument("--url", help="Gitea API URL to fetch issue from")
|
||||||
|
parser.add_argument("--title", default="", help="Issue title")
|
||||||
|
parser.add_argument("--labels", nargs="*", default=[], help="Issue labels")
|
||||||
|
parser.add_argument("--pretty", action="store_true", help="Pretty-print JSON output")
|
||||||
|
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
if args.url:
|
||||||
|
result = fetch_issue_from_url(args.url)
|
||||||
|
elif args.stdin:
|
||||||
|
body = sys.stdin.read()
|
||||||
|
result = parse_issue_body(body, args.title, args.labels)
|
||||||
|
elif args.file:
|
||||||
|
with open(args.file) as f:
|
||||||
|
body = f.read()
|
||||||
|
result = parse_issue_body(body, args.title, args.labels)
|
||||||
|
else:
|
||||||
|
parser.print_help()
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
indent = 2 if args.pretty else None
|
||||||
|
print(json.dumps(result, indent=indent))
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
447
scripts/harvester.py
Normal file
447
scripts/harvester.py
Normal file
@@ -0,0 +1,447 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
harvester.py — Extract durable knowledge from Hermes session transcripts.
|
||||||
|
|
||||||
|
Combines session_reader + extraction prompt + LLM inference to pull
|
||||||
|
facts, pitfalls, patterns, and tool quirks from finished sessions.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python3 harvester.py --session ~/.hermes/sessions/session_xxx.jsonl --output knowledge/
|
||||||
|
python3 harvester.py --batch --since 2026-04-01 --limit 100
|
||||||
|
python3 harvester.py --session session.jsonl --dry-run # Preview without writing
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import time
|
||||||
|
import hashlib
|
||||||
|
from datetime import datetime, timezone
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Optional
|
||||||
|
|
||||||
|
# Add scripts dir to path for sibling imports
|
||||||
|
SCRIPT_DIR = Path(__file__).parent.absolute()
|
||||||
|
sys.path.insert(0, str(SCRIPT_DIR))
|
||||||
|
|
||||||
|
from session_reader import read_session, extract_conversation, truncate_for_context, messages_to_text
|
||||||
|
|
||||||
|
# --- Configuration ---
|
||||||
|
|
||||||
|
DEFAULT_API_BASE = os.environ.get("HARVESTER_API_BASE", "https://api.nousresearch.com/v1")
|
||||||
|
DEFAULT_API_KEY = os.environ.get("HARVESTER_API_KEY", "")
|
||||||
|
DEFAULT_MODEL = os.environ.get("HARVESTER_MODEL", "xiaomi/mimo-v2-pro")
|
||||||
|
KNOWLEDGE_DIR = os.environ.get("HARVESTER_KNOWLEDGE_DIR", "knowledge")
|
||||||
|
PROMPT_PATH = os.environ.get("HARVESTER_PROMPT_PATH", str(SCRIPT_DIR.parent / "templates" / "harvest-prompt.md"))
|
||||||
|
|
||||||
|
# Where to look for API keys if not set via env
|
||||||
|
API_KEY_PATHS = [
|
||||||
|
os.path.expanduser("~/.config/nous/key"),
|
||||||
|
os.path.expanduser("~/.hermes/keymaxxing/active/minimax.key"),
|
||||||
|
os.path.expanduser("~/.config/openrouter/key"),
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
def find_api_key() -> str:
|
||||||
|
"""Find API key from common locations."""
|
||||||
|
for path in API_KEY_PATHS:
|
||||||
|
if os.path.exists(path):
|
||||||
|
with open(path) as f:
|
||||||
|
key = f.read().strip()
|
||||||
|
if key:
|
||||||
|
return key
|
||||||
|
return ""
|
||||||
|
|
||||||
|
|
||||||
|
def load_extraction_prompt() -> str:
|
||||||
|
"""Load the extraction prompt template."""
|
||||||
|
path = Path(PROMPT_PATH)
|
||||||
|
if not path.exists():
|
||||||
|
print(f"ERROR: Extraction prompt not found at {path}", file=sys.stderr)
|
||||||
|
print("Expected templates/harvest-prompt.md from issue #7", file=sys.stderr)
|
||||||
|
sys.exit(1)
|
||||||
|
return path.read_text(encoding='utf-8')
|
||||||
|
|
||||||
|
|
||||||
|
def call_llm(prompt: str, transcript: str, api_base: str, api_key: str, model: str) -> Optional[list[dict]]:
|
||||||
|
"""Call the LLM API to extract knowledge from a transcript."""
|
||||||
|
import urllib.request
|
||||||
|
|
||||||
|
messages = [
|
||||||
|
{"role": "system", "content": prompt},
|
||||||
|
{"role": "user", "content": f"Extract knowledge from this session transcript:\n\n{transcript}"}
|
||||||
|
]
|
||||||
|
|
||||||
|
payload = json.dumps({
|
||||||
|
"model": model,
|
||||||
|
"messages": messages,
|
||||||
|
"temperature": 0.1, # Low temp for consistent extraction
|
||||||
|
"max_tokens": 4096
|
||||||
|
}).encode('utf-8')
|
||||||
|
|
||||||
|
req = urllib.request.Request(
|
||||||
|
f"{api_base}/chat/completions",
|
||||||
|
data=payload,
|
||||||
|
headers={
|
||||||
|
"Authorization": f"Bearer {api_key}",
|
||||||
|
"Content-Type": "application/json"
|
||||||
|
},
|
||||||
|
method="POST"
|
||||||
|
)
|
||||||
|
|
||||||
|
try:
|
||||||
|
with urllib.request.urlopen(req, timeout=60) as resp:
|
||||||
|
result = json.loads(resp.read().decode('utf-8'))
|
||||||
|
content = result["choices"][0]["message"]["content"]
|
||||||
|
return parse_extraction_response(content)
|
||||||
|
except Exception as e:
|
||||||
|
print(f"ERROR: LLM API call failed: {e}", file=sys.stderr)
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def parse_extraction_response(content: str) -> Optional[list[dict]]:
|
||||||
|
"""Parse the LLM response to extract knowledge items.
|
||||||
|
|
||||||
|
Handles various response formats: raw JSON, markdown-wrapped JSON, etc.
|
||||||
|
"""
|
||||||
|
# Try direct JSON parse first
|
||||||
|
try:
|
||||||
|
data = json.loads(content)
|
||||||
|
if isinstance(data, dict) and 'knowledge' in data:
|
||||||
|
return data['knowledge']
|
||||||
|
if isinstance(data, list):
|
||||||
|
return data
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
pass
|
||||||
|
|
||||||
|
# Try extracting JSON from markdown code blocks
|
||||||
|
import re
|
||||||
|
json_match = re.search(r'```(?:json)?\s*({.*?})\s*```', content, re.DOTALL)
|
||||||
|
if json_match:
|
||||||
|
try:
|
||||||
|
data = json.loads(json_match.group(1))
|
||||||
|
if isinstance(data, dict) and 'knowledge' in data:
|
||||||
|
return data['knowledge']
|
||||||
|
if isinstance(data, list):
|
||||||
|
return data
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
pass
|
||||||
|
|
||||||
|
# Try finding any JSON object with knowledge array
|
||||||
|
json_match = re.search(r'({[^{}]*"knowledge"[^{}]*[[sS]*?][^{}]*})', content)
|
||||||
|
if json_match:
|
||||||
|
try:
|
||||||
|
data = json.loads(json_match.group(1))
|
||||||
|
return data.get('knowledge', [])
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
pass
|
||||||
|
|
||||||
|
print(f"WARNING: Could not parse LLM response as JSON", file=sys.stderr)
|
||||||
|
print(f"Response preview: {content[:500]}", file=sys.stderr)
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def load_existing_knowledge(knowledge_dir: str) -> dict:
|
||||||
|
"""Load the existing knowledge index."""
|
||||||
|
index_path = Path(knowledge_dir) / "index.json"
|
||||||
|
if not index_path.exists():
|
||||||
|
return {"version": 1, "last_updated": "", "total_facts": 0, "facts": []}
|
||||||
|
|
||||||
|
try:
|
||||||
|
with open(index_path, 'r', encoding='utf-8') as f:
|
||||||
|
return json.load(f)
|
||||||
|
except (json.JSONDecodeError, IOError) as e:
|
||||||
|
print(f"WARNING: Could not load knowledge index: {e}", file=sys.stderr)
|
||||||
|
return {"version": 1, "last_updated": "", "total_facts": 0, "facts": []}
|
||||||
|
|
||||||
|
|
||||||
|
def fact_fingerprint(fact: dict) -> str:
|
||||||
|
"""Generate a deduplication fingerprint for a fact.
|
||||||
|
|
||||||
|
Uses the fact text normalized (lowercase, stripped) as the key.
|
||||||
|
Similar facts will have similar fingerprints.
|
||||||
|
"""
|
||||||
|
text = fact.get('fact', '').lower().strip()
|
||||||
|
# Normalize whitespace
|
||||||
|
text = ' '.join(text.split())
|
||||||
|
return hashlib.md5(text.encode('utf-8')).hexdigest()
|
||||||
|
|
||||||
|
|
||||||
|
def deduplicate(new_facts: list[dict], existing: list[dict], similarity_threshold: float = 0.8) -> list[dict]:
|
||||||
|
"""Remove duplicate facts from new_facts that already exist in the knowledge store.
|
||||||
|
|
||||||
|
Uses fingerprint matching for exact dedup and simple overlap check for near-dupes.
|
||||||
|
"""
|
||||||
|
existing_fingerprints = set()
|
||||||
|
existing_texts = []
|
||||||
|
for f in existing:
|
||||||
|
fp = fact_fingerprint(f)
|
||||||
|
existing_fingerprints.add(fp)
|
||||||
|
existing_texts.append(f.get('fact', '').lower().strip())
|
||||||
|
|
||||||
|
unique = []
|
||||||
|
for fact in new_facts:
|
||||||
|
fp = fact_fingerprint(fact)
|
||||||
|
if fp in existing_fingerprints:
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Check for near-duplicates using simple word overlap
|
||||||
|
fact_words = set(fact.get('fact', '').lower().split())
|
||||||
|
is_dup = False
|
||||||
|
for existing_text in existing_texts:
|
||||||
|
existing_words = set(existing_text.split())
|
||||||
|
if not fact_words or not existing_words:
|
||||||
|
continue
|
||||||
|
overlap = len(fact_words & existing_words) / max(len(fact_words | existing_words), 1)
|
||||||
|
if overlap >= similarity_threshold:
|
||||||
|
is_dup = True
|
||||||
|
break
|
||||||
|
|
||||||
|
if not is_dup:
|
||||||
|
unique.append(fact)
|
||||||
|
existing_fingerprints.add(fp)
|
||||||
|
existing_texts.append(fact.get('fact', '').lower().strip())
|
||||||
|
|
||||||
|
return unique
|
||||||
|
|
||||||
|
|
||||||
|
def validate_fact(fact: dict) -> bool:
|
||||||
|
"""Validate a single knowledge item has required fields."""
|
||||||
|
required = ['fact', 'category', 'repo', 'confidence']
|
||||||
|
for field in required:
|
||||||
|
if field not in fact:
|
||||||
|
return False
|
||||||
|
|
||||||
|
if not isinstance(fact['fact'], str) or not fact['fact'].strip():
|
||||||
|
return False
|
||||||
|
|
||||||
|
valid_categories = ['fact', 'pitfall', 'pattern', 'tool-quirk', 'question']
|
||||||
|
if fact['category'] not in valid_categories:
|
||||||
|
return False
|
||||||
|
|
||||||
|
if not isinstance(fact.get('confidence', 0), (int, float)):
|
||||||
|
return False
|
||||||
|
|
||||||
|
if not (0.0 <= fact['confidence'] <= 1.0):
|
||||||
|
return False
|
||||||
|
|
||||||
|
return True
|
||||||
|
|
||||||
|
|
||||||
|
def write_knowledge(index: dict, new_facts: list[dict], knowledge_dir: str, source_session: str = ""):
|
||||||
|
"""Write new facts to the knowledge store."""
|
||||||
|
kdir = Path(knowledge_dir)
|
||||||
|
kdir.mkdir(parents=True, exist_ok=True)
|
||||||
|
|
||||||
|
# Add source tracking to each fact
|
||||||
|
for fact in new_facts:
|
||||||
|
fact['source_session'] = source_session
|
||||||
|
fact['harvested_at'] = datetime.now(timezone.utc).isoformat()
|
||||||
|
|
||||||
|
# Update index
|
||||||
|
index['facts'].extend(new_facts)
|
||||||
|
index['total_facts'] = len(index['facts'])
|
||||||
|
index['last_updated'] = datetime.now(timezone.utc).isoformat()
|
||||||
|
|
||||||
|
# Write index
|
||||||
|
index_path = kdir / "index.json"
|
||||||
|
with open(index_path, 'w', encoding='utf-8') as f:
|
||||||
|
json.dump(index, f, indent=2, ensure_ascii=False)
|
||||||
|
|
||||||
|
# Also write per-repo markdown files for human reading
|
||||||
|
repos = {}
|
||||||
|
for fact in new_facts:
|
||||||
|
repo = fact.get('repo', 'global')
|
||||||
|
repos.setdefault(repo, []).append(fact)
|
||||||
|
|
||||||
|
for repo, facts in repos.items():
|
||||||
|
if repo == 'global':
|
||||||
|
md_path = kdir / "global" / "harvested.md"
|
||||||
|
else:
|
||||||
|
md_path = kdir / "repos" / f"{repo}.md"
|
||||||
|
|
||||||
|
md_path.parent.mkdir(parents=True, exist_ok=True)
|
||||||
|
|
||||||
|
# Append to existing or create new
|
||||||
|
mode = 'a' if md_path.exists() else 'w'
|
||||||
|
with open(md_path, mode, encoding='utf-8') as f:
|
||||||
|
if mode == 'w':
|
||||||
|
f.write(f"# Knowledge: {repo}\n\n")
|
||||||
|
f.write(f"## Harvested {datetime.now(timezone.utc).strftime('%Y-%m-%d %H:%M')}\n\n")
|
||||||
|
for fact in facts:
|
||||||
|
icon = {'fact': '📋', 'pitfall': '⚠️', 'pattern': '🔄', 'tool-quirk': '🔧', 'question': '❓'}.get(fact['category'], '•')
|
||||||
|
f.write(f"- {icon} **{fact['category']}** (conf: {fact['confidence']:.1f}): {fact['fact']}\n")
|
||||||
|
f.write("\n")
|
||||||
|
|
||||||
|
|
||||||
|
def harvest_session(session_path: str, knowledge_dir: str, api_base: str, api_key: str,
|
||||||
|
model: str, dry_run: bool = False, min_confidence: float = 0.3) -> dict:
|
||||||
|
"""Harvest knowledge from a single session.
|
||||||
|
|
||||||
|
Returns: dict with stats (facts_found, facts_new, facts_dup, elapsed_seconds, error)
|
||||||
|
"""
|
||||||
|
start_time = time.time()
|
||||||
|
stats = {
|
||||||
|
'session': session_path,
|
||||||
|
'facts_found': 0,
|
||||||
|
'facts_new': 0,
|
||||||
|
'facts_dup': 0,
|
||||||
|
'elapsed_seconds': 0,
|
||||||
|
'error': None
|
||||||
|
}
|
||||||
|
|
||||||
|
try:
|
||||||
|
# 1. Read session
|
||||||
|
messages = read_session(session_path)
|
||||||
|
if not messages:
|
||||||
|
stats['error'] = "Empty session file"
|
||||||
|
return stats
|
||||||
|
|
||||||
|
# 2. Extract conversation
|
||||||
|
conv = extract_conversation(messages)
|
||||||
|
if not conv:
|
||||||
|
stats['error'] = "No conversation turns found"
|
||||||
|
return stats
|
||||||
|
|
||||||
|
# 3. Truncate for context window
|
||||||
|
truncated = truncate_for_context(conv, head=50, tail=50)
|
||||||
|
transcript = messages_to_text(truncated)
|
||||||
|
|
||||||
|
# 4. Load extraction prompt
|
||||||
|
prompt = load_extraction_prompt()
|
||||||
|
|
||||||
|
# 5. Call LLM
|
||||||
|
raw_facts = call_llm(prompt, transcript, api_base, api_key, model)
|
||||||
|
if raw_facts is None:
|
||||||
|
stats['error'] = "LLM extraction failed"
|
||||||
|
return stats
|
||||||
|
|
||||||
|
# 6. Validate
|
||||||
|
valid_facts = [f for f in raw_facts if validate_fact(f) and f.get('confidence', 0) >= min_confidence]
|
||||||
|
stats['facts_found'] = len(valid_facts)
|
||||||
|
|
||||||
|
# 7. Deduplicate
|
||||||
|
existing_index = load_existing_knowledge(knowledge_dir)
|
||||||
|
existing_facts = existing_index.get('facts', [])
|
||||||
|
new_facts = deduplicate(valid_facts, existing_facts)
|
||||||
|
stats['facts_new'] = len(new_facts)
|
||||||
|
stats['facts_dup'] = len(valid_facts) - len(new_facts)
|
||||||
|
|
||||||
|
# 8. Write (unless dry run)
|
||||||
|
if new_facts and not dry_run:
|
||||||
|
write_knowledge(existing_index, new_facts, knowledge_dir, source_session=session_path)
|
||||||
|
|
||||||
|
stats['elapsed_seconds'] = round(time.time() - start_time, 2)
|
||||||
|
return stats
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
stats['error'] = str(e)
|
||||||
|
stats['elapsed_seconds'] = round(time.time() - start_time, 2)
|
||||||
|
return stats
|
||||||
|
|
||||||
|
|
||||||
|
def batch_harvest(sessions_dir: str, knowledge_dir: str, api_base: str, api_key: str,
|
||||||
|
model: str, since: str = "", limit: int = 0, dry_run: bool = False) -> list[dict]:
|
||||||
|
"""Harvest knowledge from multiple sessions in batch."""
|
||||||
|
sessions_path = Path(sessions_dir)
|
||||||
|
if not sessions_path.is_dir():
|
||||||
|
print(f"ERROR: Sessions directory not found: {sessions_dir}", file=sys.stderr)
|
||||||
|
return []
|
||||||
|
|
||||||
|
# Find session files
|
||||||
|
session_files = sorted(sessions_path.glob("*.jsonl"), reverse=True) # Newest first
|
||||||
|
|
||||||
|
# Filter by date if --since provided
|
||||||
|
if since:
|
||||||
|
since_dt = datetime.fromisoformat(since.replace('Z', '+00:00'))
|
||||||
|
filtered = []
|
||||||
|
for sf in session_files:
|
||||||
|
# Try to parse timestamp from filename (common format: session_YYYYMMDD_HHMMSS_hash.jsonl)
|
||||||
|
try:
|
||||||
|
parts = sf.stem.split('_')
|
||||||
|
if len(parts) >= 3:
|
||||||
|
date_str = parts[1]
|
||||||
|
file_dt = datetime.strptime(date_str, '%Y%m%d').replace(tzinfo=timezone.utc)
|
||||||
|
if file_dt >= since_dt:
|
||||||
|
filtered.append(sf)
|
||||||
|
except (ValueError, IndexError):
|
||||||
|
# If we can't parse the date, include the file (be permissive)
|
||||||
|
filtered.append(sf)
|
||||||
|
session_files = filtered
|
||||||
|
|
||||||
|
# Apply limit
|
||||||
|
if limit > 0:
|
||||||
|
session_files = session_files[:limit]
|
||||||
|
|
||||||
|
print(f"Harvesting {len(session_files)} sessions...")
|
||||||
|
|
||||||
|
results = []
|
||||||
|
for i, sf in enumerate(session_files, 1):
|
||||||
|
print(f"[{i}/{len(session_files)}] {sf.name}...", end=" ", flush=True)
|
||||||
|
stats = harvest_session(str(sf), knowledge_dir, api_base, api_key, model, dry_run)
|
||||||
|
if stats['error']:
|
||||||
|
print(f"ERROR: {stats['error']}")
|
||||||
|
else:
|
||||||
|
print(f"{stats['facts_new']} new, {stats['facts_dup']} dup ({stats['elapsed_seconds']}s)")
|
||||||
|
results.append(stats)
|
||||||
|
|
||||||
|
return results
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
parser = argparse.ArgumentParser(description="Harvest knowledge from session transcripts")
|
||||||
|
parser.add_argument('--session', help='Path to a single session JSONL file')
|
||||||
|
parser.add_argument('--batch', action='store_true', help='Batch mode: process multiple sessions')
|
||||||
|
parser.add_argument('--sessions-dir', default=os.path.expanduser('~/.hermes/sessions'),
|
||||||
|
help='Directory containing session files (default: ~/.hermes/sessions)')
|
||||||
|
parser.add_argument('--output', default='knowledge', help='Output directory for knowledge store')
|
||||||
|
parser.add_argument('--since', default='', help='Only process sessions after this date (YYYY-MM-DD)')
|
||||||
|
parser.add_argument('--limit', type=int, default=0, help='Max sessions to process (0=unlimited)')
|
||||||
|
parser.add_argument('--api-base', default=DEFAULT_API_BASE, help='LLM API base URL')
|
||||||
|
parser.add_argument('--api-key', default='', help='LLM API key (or set HARVESTER_API_KEY)')
|
||||||
|
parser.add_argument('--model', default=DEFAULT_MODEL, help='Model to use for extraction')
|
||||||
|
parser.add_argument('--dry-run', action='store_true', help='Preview without writing to knowledge store')
|
||||||
|
parser.add_argument('--min-confidence', type=float, default=0.3, help='Minimum confidence threshold')
|
||||||
|
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
# Resolve API key
|
||||||
|
api_key = args.api_key or DEFAULT_API_KEY or find_api_key()
|
||||||
|
if not api_key:
|
||||||
|
print("ERROR: No API key found. Set HARVESTER_API_KEY or store in one of:", file=sys.stderr)
|
||||||
|
for p in API_KEY_PATHS:
|
||||||
|
print(f" {p}", file=sys.stderr)
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
# Resolve knowledge directory
|
||||||
|
knowledge_dir = args.output
|
||||||
|
if not os.path.isabs(knowledge_dir):
|
||||||
|
knowledge_dir = os.path.join(SCRIPT_DIR.parent, knowledge_dir)
|
||||||
|
|
||||||
|
if args.session:
|
||||||
|
# Single session mode
|
||||||
|
stats = harvest_session(
|
||||||
|
args.session, knowledge_dir, args.api_base, api_key, args.model,
|
||||||
|
dry_run=args.dry_run, min_confidence=args.min_confidence
|
||||||
|
)
|
||||||
|
print(json.dumps(stats, indent=2))
|
||||||
|
if stats['error']:
|
||||||
|
sys.exit(1)
|
||||||
|
elif args.batch:
|
||||||
|
# Batch mode
|
||||||
|
results = batch_harvest(
|
||||||
|
args.sessions_dir, knowledge_dir, args.api_base, api_key, args.model,
|
||||||
|
since=args.since, limit=args.limit, dry_run=args.dry_run
|
||||||
|
)
|
||||||
|
total_new = sum(r['facts_new'] for r in results)
|
||||||
|
total_dup = sum(r['facts_dup'] for r in results)
|
||||||
|
errors = sum(1 for r in results if r['error'])
|
||||||
|
print(f"\nDone: {total_new} new facts, {total_dup} duplicates, {errors} errors")
|
||||||
|
else:
|
||||||
|
parser.print_help()
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
main()
|
||||||
451
scripts/improvement_proposals.py
Normal file
451
scripts/improvement_proposals.py
Normal file
@@ -0,0 +1,451 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Improvement Proposal Generator for compounding-intelligence.
|
||||||
|
|
||||||
|
Analyzes fleet session data to identify waste patterns and generates
|
||||||
|
concrete improvement proposals with ROI estimates.
|
||||||
|
|
||||||
|
Input: Session analytics JSON (from fleet metrics or session database)
|
||||||
|
Output: Markdown proposal document + JSON proposals
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python3 scripts/improvement_proposals.py --input analytics.json
|
||||||
|
python3 scripts/improvement_proposals.py --input analytics.json --format json
|
||||||
|
python3 scripts/improvement_proposals.py --input analytics.json --output proposals.md
|
||||||
|
python3 scripts/improvement_proposals.py --input analytics.json --threshold 5.0
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
from collections import defaultdict
|
||||||
|
from dataclasses import dataclass, field, asdict
|
||||||
|
from datetime import datetime, timezone
|
||||||
|
from typing import Any, Dict, List, Optional
|
||||||
|
|
||||||
|
|
||||||
|
# ── Data types ─────────────────────────────────────────────────
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class WastePattern:
|
||||||
|
"""A detected waste pattern in fleet sessions."""
|
||||||
|
pattern_type: str # "repeated_error", "manual_process", "slow_tool", "failed_retry"
|
||||||
|
description: str
|
||||||
|
occurrences: int
|
||||||
|
total_time_hours: float
|
||||||
|
affected_repos: List[str] = field(default_factory=list)
|
||||||
|
example_sessions: List[str] = field(default_factory=list)
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class ImprovementProposal:
|
||||||
|
"""A concrete improvement proposal with ROI estimate."""
|
||||||
|
title: str
|
||||||
|
problem: str
|
||||||
|
proposed_solution: str
|
||||||
|
estimated_weekly_hours_saved: float
|
||||||
|
estimated_monthly_hours_saved: float
|
||||||
|
implementation_hours: float
|
||||||
|
roi_weeks: float # weeks to break even
|
||||||
|
priority: str # "critical", "high", "medium", "low"
|
||||||
|
affected_area: str
|
||||||
|
supporting_evidence: List[str] = field(default_factory=list)
|
||||||
|
|
||||||
|
|
||||||
|
# ── Session analysis ──────────────────────────────────────────
|
||||||
|
|
||||||
|
def analyze_sessions(sessions: List[dict]) -> List[WastePattern]:
|
||||||
|
"""
|
||||||
|
Analyze session data to find waste patterns.
|
||||||
|
|
||||||
|
Looks for:
|
||||||
|
- Repeated errors (same error across sessions)
|
||||||
|
- Manual processes (long sequences of similar actions)
|
||||||
|
- Slow tools (tools with high latency)
|
||||||
|
- Failed retries (multiple attempts at same task)
|
||||||
|
"""
|
||||||
|
patterns = []
|
||||||
|
|
||||||
|
# Track error frequency across sessions
|
||||||
|
error_counts: Dict[str, List[dict]] = defaultdict(list)
|
||||||
|
tool_latencies: Dict[str, List[float]] = defaultdict(list)
|
||||||
|
retry_counts: Dict[str, int] = defaultdict(int)
|
||||||
|
manual_sequences: List[dict] = []
|
||||||
|
|
||||||
|
for session in sessions:
|
||||||
|
session_id = session.get("session_id", "unknown")
|
||||||
|
repo = session.get("repo", "global")
|
||||||
|
messages = session.get("messages", [])
|
||||||
|
errors = session.get("errors", [])
|
||||||
|
tool_calls = session.get("tool_calls", [])
|
||||||
|
duration_min = session.get("duration_minutes", 0)
|
||||||
|
|
||||||
|
# 1. Repeated errors
|
||||||
|
for err in errors:
|
||||||
|
err_key = _normalize_error(err.get("message", ""))
|
||||||
|
if err_key:
|
||||||
|
error_counts[err_key].append({
|
||||||
|
"session_id": session_id,
|
||||||
|
"repo": repo,
|
||||||
|
})
|
||||||
|
|
||||||
|
# 2. Tool latency tracking
|
||||||
|
for tc in tool_calls:
|
||||||
|
tool_name = tc.get("tool", "unknown")
|
||||||
|
latency = tc.get("latency_ms", 0)
|
||||||
|
if latency > 0:
|
||||||
|
tool_latencies[tool_name].append(latency)
|
||||||
|
|
||||||
|
# 3. Failed retries (same tool called 3+ times in sequence)
|
||||||
|
prev_tool = None
|
||||||
|
streak = 0
|
||||||
|
for tc in tool_calls:
|
||||||
|
tool_name = tc.get("tool", "unknown")
|
||||||
|
if tool_name == prev_tool:
|
||||||
|
streak += 1
|
||||||
|
else:
|
||||||
|
if streak >= 3:
|
||||||
|
retry_counts[prev_tool] += 1
|
||||||
|
streak = 1
|
||||||
|
prev_tool = tool_name
|
||||||
|
if streak >= 3:
|
||||||
|
retry_counts[prev_tool] += 1
|
||||||
|
|
||||||
|
# 4. Manual processes (10+ sequential tool calls with no automation)
|
||||||
|
if len(tool_calls) > 10:
|
||||||
|
tool_sequence = [tc.get("tool") for tc in tool_calls]
|
||||||
|
unique_tools = set(tool_sequence)
|
||||||
|
if len(unique_tools) <= 3 and len(tool_calls) > 10:
|
||||||
|
manual_sequences.append({
|
||||||
|
"session_id": session_id,
|
||||||
|
"repo": repo,
|
||||||
|
"tool_count": len(tool_calls),
|
||||||
|
"unique_tools": list(unique_tools),
|
||||||
|
"duration_min": duration_min,
|
||||||
|
})
|
||||||
|
|
||||||
|
# Generate patterns from collected data
|
||||||
|
|
||||||
|
# Repeated errors (appearing in 3+ sessions)
|
||||||
|
for err_key, occurrences in error_counts.items():
|
||||||
|
if len(occurrences) >= 3:
|
||||||
|
repos = list(set(o["repo"] for o in occurrences))
|
||||||
|
sessions_list = [o["session_id"] for o in occurrences[:5]]
|
||||||
|
# Estimate time wasted: 5 min per error occurrence
|
||||||
|
hours = len(occurrences) * 5 / 60
|
||||||
|
patterns.append(WastePattern(
|
||||||
|
pattern_type="repeated_error",
|
||||||
|
description=f"Error: {err_key[:100]}",
|
||||||
|
occurrences=len(occurrences),
|
||||||
|
total_time_hours=round(hours, 1),
|
||||||
|
affected_repos=repos,
|
||||||
|
example_sessions=sessions_list,
|
||||||
|
))
|
||||||
|
|
||||||
|
# Slow tools (avg latency > 5000ms across 5+ calls)
|
||||||
|
for tool, latencies in tool_latencies.items():
|
||||||
|
if len(latencies) >= 5:
|
||||||
|
avg_ms = sum(latencies) / len(latencies)
|
||||||
|
if avg_ms > 5000:
|
||||||
|
hours = sum(latencies) / 1000 / 3600
|
||||||
|
patterns.append(WastePattern(
|
||||||
|
pattern_type="slow_tool",
|
||||||
|
description=f"Tool '{tool}' averages {avg_ms:.0f}ms latency",
|
||||||
|
occurrences=len(latencies),
|
||||||
|
total_time_hours=round(hours, 1),
|
||||||
|
affected_repos=["global"],
|
||||||
|
example_sessions=[],
|
||||||
|
))
|
||||||
|
|
||||||
|
# Failed retries
|
||||||
|
for tool, count in retry_counts.items():
|
||||||
|
if count >= 2:
|
||||||
|
hours = count * 10 / 60 # ~10 min per failed retry sequence
|
||||||
|
patterns.append(WastePattern(
|
||||||
|
pattern_type="failed_retry",
|
||||||
|
description=f"Tool '{tool}' had {count} retry sequences (3+ consecutive calls)",
|
||||||
|
occurrences=count,
|
||||||
|
total_time_hours=round(hours, 1),
|
||||||
|
affected_repos=["global"],
|
||||||
|
example_sessions=[],
|
||||||
|
))
|
||||||
|
|
||||||
|
# Manual processes
|
||||||
|
for seq in manual_sequences:
|
||||||
|
hours = seq["duration_min"] / 60
|
||||||
|
patterns.append(WastePattern(
|
||||||
|
pattern_type="manual_process",
|
||||||
|
description=f"Session {seq['session_id'][:12]}: {seq['tool_count']} tool calls with only {len(seq['unique_tools'])} unique tools",
|
||||||
|
occurrences=1,
|
||||||
|
total_time_hours=round(hours, 1),
|
||||||
|
affected_repos=[seq["repo"]],
|
||||||
|
example_sessions=[seq["session_id"]],
|
||||||
|
))
|
||||||
|
|
||||||
|
return sorted(patterns, key=lambda p: p.total_time_hours, reverse=True)
|
||||||
|
|
||||||
|
|
||||||
|
def _normalize_error(message: str) -> str:
|
||||||
|
"""Normalize error message to a common key."""
|
||||||
|
if not message:
|
||||||
|
return ""
|
||||||
|
msg = message.lower().strip()
|
||||||
|
# Remove variable parts (paths, IDs, timestamps)
|
||||||
|
import re
|
||||||
|
msg = re.sub(r'/\S+', '/PATH', msg)
|
||||||
|
msg = re.sub(r'\b[0-9a-f]{8,}\b', 'HASH', msg)
|
||||||
|
msg = re.sub(r'\d{4}-\d{2}-\d{2}[tT]\d{2}:\d{2}', 'TIME', msg)
|
||||||
|
return msg[:150]
|
||||||
|
|
||||||
|
|
||||||
|
# ── Proposal generation ───────────────────────────────────────
|
||||||
|
|
||||||
|
def generate_proposals(
|
||||||
|
patterns: List[WastePattern],
|
||||||
|
hourly_rate: float = 50.0,
|
||||||
|
implementation_overhead: float = 1.5,
|
||||||
|
) -> List[ImprovementProposal]:
|
||||||
|
"""
|
||||||
|
Generate improvement proposals from waste patterns.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
patterns: Detected waste patterns
|
||||||
|
hourly_rate: Developer hourly rate for ROI calculation
|
||||||
|
implementation_overhead: Multiplier for implementation time estimate
|
||||||
|
"""
|
||||||
|
proposals = []
|
||||||
|
|
||||||
|
# Group patterns by type
|
||||||
|
by_type: Dict[str, List[WastePattern]] = defaultdict(list)
|
||||||
|
for p in patterns:
|
||||||
|
by_type[p.pattern_type].append(p)
|
||||||
|
|
||||||
|
# 1. Repeated errors → Create fix/skill
|
||||||
|
for p in by_type.get("repeated_error", []):
|
||||||
|
weekly_hours = p.total_time_hours / 4 # monthly → weekly
|
||||||
|
impl_hours = max(1.0, p.occurrences * 0.25) * implementation_overhead
|
||||||
|
roi_weeks = impl_hours / weekly_hours if weekly_hours > 0 else float('inf')
|
||||||
|
|
||||||
|
proposals.append(ImprovementProposal(
|
||||||
|
title=f"Fix repeated error: {p.description[:60]}",
|
||||||
|
problem=f"This error occurred {p.occurrences} times across {len(p.affected_repos)} repos, wasting ~{p.total_time_hours:.1f} hours.",
|
||||||
|
proposed_solution="Root-cause the error and create a permanent fix or mitigation skill.",
|
||||||
|
estimated_weekly_hours_saved=round(weekly_hours, 1),
|
||||||
|
estimated_monthly_hours_saved=round(p.total_time_hours, 1),
|
||||||
|
implementation_hours=round(impl_hours, 1),
|
||||||
|
roi_weeks=round(roi_weeks, 1),
|
||||||
|
priority=_priority_from_roi(roi_weeks),
|
||||||
|
affected_area="reliability",
|
||||||
|
supporting_evidence=[f"{p.occurrences} occurrences in sessions: {', '.join(p.example_sessions[:3])}"],
|
||||||
|
))
|
||||||
|
|
||||||
|
# 2. Slow tools → Optimize or replace
|
||||||
|
for p in by_type.get("slow_tool", []):
|
||||||
|
weekly_hours = p.total_time_hours / 4
|
||||||
|
impl_hours = 3.0 * implementation_overhead # optimization task
|
||||||
|
roi_weeks = impl_hours / weekly_hours if weekly_hours > 0 else float('inf')
|
||||||
|
|
||||||
|
proposals.append(ImprovementProposal(
|
||||||
|
title=f"Optimize slow tool: {p.description[:60]}",
|
||||||
|
problem=f"Tool has {p.occurrences} calls with high latency, wasting ~{p.total_time_hours:.1f} hours total.",
|
||||||
|
proposed_solution="Profile the tool, add caching, or replace with a faster alternative.",
|
||||||
|
estimated_weekly_hours_saved=round(weekly_hours, 1),
|
||||||
|
estimated_monthly_hours_saved=round(p.total_time_hours, 1),
|
||||||
|
implementation_hours=round(impl_hours, 1),
|
||||||
|
roi_weeks=round(roi_weeks, 1),
|
||||||
|
priority=_priority_from_roi(roi_weeks),
|
||||||
|
affected_area="performance",
|
||||||
|
supporting_evidence=[f"{p.occurrences} slow calls detected"],
|
||||||
|
))
|
||||||
|
|
||||||
|
# 3. Failed retries → Add retry logic or validation
|
||||||
|
for p in by_type.get("failed_retry", []):
|
||||||
|
weekly_hours = p.total_time_hours / 4
|
||||||
|
impl_hours = 2.0 * implementation_overhead
|
||||||
|
roi_weeks = impl_hours / weekly_hours if weekly_hours > 0 else float('inf')
|
||||||
|
|
||||||
|
proposals.append(ImprovementProposal(
|
||||||
|
title=f"Reduce retries for tool '{p.description[:50]}'",
|
||||||
|
problem=f"Tool had {p.occurrences} retry sequences, wasting ~{p.total_time_hours:.1f} hours.",
|
||||||
|
proposed_solution="Add input validation, pre-flight checks, or automatic retry with backoff.",
|
||||||
|
estimated_weekly_hours_saved=round(weekly_hours, 1),
|
||||||
|
estimated_monthly_hours_saved=round(p.total_time_hours, 1),
|
||||||
|
implementation_hours=round(impl_hours, 1),
|
||||||
|
roi_weeks=round(roi_weeks, 1),
|
||||||
|
priority=_priority_from_roi(roi_weeks),
|
||||||
|
affected_area="reliability",
|
||||||
|
supporting_evidence=[f"{p.occurrences} retry sequences detected"],
|
||||||
|
))
|
||||||
|
|
||||||
|
# 4. Manual processes → Automate
|
||||||
|
total_manual_hours = sum(p.total_time_hours for p in by_type.get("manual_process", []))
|
||||||
|
manual_patterns = by_type.get("manual_process", [])
|
||||||
|
if manual_patterns:
|
||||||
|
weekly_hours = total_manual_hours / 4
|
||||||
|
impl_hours = len(manual_patterns) * 2.0 * implementation_overhead
|
||||||
|
roi_weeks = impl_hours / weekly_hours if weekly_hours > 0 else float('inf')
|
||||||
|
|
||||||
|
proposals.append(ImprovementProposal(
|
||||||
|
title=f"Automate {len(manual_patterns)} manual processes",
|
||||||
|
problem=f"{len(manual_patterns)} sessions had long manual tool sequences, wasting ~{total_manual_hours:.1f} hours.",
|
||||||
|
proposed_solution="Create composite skills or scripts that combine the repeated tool sequences into single operations.",
|
||||||
|
estimated_weekly_hours_saved=round(weekly_hours, 1),
|
||||||
|
estimated_monthly_hours_saved=round(total_manual_hours, 1),
|
||||||
|
implementation_hours=round(impl_hours, 1),
|
||||||
|
roi_weeks=round(roi_weeks, 1),
|
||||||
|
priority=_priority_from_roi(roi_weeks),
|
||||||
|
affected_area="automation",
|
||||||
|
supporting_evidence=[f"{len(manual_patterns)} manual sessions detected"],
|
||||||
|
))
|
||||||
|
|
||||||
|
return sorted(proposals, key=lambda p: p.estimated_monthly_hours_saved, reverse=True)
|
||||||
|
|
||||||
|
|
||||||
|
def _priority_from_roi(roi_weeks: float) -> str:
|
||||||
|
"""Determine priority from ROI break-even time."""
|
||||||
|
if roi_weeks <= 1:
|
||||||
|
return "critical"
|
||||||
|
elif roi_weeks <= 4:
|
||||||
|
return "high"
|
||||||
|
elif roi_weeks <= 12:
|
||||||
|
return "medium"
|
||||||
|
return "low"
|
||||||
|
|
||||||
|
|
||||||
|
# ── Output formatting ─────────────────────────────────────────
|
||||||
|
|
||||||
|
def format_proposals_markdown(
|
||||||
|
proposals: List[ImprovementProposal],
|
||||||
|
patterns: List[WastePattern],
|
||||||
|
generated_at: str,
|
||||||
|
) -> str:
|
||||||
|
"""Format proposals as a markdown document."""
|
||||||
|
lines = [
|
||||||
|
"# Improvement Proposals",
|
||||||
|
"",
|
||||||
|
f"Generated: {generated_at}",
|
||||||
|
f"Based on analysis of {sum(p.occurrences for p in patterns)} waste events across {len(set(r for p in patterns for r in p.affected_repos))} repos.",
|
||||||
|
"",
|
||||||
|
"---",
|
||||||
|
"",
|
||||||
|
"## Summary",
|
||||||
|
"",
|
||||||
|
f"| Metric | Value |",
|
||||||
|
f"|--------|-------|",
|
||||||
|
f"| Total proposals | {len(proposals)} |",
|
||||||
|
f"| Critical priority | {sum(1 for p in proposals if p.priority == 'critical')} |",
|
||||||
|
f"| Total monthly hours wasted | {sum(p.estimated_monthly_hours_saved for p in proposals):.1f}h |",
|
||||||
|
f"| Total weekly hours recoverable | {sum(p.estimated_weekly_hours_saved for p in proposals):.1f}h |",
|
||||||
|
f"| Implementation investment | {sum(p.implementation_hours for p in proposals):.1f}h |",
|
||||||
|
"",
|
||||||
|
"---",
|
||||||
|
"",
|
||||||
|
]
|
||||||
|
|
||||||
|
for i, prop in enumerate(proposals, 1):
|
||||||
|
lines.extend([
|
||||||
|
f"## {i}. {prop.title}",
|
||||||
|
"",
|
||||||
|
f"**Priority:** {prop.priority.upper()} ",
|
||||||
|
f"**Area:** {prop.affected_area} ",
|
||||||
|
f"**ROI break-even:** {prop.roi_weeks:.1f} weeks",
|
||||||
|
"",
|
||||||
|
"### Problem",
|
||||||
|
"",
|
||||||
|
prop.problem,
|
||||||
|
"",
|
||||||
|
"### Proposed Solution",
|
||||||
|
"",
|
||||||
|
prop.proposed_solution,
|
||||||
|
"",
|
||||||
|
"### ROI Estimate",
|
||||||
|
"",
|
||||||
|
f"- Weekly hours saved: **{prop.estimated_weekly_hours_saved}h**",
|
||||||
|
f"- Monthly hours saved: **{prop.estimated_monthly_hours_saved}h**",
|
||||||
|
f"- Implementation time: **{prop.implementation_hours}h**",
|
||||||
|
f"- Break-even: **{prop.roi_weeks:.1f} weeks**",
|
||||||
|
"",
|
||||||
|
])
|
||||||
|
|
||||||
|
if prop.supporting_evidence:
|
||||||
|
lines.extend([
|
||||||
|
"### Evidence",
|
||||||
|
"",
|
||||||
|
])
|
||||||
|
for ev in prop.supporting_evidence:
|
||||||
|
lines.append(f"- {ev}")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
lines.extend(["---", ""])
|
||||||
|
|
||||||
|
# Waste pattern appendix
|
||||||
|
lines.extend([
|
||||||
|
"## Appendix: Detected Waste Patterns",
|
||||||
|
"",
|
||||||
|
"| Type | Description | Occurrences | Hours |",
|
||||||
|
"|------|-------------|-------------|-------|",
|
||||||
|
])
|
||||||
|
for p in patterns[:20]:
|
||||||
|
lines.append(
|
||||||
|
f"| {p.pattern_type} | {p.description[:60]} | {p.occurrences} | {p.total_time_hours}h |"
|
||||||
|
)
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
|
||||||
|
def format_proposals_json(proposals: List[ImprovementProposal]) -> str:
|
||||||
|
"""Format proposals as JSON."""
|
||||||
|
return json.dumps(
|
||||||
|
[asdict(p) for p in proposals],
|
||||||
|
indent=2,
|
||||||
|
default=str,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# ── Main ──────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
def main():
|
||||||
|
parser = argparse.ArgumentParser(description="Generate improvement proposals from session analytics")
|
||||||
|
parser.add_argument("--input", required=True, help="Path to session analytics JSON file")
|
||||||
|
parser.add_argument("--output", help="Output file path (default: stdout)")
|
||||||
|
parser.add_argument("--format", choices=["markdown", "json"], default="markdown", help="Output format")
|
||||||
|
parser.add_argument("--hourly-rate", type=float, default=50.0, help="Developer hourly rate for ROI calc")
|
||||||
|
parser.add_argument("--threshold", type=float, default=2.0, help="Min monthly hours to include proposal")
|
||||||
|
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
with open(args.input) as f:
|
||||||
|
data = json.load(f)
|
||||||
|
|
||||||
|
sessions = data if isinstance(data, list) else data.get("sessions", [])
|
||||||
|
|
||||||
|
if not sessions:
|
||||||
|
print("No sessions found in input file.", file=sys.stderr)
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
# Analyze
|
||||||
|
patterns = analyze_sessions(sessions)
|
||||||
|
proposals = generate_proposals(patterns, hourly_rate=args.hourly_rate)
|
||||||
|
|
||||||
|
# Filter by threshold
|
||||||
|
proposals = [p for p in proposals if p.estimated_monthly_hours_saved >= args.threshold]
|
||||||
|
|
||||||
|
generated_at = datetime.now(timezone.utc).isoformat()
|
||||||
|
|
||||||
|
if args.format == "markdown":
|
||||||
|
output = format_proposals_markdown(proposals, patterns, generated_at)
|
||||||
|
else:
|
||||||
|
output = format_proposals_json(proposals)
|
||||||
|
|
||||||
|
if args.output:
|
||||||
|
with open(args.output, "w") as f:
|
||||||
|
f.write(output)
|
||||||
|
print(f"Wrote {len(proposals)} proposals to {args.output}", file=sys.stderr)
|
||||||
|
else:
|
||||||
|
print(output)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
275
scripts/knowledge_gap_identifier.py
Normal file
275
scripts/knowledge_gap_identifier.py
Normal file
@@ -0,0 +1,275 @@
|
|||||||
|
"""
|
||||||
|
Knowledge Gap Identifier — Pipeline 10.7
|
||||||
|
|
||||||
|
Cross-references code, docs, and tests to find gaps:
|
||||||
|
- Undocumented functions/classes
|
||||||
|
- Untested code paths
|
||||||
|
- Documented but missing implementations
|
||||||
|
- Test files without corresponding source
|
||||||
|
|
||||||
|
Produces a gap report with severity and suggestions.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import ast
|
||||||
|
import os
|
||||||
|
import re
|
||||||
|
from dataclasses import dataclass, field
|
||||||
|
from enum import Enum
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Dict, List, Optional, Set
|
||||||
|
|
||||||
|
|
||||||
|
class GapSeverity(Enum):
|
||||||
|
INFO = "info"
|
||||||
|
WARNING = "warning"
|
||||||
|
ERROR = "error"
|
||||||
|
|
||||||
|
|
||||||
|
class GapType(Enum):
|
||||||
|
UNDOCUMENTED = "undocumented"
|
||||||
|
UNTESTED = "untested"
|
||||||
|
MISSING_IMPLEMENTATION = "missing_implementation"
|
||||||
|
ORPHAN_TEST = "orphan_test"
|
||||||
|
STALE_DOC = "stale_doc"
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class Gap:
|
||||||
|
"""A single knowledge gap."""
|
||||||
|
gap_type: GapType
|
||||||
|
severity: GapSeverity
|
||||||
|
file: str
|
||||||
|
line: Optional[int]
|
||||||
|
name: str
|
||||||
|
description: str
|
||||||
|
suggestion: str
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class GapReport:
|
||||||
|
"""Full gap analysis report."""
|
||||||
|
repo_path: str
|
||||||
|
gaps: List[Gap] = field(default_factory=list)
|
||||||
|
stats: Dict[str, int] = field(default_factory=dict)
|
||||||
|
|
||||||
|
def summary(self) -> str:
|
||||||
|
lines = [f"Gap Report for {self.repo_path}", "=" * 40]
|
||||||
|
by_type = {}
|
||||||
|
for g in self.gaps:
|
||||||
|
by_type.setdefault(g.gap_type.value, []).append(g)
|
||||||
|
|
||||||
|
for gtype, items in sorted(by_type.items()):
|
||||||
|
lines.append(f"\n{gtype.upper()} ({len(items)}):")
|
||||||
|
for g in items:
|
||||||
|
loc = f"{g.file}:{g.line}" if g.line else g.file
|
||||||
|
lines.append(f" [{g.severity.value}] {g.name} @ {loc}")
|
||||||
|
lines.append(f" {g.description}")
|
||||||
|
|
||||||
|
lines.append(f"\nTotal gaps: {len(self.gaps)}")
|
||||||
|
self.stats = {k: len(v) for k, v in by_type.items()}
|
||||||
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
def to_dict(self) -> dict:
|
||||||
|
return {
|
||||||
|
"repo_path": self.repo_path,
|
||||||
|
"total_gaps": len(self.gaps),
|
||||||
|
"stats": {k: len(v) for k, v in
|
||||||
|
{gt: [g for g in self.gaps if g.gap_type == gt]
|
||||||
|
for gt in GapType}.items() if v},
|
||||||
|
"gaps": [
|
||||||
|
{
|
||||||
|
"type": g.gap_type.value,
|
||||||
|
"severity": g.severity.value,
|
||||||
|
"file": g.file,
|
||||||
|
"line": g.line,
|
||||||
|
"name": g.name,
|
||||||
|
"description": g.description,
|
||||||
|
"suggestion": g.suggestion,
|
||||||
|
}
|
||||||
|
for g in self.gaps
|
||||||
|
],
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def _collect_python_files(root: Path) -> List[Path]:
|
||||||
|
"""Collect .py files, excluding venv/node_modules/.git."""
|
||||||
|
skip = {".git", "venv", "env", ".venv", "node_modules", "__pycache__", ".tox", ".mypy_cache"}
|
||||||
|
files = []
|
||||||
|
for dirpath, dirnames, filenames in os.walk(root):
|
||||||
|
dirnames[:] = [d for d in dirnames if d not in skip]
|
||||||
|
for f in filenames:
|
||||||
|
if f.endswith(".py"):
|
||||||
|
files.append(Path(dirpath) / f)
|
||||||
|
return files
|
||||||
|
|
||||||
|
|
||||||
|
def _extract_python_symbols(filepath: Path) -> Set[str]:
|
||||||
|
"""Extract top-level function and class names from a Python file."""
|
||||||
|
symbols = set()
|
||||||
|
try:
|
||||||
|
source = filepath.read_text(encoding="utf-8", errors="replace")
|
||||||
|
tree = ast.parse(source, filename=str(filepath))
|
||||||
|
except (SyntaxError, UnicodeDecodeError):
|
||||||
|
return symbols
|
||||||
|
|
||||||
|
for node in ast.iter_child_nodes(tree):
|
||||||
|
if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef, ast.ClassDef)):
|
||||||
|
symbols.add(node.name)
|
||||||
|
return symbols
|
||||||
|
|
||||||
|
|
||||||
|
def _extract_doc_symbols(filepath: Path) -> Set[str]:
|
||||||
|
"""Extract function/class names mentioned in markdown docs."""
|
||||||
|
symbols = set()
|
||||||
|
try:
|
||||||
|
text = filepath.read_text(encoding="utf-8", errors="replace")
|
||||||
|
except (UnicodeDecodeError, OSError):
|
||||||
|
return symbols
|
||||||
|
|
||||||
|
# Match backtick-quoted identifiers: `ClassName`, `func_name`, `func()`
|
||||||
|
for m in re.finditer(r"`([A-Za-z_]\w+)(?:\(\))?`", text):
|
||||||
|
symbols.add(m.group(1))
|
||||||
|
# Match ## ClassName or ### func_name headings
|
||||||
|
for m in re.finditer(r"^#{1,4}\s+(\w+)", text, re.MULTILINE):
|
||||||
|
symbols.add(m.group(1))
|
||||||
|
return symbols
|
||||||
|
|
||||||
|
|
||||||
|
def _collect_test_files(root: Path) -> Dict[str, Path]:
|
||||||
|
"""Map test module names to their file paths."""
|
||||||
|
test_map = {}
|
||||||
|
for dirpath, dirnames, filenames in os.walk(root):
|
||||||
|
dirnames[:] = [d for d in dirnames if d not in {".git", "venv", "node_modules"}]
|
||||||
|
for f in filenames:
|
||||||
|
if f.startswith("test_") and f.endswith(".py"):
|
||||||
|
# test_foo.py -> foo
|
||||||
|
module_name = f[5:-3]
|
||||||
|
test_map[module_name] = Path(dirpath) / f
|
||||||
|
return test_map
|
||||||
|
|
||||||
|
|
||||||
|
class KnowledgeGapIdentifier:
|
||||||
|
"""Analyzes a repo for knowledge gaps between code, docs, and tests."""
|
||||||
|
|
||||||
|
def analyze(self, repo_path: str) -> GapReport:
|
||||||
|
root = Path(repo_path).resolve()
|
||||||
|
report = GapReport(repo_path=str(root))
|
||||||
|
|
||||||
|
if not root.is_dir():
|
||||||
|
report.gaps.append(Gap(
|
||||||
|
gap_type=GapType.UNDOCUMENTED,
|
||||||
|
severity=GapSeverity.ERROR,
|
||||||
|
file=str(root),
|
||||||
|
line=None,
|
||||||
|
name="repo",
|
||||||
|
description="Path is not a directory",
|
||||||
|
suggestion="Provide a valid repo directory",
|
||||||
|
))
|
||||||
|
return report
|
||||||
|
|
||||||
|
# Collect artifacts
|
||||||
|
py_files = _collect_python_files(root)
|
||||||
|
doc_files = list(root.glob("docs/**/*.md")) + list(root.glob("*.md"))
|
||||||
|
test_map = _collect_test_files(root / "tests") if (root / "tests").is_dir() else {}
|
||||||
|
|
||||||
|
# Extract symbols from each source file
|
||||||
|
source_symbols: Dict[str, Set[str]] = {} # relative_path -> symbols
|
||||||
|
all_source_symbols: Set[str] = set()
|
||||||
|
|
||||||
|
for pf in py_files:
|
||||||
|
rel = str(pf.relative_to(root))
|
||||||
|
# Skip test files and setup/config
|
||||||
|
if "/tests/" in rel or rel.startswith("tests/") or rel.startswith("test_"):
|
||||||
|
continue
|
||||||
|
if pf.name in ("setup.py", "conftest.py", "conf.py"):
|
||||||
|
continue
|
||||||
|
|
||||||
|
syms = _extract_python_symbols(pf)
|
||||||
|
if syms:
|
||||||
|
source_symbols[rel] = syms
|
||||||
|
all_source_symbols.update(syms)
|
||||||
|
|
||||||
|
# Extract documented symbols
|
||||||
|
doc_symbols: Set[str] = set()
|
||||||
|
for df in doc_files:
|
||||||
|
doc_symbols.update(_extract_doc_symbols(df))
|
||||||
|
|
||||||
|
# Extract test-covered symbols
|
||||||
|
tested_modules: Set[str] = set(test_map.keys())
|
||||||
|
|
||||||
|
# --- Find gaps ---
|
||||||
|
|
||||||
|
# 1. Undocumented: source symbols not in any doc
|
||||||
|
for rel_path, syms in source_symbols.items():
|
||||||
|
for sym in sorted(syms):
|
||||||
|
if sym.startswith("_") and not sym.startswith("__"):
|
||||||
|
continue # Skip private
|
||||||
|
if sym not in doc_symbols:
|
||||||
|
report.gaps.append(Gap(
|
||||||
|
gap_type=GapType.UNDOCUMENTED,
|
||||||
|
severity=GapSeverity.WARNING,
|
||||||
|
file=rel_path,
|
||||||
|
line=None,
|
||||||
|
name=sym,
|
||||||
|
description=f"{sym} defined in {rel_path} but not referenced in any docs",
|
||||||
|
suggestion=f"Add documentation for {sym} in a .md file",
|
||||||
|
))
|
||||||
|
|
||||||
|
# 2. Untested: source modules without a corresponding test file
|
||||||
|
for rel_path in source_symbols:
|
||||||
|
module_name = Path(rel_path).stem
|
||||||
|
if module_name not in tested_modules and module_name not in ("__init__", "main", "config"):
|
||||||
|
report.gaps.append(Gap(
|
||||||
|
gap_type=GapType.UNTESTED,
|
||||||
|
severity=GapSeverity.ERROR,
|
||||||
|
file=rel_path,
|
||||||
|
line=None,
|
||||||
|
name=module_name,
|
||||||
|
description=f"No test file found for {rel_path}",
|
||||||
|
suggestion=f"Create tests/test_{module_name}.py",
|
||||||
|
))
|
||||||
|
|
||||||
|
# 3. Missing implementation: doc references symbol not in any source
|
||||||
|
referenced_but_missing = doc_symbols - all_source_symbols
|
||||||
|
for sym in sorted(referenced_but_missing):
|
||||||
|
# Filter out common non-code terms
|
||||||
|
if sym.lower() in {"todo", "fixme", "note", "example", "usage", "api",
|
||||||
|
"install", "setup", "config", "license", "contributing",
|
||||||
|
"changelog", "readme", "python", "bash", "json", "yaml",
|
||||||
|
"http", "url", "cli", "gui", "ui", "api", "rest"}:
|
||||||
|
continue
|
||||||
|
if len(sym) < 3:
|
||||||
|
continue
|
||||||
|
report.gaps.append(Gap(
|
||||||
|
gap_type=GapType.MISSING_IMPLEMENTATION,
|
||||||
|
severity=GapSeverity.INFO,
|
||||||
|
file="(docs)",
|
||||||
|
line=None,
|
||||||
|
name=sym,
|
||||||
|
description=f"{sym} referenced in docs but not found in source code",
|
||||||
|
suggestion=f"Verify if {sym} should be implemented or update docs",
|
||||||
|
))
|
||||||
|
|
||||||
|
# 4. Orphan tests: test files without matching source
|
||||||
|
for test_mod, test_path in test_map.items():
|
||||||
|
if test_mod not in tested_modules and not any(
|
||||||
|
test_mod in Path(f).stem for f in source_symbols
|
||||||
|
):
|
||||||
|
# Check if any source file partially matches
|
||||||
|
matches_source = any(test_mod.replace("_", "-") in f or test_mod.replace("_", "") in Path(f).stem
|
||||||
|
for f in source_symbols)
|
||||||
|
if not matches_source:
|
||||||
|
rel = str(test_path.relative_to(root))
|
||||||
|
report.gaps.append(Gap(
|
||||||
|
gap_type=GapType.ORPHAN_TEST,
|
||||||
|
severity=GapSeverity.WARNING,
|
||||||
|
file=rel,
|
||||||
|
line=None,
|
||||||
|
name=test_mod,
|
||||||
|
description=f"Test file {rel} exists but no matching source module found",
|
||||||
|
suggestion=f"Verify if the source was renamed or removed",
|
||||||
|
))
|
||||||
|
|
||||||
|
return report
|
||||||
131
scripts/knowledge_staleness_check.py
Normal file
131
scripts/knowledge_staleness_check.py
Normal file
@@ -0,0 +1,131 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Knowledge Store Staleness Detector — Detect stale knowledge entries by comparing source file hashes.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python3 scripts/knowledge_staleness_check.py --index knowledge/index.json
|
||||||
|
python3 scripts/knowledge_staleness_check.py --index knowledge/index.json --json
|
||||||
|
python3 scripts/knowledge_staleness_check.py --index knowledge/index.json --fix
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import hashlib
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
from datetime import datetime, timezone
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Dict, List, Any, Optional
|
||||||
|
|
||||||
|
|
||||||
|
def compute_file_hash(filepath: str) -> Optional[str]:
|
||||||
|
"""Compute SHA-256 hash of a file. Returns None if file doesn't exist."""
|
||||||
|
try:
|
||||||
|
with open(filepath, "rb") as f:
|
||||||
|
return "sha256:" + hashlib.sha256(f.read()).hexdigest()
|
||||||
|
except (FileNotFoundError, IsADirectoryError, PermissionError):
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def check_staleness(index_path: str, repo_root: str = ".") -> List[Dict[str, Any]]:
|
||||||
|
"""Check all entries in knowledge index for staleness.
|
||||||
|
|
||||||
|
Returns list of entries with staleness info:
|
||||||
|
- status: "fresh" | "stale" | "missing_source" | "no_hash"
|
||||||
|
- current_hash: computed hash (if source exists)
|
||||||
|
- stored_hash: hash from index
|
||||||
|
"""
|
||||||
|
with open(index_path) as f:
|
||||||
|
data = json.load(f)
|
||||||
|
|
||||||
|
facts = data.get("facts", [])
|
||||||
|
results = []
|
||||||
|
|
||||||
|
for entry in facts:
|
||||||
|
source_file = entry.get("source_file")
|
||||||
|
stored_hash = entry.get("source_hash")
|
||||||
|
|
||||||
|
if not source_file:
|
||||||
|
results.append({**entry, "status": "no_source", "current_hash": None})
|
||||||
|
continue
|
||||||
|
|
||||||
|
full_path = os.path.join(repo_root, source_file)
|
||||||
|
current_hash = compute_file_hash(full_path)
|
||||||
|
|
||||||
|
if current_hash is None:
|
||||||
|
results.append({**entry, "status": "missing_source", "current_hash": None})
|
||||||
|
elif not stored_hash:
|
||||||
|
results.append({**entry, "status": "no_hash", "current_hash": current_hash})
|
||||||
|
elif current_hash != stored_hash:
|
||||||
|
results.append({**entry, "status": "stale", "current_hash": current_hash})
|
||||||
|
else:
|
||||||
|
results.append({**entry, "status": "fresh", "current_hash": current_hash})
|
||||||
|
|
||||||
|
return results
|
||||||
|
|
||||||
|
|
||||||
|
def fix_hashes(index_path: str, repo_root: str = ".") -> int:
|
||||||
|
"""Add hashes to entries missing them. Returns count of fixed entries."""
|
||||||
|
with open(index_path) as f:
|
||||||
|
data = json.load(f)
|
||||||
|
|
||||||
|
fixed = 0
|
||||||
|
for entry in data.get("facts", []):
|
||||||
|
if entry.get("source_hash"):
|
||||||
|
continue
|
||||||
|
source_file = entry.get("source_file")
|
||||||
|
if not source_file:
|
||||||
|
continue
|
||||||
|
full_path = os.path.join(repo_root, source_file)
|
||||||
|
h = compute_file_hash(full_path)
|
||||||
|
if h:
|
||||||
|
entry["source_hash"] = h
|
||||||
|
fixed += 1
|
||||||
|
|
||||||
|
with open(index_path, "w") as f:
|
||||||
|
json.dump(data, f, indent=2)
|
||||||
|
|
||||||
|
return fixed
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
parser = argparse.ArgumentParser(description="Check knowledge store staleness")
|
||||||
|
parser.add_argument("--index", required=True, help="Path to knowledge/index.json")
|
||||||
|
parser.add_argument("--repo", default=".", help="Repo root for source file resolution")
|
||||||
|
parser.add_argument("--json", action="store_true", help="Output as JSON")
|
||||||
|
parser.add_argument("--fix", action="store_true", help="Add hashes to entries missing them")
|
||||||
|
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
if args.fix:
|
||||||
|
fixed = fix_hashes(args.index, args.repo)
|
||||||
|
print(f"Fixed {fixed} entries with missing hashes.")
|
||||||
|
return
|
||||||
|
|
||||||
|
results = check_staleness(args.index, args.repo)
|
||||||
|
|
||||||
|
if args.json:
|
||||||
|
print(json.dumps(results, indent=2))
|
||||||
|
else:
|
||||||
|
stale = [r for r in results if r["status"] != "fresh"]
|
||||||
|
fresh = [r for r in results if r["status"] == "fresh"]
|
||||||
|
|
||||||
|
print(f"Knowledge Store Staleness Check")
|
||||||
|
print(f" Total entries: {len(results)}")
|
||||||
|
print(f" Fresh: {len(fresh)}")
|
||||||
|
print(f" Stale/Issues: {len(stale)}")
|
||||||
|
print()
|
||||||
|
|
||||||
|
if stale:
|
||||||
|
print("Issues found:")
|
||||||
|
for r in stale:
|
||||||
|
status = r["status"]
|
||||||
|
fact = r.get("fact", "?")[:60]
|
||||||
|
source = r.get("source_file", "?")
|
||||||
|
print(f" [{status}] {source}: {fact}")
|
||||||
|
else:
|
||||||
|
print("All entries are fresh!")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
550
scripts/perf_bottleneck_finder.py
Normal file
550
scripts/perf_bottleneck_finder.py
Normal file
@@ -0,0 +1,550 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Performance Bottleneck Finder — Identify slow tests, builds, and CI steps.
|
||||||
|
|
||||||
|
Analyzes:
|
||||||
|
1. Pytest output for slow tests
|
||||||
|
2. Build logs for slow steps
|
||||||
|
3. CI workflow durations
|
||||||
|
4. File system for large/slow artifacts
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python3 scripts/perf_bottleneck_finder.py --repo /path/to/repo
|
||||||
|
python3 scripts/perf_bottleneck_finder.py --repo /path/to/repo --json
|
||||||
|
python3 scripts/perf_bottleneck_finder.py --repo /path/to/repo --report metrics/perf_report.md
|
||||||
|
|
||||||
|
Weekly cron:
|
||||||
|
0 9 * * 1 cd /path/to/compounding-intelligence && python3 scripts/perf_bottleneck_finder.py --repo /path/to/target --report metrics/perf_report.md
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import re
|
||||||
|
import subprocess
|
||||||
|
import sys
|
||||||
|
from collections import defaultdict
|
||||||
|
from dataclasses import dataclass, field, asdict
|
||||||
|
from datetime import datetime, timezone
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Any, Dict, List, Optional, Tuple
|
||||||
|
|
||||||
|
|
||||||
|
# ── Configuration ──────────────────────────────────────────────────
|
||||||
|
|
||||||
|
SLOW_TEST_THRESHOLD_S = 2.0 # Tests slower than this are flagged
|
||||||
|
SLOW_BUILD_STEP_THRESHOLD_S = 10.0
|
||||||
|
TOP_N_BOTTLENECKS = 10 # Report top N bottlenecks
|
||||||
|
PYTEST_DURATIONS_COUNT = 20 # Number of slow tests to collect
|
||||||
|
LOG_EXTENSIONS = {".log", ".txt"}
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class Bottleneck:
|
||||||
|
"""A single performance bottleneck."""
|
||||||
|
category: str # "test", "build", "ci", "artifact", "import"
|
||||||
|
name: str # What's slow
|
||||||
|
duration_s: float # How long it takes
|
||||||
|
severity: str # "critical", "warning", "info"
|
||||||
|
recommendation: str # How to fix
|
||||||
|
file_path: Optional[str] = None
|
||||||
|
line_number: Optional[int] = None
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class PerfReport:
|
||||||
|
"""Full performance report."""
|
||||||
|
timestamp: str
|
||||||
|
repo_path: str
|
||||||
|
bottlenecks: List[Bottleneck] = field(default_factory=list)
|
||||||
|
summary: Dict[str, Any] = field(default_factory=dict)
|
||||||
|
test_stats: Dict[str, Any] = field(default_factory=dict)
|
||||||
|
build_stats: Dict[str, Any] = field(default_factory=dict)
|
||||||
|
ci_stats: Dict[str, Any] = field(default_factory=dict)
|
||||||
|
|
||||||
|
def to_dict(self) -> dict:
|
||||||
|
d = asdict(self)
|
||||||
|
return d
|
||||||
|
|
||||||
|
|
||||||
|
# ── Test Analysis ──────────────────────────────────────────────────
|
||||||
|
|
||||||
|
def find_slow_tests_pytest(repo_path: str) -> List[Bottleneck]:
|
||||||
|
"""Run pytest --durations and parse slow tests."""
|
||||||
|
bottlenecks = []
|
||||||
|
|
||||||
|
# Try to run pytest with durations
|
||||||
|
try:
|
||||||
|
result = subprocess.run(
|
||||||
|
["python3", "-m", "pytest", "--co", "-q", "--durations=0"],
|
||||||
|
cwd=repo_path, capture_output=True, text=True, timeout=30
|
||||||
|
)
|
||||||
|
# If tests exist, try to get durations from last run
|
||||||
|
durations_file = os.path.join(repo_path, ".pytest_cache", "v", "cache", "durations")
|
||||||
|
if os.path.exists(durations_file):
|
||||||
|
with open(durations_file) as f:
|
||||||
|
for line in f:
|
||||||
|
parts = line.strip().split()
|
||||||
|
if len(parts) >= 2:
|
||||||
|
try:
|
||||||
|
duration = float(parts[0])
|
||||||
|
test_name = " ".join(parts[1:])
|
||||||
|
if duration > SLOW_TEST_THRESHOLD_S:
|
||||||
|
severity = "critical" if duration > 10 else "warning"
|
||||||
|
bottlenecks.append(Bottleneck(
|
||||||
|
category="test",
|
||||||
|
name=test_name,
|
||||||
|
duration_s=duration,
|
||||||
|
severity=severity,
|
||||||
|
recommendation=f"Test takes {duration:.1f}s. Consider mocking slow I/O, using fixtures, or marking with @pytest.mark.slow."
|
||||||
|
))
|
||||||
|
except ValueError:
|
||||||
|
continue
|
||||||
|
except (subprocess.TimeoutExpired, FileNotFoundError):
|
||||||
|
pass
|
||||||
|
|
||||||
|
return bottlenecks
|
||||||
|
|
||||||
|
|
||||||
|
def find_slow_tests_by_scan(repo_path: str) -> List[Bottleneck]:
|
||||||
|
"""Scan test files for patterns that indicate slow tests."""
|
||||||
|
bottlenecks = []
|
||||||
|
test_patterns = [
|
||||||
|
(r"time\.sleep\((\d+(?:\.\d+)?)\)", "Contains time.sleep() — consider using mock or async wait"),
|
||||||
|
(r"subprocess\.run\(.*timeout=(\d+)", "Subprocess with timeout — may block test"),
|
||||||
|
(r"requests\.(get|post|put|delete)\(", "Real HTTP call — mock with responses or httpretty"),
|
||||||
|
(r"open\([^)]*['\"]w['\"]\)", "File I/O in test — use tmp_path fixture"),
|
||||||
|
]
|
||||||
|
|
||||||
|
for root, dirs, files in os.walk(repo_path):
|
||||||
|
# Skip hidden and cache dirs
|
||||||
|
dirs[:] = [d for d in dirs if not d.startswith(('.', '__pycache__', 'node_modules', '.git'))]
|
||||||
|
|
||||||
|
for fname in files:
|
||||||
|
if not (fname.startswith("test_") or fname.endswith("_test.py")):
|
||||||
|
continue
|
||||||
|
if not fname.endswith(".py"):
|
||||||
|
continue
|
||||||
|
|
||||||
|
fpath = os.path.join(root, fname)
|
||||||
|
rel_path = os.path.relpath(fpath, repo_path)
|
||||||
|
|
||||||
|
try:
|
||||||
|
with open(fpath) as f:
|
||||||
|
lines = f.readlines()
|
||||||
|
except (PermissionError, UnicodeDecodeError):
|
||||||
|
continue
|
||||||
|
|
||||||
|
for i, line in enumerate(lines):
|
||||||
|
for pattern, recommendation in test_patterns:
|
||||||
|
match = re.search(pattern, line)
|
||||||
|
if match:
|
||||||
|
duration = 1.0 # Default estimate
|
||||||
|
if "sleep" in pattern:
|
||||||
|
try:
|
||||||
|
duration = float(match.group(1))
|
||||||
|
except (ValueError, IndexError):
|
||||||
|
duration = 1.0
|
||||||
|
elif "timeout" in pattern:
|
||||||
|
try:
|
||||||
|
duration = float(match.group(1))
|
||||||
|
except (ValueError, IndexError):
|
||||||
|
duration = 10.0
|
||||||
|
else:
|
||||||
|
duration = 2.0 # Estimated
|
||||||
|
|
||||||
|
bottlenecks.append(Bottleneck(
|
||||||
|
category="test",
|
||||||
|
name=f"{rel_path}:{i+1}",
|
||||||
|
duration_s=duration,
|
||||||
|
severity="warning" if duration < 5 else "critical",
|
||||||
|
recommendation=recommendation,
|
||||||
|
file_path=rel_path,
|
||||||
|
line_number=i + 1
|
||||||
|
))
|
||||||
|
|
||||||
|
return bottlenecks
|
||||||
|
|
||||||
|
|
||||||
|
# ── Build Analysis ─────────────────────────────────────────────────
|
||||||
|
|
||||||
|
def analyze_build_artifacts(repo_path: str) -> List[Bottleneck]:
|
||||||
|
"""Find large build artifacts that slow down builds."""
|
||||||
|
bottlenecks = []
|
||||||
|
large_dirs = {
|
||||||
|
"node_modules": "Consider using npm ci --production or yarn --production",
|
||||||
|
"__pycache__": "Consider .gitignore and cleaning before builds",
|
||||||
|
".tox": "Consider caching tox environments",
|
||||||
|
".pytest_cache": "Consider cleaning between CI runs",
|
||||||
|
"dist": "Check if dist/ artifacts are being rebuilt unnecessarily",
|
||||||
|
"build": "Check if build/ artifacts are being rebuilt unnecessarily",
|
||||||
|
".next": "Next.js cache — consider incremental builds",
|
||||||
|
"venv": "Virtual env in repo — move outside or use Docker",
|
||||||
|
}
|
||||||
|
|
||||||
|
for dirname, recommendation in large_dirs.items():
|
||||||
|
dirpath = os.path.join(repo_path, dirname)
|
||||||
|
if os.path.isdir(dirpath):
|
||||||
|
total_size = 0
|
||||||
|
file_count = 0
|
||||||
|
for root, dirs, files in os.walk(dirpath):
|
||||||
|
for f in files:
|
||||||
|
try:
|
||||||
|
fpath = os.path.join(root, f)
|
||||||
|
total_size += os.path.getsize(fpath)
|
||||||
|
file_count += 1
|
||||||
|
except OSError:
|
||||||
|
pass
|
||||||
|
|
||||||
|
if total_size > 10 * 1024 * 1024: # > 10MB
|
||||||
|
size_mb = total_size / (1024 * 1024)
|
||||||
|
bottlenecks.append(Bottleneck(
|
||||||
|
category="build",
|
||||||
|
name=f"{dirname}/ ({size_mb:.1f}MB, {file_count} files)",
|
||||||
|
duration_s=size_mb * 0.5, # Rough estimate
|
||||||
|
severity="critical" if size_mb > 100 else "warning",
|
||||||
|
recommendation=recommendation
|
||||||
|
))
|
||||||
|
|
||||||
|
return bottlenecks
|
||||||
|
|
||||||
|
|
||||||
|
def analyze_makefile_targets(repo_path: str) -> List[Bottleneck]:
|
||||||
|
"""Analyze Makefile for potentially slow targets."""
|
||||||
|
bottlenecks = []
|
||||||
|
makefiles = []
|
||||||
|
|
||||||
|
for root, dirs, files in os.walk(repo_path):
|
||||||
|
dirs[:] = [d for d in dirs if not d.startswith(('.', '__pycache__'))]
|
||||||
|
for f in files:
|
||||||
|
if f in ("Makefile", "makefile", "GNUmakefile"):
|
||||||
|
makefiles.append(os.path.join(root, f))
|
||||||
|
|
||||||
|
slow_patterns = [
|
||||||
|
(r"pip install", "pip install without --no-deps or constraints"),
|
||||||
|
(r"npm install(?!.*--production)", "npm install without --production flag"),
|
||||||
|
(r"docker build", "Docker build — consider multi-stage and layer caching"),
|
||||||
|
(r"pytest(?!.*-x|--maxfail)", "pytest without early exit on failure"),
|
||||||
|
(r"mypy|mypy --strict", "Type checking — consider incremental mode"),
|
||||||
|
]
|
||||||
|
|
||||||
|
for mfile in makefiles:
|
||||||
|
rel_path = os.path.relpath(mfile, repo_path)
|
||||||
|
try:
|
||||||
|
with open(mfile) as f:
|
||||||
|
content = f.read()
|
||||||
|
except (PermissionError, UnicodeDecodeError):
|
||||||
|
continue
|
||||||
|
|
||||||
|
for pattern, recommendation in slow_patterns:
|
||||||
|
if re.search(pattern, content):
|
||||||
|
bottlenecks.append(Bottleneck(
|
||||||
|
category="build",
|
||||||
|
name=f"{rel_path}: {pattern}",
|
||||||
|
duration_s=5.0,
|
||||||
|
severity="info",
|
||||||
|
recommendation=recommendation,
|
||||||
|
file_path=rel_path
|
||||||
|
))
|
||||||
|
|
||||||
|
return bottlenecks
|
||||||
|
|
||||||
|
|
||||||
|
# ── CI Analysis ────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
def analyze_github_actions(repo_path: str) -> List[Bottleneck]:
|
||||||
|
"""Analyze GitHub Actions workflow files for inefficiencies."""
|
||||||
|
bottlenecks = []
|
||||||
|
workflow_dir = os.path.join(repo_path, ".github", "workflows")
|
||||||
|
|
||||||
|
if not os.path.isdir(workflow_dir):
|
||||||
|
return bottlenecks
|
||||||
|
|
||||||
|
slow_patterns = [
|
||||||
|
(r"runs-on:\s*ubuntu-latest", 0, "Consider caching dependencies between runs"),
|
||||||
|
(r"npm install", 2, "Use npm ci instead of npm install for reproducible builds"),
|
||||||
|
(r"pip install(?!.*--cache-dir)", 2, "Add --cache-dir or use pip cache action"),
|
||||||
|
(r"docker build(?!.*--cache-from)", 5, "Use Docker layer caching"),
|
||||||
|
(r"python -m pytest(?!.*-n|--numprocesses)", 3, "Consider pytest-xdist for parallel test execution"),
|
||||||
|
]
|
||||||
|
|
||||||
|
for fname in os.listdir(workflow_dir):
|
||||||
|
if not fname.endswith(('.yml', '.yaml')):
|
||||||
|
continue
|
||||||
|
|
||||||
|
fpath = os.path.join(workflow_dir, fname)
|
||||||
|
try:
|
||||||
|
with open(fpath) as f:
|
||||||
|
content = f.read()
|
||||||
|
except (PermissionError, UnicodeDecodeError):
|
||||||
|
continue
|
||||||
|
|
||||||
|
for pattern, est_savings, recommendation in slow_patterns:
|
||||||
|
if re.search(pattern, content):
|
||||||
|
bottlenecks.append(Bottleneck(
|
||||||
|
category="ci",
|
||||||
|
name=f"{fname}: {pattern}",
|
||||||
|
duration_s=est_savings,
|
||||||
|
severity="info",
|
||||||
|
recommendation=recommendation,
|
||||||
|
file_path=f".github/workflows/{fname}"
|
||||||
|
))
|
||||||
|
|
||||||
|
return bottlenecks
|
||||||
|
|
||||||
|
|
||||||
|
def analyze_gitea_ci(repo_path: str) -> List[Bottleneck]:
|
||||||
|
"""Analyze Gitea/Drone CI config files."""
|
||||||
|
bottlenecks = []
|
||||||
|
ci_files = [".gitea/workflows", ".drone.yml", ".woodpecker.yml"]
|
||||||
|
|
||||||
|
for ci_path in ci_files:
|
||||||
|
full_path = os.path.join(repo_path, ci_path)
|
||||||
|
if os.path.isfile(full_path):
|
||||||
|
try:
|
||||||
|
with open(full_path) as f:
|
||||||
|
content = f.read()
|
||||||
|
except (PermissionError, UnicodeDecodeError):
|
||||||
|
continue
|
||||||
|
|
||||||
|
if "pip install" in content and "--cache-dir" not in content:
|
||||||
|
bottlenecks.append(Bottleneck(
|
||||||
|
category="ci",
|
||||||
|
name=f"{ci_path}: pip without cache",
|
||||||
|
duration_s=5.0,
|
||||||
|
severity="warning",
|
||||||
|
recommendation="Add --cache-dir or mount pip cache volume",
|
||||||
|
file_path=ci_path
|
||||||
|
))
|
||||||
|
|
||||||
|
elif os.path.isdir(full_path):
|
||||||
|
for fname in os.listdir(full_path):
|
||||||
|
if not fname.endswith(('.yml', '.yaml')):
|
||||||
|
continue
|
||||||
|
fpath = os.path.join(full_path, fname)
|
||||||
|
try:
|
||||||
|
with open(fpath) as f:
|
||||||
|
content = f.read()
|
||||||
|
except (PermissionError, UnicodeDecodeError):
|
||||||
|
continue
|
||||||
|
|
||||||
|
if "pip install" in content and "--cache-dir" not in content:
|
||||||
|
bottlenecks.append(Bottleneck(
|
||||||
|
category="ci",
|
||||||
|
name=f"{ci_path}/{fname}: pip without cache",
|
||||||
|
duration_s=5.0,
|
||||||
|
severity="warning",
|
||||||
|
recommendation="Add --cache-dir or mount pip cache volume",
|
||||||
|
file_path=f"{ci_path}/{fname}"
|
||||||
|
))
|
||||||
|
|
||||||
|
return bottlenecks
|
||||||
|
|
||||||
|
|
||||||
|
# ── Import Analysis ────────────────────────────────────────────────
|
||||||
|
|
||||||
|
def find_slow_imports(repo_path: str) -> List[Bottleneck]:
|
||||||
|
"""Find Python files with heavy import chains."""
|
||||||
|
bottlenecks = []
|
||||||
|
heavy_imports = {
|
||||||
|
"pandas": 0.5,
|
||||||
|
"numpy": 0.3,
|
||||||
|
"torch": 2.0,
|
||||||
|
"tensorflow": 3.0,
|
||||||
|
"scipy": 0.5,
|
||||||
|
"matplotlib": 0.8,
|
||||||
|
"sklearn": 0.5,
|
||||||
|
"transformers": 1.5,
|
||||||
|
}
|
||||||
|
|
||||||
|
for root, dirs, files in os.walk(repo_path):
|
||||||
|
dirs[:] = [d for d in dirs if not d.startswith(('.', '__pycache__', 'node_modules'))]
|
||||||
|
for fname in files:
|
||||||
|
if not fname.endswith(".py"):
|
||||||
|
continue
|
||||||
|
|
||||||
|
fpath = os.path.join(root, fname)
|
||||||
|
rel_path = os.path.relpath(fpath, repo_path)
|
||||||
|
|
||||||
|
try:
|
||||||
|
with open(fpath) as f:
|
||||||
|
lines = f.readlines()
|
||||||
|
except (PermissionError, UnicodeDecodeError):
|
||||||
|
continue
|
||||||
|
|
||||||
|
for i, line in enumerate(lines):
|
||||||
|
stripped = line.strip()
|
||||||
|
if stripped.startswith("import ") or stripped.startswith("from "):
|
||||||
|
for heavy, est_time in heavy_imports.items():
|
||||||
|
if heavy in stripped:
|
||||||
|
bottlenecks.append(Bottleneck(
|
||||||
|
category="import",
|
||||||
|
name=f"{rel_path}:{i+1}: import {heavy}",
|
||||||
|
duration_s=est_time,
|
||||||
|
severity="info" if est_time < 1.0 else "warning",
|
||||||
|
recommendation=f"Heavy import ({heavy} ~{est_time}s). Consider lazy import or conditional import.",
|
||||||
|
file_path=rel_path,
|
||||||
|
line_number=i + 1
|
||||||
|
))
|
||||||
|
|
||||||
|
return bottlenecks
|
||||||
|
|
||||||
|
|
||||||
|
# ── Report Generation ──────────────────────────────────────────────
|
||||||
|
|
||||||
|
def severity_sort_key(b: Bottleneck) -> Tuple[int, float]:
|
||||||
|
"""Sort by severity then duration."""
|
||||||
|
sev_order = {"critical": 0, "warning": 1, "info": 2}
|
||||||
|
return (sev_order.get(b.severity, 3), -b.duration_s)
|
||||||
|
|
||||||
|
|
||||||
|
def generate_report(repo_path: str) -> PerfReport:
|
||||||
|
"""Run all analyses and generate a performance report."""
|
||||||
|
report = PerfReport(
|
||||||
|
timestamp=datetime.now(timezone.utc).isoformat(),
|
||||||
|
repo_path=os.path.abspath(repo_path)
|
||||||
|
)
|
||||||
|
|
||||||
|
# Collect all bottlenecks
|
||||||
|
all_bottlenecks = []
|
||||||
|
|
||||||
|
print("Scanning for slow tests (pytest cache)...")
|
||||||
|
all_bottlenecks.extend(find_slow_tests_pytest(repo_path))
|
||||||
|
|
||||||
|
print("Scanning for slow test patterns...")
|
||||||
|
all_bottlenecks.extend(find_slow_tests_by_scan(repo_path))
|
||||||
|
|
||||||
|
print("Analyzing build artifacts...")
|
||||||
|
all_bottlenecks.extend(analyze_build_artifacts(repo_path))
|
||||||
|
|
||||||
|
print("Analyzing Makefiles...")
|
||||||
|
all_bottlenecks.extend(analyze_makefile_targets(repo_path))
|
||||||
|
|
||||||
|
print("Analyzing CI workflows...")
|
||||||
|
all_bottlenecks.extend(analyze_github_actions(repo_path))
|
||||||
|
all_bottlenecks.extend(analyze_gitea_ci(repo_path))
|
||||||
|
|
||||||
|
print("Scanning for heavy imports...")
|
||||||
|
all_bottlenecks.extend(find_slow_imports(repo_path))
|
||||||
|
|
||||||
|
# Sort by severity and duration
|
||||||
|
all_bottlenecks.sort(key=severity_sort_key)
|
||||||
|
report.bottlenecks = all_bottlenecks[:TOP_N_BOTTLENECKS * 2] # Keep more for stats
|
||||||
|
|
||||||
|
# Compute summary
|
||||||
|
by_category = defaultdict(list)
|
||||||
|
for b in all_bottlenecks:
|
||||||
|
by_category[b.category].append(b)
|
||||||
|
|
||||||
|
report.summary = {
|
||||||
|
"total_bottlenecks": len(all_bottlenecks),
|
||||||
|
"critical": sum(1 for b in all_bottlenecks if b.severity == "critical"),
|
||||||
|
"warning": sum(1 for b in all_bottlenecks if b.severity == "warning"),
|
||||||
|
"info": sum(1 for b in all_bottlenecks if b.severity == "info"),
|
||||||
|
"estimated_total_slowdown_s": sum(b.duration_s for b in all_bottlenecks),
|
||||||
|
"by_category": {cat: len(items) for cat, items in by_category.items()},
|
||||||
|
}
|
||||||
|
|
||||||
|
report.test_stats = {
|
||||||
|
"slow_tests": len(by_category.get("test", [])),
|
||||||
|
"total_estimated_s": sum(b.duration_s for b in by_category.get("test", [])),
|
||||||
|
}
|
||||||
|
|
||||||
|
report.build_stats = {
|
||||||
|
"build_issues": len(by_category.get("build", [])),
|
||||||
|
"total_estimated_s": sum(b.duration_s for b in by_category.get("build", [])),
|
||||||
|
}
|
||||||
|
|
||||||
|
report.ci_stats = {
|
||||||
|
"ci_issues": len(by_category.get("ci", [])),
|
||||||
|
"total_estimated_s": sum(b.duration_s for b in by_category.get("ci", [])),
|
||||||
|
}
|
||||||
|
|
||||||
|
return report
|
||||||
|
|
||||||
|
|
||||||
|
def format_markdown(report: PerfReport) -> str:
|
||||||
|
"""Format report as markdown."""
|
||||||
|
lines = []
|
||||||
|
lines.append(f"# Performance Bottleneck Report")
|
||||||
|
lines.append(f"")
|
||||||
|
lines.append(f"Generated: {report.timestamp}")
|
||||||
|
lines.append(f"Repository: {report.repo_path}")
|
||||||
|
lines.append(f"")
|
||||||
|
|
||||||
|
# Summary
|
||||||
|
s = report.summary
|
||||||
|
lines.append(f"## Summary")
|
||||||
|
lines.append(f"")
|
||||||
|
lines.append(f"- **Total bottlenecks:** {s['total_bottlenecks']}")
|
||||||
|
lines.append(f"- **Critical:** {s['critical']} | **Warning:** {s['warning']} | **Info:** {s['info']}")
|
||||||
|
lines.append(f"- **Estimated total slowdown:** {s['estimated_total_slowdown_s']:.1f}s")
|
||||||
|
lines.append(f"- **By category:** {', '.join(f'{k}: {v}' for k, v in s['by_category'].items())}")
|
||||||
|
lines.append(f"")
|
||||||
|
|
||||||
|
# Top bottlenecks
|
||||||
|
lines.append(f"## Top {min(TOP_N_BOTTLENECKS, len(report.bottlenecks))} Bottlenecks")
|
||||||
|
lines.append(f"")
|
||||||
|
|
||||||
|
for i, b in enumerate(report.bottlenecks[:TOP_N_BOTTLENECKS], 1):
|
||||||
|
icon = {"critical": "🔴", "warning": "🟡", "info": "🔵"}.get(b.severity, "⚪")
|
||||||
|
loc = f" ({b.file_path}:{b.line_number})" if b.file_path else ""
|
||||||
|
lines.append(f"{i}. {icon} **{b.category}** — {b.name}{loc}")
|
||||||
|
lines.append(f" - Duration: ~{b.duration_s:.1f}s | Severity: {b.severity}")
|
||||||
|
lines.append(f" - Fix: {b.recommendation}")
|
||||||
|
lines.append(f"")
|
||||||
|
|
||||||
|
# Category breakdowns
|
||||||
|
for cat in ["test", "build", "ci", "import"]:
|
||||||
|
items = [b for b in report.bottlenecks if b.category == cat]
|
||||||
|
if items:
|
||||||
|
lines.append(f"## {cat.title()} Bottlenecks")
|
||||||
|
lines.append(f"")
|
||||||
|
for b in items:
|
||||||
|
icon = {"critical": "🔴", "warning": "🟡", "info": "🔵"}.get(b.severity, "⚪")
|
||||||
|
loc = f" ({b.file_path}:{b.line_number})" if b.file_path else ""
|
||||||
|
lines.append(f"- {icon} {b.name}{loc} — ~{b.duration_s:.1f}s — {b.recommendation}")
|
||||||
|
lines.append(f"")
|
||||||
|
|
||||||
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
|
||||||
|
# ── Main ───────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
def main():
|
||||||
|
global SLOW_TEST_THRESHOLD_S
|
||||||
|
parser = argparse.ArgumentParser(description="Performance Bottleneck Finder")
|
||||||
|
parser.add_argument("--repo", default=".", help="Path to repository to analyze")
|
||||||
|
parser.add_argument("--json", action="store_true", help="Output as JSON")
|
||||||
|
parser.add_argument("--report", help="Write markdown report to file")
|
||||||
|
parser.add_argument("--threshold", type=float, default=SLOW_TEST_THRESHOLD_S,
|
||||||
|
help="Slow test threshold in seconds")
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
SLOW_TEST_THRESHOLD_S = args.threshold
|
||||||
|
|
||||||
|
if not os.path.isdir(args.repo):
|
||||||
|
print(f"Error: {args.repo} is not a directory", file=sys.stderr)
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
report = generate_report(args.repo)
|
||||||
|
|
||||||
|
if args.json:
|
||||||
|
print(json.dumps(report.to_dict(), indent=2))
|
||||||
|
else:
|
||||||
|
md = format_markdown(report)
|
||||||
|
if args.report:
|
||||||
|
os.makedirs(os.path.dirname(args.report) or ".", exist_ok=True)
|
||||||
|
with open(args.report, "w") as f:
|
||||||
|
f.write(md)
|
||||||
|
print(f"Report written to {args.report}")
|
||||||
|
else:
|
||||||
|
print(md)
|
||||||
|
|
||||||
|
# Exit code: 1 if critical bottlenecks found
|
||||||
|
if report.summary.get("critical", 0) > 0:
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
682
scripts/priority_rebalancer.py
Normal file
682
scripts/priority_rebalancer.py
Normal file
@@ -0,0 +1,682 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Priority Rebalancer — Re-evaluate issue priorities based on accumulated data.
|
||||||
|
|
||||||
|
Reads pipeline outputs, knowledge store, and Gitea issues to suggest
|
||||||
|
priority changes based on what the fleet has learned.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python3 scripts/priority_rebalancer.py --org Timmy_Foundation
|
||||||
|
python3 scripts/priority_rebalancer.py --org Timmy_Foundation --repo compounding-intelligence
|
||||||
|
python3 scripts/priority_rebalancer.py --org Timmy_Foundation --dry-run
|
||||||
|
python3 scripts/priority_rebalancer.py --org Timmy_Foundation --apply
|
||||||
|
|
||||||
|
Output:
|
||||||
|
metrics/priority_report.json — full analysis
|
||||||
|
metrics/priority_suggestions.md — human-readable suggestions
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
from datetime import datetime, timezone, timedelta
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Dict, List, Any, Optional, Tuple
|
||||||
|
from dataclasses import dataclass, field, asdict
|
||||||
|
from collections import Counter, defaultdict
|
||||||
|
import urllib.request
|
||||||
|
import urllib.error
|
||||||
|
|
||||||
|
|
||||||
|
# ============================================================
|
||||||
|
# Data Models
|
||||||
|
# ============================================================
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class IssueScore:
|
||||||
|
issue_id: int
|
||||||
|
repo: str
|
||||||
|
title: str
|
||||||
|
current_labels: List[str]
|
||||||
|
current_priority: Optional[str]
|
||||||
|
suggested_priority: Optional[str]
|
||||||
|
score: float
|
||||||
|
reasons: List[str]
|
||||||
|
age_days: int
|
||||||
|
comment_count: int
|
||||||
|
assignee: Optional[str]
|
||||||
|
dependencies: List[str] = field(default_factory=list)
|
||||||
|
blocking: List[str] = field(default_factory=list)
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class PipelineSignal:
|
||||||
|
source: str # "knowledge", "metrics", "sessions", "staleness"
|
||||||
|
signal_type: str # "stale_knowledge", "high_error_rate", "missing_coverage", etc.
|
||||||
|
weight: float # 0.0 - 1.0
|
||||||
|
detail: str
|
||||||
|
affected_repos: List[str] = field(default_factory=list)
|
||||||
|
affected_issues: List[int] = field(default_factory=list)
|
||||||
|
|
||||||
|
|
||||||
|
# ============================================================
|
||||||
|
# Gitea API Client
|
||||||
|
# ============================================================
|
||||||
|
|
||||||
|
class GiteaClient:
|
||||||
|
def __init__(self, base_url: str, token: str):
|
||||||
|
self.base_url = base_url.rstrip("/")
|
||||||
|
self.token = token
|
||||||
|
|
||||||
|
def _request(self, path: str, params: Dict = None) -> Any:
|
||||||
|
url = f"{self.base_url}/api/v1{path}"
|
||||||
|
if params:
|
||||||
|
qs = "&".join(f"{k}={v}" for k, v in params.items() if v is not None)
|
||||||
|
url += f"?{qs}"
|
||||||
|
|
||||||
|
req = urllib.request.Request(url)
|
||||||
|
req.add_header("Authorization", f"token {self.token}")
|
||||||
|
req.add_header("Content-Type", "application/json")
|
||||||
|
|
||||||
|
try:
|
||||||
|
with urllib.request.urlopen(req, timeout=30) as resp:
|
||||||
|
return json.loads(resp.read().decode())
|
||||||
|
except urllib.error.HTTPError as e:
|
||||||
|
print(f"API error {e.code} for {path}: {e.read().decode()[:200]}", file=sys.stderr)
|
||||||
|
return None
|
||||||
|
|
||||||
|
def get_org_repos(self, org: str) -> List[Dict]:
|
||||||
|
repos = []
|
||||||
|
page = 1
|
||||||
|
while True:
|
||||||
|
batch = self._request(f"/orgs/{org}/repos", {"limit": 50, "page": page})
|
||||||
|
if not batch:
|
||||||
|
break
|
||||||
|
repos.extend(batch)
|
||||||
|
if len(batch) < 50:
|
||||||
|
break
|
||||||
|
page += 1
|
||||||
|
return repos
|
||||||
|
|
||||||
|
def get_issues(self, org: str, repo: str, state: str = "open") -> List[Dict]:
|
||||||
|
issues = []
|
||||||
|
page = 1
|
||||||
|
while True:
|
||||||
|
batch = self._request(f"/repos/{org}/{repo}/issues",
|
||||||
|
{"state": state, "limit": 50, "page": page, "type": "issues"})
|
||||||
|
if not batch:
|
||||||
|
break
|
||||||
|
issues.extend(batch)
|
||||||
|
if len(batch) < 50:
|
||||||
|
break
|
||||||
|
page += 1
|
||||||
|
return issues
|
||||||
|
|
||||||
|
def add_label_to_issue(self, org: str, repo: str, issue_num: int, label_ids: List[int]) -> bool:
|
||||||
|
url = f"{self.base_url}/api/v1/repos/{org}/{repo}/issues/{issue_num}/labels"
|
||||||
|
data = json.dumps({"labels": label_ids}).encode()
|
||||||
|
req = urllib.request.Request(url, data=data, method="POST")
|
||||||
|
req.add_header("Authorization", f"token {self.token}")
|
||||||
|
req.add_header("Content-Type", "application/json")
|
||||||
|
try:
|
||||||
|
with urllib.request.urlopen(req, timeout=15) as resp:
|
||||||
|
return resp.status == 200
|
||||||
|
except Exception:
|
||||||
|
return False
|
||||||
|
|
||||||
|
def remove_label_from_issue(self, org: str, repo: str, issue_num: int, label_id: int) -> bool:
|
||||||
|
url = f"{self.base_url}/api/v1/repos/{org}/{repo}/issues/{issue_num}/labels/{label_id}"
|
||||||
|
req = urllib.request.Request(url, method="DELETE")
|
||||||
|
req.add_header("Authorization", f"token {self.token}")
|
||||||
|
try:
|
||||||
|
with urllib.request.urlopen(req, timeout=15) as resp:
|
||||||
|
return resp.status == 200
|
||||||
|
except Exception:
|
||||||
|
return False
|
||||||
|
|
||||||
|
def get_repo_labels(self, org: str, repo: str) -> List[Dict]:
|
||||||
|
labels = []
|
||||||
|
page = 1
|
||||||
|
while True:
|
||||||
|
batch = self._request(f"/repos/{org}/{repo}/labels", {"limit": 50, "page": page})
|
||||||
|
if not batch:
|
||||||
|
break
|
||||||
|
labels.extend(batch)
|
||||||
|
if len(batch) < 50:
|
||||||
|
break
|
||||||
|
page += 1
|
||||||
|
return labels
|
||||||
|
|
||||||
|
def add_comment(self, org: str, repo: str, issue_num: int, body: str) -> bool:
|
||||||
|
url = f"{self.base_url}/api/v1/repos/{org}/{repo}/issues/{issue_num}/comments"
|
||||||
|
data = json.dumps({"body": body}).encode()
|
||||||
|
req = urllib.request.Request(url, data=data, method="POST")
|
||||||
|
req.add_header("Authorization", f"token {self.token}")
|
||||||
|
req.add_header("Content-Type", "application/json")
|
||||||
|
try:
|
||||||
|
with urllib.request.urlopen(req, timeout=15) as resp:
|
||||||
|
return resp.status == 201
|
||||||
|
except Exception:
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
# ============================================================
|
||||||
|
# Pipeline Signal Collectors
|
||||||
|
# ============================================================
|
||||||
|
|
||||||
|
def collect_knowledge_signals(knowledge_dir: str) -> List[PipelineSignal]:
|
||||||
|
"""Analyze knowledge store for coverage gaps and staleness."""
|
||||||
|
signals = []
|
||||||
|
index_path = os.path.join(knowledge_dir, "index.json")
|
||||||
|
|
||||||
|
if not os.path.exists(index_path):
|
||||||
|
signals.append(PipelineSignal(
|
||||||
|
source="knowledge",
|
||||||
|
signal_type="missing_index",
|
||||||
|
weight=0.8,
|
||||||
|
detail="knowledge/index.json not found — no knowledge base exists"
|
||||||
|
))
|
||||||
|
return signals
|
||||||
|
|
||||||
|
try:
|
||||||
|
with open(index_path) as f:
|
||||||
|
data = json.load(f)
|
||||||
|
except (json.JSONDecodeError, IOError) as e:
|
||||||
|
signals.append(PipelineSignal(
|
||||||
|
source="knowledge",
|
||||||
|
signal_type="corrupt_index",
|
||||||
|
weight=0.9,
|
||||||
|
detail=f"knowledge/index.json is corrupt: {e}"
|
||||||
|
))
|
||||||
|
return signals
|
||||||
|
|
||||||
|
facts = data.get("facts", [])
|
||||||
|
total = len(facts)
|
||||||
|
|
||||||
|
if total == 0:
|
||||||
|
signals.append(PipelineSignal(
|
||||||
|
source="knowledge",
|
||||||
|
signal_type="empty_knowledge",
|
||||||
|
weight=0.7,
|
||||||
|
detail="Knowledge store has 0 facts — harvester not running or not finding sessions"
|
||||||
|
))
|
||||||
|
return signals
|
||||||
|
|
||||||
|
# Check staleness
|
||||||
|
stale_count = 0
|
||||||
|
missing_source = 0
|
||||||
|
for fact in facts:
|
||||||
|
status = fact.get("status", "unknown")
|
||||||
|
if status == "stale":
|
||||||
|
stale_count += 1
|
||||||
|
elif status in ("missing_source", "no_source"):
|
||||||
|
missing_source += 1
|
||||||
|
|
||||||
|
if stale_count > 0:
|
||||||
|
signals.append(PipelineSignal(
|
||||||
|
source="knowledge",
|
||||||
|
signal_type="stale_knowledge",
|
||||||
|
weight=min(1.0, stale_count / max(1, total)),
|
||||||
|
detail=f"{stale_count}/{total} facts are stale (source files changed)"
|
||||||
|
))
|
||||||
|
|
||||||
|
if missing_source > 0:
|
||||||
|
signals.append(PipelineSignal(
|
||||||
|
source="knowledge",
|
||||||
|
signal_type="missing_sources",
|
||||||
|
weight=min(1.0, missing_source / max(1, total)),
|
||||||
|
detail=f"{missing_source}/{total} facts have missing source files"
|
||||||
|
))
|
||||||
|
|
||||||
|
# Coverage by repo
|
||||||
|
repo_counts = Counter(f.get("repo", "unknown") for f in facts)
|
||||||
|
if len(repo_counts) < 3:
|
||||||
|
signals.append(PipelineSignal(
|
||||||
|
source="knowledge",
|
||||||
|
signal_type="low_coverage",
|
||||||
|
weight=0.5,
|
||||||
|
detail=f"Knowledge covers only {len(repo_counts)} repos — expand harvester scope",
|
||||||
|
affected_repos=list(repo_counts.keys())
|
||||||
|
))
|
||||||
|
|
||||||
|
return signals
|
||||||
|
|
||||||
|
|
||||||
|
def collect_staleness_signals(scripts_dir: str, knowledge_dir: str) -> List[PipelineSignal]:
|
||||||
|
"""Run staleness checker if available."""
|
||||||
|
signals = []
|
||||||
|
checker = os.path.join(scripts_dir, "knowledge_staleness_check.py")
|
||||||
|
index_path = os.path.join(knowledge_dir, "index.json")
|
||||||
|
|
||||||
|
if not os.path.exists(checker) or not os.path.exists(index_path):
|
||||||
|
return signals
|
||||||
|
|
||||||
|
try:
|
||||||
|
import subprocess
|
||||||
|
result = subprocess.run(
|
||||||
|
["python3", checker, "--index", index_path, "--json"],
|
||||||
|
capture_output=True, text=True, timeout=30
|
||||||
|
)
|
||||||
|
if result.returncode == 0:
|
||||||
|
data = json.loads(result.stdout)
|
||||||
|
stale = data.get("stale_count", 0)
|
||||||
|
total = data.get("total", 0)
|
||||||
|
if stale > 0:
|
||||||
|
signals.append(PipelineSignal(
|
||||||
|
source="staleness",
|
||||||
|
signal_type="stale_knowledge",
|
||||||
|
weight=min(1.0, stale / max(1, total)),
|
||||||
|
detail=f"Staleness checker found {stale}/{total} stale entries"
|
||||||
|
))
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
|
||||||
|
return signals
|
||||||
|
|
||||||
|
|
||||||
|
def collect_metrics_signals(metrics_dir: str) -> List[PipelineSignal]:
|
||||||
|
"""Analyze metrics directory for pipeline health."""
|
||||||
|
signals = []
|
||||||
|
|
||||||
|
if not os.path.isdir(metrics_dir):
|
||||||
|
return signals
|
||||||
|
|
||||||
|
files = os.listdir(metrics_dir)
|
||||||
|
if len(files) <= 1: # Only .gitkeep
|
||||||
|
signals.append(PipelineSignal(
|
||||||
|
source="metrics",
|
||||||
|
signal_type="no_metrics",
|
||||||
|
weight=0.6,
|
||||||
|
detail="Metrics directory is empty — measurer pipeline not producing output"
|
||||||
|
))
|
||||||
|
|
||||||
|
return signals
|
||||||
|
|
||||||
|
|
||||||
|
# ============================================================
|
||||||
|
# Priority Scoring Engine
|
||||||
|
# ============================================================
|
||||||
|
|
||||||
|
PRIORITY_LEVELS = ["P0", "P1", "P2", "P3", "none"]
|
||||||
|
PRIORITY_LABELS = {"p0": "P0", "p1": "P1", "p2": "P2", "p3": "P3",
|
||||||
|
"priority:critical": "P0", "priority:high": "P1",
|
||||||
|
"priority:medium": "P2", "priority:low": "P3"}
|
||||||
|
|
||||||
|
|
||||||
|
def extract_priority(labels: List[str]) -> Optional[str]:
|
||||||
|
"""Extract priority level from issue labels."""
|
||||||
|
for label in labels:
|
||||||
|
lower = label.lower().strip()
|
||||||
|
if lower in PRIORITY_LABELS:
|
||||||
|
return PRIORITY_LABELS[lower]
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def compute_issue_score(
|
||||||
|
issue: Dict,
|
||||||
|
repo: str,
|
||||||
|
signals: List[PipelineSignal],
|
||||||
|
now: datetime
|
||||||
|
) -> IssueScore:
|
||||||
|
"""Compute priority score for a single issue."""
|
||||||
|
|
||||||
|
labels = [l.get("name", "") if isinstance(l, dict) else l for l in issue.get("labels", [])]
|
||||||
|
title = issue.get("title", "")
|
||||||
|
issue_id = issue.get("number", 0)
|
||||||
|
current_priority = extract_priority(labels)
|
||||||
|
|
||||||
|
# Parse dates
|
||||||
|
created_str = issue.get("created_at", "")
|
||||||
|
try:
|
||||||
|
created = datetime.fromisoformat(created_str.replace("Z", "+00:00"))
|
||||||
|
age_days = (now - created.replace(tzinfo=None)).days
|
||||||
|
except (ValueError, AttributeError):
|
||||||
|
age_days = 0
|
||||||
|
|
||||||
|
comment_count = issue.get("comments", 0)
|
||||||
|
assignee = None
|
||||||
|
assignees = issue.get("assignees") or []
|
||||||
|
if assignees:
|
||||||
|
assignee = assignees[0].get("login") if isinstance(assignees[0], dict) else str(assignees[0])
|
||||||
|
|
||||||
|
# Base score
|
||||||
|
score = 0.0
|
||||||
|
reasons = []
|
||||||
|
|
||||||
|
# Age factor: older issues drift down unless actively discussed
|
||||||
|
if age_days > 90 and comment_count < 2:
|
||||||
|
score -= 15
|
||||||
|
reasons.append(f"Dormant: {age_days} days old with only {comment_count} comments")
|
||||||
|
elif age_days > 30:
|
||||||
|
score -= 5
|
||||||
|
reasons.append(f"Aging: {age_days} days old")
|
||||||
|
|
||||||
|
# Activity factor: recent discussion suggests urgency
|
||||||
|
if comment_count > 5:
|
||||||
|
score += 10
|
||||||
|
reasons.append(f"Active discussion: {comment_count} comments")
|
||||||
|
elif comment_count > 0:
|
||||||
|
score += 3
|
||||||
|
|
||||||
|
# Assignment factor: unassigned issues need triage
|
||||||
|
if not assignee:
|
||||||
|
score += 5
|
||||||
|
reasons.append("Unassigned — needs triage")
|
||||||
|
|
||||||
|
# Pipeline signal alignment
|
||||||
|
for signal in signals:
|
||||||
|
title_lower = title.lower()
|
||||||
|
if signal.signal_type == "stale_knowledge" and "stale" in title_lower:
|
||||||
|
score += signal.weight * 20
|
||||||
|
reasons.append(f"Matches signal: {signal.detail}")
|
||||||
|
elif signal.signal_type == "empty_knowledge" and ("harvester" in title_lower or "knowledge" in title_lower):
|
||||||
|
score += signal.weight * 25
|
||||||
|
reasons.append(f"Critical gap: {signal.detail}")
|
||||||
|
elif signal.signal_type == "no_metrics" and "measur" in title_lower:
|
||||||
|
score += signal.weight * 15
|
||||||
|
reasons.append(f"Pipeline gap: {signal.detail}")
|
||||||
|
elif signal.signal_type == "low_coverage" and any(r.lower() in title_lower for r in signal.affected_repos):
|
||||||
|
score += signal.weight * 10
|
||||||
|
reasons.append(f"Coverage gap: {signal.detail}")
|
||||||
|
|
||||||
|
# Keyword boosts
|
||||||
|
keyword_scores = {
|
||||||
|
"broken": 20, "bug": 15, "fix": 10, "error": 12, "fail": 15,
|
||||||
|
"security": 25, "auth": 20, "data loss": 30, "crash": 25,
|
||||||
|
"blocker": 20, "urgent": 15, "critical": 15,
|
||||||
|
"epic": 8, "feature": -3, "nice to have": -10, "someday": -15
|
||||||
|
}
|
||||||
|
title_lower = title.lower()
|
||||||
|
for keyword, boost in keyword_scores.items():
|
||||||
|
if keyword in title_lower:
|
||||||
|
score += boost
|
||||||
|
if boost > 0:
|
||||||
|
reasons.append(f"Keyword match: '{keyword}' (+{boost})")
|
||||||
|
|
||||||
|
# Label-based adjustments
|
||||||
|
for label in labels:
|
||||||
|
lower = label.lower()
|
||||||
|
if lower == "pipeline":
|
||||||
|
score += 5 # Pipeline issues are infrastructure
|
||||||
|
elif lower == "bug":
|
||||||
|
score += 12
|
||||||
|
elif lower == "enhancement":
|
||||||
|
score -= 2
|
||||||
|
elif lower == "documentation":
|
||||||
|
score -= 5
|
||||||
|
elif "epic" in lower:
|
||||||
|
score += 3
|
||||||
|
|
||||||
|
# Determine suggested priority
|
||||||
|
if score >= 40:
|
||||||
|
suggested = "P0"
|
||||||
|
elif score >= 25:
|
||||||
|
suggested = "P1"
|
||||||
|
elif score >= 10:
|
||||||
|
suggested = "P2"
|
||||||
|
elif score >= 0:
|
||||||
|
suggested = "P3"
|
||||||
|
else:
|
||||||
|
suggested = None # Consider closing or deprioritizing
|
||||||
|
|
||||||
|
return IssueScore(
|
||||||
|
issue_id=issue_id,
|
||||||
|
repo=repo,
|
||||||
|
title=title,
|
||||||
|
current_labels=labels,
|
||||||
|
current_priority=current_priority,
|
||||||
|
suggested_priority=suggested,
|
||||||
|
score=round(score, 1),
|
||||||
|
reasons=reasons if reasons else ["No strong signals"],
|
||||||
|
age_days=age_days,
|
||||||
|
comment_count=comment_count,
|
||||||
|
assignee=assignee
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# ============================================================
|
||||||
|
# Report Generation
|
||||||
|
# ============================================================
|
||||||
|
|
||||||
|
def generate_report(
|
||||||
|
scores: List[IssueScore],
|
||||||
|
signals: List[PipelineSignal],
|
||||||
|
org: str,
|
||||||
|
repos_scanned: List[str]
|
||||||
|
) -> Dict[str, Any]:
|
||||||
|
"""Generate the full priority report."""
|
||||||
|
now = datetime.now(timezone.utc).isoformat()
|
||||||
|
|
||||||
|
# Categorize changes
|
||||||
|
upgrades = [s for s in scores if s.suggested_priority and s.current_priority and
|
||||||
|
PRIORITY_LEVELS.index(s.suggested_priority) < PRIORITY_LEVELS.index(s.current_priority)]
|
||||||
|
downgrades = [s for s in scores if s.suggested_priority and s.current_priority and
|
||||||
|
PRIORITY_LEVELS.index(s.suggested_priority) > PRIORITY_LEVELS.index(s.current_priority)]
|
||||||
|
new_assignments = [s for s in scores if s.suggested_priority and not s.current_priority]
|
||||||
|
no_change = [s for s in scores if s.suggested_priority == s.current_priority]
|
||||||
|
|
||||||
|
return {
|
||||||
|
"generated_at": now,
|
||||||
|
"org": org,
|
||||||
|
"repos_scanned": repos_scanned,
|
||||||
|
"total_issues": len(scores),
|
||||||
|
"signals": [asdict(s) for s in signals],
|
||||||
|
"summary": {
|
||||||
|
"suggested_upgrades": len(upgrades),
|
||||||
|
"suggested_downgrades": len(downgrades),
|
||||||
|
"suggested_new_priorities": len(new_assignments),
|
||||||
|
"unchanged": len(no_change)
|
||||||
|
},
|
||||||
|
"top_priority": [asdict(s) for s in sorted(scores, key=lambda x: x.score, reverse=True)[:10]],
|
||||||
|
"upgrades": [asdict(s) for s in upgrades],
|
||||||
|
"downgrades": [asdict(s) for s in downgrades],
|
||||||
|
"new_assignments": [asdict(s) for s in new_assignments],
|
||||||
|
"all_scores": [asdict(s) for s in sorted(scores, key=lambda x: x.score, reverse=True)]
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def generate_markdown_report(report: Dict[str, Any]) -> str:
|
||||||
|
"""Generate human-readable markdown report."""
|
||||||
|
lines = []
|
||||||
|
lines.append("# Priority Rebalancer Report")
|
||||||
|
lines.append(f"**Generated:** {report['generated_at']}")
|
||||||
|
lines.append(f"**Org:** {report['org']}")
|
||||||
|
lines.append(f"**Repos scanned:** {', '.join(report['repos_scanned'])}")
|
||||||
|
lines.append(f"**Issues analyzed:** {report['total_issues']}")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
# Signals
|
||||||
|
if report["signals"]:
|
||||||
|
lines.append("## Pipeline Signals")
|
||||||
|
for sig in report["signals"]:
|
||||||
|
weight_bar = "█" * int(sig["weight"] * 10) + "░" * (10 - int(sig["weight"] * 10))
|
||||||
|
lines.append(f"- [{weight_bar}] **{sig['source']}/{sig['signal_type']}** — {sig['detail']}")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
# Summary
|
||||||
|
s = report["summary"]
|
||||||
|
lines.append("## Summary")
|
||||||
|
lines.append(f"- Suggested upgrades: **{s['suggested_upgrades']}**")
|
||||||
|
lines.append(f"- Suggested downgrades: **{s['suggested_downgrades']}**")
|
||||||
|
lines.append(f"- New priority assignments: **{s['suggested_new_priorities']}**")
|
||||||
|
lines.append(f"- Unchanged: **{s['unchanged']}**")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
# Top 10
|
||||||
|
lines.append("## Top 10 by Score")
|
||||||
|
for i, item in enumerate(report["top_priority"][:10], 1):
|
||||||
|
cur = item["current_priority"] or "none"
|
||||||
|
sug = item["suggested_priority"] or "none"
|
||||||
|
arrow = "↑" if PRIORITY_LEVELS.index(sug) < PRIORITY_LEVELS.index(cur) else "↓" if PRIORITY_LEVELS.index(sug) > PRIORITY_LEVELS.index(cur) else "→"
|
||||||
|
lines.append(f"{i}. **[{item['repo']}#{item['issue_id']}]** {item['title']}")
|
||||||
|
lines.append(f" Score: {item['score']} | Current: {cur} {arrow} Suggested: {sug}")
|
||||||
|
lines.append(f" Reasons: {'; '.join(item['reasons'][:3])}")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
# Upgrades
|
||||||
|
if report["upgrades"]:
|
||||||
|
lines.append("## Suggested Upgrades")
|
||||||
|
for item in report["upgrades"]:
|
||||||
|
lines.append(f"- **[{item['repo']}#{item['issue_id']}]** {item['title']}")
|
||||||
|
lines.append(f" {item['current_priority']} → {item['suggested_priority']} (score: {item['score']})")
|
||||||
|
lines.append(f" {'; '.join(item['reasons'][:2])}")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
# Downgrades
|
||||||
|
if report["downgrades"]:
|
||||||
|
lines.append("## Suggested Downgrades")
|
||||||
|
for item in report["downgrades"]:
|
||||||
|
lines.append(f"- **[{item['repo']}#{item['issue_id']}]** {item['title']}")
|
||||||
|
lines.append(f" {item['current_priority']} → {item['suggested_priority']} (score: {item['score']})")
|
||||||
|
lines.append(f" {'; '.join(item['reasons'][:2])}")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
# New assignments
|
||||||
|
if report["new_assignments"]:
|
||||||
|
lines.append("## New Priority Suggestions (currently unlabelled)")
|
||||||
|
for item in report["new_assignments"][:20]:
|
||||||
|
lines.append(f"- **[{item['repo']}#{item['issue_id']}]** {item['title']}")
|
||||||
|
lines.append(f" Suggested: {item['suggested_priority']} (score: {item['score']})")
|
||||||
|
lines.append(f" {'; '.join(item['reasons'][:2])}")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
|
||||||
|
# ============================================================
|
||||||
|
# Main
|
||||||
|
# ============================================================
|
||||||
|
|
||||||
|
def main():
|
||||||
|
parser = argparse.ArgumentParser(description="Priority Rebalancer — re-score issues based on pipeline data")
|
||||||
|
parser.add_argument("--org", default="Timmy_Foundation", help="Gitea org name")
|
||||||
|
parser.add_argument("--repo", help="Single repo to analyze (default: all)")
|
||||||
|
parser.add_argument("--base-url", default="https://forge.alexanderwhitestone.com", help="Gitea base URL")
|
||||||
|
parser.add_argument("--knowledge-dir", default=None, help="Path to knowledge directory")
|
||||||
|
parser.add_argument("--metrics-dir", default=None, help="Path to metrics directory")
|
||||||
|
parser.add_argument("--scripts-dir", default=None, help="Path to scripts directory")
|
||||||
|
parser.add_argument("--output-dir", default=None, help="Path to output directory")
|
||||||
|
parser.add_argument("--dry-run", action="store_true", help="Show what would change without applying")
|
||||||
|
parser.add_argument("--apply", action="store_true", help="Apply priority changes via API")
|
||||||
|
parser.add_argument("--json", action="store_true", help="Output JSON instead of markdown")
|
||||||
|
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
# Resolve paths relative to script location
|
||||||
|
script_dir = Path(__file__).parent
|
||||||
|
repo_root = script_dir.parent
|
||||||
|
|
||||||
|
knowledge_dir = args.knowledge_dir or str(repo_root / "knowledge")
|
||||||
|
metrics_dir = args.metrics_dir or str(repo_root / "metrics")
|
||||||
|
scripts_dir = args.scripts_dir or str(repo_root / "scripts")
|
||||||
|
output_dir = args.output_dir or str(repo_root / "metrics")
|
||||||
|
|
||||||
|
# Get token
|
||||||
|
token = os.environ.get("GITEA_TOKEN")
|
||||||
|
if not token:
|
||||||
|
token_path = os.path.expanduser("~/.config/gitea/token")
|
||||||
|
if os.path.exists(token_path):
|
||||||
|
with open(token_path) as f:
|
||||||
|
token = f.read().strip()
|
||||||
|
|
||||||
|
if not token:
|
||||||
|
print("Error: No Gitea token found. Set GITEA_TOKEN or create ~/.config/gitea/token", file=sys.stderr)
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
client = GiteaClient(args.base_url, token)
|
||||||
|
now = datetime.utcnow()
|
||||||
|
|
||||||
|
# Collect pipeline signals
|
||||||
|
print("Collecting pipeline signals...", file=sys.stderr)
|
||||||
|
signals = []
|
||||||
|
signals.extend(collect_knowledge_signals(knowledge_dir))
|
||||||
|
signals.extend(collect_staleness_signals(scripts_dir, knowledge_dir))
|
||||||
|
signals.extend(collect_metrics_signals(metrics_dir))
|
||||||
|
print(f" Found {len(signals)} signals", file=sys.stderr)
|
||||||
|
|
||||||
|
# Get repos
|
||||||
|
if args.repo:
|
||||||
|
repos = [{"name": args.repo}]
|
||||||
|
else:
|
||||||
|
repos = client.get_org_repos(args.org)
|
||||||
|
|
||||||
|
repo_names = [r["name"] for r in repos]
|
||||||
|
print(f"Scanning {len(repo_names)} repos: {', '.join(repo_names[:5])}{'...' if len(repo_names) > 5 else ''}", file=sys.stderr)
|
||||||
|
|
||||||
|
# Score all issues
|
||||||
|
all_scores = []
|
||||||
|
for repo in repos:
|
||||||
|
repo_name = repo["name"]
|
||||||
|
issues = client.get_issues(args.org, repo_name)
|
||||||
|
print(f" {repo_name}: {len(issues)} open issues", file=sys.stderr)
|
||||||
|
|
||||||
|
for issue in issues:
|
||||||
|
if issue.get("pull_request"):
|
||||||
|
continue
|
||||||
|
score = compute_issue_score(issue, repo_name, signals, now)
|
||||||
|
all_scores.append(score)
|
||||||
|
|
||||||
|
# Generate report
|
||||||
|
report = generate_report(all_scores, signals, args.org, repo_names)
|
||||||
|
|
||||||
|
# Output
|
||||||
|
os.makedirs(output_dir, exist_ok=True)
|
||||||
|
|
||||||
|
if args.json:
|
||||||
|
print(json.dumps(report, indent=2))
|
||||||
|
else:
|
||||||
|
md = generate_markdown_report(report)
|
||||||
|
print(md)
|
||||||
|
|
||||||
|
# Write files
|
||||||
|
report_path = os.path.join(output_dir, "priority_report.json")
|
||||||
|
with open(report_path, "w") as f:
|
||||||
|
json.dump(report, f, indent=2)
|
||||||
|
print(f"\nFull report: {report_path}", file=sys.stderr)
|
||||||
|
|
||||||
|
md_path = os.path.join(output_dir, "priority_suggestions.md")
|
||||||
|
with open(md_path, "w") as f:
|
||||||
|
f.write(generate_markdown_report(report))
|
||||||
|
print(f"Suggestions: {md_path}", file=sys.stderr)
|
||||||
|
|
||||||
|
# Apply changes if requested
|
||||||
|
if args.apply:
|
||||||
|
print("\nApplying priority changes...", file=sys.stderr)
|
||||||
|
applied = 0
|
||||||
|
|
||||||
|
# Get label IDs for priority labels
|
||||||
|
priority_label_map = {}
|
||||||
|
for repo_name in repo_names:
|
||||||
|
labels = client.get_repo_labels(args.org, repo_name)
|
||||||
|
for label in labels:
|
||||||
|
name = label.get("name", "").lower()
|
||||||
|
if name in ("p0", "p1", "p2", "p3"):
|
||||||
|
priority_label_map[(repo_name, name)] = label["id"]
|
||||||
|
|
||||||
|
for score in all_scores:
|
||||||
|
if score.suggested_priority and score.suggested_priority != score.current_priority:
|
||||||
|
sug_lower = score.suggested_priority.lower()
|
||||||
|
label_id = priority_label_map.get((score.repo, sug_lower))
|
||||||
|
if label_id:
|
||||||
|
ok = client.add_label_to_issue(args.org, score.repo, score.issue_id, [label_id])
|
||||||
|
if ok:
|
||||||
|
applied += 1
|
||||||
|
print(f" Applied: [{score.repo}#{score.issue_id}] → {score.suggested_priority}", file=sys.stderr)
|
||||||
|
|
||||||
|
# Add comment explaining the change
|
||||||
|
comment = f"**Priority Rebalancer** suggested: **{score.suggested_priority}** (was: {score.current_priority or 'none'})\n\n"
|
||||||
|
comment += f"Score: {score.score}\n"
|
||||||
|
comment += f"Reasons:\n"
|
||||||
|
for r in score.reasons[:5]:
|
||||||
|
comment += f"- {r}\n"
|
||||||
|
client.add_comment(args.org, score.repo, score.issue_id, comment)
|
||||||
|
|
||||||
|
print(f"Applied {applied} priority changes", file=sys.stderr)
|
||||||
|
|
||||||
|
elif args.dry_run:
|
||||||
|
print(f"\nDry run — {report['summary']['suggested_upgrades'] + report['summary']['suggested_downgrades'] + report['summary']['suggested_new_priorities']} changes would be applied", file=sys.stderr)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
54
scripts/refactoring_opportunity_finder.py
Executable file
54
scripts/refactoring_opportunity_finder.py
Executable file
@@ -0,0 +1,54 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Finds refactoring opportunities in codebases
|
||||||
|
|
||||||
|
Engine ID: 10.4
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python3 scripts/refactoring_opportunity_finder.py --output proposals/refactoring_opportunity_finder.json
|
||||||
|
python3 scripts/refactoring_opportunity_finder.py --output proposals/refactoring_opportunity_finder.json --dry-run
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import json
|
||||||
|
import sys
|
||||||
|
from datetime import datetime, timezone
|
||||||
|
|
||||||
|
|
||||||
|
def generate_proposals():
|
||||||
|
"""Generate sample proposals for this engine."""
|
||||||
|
# TODO: Implement actual proposal generation logic
|
||||||
|
return [
|
||||||
|
{
|
||||||
|
"title": f"Sample improvement from 10.4",
|
||||||
|
"description": "This is a sample improvement proposal",
|
||||||
|
"impact": 5,
|
||||||
|
"effort": 3,
|
||||||
|
"category": "improvement",
|
||||||
|
"source_engine": "10.4",
|
||||||
|
"timestamp": datetime.now(timezone.utc).isoformat()
|
||||||
|
}
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
parser = argparse.ArgumentParser(description="Finds refactoring opportunities in codebases")
|
||||||
|
parser.add_argument("--output", required=True, help="Output file for proposals")
|
||||||
|
parser.add_argument("--dry-run", action="store_true", help="Don't write output file")
|
||||||
|
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
proposals = generate_proposals()
|
||||||
|
|
||||||
|
if not args.dry_run:
|
||||||
|
with open(args.output, "w") as f:
|
||||||
|
json.dump({"proposals": proposals}, f, indent=2)
|
||||||
|
print(f"Generated {len(proposals)} proposals -> {args.output}")
|
||||||
|
else:
|
||||||
|
print(f"Would generate {len(proposals)} proposals")
|
||||||
|
for p in proposals:
|
||||||
|
print(f" - {p['title']}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
353
scripts/sampler.py
Normal file
353
scripts/sampler.py
Normal file
@@ -0,0 +1,353 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
sampler.py — Score and rank sessions by harvest value.
|
||||||
|
|
||||||
|
With 20k+ sessions on disk, we can't harvest all at once. This script
|
||||||
|
scores each session by how likely it is to contain valuable knowledge,
|
||||||
|
so the harvester processes the best ones first.
|
||||||
|
|
||||||
|
Scoring strategy:
|
||||||
|
- Recency: last 7d=3pts, last 30d=2pts, older=1pt
|
||||||
|
- Length: >50 messages=3pts, >20=2pts, <20=1pt
|
||||||
|
- Repo uniqueness: first session for a repo=5pts, otherwise=1pt
|
||||||
|
- Outcome: failure=3pts (most to learn), success=2pts, unknown=1pt
|
||||||
|
- Tool calls: >10 tool invocations=2pts (complex sessions)
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python3 sampler.py --count 100 # Top 100 sessions
|
||||||
|
python3 sampler.py --repo the-nexus --count 20 # Top 20 for a repo
|
||||||
|
python3 sampler.py --since 2026-04-01 # All sessions since date
|
||||||
|
python3 sampler.py --count 50 --min-score 8 # Only high-value sessions
|
||||||
|
python3 sampler.py --count 100 --output sample.json # Save to file
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import time
|
||||||
|
from datetime import datetime, timedelta, timezone
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Optional
|
||||||
|
|
||||||
|
|
||||||
|
# --- Fast session scanning (no full parse) ---
|
||||||
|
|
||||||
|
def scan_session_fast(path: str) -> dict:
|
||||||
|
"""Extract scoring metadata from a session without parsing the full JSONL.
|
||||||
|
|
||||||
|
Reads only: first line, last ~20 lines, and line count. This processes
|
||||||
|
20k sessions in seconds instead of minutes.
|
||||||
|
"""
|
||||||
|
meta = {
|
||||||
|
'path': path,
|
||||||
|
'message_count': 0,
|
||||||
|
'has_tool_calls': False,
|
||||||
|
'tool_call_count': 0,
|
||||||
|
'first_timestamp': '',
|
||||||
|
'last_timestamp': '',
|
||||||
|
'is_failure': False,
|
||||||
|
'repos_mentioned': set(),
|
||||||
|
'first_role': '',
|
||||||
|
'last_content_preview': '',
|
||||||
|
}
|
||||||
|
|
||||||
|
try:
|
||||||
|
file_size = os.path.getsize(path)
|
||||||
|
if file_size == 0:
|
||||||
|
return meta
|
||||||
|
|
||||||
|
with open(path, 'r', encoding='utf-8', errors='replace') as f:
|
||||||
|
# Read first line for timestamp + role
|
||||||
|
first_line = f.readline().strip()
|
||||||
|
if first_line:
|
||||||
|
try:
|
||||||
|
first_msg = json.loads(first_line)
|
||||||
|
meta['first_timestamp'] = first_msg.get('timestamp', '')
|
||||||
|
meta['first_role'] = first_msg.get('role', '')
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
pass
|
||||||
|
|
||||||
|
# Fast line count + collect tail lines
|
||||||
|
# For the tail, seek to near end of file
|
||||||
|
tail_lines = []
|
||||||
|
line_count = 1 # already read first
|
||||||
|
|
||||||
|
if file_size > 8192:
|
||||||
|
# Seek to last 8KB for tail sampling
|
||||||
|
f.seek(max(0, file_size - 8192))
|
||||||
|
f.readline() # skip partial line
|
||||||
|
for line in f:
|
||||||
|
line = line.strip()
|
||||||
|
if line:
|
||||||
|
tail_lines.append(line)
|
||||||
|
line_count += 1
|
||||||
|
# We lost the exact count for big files — estimate from file size
|
||||||
|
# Average JSONL line is ~500 bytes
|
||||||
|
if line_count < 100:
|
||||||
|
line_count = max(line_count, file_size // 500)
|
||||||
|
else:
|
||||||
|
# Small file — read all
|
||||||
|
for line in f:
|
||||||
|
line = line.strip()
|
||||||
|
if line:
|
||||||
|
tail_lines.append(line)
|
||||||
|
line_count += 1
|
||||||
|
|
||||||
|
meta['message_count'] = line_count
|
||||||
|
|
||||||
|
# Parse tail lines for outcome, tool calls, repos
|
||||||
|
for line in tail_lines[-30:]: # last 30 non-empty lines
|
||||||
|
try:
|
||||||
|
msg = json.loads(line)
|
||||||
|
|
||||||
|
# Track last timestamp
|
||||||
|
ts = msg.get('timestamp', '')
|
||||||
|
if ts:
|
||||||
|
meta['last_timestamp'] = ts
|
||||||
|
|
||||||
|
# Count tool calls
|
||||||
|
if msg.get('tool_calls'):
|
||||||
|
meta['has_tool_calls'] = True
|
||||||
|
meta['tool_call_count'] += len(msg['tool_calls'])
|
||||||
|
|
||||||
|
# Detect failure signals in content
|
||||||
|
content = ''
|
||||||
|
if isinstance(msg.get('content'), str):
|
||||||
|
content = msg['content'].lower()
|
||||||
|
elif isinstance(msg.get('content'), list):
|
||||||
|
for part in msg['content']:
|
||||||
|
if isinstance(part, dict) and part.get('type') == 'text':
|
||||||
|
content += part.get('text', '').lower()
|
||||||
|
|
||||||
|
if content:
|
||||||
|
meta['last_content_preview'] = content[:200]
|
||||||
|
failure_signals = ['error', 'failed', 'cannot', 'unable',
|
||||||
|
'exception', 'traceback', 'rejected', 'denied']
|
||||||
|
if any(sig in content for sig in failure_signals):
|
||||||
|
meta['is_failure'] = True
|
||||||
|
|
||||||
|
# Extract repo references from tool call arguments
|
||||||
|
if msg.get('tool_calls'):
|
||||||
|
for tc in msg['tool_calls']:
|
||||||
|
args = tc.get('function', {}).get('arguments', '')
|
||||||
|
if isinstance(args, str):
|
||||||
|
# Look for repo patterns
|
||||||
|
for pattern in ['Timmy_Foundation/', 'Rockachopa/', 'compounding-intelligence', 'the-nexus', 'timmy-home', 'hermes-agent', 'the-beacon', 'the-door']:
|
||||||
|
if pattern in args:
|
||||||
|
repo = pattern.rstrip('/')
|
||||||
|
meta['repos_mentioned'].add(repo)
|
||||||
|
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
continue
|
||||||
|
|
||||||
|
except (IOError, OSError):
|
||||||
|
pass
|
||||||
|
|
||||||
|
meta['repos_mentioned'] = list(meta['repos_mentioned'])
|
||||||
|
return meta
|
||||||
|
|
||||||
|
|
||||||
|
# --- Filename timestamp parsing ---
|
||||||
|
|
||||||
|
def parse_session_timestamp(filename: str) -> Optional[datetime]:
|
||||||
|
"""Parse timestamp from session filename.
|
||||||
|
|
||||||
|
Common formats:
|
||||||
|
session_20260413_123456_hash.jsonl
|
||||||
|
20260413_123456_hash.jsonl
|
||||||
|
"""
|
||||||
|
stem = Path(filename).stem
|
||||||
|
parts = stem.split('_')
|
||||||
|
|
||||||
|
# Try session_YYYYMMDD_HHMMSS format
|
||||||
|
for i, part in enumerate(parts):
|
||||||
|
if len(part) == 8 and part.isdigit():
|
||||||
|
date_part = part
|
||||||
|
time_part = parts[i + 1] if i + 1 < len(parts) and len(parts[i + 1]) == 6 else '000000'
|
||||||
|
try:
|
||||||
|
return datetime.strptime(f"{date_part}_{time_part}", '%Y%m%d_%H%M%S').replace(tzinfo=timezone.utc)
|
||||||
|
except ValueError:
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Fallback: use file modification time
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
# --- Scoring ---
|
||||||
|
|
||||||
|
def score_session(meta: dict, now: datetime, seen_repos: set) -> tuple[int, dict]:
|
||||||
|
"""Score a session for harvest value. Returns (score, breakdown)."""
|
||||||
|
score = 0
|
||||||
|
breakdown = {}
|
||||||
|
|
||||||
|
# 1. Recency
|
||||||
|
ts = parse_session_timestamp(os.path.basename(meta['path']))
|
||||||
|
if ts is None:
|
||||||
|
# Fallback to mtime
|
||||||
|
try:
|
||||||
|
ts = datetime.fromtimestamp(os.path.getmtime(meta['path']), tz=timezone.utc)
|
||||||
|
except OSError:
|
||||||
|
ts = now - timedelta(days=365)
|
||||||
|
|
||||||
|
age_days = (now - ts).days
|
||||||
|
if age_days <= 7:
|
||||||
|
recency = 3
|
||||||
|
elif age_days <= 30:
|
||||||
|
recency = 2
|
||||||
|
else:
|
||||||
|
recency = 1
|
||||||
|
score += recency
|
||||||
|
breakdown['recency'] = recency
|
||||||
|
|
||||||
|
# 2. Length
|
||||||
|
count = meta['message_count']
|
||||||
|
if count > 50:
|
||||||
|
length = 3
|
||||||
|
elif count > 20:
|
||||||
|
length = 2
|
||||||
|
else:
|
||||||
|
length = 1
|
||||||
|
score += length
|
||||||
|
breakdown['length'] = length
|
||||||
|
|
||||||
|
# 3. Repo uniqueness (first session mentioning a repo gets bonus)
|
||||||
|
repo_score = 0
|
||||||
|
for repo in meta.get('repos_mentioned', []):
|
||||||
|
if repo not in seen_repos:
|
||||||
|
seen_repos.add(repo)
|
||||||
|
repo_score = max(repo_score, 5)
|
||||||
|
else:
|
||||||
|
repo_score = max(repo_score, 1)
|
||||||
|
score += repo_score
|
||||||
|
breakdown['repo_unique'] = repo_score
|
||||||
|
|
||||||
|
# 4. Outcome
|
||||||
|
if meta.get('is_failure'):
|
||||||
|
outcome = 3
|
||||||
|
elif meta.get('last_content_preview', '').strip():
|
||||||
|
outcome = 2 # has some content = likely completed
|
||||||
|
else:
|
||||||
|
outcome = 1
|
||||||
|
score += outcome
|
||||||
|
breakdown['outcome'] = outcome
|
||||||
|
|
||||||
|
# 5. Tool calls
|
||||||
|
if meta.get('tool_call_count', 0) > 10:
|
||||||
|
tool = 2
|
||||||
|
else:
|
||||||
|
tool = 0
|
||||||
|
score += tool
|
||||||
|
breakdown['tool_calls'] = tool
|
||||||
|
|
||||||
|
return score, breakdown
|
||||||
|
|
||||||
|
|
||||||
|
# --- Main ---
|
||||||
|
|
||||||
|
def main():
|
||||||
|
parser = argparse.ArgumentParser(description="Score and rank sessions for harvesting")
|
||||||
|
parser.add_argument('--sessions-dir', default=os.path.expanduser('~/.hermes/sessions'),
|
||||||
|
help='Directory containing session files')
|
||||||
|
parser.add_argument('--count', type=int, default=100, help='Number of top sessions to return')
|
||||||
|
parser.add_argument('--repo', default='', help='Filter to sessions mentioning this repo')
|
||||||
|
parser.add_argument('--since', default='', help='Only score sessions after this date (YYYY-MM-DD)')
|
||||||
|
parser.add_argument('--min-score', type=int, default=0, help='Minimum score threshold')
|
||||||
|
parser.add_argument('--output', default='', help='Output file (JSON). Default: stdout')
|
||||||
|
parser.add_argument('--format', choices=['json', 'paths', 'table'], default='table',
|
||||||
|
help='Output format: json (full), paths (one per line), table (human)')
|
||||||
|
parser.add_argument('--top-percent', type=float, default=0, help='Return top N%% instead of --count')
|
||||||
|
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
sessions_dir = Path(args.sessions_dir)
|
||||||
|
if not sessions_dir.is_dir():
|
||||||
|
print(f"ERROR: Sessions directory not found: {sessions_dir}", file=sys.stderr)
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
# Find all JSONL files
|
||||||
|
print(f"Scanning {sessions_dir}...", file=sys.stderr)
|
||||||
|
t0 = time.time()
|
||||||
|
|
||||||
|
session_files = list(sessions_dir.glob('*.jsonl'))
|
||||||
|
total = len(session_files)
|
||||||
|
print(f"Found {total} session files", file=sys.stderr)
|
||||||
|
|
||||||
|
# Parse since date
|
||||||
|
since_dt = None
|
||||||
|
if args.since:
|
||||||
|
since_dt = datetime.strptime(args.since, '%Y-%m-%d').replace(tzinfo=timezone.utc)
|
||||||
|
|
||||||
|
# Score all sessions
|
||||||
|
now = datetime.now(timezone.utc)
|
||||||
|
seen_repos = set() # Track repos for uniqueness scoring
|
||||||
|
scored = []
|
||||||
|
|
||||||
|
for i, sf in enumerate(session_files):
|
||||||
|
# Date filter (fast path: check filename first)
|
||||||
|
if since_dt:
|
||||||
|
ts = parse_session_timestamp(sf.name)
|
||||||
|
if ts and ts < since_dt:
|
||||||
|
continue
|
||||||
|
|
||||||
|
meta = scan_session_fast(str(sf))
|
||||||
|
|
||||||
|
# Repo filter
|
||||||
|
if args.repo:
|
||||||
|
repos = meta.get('repos_mentioned', [])
|
||||||
|
if args.repo.lower() not in [r.lower() for r in repos]:
|
||||||
|
# Also check filename
|
||||||
|
if args.repo.lower() not in sf.name.lower():
|
||||||
|
continue
|
||||||
|
|
||||||
|
score, breakdown = score_session(meta, now, seen_repos)
|
||||||
|
|
||||||
|
if score >= args.min_score:
|
||||||
|
scored.append({
|
||||||
|
'path': str(sf),
|
||||||
|
'filename': sf.name,
|
||||||
|
'score': score,
|
||||||
|
'breakdown': breakdown,
|
||||||
|
'message_count': meta['message_count'],
|
||||||
|
'repos': meta['repos_mentioned'],
|
||||||
|
'is_failure': meta['is_failure'],
|
||||||
|
})
|
||||||
|
|
||||||
|
if (i + 1) % 5000 == 0:
|
||||||
|
elapsed = time.time() - t0
|
||||||
|
print(f" Scanned {i + 1}/{total} ({elapsed:.1f}s)", file=sys.stderr)
|
||||||
|
|
||||||
|
elapsed = time.time() - t0
|
||||||
|
print(f"Scored {len(scored)} sessions in {elapsed:.1f}s", file=sys.stderr)
|
||||||
|
|
||||||
|
# Sort by score descending
|
||||||
|
scored.sort(key=lambda x: x['score'], reverse=True)
|
||||||
|
|
||||||
|
# Apply count or percent
|
||||||
|
if args.top_percent > 0:
|
||||||
|
count = max(1, int(len(scored) * args.top_percent / 100))
|
||||||
|
else:
|
||||||
|
count = args.count
|
||||||
|
scored = scored[:count]
|
||||||
|
|
||||||
|
# Output
|
||||||
|
if args.output:
|
||||||
|
with open(args.output, 'w', encoding='utf-8') as f:
|
||||||
|
json.dump(scored, f, indent=2)
|
||||||
|
print(f"Wrote {len(scored)} sessions to {args.output}", file=sys.stderr)
|
||||||
|
elif args.format == 'json':
|
||||||
|
json.dump(scored, sys.stdout, indent=2)
|
||||||
|
elif args.format == 'paths':
|
||||||
|
for s in scored:
|
||||||
|
print(s['path'])
|
||||||
|
else: # table
|
||||||
|
print(f"{'SCORE':>5} {'MSGS':>5} {'REPOS':<25} {'FILE'}")
|
||||||
|
print(f"{'-'*5} {'-'*5} {'-'*25} {'-'*40}")
|
||||||
|
for s in scored:
|
||||||
|
repos = ', '.join(s['repos'][:2]) if s['repos'] else '-'
|
||||||
|
fail = ' FAIL' if s['is_failure'] else ''
|
||||||
|
print(f"{s['score']:>5} {s['message_count']:>5} {repos:<25} {s['filename'][:40]}{fail}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
main()
|
||||||
276
scripts/session_metadata.py
Normal file
276
scripts/session_metadata.py
Normal file
@@ -0,0 +1,276 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
session_metadata.py - Extract structured metadata from Hermes session transcripts.
|
||||||
|
Works alongside session_reader.py to provide higher-level session analysis.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import json
|
||||||
|
import re
|
||||||
|
import sys
|
||||||
|
from dataclasses import dataclass, asdict
|
||||||
|
from datetime import datetime
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Dict, List, Optional, Any
|
||||||
|
|
||||||
|
# Import from session_reader (the canonical reader)
|
||||||
|
from session_reader import read_session
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class SessionSummary:
|
||||||
|
"""Structured summary of a Hermes session transcript."""
|
||||||
|
session_id: str
|
||||||
|
model: str
|
||||||
|
repo: str
|
||||||
|
outcome: str
|
||||||
|
message_count: int
|
||||||
|
tool_calls: int
|
||||||
|
duration_estimate: str
|
||||||
|
key_actions: List[str]
|
||||||
|
errors_encountered: List[str]
|
||||||
|
start_time: Optional[str] = None
|
||||||
|
end_time: Optional[str] = None
|
||||||
|
total_tokens_estimate: int = 0
|
||||||
|
user_messages: int = 0
|
||||||
|
assistant_messages: int = 0
|
||||||
|
tool_outputs: int = 0
|
||||||
|
|
||||||
|
|
||||||
|
def extract_session_metadata(file_path: str) -> SessionSummary:
|
||||||
|
"""
|
||||||
|
Extract structured metadata from a Hermes session JSONL transcript.
|
||||||
|
Uses session_reader.read_session() for file reading.
|
||||||
|
"""
|
||||||
|
session_id = Path(file_path).stem
|
||||||
|
messages = []
|
||||||
|
model = "unknown"
|
||||||
|
repo = "unknown"
|
||||||
|
tool_calls_count = 0
|
||||||
|
key_actions = []
|
||||||
|
errors = []
|
||||||
|
start_time = None
|
||||||
|
end_time = None
|
||||||
|
total_tokens = 0
|
||||||
|
|
||||||
|
# Common repo patterns to look for
|
||||||
|
repo_patterns = [
|
||||||
|
r"(?:the-nexus|compounding-intelligence|timmy-config|hermes-agent)",
|
||||||
|
r"(?:forge\.alexanderwhitestone\.com/([^/]+/[^/\\s]+))",
|
||||||
|
r"(?:github\.com/([^/]+/[^/\\s]+))",
|
||||||
|
r"(?:Timmy_Foundation/([^/\\s]+))",
|
||||||
|
]
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Use the canonical reader from session_reader.py
|
||||||
|
messages = read_session(file_path)
|
||||||
|
except FileNotFoundError:
|
||||||
|
return SessionSummary(
|
||||||
|
session_id=session_id,
|
||||||
|
model="unknown",
|
||||||
|
repo="unknown",
|
||||||
|
outcome="failure",
|
||||||
|
message_count=0,
|
||||||
|
tool_calls=0,
|
||||||
|
duration_estimate="0m",
|
||||||
|
key_actions=[],
|
||||||
|
errors_encountered=[f"File not found: {file_path}"]
|
||||||
|
)
|
||||||
|
|
||||||
|
# Process messages for metadata
|
||||||
|
for entry in messages:
|
||||||
|
# Extract model from assistant messages
|
||||||
|
if entry.get("role") == "assistant" and entry.get("model"):
|
||||||
|
model = entry["model"]
|
||||||
|
|
||||||
|
# Extract timestamps
|
||||||
|
if entry.get("timestamp"):
|
||||||
|
ts = entry["timestamp"]
|
||||||
|
if start_time is None:
|
||||||
|
start_time = ts
|
||||||
|
end_time = ts
|
||||||
|
|
||||||
|
# Count tool calls
|
||||||
|
if entry.get("tool_calls"):
|
||||||
|
tool_calls_count += len(entry["tool_calls"])
|
||||||
|
for tc in entry["tool_calls"]:
|
||||||
|
if tc.get("function", {}).get("name"):
|
||||||
|
action = f"{tc['function']['name']}"
|
||||||
|
if action not in key_actions:
|
||||||
|
key_actions.append(action)
|
||||||
|
|
||||||
|
# Estimate tokens from content length
|
||||||
|
content = entry.get("content", "")
|
||||||
|
if isinstance(content, str):
|
||||||
|
total_tokens += len(content.split())
|
||||||
|
elif isinstance(content, list):
|
||||||
|
for item in content:
|
||||||
|
if isinstance(item, dict) and "text" in item:
|
||||||
|
total_tokens += len(item["text"].split())
|
||||||
|
|
||||||
|
# Look for repo mentions in content
|
||||||
|
if entry.get("content"):
|
||||||
|
content_str = str(entry["content"])
|
||||||
|
for pattern in repo_patterns:
|
||||||
|
match = re.search(pattern, content_str, re.IGNORECASE)
|
||||||
|
if match:
|
||||||
|
if match.groups():
|
||||||
|
repo = match.group(1)
|
||||||
|
else:
|
||||||
|
repo = match.group(0)
|
||||||
|
break
|
||||||
|
|
||||||
|
# Look for error messages
|
||||||
|
if entry.get("role") == "tool" and entry.get("is_error"):
|
||||||
|
error_msg = entry.get("content", "Unknown error")
|
||||||
|
if isinstance(error_msg, str) and len(error_msg) < 200:
|
||||||
|
errors.append(error_msg[:200])
|
||||||
|
|
||||||
|
# Count message types
|
||||||
|
user_messages = sum(1 for m in messages if m.get("role") == "user")
|
||||||
|
assistant_messages = sum(1 for m in messages if m.get("role") == "assistant")
|
||||||
|
tool_outputs = sum(1 for m in messages if m.get("role") == "tool")
|
||||||
|
|
||||||
|
# Calculate duration estimate
|
||||||
|
duration_estimate = "unknown"
|
||||||
|
if start_time and end_time:
|
||||||
|
try:
|
||||||
|
# Try to parse timestamps
|
||||||
|
start_dt = None
|
||||||
|
end_dt = None
|
||||||
|
|
||||||
|
# Handle various timestamp formats
|
||||||
|
for fmt in ["%Y-%m-%dT%H:%M:%S.%fZ", "%Y-%m-%dT%H:%M:%SZ", "%Y-%m-%d %H:%M:%S"]:
|
||||||
|
try:
|
||||||
|
if start_dt is None:
|
||||||
|
start_dt = datetime.strptime(start_time, fmt)
|
||||||
|
if end_dt is None:
|
||||||
|
end_dt = datetime.strptime(end_time, fmt)
|
||||||
|
except ValueError:
|
||||||
|
continue
|
||||||
|
|
||||||
|
if start_dt and end_dt:
|
||||||
|
duration = end_dt - start_dt
|
||||||
|
minutes = duration.total_seconds() / 60
|
||||||
|
duration_estimate = f"{minutes:.0f}m"
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
|
||||||
|
# Classify outcome
|
||||||
|
outcome = "unknown"
|
||||||
|
if errors:
|
||||||
|
# Check if any errors are fatal
|
||||||
|
fatal_errors = any("405" in e or "permission" in e.lower() or "authentication" in e.lower()
|
||||||
|
for e in errors)
|
||||||
|
if fatal_errors:
|
||||||
|
outcome = "failure"
|
||||||
|
else:
|
||||||
|
outcome = "partial"
|
||||||
|
elif messages:
|
||||||
|
# Check last message for success indicators
|
||||||
|
last_msg = messages[-1]
|
||||||
|
if last_msg.get("role") == "assistant":
|
||||||
|
content = last_msg.get("content", "")
|
||||||
|
if isinstance(content, str):
|
||||||
|
success_indicators = ["done", "completed", "success", "merged", "pushed"]
|
||||||
|
if any(indicator in content.lower() for indicator in success_indicators):
|
||||||
|
outcome = "success"
|
||||||
|
else:
|
||||||
|
outcome = "unknown"
|
||||||
|
|
||||||
|
# Deduplicate key actions (keep unique, limit to 10)
|
||||||
|
unique_actions = []
|
||||||
|
for action in key_actions:
|
||||||
|
if action not in unique_actions:
|
||||||
|
unique_actions.append(action)
|
||||||
|
if len(unique_actions) >= 10:
|
||||||
|
break
|
||||||
|
|
||||||
|
# Deduplicate errors (keep unique, limit to 5)
|
||||||
|
unique_errors = []
|
||||||
|
for error in errors:
|
||||||
|
if error not in unique_errors:
|
||||||
|
unique_errors.append(error)
|
||||||
|
if len(unique_errors) >= 5:
|
||||||
|
break
|
||||||
|
|
||||||
|
return SessionSummary(
|
||||||
|
session_id=session_id,
|
||||||
|
model=model,
|
||||||
|
repo=repo,
|
||||||
|
outcome=outcome,
|
||||||
|
message_count=len(messages),
|
||||||
|
tool_calls=tool_calls_count,
|
||||||
|
duration_estimate=duration_estimate,
|
||||||
|
key_actions=unique_actions,
|
||||||
|
errors_encountered=unique_errors,
|
||||||
|
start_time=start_time,
|
||||||
|
end_time=end_time,
|
||||||
|
total_tokens_estimate=total_tokens,
|
||||||
|
user_messages=user_messages,
|
||||||
|
assistant_messages=assistant_messages,
|
||||||
|
tool_outputs=tool_outputs
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def process_session_directory(directory_path: str, output_file: Optional[str] = None) -> List[SessionSummary]:
|
||||||
|
"""
|
||||||
|
Process all JSONL files in a directory.
|
||||||
|
"""
|
||||||
|
directory = Path(directory_path)
|
||||||
|
if not directory.exists():
|
||||||
|
print(f"Error: Directory {directory_path} does not exist", file=sys.stderr)
|
||||||
|
return []
|
||||||
|
|
||||||
|
jsonl_files = list(directory.glob("*.jsonl"))
|
||||||
|
if not jsonl_files:
|
||||||
|
print(f"Warning: No JSONL files found in {directory_path}", file=sys.stderr)
|
||||||
|
return []
|
||||||
|
|
||||||
|
summaries = []
|
||||||
|
for jsonl_file in sorted(jsonl_files):
|
||||||
|
print(f"Processing {jsonl_file.name}...", file=sys.stderr)
|
||||||
|
summary = extract_session_metadata(str(jsonl_file))
|
||||||
|
summaries.append(summary)
|
||||||
|
|
||||||
|
if output_file:
|
||||||
|
with open(output_file, 'w', encoding='utf-8') as f:
|
||||||
|
json.dump([asdict(s) for s in summaries], f, indent=2)
|
||||||
|
print(f"Wrote {len(summaries)} summaries to {output_file}", file=sys.stderr)
|
||||||
|
|
||||||
|
return summaries
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
"""CLI entry point."""
|
||||||
|
import argparse
|
||||||
|
|
||||||
|
parser = argparse.ArgumentParser(description="Extract metadata from Hermes session JSONL transcripts")
|
||||||
|
parser.add_argument("path", help="Path to JSONL file or directory of session files")
|
||||||
|
parser.add_argument("-o", "--output", help="Output JSON file (default: stdout)")
|
||||||
|
parser.add_argument("-v", "--verbose", action="store_true", help="Verbose output")
|
||||||
|
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
path = Path(args.path)
|
||||||
|
|
||||||
|
if path.is_file():
|
||||||
|
summary = extract_session_metadata(str(path))
|
||||||
|
if args.output:
|
||||||
|
with open(args.output, 'w') as f:
|
||||||
|
json.dump(asdict(summary), f, indent=2)
|
||||||
|
print(f"Wrote summary to {args.output}", file=sys.stderr)
|
||||||
|
else:
|
||||||
|
print(json.dumps(asdict(summary), indent=2))
|
||||||
|
|
||||||
|
elif path.is_dir():
|
||||||
|
summaries = process_session_directory(str(path), args.output)
|
||||||
|
if not args.output:
|
||||||
|
print(json.dumps([asdict(s) for s in summaries], indent=2))
|
||||||
|
|
||||||
|
else:
|
||||||
|
print(f"Error: {args.path} is not a file or directory", file=sys.stderr)
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
234
scripts/session_pair_harvester.py
Normal file
234
scripts/session_pair_harvester.py
Normal file
@@ -0,0 +1,234 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Session Transcript → Training Pair Harvester
|
||||||
|
|
||||||
|
Scans Hermes session JSONL files for Q&A patterns and extracts
|
||||||
|
terse→rich training pairs. Outputs JSONL matching the timmy-config
|
||||||
|
training pairs spec.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python3 scripts/session_pair_harvester.py ~/.hermes/sessions/
|
||||||
|
python3 scripts/session_pair_harvester.py session.jsonl --output pairs.jsonl
|
||||||
|
python3 scripts/session_pair_harvester.py --dir ~/.hermes/sessions/ --min-ratio 2.0
|
||||||
|
|
||||||
|
Output format:
|
||||||
|
{"terse": "user short prompt", "rich": "ai detailed response", "source": "session_id", "model": "..."}
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import hashlib
|
||||||
|
import json
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Optional
|
||||||
|
|
||||||
|
|
||||||
|
def compute_hash(text: str) -> str:
|
||||||
|
"""Content hash for deduplication."""
|
||||||
|
return hashlib.sha256(text.encode()).hexdigest()[:16]
|
||||||
|
|
||||||
|
|
||||||
|
def extract_pairs_from_session(session_data: dict, min_ratio: float = 1.5,
|
||||||
|
min_response_words: int = 20) -> list:
|
||||||
|
"""Extract terse→rich pairs from a single session object."""
|
||||||
|
pairs = []
|
||||||
|
conversations = session_data.get("conversations", [])
|
||||||
|
session_id = session_data.get("id", "unknown")
|
||||||
|
model = session_data.get("model", "unknown")
|
||||||
|
|
||||||
|
seen_hashes = set()
|
||||||
|
|
||||||
|
for i, msg in enumerate(conversations):
|
||||||
|
# Look for assistant/gpt responses
|
||||||
|
if msg.get("from") not in ("gpt", "assistant"):
|
||||||
|
continue
|
||||||
|
|
||||||
|
response_text = msg.get("value", "")
|
||||||
|
if not response_text or len(response_text.split()) < min_response_words:
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Find the preceding human message
|
||||||
|
prompt_text = ""
|
||||||
|
for j in range(i - 1, -1, -1):
|
||||||
|
if conversations[j].get("from") == "human":
|
||||||
|
prompt_text = conversations[j].get("value", "")
|
||||||
|
break
|
||||||
|
|
||||||
|
if not prompt_text:
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Filter: skip tool results, system messages embedded as human
|
||||||
|
if prompt_text.startswith("{") and "output" in prompt_text[:100]:
|
||||||
|
continue # likely a tool result
|
||||||
|
if prompt_text.startswith("# SOUL.md") or prompt_text.startswith("You are"):
|
||||||
|
continue # system prompt leak
|
||||||
|
|
||||||
|
# Quality filters
|
||||||
|
prompt_words = len(prompt_text.split())
|
||||||
|
response_words = len(response_text.split())
|
||||||
|
|
||||||
|
# Must have meaningful length ratio
|
||||||
|
if prompt_words == 0 or response_words == 0:
|
||||||
|
continue
|
||||||
|
ratio = response_words / prompt_words
|
||||||
|
if ratio < min_ratio:
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Skip responses that are mostly code
|
||||||
|
code_blocks = response_text.count("```")
|
||||||
|
if code_blocks >= 4 and len(response_text.replace("```", "").strip()) < 50:
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Skip responses with tool call artifacts
|
||||||
|
if "tool_call" in response_text[:100] or "function_call" in response_text[:100]:
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Deduplicate by content hash
|
||||||
|
content_hash = compute_hash(prompt_text + response_text[:200])
|
||||||
|
if content_hash in seen_hashes:
|
||||||
|
continue
|
||||||
|
seen_hashes.add(content_hash)
|
||||||
|
|
||||||
|
# Clean up response: remove markdown headers if too many
|
||||||
|
clean_response = response_text
|
||||||
|
|
||||||
|
pairs.append({
|
||||||
|
"terse": prompt_text.strip(),
|
||||||
|
"rich": clean_response.strip(),
|
||||||
|
"source": session_id,
|
||||||
|
"model": model,
|
||||||
|
"prompt_words": prompt_words,
|
||||||
|
"response_words": response_words,
|
||||||
|
"ratio": round(ratio, 2),
|
||||||
|
})
|
||||||
|
|
||||||
|
return pairs
|
||||||
|
|
||||||
|
|
||||||
|
def extract_from_jsonl_file(filepath: str, **kwargs) -> list:
|
||||||
|
"""Extract pairs from a session JSONL file."""
|
||||||
|
pairs = []
|
||||||
|
path = Path(filepath)
|
||||||
|
|
||||||
|
if not path.exists():
|
||||||
|
print(f"Warning: {filepath} not found", file=sys.stderr)
|
||||||
|
return pairs
|
||||||
|
|
||||||
|
content = path.read_text()
|
||||||
|
lines = content.strip().split("\n")
|
||||||
|
|
||||||
|
for line in lines:
|
||||||
|
line = line.strip()
|
||||||
|
if not line:
|
||||||
|
continue
|
||||||
|
try:
|
||||||
|
session = json.loads(line)
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
continue
|
||||||
|
|
||||||
|
session_pairs = extract_pairs_from_session(session, **kwargs)
|
||||||
|
pairs.extend(session_pairs)
|
||||||
|
|
||||||
|
return pairs
|
||||||
|
|
||||||
|
|
||||||
|
def deduplicate_pairs(pairs: list) -> list:
|
||||||
|
"""Remove duplicate pairs across files."""
|
||||||
|
seen = set()
|
||||||
|
unique = []
|
||||||
|
for pair in pairs:
|
||||||
|
key = compute_hash(pair["terse"] + pair["rich"][:200])
|
||||||
|
if key not in seen:
|
||||||
|
seen.add(key)
|
||||||
|
unique.append(pair)
|
||||||
|
return unique
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
parser = argparse.ArgumentParser(description="Harvest training pairs from session transcripts")
|
||||||
|
parser.add_argument("input", nargs="?", help="Session JSONL file or directory")
|
||||||
|
parser.add_argument("--dir", "-d", help="Directory to scan for session files")
|
||||||
|
parser.add_argument("--output", "-o", default="harvested_pairs.jsonl", help="Output file")
|
||||||
|
parser.add_argument("--min-ratio", type=float, default=1.5, help="Min response/prompt word ratio")
|
||||||
|
parser.add_argument("--min-words", type=int, default=20, help="Min response word count")
|
||||||
|
parser.add_argument("--dry-run", action="store_true", help="Print stats without writing")
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
all_pairs = []
|
||||||
|
files_scanned = 0
|
||||||
|
|
||||||
|
scan_dir = args.dir or args.input
|
||||||
|
if not scan_dir:
|
||||||
|
parser.print_help()
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
scan_path = Path(scan_dir)
|
||||||
|
if scan_path.is_dir():
|
||||||
|
jsonl_files = sorted(scan_path.rglob("*.jsonl"))
|
||||||
|
print(f"Scanning {len(jsonl_files)} files in {scan_dir}...", file=sys.stderr)
|
||||||
|
for fpath in jsonl_files:
|
||||||
|
pairs = extract_from_jsonl_file(
|
||||||
|
str(fpath),
|
||||||
|
min_ratio=args.min_ratio,
|
||||||
|
min_response_words=args.min_words
|
||||||
|
)
|
||||||
|
all_pairs.extend(pairs)
|
||||||
|
files_scanned += 1
|
||||||
|
else:
|
||||||
|
pairs = extract_from_jsonl_file(
|
||||||
|
str(scan_path),
|
||||||
|
min_ratio=args.min_ratio,
|
||||||
|
min_response_words=args.min_words
|
||||||
|
)
|
||||||
|
all_pairs.extend(pairs)
|
||||||
|
files_scanned = 1
|
||||||
|
|
||||||
|
# Deduplicate
|
||||||
|
unique_pairs = deduplicate_pairs(all_pairs)
|
||||||
|
|
||||||
|
# Stats
|
||||||
|
if unique_pairs:
|
||||||
|
avg_prompt = sum(p["prompt_words"] for p in unique_pairs) / len(unique_pairs)
|
||||||
|
avg_response = sum(p["response_words"] for p in unique_pairs) / len(unique_pairs)
|
||||||
|
avg_ratio = sum(p["ratio"] for p in unique_pairs) / len(unique_pairs)
|
||||||
|
else:
|
||||||
|
avg_prompt = avg_response = avg_ratio = 0
|
||||||
|
|
||||||
|
stats = {
|
||||||
|
"files_scanned": files_scanned,
|
||||||
|
"raw_pairs": len(all_pairs),
|
||||||
|
"unique_pairs": len(unique_pairs),
|
||||||
|
"duplicates_removed": len(all_pairs) - len(unique_pairs),
|
||||||
|
"avg_prompt_words": round(avg_prompt, 1),
|
||||||
|
"avg_response_words": round(avg_response, 1),
|
||||||
|
"avg_ratio": round(avg_ratio, 2),
|
||||||
|
}
|
||||||
|
|
||||||
|
print(json.dumps(stats, indent=2), file=sys.stderr)
|
||||||
|
|
||||||
|
if args.dry_run:
|
||||||
|
# Print sample pairs
|
||||||
|
for pair in unique_pairs[:3]:
|
||||||
|
print(f"\n--- Source: {pair['source']} (ratio: {pair['ratio']}) ---", file=sys.stderr)
|
||||||
|
print(f"TERSE: {pair['terse'][:100]}...", file=sys.stderr)
|
||||||
|
print(f"RICH: {pair['rich'][:150]}...", file=sys.stderr)
|
||||||
|
return
|
||||||
|
|
||||||
|
# Write output
|
||||||
|
output_path = Path(args.output)
|
||||||
|
with open(output_path, "w") as f:
|
||||||
|
for pair in unique_pairs:
|
||||||
|
# Strip internal fields for output
|
||||||
|
output = {
|
||||||
|
"terse": pair["terse"],
|
||||||
|
"rich": pair["rich"],
|
||||||
|
"source": pair["source"],
|
||||||
|
"model": pair["model"],
|
||||||
|
}
|
||||||
|
f.write(json.dumps(output) + "\n")
|
||||||
|
|
||||||
|
print(f"\nWrote {len(unique_pairs)} pairs to {output_path}", file=sys.stderr)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
142
scripts/session_reader.py
Normal file
142
scripts/session_reader.py
Normal file
@@ -0,0 +1,142 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
session_reader.py — Parse Hermes session JSONL transcripts.
|
||||||
|
|
||||||
|
Each line in a session file is a JSON object representing a message.
|
||||||
|
Standard fields: role (user|assistant|system), content (str), timestamp (str).
|
||||||
|
Tool calls and tool results are also captured.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import json
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Iterator, Optional
|
||||||
|
|
||||||
|
|
||||||
|
def read_session(path: str) -> list[dict]:
|
||||||
|
"""Read a session JSONL file and return all messages as a list."""
|
||||||
|
messages = []
|
||||||
|
with open(path, 'r', encoding='utf-8') as f:
|
||||||
|
for line_num, line in enumerate(f, 1):
|
||||||
|
line = line.strip()
|
||||||
|
if not line:
|
||||||
|
continue
|
||||||
|
try:
|
||||||
|
msg = json.loads(line)
|
||||||
|
messages.append(msg)
|
||||||
|
except json.JSONDecodeError as e:
|
||||||
|
print(f"WARNING: Skipping malformed JSON at line {line_num}: {e}", file=sys.stderr)
|
||||||
|
return messages
|
||||||
|
|
||||||
|
|
||||||
|
def read_session_iter(path: str) -> Iterator[dict]:
|
||||||
|
"""Iterate over session messages without loading all into memory."""
|
||||||
|
with open(path, 'r', encoding='utf-8') as f:
|
||||||
|
for line_num, line in enumerate(f, 1):
|
||||||
|
line = line.strip()
|
||||||
|
if not line:
|
||||||
|
continue
|
||||||
|
try:
|
||||||
|
yield json.loads(line)
|
||||||
|
except json.JSONDecodeError as e:
|
||||||
|
print(f"WARNING: Skipping malformed JSON at line {line_num}: {e}", file=sys.stderr)
|
||||||
|
|
||||||
|
|
||||||
|
def extract_conversation(messages: list[dict]) -> list[dict]:
|
||||||
|
"""Extract user/assistant conversation turns, skipping tool-only messages."""
|
||||||
|
conversation = []
|
||||||
|
for msg in messages:
|
||||||
|
role = msg.get('role', '')
|
||||||
|
content = msg.get('content', '')
|
||||||
|
|
||||||
|
# Skip empty messages and pure tool calls
|
||||||
|
if role in ('user', 'assistant', 'system'):
|
||||||
|
if isinstance(content, str) and content.strip():
|
||||||
|
conversation.append({
|
||||||
|
'role': role,
|
||||||
|
'content': content.strip(),
|
||||||
|
'timestamp': msg.get('timestamp', '')
|
||||||
|
})
|
||||||
|
elif isinstance(content, list):
|
||||||
|
# Multimodal content — extract text parts
|
||||||
|
text_parts = []
|
||||||
|
for part in content:
|
||||||
|
if isinstance(part, dict) and part.get('type') == 'text':
|
||||||
|
text_parts.append(part.get('text', ''))
|
||||||
|
if text_parts:
|
||||||
|
conversation.append({
|
||||||
|
'role': role,
|
||||||
|
'content': '\n'.join(text_parts),
|
||||||
|
'timestamp': msg.get('timestamp', '')
|
||||||
|
})
|
||||||
|
return conversation
|
||||||
|
|
||||||
|
|
||||||
|
def truncate_for_context(messages: list[dict], head: int = 50, tail: int = 50) -> list[dict]:
|
||||||
|
"""Truncate long sessions: keep first N + last N messages.
|
||||||
|
|
||||||
|
This preserves session start (initial context) and end (final results),
|
||||||
|
skipping the messy middle of long debugging sessions.
|
||||||
|
"""
|
||||||
|
if len(messages) <= head + tail:
|
||||||
|
return messages
|
||||||
|
|
||||||
|
truncated = messages[:head]
|
||||||
|
truncated.append({
|
||||||
|
'role': 'system',
|
||||||
|
'content': f'[{len(messages) - head - tail} messages truncated]',
|
||||||
|
'timestamp': ''
|
||||||
|
})
|
||||||
|
truncated.extend(messages[-tail:])
|
||||||
|
return truncated
|
||||||
|
|
||||||
|
|
||||||
|
def messages_to_text(messages: list[dict]) -> str:
|
||||||
|
"""Convert message list to plain text for LLM consumption."""
|
||||||
|
lines = []
|
||||||
|
for msg in messages:
|
||||||
|
role = msg.get('role', 'unknown').upper()
|
||||||
|
content = msg.get('content', '')
|
||||||
|
if msg.get('role') == 'system' and 'truncated' in content:
|
||||||
|
lines.append(f'--- {content} ---')
|
||||||
|
else:
|
||||||
|
lines.append(f'{role}: {content}')
|
||||||
|
return '\n\n'.join(lines)
|
||||||
|
|
||||||
|
|
||||||
|
def get_session_metadata(path: str) -> dict:
|
||||||
|
"""Extract metadata from a session file (first message often has config info)."""
|
||||||
|
messages = read_session(path)
|
||||||
|
if not messages:
|
||||||
|
return {'path': path, 'message_count': 0}
|
||||||
|
|
||||||
|
first = messages[0]
|
||||||
|
last = messages[-1]
|
||||||
|
|
||||||
|
return {
|
||||||
|
'path': path,
|
||||||
|
'message_count': len(messages),
|
||||||
|
'first_timestamp': first.get('timestamp', ''),
|
||||||
|
'last_timestamp': last.get('timestamp', ''),
|
||||||
|
'first_role': first.get('role', ''),
|
||||||
|
'has_tool_calls': any(m.get('tool_calls') for m in messages),
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
if len(sys.argv) < 2:
|
||||||
|
print(f"Usage: {sys.argv[0]} <session.jsonl>")
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
path = sys.argv[1]
|
||||||
|
meta = get_session_metadata(path)
|
||||||
|
print(json.dumps(meta, indent=2))
|
||||||
|
|
||||||
|
messages = read_session(path)
|
||||||
|
conv = extract_conversation(messages)
|
||||||
|
print(f"\nConversation: {len(conv)} turns")
|
||||||
|
|
||||||
|
truncated = truncate_for_context(conv)
|
||||||
|
print(f"After truncation: {len(truncated)} turns")
|
||||||
|
print(f"\nPreview (first 500 chars):")
|
||||||
|
print(messages_to_text(truncated[:5])[:500])
|
||||||
147
scripts/test_automation_opportunity_finder.py
Normal file
147
scripts/test_automation_opportunity_finder.py
Normal file
@@ -0,0 +1,147 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""Tests for scripts/automation_opportunity_finder.py — 8 tests."""
|
||||||
|
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import tempfile
|
||||||
|
|
||||||
|
sys.path.insert(0, os.path.dirname(__file__) or ".")
|
||||||
|
import importlib.util
|
||||||
|
spec = importlib.util.spec_from_file_location(
|
||||||
|
"aof",
|
||||||
|
os.path.join(os.path.dirname(__file__) or ".", "automation_opportunity_finder.py"),
|
||||||
|
)
|
||||||
|
mod = importlib.util.module_from_spec(spec)
|
||||||
|
spec.loader.exec_module(mod)
|
||||||
|
|
||||||
|
|
||||||
|
def test_analyze_cron_jobs_no_file():
|
||||||
|
"""Returns empty list when no cron jobs file exists."""
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
result = mod.analyze_cron_jobs(tmpdir)
|
||||||
|
assert result == []
|
||||||
|
print("PASS: test_analyze_cron_jobs_no_file")
|
||||||
|
|
||||||
|
|
||||||
|
def test_analyze_cron_jobs_disabled():
|
||||||
|
"""Detects disabled cron jobs."""
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
cron_dir = os.path.join(tmpdir, "cron")
|
||||||
|
os.makedirs(cron_dir)
|
||||||
|
jobs = [
|
||||||
|
{"id": "j1", "name": "backup", "enabled": False, "schedule": "0 * * * *"},
|
||||||
|
{"id": "j2", "name": "health", "enabled": True, "schedule": "*/5 * * * *"},
|
||||||
|
]
|
||||||
|
with open(os.path.join(cron_dir, "jobs.json"), "w") as f:
|
||||||
|
json.dump(jobs, f)
|
||||||
|
result = mod.analyze_cron_jobs(tmpdir)
|
||||||
|
assert any(p["category"] == "cron_disabled" for p in result)
|
||||||
|
print("PASS: test_analyze_cron_jobs_disabled")
|
||||||
|
|
||||||
|
|
||||||
|
def test_analyze_cron_jobs_errors():
|
||||||
|
"""Detects cron jobs with error status."""
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
cron_dir = os.path.join(tmpdir, "cron")
|
||||||
|
os.makedirs(cron_dir)
|
||||||
|
jobs = [
|
||||||
|
{"id": "j1", "name": "broken", "enabled": True, "last_status": "error", "schedule": "0 * * * *"},
|
||||||
|
]
|
||||||
|
with open(os.path.join(cron_dir, "jobs.json"), "w") as f:
|
||||||
|
json.dump(jobs, f)
|
||||||
|
result = mod.analyze_cron_jobs(tmpdir)
|
||||||
|
assert any(p["category"] == "cron_errors" for p in result)
|
||||||
|
print("PASS: test_analyze_cron_jobs_errors")
|
||||||
|
|
||||||
|
|
||||||
|
def test_analyze_documents_finds_todos():
|
||||||
|
"""Detects TODO markers in documents."""
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
docs_dir = os.path.join(tmpdir, "docs")
|
||||||
|
os.makedirs(docs_dir)
|
||||||
|
for i in range(3):
|
||||||
|
with open(os.path.join(docs_dir, f"guide{i}.md"), "w") as f:
|
||||||
|
f.write(f"# Guide {i}\n\nTODO: Automate this step\n")
|
||||||
|
result = mod.analyze_documents([tmpdir])
|
||||||
|
assert any(p["category"] == "manual_todo" for p in result)
|
||||||
|
todo_proposals = [p for p in result if p["category"] == "manual_todo"]
|
||||||
|
assert todo_proposals[0]["details"].__len__() == 3
|
||||||
|
print("PASS: test_analyze_documents_finds_todos")
|
||||||
|
|
||||||
|
|
||||||
|
def test_analyze_scripts_repeated_commands():
|
||||||
|
"""Detects repeated shell commands across scripts."""
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
scripts_dir = os.path.join(tmpdir, "scripts")
|
||||||
|
os.makedirs(scripts_dir)
|
||||||
|
repeated_cmd = "docker restart myapp"
|
||||||
|
for i in range(4):
|
||||||
|
with open(os.path.join(scripts_dir, f"deploy{i}.sh"), "w") as f:
|
||||||
|
f.write(f"#!/bin/bash\n{repeated_cmd}\n")
|
||||||
|
result = mod.analyze_scripts([tmpdir])
|
||||||
|
assert any(p["category"] == "repeated_command" for p in result)
|
||||||
|
print("PASS: test_analyze_scripts_repeated_commands")
|
||||||
|
|
||||||
|
|
||||||
|
def test_analyze_session_transcripts():
|
||||||
|
"""Detects repeated tool-call sequences."""
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
sessions_dir = os.path.join(tmpdir, "sessions")
|
||||||
|
os.makedirs(sessions_dir)
|
||||||
|
for i in range(4):
|
||||||
|
with open(os.path.join(sessions_dir, f"session{i}.jsonl"), "w") as f:
|
||||||
|
f.write(json.dumps({"role": "user", "content": f"task {i}"}) + "\n")
|
||||||
|
f.write(json.dumps({
|
||||||
|
"role": "assistant",
|
||||||
|
"content": "working",
|
||||||
|
"tool_calls": [
|
||||||
|
{"function": {"name": "read_file"}},
|
||||||
|
{"function": {"name": "write_file"}},
|
||||||
|
]
|
||||||
|
}) + "\n")
|
||||||
|
result = mod.analyze_session_transcripts([sessions_dir])
|
||||||
|
assert any(p["category"] == "tool_sequence" for p in result)
|
||||||
|
seq_proposals = [p for p in result if p["category"] == "tool_sequence"]
|
||||||
|
assert any("read_file" in p["title"] and "write_file" in p["title"] for p in seq_proposals)
|
||||||
|
print("PASS: test_analyze_session_transcripts")
|
||||||
|
|
||||||
|
|
||||||
|
def test_deduplicate_proposals():
|
||||||
|
"""Deduplicates proposals with similar titles."""
|
||||||
|
proposals = [
|
||||||
|
{"title": "TODO found 3 times", "category": "manual_todo", "confidence": 0.7, "impact": "medium", "description": "x", "sources": []},
|
||||||
|
{"title": "TODO found 3 times", "category": "manual_todo", "confidence": 0.7, "impact": "medium", "description": "x", "sources": []},
|
||||||
|
{"title": "FIXME found 5 times", "category": "manual_fixme", "confidence": 0.8, "impact": "medium", "description": "y", "sources": []},
|
||||||
|
]
|
||||||
|
result = mod.deduplicate_proposals(proposals)
|
||||||
|
assert len(result) == 2
|
||||||
|
print("PASS: test_deduplicate_proposals")
|
||||||
|
|
||||||
|
|
||||||
|
def test_rank_proposals():
|
||||||
|
"""Ranks proposals by impact * confidence."""
|
||||||
|
proposals = [
|
||||||
|
{"title": "low", "category": "x", "confidence": 0.9, "impact": "low", "description": "", "sources": []},
|
||||||
|
{"title": "high", "category": "x", "confidence": 0.8, "impact": "high", "description": "", "sources": []},
|
||||||
|
{"title": "med", "category": "x", "confidence": 0.7, "impact": "medium", "description": "", "sources": []},
|
||||||
|
]
|
||||||
|
result = mod.rank_proposals(proposals)
|
||||||
|
assert result[0]["title"] == "high"
|
||||||
|
assert result[-1]["title"] == "low"
|
||||||
|
print("PASS: test_rank_proposals")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
tests = [v for k, v in globals().items() if k.startswith("test_")]
|
||||||
|
passed = 0
|
||||||
|
failed = 0
|
||||||
|
for t in tests:
|
||||||
|
try:
|
||||||
|
t()
|
||||||
|
passed += 1
|
||||||
|
except Exception as e:
|
||||||
|
print(f"FAIL: {t.__name__}: {e}")
|
||||||
|
failed += 1
|
||||||
|
print(f"\n{passed}/{passed+failed} tests passed")
|
||||||
|
sys.exit(1 if failed else 0)
|
||||||
239
scripts/test_bootstrapper.py
Normal file
239
scripts/test_bootstrapper.py
Normal file
@@ -0,0 +1,239 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Tests for bootstrapper.py — context assembly from knowledge store.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import json
|
||||||
|
import sys
|
||||||
|
import tempfile
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
# Add scripts dir to path for import
|
||||||
|
sys.path.insert(0, str(Path(__file__).resolve().parent))
|
||||||
|
|
||||||
|
from bootstrapper import (
|
||||||
|
build_bootstrap_context,
|
||||||
|
estimate_tokens,
|
||||||
|
filter_facts,
|
||||||
|
load_index,
|
||||||
|
sort_facts,
|
||||||
|
truncate_to_tokens,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def make_index(facts: list[dict], tmp_dir: Path) -> Path:
|
||||||
|
"""Create a temporary index.json with given facts."""
|
||||||
|
index = {
|
||||||
|
"version": 1,
|
||||||
|
"last_updated": "2026-04-13T20:00:00Z",
|
||||||
|
"total_facts": len(facts),
|
||||||
|
"facts": facts,
|
||||||
|
}
|
||||||
|
path = tmp_dir / "index.json"
|
||||||
|
with open(path, "w") as f:
|
||||||
|
json.dump(index, f)
|
||||||
|
return path
|
||||||
|
|
||||||
|
|
||||||
|
def test_empty_index():
|
||||||
|
"""Empty knowledge store produces graceful output."""
|
||||||
|
with tempfile.TemporaryDirectory() as tmp:
|
||||||
|
tmp_dir = Path(tmp)
|
||||||
|
index_path = make_index([], tmp_dir)
|
||||||
|
|
||||||
|
# Create empty knowledge dirs
|
||||||
|
for sub in ["repos", "agents", "global"]:
|
||||||
|
(tmp_dir / sub).mkdir(exist_ok=True)
|
||||||
|
|
||||||
|
context = build_bootstrap_context(
|
||||||
|
repo="the-nexus", index_path=index_path
|
||||||
|
)
|
||||||
|
assert "No relevant knowledge found" in context
|
||||||
|
assert "Starting fresh" in context
|
||||||
|
print("PASS: empty_index")
|
||||||
|
|
||||||
|
|
||||||
|
def test_filter_by_repo():
|
||||||
|
"""Filter facts by repository."""
|
||||||
|
facts = [
|
||||||
|
{"fact": "A", "category": "fact", "repo": "the-nexus", "confidence": 0.9},
|
||||||
|
{"fact": "B", "category": "fact", "repo": "fleet-ops", "confidence": 0.8},
|
||||||
|
{"fact": "C", "category": "fact", "repo": "global", "confidence": 0.7},
|
||||||
|
]
|
||||||
|
filtered = filter_facts(facts, repo="the-nexus", include_global=True)
|
||||||
|
texts = [f["fact"] for f in filtered]
|
||||||
|
assert "A" in texts
|
||||||
|
assert "B" not in texts
|
||||||
|
assert "C" in texts
|
||||||
|
print("PASS: filter_by_repo")
|
||||||
|
|
||||||
|
|
||||||
|
def test_filter_by_agent():
|
||||||
|
"""Filter facts by agent type."""
|
||||||
|
facts = [
|
||||||
|
{"fact": "A", "category": "pattern", "repo": "global", "agent": "mimo-sprint", "confidence": 0.8},
|
||||||
|
{"fact": "B", "category": "pattern", "repo": "global", "agent": "groq-fast", "confidence": 0.7},
|
||||||
|
{"fact": "C", "category": "fact", "repo": "global", "confidence": 0.9},
|
||||||
|
]
|
||||||
|
filtered = filter_facts(facts, agent="mimo-sprint", include_global=True)
|
||||||
|
texts = [f["fact"] for f in filtered]
|
||||||
|
assert "A" in texts
|
||||||
|
assert "B" not in texts
|
||||||
|
assert "C" in texts # global, no agent restriction
|
||||||
|
print("PASS: filter_by_agent")
|
||||||
|
|
||||||
|
|
||||||
|
def test_no_global_flag():
|
||||||
|
"""Excluding global facts works."""
|
||||||
|
facts = [
|
||||||
|
{"fact": "A", "category": "fact", "repo": "the-nexus", "confidence": 0.9},
|
||||||
|
{"fact": "B", "category": "fact", "repo": "global", "confidence": 0.8},
|
||||||
|
]
|
||||||
|
filtered = filter_facts(facts, repo="the-nexus", include_global=False)
|
||||||
|
texts = [f["fact"] for f in filtered]
|
||||||
|
assert "A" in texts
|
||||||
|
assert "B" not in texts
|
||||||
|
print("PASS: no_global_flag")
|
||||||
|
|
||||||
|
|
||||||
|
def test_sort_by_confidence():
|
||||||
|
"""Facts sort by confidence descending."""
|
||||||
|
facts = [
|
||||||
|
{"fact": "low", "category": "fact", "repo": "global", "confidence": 0.3},
|
||||||
|
{"fact": "high", "category": "fact", "repo": "global", "confidence": 0.95},
|
||||||
|
{"fact": "mid", "category": "fact", "repo": "global", "confidence": 0.7},
|
||||||
|
]
|
||||||
|
sorted_f = sort_facts(facts)
|
||||||
|
assert sorted_f[0]["fact"] == "high"
|
||||||
|
assert sorted_f[1]["fact"] == "mid"
|
||||||
|
assert sorted_f[2]["fact"] == "low"
|
||||||
|
print("PASS: sort_by_confidence")
|
||||||
|
|
||||||
|
|
||||||
|
def test_sort_pitfalls_first():
|
||||||
|
"""Pitfalls sort before facts at same confidence."""
|
||||||
|
facts = [
|
||||||
|
{"fact": "regular fact", "category": "fact", "repo": "global", "confidence": 0.8},
|
||||||
|
{"fact": "danger pitfall", "category": "pitfall", "repo": "global", "confidence": 0.8},
|
||||||
|
]
|
||||||
|
sorted_f = sort_facts(facts)
|
||||||
|
assert sorted_f[0]["category"] == "pitfall"
|
||||||
|
print("PASS: sort_pitfalls_first")
|
||||||
|
|
||||||
|
|
||||||
|
def test_truncate_to_tokens():
|
||||||
|
"""Truncation cuts at line boundary."""
|
||||||
|
text = "line1\nline2\nline3\nline4\nline5\n"
|
||||||
|
truncated = truncate_to_tokens(text, max_tokens=2) # ~8 chars
|
||||||
|
assert "line1" in truncated
|
||||||
|
assert "truncated" in truncated.lower()
|
||||||
|
print("PASS: truncate_to_tokens")
|
||||||
|
|
||||||
|
|
||||||
|
def test_estimate_tokens():
|
||||||
|
"""Token estimation is reasonable."""
|
||||||
|
text = "a" * 400
|
||||||
|
tokens = estimate_tokens(text)
|
||||||
|
assert 90 <= tokens <= 110 # ~100 tokens
|
||||||
|
print("PASS: estimate_tokens")
|
||||||
|
|
||||||
|
|
||||||
|
def test_build_full_context():
|
||||||
|
"""Full context with facts renders correctly."""
|
||||||
|
facts = [
|
||||||
|
{"fact": "API merges fail with 405", "category": "pitfall", "repo": "the-nexus", "confidence": 0.95},
|
||||||
|
{"fact": "Has 50+ open PRs", "category": "fact", "repo": "the-nexus", "confidence": 0.9},
|
||||||
|
{"fact": "Token at ~/.config/gitea/token", "category": "tool-quirk", "repo": "global", "confidence": 0.9},
|
||||||
|
{"fact": "Check git remote -v first", "category": "pattern", "repo": "global", "confidence": 0.8},
|
||||||
|
]
|
||||||
|
|
||||||
|
with tempfile.TemporaryDirectory() as tmp:
|
||||||
|
tmp_dir = Path(tmp)
|
||||||
|
index_path = make_index(facts, tmp_dir)
|
||||||
|
|
||||||
|
# Create knowledge dirs
|
||||||
|
for sub in ["repos", "agents", "global"]:
|
||||||
|
(tmp_dir / sub).mkdir(exist_ok=True)
|
||||||
|
|
||||||
|
context = build_bootstrap_context(
|
||||||
|
repo="the-nexus",
|
||||||
|
agent="mimo-sprint",
|
||||||
|
include_global=True,
|
||||||
|
index_path=index_path,
|
||||||
|
)
|
||||||
|
|
||||||
|
assert "What You Know" in context
|
||||||
|
assert "PITFALLS" in context
|
||||||
|
assert "API merges fail with 405" in context
|
||||||
|
assert "the-nexus" in context
|
||||||
|
assert "Token at" in context # global fact included
|
||||||
|
print("PASS: build_full_context")
|
||||||
|
|
||||||
|
|
||||||
|
def test_max_tokens_respected():
|
||||||
|
"""Output respects max_tokens limit."""
|
||||||
|
# Generate lots of facts
|
||||||
|
facts = [
|
||||||
|
{"fact": f"Fact number {i} with some detail about things", "category": "fact", "repo": "global", "confidence": 0.8}
|
||||||
|
for i in range(100)
|
||||||
|
]
|
||||||
|
|
||||||
|
with tempfile.TemporaryDirectory() as tmp:
|
||||||
|
tmp_dir = Path(tmp)
|
||||||
|
index_path = make_index(facts, tmp_dir)
|
||||||
|
for sub in ["repos", "agents", "global"]:
|
||||||
|
(tmp_dir / sub).mkdir(exist_ok=True)
|
||||||
|
|
||||||
|
context = build_bootstrap_context(
|
||||||
|
repo=None,
|
||||||
|
max_tokens=500,
|
||||||
|
index_path=index_path,
|
||||||
|
)
|
||||||
|
|
||||||
|
actual_tokens = estimate_tokens(context)
|
||||||
|
# Allow 10% overshoot since we cut at line boundaries
|
||||||
|
assert actual_tokens <= 550, f"Expected ~500 tokens, got {actual_tokens}"
|
||||||
|
print(f"PASS: max_tokens_respected (got {actual_tokens} tokens)")
|
||||||
|
|
||||||
|
|
||||||
|
def test_missing_index_graceful():
|
||||||
|
"""Missing index.json doesn't crash."""
|
||||||
|
with tempfile.TemporaryDirectory() as tmp:
|
||||||
|
tmp_dir = Path(tmp)
|
||||||
|
# Don't create index.json
|
||||||
|
for sub in ["repos", "agents", "global"]:
|
||||||
|
(tmp_dir / sub).mkdir(exist_ok=True)
|
||||||
|
|
||||||
|
fake_index = tmp_dir / "nonexistent.json"
|
||||||
|
context = build_bootstrap_context(repo="anything", index_path=fake_index)
|
||||||
|
assert "No relevant knowledge found" in context
|
||||||
|
print("PASS: missing_index_graceful")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
tests = [
|
||||||
|
test_empty_index,
|
||||||
|
test_filter_by_repo,
|
||||||
|
test_filter_by_agent,
|
||||||
|
test_no_global_flag,
|
||||||
|
test_sort_by_confidence,
|
||||||
|
test_sort_pitfalls_first,
|
||||||
|
test_truncate_to_tokens,
|
||||||
|
test_estimate_tokens,
|
||||||
|
test_build_full_context,
|
||||||
|
test_max_tokens_respected,
|
||||||
|
test_missing_index_graceful,
|
||||||
|
]
|
||||||
|
|
||||||
|
passed = 0
|
||||||
|
failed = 0
|
||||||
|
for test in tests:
|
||||||
|
try:
|
||||||
|
test()
|
||||||
|
passed += 1
|
||||||
|
except Exception as e:
|
||||||
|
print(f"FAIL: {test.__name__} — {e}")
|
||||||
|
failed += 1
|
||||||
|
|
||||||
|
print(f"\n{passed} passed, {failed} failed")
|
||||||
|
sys.exit(0 if failed == 0 else 1)
|
||||||
189
scripts/test_diff_analyzer.py
Normal file
189
scripts/test_diff_analyzer.py
Normal file
@@ -0,0 +1,189 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""Tests for scripts/diff_analyzer.py — 10 tests."""
|
||||||
|
|
||||||
|
import sys
|
||||||
|
import os
|
||||||
|
sys.path.insert(0, os.path.dirname(__file__) or ".")
|
||||||
|
|
||||||
|
import importlib.util
|
||||||
|
spec = importlib.util.spec_from_file_location("da", os.path.join(os.path.dirname(__file__) or ".", "diff_analyzer.py"))
|
||||||
|
mod = importlib.util.module_from_spec(spec)
|
||||||
|
spec.loader.exec_module(mod)
|
||||||
|
DiffAnalyzer = mod.DiffAnalyzer
|
||||||
|
ChangeCategory = mod.ChangeCategory
|
||||||
|
|
||||||
|
|
||||||
|
SAMPLE_ADD = """diff --git a/new.py b/new.py
|
||||||
|
new file mode 100644
|
||||||
|
--- /dev/null
|
||||||
|
+++ b/new.py
|
||||||
|
@@ -0,0 +1,3 @@
|
||||||
|
+def hello():
|
||||||
|
+ print("world")
|
||||||
|
+ return True
|
||||||
|
"""
|
||||||
|
|
||||||
|
SAMPLE_DELETE = """diff --git a/old.py b/old.py
|
||||||
|
deleted file mode 100644
|
||||||
|
--- a/old.py
|
||||||
|
+++ /dev/null
|
||||||
|
@@ -1,2 +0,0 @@
|
||||||
|
-def goodbye():
|
||||||
|
- pass
|
||||||
|
"""
|
||||||
|
|
||||||
|
SAMPLE_MODIFY = """diff --git a/app.py b/app.py
|
||||||
|
--- a/app.py
|
||||||
|
+++ b/app.py
|
||||||
|
@@ -1,3 +1,4 @@
|
||||||
|
def main():
|
||||||
|
- print("old")
|
||||||
|
+ print("new")
|
||||||
|
+ print("extra")
|
||||||
|
return 0
|
||||||
|
"""
|
||||||
|
|
||||||
|
SAMPLE_RENAME = """diff --git a/old_name.py b/new_name.py
|
||||||
|
rename from old_name.py
|
||||||
|
rename to new_name.py
|
||||||
|
--- a/old_name.py
|
||||||
|
+++ b/new_name.py
|
||||||
|
@@ -1,1 +1,1 @@
|
||||||
|
-old content
|
||||||
|
+new content
|
||||||
|
"""
|
||||||
|
|
||||||
|
SAMPLE_MULTI = """diff --git a/a.py b/a.py
|
||||||
|
--- a/a.py
|
||||||
|
+++ b/a.py
|
||||||
|
@@ -1,1 +1,2 @@
|
||||||
|
existing
|
||||||
|
+added line
|
||||||
|
diff --git b/b.py b/b.py
|
||||||
|
new file mode 100644
|
||||||
|
--- /dev/null
|
||||||
|
+++ b/b.py
|
||||||
|
@@ -0,0 +1,1 @@
|
||||||
|
+new file
|
||||||
|
"""
|
||||||
|
|
||||||
|
SAMPLE_BINARY = """diff --git a/img.png b/img.png
|
||||||
|
Binary files a/img.png and b/img.png differ
|
||||||
|
"""
|
||||||
|
|
||||||
|
|
||||||
|
def test_empty():
|
||||||
|
a = DiffAnalyzer()
|
||||||
|
s = a.analyze("")
|
||||||
|
assert s.total_files_changed == 0
|
||||||
|
print("PASS: test_empty")
|
||||||
|
|
||||||
|
def test_addition():
|
||||||
|
a = DiffAnalyzer()
|
||||||
|
s = a.analyze(SAMPLE_ADD)
|
||||||
|
assert s.total_files_changed == 1
|
||||||
|
assert s.total_added == 3
|
||||||
|
assert s.total_deleted == 0
|
||||||
|
assert s.new_files == 1
|
||||||
|
assert s.files[0].hunks[0].category == ChangeCategory.ADDED
|
||||||
|
print("PASS: test_addition")
|
||||||
|
|
||||||
|
def test_deletion():
|
||||||
|
a = DiffAnalyzer()
|
||||||
|
s = a.analyze(SAMPLE_DELETE)
|
||||||
|
assert s.total_deleted == 2
|
||||||
|
assert s.deleted_files == 1
|
||||||
|
assert s.files[0].hunks[0].category == ChangeCategory.DELETED
|
||||||
|
print("PASS: test_deletion")
|
||||||
|
|
||||||
|
def test_modification():
|
||||||
|
a = DiffAnalyzer()
|
||||||
|
s = a.analyze(SAMPLE_MODIFY)
|
||||||
|
assert s.total_added == 2
|
||||||
|
assert s.total_deleted == 1
|
||||||
|
assert s.files[0].hunks[0].category == ChangeCategory.MODIFIED
|
||||||
|
print("PASS: test_modification")
|
||||||
|
|
||||||
|
def test_rename():
|
||||||
|
a = DiffAnalyzer()
|
||||||
|
s = a.analyze(SAMPLE_RENAME)
|
||||||
|
assert s.renamed_files == 1
|
||||||
|
assert s.files[0].old_path == "old_name.py"
|
||||||
|
assert s.files[0].path == "new_name.py"
|
||||||
|
assert s.files[0].is_renamed == True
|
||||||
|
print("PASS: test_rename")
|
||||||
|
|
||||||
|
def test_multiple_files():
|
||||||
|
a = DiffAnalyzer()
|
||||||
|
s = a.analyze(SAMPLE_MULTI)
|
||||||
|
assert s.total_files_changed == 2
|
||||||
|
assert s.new_files == 1
|
||||||
|
print("PASS: test_multiple_files")
|
||||||
|
|
||||||
|
def test_binary():
|
||||||
|
a = DiffAnalyzer()
|
||||||
|
s = a.analyze(SAMPLE_BINARY)
|
||||||
|
assert s.binary_files == 1
|
||||||
|
assert s.files[0].is_binary == True
|
||||||
|
assert len(s.files[0].hunks) == 0
|
||||||
|
print("PASS: test_binary")
|
||||||
|
|
||||||
|
def test_to_dict():
|
||||||
|
a = DiffAnalyzer()
|
||||||
|
s = a.analyze(SAMPLE_MODIFY)
|
||||||
|
d = s.to_dict()
|
||||||
|
assert "total_files_changed" in d
|
||||||
|
assert "files" in d
|
||||||
|
assert isinstance(d["files"], list)
|
||||||
|
print("PASS: test_to_dict")
|
||||||
|
|
||||||
|
def test_context_only():
|
||||||
|
diff = """diff --git a/f.py b/f.py
|
||||||
|
--- a/f.py
|
||||||
|
+++ b/f.py
|
||||||
|
@@ -1,3 +1,3 @@
|
||||||
|
line1
|
||||||
|
-old
|
||||||
|
+new
|
||||||
|
line3
|
||||||
|
"""
|
||||||
|
a = DiffAnalyzer()
|
||||||
|
s = a.analyze(diff)
|
||||||
|
# Has both added and deleted = MODIFIED
|
||||||
|
assert s.files[0].hunks[0].category == ChangeCategory.MODIFIED
|
||||||
|
print("PASS: test_context_only")
|
||||||
|
|
||||||
|
def test_multi_hunk():
|
||||||
|
diff = """diff --git a/f.py b/f.py
|
||||||
|
--- a/f.py
|
||||||
|
+++ b/f.py
|
||||||
|
@@ -1,1 +1,2 @@
|
||||||
|
existing
|
||||||
|
+first addition
|
||||||
|
@@ -10,1 +11,2 @@
|
||||||
|
more
|
||||||
|
+second addition
|
||||||
|
"""
|
||||||
|
a = DiffAnalyzer()
|
||||||
|
s = a.analyze(diff)
|
||||||
|
assert s.total_hunks == 2
|
||||||
|
assert s.total_added == 2
|
||||||
|
print("PASS: test_multi_hunk")
|
||||||
|
|
||||||
|
|
||||||
|
def run_all():
|
||||||
|
test_empty()
|
||||||
|
test_addition()
|
||||||
|
test_deletion()
|
||||||
|
test_modification()
|
||||||
|
test_rename()
|
||||||
|
test_multiple_files()
|
||||||
|
test_binary()
|
||||||
|
test_to_dict()
|
||||||
|
test_context_only()
|
||||||
|
test_multi_hunk()
|
||||||
|
print("\nAll 10 tests passed!")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
run_all()
|
||||||
109
scripts/test_gitea_issue_parser.py
Normal file
109
scripts/test_gitea_issue_parser.py
Normal file
@@ -0,0 +1,109 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""Tests for scripts/gitea_issue_parser.py"""
|
||||||
|
|
||||||
|
import sys
|
||||||
|
import os
|
||||||
|
sys.path.insert(0, os.path.dirname(__file__) or ".")
|
||||||
|
|
||||||
|
# Import from sibling
|
||||||
|
import importlib.util
|
||||||
|
spec = importlib.util.spec_from_file_location("parser", os.path.join(os.path.dirname(__file__) or ".", "gitea_issue_parser.py"))
|
||||||
|
mod = importlib.util.module_from_spec(spec)
|
||||||
|
spec.loader.exec_module(mod)
|
||||||
|
parse_issue_body = mod.parse_issue_body
|
||||||
|
|
||||||
|
|
||||||
|
def test_basic_parsing():
|
||||||
|
body = """## Context
|
||||||
|
|
||||||
|
This is the background info.
|
||||||
|
|
||||||
|
## Acceptance Criteria
|
||||||
|
|
||||||
|
- [ ] First criterion
|
||||||
|
- [x] Second criterion (done)
|
||||||
|
|
||||||
|
## What to build
|
||||||
|
|
||||||
|
Some description.
|
||||||
|
"""
|
||||||
|
result = parse_issue_body(body, title="Test (#42)", labels=["bug"])
|
||||||
|
assert result["title"] == "Test (#42)"
|
||||||
|
assert result["labels"] == ["bug"]
|
||||||
|
assert result["epic_ref"] == 42
|
||||||
|
assert len(result["criteria"]) == 2
|
||||||
|
assert result["criteria"][0]["text"] == "First criterion"
|
||||||
|
assert result["criteria"][0]["checked"] == False
|
||||||
|
assert result["criteria"][1]["checked"] == True
|
||||||
|
assert "context" in result["sections"]
|
||||||
|
print("PASS: test_basic_parsing")
|
||||||
|
|
||||||
|
|
||||||
|
def test_numbered_criteria():
|
||||||
|
body = """## Acceptance Criteria
|
||||||
|
|
||||||
|
1. First item
|
||||||
|
2. Second item
|
||||||
|
3. Third item
|
||||||
|
"""
|
||||||
|
result = parse_issue_body(body)
|
||||||
|
assert len(result["criteria"]) == 3
|
||||||
|
assert result["criteria"][0]["text"] == "First item"
|
||||||
|
print("PASS: test_numbered_criteria")
|
||||||
|
|
||||||
|
|
||||||
|
def test_epic_ref_from_body():
|
||||||
|
body = "Closes #123\n\nSome description."
|
||||||
|
result = parse_issue_body(body)
|
||||||
|
assert result["epic_ref"] == 123
|
||||||
|
print("PASS: test_epic_ref_from_body")
|
||||||
|
|
||||||
|
|
||||||
|
def test_empty_body():
|
||||||
|
result = parse_issue_body("")
|
||||||
|
assert result["criteria"] == []
|
||||||
|
assert result["context"] == ""
|
||||||
|
assert result["sections"] == {}
|
||||||
|
print("PASS: test_empty_body")
|
||||||
|
|
||||||
|
|
||||||
|
def test_no_sections():
|
||||||
|
body = "Just a plain issue body with no headings."
|
||||||
|
result = parse_issue_body(body)
|
||||||
|
assert result["context"] == "Just a plain issue body with no headings."
|
||||||
|
print("PASS: test_no_sections")
|
||||||
|
|
||||||
|
|
||||||
|
def test_multiple_sections():
|
||||||
|
body = """## Problem
|
||||||
|
|
||||||
|
Something is broken.
|
||||||
|
|
||||||
|
## Fix
|
||||||
|
|
||||||
|
Do this instead.
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
Additional info.
|
||||||
|
"""
|
||||||
|
result = parse_issue_body(body)
|
||||||
|
assert "problem" in result["sections"]
|
||||||
|
assert "fix" in result["sections"]
|
||||||
|
assert "notes" in result["sections"]
|
||||||
|
assert "Something is broken" in result["sections"]["problem"]
|
||||||
|
print("PASS: test_multiple_sections")
|
||||||
|
|
||||||
|
|
||||||
|
def run_all():
|
||||||
|
test_basic_parsing()
|
||||||
|
test_numbered_criteria()
|
||||||
|
test_epic_ref_from_body()
|
||||||
|
test_empty_body()
|
||||||
|
test_no_sections()
|
||||||
|
test_multiple_sections()
|
||||||
|
print("\nAll 6 tests passed!")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
run_all()
|
||||||
@@ -1,41 +1,129 @@
|
|||||||
#!/usr/bin/env python3
|
#!/usr/bin/env python3
|
||||||
"""
|
"""
|
||||||
Test script for knowledge extraction prompt.
|
Test harness for knowledge extraction prompt.
|
||||||
Validates that the prompt produces consistent, structured output.
|
Validates output structure, content quality, and hallucination resistance.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python3 scripts/test_harvest_prompt.py # Run all tests
|
||||||
|
python3 scripts/test_harvest_prompt.py --transcript FILE # Test against a real transcript
|
||||||
|
python3 scripts/test_harvest_prompt.py --validate FILE # Validate an existing extraction JSON
|
||||||
"""
|
"""
|
||||||
|
|
||||||
import json
|
import json
|
||||||
import sys
|
import sys
|
||||||
|
import argparse
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
|
||||||
def validate_knowledge_item(item):
|
VALID_CATEGORIES = {"fact", "pitfall", "pattern", "tool-quirk", "question"}
|
||||||
"""Validate a single knowledge item."""
|
REQUIRED_FIELDS = {"fact", "category", "repo", "confidence", "evidence"}
|
||||||
required_fields = ["fact", "category", "repo", "confidence"]
|
REQUIRED_META = {"session_outcome", "tools_used", "repos_touched", "error_count", "knowledge_count"}
|
||||||
for field in required_fields:
|
|
||||||
if field not in item:
|
|
||||||
return False, f"Missing field: {field}"
|
|
||||||
|
|
||||||
if not isinstance(item["fact"], str) or len(item["fact"].strip()) == 0:
|
|
||||||
return False, "Fact must be a non-empty string"
|
|
||||||
|
|
||||||
valid_categories = ["fact", "pitfall", "pattern", "tool-quirk", "question"]
|
|
||||||
if item["category"] not in valid_categories:
|
|
||||||
return False, f"Invalid category: {item['category']}"
|
|
||||||
|
|
||||||
if not isinstance(item["repo"], str):
|
|
||||||
return False, "Repo must be a string"
|
|
||||||
|
|
||||||
if not isinstance(item["confidence"], (int, float)):
|
|
||||||
return False, "Confidence must be a number"
|
|
||||||
|
|
||||||
if not (0.0 <= item["confidence"] <= 1.0):
|
|
||||||
return False, "Confidence must be between 0.0 and 1.0"
|
|
||||||
|
|
||||||
return True, "Valid"
|
|
||||||
|
|
||||||
def test_sample_transcript():
|
|
||||||
"""Test with a sample transcript."""
|
def validate_knowledge_item(item, idx):
|
||||||
sample_transcript = """
|
"""Validate a single knowledge item. Returns list of errors."""
|
||||||
|
errors = []
|
||||||
|
if not isinstance(item, dict):
|
||||||
|
return [f"Item {idx}: not a dict"]
|
||||||
|
for field in REQUIRED_FIELDS:
|
||||||
|
if field not in item:
|
||||||
|
errors.append(f"Item {idx}: missing field '{field}'")
|
||||||
|
if not isinstance(item.get("fact", ""), str) or len(item.get("fact", "").strip()) == 0:
|
||||||
|
errors.append(f"Item {idx}: fact must be a non-empty string")
|
||||||
|
if item.get("category") not in VALID_CATEGORIES:
|
||||||
|
errors.append(f"Item {idx}: invalid category '{item.get('category')}'")
|
||||||
|
if not isinstance(item.get("repo", ""), str) or len(item.get("repo", "").strip()) == 0:
|
||||||
|
errors.append(f"Item {idx}: repo must be a non-empty string")
|
||||||
|
conf = item.get("confidence")
|
||||||
|
if not isinstance(conf, (int, float)) or not (0.0 <= conf <= 1.0):
|
||||||
|
errors.append(f"Item {idx}: confidence must be a number 0.0-1.0, got {conf}")
|
||||||
|
if not isinstance(item.get("evidence", ""), str) or len(item.get("evidence", "").strip()) == 0:
|
||||||
|
errors.append(f"Item {idx}: evidence must be a non-empty string (hallucination check)")
|
||||||
|
return errors
|
||||||
|
|
||||||
|
|
||||||
|
def validate_extraction(data):
|
||||||
|
"""Validate a full extraction result. Returns (is_valid, errors, warnings)."""
|
||||||
|
errors = []
|
||||||
|
warnings = []
|
||||||
|
|
||||||
|
if not isinstance(data, dict):
|
||||||
|
return False, ["Root is not a JSON object"], []
|
||||||
|
|
||||||
|
if "knowledge" not in data:
|
||||||
|
return False, ["Missing 'knowledge' array"], []
|
||||||
|
|
||||||
|
if not isinstance(data["knowledge"], list):
|
||||||
|
return False, ["'knowledge' is not an array"], []
|
||||||
|
|
||||||
|
for i, item in enumerate(data["knowledge"]):
|
||||||
|
errors.extend(validate_knowledge_item(item, i))
|
||||||
|
|
||||||
|
# Meta block validation
|
||||||
|
if "meta" not in data:
|
||||||
|
warnings.append("Missing 'meta' block (session_outcome, tools_used, etc.)")
|
||||||
|
else:
|
||||||
|
meta = data["meta"]
|
||||||
|
for field in REQUIRED_META:
|
||||||
|
if field not in meta:
|
||||||
|
warnings.append(f"Meta missing field '{field}'")
|
||||||
|
|
||||||
|
# Quality checks
|
||||||
|
facts = data["knowledge"]
|
||||||
|
if len(facts) == 0:
|
||||||
|
warnings.append("No knowledge extracted (empty session or extraction failure)")
|
||||||
|
|
||||||
|
# Check for near-duplicate facts
|
||||||
|
seen_facts = set()
|
||||||
|
for item in facts:
|
||||||
|
normalized = item.get("fact", "").lower().strip()[:80]
|
||||||
|
if normalized in seen_facts:
|
||||||
|
warnings.append(f"Duplicate fact detected: '{normalized[:50]}...'")
|
||||||
|
seen_facts.add(normalized)
|
||||||
|
|
||||||
|
# Check confidence distribution
|
||||||
|
confidences = [item.get("confidence", 0) for item in facts]
|
||||||
|
if confidences:
|
||||||
|
avg_conf = sum(confidences) / len(confidences)
|
||||||
|
if avg_conf > 0.9:
|
||||||
|
warnings.append(f"Average confidence {avg_conf:.2f} is suspiciously high (possible hallucination)")
|
||||||
|
if avg_conf < 0.4:
|
||||||
|
warnings.append(f"Average confidence {avg_conf:.2f} is very low (extraction may be too cautious)")
|
||||||
|
|
||||||
|
return len(errors) == 0, errors, warnings
|
||||||
|
|
||||||
|
|
||||||
|
def validate_transcript_coverage(data, transcript):
|
||||||
|
"""
|
||||||
|
Check that extracted facts are actually supported by the transcript.
|
||||||
|
This is a heuristic — checks that key terms from facts appear in transcript.
|
||||||
|
Returns list of potential hallucinations.
|
||||||
|
"""
|
||||||
|
hallucinations = []
|
||||||
|
transcript_lower = transcript.lower()
|
||||||
|
|
||||||
|
for item in data.get("knowledge", []):
|
||||||
|
fact = item.get("fact", "")
|
||||||
|
evidence = item.get("evidence", "")
|
||||||
|
|
||||||
|
# Check if evidence string appears in transcript
|
||||||
|
if evidence and evidence.lower() not in transcript_lower:
|
||||||
|
# Partial match — check if key terms are present
|
||||||
|
evidence_words = set(evidence.lower().split())
|
||||||
|
transcript_words = set(transcript_lower.split())
|
||||||
|
overlap = evidence_words & transcript_words
|
||||||
|
if len(overlap) < min(3, len(evidence_words) * 0.3):
|
||||||
|
hallucinations.append({
|
||||||
|
"fact": fact[:80],
|
||||||
|
"evidence": evidence[:80],
|
||||||
|
"issue": "Evidence not found in transcript — possible hallucination"
|
||||||
|
})
|
||||||
|
|
||||||
|
return hallucinations
|
||||||
|
|
||||||
|
|
||||||
|
# --- Test transcripts ---
|
||||||
|
|
||||||
|
SAMPLE_TRANSCRIPT_SUCCESS = """
|
||||||
User: Clone the repo at forge.alexanderwhitestone.com/Timmy_Foundation/compounding-intelligence
|
User: Clone the repo at forge.alexanderwhitestone.com/Timmy_Foundation/compounding-intelligence
|
||||||
Agent: Trying git clone... Error: 401 Unauthorized
|
Agent: Trying git clone... Error: 401 Unauthorized
|
||||||
Agent: Checking for token... Found in ~/.config/gitea/token
|
Agent: Checking for token... Found in ~/.config/gitea/token
|
||||||
@@ -44,30 +132,253 @@ Agent: Clone successful with token authentication
|
|||||||
Agent: Now examining the repo structure...
|
Agent: Now examining the repo structure...
|
||||||
Agent: Found knowledge/index.json with 0 facts
|
Agent: Found knowledge/index.json with 0 facts
|
||||||
Agent: Creating harvest-prompt.md in templates/
|
Agent: Creating harvest-prompt.md in templates/
|
||||||
|
Agent: Done. Files created: templates/harvest-prompt.md, scripts/test_harvest_prompt.py
|
||||||
"""
|
"""
|
||||||
|
|
||||||
# This would be replaced with actual prompt execution
|
|
||||||
print("Sample transcript processed")
|
|
||||||
print("Expected categories: fact, pitfall, pattern, tool-quirk, question")
|
|
||||||
return True
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
SAMPLE_TRANSCRIPT_FAILURE = """
|
||||||
print("Testing knowledge extraction prompt...")
|
User: Deploy the cron jobs to production
|
||||||
|
Agent: Connecting to VPS... SSH timeout after 30s
|
||||||
# Test 1: Validate prompt file exists
|
Agent: Retrying with verbose output... Connection refused on port 22
|
||||||
|
Agent: Checking firewall rules... iptables shows port 22 is open
|
||||||
|
Agent: Checking sshd config... sshd is not running
|
||||||
|
Agent: Attempting to start sshd... systemctl start sshd failed: unit not found
|
||||||
|
Agent: ERROR: OpenSSH server not installed on this VPS
|
||||||
|
Agent: Cannot proceed without SSH access. Deployment blocked.
|
||||||
|
"""
|
||||||
|
|
||||||
|
SAMPLE_TRANSCRIPT_API_QUIRK = """
|
||||||
|
User: List all open issues in hermes-agent
|
||||||
|
Agent: Querying Gitea API... endpoint: /repos/Timmy_Foundation/hermes-agent/issues?state=open&limit=50
|
||||||
|
Agent: Response: 0 items. That's odd.
|
||||||
|
Agent: Trying without state filter... endpoint: /repos/Timmy_Foundation/hermes-agent/issues?limit=50
|
||||||
|
Agent: Response: 50 items returned, but all have pull_request field. These are PRs, not issues.
|
||||||
|
Agent: Gitea issues API returns PRs alongside issues. Must filter by presence of pull_request key.
|
||||||
|
Agent: Also discovered: state=open parameter returns empty when used with limit=50, but works with limit=5.
|
||||||
|
Agent: This appears to be a Gitea API quirk — pagination interacts with state filter unexpectedly.
|
||||||
|
Agent: Workaround: fetch without state filter, then filter client-side for state=='open' and 'pull_request' not in item.
|
||||||
|
"""
|
||||||
|
|
||||||
|
|
||||||
|
def run_tests():
|
||||||
|
"""Run the built-in test suite."""
|
||||||
|
tests_passed = 0
|
||||||
|
tests_failed = 0
|
||||||
|
|
||||||
|
print("=" * 60)
|
||||||
|
print("KNOWLEDGE EXTRACTION PROMPT — TEST SUITE")
|
||||||
|
print("=" * 60)
|
||||||
|
|
||||||
|
# Test 1: Prompt file exists and is under 2k tokens (~8k chars)
|
||||||
|
print("\n[Test 1] Prompt file size constraint")
|
||||||
prompt_path = Path("templates/harvest-prompt.md")
|
prompt_path = Path("templates/harvest-prompt.md")
|
||||||
if not prompt_path.exists():
|
if not prompt_path.exists():
|
||||||
print("ERROR: harvest-prompt.md not found")
|
print(" FAIL: harvest-prompt.md not found")
|
||||||
sys.exit(1)
|
tests_failed += 1
|
||||||
|
else:
|
||||||
print(f"OK: Prompt file exists: {prompt_path}")
|
size = prompt_path.stat().st_size
|
||||||
|
# Rough token estimate: ~4 chars per token
|
||||||
# Test 2: Check prompt size
|
est_tokens = size / 4
|
||||||
prompt_size = prompt_path.stat().st_size
|
print(f" Prompt size: {size} bytes (~{est_tokens:.0f} tokens)")
|
||||||
print(f"OK: Prompt size: {prompt_size} bytes")
|
if est_tokens > 2000:
|
||||||
|
print(f" WARN: Prompt exceeds ~1500 tokens (target: ~1000)")
|
||||||
# Test 3: Test sample transcript processing
|
else:
|
||||||
if test_sample_transcript():
|
print(f" PASS: Within token budget")
|
||||||
print("OK: Sample transcript test passed")
|
tests_passed += 1
|
||||||
|
|
||||||
print("\nAll tests passed!")
|
# Test 2: Validate a well-formed extraction
|
||||||
|
print("\n[Test 2] Valid extraction passes validation")
|
||||||
|
valid_extraction = {
|
||||||
|
"knowledge": [
|
||||||
|
{
|
||||||
|
"fact": "Gitea auth token is at ~/.config/gitea/token",
|
||||||
|
"category": "tool-quirk",
|
||||||
|
"repo": "global",
|
||||||
|
"confidence": 0.9,
|
||||||
|
"evidence": "Found in ~/.config/gitea/token"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"fact": "Clone fails with 401 when no token is provided",
|
||||||
|
"category": "pitfall",
|
||||||
|
"repo": "compounding-intelligence",
|
||||||
|
"confidence": 0.9,
|
||||||
|
"evidence": "Error: 401 Unauthorized"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"meta": {
|
||||||
|
"session_outcome": "success",
|
||||||
|
"tools_used": ["git"],
|
||||||
|
"repos_touched": ["compounding-intelligence"],
|
||||||
|
"error_count": 1,
|
||||||
|
"knowledge_count": 2
|
||||||
|
}
|
||||||
|
}
|
||||||
|
is_valid, errors, warnings = validate_extraction(valid_extraction)
|
||||||
|
if is_valid:
|
||||||
|
print(f" PASS: Valid extraction accepted ({len(warnings)} warnings)")
|
||||||
|
tests_passed += 1
|
||||||
|
else:
|
||||||
|
print(f" FAIL: Valid extraction rejected: {errors}")
|
||||||
|
tests_failed += 1
|
||||||
|
|
||||||
|
# Test 3: Reject missing fields
|
||||||
|
print("\n[Test 3] Missing fields are rejected")
|
||||||
|
bad_extraction = {
|
||||||
|
"knowledge": [
|
||||||
|
{"fact": "Something learned", "category": "fact"} # Missing repo, confidence, evidence
|
||||||
|
]
|
||||||
|
}
|
||||||
|
is_valid, errors, warnings = validate_extraction(bad_extraction)
|
||||||
|
if not is_valid:
|
||||||
|
print(f" PASS: Rejected with {len(errors)} errors")
|
||||||
|
tests_passed += 1
|
||||||
|
else:
|
||||||
|
print(f" FAIL: Should have rejected missing fields")
|
||||||
|
tests_failed += 1
|
||||||
|
|
||||||
|
# Test 4: Reject invalid category
|
||||||
|
print("\n[Test 4] Invalid category is rejected")
|
||||||
|
bad_cat = {
|
||||||
|
"knowledge": [
|
||||||
|
{"fact": "Test", "category": "discovery", "repo": "x", "confidence": 0.8, "evidence": "test"}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
is_valid, errors, warnings = validate_extraction(bad_cat)
|
||||||
|
if not is_valid and any("category" in e for e in errors):
|
||||||
|
print(f" PASS: Invalid category 'discovery' rejected")
|
||||||
|
tests_passed += 1
|
||||||
|
else:
|
||||||
|
print(f" FAIL: Should have rejected invalid category")
|
||||||
|
tests_failed += 1
|
||||||
|
|
||||||
|
# Test 5: Detect near-duplicates
|
||||||
|
print("\n[Test 5] Duplicate detection")
|
||||||
|
dup_extraction = {
|
||||||
|
"knowledge": [
|
||||||
|
{"fact": "Token is at ~/.config/gitea/token", "category": "fact", "repo": "x", "confidence": 0.9, "evidence": "a"},
|
||||||
|
{"fact": "Token is at ~/.config/gitea/token", "category": "fact", "repo": "x", "confidence": 0.9, "evidence": "b"}
|
||||||
|
],
|
||||||
|
"meta": {"session_outcome": "success", "tools_used": [], "repos_touched": [], "error_count": 0, "knowledge_count": 2}
|
||||||
|
}
|
||||||
|
is_valid, errors, warnings = validate_extraction(dup_extraction)
|
||||||
|
if any("Duplicate" in w for w in warnings):
|
||||||
|
print(f" PASS: Duplicate detected")
|
||||||
|
tests_passed += 1
|
||||||
|
else:
|
||||||
|
print(f" FAIL: Should have detected duplicate")
|
||||||
|
tests_failed += 1
|
||||||
|
|
||||||
|
# Test 6: Hallucination check against transcript
|
||||||
|
print("\n[Test 6] Hallucination detection")
|
||||||
|
hallucinated = {
|
||||||
|
"knowledge": [
|
||||||
|
{
|
||||||
|
"fact": "Database port is 5433",
|
||||||
|
"category": "fact",
|
||||||
|
"repo": "x",
|
||||||
|
"confidence": 0.9,
|
||||||
|
"evidence": "PostgreSQL listening on port 5433"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"meta": {"session_outcome": "success", "tools_used": [], "repos_touched": [], "error_count": 0, "knowledge_count": 1}
|
||||||
|
}
|
||||||
|
hallucinations = validate_transcript_coverage(hallucinated, SAMPLE_TRANSCRIPT_SUCCESS)
|
||||||
|
if hallucinations:
|
||||||
|
print(f" PASS: Hallucination detected ({len(hallucinations)} items)")
|
||||||
|
tests_passed += 1
|
||||||
|
else:
|
||||||
|
print(f" FAIL: Should have detected hallucinated evidence")
|
||||||
|
tests_failed += 1
|
||||||
|
|
||||||
|
# Test 7: Failed session should extract pitfalls
|
||||||
|
print("\n[Test 7] Failed session extraction shape")
|
||||||
|
failed_extraction = {
|
||||||
|
"knowledge": [
|
||||||
|
{
|
||||||
|
"fact": "SSH server not installed on target VPS",
|
||||||
|
"category": "pitfall",
|
||||||
|
"repo": "global",
|
||||||
|
"confidence": 0.9,
|
||||||
|
"evidence": "ERROR: OpenSSH server not installed on this VPS"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"fact": "VPS blocks deployment without SSH access",
|
||||||
|
"category": "question",
|
||||||
|
"repo": "global",
|
||||||
|
"confidence": 0.7,
|
||||||
|
"evidence": "Cannot proceed without SSH access. Deployment blocked."
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"meta": {
|
||||||
|
"session_outcome": "failed",
|
||||||
|
"tools_used": ["ssh", "systemctl"],
|
||||||
|
"repos_touched": [],
|
||||||
|
"error_count": 3,
|
||||||
|
"knowledge_count": 2
|
||||||
|
}
|
||||||
|
}
|
||||||
|
is_valid, errors, warnings = validate_extraction(failed_extraction)
|
||||||
|
if is_valid:
|
||||||
|
categories = [item["category"] for item in failed_extraction["knowledge"]]
|
||||||
|
if "pitfall" in categories:
|
||||||
|
print(f" PASS: Failed session extracted {len(categories)} items including pitfalls")
|
||||||
|
tests_passed += 1
|
||||||
|
else:
|
||||||
|
print(f" FAIL: Failed session should extract pitfalls")
|
||||||
|
tests_failed += 1
|
||||||
|
else:
|
||||||
|
print(f" FAIL: {errors}")
|
||||||
|
tests_failed += 1
|
||||||
|
|
||||||
|
# Test 8: Empty extraction is warned
|
||||||
|
print("\n[Test 8] Empty extraction warning")
|
||||||
|
empty = {"knowledge": [], "meta": {"session_outcome": "success", "tools_used": [], "repos_touched": [], "error_count": 0, "knowledge_count": 0}}
|
||||||
|
is_valid, errors, warnings = validate_extraction(empty)
|
||||||
|
if any("No knowledge" in w for w in warnings):
|
||||||
|
print(f" PASS: Empty extraction warned")
|
||||||
|
tests_passed += 1
|
||||||
|
else:
|
||||||
|
print(f" FAIL: Should warn on empty extraction")
|
||||||
|
tests_failed += 1
|
||||||
|
|
||||||
|
# Summary
|
||||||
|
print(f"\n{'=' * 60}")
|
||||||
|
print(f"Results: {tests_passed} passed, {tests_failed} failed")
|
||||||
|
print(f"{'=' * 60}")
|
||||||
|
return tests_failed == 0
|
||||||
|
|
||||||
|
|
||||||
|
def validate_file(filepath):
|
||||||
|
"""Validate an existing extraction JSON file."""
|
||||||
|
path = Path(filepath)
|
||||||
|
if not path.exists():
|
||||||
|
print(f"ERROR: {filepath} not found")
|
||||||
|
return False
|
||||||
|
|
||||||
|
data = json.loads(path.read_text())
|
||||||
|
is_valid, errors, warnings = validate_extraction(data)
|
||||||
|
|
||||||
|
print(f"Validation of {filepath}:")
|
||||||
|
print(f" Knowledge items: {len(data.get('knowledge', []))}")
|
||||||
|
print(f" Errors: {len(errors)}")
|
||||||
|
print(f" Warnings: {len(warnings)}")
|
||||||
|
|
||||||
|
for e in errors:
|
||||||
|
print(f" ERROR: {e}")
|
||||||
|
for w in warnings:
|
||||||
|
print(f" WARN: {w}")
|
||||||
|
|
||||||
|
return is_valid
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
parser = argparse.ArgumentParser(description="Test knowledge extraction prompt")
|
||||||
|
parser.add_argument("--validate", help="Validate an existing extraction JSON file")
|
||||||
|
parser.add_argument("--transcript", help="Test against a real transcript file (informational)")
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
if args.validate:
|
||||||
|
success = validate_file(args.validate)
|
||||||
|
sys.exit(0 if success else 1)
|
||||||
|
else:
|
||||||
|
success = run_tests()
|
||||||
|
sys.exit(0 if success else 1)
|
||||||
|
|||||||
212
scripts/test_harvest_prompt_comprehensive.py
Normal file
212
scripts/test_harvest_prompt_comprehensive.py
Normal file
@@ -0,0 +1,212 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Comprehensive test script for knowledge extraction prompt.
|
||||||
|
Validates prompt structure, requirements, and consistency.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import json
|
||||||
|
import re
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
def test_prompt_structure():
|
||||||
|
"""Test that the prompt has the required structure."""
|
||||||
|
prompt_path = Path("templates/harvest-prompt.md")
|
||||||
|
if not prompt_path.exists():
|
||||||
|
return False, "harvest-prompt.md not found"
|
||||||
|
|
||||||
|
content = prompt_path.read_text()
|
||||||
|
|
||||||
|
# Check for required sections
|
||||||
|
required_sections = [
|
||||||
|
"System Prompt",
|
||||||
|
"Instructions",
|
||||||
|
"Categories",
|
||||||
|
"Output Format",
|
||||||
|
"Confidence Scoring",
|
||||||
|
"Constraints",
|
||||||
|
"Example"
|
||||||
|
]
|
||||||
|
|
||||||
|
for section in required_sections:
|
||||||
|
if section.lower() not in content.lower():
|
||||||
|
return False, f"Missing required section: {section}"
|
||||||
|
|
||||||
|
# Check for required categories
|
||||||
|
required_categories = ["fact", "pitfall", "pattern", "tool-quirk", "question"]
|
||||||
|
for category in required_categories:
|
||||||
|
if category not in content:
|
||||||
|
return False, f"Missing required category: {category}"
|
||||||
|
|
||||||
|
# Check for required output fields
|
||||||
|
required_fields = ["fact", "category", "repo", "confidence"]
|
||||||
|
for field in required_fields:
|
||||||
|
if field not in content:
|
||||||
|
return False, f"Missing required output field: {field}"
|
||||||
|
|
||||||
|
# Check prompt size (should be ~1k tokens, roughly 4k chars)
|
||||||
|
if len(content) > 5000:
|
||||||
|
return False, f"Prompt too large: {len(content)} chars (max ~5000)"
|
||||||
|
|
||||||
|
if len(content) < 1000:
|
||||||
|
return False, f"Prompt too small: {len(content)} chars (min ~1000)"
|
||||||
|
|
||||||
|
return True, "Prompt structure is valid"
|
||||||
|
|
||||||
|
def test_confidence_scoring():
|
||||||
|
"""Test that confidence scoring is properly defined."""
|
||||||
|
prompt_path = Path("templates/harvest-prompt.md")
|
||||||
|
content = prompt_path.read_text()
|
||||||
|
|
||||||
|
# Check for confidence scale definitions
|
||||||
|
confidence_levels = [
|
||||||
|
("0.9-1.0", "explicitly stated"),
|
||||||
|
("0.7-0.8", "clearly implied"),
|
||||||
|
("0.5-0.6", "suggested"),
|
||||||
|
("0.3-0.4", "inferred"),
|
||||||
|
("0.1-0.2", "speculative")
|
||||||
|
]
|
||||||
|
|
||||||
|
for level, description in confidence_levels:
|
||||||
|
if level not in content:
|
||||||
|
return False, f"Missing confidence level: {level}"
|
||||||
|
if description.lower() not in content.lower():
|
||||||
|
return False, f"Missing confidence description: {description}"
|
||||||
|
|
||||||
|
return True, "Confidence scoring is properly defined"
|
||||||
|
|
||||||
|
def test_example_quality():
|
||||||
|
"""Test that examples are clear and complete."""
|
||||||
|
prompt_path = Path("templates/harvest-prompt.md")
|
||||||
|
content = prompt_path.read_text()
|
||||||
|
|
||||||
|
# Check for example input/output
|
||||||
|
if "example" not in content.lower():
|
||||||
|
return False, "No examples provided"
|
||||||
|
|
||||||
|
# Check that example includes all categories
|
||||||
|
example_section = content[content.lower().find("example"):]
|
||||||
|
|
||||||
|
# Look for JSON example
|
||||||
|
json_match = re.search(r'\{[\s\S]*"knowledge"[\s\S]*\}', example_section)
|
||||||
|
if not json_match:
|
||||||
|
return False, "No JSON example found"
|
||||||
|
|
||||||
|
example_json = json_match.group(0)
|
||||||
|
|
||||||
|
# Check for all categories in example
|
||||||
|
for category in ["fact", "pitfall", "pattern", "tool-quirk", "question"]:
|
||||||
|
if category not in example_json:
|
||||||
|
return False, f"Example missing category: {category}"
|
||||||
|
|
||||||
|
return True, "Examples are clear and complete"
|
||||||
|
|
||||||
|
def test_constraint_coverage():
|
||||||
|
"""Test that constraints cover all requirements."""
|
||||||
|
prompt_path = Path("templates/harvest-prompt.md")
|
||||||
|
content = prompt_path.read_text()
|
||||||
|
|
||||||
|
required_constraints = [
|
||||||
|
"No hallucination",
|
||||||
|
"only extract",
|
||||||
|
"explicitly",
|
||||||
|
"partial",
|
||||||
|
"failed sessions",
|
||||||
|
"1k tokens"
|
||||||
|
]
|
||||||
|
|
||||||
|
for constraint in required_constraints:
|
||||||
|
if constraint.lower() not in content.lower():
|
||||||
|
return False, f"Missing constraint: {constraint}"
|
||||||
|
|
||||||
|
return True, "Constraints cover all requirements"
|
||||||
|
|
||||||
|
def test_test_sessions():
|
||||||
|
"""Test that test sessions exist and are valid."""
|
||||||
|
test_sessions_dir = Path("test_sessions")
|
||||||
|
if not test_sessions_dir.exists():
|
||||||
|
return False, "test_sessions directory not found"
|
||||||
|
|
||||||
|
session_files = list(test_sessions_dir.glob("*.jsonl"))
|
||||||
|
if len(session_files) < 5:
|
||||||
|
return False, f"Only {len(session_files)} test sessions found, need 5"
|
||||||
|
|
||||||
|
# Check each session file
|
||||||
|
for session_file in session_files:
|
||||||
|
content = session_file.read_text()
|
||||||
|
lines = content.strip().split("\n")
|
||||||
|
|
||||||
|
# Check that each line is valid JSON
|
||||||
|
for i, line in enumerate(lines, 1):
|
||||||
|
try:
|
||||||
|
json.loads(line)
|
||||||
|
except json.JSONDecodeError as e:
|
||||||
|
return False, f"Invalid JSON in {session_file.name}, line {i}: {e}"
|
||||||
|
|
||||||
|
return True, f"Found {len(session_files)} valid test sessions"
|
||||||
|
|
||||||
|
def run_all_tests():
|
||||||
|
"""Run all tests and return results."""
|
||||||
|
tests = [
|
||||||
|
("Prompt Structure", test_prompt_structure),
|
||||||
|
("Confidence Scoring", test_confidence_scoring),
|
||||||
|
("Example Quality", test_example_quality),
|
||||||
|
("Constraint Coverage", test_constraint_coverage),
|
||||||
|
("Test Sessions", test_test_sessions)
|
||||||
|
]
|
||||||
|
|
||||||
|
results = []
|
||||||
|
all_passed = True
|
||||||
|
|
||||||
|
for test_name, test_func in tests:
|
||||||
|
try:
|
||||||
|
passed, message = test_func()
|
||||||
|
results.append({
|
||||||
|
"test": test_name,
|
||||||
|
"passed": passed,
|
||||||
|
"message": message
|
||||||
|
})
|
||||||
|
if not passed:
|
||||||
|
all_passed = False
|
||||||
|
except Exception as e:
|
||||||
|
results.append({
|
||||||
|
"test": test_name,
|
||||||
|
"passed": False,
|
||||||
|
"message": f"Error: {str(e)}"
|
||||||
|
})
|
||||||
|
all_passed = False
|
||||||
|
|
||||||
|
# Print results
|
||||||
|
print("=" * 60)
|
||||||
|
print("HARVEST PROMPT TEST RESULTS")
|
||||||
|
print("=" * 60)
|
||||||
|
|
||||||
|
for result in results:
|
||||||
|
status = "✓ PASS" if result["passed"] else "✗ FAIL"
|
||||||
|
print(f"{status}: {result['test']}")
|
||||||
|
print(f" {result['message']}")
|
||||||
|
print()
|
||||||
|
|
||||||
|
print("=" * 60)
|
||||||
|
if all_passed:
|
||||||
|
print("ALL TESTS PASSED!")
|
||||||
|
else:
|
||||||
|
print("SOME TESTS FAILED!")
|
||||||
|
print("=" * 60)
|
||||||
|
|
||||||
|
return all_passed, results
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
all_passed, results = run_all_tests()
|
||||||
|
|
||||||
|
# Save results to file
|
||||||
|
with open("test_results.json", "w") as f:
|
||||||
|
json.dump({
|
||||||
|
"all_passed": all_passed,
|
||||||
|
"results": results,
|
||||||
|
"timestamp": "2026-04-14T19:05:00Z"
|
||||||
|
}, f, indent=2)
|
||||||
|
|
||||||
|
print(f"Results saved to test_results.json")
|
||||||
|
|
||||||
|
# Exit with appropriate code
|
||||||
|
exit(0 if all_passed else 1)
|
||||||
162
scripts/test_harvester_pipeline.py
Normal file
162
scripts/test_harvester_pipeline.py
Normal file
@@ -0,0 +1,162 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Smoke test for harvester pipeline — verifies the full chain:
|
||||||
|
session_reader -> prompt -> LLM (mocked) -> validate -> deduplicate -> store
|
||||||
|
|
||||||
|
Does NOT call the real LLM. Tests plumbing only.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import json
|
||||||
|
import sys
|
||||||
|
import tempfile
|
||||||
|
import os
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
# Setup path
|
||||||
|
SCRIPT_DIR = Path(__file__).parent.absolute()
|
||||||
|
sys.path.insert(0, str(SCRIPT_DIR))
|
||||||
|
|
||||||
|
from session_reader import read_session, extract_conversation, truncate_for_context, messages_to_text
|
||||||
|
from harvester import validate_fact, deduplicate, load_existing_knowledge, fact_fingerprint
|
||||||
|
|
||||||
|
|
||||||
|
def test_session_reader():
|
||||||
|
"""Test that session_reader parses JSONL correctly."""
|
||||||
|
with tempfile.NamedTemporaryFile(mode='w', suffix='.jsonl', delete=False) as f:
|
||||||
|
f.write('{"role": "user", "content": "Hello", "timestamp": "2026-04-13T10:00:00Z"}\n')
|
||||||
|
f.write('{"role": "assistant", "content": "Hi there", "timestamp": "2026-04-13T10:00:01Z"}\n')
|
||||||
|
f.write('{"role": "user", "content": "Clone the repo", "timestamp": "2026-04-13T10:00:02Z"}\n')
|
||||||
|
f.write('{"role": "assistant", "content": "Cloned successfully", "timestamp": "2026-04-13T10:00:05Z"}\n')
|
||||||
|
path = f.name
|
||||||
|
|
||||||
|
messages = read_session(path)
|
||||||
|
assert len(messages) == 4, f"Expected 4 messages, got {len(messages)}"
|
||||||
|
|
||||||
|
conv = extract_conversation(messages)
|
||||||
|
assert len(conv) == 4, f"Expected 4 conversation turns, got {len(conv)}"
|
||||||
|
|
||||||
|
text = messages_to_text(conv)
|
||||||
|
assert "USER: Hello" in text
|
||||||
|
assert "ASSISTANT: Hi there" in text
|
||||||
|
|
||||||
|
truncated = truncate_for_context(conv, head=2, tail=2)
|
||||||
|
assert len(truncated) == 4 # 4 <= head+tail, so no truncation
|
||||||
|
|
||||||
|
os.unlink(path)
|
||||||
|
print(" [PASS] session_reader pipeline works")
|
||||||
|
|
||||||
|
|
||||||
|
def test_validate_fact():
|
||||||
|
"""Test fact validation."""
|
||||||
|
good = {"fact": "Gitea token is at ~/.config/gitea/token", "category": "tool-quirk", "repo": "global", "confidence": 0.9}
|
||||||
|
assert validate_fact(good), "Valid fact should pass"
|
||||||
|
|
||||||
|
bad_missing = {"fact": "Something", "category": "fact"}
|
||||||
|
assert not validate_fact(bad_missing), "Missing fields should fail"
|
||||||
|
|
||||||
|
bad_category = {"fact": "Something", "category": "nonsense", "repo": "x", "confidence": 0.5}
|
||||||
|
assert not validate_fact(bad_category), "Bad category should fail"
|
||||||
|
|
||||||
|
bad_conf = {"fact": "Something", "category": "fact", "repo": "x", "confidence": 1.5}
|
||||||
|
assert not validate_fact(bad_conf), "Confidence > 1.0 should fail"
|
||||||
|
|
||||||
|
print(" [PASS] fact validation works")
|
||||||
|
|
||||||
|
|
||||||
|
def test_deduplicate():
|
||||||
|
"""Test deduplication."""
|
||||||
|
existing = [
|
||||||
|
{"fact": "Token is at ~/.config/gitea/token", "category": "tool-quirk", "repo": "global", "confidence": 0.9}
|
||||||
|
]
|
||||||
|
new = [
|
||||||
|
{"fact": "Token is at ~/.config/gitea/token", "category": "tool-quirk", "repo": "global", "confidence": 0.9}, # exact dup
|
||||||
|
{"fact": "Deploy uses Ansible on port 22", "category": "pattern", "repo": "fleet", "confidence": 0.8}, # unique
|
||||||
|
]
|
||||||
|
result = deduplicate(new, existing)
|
||||||
|
assert len(result) == 1, f"Expected 1 unique, got {len(result)}"
|
||||||
|
assert result[0]["fact"] == "Deploy uses Ansible on port 22"
|
||||||
|
print(" [PASS] deduplication works")
|
||||||
|
|
||||||
|
|
||||||
|
def test_knowledge_store_roundtrip():
|
||||||
|
"""Test loading and writing knowledge index."""
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
# Load empty index
|
||||||
|
index = load_existing_knowledge(tmpdir)
|
||||||
|
assert index["total_facts"] == 0
|
||||||
|
|
||||||
|
# Write a fact
|
||||||
|
new_facts = [{"fact": "Test fact", "category": "fact", "repo": "test", "confidence": 0.9}]
|
||||||
|
|
||||||
|
# Use harvester's write function
|
||||||
|
from harvester import write_knowledge
|
||||||
|
write_knowledge(index, new_facts, tmpdir, source_session="test.jsonl")
|
||||||
|
|
||||||
|
# Reload and verify
|
||||||
|
index2 = load_existing_knowledge(tmpdir)
|
||||||
|
assert index2["total_facts"] == 1
|
||||||
|
assert index2["facts"][0]["fact"] == "Test fact"
|
||||||
|
assert index2["facts"][0]["source_session"] == "test.jsonl"
|
||||||
|
|
||||||
|
# Check markdown was written
|
||||||
|
md_path = Path(tmpdir) / "repos" / "test.md"
|
||||||
|
assert md_path.exists(), "Markdown file should be created"
|
||||||
|
|
||||||
|
print(" [PASS] knowledge store roundtrip works")
|
||||||
|
|
||||||
|
|
||||||
|
def test_full_chain_no_llm():
|
||||||
|
"""Test the full pipeline minus the LLM call."""
|
||||||
|
with tempfile.NamedTemporaryFile(mode='w', suffix='.jsonl', delete=False) as f:
|
||||||
|
f.write('{"role": "user", "content": "Clone compounding-intelligence", "timestamp": "2026-04-13T10:00:00Z"}\n')
|
||||||
|
f.write('{"role": "assistant", "content": "Cloned successfully", "timestamp": "2026-04-13T10:00:05Z"}\n')
|
||||||
|
session_path = f.name
|
||||||
|
|
||||||
|
with tempfile.TemporaryDirectory() as knowledge_dir:
|
||||||
|
# Step 1: Read
|
||||||
|
messages = read_session(session_path)
|
||||||
|
assert len(messages) == 2
|
||||||
|
|
||||||
|
# Step 2: Extract conversation
|
||||||
|
conv = extract_conversation(messages)
|
||||||
|
assert len(conv) == 2
|
||||||
|
|
||||||
|
# Step 3: Truncate
|
||||||
|
truncated = truncate_for_context(conv, head=50, tail=50)
|
||||||
|
|
||||||
|
# Step 4: Convert to text (this goes to the LLM)
|
||||||
|
transcript = messages_to_text(truncated)
|
||||||
|
assert "Clone compounding-intelligence" in transcript
|
||||||
|
|
||||||
|
# Step 5-7: Would be LLM call, validate, deduplicate
|
||||||
|
# We simulate LLM output here
|
||||||
|
mock_facts = [
|
||||||
|
{"fact": "compounding-intelligence repo was cloned", "category": "fact", "repo": "compounding-intelligence", "confidence": 0.9}
|
||||||
|
]
|
||||||
|
valid = [f for f in mock_facts if validate_fact(f)]
|
||||||
|
|
||||||
|
# Step 6: Deduplicate
|
||||||
|
index = load_existing_knowledge(knowledge_dir)
|
||||||
|
new_facts = deduplicate(valid, index.get("facts", []))
|
||||||
|
assert len(new_facts) == 1
|
||||||
|
|
||||||
|
# Step 7: Store
|
||||||
|
from harvester import write_knowledge
|
||||||
|
write_knowledge(index, new_facts, knowledge_dir, source_session=session_path)
|
||||||
|
|
||||||
|
# Verify
|
||||||
|
index2 = load_existing_knowledge(knowledge_dir)
|
||||||
|
assert index2["total_facts"] == 1
|
||||||
|
|
||||||
|
os.unlink(session_path)
|
||||||
|
print(" [PASS] full chain (reader -> validate -> dedup -> store) works")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
print("Running harvester pipeline smoke tests...")
|
||||||
|
test_session_reader()
|
||||||
|
test_validate_fact()
|
||||||
|
test_deduplicate()
|
||||||
|
test_knowledge_store_roundtrip()
|
||||||
|
test_full_chain_no_llm()
|
||||||
|
print("\nAll tests passed.")
|
||||||
282
scripts/test_improvement_proposals.py
Normal file
282
scripts/test_improvement_proposals.py
Normal file
@@ -0,0 +1,282 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""Tests for scripts/improvement_proposals.py — 15 tests."""
|
||||||
|
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import tempfile
|
||||||
|
|
||||||
|
sys.path.insert(0, os.path.dirname(__file__) or ".")
|
||||||
|
import importlib.util
|
||||||
|
|
||||||
|
spec = importlib.util.spec_from_file_location(
|
||||||
|
"ip", os.path.join(os.path.dirname(__file__) or ".", "improvement_proposals.py")
|
||||||
|
)
|
||||||
|
mod = importlib.util.module_from_spec(spec)
|
||||||
|
spec.loader.exec_module(mod)
|
||||||
|
|
||||||
|
analyze_sessions = mod.analyze_sessions
|
||||||
|
generate_proposals = mod.generate_proposals
|
||||||
|
format_proposals_markdown = mod.format_proposals_markdown
|
||||||
|
format_proposals_json = mod.format_proposals_json
|
||||||
|
_normalize_error = mod._normalize_error
|
||||||
|
|
||||||
|
|
||||||
|
# ── Helper to build test sessions ─────────────────────────────
|
||||||
|
|
||||||
|
def _make_session(session_id, repo="test-repo", errors=None, tool_calls=None, duration=30):
|
||||||
|
return {
|
||||||
|
"session_id": session_id,
|
||||||
|
"repo": repo,
|
||||||
|
"errors": [{"message": e} for e in (errors or [])],
|
||||||
|
"tool_calls": tool_calls or [],
|
||||||
|
"duration_minutes": duration,
|
||||||
|
"messages": [],
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def _make_tool_calls(repeats):
|
||||||
|
"""Create tool call list with repeated tools."""
|
||||||
|
calls = []
|
||||||
|
for tool, count in repeats:
|
||||||
|
for _ in range(count):
|
||||||
|
calls.append({"tool": tool, "latency_ms": 100})
|
||||||
|
return calls
|
||||||
|
|
||||||
|
|
||||||
|
# ── Tests ─────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
def test_empty_sessions():
|
||||||
|
patterns = analyze_sessions([])
|
||||||
|
assert patterns == []
|
||||||
|
print("PASS: test_empty_sessions")
|
||||||
|
|
||||||
|
|
||||||
|
def test_no_patterns_on_clean_sessions():
|
||||||
|
sessions = [
|
||||||
|
_make_session("s1", tool_calls=[{"tool": "read_file", "latency_ms": 50}]),
|
||||||
|
_make_session("s2", tool_calls=[{"tool": "write_file", "latency_ms": 80}]),
|
||||||
|
]
|
||||||
|
patterns = analyze_sessions(sessions)
|
||||||
|
# No repeated errors, no slow tools, no retries
|
||||||
|
assert len(patterns) == 0
|
||||||
|
print("PASS: test_no_patterns_on_clean_sessions")
|
||||||
|
|
||||||
|
|
||||||
|
def test_repeated_error_detection():
|
||||||
|
"""Same error across 3+ sessions triggers pattern."""
|
||||||
|
sessions = [
|
||||||
|
_make_session(f"s{i}", errors=["ModuleNotFoundError: No module named bannerlord_trace"])
|
||||||
|
for i in range(4)
|
||||||
|
]
|
||||||
|
patterns = analyze_sessions(sessions)
|
||||||
|
repeated = [p for p in patterns if p.pattern_type == "repeated_error"]
|
||||||
|
assert len(repeated) == 1
|
||||||
|
assert repeated[0].occurrences == 4
|
||||||
|
print("PASS: test_repeated_error_detection")
|
||||||
|
|
||||||
|
|
||||||
|
def test_repeated_error_threshold():
|
||||||
|
"""2 occurrences should NOT trigger (threshold is 3)."""
|
||||||
|
sessions = [
|
||||||
|
_make_session("s1", errors=["TimeoutError: connection timed out"]),
|
||||||
|
_make_session("s2", errors=["TimeoutError: connection timed out"]),
|
||||||
|
]
|
||||||
|
patterns = analyze_sessions(sessions)
|
||||||
|
repeated = [p for p in patterns if p.pattern_type == "repeated_error"]
|
||||||
|
assert len(repeated) == 0
|
||||||
|
print("PASS: test_repeated_error_threshold")
|
||||||
|
|
||||||
|
|
||||||
|
def test_slow_tool_detection():
|
||||||
|
"""Tool with avg latency > 5000ms across 5+ calls."""
|
||||||
|
calls = [{"tool": "git_push", "latency_ms": 8000} for _ in range(10)]
|
||||||
|
sessions = [_make_session("s1", tool_calls=calls)]
|
||||||
|
patterns = analyze_sessions(sessions)
|
||||||
|
slow = [p for p in patterns if p.pattern_type == "slow_tool"]
|
||||||
|
assert len(slow) == 1
|
||||||
|
assert "git_push" in slow[0].description
|
||||||
|
print("PASS: test_slow_tool_detection")
|
||||||
|
|
||||||
|
|
||||||
|
def test_fast_tool_not_flagged():
|
||||||
|
"""Tool under 5000ms avg should not trigger."""
|
||||||
|
calls = [{"tool": "read_file", "latency_ms": 50} for _ in range(10)]
|
||||||
|
sessions = [_make_session("s1", tool_calls=calls)]
|
||||||
|
patterns = analyze_sessions(sessions)
|
||||||
|
slow = [p for p in patterns if p.pattern_type == "slow_tool"]
|
||||||
|
assert len(slow) == 0
|
||||||
|
print("PASS: test_fast_tool_not_flagged")
|
||||||
|
|
||||||
|
|
||||||
|
def test_failed_retry_detection():
|
||||||
|
"""3+ consecutive calls to same tool triggers retry pattern."""
|
||||||
|
calls = _make_tool_calls([("execute_code", 5)])
|
||||||
|
sessions = [_make_session("s1", tool_calls=calls)]
|
||||||
|
sessions.extend([
|
||||||
|
_make_session(f"s{i}", tool_calls=_make_tool_calls([("execute_code", 4)]))
|
||||||
|
for i in range(2, 5)
|
||||||
|
])
|
||||||
|
patterns = analyze_sessions(sessions)
|
||||||
|
retries = [p for p in patterns if p.pattern_type == "failed_retry"]
|
||||||
|
assert len(retries) >= 1
|
||||||
|
print("PASS: test_failed_retry_detection")
|
||||||
|
|
||||||
|
|
||||||
|
def test_manual_process_detection():
|
||||||
|
"""10+ tool calls with <= 3 unique tools."""
|
||||||
|
calls = _make_tool_calls([("terminal", 8), ("read_file", 5)])
|
||||||
|
sessions = [_make_session("s1", tool_calls=calls, duration=25)]
|
||||||
|
patterns = analyze_sessions(sessions)
|
||||||
|
manual = [p for p in patterns if p.pattern_type == "manual_process"]
|
||||||
|
assert len(manual) == 1
|
||||||
|
print("PASS: test_manual_process_detection")
|
||||||
|
|
||||||
|
|
||||||
|
def test_generate_proposals_from_patterns():
|
||||||
|
"""Proposals generated from waste patterns."""
|
||||||
|
sessions = [
|
||||||
|
_make_session(f"s{i}", errors=["Error: push timeout"])
|
||||||
|
for i in range(5)
|
||||||
|
]
|
||||||
|
patterns = analyze_sessions(sessions)
|
||||||
|
proposals = generate_proposals(patterns)
|
||||||
|
assert len(proposals) >= 1
|
||||||
|
assert proposals[0].estimated_monthly_hours_saved > 0
|
||||||
|
assert proposals[0].priority in ("critical", "high", "medium", "low")
|
||||||
|
print("PASS: test_generate_proposals_from_patterns")
|
||||||
|
|
||||||
|
|
||||||
|
def test_proposal_roi_positive():
|
||||||
|
"""ROI weeks should be a positive number for recoverable time."""
|
||||||
|
patterns = [mod.WastePattern(
|
||||||
|
pattern_type="repeated_error",
|
||||||
|
description="Test error",
|
||||||
|
occurrences=10,
|
||||||
|
total_time_hours=5.0,
|
||||||
|
affected_repos=["test"],
|
||||||
|
)]
|
||||||
|
proposals = generate_proposals(patterns)
|
||||||
|
assert len(proposals) == 1
|
||||||
|
assert proposals[0].roi_weeks > 0
|
||||||
|
assert proposals[0].roi_weeks < 100
|
||||||
|
print("PASS: test_proposal_roi_positive")
|
||||||
|
|
||||||
|
|
||||||
|
def test_proposals_sorted_by_impact():
|
||||||
|
"""Proposals should be sorted by monthly hours saved (descending)."""
|
||||||
|
sessions = [
|
||||||
|
_make_session("s1", errors=["Minor warning"] * 3, duration=5),
|
||||||
|
_make_session("s2", errors=["Critical failure: deploy crashed"] * 5, duration=60),
|
||||||
|
]
|
||||||
|
# Add more sessions to cross threshold
|
||||||
|
for i in range(3, 7):
|
||||||
|
sessions.append(_make_session(f"s{i}", errors=["Critical failure: deploy crashed"]))
|
||||||
|
|
||||||
|
patterns = analyze_sessions(sessions)
|
||||||
|
proposals = generate_proposals(patterns)
|
||||||
|
if len(proposals) >= 2:
|
||||||
|
for i in range(len(proposals) - 1):
|
||||||
|
assert proposals[i].estimated_monthly_hours_saved >= proposals[i + 1].estimated_monthly_hours_saved
|
||||||
|
print("PASS: test_proposals_sorted_by_impact")
|
||||||
|
|
||||||
|
|
||||||
|
def test_format_markdown():
|
||||||
|
"""Markdown output should contain expected sections."""
|
||||||
|
patterns = [mod.WastePattern(
|
||||||
|
pattern_type="repeated_error", description="Test", occurrences=5,
|
||||||
|
total_time_hours=2.5, affected_repos=["repo"],
|
||||||
|
)]
|
||||||
|
proposals = generate_proposals(patterns)
|
||||||
|
md = format_proposals_markdown(proposals, patterns, "2026-04-15T00:00:00Z")
|
||||||
|
assert "# Improvement Proposals" in md
|
||||||
|
assert "## Summary" in md
|
||||||
|
assert "### Problem" in md
|
||||||
|
assert "### ROI Estimate" in md
|
||||||
|
assert "## Appendix" in md
|
||||||
|
print("PASS: test_format_markdown")
|
||||||
|
|
||||||
|
|
||||||
|
def test_format_json():
|
||||||
|
"""JSON output should be valid and parseable."""
|
||||||
|
patterns = [mod.WastePattern(
|
||||||
|
pattern_type="slow_tool", description="Slow", occurrences=10,
|
||||||
|
total_time_hours=3.0, affected_repos=["global"],
|
||||||
|
)]
|
||||||
|
proposals = generate_proposals(patterns)
|
||||||
|
output = format_proposals_json(proposals)
|
||||||
|
parsed = json.loads(output)
|
||||||
|
assert isinstance(parsed, list)
|
||||||
|
assert len(parsed) == len(proposals)
|
||||||
|
assert "title" in parsed[0]
|
||||||
|
assert "roi_weeks" in parsed[0]
|
||||||
|
print("PASS: test_format_json")
|
||||||
|
|
||||||
|
|
||||||
|
def test_normalize_error():
|
||||||
|
"""Error normalization should remove paths and hashes."""
|
||||||
|
err1 = _normalize_error("Failed to clone /Users/apayne/repo with token abc123def456")
|
||||||
|
assert "/PATH" in err1
|
||||||
|
assert "HASH" in err1
|
||||||
|
assert "/Users/apayne" not in err1
|
||||||
|
|
||||||
|
err2 = _normalize_error("")
|
||||||
|
assert err2 == ""
|
||||||
|
|
||||||
|
err3 = _normalize_error("Simple error message")
|
||||||
|
assert "simple error" in err3
|
||||||
|
print("PASS: test_normalize_error")
|
||||||
|
|
||||||
|
|
||||||
|
def test_cli_integration():
|
||||||
|
"""End-to-end test: write input JSON, run script, check output."""
|
||||||
|
import subprocess
|
||||||
|
|
||||||
|
sessions = [
|
||||||
|
_make_session(f"s{i}", errors=["Connection refused: port 8080"])
|
||||||
|
for i in range(5)
|
||||||
|
]
|
||||||
|
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
input_path = os.path.join(tmpdir, "analytics.json")
|
||||||
|
output_path = os.path.join(tmpdir, "proposals.md")
|
||||||
|
|
||||||
|
with open(input_path, "w") as f:
|
||||||
|
json.dump({"sessions": sessions}, f)
|
||||||
|
|
||||||
|
script = os.path.join(os.path.dirname(__file__) or ".", "improvement_proposals.py")
|
||||||
|
result = subprocess.run(
|
||||||
|
[sys.executable, script, "--input", input_path, "--output", output_path],
|
||||||
|
capture_output=True, text=True, timeout=10,
|
||||||
|
)
|
||||||
|
|
||||||
|
assert result.returncode == 0, f"CLI failed: {result.stderr}"
|
||||||
|
assert os.path.exists(output_path)
|
||||||
|
|
||||||
|
with open(output_path) as f:
|
||||||
|
content = f.read()
|
||||||
|
assert "# Improvement Proposals" in content
|
||||||
|
print("PASS: test_cli_integration")
|
||||||
|
|
||||||
|
|
||||||
|
def run_all():
|
||||||
|
test_empty_sessions()
|
||||||
|
test_no_patterns_on_clean_sessions()
|
||||||
|
test_repeated_error_detection()
|
||||||
|
test_repeated_error_threshold()
|
||||||
|
test_slow_tool_detection()
|
||||||
|
test_fast_tool_not_flagged()
|
||||||
|
test_failed_retry_detection()
|
||||||
|
test_manual_process_detection()
|
||||||
|
test_generate_proposals_from_patterns()
|
||||||
|
test_proposal_roi_positive()
|
||||||
|
test_proposals_sorted_by_impact()
|
||||||
|
test_format_markdown()
|
||||||
|
test_format_json()
|
||||||
|
test_normalize_error()
|
||||||
|
test_cli_integration()
|
||||||
|
print("\nAll 15 tests passed!")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
run_all()
|
||||||
129
scripts/test_knowledge_staleness.py
Normal file
129
scripts/test_knowledge_staleness.py
Normal file
@@ -0,0 +1,129 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""Tests for scripts/knowledge_staleness_check.py — 8 tests."""
|
||||||
|
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import tempfile
|
||||||
|
|
||||||
|
sys.path.insert(0, os.path.dirname(__file__) or ".")
|
||||||
|
import importlib.util
|
||||||
|
spec = importlib.util.spec_from_file_location("ks", os.path.join(os.path.dirname(__file__) or ".", "knowledge_staleness_check.py"))
|
||||||
|
mod = importlib.util.module_from_spec(spec)
|
||||||
|
spec.loader.exec_module(mod)
|
||||||
|
check_staleness = mod.check_staleness
|
||||||
|
fix_hashes = mod.fix_hashes
|
||||||
|
compute_file_hash = mod.compute_file_hash
|
||||||
|
|
||||||
|
|
||||||
|
def test_fresh_entry():
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
src = os.path.join(tmpdir, "source.py")
|
||||||
|
with open(src, "w") as f:
|
||||||
|
f.write("print('hello')")
|
||||||
|
h = compute_file_hash(src)
|
||||||
|
idx = os.path.join(tmpdir, "index.json")
|
||||||
|
with open(idx, "w") as f:
|
||||||
|
json.dump({"facts": [{"fact": "hello", "source_file": "source.py", "source_hash": h}]}, f)
|
||||||
|
results = check_staleness(idx, tmpdir)
|
||||||
|
assert results[0]["status"] == "fresh"
|
||||||
|
print("PASS: test_fresh_entry")
|
||||||
|
|
||||||
|
|
||||||
|
def test_stale_entry():
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
src = os.path.join(tmpdir, "source.py")
|
||||||
|
with open(src, "w") as f:
|
||||||
|
f.write("original content")
|
||||||
|
idx = os.path.join(tmpdir, "index.json")
|
||||||
|
with open(idx, "w") as f:
|
||||||
|
json.dump({"facts": [{"fact": "old", "source_file": "source.py", "source_hash": "sha256:wrong"}]}, f)
|
||||||
|
# Now change the source
|
||||||
|
with open(src, "w") as f:
|
||||||
|
f.write("modified content")
|
||||||
|
results = check_staleness(idx, tmpdir)
|
||||||
|
assert results[0]["status"] == "stale"
|
||||||
|
print("PASS: test_stale_entry")
|
||||||
|
|
||||||
|
|
||||||
|
def test_missing_source():
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
idx = os.path.join(tmpdir, "index.json")
|
||||||
|
with open(idx, "w") as f:
|
||||||
|
json.dump({"facts": [{"fact": "gone", "source_file": "nonexistent.py", "source_hash": "sha256:abc"}]}, f)
|
||||||
|
results = check_staleness(idx, tmpdir)
|
||||||
|
assert results[0]["status"] == "missing_source"
|
||||||
|
print("PASS: test_missing_source")
|
||||||
|
|
||||||
|
|
||||||
|
def test_no_hash():
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
src = os.path.join(tmpdir, "source.py")
|
||||||
|
with open(src, "w") as f:
|
||||||
|
f.write("content")
|
||||||
|
idx = os.path.join(tmpdir, "index.json")
|
||||||
|
with open(idx, "w") as f:
|
||||||
|
json.dump({"facts": [{"fact": "no hash", "source_file": "source.py"}]}, f)
|
||||||
|
results = check_staleness(idx, tmpdir)
|
||||||
|
assert results[0]["status"] == "no_hash"
|
||||||
|
assert results[0]["current_hash"].startswith("sha256:")
|
||||||
|
print("PASS: test_no_hash")
|
||||||
|
|
||||||
|
|
||||||
|
def test_no_source_field():
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
idx = os.path.join(tmpdir, "index.json")
|
||||||
|
with open(idx, "w") as f:
|
||||||
|
json.dump({"facts": [{"fact": "orphan"}]}, f)
|
||||||
|
results = check_staleness(idx, tmpdir)
|
||||||
|
assert results[0]["status"] == "no_source"
|
||||||
|
print("PASS: test_no_source_field")
|
||||||
|
|
||||||
|
|
||||||
|
def test_fix_hashes():
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
src = os.path.join(tmpdir, "source.py")
|
||||||
|
with open(src, "w") as f:
|
||||||
|
f.write("content for hashing")
|
||||||
|
idx = os.path.join(tmpdir, "index.json")
|
||||||
|
with open(idx, "w") as f:
|
||||||
|
json.dump({"facts": [{"fact": "needs hash", "source_file": "source.py"}]}, f)
|
||||||
|
fixed = fix_hashes(idx, tmpdir)
|
||||||
|
assert fixed == 1
|
||||||
|
# Verify hash was added
|
||||||
|
with open(idx) as f:
|
||||||
|
data = json.load(f)
|
||||||
|
assert data["facts"][0]["source_hash"].startswith("sha256:")
|
||||||
|
print("PASS: test_fix_hashes")
|
||||||
|
|
||||||
|
|
||||||
|
def test_empty_index():
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
idx = os.path.join(tmpdir, "index.json")
|
||||||
|
with open(idx, "w") as f:
|
||||||
|
json.dump({"facts": []}, f)
|
||||||
|
results = check_staleness(idx, tmpdir)
|
||||||
|
assert results == []
|
||||||
|
print("PASS: test_empty_index")
|
||||||
|
|
||||||
|
|
||||||
|
def test_compute_hash_nonexistent():
|
||||||
|
h = compute_file_hash("/nonexistent/path/file.py")
|
||||||
|
assert h is None
|
||||||
|
print("PASS: test_compute_hash_nonexistent")
|
||||||
|
|
||||||
|
|
||||||
|
def run_all():
|
||||||
|
test_fresh_entry()
|
||||||
|
test_stale_entry()
|
||||||
|
test_missing_source()
|
||||||
|
test_no_hash()
|
||||||
|
test_no_source_field()
|
||||||
|
test_fix_hashes()
|
||||||
|
test_empty_index()
|
||||||
|
test_compute_hash_nonexistent()
|
||||||
|
print("\nAll 8 tests passed!")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
run_all()
|
||||||
305
scripts/test_priority_rebalancer.py
Normal file
305
scripts/test_priority_rebalancer.py
Normal file
@@ -0,0 +1,305 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Tests for Priority Rebalancer
|
||||||
|
"""
|
||||||
|
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import tempfile
|
||||||
|
from datetime import datetime, timedelta
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
# Add script dir to path
|
||||||
|
sys.path.insert(0, str(Path(__file__).parent))
|
||||||
|
|
||||||
|
from priority_rebalancer import (
|
||||||
|
GiteaClient,
|
||||||
|
IssueScore,
|
||||||
|
PipelineSignal,
|
||||||
|
compute_issue_score,
|
||||||
|
collect_knowledge_signals,
|
||||||
|
collect_metrics_signals,
|
||||||
|
extract_priority,
|
||||||
|
generate_report,
|
||||||
|
generate_markdown_report,
|
||||||
|
PRIORITY_LEVELS,
|
||||||
|
)
|
||||||
|
|
||||||
|
# ============================================================
|
||||||
|
# Test Helpers
|
||||||
|
# ============================================================
|
||||||
|
|
||||||
|
PASS = 0
|
||||||
|
FAIL = 0
|
||||||
|
|
||||||
|
def test(name):
|
||||||
|
def decorator(fn):
|
||||||
|
global PASS, FAIL
|
||||||
|
try:
|
||||||
|
fn()
|
||||||
|
PASS += 1
|
||||||
|
print(f" [PASS] {name}")
|
||||||
|
except Exception as e:
|
||||||
|
FAIL += 1
|
||||||
|
print(f" [FAIL] {name}: {e}")
|
||||||
|
return decorator
|
||||||
|
|
||||||
|
def assert_eq(a, b, msg=""):
|
||||||
|
if a != b:
|
||||||
|
raise AssertionError(f"{msg} expected {b!r}, got {a!r}")
|
||||||
|
|
||||||
|
def assert_true(v, msg=""):
|
||||||
|
if not v:
|
||||||
|
raise AssertionError(msg or "Expected True")
|
||||||
|
|
||||||
|
def assert_false(v, msg=""):
|
||||||
|
if v:
|
||||||
|
raise AssertionError(msg or "Expected False")
|
||||||
|
|
||||||
|
|
||||||
|
# ============================================================
|
||||||
|
# Priority Extraction Tests
|
||||||
|
# ============================================================
|
||||||
|
|
||||||
|
print("=== Priority Rebalancer Tests ===\n")
|
||||||
|
|
||||||
|
print("-- Priority Extraction --")
|
||||||
|
|
||||||
|
@test("extract P0 from label")
|
||||||
|
def _():
|
||||||
|
assert_eq(extract_priority(["P0", "bug"]), "P0")
|
||||||
|
|
||||||
|
@test("extract P1 from priority:high")
|
||||||
|
def _():
|
||||||
|
assert_eq(extract_priority(["priority:high"]), "P1")
|
||||||
|
|
||||||
|
@test("extract P2 from priority:medium")
|
||||||
|
def _():
|
||||||
|
assert_eq(extract_priority(["priority:medium"]), "P2")
|
||||||
|
|
||||||
|
@test("extract P3 from priority:low")
|
||||||
|
def _():
|
||||||
|
assert_eq(extract_priority(["priority:low"]), "P3")
|
||||||
|
|
||||||
|
@test("returns None for no priority")
|
||||||
|
def _():
|
||||||
|
assert_eq(extract_priority(["bug", "enhancement"]), None)
|
||||||
|
|
||||||
|
@test("case insensitive")
|
||||||
|
def _():
|
||||||
|
assert_eq(extract_priority(["p1"]), "P1")
|
||||||
|
assert_eq(extract_priority(["PRIORITY:CRITICAL"]), "P0")
|
||||||
|
|
||||||
|
|
||||||
|
# ============================================================
|
||||||
|
# Issue Scoring Tests
|
||||||
|
# ============================================================
|
||||||
|
|
||||||
|
print("\n-- Issue Scoring --")
|
||||||
|
|
||||||
|
def make_issue(**kwargs):
|
||||||
|
defaults = {
|
||||||
|
"number": 1,
|
||||||
|
"title": "Test issue",
|
||||||
|
"labels": [],
|
||||||
|
"created_at": (datetime.utcnow() - timedelta(days=5)).isoformat() + "Z",
|
||||||
|
"comments": 0,
|
||||||
|
"assignees": None,
|
||||||
|
}
|
||||||
|
defaults.update(kwargs)
|
||||||
|
return defaults
|
||||||
|
|
||||||
|
@test("bug gets score boost")
|
||||||
|
def _():
|
||||||
|
issue = make_issue(title="Incorrect output format", labels=["bug"])
|
||||||
|
score = compute_issue_score(issue, "test-repo", [], datetime.utcnow())
|
||||||
|
assert_true(score.score > 0, f"Bug should boost score, got {score.score}")
|
||||||
|
# Bug label alone should be P2 or P3 (not P0)
|
||||||
|
assert_true(score.suggested_priority in ("P2", "P3"),
|
||||||
|
f"Bug label alone should be P2/P3, got {score.suggested_priority}")
|
||||||
|
|
||||||
|
@test("security gets high score")
|
||||||
|
def _():
|
||||||
|
issue = make_issue(title="Security: auth bypass", labels=["bug"])
|
||||||
|
score = compute_issue_score(issue, "test-repo", [], datetime.utcnow())
|
||||||
|
assert_true(score.score >= 25, f"Security should score high, got {score.score}")
|
||||||
|
|
||||||
|
@test("old dormant issue gets penalized")
|
||||||
|
def _():
|
||||||
|
issue = make_issue(
|
||||||
|
title="Some old feature",
|
||||||
|
created_at=(datetime.utcnow() - timedelta(days=120)).isoformat() + "Z",
|
||||||
|
comments=0
|
||||||
|
)
|
||||||
|
score = compute_issue_score(issue, "test-repo", [], datetime.utcnow())
|
||||||
|
assert_true(score.score < 0, f"Old dormant should be negative, got {score.score}")
|
||||||
|
assert_true(any("Dormant" in r for r in score.reasons), "Should mention dormancy")
|
||||||
|
|
||||||
|
@test("active discussion boosts score")
|
||||||
|
def _():
|
||||||
|
issue = make_issue(title="Important fix", comments=8)
|
||||||
|
score = compute_issue_score(issue, "test-repo", [], datetime.utcnow())
|
||||||
|
assert_true(score.score > 5, f"Active discussion should boost, got {score.score}")
|
||||||
|
assert_true(any("Active" in r for r in score.reasons))
|
||||||
|
|
||||||
|
@test("unassigned gets slight boost")
|
||||||
|
def _():
|
||||||
|
issue = make_issue(title="Fix bug", assignees=None)
|
||||||
|
score = compute_issue_score(issue, "test-repo", [], datetime.utcnow())
|
||||||
|
assert_true(any("Unassigned" in r for r in score.reasons))
|
||||||
|
|
||||||
|
@test("assigned issue notes assignee")
|
||||||
|
def _():
|
||||||
|
issue = make_issue(title="Fix bug", assignees=[{"login": "alice"}])
|
||||||
|
score = compute_issue_score(issue, "test-repo", [], datetime.utcnow())
|
||||||
|
assert_eq(score.assignee, "alice")
|
||||||
|
|
||||||
|
@test("nice-to-have gets penalized")
|
||||||
|
def _():
|
||||||
|
issue = make_issue(title="Nice to have: fancy animation")
|
||||||
|
score = compute_issue_score(issue, "test-repo", [], datetime.utcnow())
|
||||||
|
assert_true(score.score < 0, f"Nice-to-have should be negative, got {score.score}")
|
||||||
|
|
||||||
|
|
||||||
|
# ============================================================
|
||||||
|
# Pipeline Signal Tests
|
||||||
|
# ============================================================
|
||||||
|
|
||||||
|
print("\n-- Pipeline Signals --")
|
||||||
|
|
||||||
|
@test("signal alignment boosts matching issues")
|
||||||
|
def _():
|
||||||
|
signals = [PipelineSignal(
|
||||||
|
source="knowledge",
|
||||||
|
signal_type="stale_knowledge",
|
||||||
|
weight=0.8,
|
||||||
|
detail="20 stale facts"
|
||||||
|
)]
|
||||||
|
issue = make_issue(title="Fix stale knowledge entries")
|
||||||
|
score = compute_issue_score(issue, "test-repo", signals, datetime.utcnow())
|
||||||
|
assert_true(any("Matches signal" in r for r in score.reasons))
|
||||||
|
|
||||||
|
@test("empty knowledge boosts harvester issues")
|
||||||
|
def _():
|
||||||
|
signals = [PipelineSignal(
|
||||||
|
source="knowledge",
|
||||||
|
signal_type="empty_knowledge",
|
||||||
|
weight=0.7,
|
||||||
|
detail="0 facts"
|
||||||
|
)]
|
||||||
|
issue = make_issue(title="Implement harvester pipeline")
|
||||||
|
score = compute_issue_score(issue, "test-repo", signals, datetime.utcnow())
|
||||||
|
assert_true(any("Critical gap" in r for r in score.reasons))
|
||||||
|
|
||||||
|
|
||||||
|
# ============================================================
|
||||||
|
# Knowledge Signal Collection Tests
|
||||||
|
# ============================================================
|
||||||
|
|
||||||
|
print("\n-- Knowledge Signal Collection --")
|
||||||
|
|
||||||
|
@test("missing index generates signal")
|
||||||
|
def _():
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
signals = collect_knowledge_signals(tmpdir)
|
||||||
|
assert_true(len(signals) > 0)
|
||||||
|
assert_eq(signals[0].signal_type, "missing_index")
|
||||||
|
|
||||||
|
@test("empty knowledge generates signal")
|
||||||
|
def _():
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
idx = os.path.join(tmpdir, "index.json")
|
||||||
|
with open(idx, "w") as f:
|
||||||
|
json.dump({"facts": []}, f)
|
||||||
|
signals = collect_knowledge_signals(tmpdir)
|
||||||
|
assert_true(any(s.signal_type == "empty_knowledge" for s in signals))
|
||||||
|
|
||||||
|
@test("corrupt index generates signal")
|
||||||
|
def _():
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
idx = os.path.join(tmpdir, "index.json")
|
||||||
|
with open(idx, "w") as f:
|
||||||
|
f.write("not json {{{")
|
||||||
|
signals = collect_knowledge_signals(tmpdir)
|
||||||
|
assert_true(any(s.signal_type == "corrupt_index" for s in signals))
|
||||||
|
|
||||||
|
@test("knowledge with facts passes")
|
||||||
|
def _():
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
idx = os.path.join(tmpdir, "index.json")
|
||||||
|
with open(idx, "w") as f:
|
||||||
|
json.dump({"facts": [
|
||||||
|
{"id": 1, "repo": "test", "status": "fresh"},
|
||||||
|
{"id": 2, "repo": "test", "status": "fresh"},
|
||||||
|
]}, f)
|
||||||
|
signals = collect_knowledge_signals(tmpdir)
|
||||||
|
# Should not generate missing_index or empty_knowledge
|
||||||
|
assert_false(any(s.signal_type in ("missing_index", "empty_knowledge") for s in signals))
|
||||||
|
|
||||||
|
|
||||||
|
# ============================================================
|
||||||
|
# Metrics Signal Collection Tests
|
||||||
|
# ============================================================
|
||||||
|
|
||||||
|
print("\n-- Metrics Signal Collection --")
|
||||||
|
|
||||||
|
@test("empty metrics dir generates signal")
|
||||||
|
def _():
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
signals = collect_metrics_signals(tmpdir)
|
||||||
|
assert_true(any(s.signal_type == "no_metrics" for s in signals))
|
||||||
|
|
||||||
|
@test("metrics with files passes")
|
||||||
|
def _():
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
# Create files (simulating real metrics dir with .gitkeep + actual files)
|
||||||
|
with open(os.path.join(tmpdir, ".gitkeep"), "w") as f:
|
||||||
|
f.write("")
|
||||||
|
with open(os.path.join(tmpdir, "report.json"), "w") as f:
|
||||||
|
f.write("{}")
|
||||||
|
signals = collect_metrics_signals(tmpdir)
|
||||||
|
assert_false(any(s.signal_type == "no_metrics" for s in signals))
|
||||||
|
|
||||||
|
|
||||||
|
# ============================================================
|
||||||
|
# Report Generation Tests
|
||||||
|
# ============================================================
|
||||||
|
|
||||||
|
print("\n-- Report Generation --")
|
||||||
|
|
||||||
|
@test("report has correct structure")
|
||||||
|
def _():
|
||||||
|
scores = [
|
||||||
|
IssueScore(1, "repo1", "Bug fix", ["bug"], None, "P1", 30.0, ["test"], 5, 3, None),
|
||||||
|
IssueScore(2, "repo1", "Feature", ["enhancement"], "P3", None, -5.0, ["test"], 60, 0, "alice"),
|
||||||
|
]
|
||||||
|
signals = [PipelineSignal("knowledge", "stale_knowledge", 0.5, "10 stale")]
|
||||||
|
report = generate_report(scores, signals, "test-org", ["repo1"])
|
||||||
|
|
||||||
|
assert_eq(report["org"], "test-org")
|
||||||
|
assert_eq(report["total_issues"], 2)
|
||||||
|
assert_true("generated_at" in report)
|
||||||
|
assert_true("summary" in report)
|
||||||
|
assert_true("top_priority" in report)
|
||||||
|
assert_eq(report["summary"]["suggested_new_priorities"], 1)
|
||||||
|
|
||||||
|
@test("markdown report is non-empty")
|
||||||
|
def _():
|
||||||
|
scores = [IssueScore(1, "repo1", "Test", ["bug"], None, "P2", 15.0, ["reason"], 5, 0, None)]
|
||||||
|
report = generate_report(scores, [], "test-org", ["repo1"])
|
||||||
|
md = generate_markdown_report(report)
|
||||||
|
assert_true(len(md) > 100)
|
||||||
|
assert_true("Priority Rebalancer Report" in md)
|
||||||
|
assert_true("Top 10" in md)
|
||||||
|
|
||||||
|
|
||||||
|
# ============================================================
|
||||||
|
# Summary
|
||||||
|
# ============================================================
|
||||||
|
|
||||||
|
print(f"\n=== Summary ===")
|
||||||
|
print(f"Total: {PASS + FAIL} | Passed: {PASS} | Failed: {FAIL}")
|
||||||
|
|
||||||
|
if FAIL > 0:
|
||||||
|
sys.exit(1)
|
||||||
242
scripts/test_refactoring_opportunity_finder.py
Normal file
242
scripts/test_refactoring_opportunity_finder.py
Normal file
@@ -0,0 +1,242 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""Tests for scripts/refactoring_opportunity_finder.py — 10 tests."""
|
||||||
|
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import tempfile
|
||||||
|
|
||||||
|
sys.path.insert(0, os.path.dirname(__file__) or ".")
|
||||||
|
import importlib.util
|
||||||
|
spec = importlib.util.spec_from_file_location(
|
||||||
|
"rof", os.path.join(os.path.dirname(__file__) or ".", "refactoring_opportunity_finder.py"))
|
||||||
|
mod = importlib.util.module_from_spec(spec)
|
||||||
|
spec.loader.exec_module(mod)
|
||||||
|
|
||||||
|
compute_file_complexity = mod.compute_file_complexity
|
||||||
|
calculate_refactoring_score = mod.calculate_refactoring_score
|
||||||
|
FileMetrics = mod.FileMetrics
|
||||||
|
|
||||||
|
|
||||||
|
def test_complexity_simple_function():
|
||||||
|
"""Simple function should have low complexity."""
|
||||||
|
with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f:
|
||||||
|
f.write("""
|
||||||
|
def simple():
|
||||||
|
return 42
|
||||||
|
""")
|
||||||
|
f.flush()
|
||||||
|
avg, max_c, funcs, classes, lines = compute_file_complexity(f.name)
|
||||||
|
assert avg == 1.0, f"Expected 1.0, got {avg}"
|
||||||
|
assert max_c == 1, f"Expected 1, got {max_c}"
|
||||||
|
assert funcs == 1, f"Expected 1, got {funcs}"
|
||||||
|
assert classes == 0, f"Expected 0, got {classes}"
|
||||||
|
os.unlink(f.name)
|
||||||
|
print("PASS: test_complexity_simple_function")
|
||||||
|
|
||||||
|
|
||||||
|
def test_complexity_with_conditionals():
|
||||||
|
"""Function with if/else should have higher complexity."""
|
||||||
|
with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f:
|
||||||
|
f.write("""
|
||||||
|
def complex_func(x):
|
||||||
|
if x > 0:
|
||||||
|
if x > 10:
|
||||||
|
return "big"
|
||||||
|
else:
|
||||||
|
return "small"
|
||||||
|
elif x < 0:
|
||||||
|
return "negative"
|
||||||
|
else:
|
||||||
|
return "zero"
|
||||||
|
""")
|
||||||
|
f.flush()
|
||||||
|
avg, max_c, funcs, classes, lines = compute_file_complexity(f.name)
|
||||||
|
# Base 1 + 3 if/elif + 1 nested if = 5
|
||||||
|
assert max_c >= 4, f"Expected max_c >= 4, got {max_c}"
|
||||||
|
assert funcs == 1, f"Expected 1, got {funcs}"
|
||||||
|
os.unlink(f.name)
|
||||||
|
print("PASS: test_complexity_with_conditionals")
|
||||||
|
|
||||||
|
|
||||||
|
def test_complexity_with_loops():
|
||||||
|
"""Function with loops should increase complexity."""
|
||||||
|
with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f:
|
||||||
|
f.write("""
|
||||||
|
def loop_func(items):
|
||||||
|
result = []
|
||||||
|
for item in items:
|
||||||
|
if item > 0:
|
||||||
|
result.append(item)
|
||||||
|
while len(result) > 10:
|
||||||
|
result.pop()
|
||||||
|
return result
|
||||||
|
""")
|
||||||
|
f.flush()
|
||||||
|
avg, max_c, funcs, classes, lines = compute_file_complexity(f.name)
|
||||||
|
# Base 1 + 1 for + 1 if + 1 while = 4
|
||||||
|
assert max_c >= 3, f"Expected max_c >= 3, got {max_c}"
|
||||||
|
os.unlink(f.name)
|
||||||
|
print("PASS: test_complexity_with_loops")
|
||||||
|
|
||||||
|
|
||||||
|
def test_complexity_with_class():
|
||||||
|
"""Class with methods should count both."""
|
||||||
|
with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f:
|
||||||
|
f.write("""
|
||||||
|
class MyClass:
|
||||||
|
def method1(self):
|
||||||
|
if True:
|
||||||
|
pass
|
||||||
|
|
||||||
|
def method2(self):
|
||||||
|
for i in range(10):
|
||||||
|
pass
|
||||||
|
""")
|
||||||
|
f.flush()
|
||||||
|
avg, max_c, funcs, classes, lines = compute_file_complexity(f.name)
|
||||||
|
assert classes == 1, f"Expected 1 class, got {classes}"
|
||||||
|
assert funcs == 2, f"Expected 2 functions, got {funcs}"
|
||||||
|
os.unlink(f.name)
|
||||||
|
print("PASS: test_complexity_with_class")
|
||||||
|
|
||||||
|
|
||||||
|
def test_complexity_syntax_error():
|
||||||
|
"""File with syntax error should return zeros."""
|
||||||
|
with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f:
|
||||||
|
f.write("def broken(:\n pass")
|
||||||
|
f.flush()
|
||||||
|
avg, max_c, funcs, classes, lines = compute_file_complexity(f.name)
|
||||||
|
assert avg == 0.0, f"Expected 0.0, got {avg}"
|
||||||
|
assert funcs == 0, f"Expected 0, got {funcs}"
|
||||||
|
os.unlink(f.name)
|
||||||
|
print("PASS: test_complexity_syntax_error")
|
||||||
|
|
||||||
|
|
||||||
|
def test_refactoring_score_high_complexity():
|
||||||
|
"""High complexity should give high score."""
|
||||||
|
metrics = FileMetrics(
|
||||||
|
path="test.py",
|
||||||
|
lines=200,
|
||||||
|
complexity=15.0,
|
||||||
|
max_complexity=25,
|
||||||
|
functions=10,
|
||||||
|
classes=2,
|
||||||
|
churn_30d=5,
|
||||||
|
churn_90d=15,
|
||||||
|
test_coverage=0.3,
|
||||||
|
refactoring_score=0.0
|
||||||
|
)
|
||||||
|
score = calculate_refactoring_score(metrics)
|
||||||
|
assert score > 50, f"Expected score > 50, got {score}"
|
||||||
|
print("PASS: test_refactoring_score_high_complexity")
|
||||||
|
|
||||||
|
|
||||||
|
def test_refactoring_score_low_complexity():
|
||||||
|
"""Low complexity should give lower score."""
|
||||||
|
metrics = FileMetrics(
|
||||||
|
path="test.py",
|
||||||
|
lines=50,
|
||||||
|
complexity=2.0,
|
||||||
|
max_complexity=3,
|
||||||
|
functions=3,
|
||||||
|
classes=0,
|
||||||
|
churn_30d=0,
|
||||||
|
churn_90d=1,
|
||||||
|
test_coverage=0.9,
|
||||||
|
refactoring_score=0.0
|
||||||
|
)
|
||||||
|
score = calculate_refactoring_score(metrics)
|
||||||
|
assert score < 30, f"Expected score < 30, got {score}"
|
||||||
|
print("PASS: test_refactoring_score_low_complexity")
|
||||||
|
|
||||||
|
|
||||||
|
def test_refactoring_score_high_churn():
|
||||||
|
"""High churn should increase score."""
|
||||||
|
metrics = FileMetrics(
|
||||||
|
path="test.py",
|
||||||
|
lines=100,
|
||||||
|
complexity=5.0,
|
||||||
|
max_complexity=8,
|
||||||
|
functions=5,
|
||||||
|
classes=0,
|
||||||
|
churn_30d=10,
|
||||||
|
churn_90d=20,
|
||||||
|
test_coverage=0.5,
|
||||||
|
refactoring_score=0.0
|
||||||
|
)
|
||||||
|
score = calculate_refactoring_score(metrics)
|
||||||
|
# Churn should contribute significantly
|
||||||
|
assert score > 40, f"Expected score > 40 for high churn, got {score}"
|
||||||
|
print("PASS: test_refactoring_score_high_churn")
|
||||||
|
|
||||||
|
|
||||||
|
def test_refactoring_score_no_coverage():
|
||||||
|
"""No coverage data should assume medium risk."""
|
||||||
|
metrics = FileMetrics(
|
||||||
|
path="test.py",
|
||||||
|
lines=100,
|
||||||
|
complexity=5.0,
|
||||||
|
max_complexity=8,
|
||||||
|
functions=5,
|
||||||
|
classes=0,
|
||||||
|
churn_30d=1,
|
||||||
|
churn_90d=2,
|
||||||
|
test_coverage=None,
|
||||||
|
refactoring_score=0.0
|
||||||
|
)
|
||||||
|
score = calculate_refactoring_score(metrics)
|
||||||
|
# Should have some score from the 5-point coverage component
|
||||||
|
assert score > 0, f"Expected positive score, got {score}"
|
||||||
|
print("PASS: test_refactoring_score_no_coverage")
|
||||||
|
|
||||||
|
|
||||||
|
def test_refactoring_score_large_file():
|
||||||
|
"""Large files should score higher."""
|
||||||
|
metrics_small = FileMetrics(
|
||||||
|
path="small.py",
|
||||||
|
lines=50,
|
||||||
|
complexity=5.0,
|
||||||
|
max_complexity=8,
|
||||||
|
functions=3,
|
||||||
|
classes=0,
|
||||||
|
churn_30d=1,
|
||||||
|
churn_90d=2,
|
||||||
|
test_coverage=0.8,
|
||||||
|
refactoring_score=0.0
|
||||||
|
)
|
||||||
|
metrics_large = FileMetrics(
|
||||||
|
path="large.py",
|
||||||
|
lines=1000,
|
||||||
|
complexity=5.0,
|
||||||
|
max_complexity=8,
|
||||||
|
functions=3,
|
||||||
|
classes=0,
|
||||||
|
churn_30d=1,
|
||||||
|
churn_90d=2,
|
||||||
|
test_coverage=0.8,
|
||||||
|
refactoring_score=0.0
|
||||||
|
)
|
||||||
|
score_small = calculate_refactoring_score(metrics_small)
|
||||||
|
score_large = calculate_refactoring_score(metrics_large)
|
||||||
|
assert score_large > score_small, \
|
||||||
|
f"Large file ({score_large}) should score higher than small ({score_small})"
|
||||||
|
print("PASS: test_refactoring_score_large_file")
|
||||||
|
|
||||||
|
|
||||||
|
def run_all():
|
||||||
|
test_complexity_simple_function()
|
||||||
|
test_complexity_with_conditionals()
|
||||||
|
test_complexity_with_loops()
|
||||||
|
test_complexity_with_class()
|
||||||
|
test_complexity_syntax_error()
|
||||||
|
test_refactoring_score_high_complexity()
|
||||||
|
test_refactoring_score_low_complexity()
|
||||||
|
test_refactoring_score_high_churn()
|
||||||
|
test_refactoring_score_no_coverage()
|
||||||
|
test_refactoring_score_large_file()
|
||||||
|
print("\nAll 10 tests passed!")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
run_all()
|
||||||
90
scripts/test_session_pair_harvester.py
Normal file
90
scripts/test_session_pair_harvester.py
Normal file
@@ -0,0 +1,90 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""Tests for session_pair_harvester."""
|
||||||
|
|
||||||
|
import json
|
||||||
|
import sys
|
||||||
|
import os
|
||||||
|
import tempfile
|
||||||
|
|
||||||
|
sys.path.insert(0, os.path.dirname(__file__))
|
||||||
|
from session_pair_harvester import extract_pairs_from_session, deduplicate_pairs, compute_hash
|
||||||
|
|
||||||
|
|
||||||
|
def test_basic_extraction():
|
||||||
|
session = {
|
||||||
|
"id": "test_001",
|
||||||
|
"model": "test-model",
|
||||||
|
"conversations": [
|
||||||
|
{"from": "system", "value": "You are helpful."},
|
||||||
|
{"from": "human", "value": "What is Python?"},
|
||||||
|
{"from": "gpt", "value": "Python is a high-level programming language known for its readability and versatility. It supports multiple paradigms including procedural, object-oriented, and functional programming. Python is widely used in web development, data science, machine learning, and automation."},
|
||||||
|
]
|
||||||
|
}
|
||||||
|
pairs = extract_pairs_from_session(session, min_ratio=1.5, min_response_words=10)
|
||||||
|
assert len(pairs) == 1
|
||||||
|
assert pairs[0]["terse"] == "What is Python?"
|
||||||
|
assert "programming language" in pairs[0]["rich"]
|
||||||
|
assert pairs[0]["source"] == "test_001"
|
||||||
|
print("PASS: test_basic_extraction")
|
||||||
|
|
||||||
|
|
||||||
|
def test_filters_short_responses():
|
||||||
|
session = {
|
||||||
|
"id": "test_002",
|
||||||
|
"model": "test",
|
||||||
|
"conversations": [
|
||||||
|
{"from": "human", "value": "Hi"},
|
||||||
|
{"from": "gpt", "value": "Hello!"},
|
||||||
|
]
|
||||||
|
}
|
||||||
|
pairs = extract_pairs_from_session(session, min_ratio=1.5, min_response_words=20)
|
||||||
|
assert len(pairs) == 0
|
||||||
|
print("PASS: test_filters_short_responses")
|
||||||
|
|
||||||
|
|
||||||
|
def test_skips_tool_results():
|
||||||
|
session = {
|
||||||
|
"id": "test_003",
|
||||||
|
"model": "test",
|
||||||
|
"conversations": [
|
||||||
|
{"from": "human", "value": '{"output": "file content", "exit_code": 0}'},
|
||||||
|
{"from": "gpt", "value": "The file was read successfully. Now let me analyze the content and provide a detailed summary of what was found in the file system."},
|
||||||
|
]
|
||||||
|
}
|
||||||
|
pairs = extract_pairs_from_session(session, min_ratio=1.5, min_response_words=10)
|
||||||
|
assert len(pairs) == 0
|
||||||
|
print("PASS: test_skips_tool_results")
|
||||||
|
|
||||||
|
|
||||||
|
def test_deduplication():
|
||||||
|
pairs = [
|
||||||
|
{"terse": "What is X?", "rich": "X is Y.", "source": "s1", "model": "m"},
|
||||||
|
{"terse": "What is X?", "rich": "X is Y.", "source": "s2", "model": "m"},
|
||||||
|
{"terse": "What is Z?", "rich": "Z is W.", "source": "s1", "model": "m"},
|
||||||
|
]
|
||||||
|
unique = deduplicate_pairs(pairs)
|
||||||
|
assert len(unique) == 2
|
||||||
|
print("PASS: test_deduplication")
|
||||||
|
|
||||||
|
|
||||||
|
def test_ratio_filter():
|
||||||
|
session = {
|
||||||
|
"id": "test_005",
|
||||||
|
"model": "test",
|
||||||
|
"conversations": [
|
||||||
|
{"from": "human", "value": "Explain quantum computing in detail with examples and applications"},
|
||||||
|
{"from": "gpt", "value": "OK."},
|
||||||
|
]
|
||||||
|
}
|
||||||
|
pairs = extract_pairs_from_session(session, min_ratio=1.5, min_response_words=10)
|
||||||
|
assert len(pairs) == 0 # response too short relative to prompt
|
||||||
|
print("PASS: test_ratio_filter")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
test_basic_extraction()
|
||||||
|
test_filters_short_responses()
|
||||||
|
test_skips_tool_results()
|
||||||
|
test_deduplication()
|
||||||
|
test_ratio_filter()
|
||||||
|
print("\nAll tests passed.")
|
||||||
38
scripts/validate_knowledge.py
Normal file
38
scripts/validate_knowledge.py
Normal file
@@ -0,0 +1,38 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""Validate knowledge files and index.json against the schema."""
|
||||||
|
import json, sys
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
VALID_CATEGORIES = {"fact", "pitfall", "pattern", "tool-quirk", "question"}
|
||||||
|
REQUIRED = {"id", "fact", "category", "domain", "confidence"}
|
||||||
|
|
||||||
|
def validate_fact(fact, src=""):
|
||||||
|
errs = []
|
||||||
|
for f in REQUIRED:
|
||||||
|
if f not in fact: errs.append(f"{src}: missing '{f}'")
|
||||||
|
if "category" in fact and fact["category"] not in VALID_CATEGORIES:
|
||||||
|
errs.append(f"{src}: invalid category '{fact['category']}'")
|
||||||
|
if "confidence" in fact:
|
||||||
|
if not isinstance(fact["confidence"], (int, float)) or not (0 <= fact["confidence"] <= 1):
|
||||||
|
errs.append(f"{src}: confidence must be 0.0-1.0")
|
||||||
|
if "id" in fact:
|
||||||
|
parts = fact["id"].split(":")
|
||||||
|
if len(parts) != 3: errs.append(f"{src}: id must be domain:category:sequence")
|
||||||
|
return errs
|
||||||
|
|
||||||
|
def main():
|
||||||
|
idx = Path(__file__).parent.parent / "knowledge" / "index.json"
|
||||||
|
if not idx.exists(): print(f"FAILED: {idx} not found"); sys.exit(1)
|
||||||
|
data = json.load(open(idx))
|
||||||
|
errs = []
|
||||||
|
seen = set()
|
||||||
|
for i, f in enumerate(data.get("facts", [])):
|
||||||
|
errs.extend(validate_fact(f, f"[{i}]"))
|
||||||
|
if "id" in f:
|
||||||
|
if f["id"] in seen: errs.append(f"duplicate id '{f['id']}'")
|
||||||
|
seen.add(f["id"])
|
||||||
|
if errs:
|
||||||
|
print(f"FAILED - {len(errs)} errors:"); [print(f" x {e}") for e in errs]; sys.exit(1)
|
||||||
|
print(f"PASSED - {len(data.get('facts', []))} facts")
|
||||||
|
|
||||||
|
if __name__ == "__main__": main()
|
||||||
@@ -2,98 +2,107 @@
|
|||||||
|
|
||||||
## System Prompt
|
## System Prompt
|
||||||
|
|
||||||
You are a knowledge extraction engine. Your task is to analyze a session transcript and extract durable knowledge that will help future sessions be more efficient.
|
You are a knowledge extraction engine. You read session transcripts and output ONLY structured JSON. You never infer. You never assume. You extract only what the transcript explicitly states.
|
||||||
|
|
||||||
## Instructions
|
## Prompt
|
||||||
|
|
||||||
Read the session transcript carefully. Extract ONLY information that is explicitly stated in the transcript. Do NOT infer, assume, or hallucinate information.
|
|
||||||
|
|
||||||
### Categories
|
|
||||||
|
|
||||||
Extract knowledge into these categories:
|
|
||||||
|
|
||||||
1. **fact**: Concrete, verifiable information learned (e.g., "Repository X has 5 files", "API returns JSON with field Y")
|
|
||||||
2. **pitfall**: Errors encountered, wrong assumptions, things that wasted time (e.g., "Assumed API token was in env var GITEA_TOKEN, but it's in ~/.config/gitea/token")
|
|
||||||
3. **pattern**: Successful sequences of actions (e.g., "To deploy: 1. Run tests 2. Build 3. Push to Gitea 4. Trigger webhook")
|
|
||||||
4. **tool-quirk**: Environment-specific behaviors (e.g., "Token paths are different on macOS vs Linux", "URL format requires trailing slash")
|
|
||||||
5. **question**: Things identified but not answered (e.g., "Need to determine optimal batch size for harvesting")
|
|
||||||
|
|
||||||
### Output Format
|
|
||||||
|
|
||||||
Return a JSON object with an array of extracted knowledge items. Each item must have:
|
|
||||||
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"fact": "One sentence description of the knowledge",
|
|
||||||
"category": "fact|pitfall|pattern|tool-quirk|question",
|
|
||||||
"repo": "Repository name this applies to, or 'global' if general",
|
|
||||||
"confidence": 0.0-1.0
|
|
||||||
}
|
|
||||||
```
|
```
|
||||||
|
TASK: Extract durable knowledge from this session transcript.
|
||||||
|
|
||||||
### Confidence Scoring
|
RULES:
|
||||||
|
1. Extract ONLY information explicitly stated in the transcript.
|
||||||
|
2. Do NOT infer, assume, or hallucinate.
|
||||||
|
3. Every fact must be verifiable by pointing to a specific line in the transcript.
|
||||||
|
4. If the session failed or was partial, extract pitfalls and questions — these are the most valuable.
|
||||||
|
5. Be specific. "Gitea API is slow" is worthless. "Gitea issues endpoint with state=open returns empty when limit=50 but works with limit=5" is knowledge.
|
||||||
|
|
||||||
- 0.9-1.0: Explicitly stated with verification (e.g., "Error message shows X")
|
CATEGORIES (assign exactly one per item):
|
||||||
- 0.7-0.8: Clearly implied by multiple data points
|
- fact: Concrete, verifiable thing learned (paths, formats, counts, configs)
|
||||||
- 0.5-0.6: Suggested but not fully verified
|
- pitfall: Error hit, wrong assumption, time wasted, thing that didn't work
|
||||||
- 0.3-0.4: Inferred from limited data
|
- pattern: Successful sequence that should be reused (deploy steps, debug flow)
|
||||||
- 0.1-0.2: Speculative or uncertain
|
- tool-quirk: Environment-specific behavior (token paths, URL formats, API gotchas)
|
||||||
|
- question: Something identified but not answered — the NEXT agent should investigate
|
||||||
|
|
||||||
### Constraints
|
CONFIDENCE:
|
||||||
|
- 0.9: Directly observed with error output or explicit verification
|
||||||
|
- 0.7: Multiple data points confirm, but not explicitly verified
|
||||||
|
- 0.5: Suggested by context, not tested
|
||||||
|
- 0.3: Inferred from limited evidence
|
||||||
|
|
||||||
1. **No hallucination**: Only extract what's explicitly in the transcript
|
OUTPUT FORMAT (valid JSON only, no markdown, no explanation):
|
||||||
2. **Specificity**: Each fact must be specific and actionable
|
|
||||||
3. **Relevance**: Only extract knowledge that would help future sessions
|
|
||||||
4. **Brevity**: One sentence per fact
|
|
||||||
5. **Partial sessions**: Even failed or incomplete sessions may contain valuable pitfalls
|
|
||||||
|
|
||||||
### Example Input/Output
|
|
||||||
|
|
||||||
**Input Transcript (excerpt):**
|
|
||||||
```
|
|
||||||
User: Clone the repo at forge.alexanderwhitestone.com/Timmy_Foundation/compounding-intelligence
|
|
||||||
Agent: Trying git clone... Error: 401 Unauthorized
|
|
||||||
Agent: Checking for token... Found in ~/.config/gitea/token
|
|
||||||
Agent: Token is gitea_token format, not OAuth
|
|
||||||
Agent: Clone successful with token authentication
|
|
||||||
```
|
|
||||||
|
|
||||||
**Output:**
|
|
||||||
```json
|
|
||||||
{
|
{
|
||||||
"knowledge": [
|
"knowledge": [
|
||||||
{
|
{
|
||||||
"fact": "Gitea repo at forge.alexanderwhitestone.com requires authentication for cloning",
|
"fact": "One specific sentence of knowledge",
|
||||||
"category": "fact",
|
"category": "fact|pitfall|pattern|tool-quirk|question",
|
||||||
"repo": "compounding-intelligence",
|
"repo": "repo-name or global",
|
||||||
"confidence": 0.9
|
"confidence": 0.0-1.0,
|
||||||
},
|
"evidence": "Brief quote or reference from transcript that supports this"
|
||||||
{
|
|
||||||
"fact": "Gitea authentication token is stored at ~/.config/gitea/token",
|
|
||||||
"category": "tool-quirk",
|
|
||||||
"repo": "global",
|
|
||||||
"confidence": 0.9
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"fact": "Gitea uses gitea_token format, not OAuth for API access",
|
|
||||||
"category": "tool-quirk",
|
|
||||||
"repo": "global",
|
|
||||||
"confidence": 0.8
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"fact": "Clone fails with 401 when no token is provided",
|
|
||||||
"category": "pitfall",
|
|
||||||
"repo": "compounding-intelligence",
|
|
||||||
"confidence": 0.9
|
|
||||||
}
|
}
|
||||||
]
|
],
|
||||||
|
"meta": {
|
||||||
|
"session_outcome": "success|partial|failed",
|
||||||
|
"tools_used": ["tool1", "tool2"],
|
||||||
|
"repos_touched": ["repo1"],
|
||||||
|
"error_count": 0,
|
||||||
|
"knowledge_count": 0
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
TRANSCRIPT:
|
||||||
|
{{transcript}}
|
||||||
```
|
```
|
||||||
|
|
||||||
## Final Notes
|
## Design Notes
|
||||||
|
|
||||||
- Process the entire transcript, not just the beginning
|
### Why this works with mimo-v2-pro
|
||||||
- Pay special attention to errors and corrections
|
|
||||||
- Note any environment-specific details
|
Mimo needs:
|
||||||
- Track tool-specific behaviors and quirks
|
- Explicit format constraints ("valid JSON only, no markdown")
|
||||||
- If the session failed, focus on pitfalls and questions
|
- Clear category definitions with concrete examples
|
||||||
|
- Hard rules before soft guidance
|
||||||
|
- The transcript at the END (so it reads all instructions first)
|
||||||
|
|
||||||
|
This prompt front-loads all rules, then gives the transcript last. Mimo follows the pattern.
|
||||||
|
|
||||||
|
### Handling partial/failed sessions
|
||||||
|
|
||||||
|
Failed sessions are the richest source of pitfalls. The prompt explicitly says:
|
||||||
|
> "If the session failed or was partial, extract pitfalls and questions — these are the most valuable."
|
||||||
|
|
||||||
|
This reframes failure as valuable output, not noise to discard.
|
||||||
|
|
||||||
|
### The `evidence` field
|
||||||
|
|
||||||
|
Added to the original spec. Every extracted item must cite where in the transcript it came from. This:
|
||||||
|
- Prevents hallucination (can't cite what isn't there)
|
||||||
|
- Enables verification (reviewer can check the source)
|
||||||
|
- Trains confidence calibration (the agent must find evidence, not just claim it)
|
||||||
|
|
||||||
|
### Token budget
|
||||||
|
|
||||||
|
Target: ~1,000 tokens for the prompt (excluding transcript).
|
||||||
|
|
||||||
|
```
|
||||||
|
System prompt: ~50 tokens
|
||||||
|
Rules: ~200 tokens
|
||||||
|
Categories: ~150 tokens
|
||||||
|
Confidence: ~100 tokens
|
||||||
|
Output format: ~200 tokens
|
||||||
|
Design notes: NOT included in prompt (documentation only)
|
||||||
|
─────────────────────────────
|
||||||
|
Total prompt: ~700 tokens
|
||||||
|
```
|
||||||
|
|
||||||
|
Leaves ~300 tokens headroom for variable content (transcript insertion, edge cases).
|
||||||
|
|
||||||
|
### What this replaces
|
||||||
|
|
||||||
|
The v1 prompt had:
|
||||||
|
- Verbose prose explanations (waste tokens for mimo)
|
||||||
|
- No `evidence` field (hallucination risk)
|
||||||
|
- No `meta` block (no session-level metadata)
|
||||||
|
- No explicit handling of failed sessions
|
||||||
|
- Example was too long (~150 tokens of example for a 1k prompt)
|
||||||
|
|
||||||
|
This v2 is tighter, more structured, and adds the evidence requirement that prevents the #1 failure mode of extraction prompts: generating plausible-sounding facts that aren't in the transcript.
|
||||||
|
|||||||
18
test_sessions/session_failure.jsonl
Normal file
18
test_sessions/session_failure.jsonl
Normal file
@@ -0,0 +1,18 @@
|
|||||||
|
{"role": "user", "content": "Deploy the application to production"}
|
||||||
|
{"role": "assistant", "content": "I'll deploy the application to production.", "tool_calls": [{"function": "terminal", "arguments": {"command": "ssh deploy@prod-server 'cd /app && git pull && docker-compose up -d'"}}]}
|
||||||
|
{"role": "tool", "content": "ssh: connect to host prod-server port 22: Connection refused"}
|
||||||
|
{"role": "assistant", "content": "Connection refused. Let me check the server status.", "tool_calls": [{"function": "terminal", "arguments": {"command": "ping -c 3 prod-server"}}]}
|
||||||
|
{"role": "tool", "content": "PING prod-server (192.168.1.100): 56 data bytes\nRequest timeout for icmp_seq 0\nRequest timeout for icmp_seq 1\nRequest timeout for icmp_seq 2\n--- prod-server ping statistics ---\n3 packets transmitted, 0 packets received, 100.0% packet loss"}
|
||||||
|
{"role": "assistant", "content": "The server appears to be down. Let me check if there's an alternative deployment method."}
|
||||||
|
{"role": "user", "content": "Try using the VPS instead"}
|
||||||
|
{"role": "assistant", "content": "I'll try deploying to the VPS.", "tool_calls": [{"function": "terminal", "arguments": {"command": "ssh deploy@vps-server 'cd /app && git pull && docker-compose up -d'"}}]}
|
||||||
|
{"role": "tool", "content": "Permission denied (publickey)."}
|
||||||
|
{"role": "assistant", "content": "Permission denied. The SSH key might not be set up correctly. Let me check the SSH configuration.", "tool_calls": [{"function": "terminal", "arguments": {"command": "cat ~/.ssh/config"}}]}
|
||||||
|
{"role": "tool", "content": "Host vps-server\n HostName 167.99.126.228\n User deploy\n IdentityFile ~/.ssh/id_rsa_vps"}
|
||||||
|
{"role": "assistant", "content": "The SSH config looks correct. Let me check if the key exists.", "tool_calls": [{"function": "terminal", "arguments": {"command": "ls -la ~/.ssh/id_rsa_vps"}}]}
|
||||||
|
{"role": "tool", "content": "ls: /Users/user/.ssh/id_rsa_vps: No such file or directory"}
|
||||||
|
{"role": "assistant", "content": "The SSH key doesn't exist. This is a pitfall - we need to generate the SSH key first."}
|
||||||
|
{"role": "user", "content": "Generate a new SSH key for the VPS"}
|
||||||
|
{"role": "assistant", "content": "I'll generate a new SSH key.", "tool_calls": [{"function": "terminal", "arguments": {"command": "ssh-keygen -t rsa -b 4096 -f ~/.ssh/id_rsa_vps -N ''"}}]}
|
||||||
|
{"role": "tool", "content": "Generating public/private rsa key pair.\nYour identification has been saved in /Users/user/.ssh/id_rsa_vps\nYour public key has been saved in /Users/user/.ssh/id_rsa_vps.pub\nThe key fingerprint is:\nSHA256:abc123... user@machine"}
|
||||||
|
{"role": "assistant", "content": "SSH key generated. Now I need to add it to the VPS."}
|
||||||
14
test_sessions/session_partial.jsonl
Normal file
14
test_sessions/session_partial.jsonl
Normal file
@@ -0,0 +1,14 @@
|
|||||||
|
{"role": "user", "content": "Check the Gitea API for open issues"}
|
||||||
|
{"role": "assistant", "content": "I'll check the Gitea API for open issues.", "tool_calls": [{"function": "terminal", "arguments": {"command": "curl -H 'Authorization: token $GITEA_TOKEN' https://forge.alexanderwhitestone.com/api/v1/repos/Timmy_Foundation/compounding-intelligence/issues?state=open"}}]}
|
||||||
|
{"role": "tool", "content": "curl: (6) Could not resolve host: forge.alexanderwhitestone.com"}
|
||||||
|
{"role": "assistant", "content": "DNS resolution failed. Let me check the network connectivity.", "tool_calls": [{"function": "terminal", "arguments": {"command": "nslookup forge.alexanderwhitestone.com"}}]}
|
||||||
|
{"role": "tool", "content": "Server: 8.8.8.8\nAddress: 8.8.8.8#53\n\n** server can't find forge.alexanderwhitestone.com: NXDOMAIN"}
|
||||||
|
{"role": "assistant", "content": "The domain doesn't resolve. This might be a DNS issue. Let me try using the IP address directly.", "tool_calls": [{"function": "terminal", "arguments": {"command": "curl -H 'Authorization: token $GITEA_TOKEN' https://167.99.126.228/api/v1/repos/Timmy_Foundation/compounding-intelligence/issues?state=open"}}]}
|
||||||
|
{"role": "tool", "content": "curl: (60) SSL certificate problem: self signed certificate"}
|
||||||
|
{"role": "assistant", "content": "SSL certificate issue. The server is using a self-signed certificate. Let me try with -k flag to skip SSL verification.", "tool_calls": [{"function": "terminal", "arguments": {"command": "curl -k -H 'Authorization: token $GITEA_TOKEN' https://167.99.126.228/api/v1/repos/Timmy_Foundation/compounding-intelligence/issues?state=open"}}]}
|
||||||
|
{"role": "tool", "content": "[]"}
|
||||||
|
{"role": "assistant", "content": "No open issues found. The API returned an empty array."}
|
||||||
|
{"role": "user", "content": "Check if there are any pull requests"}
|
||||||
|
{"role": "assistant", "content": "Let me check for pull requests.", "tool_calls": [{"function": "terminal", "arguments": {"command": "curl -k -H 'Authorization: token $GITEA_TOKEN' https://167.99.126.228/api/v1/repos/Timmy_Foundation/compounding-intelligence/pulls?state=open"}}]}
|
||||||
|
{"role": "tool", "content": "[{"id": 20, "title": "feat: add harvester.py + session_reader.py — session knowledge extractor (closes #8)", "state": "open", "user": {"login": "Rockachopa"}, "created_at": "2026-04-14T17:28:20Z"}]"}
|
||||||
|
{"role": "assistant", "content": "Found 1 open pull request: PR #20 - "feat: add harvester.py + session_reader.py — session knowledge extractor (closes #8)" by Rockachopa."}
|
||||||
10
test_sessions/session_patterns.jsonl
Normal file
10
test_sessions/session_patterns.jsonl
Normal file
@@ -0,0 +1,10 @@
|
|||||||
|
{"role": "user", "content": "How do I deploy a new version of the application?"}
|
||||||
|
{"role": "assistant", "content": "Here's the deployment process:\n1. Run tests: pytest\n2. Build Docker image: docker build -t app:latest .\n3. Push to registry: docker push registry.example.com/app:latest\n4. Update production: ssh prod 'cd /app && docker-compose pull && docker-compose up -d'\n5. Verify: curl https://app.example.com/health"}
|
||||||
|
{"role": "user", "content": "What if tests fail?"}
|
||||||
|
{"role": "assistant", "content": "If tests fail:\n1. Check the error logs: pytest -v\n2. Fix the failing tests\n3. Re-run tests: pytest\n4. If still failing, check recent changes: git log --oneline -10\n5. Consider reverting: git revert HEAD"}
|
||||||
|
{"role": "user", "content": "What if the Docker build fails?"}
|
||||||
|
{"role": "assistant", "content": "If Docker build fails:\n1. Check Dockerfile syntax: docker build --no-cache -t app:latest .\n2. Check for missing dependencies: cat requirements.txt\n3. Check disk space: df -h\n4. Clean up Docker: docker system prune -f\n5. Try building with verbose output: docker build --progress=plain -t app:latest ."}
|
||||||
|
{"role": "user", "content": "What if the push to registry fails?"}
|
||||||
|
{"role": "assistant", "content": "If push to registry fails:\n1. Check registry credentials: docker login registry.example.com\n2. Check registry space: curl https://registry.example.com/v2/_catalog\n3. Check image size: docker images app:latest\n4. Try tagging with version: docker tag app:latest registry.example.com/app:v1.0.0\n5. Push with version tag: docker push registry.example.com/app:v1.0.0"}
|
||||||
|
{"role": "user", "content": "What if the production update fails?"}
|
||||||
|
{"role": "assistant", "content": "If production update fails:\n1. Check SSH connection: ssh prod 'echo connected'\n2. Check disk space on prod: ssh prod 'df -h'\n3. Check Docker status: ssh prod 'docker ps -a'\n4. Check logs: ssh prod 'docker-compose logs --tail=100'\n5. Rollback: ssh prod 'cd /app && git checkout HEAD~1 && docker-compose up -d'"}
|
||||||
12
test_sessions/session_questions.jsonl
Normal file
12
test_sessions/session_questions.jsonl
Normal file
@@ -0,0 +1,12 @@
|
|||||||
|
{"role": "user", "content": "I need to implement a new feature for the application"}
|
||||||
|
{"role": "assistant", "content": "I can help you implement a new feature. What feature would you like to add?"}
|
||||||
|
{"role": "user", "content": "I want to add user authentication"}
|
||||||
|
{"role": "assistant", "content": "User authentication is a good feature. I have a few questions:\n1. What authentication method do you prefer? (JWT, OAuth, Session-based)\n2. Do you need social login? (Google, GitHub, etc.)\n3. What user data do you need to store?\n4. Do you need role-based access control?\n5. What's your database? (PostgreSQL, MySQL, SQLite)"}
|
||||||
|
{"role": "user", "content": "I prefer JWT, no social login, just email and password, no roles, PostgreSQL"}
|
||||||
|
{"role": "assistant", "content": "Great choices. Here's the implementation plan:\n1. Create user model with email and password hash\n2. Create JWT token generation and validation\n3. Create login endpoint: POST /api/auth/login\n4. Create register endpoint: POST /api/auth/register\n5. Create middleware for protected routes\n6. Add password hashing with bcrypt\n7. Add rate limiting for login attempts\n8. Add password reset functionality\n9. Add email verification\n10. Test all endpoints"}
|
||||||
|
{"role": "user", "content": "What about password reset?"}
|
||||||
|
{"role": "assistant", "content": "For password reset:\n1. Create reset token table\n2. Create forgot password endpoint: POST /api/auth/forgot-password\n3. Send reset email with token\n4. Create reset password endpoint: POST /api/auth/reset-password\n5. Validate reset token and expiration\n6. Update password and delete token\n7. Send confirmation email"}
|
||||||
|
{"role": "user", "content": "What about email verification?"}
|
||||||
|
{"role": "assistant", "content": "For email verification:\n1. Add email_verified field to user model\n2. Create verification token table\n3. Send verification email on registration\n4. Create verify email endpoint: GET /api/auth/verify-email?token=...\n5. Validate token and update email_verified\n6. Add middleware to check email_verified for certain routes\n7. Add resend verification email endpoint"}
|
||||||
|
{"role": "user", "content": "What about rate limiting?"}
|
||||||
|
{"role": "assistant", "content": "For rate limiting:\n1. Use Redis for rate limit storage\n2. Create rate limit middleware\n3. Set limits: 5 login attempts per minute, 3 password resets per hour\n4. Return 429 Too Many Requests when limit exceeded\n5. Add X-RateLimit headers to responses\n6. Log rate limit violations\n7. Add whitelist for trusted IPs"}
|
||||||
13
test_sessions/session_success.jsonl
Normal file
13
test_sessions/session_success.jsonl
Normal file
@@ -0,0 +1,13 @@
|
|||||||
|
{"role": "user", "content": "Clone the repo at forge.alexanderwhitestone.com/Timmy_Foundation/compounding-intelligence"}
|
||||||
|
{"role": "assistant", "content": "I'll clone the repository for you.", "tool_calls": [{"function": "terminal", "arguments": {"command": "git clone https://forge.alexanderwhitestone.com/Timmy_Foundation/compounding-intelligence.git"}}]}
|
||||||
|
{"role": "tool", "content": "Cloning into 'compounding-intelligence'...\nremote: Enumerating objects: 15, done.\nremote: Counting objects: 100% (15/15), done.\nremote: Compressing objects: 100% (12/12), done.\nremote: Total 15 (delta 2), reused 0 (delta 0), pack-reused 0\nReceiving objects: 100% (15/15), done.\nResolving deltas: 100% (2/2), done."}
|
||||||
|
{"role": "assistant", "content": "Repository cloned successfully. Let me check the structure.", "tool_calls": [{"function": "terminal", "arguments": {"command": "ls -la compounding-intelligence/"}}]}
|
||||||
|
{"role": "tool", "content": "total 8\ndrwxr-xr-x 5 user staff 160 Apr 14 17:21 .\ndrwxr-xr-x 12 user staff 384 Apr 14 17:21 ..\ndrwxr-xr-x 13 user staff 416 Apr 14 17:21 .git\n-rw-r--r-- 1 user staff 1234 Apr 14 17:21 README.md\ndrwxr-xr-x 3 user staff 96 Apr 14 17:21 knowledge\n-rw-r--r-- 1 user staff 456 Apr 14 17:21 metrics\ndrwxr-xr-x 4 user staff 128 Apr 14 17:21 scripts\ndrwxr-xr-x 3 user staff 96 Apr 14 17:21 templates"}
|
||||||
|
{"role": "user", "content": "What files are in the templates directory?"}
|
||||||
|
{"role": "assistant", "content": "Let me check the templates directory.", "tool_calls": [{"function": "terminal", "arguments": {"command": "ls -la compounding-intelligence/templates/"}}]}
|
||||||
|
{"role": "tool", "content": "total 0\ndrwxr-xr-x 3 user staff 96 Apr 14 17:21 .\ndrwxr-xr-x 5 user staff 160 Apr 14 17:21 ..\n-rw-r--r-- 1 user staff 0 Apr 14 17:21 .gitkeep"}
|
||||||
|
{"role": "assistant", "content": "The templates directory currently only contains a .gitkeep file. No templates have been added yet."}
|
||||||
|
{"role": "user", "content": "Create a harvest-prompt.md template in the templates directory."}
|
||||||
|
{"role": "assistant", "content": "I'll create the harvest-prompt.md template for you.", "tool_calls": [{"function": "terminal", "arguments": {"command": "cat > compounding-intelligence/templates/harvest-prompt.md << 'EOF'\n# Knowledge Extraction Prompt\n\nExtract knowledge from session transcripts.\nEOF"}}]}
|
||||||
|
{"role": "tool", "content": ""}
|
||||||
|
{"role": "assistant", "content": "Template created successfully. The harvest-prompt.md file is now in the templates directory."}
|
||||||
19
tests/test_ci_config.py
Normal file
19
tests/test_ci_config.py
Normal file
@@ -0,0 +1,19 @@
|
|||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
|
||||||
|
def test_requirements_makefile_and_workflow_exist() -> None:
|
||||||
|
assert Path("requirements.txt").exists()
|
||||||
|
assert Path("Makefile").exists()
|
||||||
|
assert Path(".gitea/workflows/test.yml").exists()
|
||||||
|
|
||||||
|
|
||||||
|
def test_ci_workflow_runs_project_test_command() -> None:
|
||||||
|
workflow = Path(".gitea/workflows/test.yml").read_text(encoding="utf-8")
|
||||||
|
requirements = Path("requirements.txt").read_text(encoding="utf-8")
|
||||||
|
makefile = Path("Makefile").read_text(encoding="utf-8")
|
||||||
|
|
||||||
|
assert "pytest" in requirements
|
||||||
|
assert "test:" in makefile
|
||||||
|
assert "python3 -m pytest tests/test_ci_config.py scripts/test_*.py -v" in makefile
|
||||||
|
assert "pip install -r requirements.txt" in workflow
|
||||||
|
assert "make test" in workflow
|
||||||
141
tests/test_knowledge_gap_identifier.py
Normal file
141
tests/test_knowledge_gap_identifier.py
Normal file
@@ -0,0 +1,141 @@
|
|||||||
|
"""Tests for knowledge_gap_identifier module."""
|
||||||
|
|
||||||
|
import sys
|
||||||
|
import os
|
||||||
|
import tempfile
|
||||||
|
import shutil
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..', 'scripts'))
|
||||||
|
|
||||||
|
from knowledge_gap_identifier import KnowledgeGapIdentifier, GapType, GapSeverity
|
||||||
|
|
||||||
|
|
||||||
|
def _make_repo(tmpdir, structure):
|
||||||
|
"""Create a test repo from a dict of {path: content}."""
|
||||||
|
for rel_path, content in structure.items():
|
||||||
|
p = Path(tmpdir) / rel_path
|
||||||
|
p.parent.mkdir(parents=True, exist_ok=True)
|
||||||
|
p.write_text(content)
|
||||||
|
|
||||||
|
|
||||||
|
def test_undocumented_symbol():
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
_make_repo(tmpdir, {
|
||||||
|
"src/calculator.py": "def add(a, b):\n return a + b\n",
|
||||||
|
"README.md": "# Calculator\n",
|
||||||
|
})
|
||||||
|
report = KnowledgeGapIdentifier().analyze(tmpdir)
|
||||||
|
undocumented = [g for g in report.gaps if g.gap_type == GapType.UNDOCUMENTED]
|
||||||
|
assert any(g.name == "add" for g in undocumented), "add should be undocumented"
|
||||||
|
|
||||||
|
|
||||||
|
def test_documented_symbol_no_gap():
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
_make_repo(tmpdir, {
|
||||||
|
"src/calculator.py": "def add(a, b):\n return a + b\n",
|
||||||
|
"README.md": "# Calculator\nUse `add()` to add numbers.\n",
|
||||||
|
})
|
||||||
|
report = KnowledgeGapIdentifier().analyze(tmpdir)
|
||||||
|
undocumented = [g for g in report.gaps
|
||||||
|
if g.gap_type == GapType.UNDOCUMENTED and g.name == "add"]
|
||||||
|
assert len(undocumented) == 0, "add is documented, should not be flagged"
|
||||||
|
|
||||||
|
|
||||||
|
def test_untested_module():
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
_make_repo(tmpdir, {
|
||||||
|
"src/calculator.py": "def add(a, b):\n return a + b\n",
|
||||||
|
"src/helper.py": "def format(x):\n return str(x)\n",
|
||||||
|
"tests/test_calculator.py": "from src.calculator import add\nassert add(1,2) == 3\n",
|
||||||
|
})
|
||||||
|
report = KnowledgeGapIdentifier().analyze(tmpdir)
|
||||||
|
untested = [g for g in report.gaps if g.gap_type == GapType.UNTESTED]
|
||||||
|
assert any("helper" in g.name for g in untested), "helper should be untested"
|
||||||
|
|
||||||
|
|
||||||
|
def test_tested_module_no_gap():
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
_make_repo(tmpdir, {
|
||||||
|
"src/calculator.py": "def add(a, b):\n return a + b\n",
|
||||||
|
"tests/test_calculator.py": "def test_add():\n assert True\n",
|
||||||
|
})
|
||||||
|
report = KnowledgeGapIdentifier().analyze(tmpdir)
|
||||||
|
untested = [g for g in report.gaps
|
||||||
|
if g.gap_type == GapType.UNTESTED and "calculator" in g.name]
|
||||||
|
assert len(untested) == 0, "calculator has tests, should not be flagged"
|
||||||
|
|
||||||
|
|
||||||
|
def test_missing_implementation():
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
_make_repo(tmpdir, {
|
||||||
|
"src/app.py": "def run():\n pass\n",
|
||||||
|
"docs/api.md": "# API\nUse `NonExistentClass` to do things.\n",
|
||||||
|
})
|
||||||
|
report = KnowledgeGapIdentifier().analyze(tmpdir)
|
||||||
|
missing = [g for g in report.gaps if g.gap_type == GapType.MISSING_IMPLEMENTATION]
|
||||||
|
assert any(g.name == "NonExistentClass" for g in missing)
|
||||||
|
|
||||||
|
|
||||||
|
def test_private_symbols_skipped():
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
_make_repo(tmpdir, {
|
||||||
|
"src/app.py": "def _internal():\n pass\ndef public():\n pass\n",
|
||||||
|
"README.md": "# App\n",
|
||||||
|
})
|
||||||
|
report = KnowledgeGapIdentifier().analyze(tmpdir)
|
||||||
|
undocumented_names = [g.name for g in report.gaps if g.gap_type == GapType.UNDOCUMENTED]
|
||||||
|
assert "_internal" not in undocumented_names, "Private symbols should be skipped"
|
||||||
|
assert "public" in undocumented_names
|
||||||
|
|
||||||
|
|
||||||
|
def test_empty_repo():
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
report = KnowledgeGapIdentifier().analyze(tmpdir)
|
||||||
|
assert len(report.gaps) == 0
|
||||||
|
|
||||||
|
|
||||||
|
def test_invalid_path():
|
||||||
|
report = KnowledgeGapIdentifier().analyze("/nonexistent/path/xyz")
|
||||||
|
assert len(report.gaps) == 1
|
||||||
|
assert report.gaps[0].severity == GapSeverity.ERROR
|
||||||
|
|
||||||
|
|
||||||
|
def test_report_summary():
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
_make_repo(tmpdir, {
|
||||||
|
"src/app.py": "class MyService:\n def handle(self):\n pass\n",
|
||||||
|
"README.md": "# App\n",
|
||||||
|
})
|
||||||
|
report = KnowledgeGapIdentifier().analyze(tmpdir)
|
||||||
|
summary = report.summary()
|
||||||
|
assert "UNDOCUMENTED" in summary
|
||||||
|
assert "MyService" in summary
|
||||||
|
|
||||||
|
|
||||||
|
def test_report_to_dict():
|
||||||
|
with tempfile.TemporaryDirectory() as tmpdir:
|
||||||
|
_make_repo(tmpdir, {
|
||||||
|
"src/app.py": "def hello():\n pass\n",
|
||||||
|
"README.md": "# App\n",
|
||||||
|
})
|
||||||
|
report = KnowledgeGapIdentifier().analyze(tmpdir)
|
||||||
|
d = report.to_dict()
|
||||||
|
assert "total_gaps" in d
|
||||||
|
assert "gaps" in d
|
||||||
|
assert isinstance(d["gaps"], list)
|
||||||
|
assert d["total_gaps"] > 0
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
test_undocumented_symbol()
|
||||||
|
test_documented_symbol_no_gap()
|
||||||
|
test_untested_module()
|
||||||
|
test_tested_module_no_gap()
|
||||||
|
test_missing_implementation()
|
||||||
|
test_private_symbols_skipped()
|
||||||
|
test_empty_repo()
|
||||||
|
test_invalid_path()
|
||||||
|
test_report_summary()
|
||||||
|
test_report_to_dict()
|
||||||
|
print("All 10 tests passed.")
|
||||||
319
tests/test_perf_bottleneck_finder.py
Normal file
319
tests/test_perf_bottleneck_finder.py
Normal file
@@ -0,0 +1,319 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Tests for Performance Bottleneck Finder.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import tempfile
|
||||||
|
import textwrap
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
# Add scripts to path
|
||||||
|
import sys
|
||||||
|
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "scripts"))
|
||||||
|
|
||||||
|
from perf_bottleneck_finder import (
|
||||||
|
Bottleneck,
|
||||||
|
PerfReport,
|
||||||
|
find_slow_tests_by_scan,
|
||||||
|
analyze_build_artifacts,
|
||||||
|
analyze_makefile_targets,
|
||||||
|
find_slow_imports,
|
||||||
|
generate_report,
|
||||||
|
format_markdown,
|
||||||
|
severity_sort_key,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class TestBottleneck:
|
||||||
|
"""Test Bottleneck dataclass."""
|
||||||
|
|
||||||
|
def test_creation(self):
|
||||||
|
b = Bottleneck(
|
||||||
|
category="test",
|
||||||
|
name="test_foo",
|
||||||
|
duration_s=5.0,
|
||||||
|
severity="warning",
|
||||||
|
recommendation="Mock it"
|
||||||
|
)
|
||||||
|
assert b.category == "test"
|
||||||
|
assert b.name == "test_foo"
|
||||||
|
assert b.duration_s == 5.0
|
||||||
|
assert b.severity == "warning"
|
||||||
|
assert b.recommendation == "Mock it"
|
||||||
|
assert b.file_path is None
|
||||||
|
assert b.line_number is None
|
||||||
|
|
||||||
|
def test_with_location(self):
|
||||||
|
b = Bottleneck(
|
||||||
|
category="test",
|
||||||
|
name="test_bar",
|
||||||
|
duration_s=2.0,
|
||||||
|
severity="info",
|
||||||
|
recommendation="Consider",
|
||||||
|
file_path="tests/test_bar.py",
|
||||||
|
line_number=42
|
||||||
|
)
|
||||||
|
assert b.file_path == "tests/test_bar.py"
|
||||||
|
assert b.line_number == 42
|
||||||
|
|
||||||
|
def test_to_dict(self):
|
||||||
|
b = Bottleneck("test", "x", 1.0, "info", "y")
|
||||||
|
d = b.__dict__
|
||||||
|
assert "category" in d
|
||||||
|
assert "duration_s" in d
|
||||||
|
|
||||||
|
|
||||||
|
class TestPerfReport:
|
||||||
|
"""Test PerfReport dataclass."""
|
||||||
|
|
||||||
|
def test_creation(self):
|
||||||
|
report = PerfReport(
|
||||||
|
timestamp="2026-01-01T00:00:00Z",
|
||||||
|
repo_path="/tmp/repo"
|
||||||
|
)
|
||||||
|
assert report.timestamp == "2026-01-01T00:00:00Z"
|
||||||
|
assert report.bottlenecks == []
|
||||||
|
assert report.summary == {}
|
||||||
|
|
||||||
|
def test_to_dict(self):
|
||||||
|
report = PerfReport(
|
||||||
|
timestamp="2026-01-01T00:00:00Z",
|
||||||
|
repo_path="/tmp/repo",
|
||||||
|
bottlenecks=[Bottleneck("test", "x", 1.0, "info", "y")]
|
||||||
|
)
|
||||||
|
d = report.to_dict()
|
||||||
|
assert "bottlenecks" in d
|
||||||
|
assert len(d["bottlenecks"]) == 1
|
||||||
|
|
||||||
|
|
||||||
|
class TestSeveritySort:
|
||||||
|
"""Test severity sorting."""
|
||||||
|
|
||||||
|
def test_critical_first(self):
|
||||||
|
items = [
|
||||||
|
Bottleneck("test", "a", 1.0, "info", ""),
|
||||||
|
Bottleneck("test", "b", 0.5, "critical", ""),
|
||||||
|
Bottleneck("test", "c", 2.0, "warning", ""),
|
||||||
|
]
|
||||||
|
items.sort(key=severity_sort_key)
|
||||||
|
assert items[0].severity == "critical"
|
||||||
|
assert items[1].severity == "warning"
|
||||||
|
assert items[2].severity == "info"
|
||||||
|
|
||||||
|
def test_duration_within_severity(self):
|
||||||
|
items = [
|
||||||
|
Bottleneck("test", "slow", 10.0, "warning", ""),
|
||||||
|
Bottleneck("test", "fast", 1.0, "warning", ""),
|
||||||
|
]
|
||||||
|
items.sort(key=severity_sort_key)
|
||||||
|
assert items[0].name == "slow" # Higher duration first within same severity
|
||||||
|
|
||||||
|
|
||||||
|
class TestSlowTestScan:
|
||||||
|
"""Test slow test pattern scanning."""
|
||||||
|
|
||||||
|
def test_finds_sleep(self, tmp_path):
|
||||||
|
test_file = tmp_path / "test_sleepy.py"
|
||||||
|
test_file.write_text(textwrap.dedent('''
|
||||||
|
import time
|
||||||
|
|
||||||
|
def test_slow():
|
||||||
|
time.sleep(5)
|
||||||
|
assert True
|
||||||
|
'''))
|
||||||
|
|
||||||
|
bottlenecks = find_slow_tests_by_scan(str(tmp_path))
|
||||||
|
assert len(bottlenecks) >= 1
|
||||||
|
assert any("sleep" in b.recommendation.lower() for b in bottlenecks)
|
||||||
|
|
||||||
|
def test_finds_http_calls(self, tmp_path):
|
||||||
|
test_file = tmp_path / "test_http.py"
|
||||||
|
test_file.write_text(textwrap.dedent('''
|
||||||
|
import requests
|
||||||
|
|
||||||
|
def test_api():
|
||||||
|
resp = requests.get("https://example.com")
|
||||||
|
assert resp.status_code == 200
|
||||||
|
'''))
|
||||||
|
|
||||||
|
bottlenecks = find_slow_tests_by_scan(str(tmp_path))
|
||||||
|
assert len(bottlenecks) >= 1
|
||||||
|
assert any("HTTP" in b.recommendation or "mock" in b.recommendation.lower() for b in bottlenecks)
|
||||||
|
|
||||||
|
def test_skips_non_test_files(self, tmp_path):
|
||||||
|
src_file = tmp_path / "main.py"
|
||||||
|
src_file.write_text("import time\ntime.sleep(10)\n")
|
||||||
|
|
||||||
|
bottlenecks = find_slow_tests_by_scan(str(tmp_path))
|
||||||
|
assert len(bottlenecks) == 0
|
||||||
|
|
||||||
|
def test_handles_missing_dir(self):
|
||||||
|
bottlenecks = find_slow_tests_by_scan("/nonexistent/path")
|
||||||
|
assert bottlenecks == []
|
||||||
|
|
||||||
|
def test_file_path_populated(self, tmp_path):
|
||||||
|
test_file = tmp_path / "test_example.py"
|
||||||
|
test_file.write_text("import time\n\ndef test_it():\n time.sleep(2)\n")
|
||||||
|
|
||||||
|
bottlenecks = find_slow_tests_by_scan(str(tmp_path))
|
||||||
|
assert len(bottlenecks) >= 1
|
||||||
|
assert bottlenecks[0].file_path is not None
|
||||||
|
assert bottlenecks[0].line_number is not None
|
||||||
|
|
||||||
|
|
||||||
|
class TestBuildArtifacts:
|
||||||
|
"""Test build artifact analysis."""
|
||||||
|
|
||||||
|
def test_finds_large_node_modules(self, tmp_path):
|
||||||
|
nm = tmp_path / "node_modules"
|
||||||
|
nm.mkdir()
|
||||||
|
# Create a file > 10MB
|
||||||
|
big_file = nm / "big.txt"
|
||||||
|
big_file.write_bytes(b"x" * (11 * 1024 * 1024))
|
||||||
|
|
||||||
|
bottlenecks = analyze_build_artifacts(str(tmp_path))
|
||||||
|
assert len(bottlenecks) >= 1
|
||||||
|
assert any("node_modules" in b.name for b in bottlenecks)
|
||||||
|
|
||||||
|
def test_ignores_small_dirs(self, tmp_path):
|
||||||
|
nm = tmp_path / "node_modules"
|
||||||
|
nm.mkdir()
|
||||||
|
small_file = nm / "small.txt"
|
||||||
|
small_file.write_bytes(b"x" * 100)
|
||||||
|
|
||||||
|
bottlenecks = analyze_build_artifacts(str(tmp_path))
|
||||||
|
assert not any("node_modules" in b.name for b in bottlenecks)
|
||||||
|
|
||||||
|
def test_finds_pycache(self, tmp_path):
|
||||||
|
cache = tmp_path / "__pycache__"
|
||||||
|
cache.mkdir()
|
||||||
|
big_file = cache / "big.pyc"
|
||||||
|
big_file.write_bytes(b"x" * (11 * 1024 * 1024))
|
||||||
|
|
||||||
|
bottlenecks = analyze_build_artifacts(str(tmp_path))
|
||||||
|
assert any("__pycache__" in b.name for b in bottlenecks)
|
||||||
|
|
||||||
|
|
||||||
|
class TestMakefileAnalysis:
|
||||||
|
"""Test Makefile analysis."""
|
||||||
|
|
||||||
|
def test_finds_pip_install(self, tmp_path):
|
||||||
|
makefile = tmp_path / "Makefile"
|
||||||
|
makefile.write_text(textwrap.dedent('''
|
||||||
|
install:
|
||||||
|
pip install -r requirements.txt
|
||||||
|
|
||||||
|
test:
|
||||||
|
pytest
|
||||||
|
'''))
|
||||||
|
|
||||||
|
bottlenecks = analyze_makefile_targets(str(tmp_path))
|
||||||
|
assert len(bottlenecks) >= 1
|
||||||
|
|
||||||
|
def test_no_makefile(self, tmp_path):
|
||||||
|
bottlenecks = analyze_makefile_targets(str(tmp_path))
|
||||||
|
assert bottlenecks == []
|
||||||
|
|
||||||
|
|
||||||
|
class TestImportAnalysis:
|
||||||
|
"""Test heavy import detection."""
|
||||||
|
|
||||||
|
def test_finds_pandas(self, tmp_path):
|
||||||
|
src = tmp_path / "analysis.py"
|
||||||
|
src.write_text("import pandas as pd\n")
|
||||||
|
|
||||||
|
bottlenecks = find_slow_imports(str(tmp_path))
|
||||||
|
assert len(bottlenecks) >= 1
|
||||||
|
assert any("pandas" in b.name for b in bottlenecks)
|
||||||
|
|
||||||
|
def test_finds_torch(self, tmp_path):
|
||||||
|
src = tmp_path / "model.py"
|
||||||
|
src.write_text("import torch\n")
|
||||||
|
|
||||||
|
bottlenecks = find_slow_imports(str(tmp_path))
|
||||||
|
assert any("torch" in b.name for b in bottlenecks)
|
||||||
|
|
||||||
|
def test_skips_light_imports(self, tmp_path):
|
||||||
|
src = tmp_path / "utils.py"
|
||||||
|
src.write_text("import json\nimport os\nimport sys\n")
|
||||||
|
|
||||||
|
bottlenecks = find_slow_imports(str(tmp_path))
|
||||||
|
assert len(bottlenecks) == 0
|
||||||
|
|
||||||
|
|
||||||
|
class TestGenerateReport:
|
||||||
|
"""Test full report generation."""
|
||||||
|
|
||||||
|
def test_empty_repo(self, tmp_path):
|
||||||
|
report = generate_report(str(tmp_path))
|
||||||
|
assert report.summary["total_bottlenecks"] >= 0
|
||||||
|
assert "critical" in report.summary
|
||||||
|
assert "warning" in report.summary
|
||||||
|
|
||||||
|
def test_with_findings(self, tmp_path):
|
||||||
|
# Create a test file with issues
|
||||||
|
test_file = tmp_path / "test_slow.py"
|
||||||
|
test_file.write_text(textwrap.dedent('''
|
||||||
|
import time
|
||||||
|
import requests
|
||||||
|
|
||||||
|
def test_sleepy():
|
||||||
|
time.sleep(3)
|
||||||
|
|
||||||
|
def test_http():
|
||||||
|
requests.get("https://example.com")
|
||||||
|
'''))
|
||||||
|
|
||||||
|
report = generate_report(str(tmp_path))
|
||||||
|
assert report.summary["total_bottlenecks"] >= 2
|
||||||
|
assert len(report.bottlenecks) > 0
|
||||||
|
|
||||||
|
def test_summary_categories(self, tmp_path):
|
||||||
|
report = generate_report(str(tmp_path))
|
||||||
|
assert "by_category" in report.summary
|
||||||
|
|
||||||
|
|
||||||
|
class TestMarkdownReport:
|
||||||
|
"""Test markdown output."""
|
||||||
|
|
||||||
|
def test_format(self):
|
||||||
|
report = PerfReport(
|
||||||
|
timestamp="2026-01-01T00:00:00Z",
|
||||||
|
repo_path="/tmp/repo",
|
||||||
|
bottlenecks=[
|
||||||
|
Bottleneck("test", "slow_test", 5.0, "critical", "Fix it")
|
||||||
|
],
|
||||||
|
summary={
|
||||||
|
"total_bottlenecks": 1,
|
||||||
|
"critical": 1,
|
||||||
|
"warning": 0,
|
||||||
|
"info": 0,
|
||||||
|
"estimated_total_slowdown_s": 5.0,
|
||||||
|
"by_category": {"test": 1},
|
||||||
|
}
|
||||||
|
)
|
||||||
|
md = format_markdown(report)
|
||||||
|
assert "# Performance Bottleneck Report" in md
|
||||||
|
assert "slow_test" in md
|
||||||
|
assert "🔴" in md
|
||||||
|
assert "Fix it" in md
|
||||||
|
|
||||||
|
def test_empty_report(self):
|
||||||
|
report = PerfReport(
|
||||||
|
timestamp="2026-01-01T00:00:00Z",
|
||||||
|
repo_path="/tmp/repo",
|
||||||
|
summary={
|
||||||
|
"total_bottlenecks": 0,
|
||||||
|
"critical": 0,
|
||||||
|
"warning": 0,
|
||||||
|
"info": 0,
|
||||||
|
"estimated_total_slowdown_s": 0,
|
||||||
|
"by_category": {},
|
||||||
|
}
|
||||||
|
)
|
||||||
|
md = format_markdown(report)
|
||||||
|
assert "Total bottlenecks:** 0" in md
|
||||||
Reference in New Issue
Block a user