Compare commits
1 Commits
step35/138
...
feat/sessi
| Author | SHA1 | Date | |
|---|---|---|---|
| a20e9721b2 |
@@ -1,22 +0,0 @@
|
||||
name: Test
|
||||
|
||||
on:
|
||||
pull_request:
|
||||
push:
|
||||
branches: [main]
|
||||
|
||||
jobs:
|
||||
pytest:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
- uses: actions/setup-python@v5
|
||||
with:
|
||||
python-version: '3.11'
|
||||
- name: Install test dependencies
|
||||
run: |
|
||||
python -m pip install --upgrade pip
|
||||
pip install -r requirements.txt
|
||||
- name: Run test suite
|
||||
run: |
|
||||
make test
|
||||
2
.gitignore
vendored
2
.gitignore
vendored
@@ -1,2 +0,0 @@
|
||||
__pycache__/
|
||||
*.pyc
|
||||
251
GENOME.md
251
GENOME.md
@@ -1,251 +0,0 @@
|
||||
# GENOME.md — compounding-intelligence
|
||||
|
||||
**Generated:** 2026-04-17
|
||||
**Repo:** Timmy_Foundation/compounding-intelligence
|
||||
**Description:** Turn 1B+ daily agent tokens into durable, compounding fleet intelligence.
|
||||
|
||||
---
|
||||
|
||||
## Project Overview
|
||||
|
||||
Every agent session starts at zero. The same HTTP 405 gets rediscovered as a branch protection issue. The same token path gets searched from scratch. Intelligence evaporates when the session ends.
|
||||
|
||||
Compounding-intelligence solves this with three pipelines forming a loop:
|
||||
|
||||
```
|
||||
SESSION ENDS → HARVESTER → KNOWLEDGE STORE → BOOTSTRAPPER → NEW SESSION STARTS SMARTER
|
||||
↓
|
||||
MEASURER → Prove it's working
|
||||
```
|
||||
|
||||
**Status:** Active development. Core pipelines implemented. 20+ scripts, 14 test files, knowledge store populated with real data.
|
||||
|
||||
## Architecture
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
TRANS[Session Transcripts<br/>~/.hermes/sessions/*.jsonl] --> READER[session_reader.py]
|
||||
READER --> HARVESTER[harvester.py]
|
||||
HARVESTER -->|LLM extraction| PROMPT[harvest-prompt.md]
|
||||
HARVESTER --> DEDUP[deduplicate()]
|
||||
DEDUP --> INDEX[knowledge/index.json]
|
||||
DEDUP --> GLOBAL[knowledge/global/*.yaml]
|
||||
DEDUP --> REPO[knowledge/repos/*.yaml]
|
||||
|
||||
INDEX --> BOOTSTRAPPER[bootstrapper.py]
|
||||
BOOTSTRAPPER -->|filter + rank + truncate| CONTEXT[Bootstrap Context<br/>2k token injection]
|
||||
CONTEXT --> SESSION[New Session starts smarter]
|
||||
|
||||
INDEX --> VALIDATOR[validate_knowledge.py]
|
||||
INDEX --> STALENESS[knowledge_staleness_check.py]
|
||||
INDEX --> GAPS[knowledge_gap_identifier.py]
|
||||
|
||||
TRANS --> SAMPLER[sampler.py]
|
||||
SAMPLER -->|score + rank| BEST[High-value sessions]
|
||||
BEST --> HARVESTER
|
||||
|
||||
TRANS --> METADATA[session_metadata.py]
|
||||
METADATA --> SUMMARY[SessionSummary objects]
|
||||
|
||||
KNOWLEDGE --> DIFF[diff_analyzer.py]
|
||||
DIFF --> PROPOSALS[improvement_proposals.py]
|
||||
PROPOSALS --> PRIORITIES[priority_rebalancer.py]
|
||||
```
|
||||
|
||||
## Entry Points
|
||||
|
||||
### Core Pipelines
|
||||
|
||||
| Script | Purpose | Key Functions |
|
||||
|--------|---------|---------------|
|
||||
| `harvester.py` | Extract knowledge from session transcripts | `harvest_session()`, `call_llm()`, `deduplicate()`, `validate_fact()` |
|
||||
| `bootstrapper.py` | Build pre-session context from knowledge store | `build_bootstrap_context()`, `filter_facts()`, `sort_facts()`, `truncate_to_tokens()` |
|
||||
| `session_reader.py` | Parse JSONL session transcripts | `read_session()`, `extract_conversation()`, `messages_to_text()` |
|
||||
| `sampler.py` | Score and rank sessions for harvesting value | `scan_session_fast()`, `score_session()` |
|
||||
| `session_metadata.py` | Extract structured metadata from sessions | `extract_session_metadata()`, `SessionSummary` |
|
||||
|
||||
### Analysis & Quality
|
||||
|
||||
| Script | Purpose |
|
||||
|--------|---------|
|
||||
| `validate_knowledge.py` | Validate knowledge index schema compliance |
|
||||
| `knowledge_staleness_check.py` | Detect stale knowledge (source changed since extraction) |
|
||||
| `knowledge_gap_identifier.py` | Find untested functions, undocumented APIs, missing tests |
|
||||
| `diff_analyzer.py` | Analyze code diffs for improvement signals |
|
||||
| `improvement_proposals.py` | Generate ranked improvement proposals |
|
||||
| `priority_rebalancer.py` | Rebalance priorities across proposals |
|
||||
| `automation_opportunity_finder.py` | Find manual steps that can be automated |
|
||||
| `dead_code_detector.py` | Detect unused code |
|
||||
| `dependency_graph.py` | Map dependency relationships |
|
||||
| `perf_bottleneck_finder.py` | Find performance bottlenecks |
|
||||
| `refactoring_opportunity_finder.py` | Identify refactoring targets |
|
||||
| `gitea_issue_parser.py` | Parse Gitea issues for knowledge extraction |
|
||||
|
||||
### Automation
|
||||
|
||||
| Script | Purpose |
|
||||
|--------|---------|
|
||||
| `session_pair_harvester.py` | Extract training pairs from sessions |
|
||||
|
||||
## Data Flow
|
||||
|
||||
```
|
||||
1. Session ends → .jsonl written to ~/.hermes/sessions/
|
||||
2. sampler.py scores sessions by age, recency, repo coverage
|
||||
3. harvester.py reads top sessions, calls LLM with harvest-prompt.md
|
||||
4. LLM extracts facts/pitfalls/patterns/quirks/questions
|
||||
5. deduplicate() checks against existing index via fact_fingerprint()
|
||||
6. validate_fact() checks schema compliance
|
||||
7. write_knowledge() appends to knowledge/index.json + per-repo YAML
|
||||
8. On next session start, bootstrapper.py:
|
||||
a. Loads knowledge/index.json
|
||||
b. Filters by session's repo and agent type
|
||||
c. Sorts by confidence (high first), then recency
|
||||
d. Truncates to 2k token budget
|
||||
e. Injects as pre-context
|
||||
9. Agent starts with full situational awareness instead of zero
|
||||
```
|
||||
|
||||
## Key Abstractions
|
||||
|
||||
### Knowledge Item (fact/pitfall/pattern/quirk/question)
|
||||
```json
|
||||
{
|
||||
"fact": "Gitea token is at ~/.config/gitea/token",
|
||||
"category": "tool-quirk",
|
||||
"repo": "global",
|
||||
"confidence": 0.9,
|
||||
"evidence": "Found during clone attempt",
|
||||
"source_session": "2026-04-13_abc123",
|
||||
"extracted_at": "2026-04-13T20:00:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
### SessionSummary (session_metadata.py)
|
||||
Extracted metadata per session: duration, token count, tools used, repos touched, error count, outcome.
|
||||
|
||||
### Gap / GapReport (knowledge_gap_identifier.py)
|
||||
Structured gap analysis: untested functions, undocumented APIs, missing tests. Severity: critical/high/medium/low.
|
||||
|
||||
### Knowledge Index (knowledge/index.json)
|
||||
Machine-readable fact store. 12KB, populated with real data. Categories: fact, pitfall, pattern, tool-quirk, question.
|
||||
|
||||
## Knowledge Store
|
||||
|
||||
```
|
||||
knowledge/
|
||||
├── index.json # Master fact store (12KB, populated)
|
||||
├── SCHEMA.md # Schema documentation
|
||||
├── global/
|
||||
│ ├── pitfalls.yaml # Cross-repo pitfalls (2KB)
|
||||
│ └── tool-quirks.yaml # Tool-specific quirks (2KB)
|
||||
├── repos/
|
||||
│ ├── hermes-agent.yaml # hermes-agent knowledge (2KB)
|
||||
│ └── the-nexus.yaml # the-nexus knowledge (2KB)
|
||||
└── agents/ # Per-agent knowledge (empty)
|
||||
```
|
||||
|
||||
## API Surface
|
||||
|
||||
### LLM API (consumed)
|
||||
| Provider | Endpoint | Usage |
|
||||
|----------|----------|-------|
|
||||
| Nous Research | `https://inference-api.nousresearch.com/v1` | Knowledge extraction |
|
||||
| Ollama | `http://localhost:11434/v1` | Local fallback |
|
||||
|
||||
### File API (consumed/produced)
|
||||
| Path | Format | Direction |
|
||||
|------|--------|-----------|
|
||||
| `~/.hermes/sessions/*.jsonl` | JSONL | Input (session transcripts) |
|
||||
| `knowledge/index.json` | JSON | Output (master fact store) |
|
||||
| `knowledge/global/*.yaml` | YAML | Output (cross-repo knowledge) |
|
||||
| `knowledge/repos/*.yaml` | YAML | Output (per-repo knowledge) |
|
||||
| `templates/harvest-prompt.md` | Markdown | Config (extraction prompt) |
|
||||
|
||||
## Test Coverage
|
||||
|
||||
**14 test files** covering core pipelines:
|
||||
|
||||
| Test File | Covers |
|
||||
|-----------|--------|
|
||||
| `test_harvest_prompt.py` | Prompt validation, hallucination detection |
|
||||
| `test_harvest_prompt_comprehensive.py` | Extended prompt testing |
|
||||
| `test_harvester_pipeline.py` | Harvester extraction + dedup |
|
||||
| `test_bootstrapper.py` | Context building, filtering, truncation |
|
||||
| `test_session_pair_harvester.py` | Training pair extraction |
|
||||
| `test_improvement_proposals.py` | Proposal generation |
|
||||
| `test_priority_rebalancer.py` | Priority scoring |
|
||||
| `test_knowledge_staleness.py` | Staleness detection |
|
||||
| `test_automation_opportunity_finder.py` | Automation detection |
|
||||
| `test_diff_analyzer.py` | Diff analysis |
|
||||
| `test_gitea_issue_parser.py` | Issue parsing |
|
||||
| `test_refactoring_opportunity_finder.py` | Refactoring signals |
|
||||
| `test_knowledge_gap_identifier.py` | Gap analysis |
|
||||
| `test_perf_bottleneck_finder.py` | Perf bottleneck detection |
|
||||
|
||||
### Coverage Gaps
|
||||
|
||||
1. **session_reader.py** — No dedicated test file (tested indirectly)
|
||||
2. **sampler.py** — No test file (scoring logic untested)
|
||||
3. **session_metadata.py** — No test file
|
||||
4. **validate_knowledge.py** — No test file
|
||||
5. **knowledge_staleness_check.py** — Tested but limited
|
||||
|
||||
## Security Considerations
|
||||
|
||||
### API Key Handling
|
||||
- `harvester.py` reads API key from `~/.hermes/auth.json` or env vars
|
||||
- Key passed to LLM API in request headers only
|
||||
- No key logging
|
||||
|
||||
### Knowledge Integrity
|
||||
- `validate_fact()` checks schema before writing
|
||||
- `deduplicate()` prevents duplicate entries via fingerprint
|
||||
- `knowledge_staleness_check.py` detects when source code changed but knowledge didn't
|
||||
- Confidence scores prevent low-quality knowledge from polluting the store
|
||||
|
||||
### File Safety
|
||||
- Knowledge writes are append-only (never deletes)
|
||||
- Bootstrap context is truncated to budget (no prompt injection via knowledge)
|
||||
- Session reader handles malformed JSONL gracefully
|
||||
|
||||
## File Index
|
||||
|
||||
```
|
||||
scripts/
|
||||
harvester.py (473 lines) — Core knowledge extraction
|
||||
bootstrapper.py (302 lines) — Pre-session context builder
|
||||
session_reader.py (137 lines) — JSONL session parser
|
||||
sampler.py (363 lines) — Session scoring + ranking
|
||||
session_metadata.py (271 lines) — Session metadata extraction
|
||||
validate_knowledge.py (44 lines) — Index validation
|
||||
knowledge_staleness_check.py (125 lines) — Staleness detection
|
||||
knowledge_gap_identifier.py (291 lines) — Gap analysis engine
|
||||
diff_analyzer.py (203 lines) — Diff analysis
|
||||
improvement_proposals.py (518 lines) — Proposal generation
|
||||
priority_rebalancer.py (745 lines) — Priority scoring
|
||||
automation_opportunity_finder.py (600 lines) — Automation detection
|
||||
dead_code_detector.py (270 lines) — Dead code detection
|
||||
dependency_graph.py (220 lines) — Dependency mapping
|
||||
perf_bottleneck_finder.py (635 lines) — Perf analysis
|
||||
refactoring_opportunity_finder.py (46 lines) — Refactoring signals
|
||||
gitea_issue_parser.py (140 lines) — Gitea issue parsing
|
||||
session_pair_harvester.py (224 lines) — Training pair extraction
|
||||
knowledge/
|
||||
index.json (12KB) — Master fact store
|
||||
SCHEMA.md (3KB) — Schema docs
|
||||
global/pitfalls.yaml (2KB) — Cross-repo pitfalls
|
||||
global/tool-quirks.yaml (2KB) — Tool quirks
|
||||
repos/hermes-agent.yaml (2KB) — Repo-specific knowledge
|
||||
repos/the-nexus.yaml (2KB) — Repo-specific knowledge
|
||||
templates/
|
||||
harvest-prompt.md (4KB) — Extraction prompt
|
||||
test_sessions/ (5 files) — Sample transcripts
|
||||
tests/ + scripts/test_* (14 files)— Test suite
|
||||
```
|
||||
|
||||
**Total:** ~6,500 lines of code across 18 scripts + 14 test files.
|
||||
|
||||
---
|
||||
|
||||
*Generated by Codebase Genome pipeline — Issue #676*
|
||||
4
Makefile
4
Makefile
@@ -1,4 +0,0 @@
|
||||
.PHONY: test
|
||||
|
||||
test:
|
||||
python3 -m pytest tests/test_ci_config.py scripts/test_*.py -v
|
||||
@@ -1,114 +0,0 @@
|
||||
# Knowledge File Format Specification
|
||||
|
||||
**Version:** 1
|
||||
**Issue:** #10
|
||||
**Status:** Draft
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
The knowledge system has two layers:
|
||||
|
||||
1. **index.json** — Machine-readable fact index. Fast lookups by ID, category, repo, tags.
|
||||
2. **Knowledge files** (YAML) — Human-readable, editable facts organized by domain.
|
||||
|
||||
The harvester writes to both. The bootstrapper reads from index.json. Humans edit the YAML files directly.
|
||||
|
||||
---
|
||||
|
||||
## index.json Schema
|
||||
|
||||
```json
|
||||
{
|
||||
"version": 1,
|
||||
"last_updated": "ISO-8601 timestamp",
|
||||
"total_facts": 0,
|
||||
"facts": []
|
||||
}
|
||||
```
|
||||
|
||||
### Fact Object
|
||||
|
||||
| Field | Type | Required | Description |
|
||||
|-------|------|----------|-------------|
|
||||
| `id` | string | yes | Unique identifier: `{domain}:{category}:{sequence}` |
|
||||
| `fact` | string | yes | One-sentence description of the knowledge |
|
||||
| `category` | enum | yes | One of: `fact`, `pitfall`, `pattern`, `tool-quirk`, `question` |
|
||||
| `domain` | string | yes | Where this applies: repo name, `global`, or agent name |
|
||||
| `confidence` | float | yes | 0.0–1.0. How certain is this knowledge? |
|
||||
| `tags` | string[] | no | Searchable labels |
|
||||
| `source_count` | int | no | How many sessions confirmed this fact |
|
||||
| `first_seen` | date | no | ISO-8601 date first extracted |
|
||||
| `last_confirmed` | date | no | ISO-8601 date last seen in a session |
|
||||
| `expires` | date | no | Optional. After this date, fact is stale |
|
||||
| `related` | string[] | no | IDs of related facts |
|
||||
|
||||
### ID Format: `{domain}:{category}:{sequence}`
|
||||
|
||||
### Categories
|
||||
|
||||
| Category | Definition |
|
||||
|----------|------------|
|
||||
| `fact` | Concrete, verifiable information |
|
||||
| `pitfall` | Errors, wrong assumptions, time-wasters |
|
||||
| `pattern` | Successful sequences of actions |
|
||||
| `tool-quirk` | Environment-specific behaviors |
|
||||
| `question` | Identified but unanswered |
|
||||
|
||||
### Confidence Scoring
|
||||
|
||||
| Range | Meaning |
|
||||
|-------|---------|
|
||||
| 0.9–1.0 | Explicitly stated and verified |
|
||||
| 0.7–0.8 | Clearly implied by multiple data points |
|
||||
| 0.5–0.6 | Suggested but not fully verified |
|
||||
| 0.3–0.4 | Inferred from limited data |
|
||||
| 0.1–0.2 | Speculative or uncertain |
|
||||
|
||||
---
|
||||
|
||||
## Directory Structure
|
||||
|
||||
```
|
||||
knowledge/
|
||||
├── index.json # Machine-readable fact index
|
||||
├── SCHEMA.md # This file
|
||||
├── global/ # Cross-repo knowledge
|
||||
│ ├── pitfalls.yaml
|
||||
│ ├── patterns.yaml
|
||||
│ └── tool-quirks.yaml
|
||||
├── repos/ # Per-repo knowledge
|
||||
│ ├── {repo-name}.yaml
|
||||
│ └── ...
|
||||
└── agents/ # Agent-type knowledge
|
||||
└── {agent-type}.yaml
|
||||
```
|
||||
|
||||
## YAML File Format
|
||||
|
||||
YAML files use frontmatter for metadata, then markdown sections with fact entries:
|
||||
|
||||
```yaml
|
||||
---
|
||||
domain: global
|
||||
category: tool-quirk
|
||||
version: 1
|
||||
last_updated: "2026-04-13"
|
||||
---
|
||||
|
||||
# Title
|
||||
|
||||
## Section
|
||||
|
||||
- id: global:tool-quirk:001
|
||||
fact: "Description"
|
||||
confidence: 0.95
|
||||
tags: [tag1, tag2]
|
||||
source_count: 5
|
||||
first_seen: "2026-03-27"
|
||||
```
|
||||
|
||||
## Validation
|
||||
|
||||
Run `python scripts/validate_knowledge.py` to validate index.json.
|
||||
@@ -1,80 +0,0 @@
|
||||
---
|
||||
domain: global
|
||||
category: pitfall
|
||||
version: 1
|
||||
last_updated: "2026-04-13"
|
||||
---
|
||||
|
||||
# Pitfalls (Global)
|
||||
|
||||
Cross-repo traps that waste time across the fleet.
|
||||
|
||||
## Git & Forge
|
||||
|
||||
- id: global:pitfall:001
|
||||
fact: "Branch protection requires 1 approval on main - API merges fail with 405 without it"
|
||||
confidence: 0.95
|
||||
tags: [git, merge, branch-protection, gitea]
|
||||
source_count: 12
|
||||
first_seen: "2026-04-05"
|
||||
last_confirmed: "2026-04-13"
|
||||
related: [the-nexus:pitfall:001]
|
||||
|
||||
- id: global:pitfall:002
|
||||
fact: "Never use --no-verify on git commits - it bypasses all hooks including safety checks"
|
||||
confidence: 0.95
|
||||
tags: [git, hooks, safety]
|
||||
source_count: 5
|
||||
first_seen: "2026-03-28"
|
||||
last_confirmed: "2026-04-13"
|
||||
|
||||
- id: global:pitfall:003
|
||||
fact: "Gitea PR creation workaround needed on the-nexus - direct API call fails, use alternative endpoint"
|
||||
confidence: 0.9
|
||||
tags: [gitea, pr, api, workaround]
|
||||
source_count: 4
|
||||
first_seen: "2026-04-06"
|
||||
last_confirmed: "2026-04-12"
|
||||
|
||||
## Agent Operations
|
||||
|
||||
- id: global:pitfall:004
|
||||
fact: "Anthropic is BANNED from fallback chain - if fallback triggers to Anthropic, something is wrong"
|
||||
confidence: 0.95
|
||||
tags: [provider, anthropic, fallback]
|
||||
source_count: 7
|
||||
first_seen: "2026-03-30"
|
||||
last_confirmed: "2026-04-13"
|
||||
|
||||
- id: global:pitfall:005
|
||||
fact: "Telegram tokens expired - don't assume Telegram notifications work without checking"
|
||||
confidence: 0.85
|
||||
tags: [telegram, notifications, token]
|
||||
source_count: 3
|
||||
first_seen: "2026-04-02"
|
||||
|
||||
- id: global:pitfall:006
|
||||
fact: "Multiple gateways = 'cannot schedule futures' error - only one gateway process should run"
|
||||
confidence: 0.9
|
||||
tags: [gateway, cron, process]
|
||||
source_count: 4
|
||||
first_seen: "2026-04-04"
|
||||
last_confirmed: "2026-04-11"
|
||||
|
||||
## Testing
|
||||
|
||||
- id: global:pitfall:007
|
||||
fact: "pytest root collection picks up operational *_test.py scripts - restrict to tests/ directory"
|
||||
confidence: 0.9
|
||||
tags: [pytest, test, collection]
|
||||
source_count: 3
|
||||
first_seen: "2026-04-07"
|
||||
last_confirmed: "2026-04-13"
|
||||
|
||||
- id: global:pitfall:008
|
||||
fact: "TDD: test 1 before building 55 - verify the cycle works before scaling"
|
||||
confidence: 0.95
|
||||
tags: [tdd, testing, methodology]
|
||||
source_count: 8
|
||||
first_seen: "2026-03-25"
|
||||
last_confirmed: "2026-04-13"
|
||||
@@ -1,71 +0,0 @@
|
||||
---
|
||||
domain: global
|
||||
category: tool-quirk
|
||||
version: 1
|
||||
last_updated: "2026-04-13"
|
||||
---
|
||||
|
||||
# Tool Quirks (Global)
|
||||
|
||||
## Authentication
|
||||
|
||||
- id: global:tool-quirk:001
|
||||
fact: "Gitea token stored at ~/.config/gitea/token, not env var GITEA_TOKEN"
|
||||
confidence: 0.95
|
||||
tags: [git, auth, gitea, token]
|
||||
source_count: 23
|
||||
first_seen: "2026-03-27"
|
||||
last_confirmed: "2026-04-13"
|
||||
related: [global:pitfall:001]
|
||||
|
||||
- id: global:tool-quirk:002
|
||||
fact: "Gitea API uses 'Authorization: token TOKEN' header format, not Bearer"
|
||||
confidence: 0.9
|
||||
tags: [git, api, gitea]
|
||||
source_count: 8
|
||||
first_seen: "2026-03-28"
|
||||
last_confirmed: "2026-04-12"
|
||||
|
||||
- id: global:tool-quirk:003
|
||||
fact: "Gitea Issues API type=issues param does NOT filter PRs - use truthiness check on pull_request field"
|
||||
confidence: 0.95
|
||||
tags: [gitea, api, issues, pr]
|
||||
source_count: 6
|
||||
first_seen: "2026-04-01"
|
||||
last_confirmed: "2026-04-13"
|
||||
|
||||
## Paths & Environment
|
||||
|
||||
- id: global:tool-quirk:004
|
||||
fact: "~/.hermes is the default hermes home - check get_hermes_home() not the path literal"
|
||||
confidence: 0.9
|
||||
tags: [paths, hermes, env]
|
||||
source_count: 10
|
||||
first_seen: "2026-03-30"
|
||||
last_confirmed: "2026-04-13"
|
||||
related: [hermes-agent:pitfall:005]
|
||||
|
||||
- id: global:tool-quirk:005
|
||||
fact: "Ansible vault-encrypted vars in YAML require vault_inline_vars plugin"
|
||||
confidence: 0.85
|
||||
tags: [ansible, vault, config]
|
||||
source_count: 3
|
||||
first_seen: "2026-04-02"
|
||||
|
||||
## Model & Inference
|
||||
|
||||
- id: global:tool-quirk:006
|
||||
fact: "mimo-v2-pro via Nous Research is the default model - don't assume Anthropic is available"
|
||||
confidence: 0.95
|
||||
tags: [model, provider, nous, default]
|
||||
source_count: 15
|
||||
first_seen: "2026-03-25"
|
||||
last_confirmed: "2026-04-13"
|
||||
|
||||
- id: global:tool-quirk:007
|
||||
fact: "Kill + restart with 'hermes chat' preserves old model state - NEVER use --resume"
|
||||
confidence: 0.95
|
||||
tags: [hermes, model, restart, session]
|
||||
source_count: 8
|
||||
first_seen: "2026-03-29"
|
||||
last_confirmed: "2026-04-12"
|
||||
@@ -1,472 +1,6 @@
|
||||
{
|
||||
"version": 1,
|
||||
"last_updated": "2026-04-13T20:00:00Z",
|
||||
"total_facts": 29,
|
||||
"facts": [
|
||||
{
|
||||
"id": "hermes-agent:pitfall:001",
|
||||
"fact": "deploy-crons.py leaves jobs in mixed model format",
|
||||
"category": "pitfall",
|
||||
"domain": "hermes-agent",
|
||||
"confidence": 0.95,
|
||||
"tags": [
|
||||
"cron",
|
||||
"deploy",
|
||||
"model",
|
||||
"config"
|
||||
],
|
||||
"source_count": 5,
|
||||
"first_seen": "2026-04-08",
|
||||
"last_confirmed": "2026-04-13",
|
||||
"related": [
|
||||
"hermes-agent:pitfall:002",
|
||||
"hermes-agent:pitfall:003"
|
||||
]
|
||||
},
|
||||
{
|
||||
"id": "hermes-agent:pitfall:002",
|
||||
"fact": "deploy-crons.py --deploy doesn't set legacy skill field from skills list",
|
||||
"category": "pitfall",
|
||||
"domain": "hermes-agent",
|
||||
"confidence": 0.9,
|
||||
"tags": [
|
||||
"cron",
|
||||
"deploy",
|
||||
"skills"
|
||||
],
|
||||
"source_count": 3,
|
||||
"first_seen": "2026-04-09",
|
||||
"last_confirmed": "2026-04-13",
|
||||
"related": [
|
||||
"hermes-agent:pitfall:001"
|
||||
]
|
||||
},
|
||||
{
|
||||
"id": "hermes-agent:pitfall:003",
|
||||
"fact": "Cron jobs with blank fallback_model fields trigger spurious gateway warnings",
|
||||
"category": "pitfall",
|
||||
"domain": "hermes-agent",
|
||||
"confidence": 0.9,
|
||||
"tags": [
|
||||
"cron",
|
||||
"model",
|
||||
"fallback"
|
||||
],
|
||||
"source_count": 4,
|
||||
"first_seen": "2026-04-07",
|
||||
"last_confirmed": "2026-04-12",
|
||||
"related": [
|
||||
"hermes-agent:pitfall:001"
|
||||
]
|
||||
},
|
||||
{
|
||||
"id": "hermes-agent:pitfall:004",
|
||||
"fact": "model-watchdog.py checks first provider line, not model.provider - causes false drift alarms",
|
||||
"category": "pitfall",
|
||||
"domain": "hermes-agent",
|
||||
"confidence": 0.9,
|
||||
"tags": [
|
||||
"watchdog",
|
||||
"model",
|
||||
"config"
|
||||
],
|
||||
"source_count": 3,
|
||||
"first_seen": "2026-04-08",
|
||||
"last_confirmed": "2026-04-13"
|
||||
},
|
||||
{
|
||||
"id": "hermes-agent:pitfall:005",
|
||||
"fact": "10+ files read HERMES_HOME directly instead of get_hermes_home()",
|
||||
"category": "pitfall",
|
||||
"domain": "hermes-agent",
|
||||
"confidence": 0.85,
|
||||
"tags": [
|
||||
"paths",
|
||||
"env",
|
||||
"hermes-home"
|
||||
],
|
||||
"source_count": 6,
|
||||
"first_seen": "2026-04-06",
|
||||
"last_confirmed": "2026-04-12",
|
||||
"related": [
|
||||
"global:pitfall:002"
|
||||
]
|
||||
},
|
||||
{
|
||||
"id": "hermes-agent:pitfall:006",
|
||||
"fact": "get_hermes_home() doesn't expand tilde when HERMES_HOME=~/... is set",
|
||||
"category": "pitfall",
|
||||
"domain": "hermes-agent",
|
||||
"confidence": 0.8,
|
||||
"tags": [
|
||||
"paths",
|
||||
"env",
|
||||
"bug"
|
||||
],
|
||||
"source_count": 2,
|
||||
"first_seen": "2026-04-05"
|
||||
},
|
||||
{
|
||||
"id": "hermes-agent:pitfall:007",
|
||||
"fact": "vps-agent-dispatch reports OK while remote hermes binary path is broken",
|
||||
"category": "pitfall",
|
||||
"domain": "hermes-agent",
|
||||
"confidence": 0.9,
|
||||
"tags": [
|
||||
"ssh",
|
||||
"dispatch",
|
||||
"vps"
|
||||
],
|
||||
"source_count": 4,
|
||||
"first_seen": "2026-04-07",
|
||||
"last_confirmed": "2026-04-11"
|
||||
},
|
||||
{
|
||||
"id": "hermes-agent:pitfall:008",
|
||||
"fact": "nightwatch-health-monitor SSH check fails on cloud-model-only deployments",
|
||||
"category": "pitfall",
|
||||
"domain": "hermes-agent",
|
||||
"confidence": 0.85,
|
||||
"tags": [
|
||||
"ssh",
|
||||
"health",
|
||||
"cloud"
|
||||
],
|
||||
"source_count": 2,
|
||||
"first_seen": "2026-04-10"
|
||||
},
|
||||
{
|
||||
"id": "the-nexus:pitfall:001",
|
||||
"fact": "Merges fail with HTTP 405 due to branch protection",
|
||||
"category": "pitfall",
|
||||
"domain": "the-nexus",
|
||||
"confidence": 0.95,
|
||||
"tags": [
|
||||
"git",
|
||||
"merge",
|
||||
"branch-protection",
|
||||
"gitea"
|
||||
],
|
||||
"source_count": 12,
|
||||
"first_seen": "2026-04-05",
|
||||
"last_confirmed": "2026-04-13",
|
||||
"related": [
|
||||
"global:pitfall:001"
|
||||
]
|
||||
},
|
||||
{
|
||||
"id": "the-nexus:pitfall:002",
|
||||
"fact": "ThreadingHTTPServer required for multi-user bridge - standard HTTPServer blocks on concurrent requests",
|
||||
"category": "pitfall",
|
||||
"domain": "the-nexus",
|
||||
"confidence": 0.95,
|
||||
"tags": [
|
||||
"server",
|
||||
"concurrency",
|
||||
"bridge"
|
||||
],
|
||||
"source_count": 5,
|
||||
"first_seen": "2026-04-10",
|
||||
"last_confirmed": "2026-04-13"
|
||||
},
|
||||
{
|
||||
"id": "the-nexus:pitfall:003",
|
||||
"fact": "ChatLog.log() crashes on message persistence when index.html has orphaned button tags",
|
||||
"category": "pitfall",
|
||||
"domain": "the-nexus",
|
||||
"confidence": 0.9,
|
||||
"tags": [
|
||||
"html",
|
||||
"crash",
|
||||
"chatlog"
|
||||
],
|
||||
"source_count": 3,
|
||||
"first_seen": "2026-04-12",
|
||||
"last_confirmed": "2026-04-13"
|
||||
},
|
||||
{
|
||||
"id": "the-nexus:pitfall:004",
|
||||
"fact": "Three.js LOD not implemented - local hardware struggles with full scene",
|
||||
"category": "pitfall",
|
||||
"domain": "the-nexus",
|
||||
"confidence": 0.85,
|
||||
"tags": [
|
||||
"threejs",
|
||||
"performance",
|
||||
"lod"
|
||||
],
|
||||
"source_count": 4,
|
||||
"first_seen": "2026-04-09",
|
||||
"last_confirmed": "2026-04-13"
|
||||
},
|
||||
{
|
||||
"id": "the-nexus:pitfall:005",
|
||||
"fact": "Duplicate content blocks appear in index.html when PR merges conflict silently",
|
||||
"category": "pitfall",
|
||||
"domain": "the-nexus",
|
||||
"confidence": 0.8,
|
||||
"tags": [
|
||||
"html",
|
||||
"merge-conflict",
|
||||
"duplicate"
|
||||
],
|
||||
"source_count": 3,
|
||||
"first_seen": "2026-04-11",
|
||||
"last_confirmed": "2026-04-13"
|
||||
},
|
||||
{
|
||||
"id": "the-nexus:pitfall:006",
|
||||
"fact": "Unified HTTP + WebSocket server required for proper URL deployment - separate servers break CORS",
|
||||
"category": "pitfall",
|
||||
"domain": "the-nexus",
|
||||
"confidence": 0.9,
|
||||
"tags": [
|
||||
"deploy",
|
||||
"websocket",
|
||||
"http",
|
||||
"cors"
|
||||
],
|
||||
"source_count": 4,
|
||||
"first_seen": "2026-04-10",
|
||||
"last_confirmed": "2026-04-13"
|
||||
},
|
||||
{
|
||||
"id": "global:tool-quirk:001",
|
||||
"fact": "Gitea token stored at ~/.config/gitea/token, not env var GITEA_TOKEN",
|
||||
"category": "tool-quirk",
|
||||
"domain": "global",
|
||||
"confidence": 0.95,
|
||||
"tags": [
|
||||
"git",
|
||||
"auth",
|
||||
"gitea",
|
||||
"token"
|
||||
],
|
||||
"source_count": 23,
|
||||
"first_seen": "2026-03-27",
|
||||
"last_confirmed": "2026-04-13",
|
||||
"related": [
|
||||
"global:pitfall:001"
|
||||
]
|
||||
},
|
||||
{
|
||||
"id": "global:tool-quirk:002",
|
||||
"fact": "Gitea API uses 'Authorization: token TOKEN' header format, not Bearer",
|
||||
"category": "tool-quirk",
|
||||
"domain": "global",
|
||||
"confidence": 0.9,
|
||||
"tags": [
|
||||
"git",
|
||||
"api",
|
||||
"gitea"
|
||||
],
|
||||
"source_count": 8,
|
||||
"first_seen": "2026-03-28",
|
||||
"last_confirmed": "2026-04-12"
|
||||
},
|
||||
{
|
||||
"id": "global:tool-quirk:003",
|
||||
"fact": "Gitea Issues API type=issues param does NOT filter PRs",
|
||||
"category": "tool-quirk",
|
||||
"domain": "global",
|
||||
"confidence": 0.95,
|
||||
"tags": [
|
||||
"gitea",
|
||||
"api",
|
||||
"issues",
|
||||
"pr"
|
||||
],
|
||||
"source_count": 6,
|
||||
"first_seen": "2026-04-01",
|
||||
"last_confirmed": "2026-04-13"
|
||||
},
|
||||
{
|
||||
"id": "global:tool-quirk:004",
|
||||
"fact": "~/.hermes is the default hermes home - check get_hermes_home() not the path literal",
|
||||
"category": "tool-quirk",
|
||||
"domain": "global",
|
||||
"confidence": 0.9,
|
||||
"tags": [
|
||||
"paths",
|
||||
"hermes",
|
||||
"env"
|
||||
],
|
||||
"source_count": 10,
|
||||
"first_seen": "2026-03-30",
|
||||
"last_confirmed": "2026-04-13",
|
||||
"related": [
|
||||
"hermes-agent:pitfall:005"
|
||||
]
|
||||
},
|
||||
{
|
||||
"id": "global:tool-quirk:005",
|
||||
"fact": "Ansible vault-encrypted vars in YAML require vault_inline_vars plugin",
|
||||
"category": "tool-quirk",
|
||||
"domain": "global",
|
||||
"confidence": 0.85,
|
||||
"tags": [
|
||||
"ansible",
|
||||
"vault",
|
||||
"config"
|
||||
],
|
||||
"source_count": 3,
|
||||
"first_seen": "2026-04-02"
|
||||
},
|
||||
{
|
||||
"id": "global:tool-quirk:006",
|
||||
"fact": "mimo-v2-pro via Nous Research is the default model - don't assume Anthropic is available",
|
||||
"category": "tool-quirk",
|
||||
"domain": "global",
|
||||
"confidence": 0.95,
|
||||
"tags": [
|
||||
"model",
|
||||
"provider",
|
||||
"nous",
|
||||
"default"
|
||||
],
|
||||
"source_count": 15,
|
||||
"first_seen": "2026-03-25",
|
||||
"last_confirmed": "2026-04-13"
|
||||
},
|
||||
{
|
||||
"id": "global:tool-quirk:007",
|
||||
"fact": "Kill + restart with 'hermes chat' preserves old model state - NEVER use --resume",
|
||||
"category": "tool-quirk",
|
||||
"domain": "global",
|
||||
"confidence": 0.95,
|
||||
"tags": [
|
||||
"hermes",
|
||||
"model",
|
||||
"restart",
|
||||
"session"
|
||||
],
|
||||
"source_count": 8,
|
||||
"first_seen": "2026-03-29",
|
||||
"last_confirmed": "2026-04-12"
|
||||
},
|
||||
{
|
||||
"id": "global:pitfall:001",
|
||||
"fact": "Branch protection requires 1 approval on main - API merges fail with 405 without it",
|
||||
"category": "pitfall",
|
||||
"domain": "global",
|
||||
"confidence": 0.95,
|
||||
"tags": [
|
||||
"git",
|
||||
"merge",
|
||||
"branch-protection",
|
||||
"gitea"
|
||||
],
|
||||
"source_count": 12,
|
||||
"first_seen": "2026-04-05",
|
||||
"last_confirmed": "2026-04-13",
|
||||
"related": [
|
||||
"the-nexus:pitfall:001"
|
||||
]
|
||||
},
|
||||
{
|
||||
"id": "global:pitfall:002",
|
||||
"fact": "Never use --no-verify on git commits",
|
||||
"category": "pitfall",
|
||||
"domain": "global",
|
||||
"confidence": 0.95,
|
||||
"tags": [
|
||||
"git",
|
||||
"hooks",
|
||||
"safety"
|
||||
],
|
||||
"source_count": 5,
|
||||
"first_seen": "2026-03-28",
|
||||
"last_confirmed": "2026-04-13"
|
||||
},
|
||||
{
|
||||
"id": "global:pitfall:003",
|
||||
"fact": "Gitea PR creation workaround needed on the-nexus - direct API call fails",
|
||||
"category": "pitfall",
|
||||
"domain": "global",
|
||||
"confidence": 0.9,
|
||||
"tags": [
|
||||
"gitea",
|
||||
"pr",
|
||||
"api",
|
||||
"workaround"
|
||||
],
|
||||
"source_count": 4,
|
||||
"first_seen": "2026-04-06",
|
||||
"last_confirmed": "2026-04-12"
|
||||
},
|
||||
{
|
||||
"id": "global:pitfall:004",
|
||||
"fact": "Anthropic is BANNED from fallback chain",
|
||||
"category": "pitfall",
|
||||
"domain": "global",
|
||||
"confidence": 0.95,
|
||||
"tags": [
|
||||
"provider",
|
||||
"anthropic",
|
||||
"fallback"
|
||||
],
|
||||
"source_count": 7,
|
||||
"first_seen": "2026-03-30",
|
||||
"last_confirmed": "2026-04-13"
|
||||
},
|
||||
{
|
||||
"id": "global:pitfall:005",
|
||||
"fact": "Telegram tokens expired - don't assume Telegram notifications work",
|
||||
"category": "pitfall",
|
||||
"domain": "global",
|
||||
"confidence": 0.85,
|
||||
"tags": [
|
||||
"telegram",
|
||||
"notifications",
|
||||
"token"
|
||||
],
|
||||
"source_count": 3,
|
||||
"first_seen": "2026-04-02"
|
||||
},
|
||||
{
|
||||
"id": "global:pitfall:006",
|
||||
"fact": "Multiple gateways = 'cannot schedule futures' error - only one gateway process should run",
|
||||
"category": "pitfall",
|
||||
"domain": "global",
|
||||
"confidence": 0.9,
|
||||
"tags": [
|
||||
"gateway",
|
||||
"cron",
|
||||
"process"
|
||||
],
|
||||
"source_count": 4,
|
||||
"first_seen": "2026-04-04",
|
||||
"last_confirmed": "2026-04-11"
|
||||
},
|
||||
{
|
||||
"id": "global:pitfall:007",
|
||||
"fact": "pytest root collection picks up operational *_test.py scripts - restrict to tests/ directory",
|
||||
"category": "pitfall",
|
||||
"domain": "global",
|
||||
"confidence": 0.9,
|
||||
"tags": [
|
||||
"pytest",
|
||||
"test",
|
||||
"collection"
|
||||
],
|
||||
"source_count": 3,
|
||||
"first_seen": "2026-04-07",
|
||||
"last_confirmed": "2026-04-13"
|
||||
},
|
||||
{
|
||||
"id": "global:pitfall:008",
|
||||
"fact": "TDD: test 1 before building 55",
|
||||
"category": "pitfall",
|
||||
"domain": "global",
|
||||
"confidence": 0.95,
|
||||
"tags": [
|
||||
"tdd",
|
||||
"testing",
|
||||
"methodology"
|
||||
],
|
||||
"source_count": 8,
|
||||
"first_seen": "2026-03-25",
|
||||
"last_confirmed": "2026-04-13"
|
||||
}
|
||||
]
|
||||
"total_facts": 0,
|
||||
"facts": []
|
||||
}
|
||||
@@ -1,80 +0,0 @@
|
||||
---
|
||||
domain: hermes-agent
|
||||
category: pitfall
|
||||
version: 1
|
||||
last_updated: "2026-04-13"
|
||||
---
|
||||
|
||||
# Pitfalls (hermes-agent)
|
||||
|
||||
## Cron & Deployment
|
||||
|
||||
- id: hermes-agent:pitfall:001
|
||||
fact: "deploy-crons.py leaves jobs in mixed model format - some have provider/model, some just model"
|
||||
confidence: 0.95
|
||||
tags: [cron, deploy, model, config]
|
||||
source_count: 5
|
||||
first_seen: "2026-04-08"
|
||||
last_confirmed: "2026-04-13"
|
||||
related: [hermes-agent:pitfall:002, hermes-agent:pitfall:003]
|
||||
|
||||
- id: hermes-agent:pitfall:002
|
||||
fact: "deploy-crons.py --deploy doesn't set legacy skill field from skills list"
|
||||
confidence: 0.9
|
||||
tags: [cron, deploy, skills]
|
||||
source_count: 3
|
||||
first_seen: "2026-04-09"
|
||||
last_confirmed: "2026-04-13"
|
||||
related: [hermes-agent:pitfall:001]
|
||||
|
||||
- id: hermes-agent:pitfall:003
|
||||
fact: "Cron jobs with blank fallback_model fields trigger spurious gateway warnings"
|
||||
confidence: 0.9
|
||||
tags: [cron, model, fallback]
|
||||
source_count: 4
|
||||
first_seen: "2026-04-07"
|
||||
last_confirmed: "2026-04-12"
|
||||
related: [hermes-agent:pitfall:001]
|
||||
|
||||
- id: hermes-agent:pitfall:004
|
||||
fact: "model-watchdog.py checks first provider line, not model.provider - causes false drift alarms"
|
||||
confidence: 0.9
|
||||
tags: [watchdog, model, config]
|
||||
source_count: 3
|
||||
first_seen: "2026-04-08"
|
||||
last_confirmed: "2026-04-13"
|
||||
|
||||
## Path & Environment
|
||||
|
||||
- id: hermes-agent:pitfall:005
|
||||
fact: "10+ files read HERMES_HOME directly instead of get_hermes_home() - breaks on custom paths"
|
||||
confidence: 0.85
|
||||
tags: [paths, env, hermes-home]
|
||||
source_count: 6
|
||||
first_seen: "2026-04-06"
|
||||
last_confirmed: "2026-04-12"
|
||||
related: [global:pitfall:002]
|
||||
|
||||
- id: hermes-agent:pitfall:006
|
||||
fact: "get_hermes_home() doesn't expand tilde when HERMES_HOME=~/... is set"
|
||||
confidence: 0.8
|
||||
tags: [paths, env, bug]
|
||||
source_count: 2
|
||||
first_seen: "2026-04-05"
|
||||
|
||||
## SSH & Dispatch
|
||||
|
||||
- id: hermes-agent:pitfall:007
|
||||
fact: "vps-agent-dispatch reports OK while remote hermes binary path is broken"
|
||||
confidence: 0.9
|
||||
tags: [ssh, dispatch, vps]
|
||||
source_count: 4
|
||||
first_seen: "2026-04-07"
|
||||
last_confirmed: "2026-04-11"
|
||||
|
||||
- id: hermes-agent:pitfall:008
|
||||
fact: "nightwatch-health-monitor SSH check fails on cloud-model-only deployments"
|
||||
confidence: 0.85
|
||||
tags: [ssh, health, cloud]
|
||||
source_count: 2
|
||||
first_seen: "2026-04-10"
|
||||
@@ -1,63 +0,0 @@
|
||||
---
|
||||
domain: the-nexus
|
||||
category: pitfall
|
||||
version: 1
|
||||
last_updated: "2026-04-13"
|
||||
---
|
||||
|
||||
# Pitfalls (the-nexus)
|
||||
|
||||
## Git & Merging
|
||||
|
||||
- id: the-nexus:pitfall:001
|
||||
fact: "Merges fail with HTTP 405 due to branch protection - must use merge API with 1 approval"
|
||||
confidence: 0.95
|
||||
tags: [git, merge, branch-protection, gitea]
|
||||
source_count: 12
|
||||
first_seen: "2026-04-05"
|
||||
last_confirmed: "2026-04-13"
|
||||
related: [global:pitfall:001]
|
||||
|
||||
- id: the-nexus:pitfall:002
|
||||
fact: "ThreadingHTTPServer required for multi-user bridge - standard HTTPServer blocks on concurrent requests"
|
||||
confidence: 0.95
|
||||
tags: [server, concurrency, bridge]
|
||||
source_count: 5
|
||||
first_seen: "2026-04-10"
|
||||
last_confirmed: "2026-04-13"
|
||||
|
||||
- id: the-nexus:pitfall:003
|
||||
fact: "ChatLog.log() crashes on message persistence when index.html has orphaned button tags"
|
||||
confidence: 0.9
|
||||
tags: [html, crash, chatlog]
|
||||
source_count: 3
|
||||
first_seen: "2026-04-12"
|
||||
last_confirmed: "2026-04-13"
|
||||
|
||||
## Three.js & Performance
|
||||
|
||||
- id: the-nexus:pitfall:004
|
||||
fact: "Three.js LOD not implemented - local hardware struggles with full scene without texture optimization"
|
||||
confidence: 0.85
|
||||
tags: [threejs, performance, lod]
|
||||
source_count: 4
|
||||
first_seen: "2026-04-09"
|
||||
last_confirmed: "2026-04-13"
|
||||
|
||||
- id: the-nexus:pitfall:005
|
||||
fact: "Duplicate content blocks appear in index.html when PR merges conflict silently"
|
||||
confidence: 0.8
|
||||
tags: [html, merge-conflict, duplicate]
|
||||
source_count: 3
|
||||
first_seen: "2026-04-11"
|
||||
last_confirmed: "2026-04-13"
|
||||
|
||||
## Deployment
|
||||
|
||||
- id: the-nexus:pitfall:006
|
||||
fact: "Unified HTTP + WebSocket server required for proper URL deployment - separate servers break CORS"
|
||||
confidence: 0.9
|
||||
tags: [deploy, websocket, http, cors]
|
||||
source_count: 4
|
||||
first_seen: "2026-04-10"
|
||||
last_confirmed: "2026-04-13"
|
||||
297
quality_gate.py
297
quality_gate.py
@@ -1,297 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
quality_gate.py — Score and filter knowledge entries.
|
||||
|
||||
Scores each entry on 4 dimensions:
|
||||
- Specificity: concrete examples vs vague generalities
|
||||
- Actionability: can this be used to do something?
|
||||
- Freshness: is this still accurate?
|
||||
- Source quality: was the model/provider reliable?
|
||||
|
||||
Usage:
|
||||
from quality_gate import score_entry, filter_entries, quality_report
|
||||
|
||||
score = score_entry(entry)
|
||||
filtered = filter_entries(entries, threshold=0.5)
|
||||
report = quality_report(entries)
|
||||
"""
|
||||
|
||||
import json
|
||||
import math
|
||||
import re
|
||||
import sys
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
from typing import Dict, List, Any, Optional
|
||||
|
||||
# Source quality scores (higher = more reliable)
|
||||
SOURCE_QUALITY = {
|
||||
"claude-sonnet": 0.9,
|
||||
"claude-opus": 0.95,
|
||||
"gpt-4": 0.85,
|
||||
"gpt-4-turbo": 0.85,
|
||||
"gpt-5": 0.9,
|
||||
"mimo-v2-pro": 0.8,
|
||||
"gemini-pro": 0.8,
|
||||
"llama-3-70b": 0.75,
|
||||
"llama-3-8b": 0.7,
|
||||
"ollama": 0.6,
|
||||
"unknown": 0.5,
|
||||
}
|
||||
|
||||
DEFAULT_SOURCE_QUALITY = 0.5
|
||||
|
||||
# Specificity indicators
|
||||
SPECIFIC_INDICATORS = [
|
||||
r"\b\d+\.\d+", # decimal numbers
|
||||
r"\b\d{4}-\d{2}-\d{2}", # dates
|
||||
r"\b[A-Z][a-z]+\s[A-Z][a-z]+", # proper nouns
|
||||
r"`[^`]+`", # code/commands
|
||||
r"https?://", # URLs
|
||||
r"\b(example|instance|specifically|concretely)\b",
|
||||
r"\b(step \d|first|second|third)\b",
|
||||
r"\b(exactly|precisely|measured|counted)\b",
|
||||
]
|
||||
|
||||
# Vagueness indicators (penalty)
|
||||
VAGUE_INDICATORS = [
|
||||
r"\b(generally|usually|often|sometimes|might|could|perhaps)\b",
|
||||
r"\b(various|several|many|some|few)\b",
|
||||
r"\b(it depends|varies|differs)\b",
|
||||
r"\b(basically|essentially|fundamentally)\b",
|
||||
r"\b(everyone knows|it's obvious|clearly)\b",
|
||||
]
|
||||
|
||||
# Actionability indicators
|
||||
ACTIONABLE_INDICATORS = [
|
||||
r"\b(run|execute|install|deploy|configure|set up)\b",
|
||||
r"\b(use|apply|implement|create|build)\b",
|
||||
r"\b(check|verify|test|validate|confirm)\b",
|
||||
r"\b(fix|resolve|solve|debug|troubleshoot)\b",
|
||||
r"\b(if .+ then|when .+ do|to .+ use)\b",
|
||||
r"```[a-z]*\n", # code blocks
|
||||
r"\$\s", # shell commands
|
||||
r"\b\d+\.\s", # numbered steps
|
||||
]
|
||||
|
||||
|
||||
def score_specificity(content: str) -> float:
|
||||
"""Score specificity: 0=vague, 1=very specific."""
|
||||
content_lower = content.lower()
|
||||
score = 0.5 # baseline
|
||||
|
||||
# Check for specific indicators
|
||||
specific_count = sum(
|
||||
len(re.findall(p, content, re.IGNORECASE))
|
||||
for p in SPECIFIC_INDICATORS
|
||||
)
|
||||
|
||||
# Check for vague indicators
|
||||
vague_count = sum(
|
||||
len(re.findall(p, content_lower))
|
||||
for p in VAGUE_INDICATORS
|
||||
)
|
||||
|
||||
# Adjust score
|
||||
score += min(specific_count * 0.05, 0.4)
|
||||
score -= min(vague_count * 0.08, 0.3)
|
||||
|
||||
# Length bonus (longer = more detail, up to a point)
|
||||
word_count = len(content.split())
|
||||
if word_count > 50:
|
||||
score += min((word_count - 50) * 0.001, 0.1)
|
||||
|
||||
return max(0.0, min(1.0, score))
|
||||
|
||||
|
||||
def score_actionability(content: str) -> float:
|
||||
"""Score actionability: 0=abstract, 1=highly actionable."""
|
||||
content_lower = content.lower()
|
||||
score = 0.3 # baseline (most knowledge is informational)
|
||||
|
||||
# Check for actionable indicators
|
||||
actionable_count = sum(
|
||||
len(re.findall(p, content_lower))
|
||||
for p in ACTIONABLE_INDICATORS
|
||||
)
|
||||
|
||||
score += min(actionable_count * 0.1, 0.6)
|
||||
|
||||
# Code blocks are highly actionable
|
||||
if "```" in content:
|
||||
score += 0.2
|
||||
|
||||
# Numbered steps are actionable
|
||||
if re.search(r"\d+\.\s+\w", content):
|
||||
score += 0.1
|
||||
|
||||
return max(0.0, min(1.0, score))
|
||||
|
||||
|
||||
def score_freshness(timestamp: Optional[str]) -> float:
|
||||
"""Score freshness: 1=new, decays over time."""
|
||||
if not timestamp:
|
||||
return 0.5
|
||||
|
||||
try:
|
||||
if isinstance(timestamp, str):
|
||||
ts = datetime.fromisoformat(timestamp.replace("Z", "+00:00"))
|
||||
else:
|
||||
ts = timestamp
|
||||
|
||||
now = datetime.now(timezone.utc)
|
||||
age_days = (now - ts).days
|
||||
|
||||
# Exponential decay: 1.0 at day 0, 0.5 at ~180 days, 0.1 at ~365 days
|
||||
score = math.exp(-age_days / 180)
|
||||
return max(0.1, min(1.0, score))
|
||||
except (ValueError, TypeError):
|
||||
return 0.5
|
||||
|
||||
|
||||
def score_source_quality(model: Optional[str]) -> float:
|
||||
"""Score source quality based on model/provider."""
|
||||
if not model:
|
||||
return DEFAULT_SOURCE_QUALITY
|
||||
|
||||
# Normalize model name
|
||||
model_lower = model.lower()
|
||||
for key, score in SOURCE_QUALITY.items():
|
||||
if key in model_lower:
|
||||
return score
|
||||
|
||||
return DEFAULT_SOURCE_QUALITY
|
||||
|
||||
|
||||
def score_entry(entry: dict) -> float:
|
||||
"""
|
||||
Score a knowledge entry on quality (0.0-1.0).
|
||||
|
||||
Weights:
|
||||
- specificity: 0.3
|
||||
- actionability: 0.3
|
||||
- freshness: 0.2
|
||||
- source_quality: 0.2
|
||||
"""
|
||||
content = entry.get("content", entry.get("text", entry.get("response", "")))
|
||||
model = entry.get("model", entry.get("provenance", {}).get("model"))
|
||||
timestamp = entry.get("timestamp", entry.get("provenance", {}).get("timestamp"))
|
||||
|
||||
specificity = score_specificity(content)
|
||||
actionability = score_actionability(content)
|
||||
freshness = score_freshness(timestamp)
|
||||
source = score_source_quality(model)
|
||||
|
||||
return round(
|
||||
0.3 * specificity +
|
||||
0.3 * actionability +
|
||||
0.2 * freshness +
|
||||
0.2 * source,
|
||||
4
|
||||
)
|
||||
|
||||
|
||||
def score_entry_detailed(entry: dict) -> dict:
|
||||
"""Score with breakdown."""
|
||||
content = entry.get("content", entry.get("text", entry.get("response", "")))
|
||||
model = entry.get("model", entry.get("provenance", {}).get("model"))
|
||||
timestamp = entry.get("timestamp", entry.get("provenance", {}).get("timestamp"))
|
||||
|
||||
specificity = score_specificity(content)
|
||||
actionability = score_actionability(content)
|
||||
freshness = score_freshness(timestamp)
|
||||
source = score_source_quality(model)
|
||||
|
||||
return {
|
||||
"score": round(0.3 * specificity + 0.3 * actionability + 0.2 * freshness + 0.2 * source, 4),
|
||||
"specificity": round(specificity, 4),
|
||||
"actionability": round(actionability, 4),
|
||||
"freshness": round(freshness, 4),
|
||||
"source_quality": round(source, 4),
|
||||
}
|
||||
|
||||
|
||||
def filter_entries(entries: List[dict], threshold: float = 0.5) -> List[dict]:
|
||||
"""Filter entries below quality threshold."""
|
||||
filtered = []
|
||||
for entry in entries:
|
||||
if score_entry(entry) >= threshold:
|
||||
filtered.append(entry)
|
||||
return filtered
|
||||
|
||||
|
||||
def quality_report(entries: List[dict]) -> str:
|
||||
"""Generate quality distribution report."""
|
||||
if not entries:
|
||||
return "No entries to analyze."
|
||||
|
||||
scores = [score_entry(e) for e in entries]
|
||||
|
||||
avg = sum(scores) / len(scores)
|
||||
min_score = min(scores)
|
||||
max_score = max(scores)
|
||||
|
||||
# Distribution buckets
|
||||
buckets = {"high": 0, "medium": 0, "low": 0, "rejected": 0}
|
||||
for s in scores:
|
||||
if s >= 0.7:
|
||||
buckets["high"] += 1
|
||||
elif s >= 0.5:
|
||||
buckets["medium"] += 1
|
||||
elif s >= 0.3:
|
||||
buckets["low"] += 1
|
||||
else:
|
||||
buckets["rejected"] += 1
|
||||
|
||||
lines = [
|
||||
"=" * 50,
|
||||
" QUALITY GATE REPORT",
|
||||
"=" * 50,
|
||||
f" Total entries: {len(entries)}",
|
||||
f" Average score: {avg:.3f}",
|
||||
f" Min: {min_score:.3f}",
|
||||
f" Max: {max_score:.3f}",
|
||||
"",
|
||||
" Distribution:",
|
||||
]
|
||||
|
||||
for bucket, count in buckets.items():
|
||||
pct = count / len(entries) * 100
|
||||
bar = "█" * int(pct / 5)
|
||||
lines.append(f" {bucket:<12} {count:>5} ({pct:>5.1f}%) {bar}")
|
||||
|
||||
passed = buckets["high"] + buckets["medium"]
|
||||
lines.append(f"\n Pass rate (>= 0.5): {passed}/{len(entries)} ({passed/len(entries)*100:.1f}%)")
|
||||
lines.append("=" * 50)
|
||||
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def main():
|
||||
import argparse
|
||||
parser = argparse.ArgumentParser(description="Knowledge quality gate")
|
||||
parser.add_argument("files", nargs="+", help="JSONL files to score")
|
||||
parser.add_argument("--threshold", type=float, default=0.5, help="Quality threshold")
|
||||
parser.add_argument("--json", action="store_true", help="JSON output")
|
||||
parser.add_argument("--filter", action="store_true", help="Filter and write back")
|
||||
args = parser.parse_args()
|
||||
|
||||
all_entries = []
|
||||
for filepath in args.files:
|
||||
with open(filepath) as f:
|
||||
for line in f:
|
||||
if line.strip():
|
||||
all_entries.append(json.loads(line))
|
||||
|
||||
if args.json:
|
||||
results = [{"entry": e, **score_entry_detailed(e)} for e in all_entries]
|
||||
print(json.dumps(results, indent=2))
|
||||
elif args.filter:
|
||||
filtered = filter_entries(all_entries, args.threshold)
|
||||
print(f"Kept {len(filtered)}/{len(all_entries)} entries (threshold: {args.threshold})")
|
||||
else:
|
||||
print(quality_report(all_entries))
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1 +0,0 @@
|
||||
pytest>=8,<9
|
||||
@@ -1,554 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Automation Opportunity Finder — Scan fleet for manual processes that could be automated.
|
||||
|
||||
Analyzes:
|
||||
1. Cron jobs — finds manual steps between scheduled tasks
|
||||
2. Documentation — extracts TODO/FIXME/manual-step patterns
|
||||
3. Scripts — detects repeated command sequences
|
||||
4. Session transcripts — finds repeated tool-call patterns
|
||||
|
||||
Usage:
|
||||
python3 scripts/automation_opportunity_finder.py --hermes-home ~/.hermes
|
||||
python3 scripts/automation_opportunity_finder.py --hermes-home ~/.hermes --json
|
||||
python3 scripts/automation_opportunity_finder.py --hermes-home ~/.hermes --output proposals.json
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
from collections import Counter, defaultdict
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
from typing import Any, Dict, List, Optional
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Patterns that signal manual work
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
MANUAL_STEP_PATTERNS = [
|
||||
# Explicit manual markers
|
||||
(r"(?i)\bTODO[:\s]", "todo"),
|
||||
(r"(?i)\bFIXME[:\s]", "fixme"),
|
||||
(r"(?i)\bMANUAL[:\s]", "manual_step"),
|
||||
(r"(?i)\bHACK[:\s]", "hack"),
|
||||
(r"(?i)\bWORKAROUND[:\s]", "workaround"),
|
||||
# Step-by-step instructions in docs
|
||||
(r"(?i)^(\d+)[.\)]\s+(run|execute|ssh|scp|curl|cd|make|docker|ansible|git)", "sequential_step"),
|
||||
# Explicitly manual operations
|
||||
(r"(?i)\bmanually\b", "manual_keyword"),
|
||||
(r"(?i)\bby hand\b", "manual_keyword"),
|
||||
(r"(?i)\bdon\'?t forget to\b", "manual_keyword"),
|
||||
(r"(?i)\bremember to\b", "manual_keyword"),
|
||||
(r"(?i)\bmake sure to\b", "manual_keyword"),
|
||||
]
|
||||
|
||||
# Shell commands that appear frequently in runbooks — signal automatable workflows
|
||||
SHELL_COMMAND_PATTERNS = [
|
||||
"ssh ", "scp ", "rsync ", "curl ", "wget ",
|
||||
"docker ", "docker-compose ", "kubectl ",
|
||||
"ansible-playbook ", "terraform ", "systemctl ",
|
||||
"systemctl restart", "systemctl status",
|
||||
"git push", "git pull", "git merge", "git checkout",
|
||||
"pip install", "npm install", "cargo build",
|
||||
]
|
||||
|
||||
# Directories to skip during scans — large/uninteresting trees
|
||||
EXCLUDE_DIRS = frozenset({
|
||||
"node_modules", "venv", ".venv", "__pycache__", ".git",
|
||||
"site-packages", "dist", "build", ".tox", ".mypy_cache",
|
||||
".pytest_cache", "coverage", ".next", "vendor",
|
||||
"skills", # hermes skills dir is huge
|
||||
"audio_cache", "skins", "profiles",
|
||||
})
|
||||
|
||||
# Session tool calls that appear repeatedly — candidates for workflow automation
|
||||
TOOL_SEQUENCE_MIN_OCCURRENCES = 3
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Analyzers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def analyze_cron_jobs(hermes_home: str) -> List[Dict[str, Any]]:
|
||||
"""Analyze cron job definitions for automation gaps."""
|
||||
proposals = []
|
||||
cron_dir = Path(hermes_home) / "cron"
|
||||
jobs_file = cron_dir / "jobs.json"
|
||||
|
||||
if not jobs_file.exists():
|
||||
# Try YAML format
|
||||
for ext in (".yaml", ".yml"):
|
||||
alt = cron_dir / f"jobs{ext}"
|
||||
if alt.exists():
|
||||
jobs_file = alt
|
||||
break
|
||||
|
||||
if not jobs_file.exists():
|
||||
return proposals
|
||||
|
||||
try:
|
||||
if jobs_file.suffix == ".json":
|
||||
with open(jobs_file) as f:
|
||||
jobs = json.load(f)
|
||||
else:
|
||||
try:
|
||||
import yaml
|
||||
with open(jobs_file) as f:
|
||||
jobs = yaml.safe_load(f)
|
||||
except ImportError:
|
||||
return proposals
|
||||
except (json.JSONDecodeError, Exception):
|
||||
return proposals
|
||||
|
||||
if not isinstance(jobs, list):
|
||||
return proposals
|
||||
|
||||
# Look for disabled jobs (someone turned them off — might need a different approach)
|
||||
disabled = [j for j in jobs if not j.get("enabled", True)]
|
||||
if disabled:
|
||||
names = [j.get("name", j.get("id", "?")) for j in disabled[:5]]
|
||||
proposals.append({
|
||||
"category": "cron_disabled",
|
||||
"title": f"{len(disabled)} disabled cron job(s) may need automation rework",
|
||||
"description": f"These jobs were disabled: {', '.join(names)}. Investigate why and whether a different automation approach is needed.",
|
||||
"confidence": 0.7,
|
||||
"impact": "medium",
|
||||
"sources": [str(jobs_file)],
|
||||
})
|
||||
|
||||
# Look for jobs with high error counts
|
||||
error_jobs = [j for j in jobs if j.get("last_status") == "error"]
|
||||
if error_jobs:
|
||||
names = [j.get("name", j.get("id", "?")) for j in error_jobs[:5]]
|
||||
proposals.append({
|
||||
"category": "cron_errors",
|
||||
"title": f"{len(error_jobs)} cron job(s) failing — may need automation rework",
|
||||
"description": f"Jobs with errors: {', '.join(names)}. Failure patterns suggest missing dependencies or fragile automation.",
|
||||
"confidence": 0.8,
|
||||
"impact": "high",
|
||||
"sources": [str(jobs_file)],
|
||||
})
|
||||
|
||||
# Look for jobs with delivery errors (platform issues)
|
||||
delivery_errors = [j for j in jobs if j.get("last_delivery_error")]
|
||||
if delivery_errors:
|
||||
proposals.append({
|
||||
"category": "cron_delivery",
|
||||
"title": f"{len(delivery_errors)} cron job(s) have delivery failures",
|
||||
"description": "Delivery failures suggest missing retry logic or platform integration gaps.",
|
||||
"confidence": 0.75,
|
||||
"impact": "medium",
|
||||
"sources": [str(jobs_file)],
|
||||
})
|
||||
|
||||
# Look for jobs on short intervals that could be event-driven
|
||||
for job in jobs:
|
||||
schedule = job.get("schedule", "")
|
||||
# Check for very frequent schedules (every minute, every 5 min)
|
||||
if isinstance(schedule, str) and re.match(r"^\*\/([1-5])\s", schedule):
|
||||
proposals.append({
|
||||
"category": "cron_frequency",
|
||||
"title": f"Job '{job.get('name', job.get('id', '?'))}' runs every {schedule.split()[0]} — consider event-driven",
|
||||
"description": f"High-frequency cron ({schedule}) may be better as event-driven or daemon.",
|
||||
"confidence": 0.6,
|
||||
"impact": "low",
|
||||
"sources": [str(jobs_file)],
|
||||
})
|
||||
|
||||
return proposals
|
||||
|
||||
|
||||
def analyze_documents(root_dirs: List[str]) -> List[Dict[str, Any]]:
|
||||
"""Scan documentation for manual step patterns."""
|
||||
proposals = []
|
||||
doc_extensions = {".md", ".txt", ".rst", ".adoc"}
|
||||
findings_by_category = defaultdict(list)
|
||||
|
||||
for root_dir in root_dirs:
|
||||
root = Path(root_dir)
|
||||
if not root.exists():
|
||||
continue
|
||||
|
||||
for path in root.rglob("*"):
|
||||
if path.is_dir():
|
||||
continue
|
||||
if path.suffix not in doc_extensions:
|
||||
continue
|
||||
# Skip excluded dirs and hidden dirs
|
||||
parts = path.relative_to(root).parts if root in path.parents or root == path.parent else path.parts
|
||||
if any(p.startswith(".") or p in EXCLUDE_DIRS for p in parts):
|
||||
continue
|
||||
if len(parts) > 8:
|
||||
continue
|
||||
|
||||
try:
|
||||
content = path.read_text(errors="replace")
|
||||
except (PermissionError, OSError):
|
||||
continue
|
||||
|
||||
lines = content.split("\n")
|
||||
for i, line in enumerate(lines):
|
||||
stripped = line.strip()
|
||||
if not stripped:
|
||||
continue
|
||||
|
||||
for pattern, category in MANUAL_STEP_PATTERNS:
|
||||
if re.search(pattern, stripped):
|
||||
findings_by_category[category].append({
|
||||
"file": str(path),
|
||||
"line": i + 1,
|
||||
"text": stripped[:200],
|
||||
})
|
||||
|
||||
# Generate proposals from findings
|
||||
for category, findings in findings_by_category.items():
|
||||
if len(findings) < 2:
|
||||
continue
|
||||
|
||||
file_count = len(set(f["file"] for f in findings))
|
||||
proposals.append({
|
||||
"category": f"manual_{category}",
|
||||
"title": f"{len(findings)} '{category}' markers across {file_count} doc(s)",
|
||||
"description": f"Found in: {', '.join(set(Path(f['file']).name for f in findings[:5]))}",
|
||||
"confidence": 0.65,
|
||||
"impact": "medium",
|
||||
"sources": list(set(f["file"] for f in findings[:10])),
|
||||
"details": findings[:5], # sample
|
||||
})
|
||||
|
||||
return proposals
|
||||
|
||||
|
||||
def analyze_scripts(root_dirs: List[str]) -> List[Dict[str, Any]]:
|
||||
"""Detect repeated command sequences in scripts."""
|
||||
proposals = []
|
||||
script_extensions = {".py", ".sh", ".bash", ".zsh"}
|
||||
command_counter = Counter()
|
||||
command_locations = defaultdict(list)
|
||||
|
||||
for root_dir in root_dirs:
|
||||
root = Path(root_dir)
|
||||
if not root.exists():
|
||||
continue
|
||||
|
||||
for path in root.rglob("*"):
|
||||
if path.is_dir():
|
||||
continue
|
||||
if path.suffix not in script_extensions:
|
||||
continue
|
||||
parts = path.relative_to(root).parts if root in path.parents or root == path.parent else path.parts
|
||||
if any(p.startswith(".") or p in EXCLUDE_DIRS for p in parts):
|
||||
continue
|
||||
if len(parts) > 8:
|
||||
continue
|
||||
|
||||
try:
|
||||
content = path.read_text(errors="replace")
|
||||
except (PermissionError, OSError):
|
||||
continue
|
||||
|
||||
lines = content.split("\n")
|
||||
for i, line in enumerate(lines):
|
||||
stripped = line.strip()
|
||||
if not stripped or stripped.startswith("#"):
|
||||
continue
|
||||
|
||||
for cmd_prefix in SHELL_COMMAND_PATTERNS:
|
||||
if cmd_prefix in stripped:
|
||||
# Normalize the command
|
||||
normalized = re.sub(r"\s+", " ", stripped)[:120]
|
||||
command_counter[normalized] += 1
|
||||
command_locations[normalized].append(f"{path}:{i+1}")
|
||||
|
||||
# Proposals for commands appearing 3+ times
|
||||
for cmd, count in command_counter.most_common(20):
|
||||
if count < 3:
|
||||
break
|
||||
locs = command_locations[cmd]
|
||||
file_count = len(set(loc.split(":")[0] for loc in locs))
|
||||
proposals.append({
|
||||
"category": "repeated_command",
|
||||
"title": f"Command repeated {count}x across {file_count} file(s): {cmd[:80]}",
|
||||
"description": f"Locations: {', '.join(locs[:3])}",
|
||||
"confidence": min(0.5 + (count * 0.1), 0.95),
|
||||
"impact": "medium",
|
||||
"sources": list(set(loc.split(":")[0] for loc in locs)),
|
||||
})
|
||||
|
||||
return proposals
|
||||
|
||||
|
||||
def analyze_session_transcripts(session_dirs: List[str]) -> List[Dict[str, Any]]:
|
||||
"""Find repeated tool-call patterns in session transcripts."""
|
||||
proposals = []
|
||||
tool_sequence_counter = Counter()
|
||||
tool_sequence_examples = {}
|
||||
|
||||
for session_dir in session_dirs:
|
||||
session_path = Path(session_dir)
|
||||
if not session_path.exists():
|
||||
continue
|
||||
|
||||
for path in session_path.rglob("*.jsonl"):
|
||||
try:
|
||||
content = path.read_text(errors="replace")
|
||||
except (PermissionError, OSError):
|
||||
continue
|
||||
|
||||
# Extract tool calls in sequence
|
||||
tool_sequence = []
|
||||
for line in content.split("\n"):
|
||||
line = line.strip()
|
||||
if not line:
|
||||
continue
|
||||
try:
|
||||
msg = json.loads(line)
|
||||
except json.JSONDecodeError:
|
||||
continue
|
||||
|
||||
# Look for tool calls in assistant messages
|
||||
if msg.get("role") == "assistant" and msg.get("tool_calls"):
|
||||
for tc in msg["tool_calls"]:
|
||||
func_name = tc.get("function", {}).get("name", "?")
|
||||
tool_sequence.append(func_name)
|
||||
|
||||
# Find 2-call sequences
|
||||
for i in range(len(tool_sequence) - 1):
|
||||
seq = (tool_sequence[i], tool_sequence[i + 1])
|
||||
tool_sequence_counter[seq] += 1
|
||||
if seq not in tool_sequence_examples:
|
||||
tool_sequence_examples[seq] = str(path.name)
|
||||
|
||||
# Find 3-call sequences
|
||||
for i in range(len(tool_sequence) - 2):
|
||||
seq = (tool_sequence[i], tool_sequence[i + 1], tool_sequence[i + 2])
|
||||
tool_sequence_counter[seq] += 1
|
||||
if seq not in tool_sequence_examples:
|
||||
tool_sequence_examples[seq] = str(path.name)
|
||||
|
||||
# Generate proposals for frequently repeated sequences
|
||||
for seq, count in tool_sequence_counter.most_common(20):
|
||||
if count < TOOL_SEQUENCE_MIN_OCCURRENCES:
|
||||
break
|
||||
seq_str = " -> ".join(seq)
|
||||
proposals.append({
|
||||
"category": "tool_sequence",
|
||||
"title": f"Tool sequence '{seq_str}' repeated {count} times",
|
||||
"description": f"Consider creating a workflow/skill that automates this sequence.",
|
||||
"confidence": min(0.5 + (count * 0.05), 0.9),
|
||||
"impact": "medium",
|
||||
"sources": [tool_sequence_examples.get(seq, "unknown")],
|
||||
})
|
||||
|
||||
return proposals
|
||||
|
||||
|
||||
def analyze_shell_history(root_dirs: List[str]) -> List[Dict[str, Any]]:
|
||||
"""Find repeated shell commands from history files."""
|
||||
proposals = []
|
||||
command_counter = Counter()
|
||||
|
||||
for root_dir in root_dirs:
|
||||
root = Path(root_dir)
|
||||
history_files = []
|
||||
|
||||
# Look for shell history files
|
||||
for name in (".bash_history", ".zsh_history", ".python_history"):
|
||||
p = root / name
|
||||
if p.exists():
|
||||
history_files.append(p)
|
||||
|
||||
# Also check in hermes home
|
||||
for p in root.glob("**/*history*"):
|
||||
if p.is_file() and p.suffix in ("", ".txt", ".log"):
|
||||
history_files.append(p)
|
||||
|
||||
for hf in history_files:
|
||||
try:
|
||||
content = hf.read_text(errors="replace")
|
||||
except (PermissionError, OSError):
|
||||
continue
|
||||
|
||||
for line in content.split("\n"):
|
||||
stripped = line.strip()
|
||||
# ZSH history format: ": 1234567890:0;command"
|
||||
stripped = re.sub(r"^:\s*\d+:\d+;", "", stripped)
|
||||
if not stripped or len(stripped) < 5:
|
||||
continue
|
||||
# Skip trivial commands
|
||||
if stripped in ("ls", "cd", "pwd", "clear", "exit"):
|
||||
continue
|
||||
command_counter[stripped] += 1
|
||||
|
||||
for cmd, count in command_counter.most_common(10):
|
||||
if count < 5:
|
||||
break
|
||||
proposals.append({
|
||||
"category": "shell_repetition",
|
||||
"title": f"Shell command run {count}+ times: {cmd[:80]}",
|
||||
"description": "Frequently repeated shell command — candidate for alias, function, or script.",
|
||||
"confidence": min(0.4 + (count * 0.05), 0.85),
|
||||
"impact": "low",
|
||||
"sources": ["shell_history"],
|
||||
})
|
||||
|
||||
return proposals
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Proposal output
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def deduplicate_proposals(proposals: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
|
||||
"""Remove duplicate proposals based on title similarity."""
|
||||
seen_titles = set()
|
||||
unique = []
|
||||
for p in proposals:
|
||||
# Normalize title for dedup
|
||||
key = re.sub(r"\d+", "N", p["title"]).lower()
|
||||
if key not in seen_titles:
|
||||
seen_titles.add(key)
|
||||
unique.append(p)
|
||||
return unique
|
||||
|
||||
|
||||
def rank_proposals(proposals: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
|
||||
"""Sort proposals by impact * confidence (highest first)."""
|
||||
impact_weight = {"critical": 1.0, "high": 0.8, "medium": 0.5, "low": 0.2}
|
||||
return sorted(
|
||||
proposals,
|
||||
key=lambda p: impact_weight.get(p.get("impact", "low"), 0.2) * p.get("confidence", 0.5),
|
||||
reverse=True,
|
||||
)
|
||||
|
||||
|
||||
def format_text_report(proposals: List[Dict[str, Any]]) -> str:
|
||||
"""Format proposals as human-readable text."""
|
||||
if not proposals:
|
||||
return "No automation opportunities found."
|
||||
|
||||
lines = [
|
||||
"=" * 70,
|
||||
" AUTOMATION OPPORTUNITY REPORT",
|
||||
f" Generated: {datetime.now(timezone.utc).strftime('%Y-%m-%d %H:%M UTC')}",
|
||||
f" Proposals: {len(proposals)}",
|
||||
"=" * 70,
|
||||
"",
|
||||
]
|
||||
|
||||
for i, p in enumerate(proposals, 1):
|
||||
score = p.get("confidence", 0.5) * {"critical": 1.0, "high": 0.8, "medium": 0.5, "low": 0.2}.get(p.get("impact", "low"), 0.2)
|
||||
lines.append(f"[{i}] {p['title']}")
|
||||
lines.append(f" Category: {p['category']} | Impact: {p.get('impact','?')} | Confidence: {p.get('confidence',0):.0%} | Score: {score:.2f}")
|
||||
lines.append(f" {p['description']}")
|
||||
if p.get("sources"):
|
||||
lines.append(f" Sources: {', '.join(p['sources'][:3])}")
|
||||
lines.append("")
|
||||
|
||||
# Summary by category
|
||||
cat_counts = Counter(p["category"] for p in proposals)
|
||||
lines.append("-" * 70)
|
||||
lines.append("Summary by category:")
|
||||
for cat, count in cat_counts.most_common():
|
||||
lines.append(f" {cat}: {count}")
|
||||
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Main
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Find automation opportunities across the fleet")
|
||||
parser.add_argument("--hermes-home", default=os.path.expanduser("~/.hermes"),
|
||||
help="Path to hermes home directory (default: ~/.hermes)")
|
||||
parser.add_argument("--scan-dirs", nargs="*",
|
||||
help="Additional directories to scan (default: hermes-home + cwd)")
|
||||
parser.add_argument("--session-dirs", nargs="*",
|
||||
help="Session transcript directories (default: hermes-home/sessions)")
|
||||
parser.add_argument("--json", action="store_true", help="Output as JSON")
|
||||
parser.add_argument("--output", "-o", help="Write proposals to file")
|
||||
parser.add_argument("--min-confidence", type=float, default=0.3,
|
||||
help="Minimum confidence threshold (default: 0.3)")
|
||||
parser.add_argument("--categories", nargs="*",
|
||||
help="Only include these categories (cron, docs, scripts, sessions, shell)")
|
||||
args = parser.parse_args()
|
||||
|
||||
hermes_home = os.path.expanduser(args.hermes_home)
|
||||
|
||||
# Default scan directories
|
||||
scan_dirs = [hermes_home, "."]
|
||||
if args.scan_dirs:
|
||||
scan_dirs.extend(args.scan_dirs)
|
||||
|
||||
session_dirs = [os.path.join(hermes_home, "sessions")]
|
||||
if args.session_dirs:
|
||||
session_dirs.extend(args.session_dirs)
|
||||
|
||||
# Also check common session locations
|
||||
for subdir in ("transcripts", "session-db"):
|
||||
p = os.path.join(hermes_home, subdir)
|
||||
if os.path.isdir(p):
|
||||
session_dirs.append(p)
|
||||
|
||||
categories = set(args.categories) if args.categories else {"cron", "docs", "scripts", "sessions", "shell"}
|
||||
|
||||
# Run analyzers
|
||||
all_proposals = []
|
||||
|
||||
if "cron" in categories:
|
||||
all_proposals.extend(analyze_cron_jobs(hermes_home))
|
||||
|
||||
if "docs" in categories:
|
||||
all_proposals.extend(analyze_documents(scan_dirs))
|
||||
|
||||
if "scripts" in categories:
|
||||
all_proposals.extend(analyze_scripts(scan_dirs))
|
||||
|
||||
if "sessions" in categories:
|
||||
all_proposals.extend(analyze_session_transcripts(session_dirs))
|
||||
|
||||
if "shell" in categories:
|
||||
all_proposals.extend(analyze_shell_history(scan_dirs))
|
||||
|
||||
# Deduplicate and rank
|
||||
all_proposals = deduplicate_proposals(all_proposals)
|
||||
all_proposals = rank_proposals(all_proposals)
|
||||
|
||||
# Filter by confidence
|
||||
all_proposals = [p for p in all_proposals if p.get("confidence", 0) >= args.min_confidence]
|
||||
|
||||
# Output
|
||||
output = {
|
||||
"generated_at": datetime.now(timezone.utc).isoformat(),
|
||||
"hermes_home": hermes_home,
|
||||
"scan_dirs": scan_dirs,
|
||||
"total_proposals": len(all_proposals),
|
||||
"proposals": all_proposals,
|
||||
}
|
||||
|
||||
if args.json:
|
||||
result = json.dumps(output, indent=2)
|
||||
else:
|
||||
result = format_text_report(all_proposals)
|
||||
|
||||
if args.output:
|
||||
with open(args.output, "w") as f:
|
||||
if args.json:
|
||||
json.dump(output, f, indent=2)
|
||||
else:
|
||||
f.write(result)
|
||||
print(f"Written to {args.output}", file=sys.stderr)
|
||||
else:
|
||||
print(result)
|
||||
|
||||
return 0 if all_proposals else 1
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
@@ -1,359 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Bootstrapper — assemble pre-session context from knowledge store.
|
||||
|
||||
Reads the knowledge store and produces a compact context block (2k tokens max)
|
||||
that can be injected into a new session so it starts with situational awareness.
|
||||
|
||||
Usage:
|
||||
python3 bootstrapper.py --repo the-nexus --agent mimo-sprint
|
||||
python3 bootstrapper.py --repo timmy-home --global
|
||||
python3 bootstrapper.py --global
|
||||
python3 bootstrapper.py --repo the-nexus --max-tokens 1000
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import sys
|
||||
from pathlib import Path
|
||||
from typing import Optional
|
||||
|
||||
# Resolve knowledge root relative to this script's parent
|
||||
SCRIPT_DIR = Path(__file__).resolve().parent
|
||||
REPO_ROOT = SCRIPT_DIR.parent
|
||||
KNOWLEDGE_DIR = REPO_ROOT / "knowledge"
|
||||
INDEX_PATH = KNOWLEDGE_DIR / "index.json"
|
||||
|
||||
# Approximate token count: ~4 chars per token for English text
|
||||
CHARS_PER_TOKEN = 4
|
||||
|
||||
# Category sort priority (lower = shown first)
|
||||
CATEGORY_PRIORITY = {
|
||||
"pitfall": 0,
|
||||
"tool-quirk": 1,
|
||||
"pattern": 2,
|
||||
"fact": 3,
|
||||
"question": 4,
|
||||
}
|
||||
|
||||
|
||||
def load_index(index_path: Path = INDEX_PATH) -> dict:
|
||||
"""Load and validate the knowledge index."""
|
||||
if not index_path.exists():
|
||||
return {"version": 1, "total_facts": 0, "facts": []}
|
||||
|
||||
with open(index_path) as f:
|
||||
data = json.load(f)
|
||||
|
||||
if "facts" not in data:
|
||||
print(f"WARNING: index.json missing 'facts' key", file=sys.stderr)
|
||||
return {"version": 1, "total_facts": 0, "facts": []}
|
||||
|
||||
return data
|
||||
|
||||
|
||||
def filter_facts(
|
||||
facts: list[dict],
|
||||
repo: Optional[str] = None,
|
||||
agent: Optional[str] = None,
|
||||
include_global: bool = True,
|
||||
) -> list[dict]:
|
||||
"""Filter facts by repo, agent, and global scope."""
|
||||
filtered = []
|
||||
|
||||
for fact in facts:
|
||||
fact_repo = fact.get("repo", "global")
|
||||
fact_agent = fact.get("agent", "")
|
||||
|
||||
# Match by repo (regardless of agent)
|
||||
if repo and fact_repo == repo:
|
||||
filtered.append(fact)
|
||||
continue
|
||||
|
||||
# Match by exact agent type
|
||||
if agent and fact_agent == agent:
|
||||
filtered.append(fact)
|
||||
continue
|
||||
|
||||
# Include global facts without agent restriction (universal facts)
|
||||
if include_global and fact_repo == "global" and not fact_agent:
|
||||
filtered.append(fact)
|
||||
|
||||
return filtered
|
||||
|
||||
|
||||
def sort_facts(facts: list[dict]) -> list[dict]:
|
||||
"""
|
||||
Sort facts by: confidence (desc), then category priority, then fact text.
|
||||
Most reliable and most dangerous facts come first.
|
||||
"""
|
||||
|
||||
def sort_key(f):
|
||||
confidence = f.get("confidence", 0.5)
|
||||
category = f.get("category", "fact")
|
||||
cat_priority = CATEGORY_PRIORITY.get(category, 5)
|
||||
return (-confidence, cat_priority, f.get("fact", ""))
|
||||
|
||||
return sorted(facts, key=sort_key)
|
||||
|
||||
|
||||
def load_repo_knowledge(repo: str) -> Optional[str]:
|
||||
"""Load per-repo knowledge markdown if it exists."""
|
||||
repo_path = KNOWLEDGE_DIR / "repos" / f"{repo}.md"
|
||||
if repo_path.exists():
|
||||
return repo_path.read_text().strip()
|
||||
return None
|
||||
|
||||
|
||||
def load_agent_knowledge(agent: str) -> Optional[str]:
|
||||
"""Load per-agent knowledge markdown if it exists."""
|
||||
agent_path = KNOWLEDGE_DIR / "agents" / f"{agent}.md"
|
||||
if agent_path.exists():
|
||||
return agent_path.read_text().strip()
|
||||
return None
|
||||
|
||||
|
||||
def load_global_knowledge() -> list[str]:
|
||||
"""Load all global knowledge markdown files."""
|
||||
global_dir = KNOWLEDGE_DIR / "global"
|
||||
if not global_dir.exists():
|
||||
return []
|
||||
|
||||
chunks = []
|
||||
for md_file in sorted(global_dir.glob("*.md")):
|
||||
content = md_file.read_text().strip()
|
||||
if content:
|
||||
chunks.append(content)
|
||||
return chunks
|
||||
|
||||
|
||||
def render_facts_section(facts: list[dict], category: str, label: str) -> str:
|
||||
"""Render a section of facts for a single category."""
|
||||
cat_facts = [f for f in facts if f.get("category") == category]
|
||||
if not cat_facts:
|
||||
return ""
|
||||
|
||||
lines = [f"### {label}\n"]
|
||||
for f in cat_facts:
|
||||
conf = f.get("confidence", 0.5)
|
||||
fact_text = f.get("fact", "")
|
||||
repo_tag = f.get("repo", "")
|
||||
if repo_tag and repo_tag != "global":
|
||||
lines.append(f"- [{conf:.0%}] ({repo_tag}) {fact_text}")
|
||||
else:
|
||||
lines.append(f"- [{conf:.0%}] {fact_text}")
|
||||
|
||||
return "\n".join(lines) + "\n"
|
||||
|
||||
|
||||
def estimate_tokens(text: str) -> int:
|
||||
"""Rough token estimate."""
|
||||
return len(text) // CHARS_PER_TOKEN
|
||||
|
||||
|
||||
def truncate_to_tokens(text: str, max_tokens: int) -> str:
|
||||
"""Truncate text to approximately max_tokens, cutting at line boundaries."""
|
||||
max_chars = max_tokens * CHARS_PER_TOKEN
|
||||
if len(text) <= max_chars:
|
||||
return text
|
||||
|
||||
# Cut at last newline before the limit
|
||||
truncated = text[:max_chars]
|
||||
last_newline = truncated.rfind("\n")
|
||||
if last_newline > 0:
|
||||
truncated = truncated[:last_newline]
|
||||
|
||||
return truncated + "\n\n[... truncated to fit context window ...]"
|
||||
|
||||
|
||||
def build_bootstrap_context(
|
||||
repo: Optional[str] = None,
|
||||
agent: Optional[str] = None,
|
||||
include_global: bool = True,
|
||||
max_tokens: int = 2000,
|
||||
index_path: Path = INDEX_PATH,
|
||||
) -> str:
|
||||
"""
|
||||
Build the full bootstrap context block.
|
||||
|
||||
Returns a markdown string suitable for injection into a session prompt.
|
||||
"""
|
||||
index = load_index(index_path)
|
||||
facts = index.get("facts", [])
|
||||
|
||||
# Filter
|
||||
filtered = filter_facts(facts, repo=repo, agent=agent, include_global=include_global)
|
||||
|
||||
# Sort
|
||||
sorted_facts = sort_facts(filtered)
|
||||
|
||||
# Build sections
|
||||
sections = ["## What You Know (bootstrapped)\n"]
|
||||
|
||||
# Per-repo markdown knowledge
|
||||
if repo:
|
||||
repo_md = load_repo_knowledge(repo)
|
||||
if repo_md:
|
||||
sections.append(f"### Repo Notes: {repo}\n")
|
||||
sections.append(repo_md + "\n")
|
||||
|
||||
# Structured facts by category
|
||||
if sorted_facts:
|
||||
# Group by source
|
||||
repo_facts = [f for f in sorted_facts if f.get("repo") == repo] if repo else []
|
||||
global_facts = [f for f in sorted_facts if f.get("repo") == "global"]
|
||||
agent_facts = [f for f in sorted_facts if f.get("agent") == agent] if agent else []
|
||||
|
||||
if repo_facts:
|
||||
sections.append(f"### Repo: {repo}\n")
|
||||
for cat, label in [
|
||||
("pitfall", "PITFALLS"),
|
||||
("tool-quirk", "QUIRKS"),
|
||||
("pattern", "PATTERNS"),
|
||||
("fact", "FACTS"),
|
||||
("question", "OPEN QUESTIONS"),
|
||||
]:
|
||||
section = render_facts_section(repo_facts, cat, label)
|
||||
if section:
|
||||
sections.append(section)
|
||||
|
||||
if global_facts:
|
||||
sections.append("### Global\n")
|
||||
for cat, label in [
|
||||
("pitfall", "PITFALLS"),
|
||||
("tool-quirk", "QUIRKS"),
|
||||
("pattern", "PATTERNS"),
|
||||
("fact", "FACTS"),
|
||||
]:
|
||||
section = render_facts_section(global_facts, cat, label)
|
||||
if section:
|
||||
sections.append(section)
|
||||
|
||||
if agent_facts:
|
||||
sections.append(f"### Agent Notes ({agent})\n")
|
||||
for cat, label in [
|
||||
("pitfall", "PITFALLS"),
|
||||
("tool-quirk", "QUIRKS"),
|
||||
("pattern", "PATTERNS"),
|
||||
("fact", "FACTS"),
|
||||
]:
|
||||
section = render_facts_section(agent_facts, cat, label)
|
||||
if section:
|
||||
sections.append(section)
|
||||
|
||||
# Per-agent markdown knowledge
|
||||
if agent:
|
||||
agent_md = load_agent_knowledge(agent)
|
||||
if agent_md:
|
||||
sections.append(f"### Agent Profile: {agent}\n")
|
||||
sections.append(agent_md + "\n")
|
||||
|
||||
# Global markdown knowledge
|
||||
global_chunks = load_global_knowledge()
|
||||
if global_chunks:
|
||||
sections.append("### Global Notes\n")
|
||||
sections.extend(chunk + "\n" for chunk in global_chunks)
|
||||
|
||||
# If nothing was found
|
||||
if len(sections) == 1:
|
||||
sections.append("_No relevant knowledge found. Starting fresh._\n")
|
||||
if not facts:
|
||||
sections.append(
|
||||
"_Knowledge store is empty. Run the harvester to populate it._\n"
|
||||
)
|
||||
|
||||
# Join and truncate
|
||||
context = "\n".join(sections)
|
||||
context = truncate_to_tokens(context, max_tokens)
|
||||
|
||||
return context
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Assemble pre-session context from knowledge store"
|
||||
)
|
||||
parser.add_argument(
|
||||
"--repo",
|
||||
type=str,
|
||||
default=None,
|
||||
help="Repository name to filter facts by",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--agent",
|
||||
type=str,
|
||||
default=None,
|
||||
help="Agent type to filter facts by (e.g., mimo-sprint, groq-fast)",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--global",
|
||||
dest="include_global",
|
||||
action="store_true",
|
||||
default=True,
|
||||
help="Include global facts (default: true)",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--no-global",
|
||||
dest="include_global",
|
||||
action="store_false",
|
||||
help="Exclude global facts",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--max-tokens",
|
||||
type=int,
|
||||
default=2000,
|
||||
help="Maximum token count for output (default: 2000)",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--index",
|
||||
type=str,
|
||||
default=None,
|
||||
help="Path to index.json (default: knowledge/index.json)",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--json",
|
||||
dest="output_json",
|
||||
action="store_true",
|
||||
help="Output raw JSON instead of markdown",
|
||||
)
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
index_path = Path(args.index) if args.index else INDEX_PATH
|
||||
|
||||
if args.output_json:
|
||||
# JSON mode: return the filtered, sorted facts
|
||||
index = load_index(index_path)
|
||||
facts = index.get("facts", [])
|
||||
filtered = filter_facts(
|
||||
facts,
|
||||
repo=args.repo,
|
||||
agent=args.agent,
|
||||
include_global=args.include_global,
|
||||
)
|
||||
sorted_facts = sort_facts(filtered)
|
||||
output = {
|
||||
"repo": args.repo,
|
||||
"agent": args.agent,
|
||||
"include_global": args.include_global,
|
||||
"total_indexed": len(facts),
|
||||
"matched": len(sorted_facts),
|
||||
"facts": sorted_facts,
|
||||
}
|
||||
print(json.dumps(output, indent=2))
|
||||
else:
|
||||
# Markdown mode: full bootstrap context
|
||||
context = build_bootstrap_context(
|
||||
repo=args.repo,
|
||||
agent=args.agent,
|
||||
include_global=args.include_global,
|
||||
max_tokens=args.max_tokens,
|
||||
index_path=index_path,
|
||||
)
|
||||
print(context)
|
||||
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
@@ -1,283 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
conference_summarizer.py — Extract knowledge from conference talk transcripts.
|
||||
|
||||
Reads a plain-text transcript and uses LLM to extract durable knowledge items.
|
||||
Integrates with the knowledge store (index.json + knowledge/conferences/talks.md).
|
||||
|
||||
Usage:
|
||||
python3 conference_summarizer.py --transcript talk.txt --conference "AI拂晓" --domain global
|
||||
python3 conference_summarizer.py --transcript talk.txt --domain the-nexus # talk about that repo
|
||||
python3 conference_summarizer.py --transcript talk.txt --dry-run
|
||||
|
||||
Refs: Issue #138 — 7.6: Conference Talk Summarizer
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import hashlib
|
||||
import json
|
||||
import sys
|
||||
import time
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
|
||||
SCRIPT_DIR = Path(__file__).parent.absolute()
|
||||
REPO_ROOT = SCRIPT_DIR.parent
|
||||
KNOWLEDGE_DIR = REPO_ROOT / "knowledge"
|
||||
|
||||
DEFAULT_API_BASE = "https://api.nousresearch.com/v1"
|
||||
DEFAULT_API_KEY = ""
|
||||
DEFAULT_MODEL = "xiaomi/mimo-v2-pro"
|
||||
|
||||
API_KEY_PATHS = [
|
||||
Path.home() / ".config/nous/key",
|
||||
Path.home() / ".hermes/keymaxxing/active/minimax.key",
|
||||
Path.home() / ".config/openrouter/key",
|
||||
]
|
||||
|
||||
|
||||
def find_api_key() -> str:
|
||||
for path in API_KEY_PATHS:
|
||||
if path.exists():
|
||||
return path.read_text().strip()
|
||||
return ""
|
||||
|
||||
|
||||
def load_prompt() -> str:
|
||||
path = SCRIPT_DIR.parent / "templates" / "conference-summary-prompt.md"
|
||||
if not path.exists():
|
||||
print(f"ERROR: Prompt not found at {path}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
return path.read_text(encoding="utf-8")
|
||||
|
||||
|
||||
def truncate_for_context(text: str, head: int = 120, tail: int = 120) -> str:
|
||||
lines = text.splitlines()
|
||||
if len(lines) <= head + tail:
|
||||
return text
|
||||
return (
|
||||
"\n".join(lines[:head])
|
||||
+ "\n\n... [truncated] ...\n\n"
|
||||
+ "\n".join(lines[-tail:])
|
||||
)
|
||||
|
||||
|
||||
def call_llm(prompt: str, transcript: str, api_base: str, api_key: str, model: str):
|
||||
import urllib.request
|
||||
|
||||
messages = [
|
||||
{"role": "system", "content": prompt},
|
||||
{"role": "user", "content": f"Transcript:\n\n{truncate_for_context(transcript)}"},
|
||||
]
|
||||
payload = json.dumps(
|
||||
{"model": model, "messages": messages, "temperature": 0.1, "max_tokens": 4096}
|
||||
).encode("utf-8")
|
||||
req = urllib.request.Request(
|
||||
f"{api_base}/chat/completions",
|
||||
data=payload,
|
||||
headers={"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"},
|
||||
)
|
||||
try:
|
||||
with urllib.request.urlopen(req, timeout=60) as resp:
|
||||
data = json.loads(resp.read())
|
||||
content = data["choices"][0]["message"]["content"].strip()
|
||||
# Strip code fences
|
||||
if content.startswith("```"):
|
||||
content = content.split("\n", 1)[1].rsplit("```", 1)[0].strip()
|
||||
return json.loads(content)
|
||||
except Exception as e:
|
||||
print(f"ERROR: LLM extraction failed: {e}", file=sys.stderr)
|
||||
return None
|
||||
|
||||
|
||||
def load_index(knowledge_dir: Path) -> dict:
|
||||
index_path = knowledge_dir / "index.json"
|
||||
if index_path.exists():
|
||||
with open(index_path) as f:
|
||||
return json.load(f)
|
||||
return {"version": 1, "total_facts": 0, "facts": []}
|
||||
|
||||
|
||||
def content_hash(text: str) -> str:
|
||||
normalized = " ".join(text.lower().strip().split())
|
||||
return hashlib.sha256(normalized.encode("utf-8")).hexdigest()
|
||||
|
||||
|
||||
def compute_next_sequence(existing_facts: list[dict], domain: str, category: str) -> int:
|
||||
"""Compute next sequence number for (domain, category) based on existing IDs."""
|
||||
max_seq = 0
|
||||
for f in existing_facts:
|
||||
fid = f.get("id", "")
|
||||
parts = fid.split(":")
|
||||
if len(parts) == 3 and parts[0] == domain and parts[1] == category:
|
||||
try:
|
||||
seq = int(parts[2])
|
||||
max_seq = max(max_seq, seq)
|
||||
except ValueError:
|
||||
pass
|
||||
return max_seq + 1
|
||||
|
||||
|
||||
def deduplicate(new_facts: list[dict], existing: list[dict]) -> list[dict]:
|
||||
"""Exact-deduplicate by content hash; near-dedup by token overlap."""
|
||||
existing_hashes = {content_hash(f["fact"]): f for f in existing}
|
||||
existing_texts = [f["fact"].lower() for f in existing]
|
||||
unique = []
|
||||
for fact in new_facts:
|
||||
text = fact.get("fact", "")
|
||||
h = content_hash(text)
|
||||
if h in existing_hashes:
|
||||
continue
|
||||
# Near-dedup: token Jaccard >= 0.8
|
||||
tokens = set(text.lower().split())
|
||||
for ex in existing_texts:
|
||||
ex_tokens = set(ex.split())
|
||||
if tokens and ex_tokens:
|
||||
inter = len(tokens & ex_tokens)
|
||||
union = len(tokens | ex_tokens)
|
||||
if inter / union >= 0.8:
|
||||
break
|
||||
else:
|
||||
unique.append(fact)
|
||||
return unique
|
||||
|
||||
|
||||
def validate_fact(fact: dict) -> bool:
|
||||
required = ["fact", "category", "domain", "confidence"]
|
||||
for field in required:
|
||||
if field not in fact:
|
||||
return False
|
||||
if not isinstance(fact["fact"], str) or not fact["fact"].strip():
|
||||
return False
|
||||
if fact["category"] not in ["fact", "pitfall", "pattern", "tool-quirk", "question"]:
|
||||
return False
|
||||
c = fact.get("confidence", 0)
|
||||
return isinstance(c, (int, float)) and 0.0 <= c <= 1.0
|
||||
|
||||
|
||||
def write_knowledge(index: dict, new_facts: list[dict], knowledge_dir: Path):
|
||||
kdir = knowledge_dir
|
||||
kdir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
for fact in new_facts:
|
||||
fact["harvested_at"] = datetime.now(timezone.utc).isoformat()
|
||||
fact["source"] = "conference-talk"
|
||||
|
||||
index["facts"].extend(new_facts)
|
||||
index["total_facts"] = len(index["facts"])
|
||||
index["last_updated"] = datetime.now(timezone.utc).isoformat()
|
||||
|
||||
# index.json
|
||||
with open(kdir / "index.json", "w", encoding="utf-8") as f:
|
||||
json.dump(index, f, indent=2, ensure_ascii=False)
|
||||
|
||||
# conferences/talks.md (human-readable)
|
||||
conf_dir = kdir / "conferences"
|
||||
conf_dir.mkdir(parents=True, exist_ok=True)
|
||||
conf_md = conf_dir / "talks.md"
|
||||
mode = "a" if conf_md.exists() else "w"
|
||||
with open(conf_md, mode, encoding="utf-8") as f:
|
||||
if mode == "w":
|
||||
f.write("# Conference Talk Knowledge\n\n")
|
||||
f.write(
|
||||
f"## {datetime.now(timezone.utc).strftime('%Y-%m-%d')} — {len(new_facts)} items\n\n"
|
||||
)
|
||||
for fact in new_facts:
|
||||
icon = {"fact": "📋", "pitfall": "⚠️", "pattern": "🔄", "tool-quirk": "🔧", "question": "❓"}.get(fact["category"], "•")
|
||||
f.write(f"- {icon} **{fact['category']}** (conf: {fact['confidence']:.1f}): {fact['fact']}\n")
|
||||
if fact.get("evidence"):
|
||||
f.write(f" _Evidence: {fact['evidence']}_\n")
|
||||
f.write("\n")
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Summarize conference talks into knowledge store")
|
||||
parser.add_argument("--transcript", required=True, help="Path to transcript text file")
|
||||
parser.add_argument("--conference", default="unknown", help="Conference name")
|
||||
parser.add_argument("--title", default="", help="Talk title")
|
||||
parser.add_argument("--speaker", default="", help="Speaker name(s)")
|
||||
parser.add_argument("--talk-url", default="", help="URL to talk/video")
|
||||
parser.add_argument("--domain", default="global", help="Domain: global or repo/agent name")
|
||||
parser.add_argument("--knowledge-dir", default=str(KNOWLEDGE_DIR), help="Knowledge store directory")
|
||||
parser.add_argument("--api-base", default=DEFAULT_API_BASE, help="LLM API base URL")
|
||||
parser.add_argument("--api-key", default="", help="LLM API key")
|
||||
parser.add_argument("--model", default=DEFAULT_MODEL, help="Model to use")
|
||||
parser.add_argument("--dry-run", action="store_true", help="Preview without writing")
|
||||
parser.add_argument("--min-confidence", type=float, default=0.3, help="Minimum confidence threshold")
|
||||
args = parser.parse_args()
|
||||
|
||||
transcript_path = Path(args.transcript)
|
||||
if not transcript_path.exists():
|
||||
print(f"ERROR: Transcript not found: {transcript_path}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
transcript = transcript_path.read_text(encoding="utf-8", errors="replace")
|
||||
if not transcript.strip():
|
||||
print("ERROR: Transcript is empty", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
api_key = args.api_key or DEFAULT_API_KEY or find_api_key()
|
||||
if not api_key:
|
||||
print("ERROR: No API key. Set HARVESTER_API_KEY or pass --api-key", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
prompt = load_prompt()
|
||||
print(f"Summarizing '{transcript_path.name}' domain={args.domain} conf={args.conference}")
|
||||
start = time.time()
|
||||
extracted = call_llm(prompt, transcript, args.api_base, api_key, args.model)
|
||||
if extracted is None:
|
||||
print("ERROR: LLM extraction failed", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
raw_items = extracted.get("knowledge", [])
|
||||
print(f" Raw items: {len(raw_items)}")
|
||||
valid = [f for f in raw_items if validate_fact(f) and f.get("confidence", 0) >= args.min_confidence]
|
||||
print(f" Valid: {len(valid)}")
|
||||
|
||||
if not valid:
|
||||
print("WARNING: No valid items extracted", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
kdir = Path(args.knowledge_dir)
|
||||
index = load_index(kdir)
|
||||
existing_facts = index.get("facts", [])
|
||||
new_facts = deduplicate(valid, existing_facts)
|
||||
print(f" New (non-duplicate): {len(new_facts)}")
|
||||
|
||||
if not new_facts:
|
||||
print("All items duplicated — nothing to write.")
|
||||
sys.exit(0)
|
||||
|
||||
# Assign IDs per (domain, category) sequence
|
||||
seq_counters = {}
|
||||
# Count existing for this domain
|
||||
for f in existing_facts:
|
||||
if f.get("domain") == args.domain:
|
||||
cat = f.get("category", "fact")
|
||||
key = (args.domain, cat)
|
||||
seq_counters[key] = seq_counters.get(key, 0) + 1
|
||||
# Now next sequence for each category in new_facts
|
||||
for fact in new_facts:
|
||||
cat = fact["category"]
|
||||
key = (args.domain, cat)
|
||||
next_seq = seq_counters.get(key, 0) + 1
|
||||
seq_counters[key] = next_seq
|
||||
fact["id"] = f"{args.domain}:{cat}:{next_seq:03d}"
|
||||
fact["domain"] = args.domain
|
||||
fact.setdefault("tags", []).extend([args.conference, "conference-talk"])
|
||||
fact["first_seen"] = datetime.now(timezone.utc).strftime("%Y-%m-%d")
|
||||
fact["last_confirmed"] = fact["first_seen"]
|
||||
fact["source_count"] = 1
|
||||
fact["talk_meta"] = extracted.get("meta", {})
|
||||
|
||||
if args.dry_run:
|
||||
print("DRY RUN — items that would be added:")
|
||||
for f in new_facts:
|
||||
print(f" [{f['category']}] {f['fact'][:90]}")
|
||||
sys.exit(0)
|
||||
|
||||
write_knowledge(index, new_facts, kdir)
|
||||
print(f"✓ Stored {len(new_facts)} items to knowledge store in {time.time() - start:.1f}s")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,282 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Dead Code Detector for Python Codebases
|
||||
|
||||
AST-based analysis to find defined but never-called functions and classes.
|
||||
Excludes entry points, plugin hooks, __init__ exports.
|
||||
|
||||
Usage:
|
||||
python3 scripts/dead_code_detector.py /path/to/repo/
|
||||
python3 scripts/dead_code_detector.py hermes-agent/ --format json
|
||||
python3 scripts/dead_code_detector.py . --exclude tests/,venv/
|
||||
|
||||
Output: file:line, function/class name, last git author (if available)
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import ast
|
||||
import json
|
||||
import os
|
||||
import subprocess
|
||||
import sys
|
||||
from collections import defaultdict
|
||||
from pathlib import Path
|
||||
from typing import Optional
|
||||
|
||||
|
||||
# Names that are expected to be unused (entry points, protocol methods, etc.)
|
||||
SAFE_UNUSED_PATTERNS = {
|
||||
# Python dunders
|
||||
"__init__", "__str__", "__repr__", "__eq__", "__hash__", "__len__",
|
||||
"__getitem__", "__setitem__", "__contains__", "__iter__", "__next__",
|
||||
"__enter__", "__exit__", "__call__", "__bool__", "__del__",
|
||||
"__post_init__", "__class_getitem__",
|
||||
# Common entry points
|
||||
"main", "app", "handler", "setup", "teardown", "fixture",
|
||||
# pytest
|
||||
"conftest", "test_", "pytest_", # prefix patterns
|
||||
# Protocols / abstract
|
||||
"abstractmethod", "abc_",
|
||||
}
|
||||
|
||||
|
||||
def is_safe_unused(name: str, filepath: str) -> bool:
|
||||
"""Check if an unused name is expected to be unused."""
|
||||
# Test files are exempt
|
||||
if "test" in filepath.lower():
|
||||
return True
|
||||
|
||||
# Known patterns
|
||||
for pattern in SAFE_UNUSED_PATTERNS:
|
||||
if name.startswith(pattern) or name == pattern:
|
||||
return True
|
||||
|
||||
# __init__.py exports are often unused internally
|
||||
if filepath.endswith("__init__.py"):
|
||||
return True
|
||||
|
||||
return False
|
||||
|
||||
|
||||
def get_git_blame(filepath: str, lineno: int) -> Optional[str]:
|
||||
"""Get last author of a line via git blame."""
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["git", "blame", "-L", f"{lineno},{lineno}", "--porcelain", filepath],
|
||||
capture_output=True, text=True, timeout=5
|
||||
)
|
||||
for line in result.stdout.split("\n"):
|
||||
if line.startswith("author "):
|
||||
return line[7:]
|
||||
except:
|
||||
pass
|
||||
return None
|
||||
|
||||
|
||||
class DefinitionCollector(ast.NodeVisitor):
|
||||
"""Collect all function and class definitions."""
|
||||
|
||||
def __init__(self):
|
||||
self.definitions = [] # (name, type, lineno, filepath)
|
||||
|
||||
def visit_FunctionDef(self, node):
|
||||
self.definitions.append((node.name, "function", node.lineno))
|
||||
self.generic_visit(node)
|
||||
|
||||
def visit_AsyncFunctionDef(self, node):
|
||||
self.definitions.append((node.name, "async_function", node.lineno))
|
||||
self.generic_visit(node)
|
||||
|
||||
def visit_ClassDef(self, node):
|
||||
self.definitions.append((node.name, "class", node.lineno))
|
||||
self.generic_visit(node)
|
||||
|
||||
|
||||
class NameUsageCollector(ast.NodeVisitor):
|
||||
"""Collect all name references (calls, imports, attribute access)."""
|
||||
|
||||
def __init__(self):
|
||||
self.names = set()
|
||||
self.calls = set()
|
||||
self.imports = set()
|
||||
|
||||
def visit_Name(self, node):
|
||||
self.names.add(node.id)
|
||||
self.generic_visit(node)
|
||||
|
||||
def visit_Attribute(self, node):
|
||||
if isinstance(node.value, ast.Name):
|
||||
self.names.add(node.value.id)
|
||||
self.generic_visit(node)
|
||||
|
||||
def visit_Call(self, node):
|
||||
if isinstance(node.func, ast.Name):
|
||||
self.calls.add(node.func.id)
|
||||
elif isinstance(node.func, ast.Attribute):
|
||||
if isinstance(node.func.value, ast.Name):
|
||||
self.names.add(node.func.value.id)
|
||||
self.calls.add(node.func.attr)
|
||||
self.generic_visit(node)
|
||||
|
||||
def visit_Import(self, node):
|
||||
for alias in node.names:
|
||||
self.imports.add(alias.asname or alias.name)
|
||||
self.generic_visit(node)
|
||||
|
||||
def visit_ImportFrom(self, node):
|
||||
for alias in node.names:
|
||||
self.imports.add(alias.asname or alias.name)
|
||||
self.generic_visit(node)
|
||||
|
||||
|
||||
def analyze_file(filepath: str) -> dict:
|
||||
"""Analyze a single Python file for dead code."""
|
||||
path = Path(filepath)
|
||||
try:
|
||||
content = path.read_text()
|
||||
tree = ast.parse(content, filename=str(filepath))
|
||||
except (SyntaxError, UnicodeDecodeError):
|
||||
return {"error": f"Could not parse {filepath}"}
|
||||
|
||||
# Collect definitions
|
||||
def_collector = DefinitionCollector()
|
||||
def_collector.visit(tree)
|
||||
definitions = def_collector.definitions
|
||||
|
||||
# Collect usage
|
||||
usage_collector = NameUsageCollector()
|
||||
usage_collector.visit(tree)
|
||||
used_names = usage_collector.names | usage_collector.calls | usage_collector.imports
|
||||
|
||||
# Also scan the entire repo for references to this file's definitions
|
||||
# (this is done at the repo level, not file level)
|
||||
|
||||
dead = []
|
||||
for name, def_type, lineno in definitions:
|
||||
if name.startswith("_") and not name.startswith("__"):
|
||||
# Private functions — might be used externally, less likely dead
|
||||
pass
|
||||
|
||||
if name not in used_names:
|
||||
if not is_safe_unused(name, filepath):
|
||||
dead.append({
|
||||
"name": name,
|
||||
"type": def_type,
|
||||
"file": filepath,
|
||||
"line": lineno,
|
||||
})
|
||||
|
||||
return {"definitions": len(definitions), "dead": dead}
|
||||
|
||||
|
||||
def scan_repo(repo_path: str, exclude_patterns: list = None) -> dict:
|
||||
"""Scan an entire repo for dead code."""
|
||||
path = Path(repo_path)
|
||||
exclude = exclude_patterns or ["venv", ".venv", "node_modules", "__pycache__",
|
||||
".git", "dist", "build", ".tox", "vendor"]
|
||||
|
||||
all_definitions = {} # name -> [{file, line, type}]
|
||||
all_files = []
|
||||
dead_code = []
|
||||
|
||||
# First pass: collect all definitions across repo
|
||||
for fpath in path.rglob("*.py"):
|
||||
parts = fpath.parts
|
||||
if any(ex in parts for ex in exclude):
|
||||
continue
|
||||
if fpath.name.startswith("."):
|
||||
continue
|
||||
|
||||
try:
|
||||
content = fpath.read_text(errors="ignore")
|
||||
tree = ast.parse(content, filename=str(fpath))
|
||||
except:
|
||||
continue
|
||||
|
||||
all_files.append(str(fpath))
|
||||
collector = DefinitionCollector()
|
||||
collector.visit(tree)
|
||||
|
||||
for name, def_type, lineno in collector.definitions:
|
||||
rel_path = str(fpath.relative_to(path))
|
||||
if name not in all_definitions:
|
||||
all_definitions[name] = []
|
||||
all_definitions[name].append({
|
||||
"file": rel_path,
|
||||
"line": lineno,
|
||||
"type": def_type,
|
||||
})
|
||||
|
||||
# Second pass: check each name for usage across entire repo
|
||||
all_used_names = set()
|
||||
for fpath_str in all_files:
|
||||
try:
|
||||
content = Path(fpath_str).read_text(errors="ignore")
|
||||
tree = ast.parse(content)
|
||||
except:
|
||||
continue
|
||||
usage = NameUsageCollector()
|
||||
usage.visit(tree)
|
||||
all_used_names.update(usage.names)
|
||||
all_used_names.update(usage.calls)
|
||||
all_used_names.update(usage.imports)
|
||||
|
||||
# Find dead code
|
||||
for name, locations in all_definitions.items():
|
||||
if name not in all_used_names:
|
||||
for loc in locations:
|
||||
if not is_safe_unused(name, loc["file"]):
|
||||
dead_code.append({
|
||||
"name": name,
|
||||
"type": loc["type"],
|
||||
"file": loc["file"],
|
||||
"line": loc["line"],
|
||||
})
|
||||
|
||||
return {
|
||||
"repo": path.name,
|
||||
"files_scanned": len(all_files),
|
||||
"total_definitions": sum(len(v) for v in all_definitions.values()),
|
||||
"dead_code_count": len(dead_code),
|
||||
"dead_code": sorted(dead_code, key=lambda x: (x["file"], x["line"])),
|
||||
}
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Find dead code in Python codebases")
|
||||
parser.add_argument("repo", help="Repository path to scan")
|
||||
parser.add_argument("--format", choices=["text", "json"], default="text")
|
||||
parser.add_argument("--exclude", help="Comma-separated patterns to exclude")
|
||||
parser.add_argument("--git-blame", action="store_true", help="Include git blame info")
|
||||
args = parser.parse_args()
|
||||
|
||||
exclude = args.exclude.split(",") if args.exclude else None
|
||||
result = scan_repo(args.repo, exclude)
|
||||
|
||||
if args.format == "json":
|
||||
print(json.dumps(result, indent=2))
|
||||
else:
|
||||
print(f"Dead Code Report: {result['repo']}")
|
||||
print(f"Files scanned: {result['files_scanned']}")
|
||||
print(f"Total definitions: {result['total_definitions']}")
|
||||
print(f"Dead code found: {result['dead_code_count']}")
|
||||
print()
|
||||
|
||||
if result["dead_code"]:
|
||||
print(f"{'File':<45} {'Line':>4} {'Type':<10} {'Name'}")
|
||||
print("-" * 85)
|
||||
for item in result["dead_code"]:
|
||||
author = ""
|
||||
if args.git_blame:
|
||||
author = get_git_blame(
|
||||
os.path.join(args.repo, item["file"]),
|
||||
item["line"]
|
||||
) or ""
|
||||
author = f" ({author})" if author else ""
|
||||
print(f"{item['file']:<45} {item['line']:>4} {item['type']:<10} {item['name']}{author}")
|
||||
else:
|
||||
print("No dead code detected!")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
317
scripts/dedup.py
317
scripts/dedup.py
@@ -1,317 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
dedup.py — Knowledge deduplication: content hash + semantic similarity.
|
||||
|
||||
Deduplicates harvested knowledge entries to avoid training on duplicates.
|
||||
Uses content hashing for exact matches and token overlap for near-duplicates.
|
||||
|
||||
Usage:
|
||||
python3 dedup.py --input knowledge/index.json --output knowledge/index_deduped.json
|
||||
python3 dedup.py --input knowledge/index.json --dry-run
|
||||
python3 dedup.py --test # Run built-in dedup test
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import hashlib
|
||||
import json
|
||||
import re
|
||||
import sys
|
||||
from pathlib import Path
|
||||
from typing import List, Dict, Optional, Tuple
|
||||
|
||||
|
||||
def normalize_text(text: str) -> str:
|
||||
"""Normalize text for hashing: lowercase, collapse whitespace, strip."""
|
||||
text = text.lower().strip()
|
||||
text = re.sub(r'\s+', ' ', text)
|
||||
return text
|
||||
|
||||
|
||||
def content_hash(text: str) -> str:
|
||||
"""SHA256 hash of normalized text for exact dedup."""
|
||||
normalized = normalize_text(text)
|
||||
return hashlib.sha256(normalized.encode('utf-8')).hexdigest()
|
||||
|
||||
|
||||
def tokenize(text: str) -> set:
|
||||
"""Simple tokenizer: lowercase words, 3+ chars."""
|
||||
words = re.findall(r'[a-z0-9_]{3,}', text.lower())
|
||||
return set(words)
|
||||
|
||||
|
||||
def token_similarity(a: str, b: str) -> float:
|
||||
"""Token-based Jaccard similarity (0.0-1.0).
|
||||
|
||||
Fast local alternative to embedding similarity.
|
||||
Good enough for near-duplicate detection.
|
||||
"""
|
||||
tokens_a = tokenize(a)
|
||||
tokens_b = tokenize(b)
|
||||
if not tokens_a or not tokens_b:
|
||||
return 0.0
|
||||
intersection = tokens_a & tokens_b
|
||||
union = tokens_a | tokens_b
|
||||
return len(intersection) / len(union)
|
||||
|
||||
|
||||
def quality_score(fact: dict) -> float:
|
||||
"""Compute quality score for merge ranking.
|
||||
|
||||
Higher is better. Factors:
|
||||
- confidence (0-1)
|
||||
- source_count (more confirmations = better)
|
||||
- has tags (richer metadata)
|
||||
"""
|
||||
confidence = fact.get('confidence', 0.5)
|
||||
source_count = fact.get('source_count', 1)
|
||||
has_tags = 1.0 if fact.get('tags') else 0.0
|
||||
has_related = 1.0 if fact.get('related') else 0.0
|
||||
|
||||
# Weighted composite
|
||||
score = (
|
||||
confidence * 0.5 +
|
||||
min(source_count / 10, 1.0) * 0.3 +
|
||||
has_tags * 0.1 +
|
||||
has_related * 0.1
|
||||
)
|
||||
return round(score, 4)
|
||||
|
||||
|
||||
def merge_facts(keep: dict, drop: dict) -> dict:
|
||||
"""Merge two near-duplicate facts, keeping higher-quality fields.
|
||||
|
||||
The 'keep' fact is enriched with metadata from 'drop'.
|
||||
"""
|
||||
# Merge tags (union)
|
||||
keep_tags = set(keep.get('tags', []))
|
||||
drop_tags = set(drop.get('tags', []))
|
||||
keep['tags'] = sorted(keep_tags | drop_tags)
|
||||
|
||||
# Merge related (union)
|
||||
keep_related = set(keep.get('related', []))
|
||||
drop_related = set(drop.get('related', []))
|
||||
keep['related'] = sorted(keep_related | drop_related)
|
||||
|
||||
# Update source_count (sum)
|
||||
keep['source_count'] = keep.get('source_count', 1) + drop.get('source_count', 1)
|
||||
|
||||
# Update confidence (max — we've now seen it from multiple sources)
|
||||
keep['confidence'] = max(keep.get('confidence', 0), drop.get('confidence', 0))
|
||||
|
||||
# Track that we merged
|
||||
if '_merged_from' not in keep:
|
||||
keep['_merged_from'] = []
|
||||
keep['_merged_from'].append(drop.get('id', 'unknown'))
|
||||
|
||||
return keep
|
||||
|
||||
|
||||
def dedup_facts(
|
||||
facts: List[dict],
|
||||
exact_threshold: float = 1.0,
|
||||
near_threshold: float = 0.95,
|
||||
dry_run: bool = False,
|
||||
) -> Tuple[List[dict], dict]:
|
||||
"""Deduplicate a list of knowledge facts.
|
||||
|
||||
Args:
|
||||
facts: List of fact dicts (from index.json)
|
||||
exact_threshold: Hash match = exact duplicate
|
||||
near_threshold: Token similarity above this = near-duplicate
|
||||
dry_run: If True, don't modify, just report
|
||||
|
||||
Returns:
|
||||
(deduped_facts, stats_dict)
|
||||
"""
|
||||
if not facts:
|
||||
return [], {"total": 0, "exact_dupes": 0, "near_dupes": 0, "unique": 0}
|
||||
|
||||
# Phase 1: Exact dedup by content hash
|
||||
hash_seen = {} # hash -> index in deduped list
|
||||
exact_dupes = 0
|
||||
deduped = []
|
||||
|
||||
for fact in facts:
|
||||
text = fact.get('fact', '')
|
||||
h = content_hash(text)
|
||||
|
||||
if h in hash_seen:
|
||||
# Exact duplicate — merge metadata into existing
|
||||
existing_idx = hash_seen[h]
|
||||
if not dry_run:
|
||||
deduped[existing_idx] = merge_facts(deduped[existing_idx], fact)
|
||||
exact_dupes += 1
|
||||
else:
|
||||
hash_seen[h] = len(deduped)
|
||||
deduped.append(fact)
|
||||
|
||||
# Phase 2: Near-dup by token similarity
|
||||
near_dupes = 0
|
||||
i = 0
|
||||
while i < len(deduped):
|
||||
j = i + 1
|
||||
while j < len(deduped):
|
||||
sim = token_similarity(deduped[i].get('fact', ''), deduped[j].get('fact', ''))
|
||||
if sim >= near_threshold:
|
||||
# Near-duplicate — keep higher quality
|
||||
q_i = quality_score(deduped[i])
|
||||
q_j = quality_score(deduped[j])
|
||||
if q_i >= q_j:
|
||||
if not dry_run:
|
||||
deduped[i] = merge_facts(deduped[i], deduped[j])
|
||||
deduped.pop(j)
|
||||
else:
|
||||
# j is higher quality — merge i into j, then remove i
|
||||
if not dry_run:
|
||||
deduped[j] = merge_facts(deduped[j], deduped[i])
|
||||
deduped.pop(i)
|
||||
break # i changed, restart inner loop
|
||||
near_dupes += 1
|
||||
else:
|
||||
j += 1
|
||||
i += 1
|
||||
|
||||
stats = {
|
||||
"total": len(facts),
|
||||
"exact_dupes": exact_dupes,
|
||||
"near_dupes": near_dupes,
|
||||
"unique": len(deduped),
|
||||
"removed": len(facts) - len(deduped),
|
||||
}
|
||||
|
||||
return deduped, stats
|
||||
|
||||
|
||||
def dedup_index_file(
|
||||
input_path: str,
|
||||
output_path: Optional[str] = None,
|
||||
near_threshold: float = 0.95,
|
||||
dry_run: bool = False,
|
||||
) -> dict:
|
||||
"""Deduplicate an index.json file.
|
||||
|
||||
Args:
|
||||
input_path: Path to index.json
|
||||
output_path: Where to write deduped file (default: overwrite input)
|
||||
near_threshold: Token similarity threshold for near-dupes
|
||||
dry_run: Report only, don't write
|
||||
|
||||
Returns stats dict.
|
||||
"""
|
||||
path = Path(input_path)
|
||||
if not path.exists():
|
||||
raise FileNotFoundError(f"Index file not found: {input_path}")
|
||||
|
||||
with open(path) as f:
|
||||
data = json.load(f)
|
||||
|
||||
facts = data.get('facts', [])
|
||||
deduped, stats = dedup_facts(facts, near_threshold=near_threshold, dry_run=dry_run)
|
||||
|
||||
if not dry_run:
|
||||
data['facts'] = deduped
|
||||
data['total_facts'] = len(deduped)
|
||||
data['last_dedup'] = __import__('datetime').datetime.now(
|
||||
__import__('datetime').timezone.utc
|
||||
).isoformat()
|
||||
|
||||
out_path = Path(output_path) if output_path else path
|
||||
with open(out_path, 'w') as f:
|
||||
json.dump(data, f, indent=2, ensure_ascii=False)
|
||||
|
||||
return stats
|
||||
|
||||
|
||||
def generate_test_duplicates(n: int = 20) -> List[dict]:
|
||||
"""Generate test facts with intentional duplicates for testing.
|
||||
|
||||
Creates n unique facts plus n/4 exact dupes and n/4 near-dupes.
|
||||
"""
|
||||
import random
|
||||
random.seed(42)
|
||||
|
||||
unique_facts = []
|
||||
for i in range(n):
|
||||
topic = random.choice(["git", "python", "docker", "rust", "nginx"])
|
||||
tip = random.choice(["use verbose flags", "check logs first", "restart service", "clear cache", "update config"])
|
||||
unique_facts.append({
|
||||
"id": f"test:fact:{i:03d}",
|
||||
"fact": f"When working with {topic}, always {tip} before deploying.",
|
||||
"category": "fact",
|
||||
"domain": "test",
|
||||
"confidence": round(random.uniform(0.5, 1.0), 2),
|
||||
"source_count": random.randint(1, 5),
|
||||
"tags": [topic, "test"],
|
||||
})
|
||||
|
||||
# Add exact duplicates (same text, different IDs)
|
||||
duped = list(unique_facts)
|
||||
for i in range(n // 4):
|
||||
original = unique_facts[i]
|
||||
dupe = dict(original)
|
||||
dupe["id"] = f"test:fact:dup{i:03d}"
|
||||
dupe["confidence"] = round(random.uniform(0.3, 0.8), 2)
|
||||
duped.append(dupe)
|
||||
|
||||
# Add near-duplicates (slightly different phrasing)
|
||||
for i in range(n // 4):
|
||||
original = unique_facts[i]
|
||||
near = dict(original)
|
||||
near["id"] = f"test:fact:near{i:03d}"
|
||||
near["fact"] = original["fact"].replace("always", "should").replace("before deploying", "prior to deployment")
|
||||
near["confidence"] = round(random.uniform(0.4, 0.9), 2)
|
||||
duped.append(near)
|
||||
|
||||
return duped
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Knowledge deduplication")
|
||||
parser.add_argument("--input", help="Path to index.json")
|
||||
parser.add_argument("--output", help="Output path (default: overwrite input)")
|
||||
parser.add_argument("--threshold", type=float, default=0.95,
|
||||
help="Near-dup similarity threshold (default: 0.95)")
|
||||
parser.add_argument("--dry-run", action="store_true", help="Report only, don't write")
|
||||
parser.add_argument("--test", action="store_true", help="Run built-in dedup test")
|
||||
parser.add_argument("--json", action="store_true", help="JSON output")
|
||||
args = parser.parse_args()
|
||||
|
||||
if args.test:
|
||||
test_facts = generate_test_duplicates(20)
|
||||
print(f"Generated {len(test_facts)} test facts (20 unique + dupes)")
|
||||
deduped, stats = dedup_facts(test_facts, near_threshold=args.threshold)
|
||||
print(f"\nDedup results:")
|
||||
print(f" Total input: {stats['total']}")
|
||||
print(f" Exact dupes: {stats['exact_dupes']}")
|
||||
print(f" Near dupes: {stats['near_dupes']}")
|
||||
print(f" Unique output: {stats['unique']}")
|
||||
print(f" Removed: {stats['removed']}")
|
||||
|
||||
# Verify: should have ~20 unique (some merged)
|
||||
assert stats['unique'] <= 20, f"Too many unique: {stats['unique']} > 20"
|
||||
assert stats['unique'] >= 15, f"Too few unique: {stats['unique']} < 15"
|
||||
assert stats['removed'] > 0, "No duplicates removed"
|
||||
print("\nOK: Dedup test passed")
|
||||
return
|
||||
|
||||
if not args.input:
|
||||
print("ERROR: Provide --input or --test")
|
||||
sys.exit(1)
|
||||
|
||||
stats = dedup_index_file(args.input, args.output, args.threshold, args.dry_run)
|
||||
|
||||
if args.json:
|
||||
print(json.dumps(stats, indent=2))
|
||||
else:
|
||||
print(f"Dedup results:")
|
||||
print(f" Total input: {stats['total']}")
|
||||
print(f" Exact dupes: {stats['exact_dupes']}")
|
||||
print(f" Near dupes: {stats['near_dupes']}")
|
||||
print(f" Unique output: {stats['unique']}")
|
||||
print(f" Removed: {stats['removed']}")
|
||||
if args.dry_run:
|
||||
print(" (dry run — no changes written)")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,249 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Cross-Repo Dependency Graph Builder
|
||||
|
||||
Scans repos for import/require/reference patterns and builds a directed
|
||||
dependency graph. Detects circular dependencies. Outputs DOT and Mermaid.
|
||||
|
||||
Usage:
|
||||
python3 scripts/dependency_graph.py /path/to/repos/
|
||||
python3 scripts/dependency_graph.py --repos repo1,repo2,repo3 --format mermaid
|
||||
python3 scripts/dependency_graph.py --repos-dir /path/to/ --format dot --output deps.dot
|
||||
|
||||
Patterns detected:
|
||||
- Python: import X, from X import Y
|
||||
- JavaScript: require("X"), import ... from "X"
|
||||
- Go: import "X"
|
||||
- Ansible: include_role, import_role
|
||||
- Docker/Compose: image: X, depends_on
|
||||
- Config references: repo-name in YAML/TOML/JSON
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
from collections import defaultdict
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
# Known repo names for matching
|
||||
KNOWN_REPOS = [
|
||||
"hermes-agent", "timmy-config", "timmy-home", "the-nexus", "the-door",
|
||||
"the-beacon", "fleet-ops", "burn-fleet", "timmy-dispatch", "turboquant",
|
||||
"compounding-intelligence", "the-playground", "second-son-of-timmy",
|
||||
"ai-safety-review", "the-echo-pattern", "timmy-academy", "wolf",
|
||||
"the-testament",
|
||||
]
|
||||
|
||||
|
||||
def normalize_repo_name(name: str) -> str:
|
||||
"""Normalize a repo name for comparison."""
|
||||
return name.lower().replace("_", "-").replace(".git", "").strip()
|
||||
|
||||
|
||||
def scan_file_for_deps(filepath: str, content: str, own_repo: str) -> set:
|
||||
"""Scan a file's content for references to other repos."""
|
||||
deps = set()
|
||||
own_norm = normalize_repo_name(own_repo)
|
||||
|
||||
for repo in KNOWN_REPOS:
|
||||
repo_norm = normalize_repo_name(repo)
|
||||
if repo_norm == own_norm:
|
||||
continue
|
||||
|
||||
# Direct name references
|
||||
patterns = [
|
||||
repo, # exact name
|
||||
repo.replace("-", "_"), # underscore variant
|
||||
repo.replace("-", ""), # no separator
|
||||
f"/{repo}/", # path reference
|
||||
f'"{repo}"', # quoted
|
||||
f"'{repo}'", # single quoted
|
||||
f"Timmy_Foundation/{repo}", # full Gitea path
|
||||
f"Timmy_Foundation.{repo}", # Python module path
|
||||
]
|
||||
|
||||
for pattern in patterns:
|
||||
if pattern in content:
|
||||
deps.add(repo)
|
||||
break
|
||||
|
||||
return deps
|
||||
|
||||
|
||||
def scan_repo(repo_path: str, repo_name: str = None) -> dict:
|
||||
"""Scan a repo directory for dependencies."""
|
||||
path = Path(repo_path)
|
||||
if not path.is_dir():
|
||||
return {"error": f"Not a directory: {repo_path}"}
|
||||
|
||||
if not repo_name:
|
||||
repo_name = path.name
|
||||
|
||||
deps = set()
|
||||
files_scanned = 0
|
||||
exts = {".py", ".js", ".ts", ".go", ".yaml", ".yml", ".toml", ".json",
|
||||
".md", ".sh", ".bash", ".Dockerfile", ".tf", ".hcl"}
|
||||
|
||||
for fpath in path.rglob("*"):
|
||||
if not fpath.is_file():
|
||||
continue
|
||||
if fpath.suffix not in exts:
|
||||
continue
|
||||
# Skip common non-source dirs
|
||||
parts = fpath.parts
|
||||
if any(p in (".git", "node_modules", "__pycache__", ".venv", "venv",
|
||||
"vendor", "dist", "build", ".tox") for p in parts):
|
||||
continue
|
||||
|
||||
try:
|
||||
content = fpath.read_text(errors="ignore")
|
||||
except:
|
||||
continue
|
||||
|
||||
file_deps = scan_file_for_deps(str(fpath), content, repo_name)
|
||||
deps.update(file_deps)
|
||||
files_scanned += 1
|
||||
|
||||
return {
|
||||
"repo": repo_name,
|
||||
"dependencies": sorted(deps),
|
||||
"files_scanned": files_scanned,
|
||||
}
|
||||
|
||||
|
||||
def detect_cycles(graph: dict) -> list:
|
||||
"""Detect circular dependencies using DFS."""
|
||||
cycles = []
|
||||
visited = set()
|
||||
rec_stack = set()
|
||||
|
||||
def dfs(node, path):
|
||||
visited.add(node)
|
||||
rec_stack.add(node)
|
||||
|
||||
for neighbor in graph.get(node, {}).get("dependencies", []):
|
||||
if neighbor not in visited:
|
||||
result = dfs(neighbor, path + [neighbor])
|
||||
if result:
|
||||
return result
|
||||
elif neighbor in rec_stack:
|
||||
cycle_start = path.index(neighbor)
|
||||
return path[cycle_start:] + [neighbor]
|
||||
|
||||
rec_stack.remove(node)
|
||||
return None
|
||||
|
||||
for node in graph:
|
||||
if node not in visited:
|
||||
cycle = dfs(node, [node])
|
||||
if cycle:
|
||||
cycles.append(cycle)
|
||||
|
||||
return cycles
|
||||
|
||||
|
||||
def to_dot(graph: dict) -> str:
|
||||
"""Generate DOT format output."""
|
||||
lines = ["digraph dependencies {"]
|
||||
lines.append(" rankdir=LR;")
|
||||
lines.append(' node [shape=box, style=filled, fillcolor="#1a1a2e", fontcolor="#e6edf3"];')
|
||||
lines.append(' edge [color="#4a4a6a"];')
|
||||
lines.append("")
|
||||
|
||||
for repo, data in sorted(graph.items()):
|
||||
dep_count = len(data.get("dependencies", []))
|
||||
fill = "#2d1b69" if dep_count > 2 else "#16213e"
|
||||
lines.append(f' "{repo}" [fillcolor="{fill}"];')
|
||||
for dep in data.get("dependencies", []):
|
||||
lines.append(f' "{repo}" -> "{dep}";')
|
||||
|
||||
lines.append("}")
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def to_mermaid(graph: dict) -> str:
|
||||
"""Generate Mermaid format output."""
|
||||
lines = ["graph LR"]
|
||||
|
||||
for repo, data in sorted(graph.items()):
|
||||
for dep in data.get("dependencies", []):
|
||||
lines.append(f" {repo.replace('-','_')} --> {dep.replace('-','_')}")
|
||||
|
||||
# Add node labels
|
||||
lines.append("")
|
||||
for repo in sorted(graph.keys()):
|
||||
lines.append(f" {repo.replace('-','_')}[{repo}]")
|
||||
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Build cross-repo dependency graph")
|
||||
parser.add_argument("repos_dir", nargs="?", help="Directory containing repos")
|
||||
parser.add_argument("--repos", help="Comma-separated list of repo paths")
|
||||
parser.add_argument("--format", choices=["dot", "mermaid", "json"], default="json")
|
||||
parser.add_argument("--output", "-o", help="Output file (default: stdout)")
|
||||
parser.add_argument("--cycles-only", action="store_true", help="Only report cycles")
|
||||
args = parser.parse_args()
|
||||
|
||||
results = {}
|
||||
repo_paths = []
|
||||
|
||||
if args.repos:
|
||||
repo_paths = [p.strip() for p in args.repos.split(",")]
|
||||
elif args.repos_dir:
|
||||
base = Path(args.repos_dir)
|
||||
repo_paths = [str(p) for p in base.iterdir() if p.is_dir() and not p.name.startswith(".")]
|
||||
else:
|
||||
parser.print_help()
|
||||
sys.exit(1)
|
||||
|
||||
for rpath in repo_paths:
|
||||
name = Path(rpath).name
|
||||
print(f"Scanning {name}...", file=sys.stderr)
|
||||
result = scan_repo(rpath, name)
|
||||
if "error" not in result:
|
||||
results[name] = result
|
||||
|
||||
# Detect cycles
|
||||
cycles = detect_cycles(results)
|
||||
|
||||
if args.cycles_only:
|
||||
if cycles:
|
||||
print("CIRCULAR DEPENDENCIES DETECTED:")
|
||||
for cycle in cycles:
|
||||
print(f" {' -> '.join(cycle)}")
|
||||
sys.exit(1)
|
||||
else:
|
||||
print("No circular dependencies found.")
|
||||
sys.exit(0)
|
||||
|
||||
# Output
|
||||
output = {}
|
||||
if args.format == "dot":
|
||||
output = to_dot(results)
|
||||
elif args.format == "mermaid":
|
||||
output = to_mermaid(results)
|
||||
else:
|
||||
output = json.dumps({
|
||||
"repos": results,
|
||||
"cycles": cycles,
|
||||
"summary": {
|
||||
"total_repos": len(results),
|
||||
"total_deps": sum(len(r["dependencies"]) for r in results.values()),
|
||||
"cycles_found": len(cycles),
|
||||
}
|
||||
}, indent=2)
|
||||
|
||||
if args.output:
|
||||
Path(args.output).write_text(output)
|
||||
print(f"Written to {args.output}", file=sys.stderr)
|
||||
else:
|
||||
print(output)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,216 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Diff Analyzer — Parse unified diffs and categorize every change.
|
||||
|
||||
Pipeline 6.1 for Compounding Intelligence.
|
||||
"""
|
||||
|
||||
import re
|
||||
from dataclasses import dataclass, field, asdict
|
||||
from enum import Enum
|
||||
from typing import List, Dict, Any, Optional
|
||||
|
||||
|
||||
class ChangeCategory(Enum):
|
||||
ADDED = "added"
|
||||
DELETED = "deleted"
|
||||
MODIFIED = "modified"
|
||||
MOVED = "moved"
|
||||
CONTEXT = "context"
|
||||
|
||||
|
||||
@dataclass
|
||||
class Hunk:
|
||||
"""A single diff hunk with header, line ranges, and category."""
|
||||
header: str
|
||||
old_start: int
|
||||
old_count: int
|
||||
new_start: int
|
||||
new_count: int
|
||||
lines: List[str] = field(default_factory=list)
|
||||
category: ChangeCategory = ChangeCategory.CONTEXT
|
||||
|
||||
def to_dict(self) -> Dict[str, Any]:
|
||||
d = asdict(self)
|
||||
d["category"] = self.category.value
|
||||
return d
|
||||
|
||||
|
||||
@dataclass
|
||||
class FileChange:
|
||||
"""A single file's changes."""
|
||||
path: str
|
||||
old_path: Optional[str] = None # For renames
|
||||
hunks: List[Hunk] = field(default_factory=list)
|
||||
added_lines: int = 0
|
||||
deleted_lines: int = 0
|
||||
is_new: bool = False
|
||||
is_deleted: bool = False
|
||||
is_renamed: bool = False
|
||||
is_binary: bool = False
|
||||
|
||||
def to_dict(self) -> Dict[str, Any]:
|
||||
return {
|
||||
"path": self.path,
|
||||
"old_path": self.old_path,
|
||||
"hunks": [h.to_dict() for h in self.hunks],
|
||||
"added_lines": self.added_lines,
|
||||
"deleted_lines": self.deleted_lines,
|
||||
"is_new": self.is_new,
|
||||
"is_deleted": self.is_deleted,
|
||||
"is_renamed": self.is_renamed,
|
||||
"is_binary": self.is_binary,
|
||||
}
|
||||
|
||||
|
||||
@dataclass
|
||||
class ChangeSummary:
|
||||
"""Aggregate stats + per-file breakdown."""
|
||||
files: List[FileChange] = field(default_factory=list)
|
||||
total_added: int = 0
|
||||
total_deleted: int = 0
|
||||
total_files_changed: int = 0
|
||||
total_hunks: int = 0
|
||||
new_files: int = 0
|
||||
deleted_files: int = 0
|
||||
renamed_files: int = 0
|
||||
binary_files: int = 0
|
||||
|
||||
def to_dict(self) -> Dict[str, Any]:
|
||||
return {
|
||||
"total_files_changed": self.total_files_changed,
|
||||
"total_added": self.total_added,
|
||||
"total_deleted": self.total_deleted,
|
||||
"total_hunks": self.total_hunks,
|
||||
"new_files": self.new_files,
|
||||
"deleted_files": self.deleted_files,
|
||||
"renamed_files": self.renamed_files,
|
||||
"binary_files": self.binary_files,
|
||||
"files": [f.to_dict() for f in self.files],
|
||||
}
|
||||
|
||||
|
||||
class DiffAnalyzer:
|
||||
"""Parses unified diff format and produces structured ChangeSummary."""
|
||||
|
||||
HUNK_HEADER_RE = re.compile(r"^@@\s+-(\d+)(?:,(\d+))?\s+\+(\d+)(?:,(\d+))?\s+@@(.*)$")
|
||||
DIFF_FILE_RE = re.compile(r"^diff --git a/(.*) b/(.*)")
|
||||
RENAME_RE = re.compile(r"^rename from (.+)$")
|
||||
RENAME_TO_RE = re.compile(r"^rename to (.+)$")
|
||||
NEW_FILE_RE = re.compile(r"^new file mode")
|
||||
DELETED_FILE_RE = re.compile(r"^deleted file mode")
|
||||
BINARY_RE = re.compile(r"^Binary files .* differ")
|
||||
|
||||
def analyze(self, diff_text: str) -> ChangeSummary:
|
||||
"""Parse a unified diff and return a ChangeSummary."""
|
||||
summary = ChangeSummary()
|
||||
if not diff_text or not diff_text.strip():
|
||||
return summary
|
||||
|
||||
# Split diff into per-file sections
|
||||
file_diffs = self._split_files(diff_text)
|
||||
|
||||
for file_diff in file_diffs:
|
||||
fc = self._parse_file_diff(file_diff)
|
||||
summary.files.append(fc)
|
||||
summary.total_added += fc.added_lines
|
||||
summary.total_deleted += fc.deleted_lines
|
||||
summary.total_hunks += len(fc.hunks)
|
||||
if fc.is_new:
|
||||
summary.new_files += 1
|
||||
if fc.is_deleted:
|
||||
summary.deleted_files += 1
|
||||
if fc.is_renamed:
|
||||
summary.renamed_files += 1
|
||||
if fc.is_binary:
|
||||
summary.binary_files += 1
|
||||
|
||||
summary.total_files_changed = len(summary.files)
|
||||
return summary
|
||||
|
||||
def _split_files(self, diff_text: str) -> List[str]:
|
||||
"""Split a multi-file diff into individual file diffs."""
|
||||
lines = diff_text.split("\n")
|
||||
chunks = []
|
||||
current = []
|
||||
for line in lines:
|
||||
if line.startswith("diff --git ") and current:
|
||||
chunks.append("\n".join(current))
|
||||
current = [line]
|
||||
else:
|
||||
current.append(line)
|
||||
if current:
|
||||
chunks.append("\n".join(current))
|
||||
return chunks
|
||||
|
||||
def _parse_file_diff(self, diff_text: str) -> FileChange:
|
||||
"""Parse a single file's diff section."""
|
||||
lines = diff_text.split("\n")
|
||||
fc = FileChange(path="")
|
||||
|
||||
# Extract file paths
|
||||
for line in lines:
|
||||
m = self.DIFF_FILE_RE.match(line)
|
||||
if m:
|
||||
fc.path = m.group(2)
|
||||
break
|
||||
|
||||
# Check for special states
|
||||
for line in lines:
|
||||
if self.NEW_FILE_RE.match(line):
|
||||
fc.is_new = True
|
||||
elif self.DELETED_FILE_RE.match(line):
|
||||
fc.is_deleted = True
|
||||
elif self.RENAME_RE.match(line):
|
||||
fc.old_path = m.group(1) if (m := self.RENAME_RE.match(line)) else None
|
||||
fc.is_renamed = True
|
||||
elif self.BINARY_RE.match(line):
|
||||
fc.is_binary = True
|
||||
return fc # No hunks for binary
|
||||
|
||||
# Rename TO
|
||||
for line in lines:
|
||||
m = self.RENAME_TO_RE.match(line)
|
||||
if m and fc.is_renamed:
|
||||
fc.path = m.group(1)
|
||||
|
||||
# Parse hunks
|
||||
current_hunk = None
|
||||
for line in lines:
|
||||
m = self.HUNK_HEADER_RE.match(line)
|
||||
if m:
|
||||
if current_hunk:
|
||||
self._classify_hunk(current_hunk, fc)
|
||||
fc.hunks.append(current_hunk)
|
||||
current_hunk = Hunk(
|
||||
header=m.group(5).strip(),
|
||||
old_start=int(m.group(1)),
|
||||
old_count=int(m.group(2) or 1),
|
||||
new_start=int(m.group(3)),
|
||||
new_count=int(m.group(4) or 1),
|
||||
)
|
||||
elif current_hunk and (line.startswith("+") or line.startswith("-") or line.startswith(" ")):
|
||||
current_hunk.lines.append(line)
|
||||
|
||||
if current_hunk:
|
||||
self._classify_hunk(current_hunk, fc)
|
||||
fc.hunks.append(current_hunk)
|
||||
|
||||
return fc
|
||||
|
||||
def _classify_hunk(self, hunk: Hunk, fc: FileChange):
|
||||
"""Classify a hunk and count lines."""
|
||||
added = sum(1 for l in hunk.lines if l.startswith("+"))
|
||||
deleted = sum(1 for l in hunk.lines if l.startswith("-"))
|
||||
|
||||
fc.added_lines += added
|
||||
fc.deleted_lines += deleted
|
||||
|
||||
if added > 0 and deleted == 0:
|
||||
hunk.category = ChangeCategory.ADDED
|
||||
elif deleted > 0 and added == 0:
|
||||
hunk.category = ChangeCategory.DELETED
|
||||
elif added > 0 and deleted > 0:
|
||||
hunk.category = ChangeCategory.MODIFIED
|
||||
else:
|
||||
hunk.category = ChangeCategory.CONTEXT
|
||||
@@ -1,387 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Knowledge Freshness Cron — Detect stale entries from code changes (Issue #200)
|
||||
|
||||
Automatically detects when knowledge entries become stale due to code changes.
|
||||
|
||||
Detection Method:
|
||||
1. Track source file hash alongside knowledge entry
|
||||
2. Compare current file hashes vs stored
|
||||
3. Mismatch → flag entry as potentially stale
|
||||
4. Report stale entries and optionally re-extract
|
||||
|
||||
Usage:
|
||||
python3 scripts/freshness.py --knowledge-dir knowledge/
|
||||
python3 scripts/freshness.py --knowledge-dir knowledge/ --json
|
||||
python3 scripts/freshness.py --knowledge-dir knowledge/ --repo /path/to/repo
|
||||
python3 scripts/freshness.py --knowledge-dir knowledge/ --auto-reextract
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import hashlib
|
||||
import json
|
||||
import os
|
||||
import subprocess
|
||||
import sys
|
||||
import yaml
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
from typing import Dict, List, Any, Optional, Tuple
|
||||
|
||||
|
||||
def compute_file_hash(filepath: str) -> Optional[str]:
|
||||
"""Compute SHA-256 hash of a file. Returns None if file doesn't exist."""
|
||||
try:
|
||||
with open(filepath, "rb") as f:
|
||||
return "sha256:" + hashlib.sha256(f.read()).hexdigest()
|
||||
except (FileNotFoundError, IsADirectoryError, PermissionError):
|
||||
return None
|
||||
|
||||
|
||||
def get_git_file_changes(repo_path: str, days: int = 1) -> Dict[str, List[str]]:
|
||||
"""
|
||||
Get files changed in git in the last N days.
|
||||
|
||||
Returns dict with 'modified', 'added', 'deleted' lists of file paths.
|
||||
"""
|
||||
changes = {"modified": [], "added": [], "deleted": []}
|
||||
|
||||
try:
|
||||
# Get commits from last N days
|
||||
cmd = [
|
||||
"git", "-C", repo_path, "log",
|
||||
f"--since={days} days ago",
|
||||
"--name-status",
|
||||
"--pretty=format:",
|
||||
"--diff-filter=MAD"
|
||||
]
|
||||
result = subprocess.run(cmd, capture_output=True, text=True, timeout=30)
|
||||
|
||||
if result.returncode != 0:
|
||||
return changes
|
||||
|
||||
for line in result.stdout.splitlines():
|
||||
line = line.strip()
|
||||
if not line:
|
||||
continue
|
||||
|
||||
parts = line.split('\t', 1)
|
||||
if len(parts) != 2:
|
||||
continue
|
||||
|
||||
status, filepath = parts
|
||||
if status == 'M':
|
||||
changes["modified"].append(filepath)
|
||||
elif status == 'A':
|
||||
changes["added"].append(filepath)
|
||||
elif status == 'D':
|
||||
changes["deleted"].append(filepath)
|
||||
|
||||
except (subprocess.TimeoutExpired, FileNotFoundError):
|
||||
pass
|
||||
|
||||
# Deduplicate
|
||||
for key in changes:
|
||||
changes[key] = list(set(changes[key]))
|
||||
|
||||
return changes
|
||||
|
||||
|
||||
def load_knowledge_entries(knowledge_dir: str) -> List[Dict[str, Any]]:
|
||||
"""
|
||||
Load knowledge entries from YAML files in the knowledge directory.
|
||||
|
||||
Supports:
|
||||
- knowledge/index.json (legacy format)
|
||||
- knowledge/global/*.yaml
|
||||
- knowledge/repos/*.yaml
|
||||
- knowledge/agents/*.yaml
|
||||
"""
|
||||
entries = []
|
||||
|
||||
# Load from index.json if exists
|
||||
index_path = os.path.join(knowledge_dir, "index.json")
|
||||
if os.path.exists(index_path):
|
||||
try:
|
||||
with open(index_path) as f:
|
||||
data = json.load(f)
|
||||
for fact in data.get("facts", []):
|
||||
entries.append({
|
||||
"source": "index.json",
|
||||
"fact": fact.get("fact", ""),
|
||||
"source_file": fact.get("source_file"),
|
||||
"source_hash": fact.get("source_hash"),
|
||||
"category": fact.get("category", "unknown"),
|
||||
"confidence": fact.get("confidence", 0.5)
|
||||
})
|
||||
except (json.JSONDecodeError, KeyError):
|
||||
pass
|
||||
|
||||
# Load from YAML files
|
||||
for subdir in ["global", "repos", "agents"]:
|
||||
subdir_path = os.path.join(knowledge_dir, subdir)
|
||||
if not os.path.isdir(subdir_path):
|
||||
continue
|
||||
|
||||
for filename in os.listdir(subdir_path):
|
||||
if not filename.endswith((".yaml", ".yml")):
|
||||
continue
|
||||
|
||||
filepath = os.path.join(subdir_path, filename)
|
||||
try:
|
||||
with open(filepath) as f:
|
||||
data = yaml.safe_load(f)
|
||||
|
||||
if not data or not isinstance(data, dict):
|
||||
continue
|
||||
|
||||
# Extract entries from YAML structure
|
||||
for key, value in data.items():
|
||||
if isinstance(value, list):
|
||||
for item in value:
|
||||
if isinstance(item, dict):
|
||||
entries.append({
|
||||
"source": f"{subdir}/{filename}",
|
||||
"fact": item.get("description", item.get("fact", "")),
|
||||
"source_file": item.get("source_file"),
|
||||
"source_hash": item.get("source_hash"),
|
||||
"category": item.get("category", "unknown"),
|
||||
"confidence": item.get("confidence", 0.5)
|
||||
})
|
||||
elif isinstance(value, dict):
|
||||
entries.append({
|
||||
"source": f"{subdir}/{filename}",
|
||||
"fact": value.get("description", value.get("fact", "")),
|
||||
"source_file": value.get("source_file"),
|
||||
"source_hash": value.get("source_hash"),
|
||||
"category": value.get("category", "unknown"),
|
||||
"confidence": value.get("confidence", 0.5)
|
||||
})
|
||||
except (yaml.YAMLError, IOError):
|
||||
pass
|
||||
|
||||
return entries
|
||||
|
||||
|
||||
def check_freshness(knowledge_dir: str, repo_root: str = ".",
|
||||
days: int = 1) -> Dict[str, Any]:
|
||||
"""
|
||||
Check freshness of knowledge entries against recent code changes.
|
||||
|
||||
Returns:
|
||||
{
|
||||
"timestamp": ISO timestamp,
|
||||
"total_entries": int,
|
||||
"stale_entries": [...],
|
||||
"fresh_entries": [...],
|
||||
"git_changes": {...},
|
||||
"summary": {...}
|
||||
}
|
||||
"""
|
||||
entries = load_knowledge_entries(knowledge_dir)
|
||||
git_changes = get_git_file_changes(repo_root, days)
|
||||
|
||||
stale_entries = []
|
||||
fresh_entries = []
|
||||
|
||||
for entry in entries:
|
||||
source_file = entry.get("source_file")
|
||||
if not source_file:
|
||||
# Entry without source file reference
|
||||
fresh_entries.append({**entry, "status": "no_source"})
|
||||
continue
|
||||
|
||||
# Check if source file was recently modified
|
||||
is_stale = False
|
||||
reason = ""
|
||||
|
||||
if source_file in git_changes["modified"]:
|
||||
is_stale = True
|
||||
reason = "source_modified"
|
||||
elif source_file in git_changes["deleted"]:
|
||||
is_stale = True
|
||||
reason = "source_deleted"
|
||||
elif source_file in git_changes["added"]:
|
||||
is_stale = True
|
||||
reason = "source_added"
|
||||
|
||||
# Also check hash if available
|
||||
stored_hash = entry.get("source_hash")
|
||||
if stored_hash:
|
||||
full_path = os.path.join(repo_root, source_file)
|
||||
current_hash = compute_file_hash(full_path)
|
||||
|
||||
if current_hash is None:
|
||||
is_stale = True
|
||||
reason = "source_missing"
|
||||
elif current_hash != stored_hash:
|
||||
is_stale = True
|
||||
reason = "hash_mismatch"
|
||||
|
||||
if is_stale:
|
||||
stale_entries.append({
|
||||
**entry,
|
||||
"status": "stale",
|
||||
"reason": reason
|
||||
})
|
||||
else:
|
||||
fresh_entries.append({**entry, "status": "fresh"})
|
||||
|
||||
# Compute summary
|
||||
total = len(entries)
|
||||
stale_count = len(stale_entries)
|
||||
fresh_count = len(fresh_entries)
|
||||
|
||||
# Group stale entries by reason
|
||||
stale_by_reason = {}
|
||||
for entry in stale_entries:
|
||||
reason = entry.get("reason", "unknown")
|
||||
if reason not in stale_by_reason:
|
||||
stale_by_reason[reason] = 0
|
||||
stale_by_reason[reason] += 1
|
||||
|
||||
return {
|
||||
"timestamp": datetime.now(timezone.utc).isoformat(),
|
||||
"total_entries": total,
|
||||
"stale_entries": stale_entries,
|
||||
"fresh_entries": fresh_entries,
|
||||
"git_changes": git_changes,
|
||||
"summary": {
|
||||
"total": total,
|
||||
"stale": stale_count,
|
||||
"fresh": fresh_count,
|
||||
"stale_percentage": round(stale_count / total * 100, 1) if total > 0 else 0,
|
||||
"stale_by_reason": stale_by_reason,
|
||||
"git_changes_summary": {
|
||||
"modified": len(git_changes["modified"]),
|
||||
"added": len(git_changes["added"]),
|
||||
"deleted": len(git_changes["deleted"])
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
def update_stale_hashes(knowledge_dir: str, repo_root: str = ".") -> int:
|
||||
"""
|
||||
Update hashes for stale entries. Returns count of updated entries.
|
||||
"""
|
||||
entries = load_knowledge_entries(knowledge_dir)
|
||||
updated = 0
|
||||
|
||||
# This is a simplified version - in practice, you'd need to
|
||||
# write back to the specific YAML files
|
||||
for entry in entries:
|
||||
source_file = entry.get("source_file")
|
||||
if not source_file:
|
||||
continue
|
||||
|
||||
full_path = os.path.join(repo_root, source_file)
|
||||
current_hash = compute_file_hash(full_path)
|
||||
|
||||
if current_hash and entry.get("source_hash") != current_hash:
|
||||
# Mark for update (in practice, you'd write back to the file)
|
||||
updated += 1
|
||||
|
||||
return updated
|
||||
|
||||
|
||||
def format_report(result: Dict[str, Any], max_items: int = 20) -> str:
|
||||
"""Format freshness check results as a human-readable report."""
|
||||
timestamp = result["timestamp"]
|
||||
summary = result["summary"]
|
||||
stale_entries = result["stale_entries"]
|
||||
git_changes = result["git_changes"]
|
||||
|
||||
lines = [
|
||||
"Knowledge Freshness Report",
|
||||
"=" * 50,
|
||||
f"Generated: {timestamp}",
|
||||
f"Total entries: {summary['total']}",
|
||||
f"Stale entries: {summary['stale']} ({summary['stale_percentage']}%)",
|
||||
f"Fresh entries: {summary['fresh']}",
|
||||
""
|
||||
]
|
||||
|
||||
# Git changes summary
|
||||
lines.extend([
|
||||
"Git Changes (last 24h):",
|
||||
f" Modified: {len(git_changes['modified'])} files",
|
||||
f" Added: {len(git_changes['added'])} files",
|
||||
f" Deleted: {len(git_changes['deleted'])} files",
|
||||
""
|
||||
])
|
||||
|
||||
# Stale entries by reason
|
||||
if summary.get("stale_by_reason"):
|
||||
lines.extend([
|
||||
"Stale Entries by Reason:",
|
||||
""
|
||||
])
|
||||
for reason, count in summary["stale_by_reason"].items():
|
||||
lines.append(f" {reason}: {count}")
|
||||
lines.append("")
|
||||
|
||||
# List stale entries
|
||||
if stale_entries:
|
||||
lines.extend([
|
||||
"Stale Entries:",
|
||||
""
|
||||
])
|
||||
for i, entry in enumerate(stale_entries[:max_items], 1):
|
||||
source = entry.get("source_file", "?")
|
||||
reason = entry.get("reason", "unknown")
|
||||
fact = entry.get("fact", "")[:60]
|
||||
lines.append(f"{i:2d}. [{reason}] {source}")
|
||||
if fact:
|
||||
lines.append(f" {fact}")
|
||||
|
||||
if len(stale_entries) > max_items:
|
||||
lines.append(f"\n... and {len(stale_entries) - max_items} more")
|
||||
else:
|
||||
lines.append("No stale entries found. All knowledge is fresh!")
|
||||
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Knowledge Freshness Cron — detect stale entries from code changes")
|
||||
parser.add_argument("--knowledge-dir", required=True,
|
||||
help="Path to knowledge directory")
|
||||
parser.add_argument("--repo", default=".",
|
||||
help="Path to repository for git change detection")
|
||||
parser.add_argument("--days", type=int, default=1,
|
||||
help="Number of days to check for git changes (default: 1)")
|
||||
parser.add_argument("--json", action="store_true",
|
||||
help="Output as JSON instead of human-readable")
|
||||
parser.add_argument("--max", type=int, default=20,
|
||||
help="Maximum stale entries to show (default: 20)")
|
||||
parser.add_argument("--auto-reextract", action="store_true",
|
||||
help="Auto-re-extract knowledge for stale entries")
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
if not os.path.isdir(args.knowledge_dir):
|
||||
print(f"Error: {args.knowledge_dir} is not a directory", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
if not os.path.isdir(args.repo):
|
||||
print(f"Error: {args.repo} is not a directory", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
result = check_freshness(args.knowledge_dir, args.repo, args.days)
|
||||
|
||||
if args.json:
|
||||
print(json.dumps(result, indent=2))
|
||||
else:
|
||||
print(format_report(result, args.max))
|
||||
|
||||
# Auto-re-extract if requested
|
||||
if args.auto_reextract and result["stale_entries"]:
|
||||
print(f"\nAuto-re-extracting {len(result['stale_entries'])} stale entries...")
|
||||
# In a real implementation, this would call the harvester
|
||||
print("(Auto-re-extraction not yet implemented)")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,131 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Gitea Issue Body Parser — Extract structured data from markdown issue bodies.
|
||||
|
||||
Usage:
|
||||
cat issue_body.txt | python3 scripts/gitea_issue_parser.py --stdin --pretty
|
||||
python3 scripts/gitea_issue_parser.py --url https://forge.../api/v1/repos/.../issues/123 --pretty
|
||||
python3 scripts/gitea_issue_parser.py body.txt --title "Fix thing (#42)" --labels pipeline extraction
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import re
|
||||
import sys
|
||||
from typing import Dict, List, Any, Optional
|
||||
|
||||
|
||||
def parse_issue_body(body: str, title: str = "", labels: List[str] = None) -> Dict[str, Any]:
|
||||
"""Parse a Gitea issue markdown body into structured JSON.
|
||||
|
||||
Extracted fields:
|
||||
- title: Issue title
|
||||
- context: Background/description section
|
||||
- criteria[]: Acceptance criteria (checkboxes or numbered lists)
|
||||
- labels[]: Issue labels
|
||||
- epic_ref: Parent/epic issue reference (from "Closes #N" or title)
|
||||
- sections{}: All ## sections as key-value pairs
|
||||
"""
|
||||
result = {
|
||||
"title": title,
|
||||
"context": "",
|
||||
"criteria": [],
|
||||
"labels": labels or [],
|
||||
"epic_ref": None,
|
||||
"sections": {},
|
||||
}
|
||||
|
||||
if not body:
|
||||
return result
|
||||
|
||||
# Extract epic reference from title or body
|
||||
epic_patterns = [
|
||||
r"(?:closes|fixes|addresses|refs?)\s+#(\d+)",
|
||||
r"#(\d+)",
|
||||
]
|
||||
for pattern in epic_patterns:
|
||||
match = re.search(pattern, (title + " " + body).lower())
|
||||
if match:
|
||||
result["epic_ref"] = int(match.group(1))
|
||||
break
|
||||
|
||||
# Parse ## sections
|
||||
section_pattern = r"^##\s+(.+?)$\n((?:^(?!##\s).*$\n?)*)"
|
||||
for match in re.finditer(section_pattern, body, re.MULTILINE):
|
||||
section_name = match.group(1).strip().lower().replace(" ", "_")
|
||||
section_content = match.group(2).strip()
|
||||
result["sections"][section_name] = section_content
|
||||
|
||||
# Extract acceptance criteria (checkboxes)
|
||||
checkbox_pattern = r"^\s*-\s*\[([ xX])\]\s*(.+)$"
|
||||
for match in re.finditer(checkbox_pattern, body, re.MULTILINE):
|
||||
checked = match.group(1).lower() == "x"
|
||||
text = match.group(2).strip()
|
||||
result["criteria"].append({"text": text, "checked": checked})
|
||||
|
||||
# If no checkboxes, try numbered lists in "Acceptance Criteria" or "Criteria" section
|
||||
if not result["criteria"]:
|
||||
for section_name in ["acceptance_criteria", "criteria", "acceptance criteria"]:
|
||||
if section_name in result["sections"]:
|
||||
numbered = r"^\s*\d+\.\s*(.+)$"
|
||||
for match in re.finditer(numbered, result["sections"][section_name], re.MULTILINE):
|
||||
result["criteria"].append({"text": match.group(1).strip(), "checked": False})
|
||||
break
|
||||
|
||||
# Extract context (first section or first paragraph before any ## heading)
|
||||
first_heading = body.find("## ")
|
||||
if first_heading > 0:
|
||||
context_text = body[:first_heading].strip()
|
||||
else:
|
||||
context_text = body.split("\n\n")[0].strip()
|
||||
# Clean up: remove "## Context" or "## Problem" header if present
|
||||
context_text = re.sub(r"^#+\s*\w+\s*\n?", "", context_text).strip()
|
||||
result["context"] = context_text[:500] # Cap at 500 chars
|
||||
|
||||
return result
|
||||
|
||||
|
||||
def fetch_issue_from_url(url: str) -> Dict[str, Any]:
|
||||
"""Fetch an issue from a Gitea API URL and parse it."""
|
||||
import urllib.request
|
||||
req = urllib.request.Request(url, headers={"Accept": "application/json"})
|
||||
with urllib.request.urlopen(req) as resp:
|
||||
data = json.loads(resp.read())
|
||||
|
||||
return parse_issue_body(
|
||||
body=data.get("body", ""),
|
||||
title=data.get("title", ""),
|
||||
labels=[l["name"] for l in data.get("labels", [])]
|
||||
)
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Parse Gitea issue markdown into structured JSON")
|
||||
parser.add_argument("file", nargs="?", help="Issue body file (or use --stdin)")
|
||||
parser.add_argument("--stdin", action="store_true", help="Read from stdin")
|
||||
parser.add_argument("--url", help="Gitea API URL to fetch issue from")
|
||||
parser.add_argument("--title", default="", help="Issue title")
|
||||
parser.add_argument("--labels", nargs="*", default=[], help="Issue labels")
|
||||
parser.add_argument("--pretty", action="store_true", help="Pretty-print JSON output")
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
if args.url:
|
||||
result = fetch_issue_from_url(args.url)
|
||||
elif args.stdin:
|
||||
body = sys.stdin.read()
|
||||
result = parse_issue_body(body, args.title, args.labels)
|
||||
elif args.file:
|
||||
with open(args.file) as f:
|
||||
body = f.read()
|
||||
result = parse_issue_body(body, args.title, args.labels)
|
||||
else:
|
||||
parser.print_help()
|
||||
sys.exit(1)
|
||||
|
||||
indent = 2 if args.pretty else None
|
||||
print(json.dumps(result, indent=indent))
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,447 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
harvester.py — Extract durable knowledge from Hermes session transcripts.
|
||||
|
||||
Combines session_reader + extraction prompt + LLM inference to pull
|
||||
facts, pitfalls, patterns, and tool quirks from finished sessions.
|
||||
|
||||
Usage:
|
||||
python3 harvester.py --session ~/.hermes/sessions/session_xxx.jsonl --output knowledge/
|
||||
python3 harvester.py --batch --since 2026-04-01 --limit 100
|
||||
python3 harvester.py --session session.jsonl --dry-run # Preview without writing
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import time
|
||||
import hashlib
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
from typing import Optional
|
||||
|
||||
# Add scripts dir to path for sibling imports
|
||||
SCRIPT_DIR = Path(__file__).parent.absolute()
|
||||
sys.path.insert(0, str(SCRIPT_DIR))
|
||||
|
||||
from session_reader import read_session, extract_conversation, truncate_for_context, messages_to_text
|
||||
|
||||
# --- Configuration ---
|
||||
|
||||
DEFAULT_API_BASE = os.environ.get("HARVESTER_API_BASE", "https://api.nousresearch.com/v1")
|
||||
DEFAULT_API_KEY = os.environ.get("HARVESTER_API_KEY", "")
|
||||
DEFAULT_MODEL = os.environ.get("HARVESTER_MODEL", "xiaomi/mimo-v2-pro")
|
||||
KNOWLEDGE_DIR = os.environ.get("HARVESTER_KNOWLEDGE_DIR", "knowledge")
|
||||
PROMPT_PATH = os.environ.get("HARVESTER_PROMPT_PATH", str(SCRIPT_DIR.parent / "templates" / "harvest-prompt.md"))
|
||||
|
||||
# Where to look for API keys if not set via env
|
||||
API_KEY_PATHS = [
|
||||
os.path.expanduser("~/.config/nous/key"),
|
||||
os.path.expanduser("~/.hermes/keymaxxing/active/minimax.key"),
|
||||
os.path.expanduser("~/.config/openrouter/key"),
|
||||
]
|
||||
|
||||
|
||||
def find_api_key() -> str:
|
||||
"""Find API key from common locations."""
|
||||
for path in API_KEY_PATHS:
|
||||
if os.path.exists(path):
|
||||
with open(path) as f:
|
||||
key = f.read().strip()
|
||||
if key:
|
||||
return key
|
||||
return ""
|
||||
|
||||
|
||||
def load_extraction_prompt() -> str:
|
||||
"""Load the extraction prompt template."""
|
||||
path = Path(PROMPT_PATH)
|
||||
if not path.exists():
|
||||
print(f"ERROR: Extraction prompt not found at {path}", file=sys.stderr)
|
||||
print("Expected templates/harvest-prompt.md from issue #7", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
return path.read_text(encoding='utf-8')
|
||||
|
||||
|
||||
def call_llm(prompt: str, transcript: str, api_base: str, api_key: str, model: str) -> Optional[list[dict]]:
|
||||
"""Call the LLM API to extract knowledge from a transcript."""
|
||||
import urllib.request
|
||||
|
||||
messages = [
|
||||
{"role": "system", "content": prompt},
|
||||
{"role": "user", "content": f"Extract knowledge from this session transcript:\n\n{transcript}"}
|
||||
]
|
||||
|
||||
payload = json.dumps({
|
||||
"model": model,
|
||||
"messages": messages,
|
||||
"temperature": 0.1, # Low temp for consistent extraction
|
||||
"max_tokens": 4096
|
||||
}).encode('utf-8')
|
||||
|
||||
req = urllib.request.Request(
|
||||
f"{api_base}/chat/completions",
|
||||
data=payload,
|
||||
headers={
|
||||
"Authorization": f"Bearer {api_key}",
|
||||
"Content-Type": "application/json"
|
||||
},
|
||||
method="POST"
|
||||
)
|
||||
|
||||
try:
|
||||
with urllib.request.urlopen(req, timeout=60) as resp:
|
||||
result = json.loads(resp.read().decode('utf-8'))
|
||||
content = result["choices"][0]["message"]["content"]
|
||||
return parse_extraction_response(content)
|
||||
except Exception as e:
|
||||
print(f"ERROR: LLM API call failed: {e}", file=sys.stderr)
|
||||
return None
|
||||
|
||||
|
||||
def parse_extraction_response(content: str) -> Optional[list[dict]]:
|
||||
"""Parse the LLM response to extract knowledge items.
|
||||
|
||||
Handles various response formats: raw JSON, markdown-wrapped JSON, etc.
|
||||
"""
|
||||
# Try direct JSON parse first
|
||||
try:
|
||||
data = json.loads(content)
|
||||
if isinstance(data, dict) and 'knowledge' in data:
|
||||
return data['knowledge']
|
||||
if isinstance(data, list):
|
||||
return data
|
||||
except json.JSONDecodeError:
|
||||
pass
|
||||
|
||||
# Try extracting JSON from markdown code blocks
|
||||
import re
|
||||
json_match = re.search(r'```(?:json)?\s*({.*?})\s*```', content, re.DOTALL)
|
||||
if json_match:
|
||||
try:
|
||||
data = json.loads(json_match.group(1))
|
||||
if isinstance(data, dict) and 'knowledge' in data:
|
||||
return data['knowledge']
|
||||
if isinstance(data, list):
|
||||
return data
|
||||
except json.JSONDecodeError:
|
||||
pass
|
||||
|
||||
# Try finding any JSON object with knowledge array
|
||||
json_match = re.search(r'({[^{}]*"knowledge"[^{}]*[[sS]*?][^{}]*})', content)
|
||||
if json_match:
|
||||
try:
|
||||
data = json.loads(json_match.group(1))
|
||||
return data.get('knowledge', [])
|
||||
except json.JSONDecodeError:
|
||||
pass
|
||||
|
||||
print(f"WARNING: Could not parse LLM response as JSON", file=sys.stderr)
|
||||
print(f"Response preview: {content[:500]}", file=sys.stderr)
|
||||
return None
|
||||
|
||||
|
||||
def load_existing_knowledge(knowledge_dir: str) -> dict:
|
||||
"""Load the existing knowledge index."""
|
||||
index_path = Path(knowledge_dir) / "index.json"
|
||||
if not index_path.exists():
|
||||
return {"version": 1, "last_updated": "", "total_facts": 0, "facts": []}
|
||||
|
||||
try:
|
||||
with open(index_path, 'r', encoding='utf-8') as f:
|
||||
return json.load(f)
|
||||
except (json.JSONDecodeError, IOError) as e:
|
||||
print(f"WARNING: Could not load knowledge index: {e}", file=sys.stderr)
|
||||
return {"version": 1, "last_updated": "", "total_facts": 0, "facts": []}
|
||||
|
||||
|
||||
def fact_fingerprint(fact: dict) -> str:
|
||||
"""Generate a deduplication fingerprint for a fact.
|
||||
|
||||
Uses the fact text normalized (lowercase, stripped) as the key.
|
||||
Similar facts will have similar fingerprints.
|
||||
"""
|
||||
text = fact.get('fact', '').lower().strip()
|
||||
# Normalize whitespace
|
||||
text = ' '.join(text.split())
|
||||
return hashlib.md5(text.encode('utf-8')).hexdigest()
|
||||
|
||||
|
||||
def deduplicate(new_facts: list[dict], existing: list[dict], similarity_threshold: float = 0.8) -> list[dict]:
|
||||
"""Remove duplicate facts from new_facts that already exist in the knowledge store.
|
||||
|
||||
Uses fingerprint matching for exact dedup and simple overlap check for near-dupes.
|
||||
"""
|
||||
existing_fingerprints = set()
|
||||
existing_texts = []
|
||||
for f in existing:
|
||||
fp = fact_fingerprint(f)
|
||||
existing_fingerprints.add(fp)
|
||||
existing_texts.append(f.get('fact', '').lower().strip())
|
||||
|
||||
unique = []
|
||||
for fact in new_facts:
|
||||
fp = fact_fingerprint(fact)
|
||||
if fp in existing_fingerprints:
|
||||
continue
|
||||
|
||||
# Check for near-duplicates using simple word overlap
|
||||
fact_words = set(fact.get('fact', '').lower().split())
|
||||
is_dup = False
|
||||
for existing_text in existing_texts:
|
||||
existing_words = set(existing_text.split())
|
||||
if not fact_words or not existing_words:
|
||||
continue
|
||||
overlap = len(fact_words & existing_words) / max(len(fact_words | existing_words), 1)
|
||||
if overlap >= similarity_threshold:
|
||||
is_dup = True
|
||||
break
|
||||
|
||||
if not is_dup:
|
||||
unique.append(fact)
|
||||
existing_fingerprints.add(fp)
|
||||
existing_texts.append(fact.get('fact', '').lower().strip())
|
||||
|
||||
return unique
|
||||
|
||||
|
||||
def validate_fact(fact: dict) -> bool:
|
||||
"""Validate a single knowledge item has required fields."""
|
||||
required = ['fact', 'category', 'repo', 'confidence']
|
||||
for field in required:
|
||||
if field not in fact:
|
||||
return False
|
||||
|
||||
if not isinstance(fact['fact'], str) or not fact['fact'].strip():
|
||||
return False
|
||||
|
||||
valid_categories = ['fact', 'pitfall', 'pattern', 'tool-quirk', 'question']
|
||||
if fact['category'] not in valid_categories:
|
||||
return False
|
||||
|
||||
if not isinstance(fact.get('confidence', 0), (int, float)):
|
||||
return False
|
||||
|
||||
if not (0.0 <= fact['confidence'] <= 1.0):
|
||||
return False
|
||||
|
||||
return True
|
||||
|
||||
|
||||
def write_knowledge(index: dict, new_facts: list[dict], knowledge_dir: str, source_session: str = ""):
|
||||
"""Write new facts to the knowledge store."""
|
||||
kdir = Path(knowledge_dir)
|
||||
kdir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Add source tracking to each fact
|
||||
for fact in new_facts:
|
||||
fact['source_session'] = source_session
|
||||
fact['harvested_at'] = datetime.now(timezone.utc).isoformat()
|
||||
|
||||
# Update index
|
||||
index['facts'].extend(new_facts)
|
||||
index['total_facts'] = len(index['facts'])
|
||||
index['last_updated'] = datetime.now(timezone.utc).isoformat()
|
||||
|
||||
# Write index
|
||||
index_path = kdir / "index.json"
|
||||
with open(index_path, 'w', encoding='utf-8') as f:
|
||||
json.dump(index, f, indent=2, ensure_ascii=False)
|
||||
|
||||
# Also write per-repo markdown files for human reading
|
||||
repos = {}
|
||||
for fact in new_facts:
|
||||
repo = fact.get('repo', 'global')
|
||||
repos.setdefault(repo, []).append(fact)
|
||||
|
||||
for repo, facts in repos.items():
|
||||
if repo == 'global':
|
||||
md_path = kdir / "global" / "harvested.md"
|
||||
else:
|
||||
md_path = kdir / "repos" / f"{repo}.md"
|
||||
|
||||
md_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Append to existing or create new
|
||||
mode = 'a' if md_path.exists() else 'w'
|
||||
with open(md_path, mode, encoding='utf-8') as f:
|
||||
if mode == 'w':
|
||||
f.write(f"# Knowledge: {repo}\n\n")
|
||||
f.write(f"## Harvested {datetime.now(timezone.utc).strftime('%Y-%m-%d %H:%M')}\n\n")
|
||||
for fact in facts:
|
||||
icon = {'fact': '📋', 'pitfall': '⚠️', 'pattern': '🔄', 'tool-quirk': '🔧', 'question': '❓'}.get(fact['category'], '•')
|
||||
f.write(f"- {icon} **{fact['category']}** (conf: {fact['confidence']:.1f}): {fact['fact']}\n")
|
||||
f.write("\n")
|
||||
|
||||
|
||||
def harvest_session(session_path: str, knowledge_dir: str, api_base: str, api_key: str,
|
||||
model: str, dry_run: bool = False, min_confidence: float = 0.3) -> dict:
|
||||
"""Harvest knowledge from a single session.
|
||||
|
||||
Returns: dict with stats (facts_found, facts_new, facts_dup, elapsed_seconds, error)
|
||||
"""
|
||||
start_time = time.time()
|
||||
stats = {
|
||||
'session': session_path,
|
||||
'facts_found': 0,
|
||||
'facts_new': 0,
|
||||
'facts_dup': 0,
|
||||
'elapsed_seconds': 0,
|
||||
'error': None
|
||||
}
|
||||
|
||||
try:
|
||||
# 1. Read session
|
||||
messages = read_session(session_path)
|
||||
if not messages:
|
||||
stats['error'] = "Empty session file"
|
||||
return stats
|
||||
|
||||
# 2. Extract conversation
|
||||
conv = extract_conversation(messages)
|
||||
if not conv:
|
||||
stats['error'] = "No conversation turns found"
|
||||
return stats
|
||||
|
||||
# 3. Truncate for context window
|
||||
truncated = truncate_for_context(conv, head=50, tail=50)
|
||||
transcript = messages_to_text(truncated)
|
||||
|
||||
# 4. Load extraction prompt
|
||||
prompt = load_extraction_prompt()
|
||||
|
||||
# 5. Call LLM
|
||||
raw_facts = call_llm(prompt, transcript, api_base, api_key, model)
|
||||
if raw_facts is None:
|
||||
stats['error'] = "LLM extraction failed"
|
||||
return stats
|
||||
|
||||
# 6. Validate
|
||||
valid_facts = [f for f in raw_facts if validate_fact(f) and f.get('confidence', 0) >= min_confidence]
|
||||
stats['facts_found'] = len(valid_facts)
|
||||
|
||||
# 7. Deduplicate
|
||||
existing_index = load_existing_knowledge(knowledge_dir)
|
||||
existing_facts = existing_index.get('facts', [])
|
||||
new_facts = deduplicate(valid_facts, existing_facts)
|
||||
stats['facts_new'] = len(new_facts)
|
||||
stats['facts_dup'] = len(valid_facts) - len(new_facts)
|
||||
|
||||
# 8. Write (unless dry run)
|
||||
if new_facts and not dry_run:
|
||||
write_knowledge(existing_index, new_facts, knowledge_dir, source_session=session_path)
|
||||
|
||||
stats['elapsed_seconds'] = round(time.time() - start_time, 2)
|
||||
return stats
|
||||
|
||||
except Exception as e:
|
||||
stats['error'] = str(e)
|
||||
stats['elapsed_seconds'] = round(time.time() - start_time, 2)
|
||||
return stats
|
||||
|
||||
|
||||
def batch_harvest(sessions_dir: str, knowledge_dir: str, api_base: str, api_key: str,
|
||||
model: str, since: str = "", limit: int = 0, dry_run: bool = False) -> list[dict]:
|
||||
"""Harvest knowledge from multiple sessions in batch."""
|
||||
sessions_path = Path(sessions_dir)
|
||||
if not sessions_path.is_dir():
|
||||
print(f"ERROR: Sessions directory not found: {sessions_dir}", file=sys.stderr)
|
||||
return []
|
||||
|
||||
# Find session files
|
||||
session_files = sorted(sessions_path.glob("*.jsonl"), reverse=True) # Newest first
|
||||
|
||||
# Filter by date if --since provided
|
||||
if since:
|
||||
since_dt = datetime.fromisoformat(since.replace('Z', '+00:00'))
|
||||
filtered = []
|
||||
for sf in session_files:
|
||||
# Try to parse timestamp from filename (common format: session_YYYYMMDD_HHMMSS_hash.jsonl)
|
||||
try:
|
||||
parts = sf.stem.split('_')
|
||||
if len(parts) >= 3:
|
||||
date_str = parts[1]
|
||||
file_dt = datetime.strptime(date_str, '%Y%m%d').replace(tzinfo=timezone.utc)
|
||||
if file_dt >= since_dt:
|
||||
filtered.append(sf)
|
||||
except (ValueError, IndexError):
|
||||
# If we can't parse the date, include the file (be permissive)
|
||||
filtered.append(sf)
|
||||
session_files = filtered
|
||||
|
||||
# Apply limit
|
||||
if limit > 0:
|
||||
session_files = session_files[:limit]
|
||||
|
||||
print(f"Harvesting {len(session_files)} sessions...")
|
||||
|
||||
results = []
|
||||
for i, sf in enumerate(session_files, 1):
|
||||
print(f"[{i}/{len(session_files)}] {sf.name}...", end=" ", flush=True)
|
||||
stats = harvest_session(str(sf), knowledge_dir, api_base, api_key, model, dry_run)
|
||||
if stats['error']:
|
||||
print(f"ERROR: {stats['error']}")
|
||||
else:
|
||||
print(f"{stats['facts_new']} new, {stats['facts_dup']} dup ({stats['elapsed_seconds']}s)")
|
||||
results.append(stats)
|
||||
|
||||
return results
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Harvest knowledge from session transcripts")
|
||||
parser.add_argument('--session', help='Path to a single session JSONL file')
|
||||
parser.add_argument('--batch', action='store_true', help='Batch mode: process multiple sessions')
|
||||
parser.add_argument('--sessions-dir', default=os.path.expanduser('~/.hermes/sessions'),
|
||||
help='Directory containing session files (default: ~/.hermes/sessions)')
|
||||
parser.add_argument('--output', default='knowledge', help='Output directory for knowledge store')
|
||||
parser.add_argument('--since', default='', help='Only process sessions after this date (YYYY-MM-DD)')
|
||||
parser.add_argument('--limit', type=int, default=0, help='Max sessions to process (0=unlimited)')
|
||||
parser.add_argument('--api-base', default=DEFAULT_API_BASE, help='LLM API base URL')
|
||||
parser.add_argument('--api-key', default='', help='LLM API key (or set HARVESTER_API_KEY)')
|
||||
parser.add_argument('--model', default=DEFAULT_MODEL, help='Model to use for extraction')
|
||||
parser.add_argument('--dry-run', action='store_true', help='Preview without writing to knowledge store')
|
||||
parser.add_argument('--min-confidence', type=float, default=0.3, help='Minimum confidence threshold')
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
# Resolve API key
|
||||
api_key = args.api_key or DEFAULT_API_KEY or find_api_key()
|
||||
if not api_key:
|
||||
print("ERROR: No API key found. Set HARVESTER_API_KEY or store in one of:", file=sys.stderr)
|
||||
for p in API_KEY_PATHS:
|
||||
print(f" {p}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
# Resolve knowledge directory
|
||||
knowledge_dir = args.output
|
||||
if not os.path.isabs(knowledge_dir):
|
||||
knowledge_dir = os.path.join(SCRIPT_DIR.parent, knowledge_dir)
|
||||
|
||||
if args.session:
|
||||
# Single session mode
|
||||
stats = harvest_session(
|
||||
args.session, knowledge_dir, args.api_base, api_key, args.model,
|
||||
dry_run=args.dry_run, min_confidence=args.min_confidence
|
||||
)
|
||||
print(json.dumps(stats, indent=2))
|
||||
if stats['error']:
|
||||
sys.exit(1)
|
||||
elif args.batch:
|
||||
# Batch mode
|
||||
results = batch_harvest(
|
||||
args.sessions_dir, knowledge_dir, args.api_base, api_key, args.model,
|
||||
since=args.since, limit=args.limit, dry_run=args.dry_run
|
||||
)
|
||||
total_new = sum(r['facts_new'] for r in results)
|
||||
total_dup = sum(r['facts_dup'] for r in results)
|
||||
errors = sum(1 for r in results if r['error'])
|
||||
print(f"\nDone: {total_new} new facts, {total_dup} duplicates, {errors} errors")
|
||||
else:
|
||||
parser.print_help()
|
||||
sys.exit(1)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
@@ -1,451 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Improvement Proposal Generator for compounding-intelligence.
|
||||
|
||||
Analyzes fleet session data to identify waste patterns and generates
|
||||
concrete improvement proposals with ROI estimates.
|
||||
|
||||
Input: Session analytics JSON (from fleet metrics or session database)
|
||||
Output: Markdown proposal document + JSON proposals
|
||||
|
||||
Usage:
|
||||
python3 scripts/improvement_proposals.py --input analytics.json
|
||||
python3 scripts/improvement_proposals.py --input analytics.json --format json
|
||||
python3 scripts/improvement_proposals.py --input analytics.json --output proposals.md
|
||||
python3 scripts/improvement_proposals.py --input analytics.json --threshold 5.0
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
from collections import defaultdict
|
||||
from dataclasses import dataclass, field, asdict
|
||||
from datetime import datetime, timezone
|
||||
from typing import Any, Dict, List, Optional
|
||||
|
||||
|
||||
# ── Data types ─────────────────────────────────────────────────
|
||||
|
||||
@dataclass
|
||||
class WastePattern:
|
||||
"""A detected waste pattern in fleet sessions."""
|
||||
pattern_type: str # "repeated_error", "manual_process", "slow_tool", "failed_retry"
|
||||
description: str
|
||||
occurrences: int
|
||||
total_time_hours: float
|
||||
affected_repos: List[str] = field(default_factory=list)
|
||||
example_sessions: List[str] = field(default_factory=list)
|
||||
|
||||
|
||||
@dataclass
|
||||
class ImprovementProposal:
|
||||
"""A concrete improvement proposal with ROI estimate."""
|
||||
title: str
|
||||
problem: str
|
||||
proposed_solution: str
|
||||
estimated_weekly_hours_saved: float
|
||||
estimated_monthly_hours_saved: float
|
||||
implementation_hours: float
|
||||
roi_weeks: float # weeks to break even
|
||||
priority: str # "critical", "high", "medium", "low"
|
||||
affected_area: str
|
||||
supporting_evidence: List[str] = field(default_factory=list)
|
||||
|
||||
|
||||
# ── Session analysis ──────────────────────────────────────────
|
||||
|
||||
def analyze_sessions(sessions: List[dict]) -> List[WastePattern]:
|
||||
"""
|
||||
Analyze session data to find waste patterns.
|
||||
|
||||
Looks for:
|
||||
- Repeated errors (same error across sessions)
|
||||
- Manual processes (long sequences of similar actions)
|
||||
- Slow tools (tools with high latency)
|
||||
- Failed retries (multiple attempts at same task)
|
||||
"""
|
||||
patterns = []
|
||||
|
||||
# Track error frequency across sessions
|
||||
error_counts: Dict[str, List[dict]] = defaultdict(list)
|
||||
tool_latencies: Dict[str, List[float]] = defaultdict(list)
|
||||
retry_counts: Dict[str, int] = defaultdict(int)
|
||||
manual_sequences: List[dict] = []
|
||||
|
||||
for session in sessions:
|
||||
session_id = session.get("session_id", "unknown")
|
||||
repo = session.get("repo", "global")
|
||||
messages = session.get("messages", [])
|
||||
errors = session.get("errors", [])
|
||||
tool_calls = session.get("tool_calls", [])
|
||||
duration_min = session.get("duration_minutes", 0)
|
||||
|
||||
# 1. Repeated errors
|
||||
for err in errors:
|
||||
err_key = _normalize_error(err.get("message", ""))
|
||||
if err_key:
|
||||
error_counts[err_key].append({
|
||||
"session_id": session_id,
|
||||
"repo": repo,
|
||||
})
|
||||
|
||||
# 2. Tool latency tracking
|
||||
for tc in tool_calls:
|
||||
tool_name = tc.get("tool", "unknown")
|
||||
latency = tc.get("latency_ms", 0)
|
||||
if latency > 0:
|
||||
tool_latencies[tool_name].append(latency)
|
||||
|
||||
# 3. Failed retries (same tool called 3+ times in sequence)
|
||||
prev_tool = None
|
||||
streak = 0
|
||||
for tc in tool_calls:
|
||||
tool_name = tc.get("tool", "unknown")
|
||||
if tool_name == prev_tool:
|
||||
streak += 1
|
||||
else:
|
||||
if streak >= 3:
|
||||
retry_counts[prev_tool] += 1
|
||||
streak = 1
|
||||
prev_tool = tool_name
|
||||
if streak >= 3:
|
||||
retry_counts[prev_tool] += 1
|
||||
|
||||
# 4. Manual processes (10+ sequential tool calls with no automation)
|
||||
if len(tool_calls) > 10:
|
||||
tool_sequence = [tc.get("tool") for tc in tool_calls]
|
||||
unique_tools = set(tool_sequence)
|
||||
if len(unique_tools) <= 3 and len(tool_calls) > 10:
|
||||
manual_sequences.append({
|
||||
"session_id": session_id,
|
||||
"repo": repo,
|
||||
"tool_count": len(tool_calls),
|
||||
"unique_tools": list(unique_tools),
|
||||
"duration_min": duration_min,
|
||||
})
|
||||
|
||||
# Generate patterns from collected data
|
||||
|
||||
# Repeated errors (appearing in 3+ sessions)
|
||||
for err_key, occurrences in error_counts.items():
|
||||
if len(occurrences) >= 3:
|
||||
repos = list(set(o["repo"] for o in occurrences))
|
||||
sessions_list = [o["session_id"] for o in occurrences[:5]]
|
||||
# Estimate time wasted: 5 min per error occurrence
|
||||
hours = len(occurrences) * 5 / 60
|
||||
patterns.append(WastePattern(
|
||||
pattern_type="repeated_error",
|
||||
description=f"Error: {err_key[:100]}",
|
||||
occurrences=len(occurrences),
|
||||
total_time_hours=round(hours, 1),
|
||||
affected_repos=repos,
|
||||
example_sessions=sessions_list,
|
||||
))
|
||||
|
||||
# Slow tools (avg latency > 5000ms across 5+ calls)
|
||||
for tool, latencies in tool_latencies.items():
|
||||
if len(latencies) >= 5:
|
||||
avg_ms = sum(latencies) / len(latencies)
|
||||
if avg_ms > 5000:
|
||||
hours = sum(latencies) / 1000 / 3600
|
||||
patterns.append(WastePattern(
|
||||
pattern_type="slow_tool",
|
||||
description=f"Tool '{tool}' averages {avg_ms:.0f}ms latency",
|
||||
occurrences=len(latencies),
|
||||
total_time_hours=round(hours, 1),
|
||||
affected_repos=["global"],
|
||||
example_sessions=[],
|
||||
))
|
||||
|
||||
# Failed retries
|
||||
for tool, count in retry_counts.items():
|
||||
if count >= 2:
|
||||
hours = count * 10 / 60 # ~10 min per failed retry sequence
|
||||
patterns.append(WastePattern(
|
||||
pattern_type="failed_retry",
|
||||
description=f"Tool '{tool}' had {count} retry sequences (3+ consecutive calls)",
|
||||
occurrences=count,
|
||||
total_time_hours=round(hours, 1),
|
||||
affected_repos=["global"],
|
||||
example_sessions=[],
|
||||
))
|
||||
|
||||
# Manual processes
|
||||
for seq in manual_sequences:
|
||||
hours = seq["duration_min"] / 60
|
||||
patterns.append(WastePattern(
|
||||
pattern_type="manual_process",
|
||||
description=f"Session {seq['session_id'][:12]}: {seq['tool_count']} tool calls with only {len(seq['unique_tools'])} unique tools",
|
||||
occurrences=1,
|
||||
total_time_hours=round(hours, 1),
|
||||
affected_repos=[seq["repo"]],
|
||||
example_sessions=[seq["session_id"]],
|
||||
))
|
||||
|
||||
return sorted(patterns, key=lambda p: p.total_time_hours, reverse=True)
|
||||
|
||||
|
||||
def _normalize_error(message: str) -> str:
|
||||
"""Normalize error message to a common key."""
|
||||
if not message:
|
||||
return ""
|
||||
msg = message.lower().strip()
|
||||
# Remove variable parts (paths, IDs, timestamps)
|
||||
import re
|
||||
msg = re.sub(r'/\S+', '/PATH', msg)
|
||||
msg = re.sub(r'\b[0-9a-f]{8,}\b', 'HASH', msg)
|
||||
msg = re.sub(r'\d{4}-\d{2}-\d{2}[tT]\d{2}:\d{2}', 'TIME', msg)
|
||||
return msg[:150]
|
||||
|
||||
|
||||
# ── Proposal generation ───────────────────────────────────────
|
||||
|
||||
def generate_proposals(
|
||||
patterns: List[WastePattern],
|
||||
hourly_rate: float = 50.0,
|
||||
implementation_overhead: float = 1.5,
|
||||
) -> List[ImprovementProposal]:
|
||||
"""
|
||||
Generate improvement proposals from waste patterns.
|
||||
|
||||
Args:
|
||||
patterns: Detected waste patterns
|
||||
hourly_rate: Developer hourly rate for ROI calculation
|
||||
implementation_overhead: Multiplier for implementation time estimate
|
||||
"""
|
||||
proposals = []
|
||||
|
||||
# Group patterns by type
|
||||
by_type: Dict[str, List[WastePattern]] = defaultdict(list)
|
||||
for p in patterns:
|
||||
by_type[p.pattern_type].append(p)
|
||||
|
||||
# 1. Repeated errors → Create fix/skill
|
||||
for p in by_type.get("repeated_error", []):
|
||||
weekly_hours = p.total_time_hours / 4 # monthly → weekly
|
||||
impl_hours = max(1.0, p.occurrences * 0.25) * implementation_overhead
|
||||
roi_weeks = impl_hours / weekly_hours if weekly_hours > 0 else float('inf')
|
||||
|
||||
proposals.append(ImprovementProposal(
|
||||
title=f"Fix repeated error: {p.description[:60]}",
|
||||
problem=f"This error occurred {p.occurrences} times across {len(p.affected_repos)} repos, wasting ~{p.total_time_hours:.1f} hours.",
|
||||
proposed_solution="Root-cause the error and create a permanent fix or mitigation skill.",
|
||||
estimated_weekly_hours_saved=round(weekly_hours, 1),
|
||||
estimated_monthly_hours_saved=round(p.total_time_hours, 1),
|
||||
implementation_hours=round(impl_hours, 1),
|
||||
roi_weeks=round(roi_weeks, 1),
|
||||
priority=_priority_from_roi(roi_weeks),
|
||||
affected_area="reliability",
|
||||
supporting_evidence=[f"{p.occurrences} occurrences in sessions: {', '.join(p.example_sessions[:3])}"],
|
||||
))
|
||||
|
||||
# 2. Slow tools → Optimize or replace
|
||||
for p in by_type.get("slow_tool", []):
|
||||
weekly_hours = p.total_time_hours / 4
|
||||
impl_hours = 3.0 * implementation_overhead # optimization task
|
||||
roi_weeks = impl_hours / weekly_hours if weekly_hours > 0 else float('inf')
|
||||
|
||||
proposals.append(ImprovementProposal(
|
||||
title=f"Optimize slow tool: {p.description[:60]}",
|
||||
problem=f"Tool has {p.occurrences} calls with high latency, wasting ~{p.total_time_hours:.1f} hours total.",
|
||||
proposed_solution="Profile the tool, add caching, or replace with a faster alternative.",
|
||||
estimated_weekly_hours_saved=round(weekly_hours, 1),
|
||||
estimated_monthly_hours_saved=round(p.total_time_hours, 1),
|
||||
implementation_hours=round(impl_hours, 1),
|
||||
roi_weeks=round(roi_weeks, 1),
|
||||
priority=_priority_from_roi(roi_weeks),
|
||||
affected_area="performance",
|
||||
supporting_evidence=[f"{p.occurrences} slow calls detected"],
|
||||
))
|
||||
|
||||
# 3. Failed retries → Add retry logic or validation
|
||||
for p in by_type.get("failed_retry", []):
|
||||
weekly_hours = p.total_time_hours / 4
|
||||
impl_hours = 2.0 * implementation_overhead
|
||||
roi_weeks = impl_hours / weekly_hours if weekly_hours > 0 else float('inf')
|
||||
|
||||
proposals.append(ImprovementProposal(
|
||||
title=f"Reduce retries for tool '{p.description[:50]}'",
|
||||
problem=f"Tool had {p.occurrences} retry sequences, wasting ~{p.total_time_hours:.1f} hours.",
|
||||
proposed_solution="Add input validation, pre-flight checks, or automatic retry with backoff.",
|
||||
estimated_weekly_hours_saved=round(weekly_hours, 1),
|
||||
estimated_monthly_hours_saved=round(p.total_time_hours, 1),
|
||||
implementation_hours=round(impl_hours, 1),
|
||||
roi_weeks=round(roi_weeks, 1),
|
||||
priority=_priority_from_roi(roi_weeks),
|
||||
affected_area="reliability",
|
||||
supporting_evidence=[f"{p.occurrences} retry sequences detected"],
|
||||
))
|
||||
|
||||
# 4. Manual processes → Automate
|
||||
total_manual_hours = sum(p.total_time_hours for p in by_type.get("manual_process", []))
|
||||
manual_patterns = by_type.get("manual_process", [])
|
||||
if manual_patterns:
|
||||
weekly_hours = total_manual_hours / 4
|
||||
impl_hours = len(manual_patterns) * 2.0 * implementation_overhead
|
||||
roi_weeks = impl_hours / weekly_hours if weekly_hours > 0 else float('inf')
|
||||
|
||||
proposals.append(ImprovementProposal(
|
||||
title=f"Automate {len(manual_patterns)} manual processes",
|
||||
problem=f"{len(manual_patterns)} sessions had long manual tool sequences, wasting ~{total_manual_hours:.1f} hours.",
|
||||
proposed_solution="Create composite skills or scripts that combine the repeated tool sequences into single operations.",
|
||||
estimated_weekly_hours_saved=round(weekly_hours, 1),
|
||||
estimated_monthly_hours_saved=round(total_manual_hours, 1),
|
||||
implementation_hours=round(impl_hours, 1),
|
||||
roi_weeks=round(roi_weeks, 1),
|
||||
priority=_priority_from_roi(roi_weeks),
|
||||
affected_area="automation",
|
||||
supporting_evidence=[f"{len(manual_patterns)} manual sessions detected"],
|
||||
))
|
||||
|
||||
return sorted(proposals, key=lambda p: p.estimated_monthly_hours_saved, reverse=True)
|
||||
|
||||
|
||||
def _priority_from_roi(roi_weeks: float) -> str:
|
||||
"""Determine priority from ROI break-even time."""
|
||||
if roi_weeks <= 1:
|
||||
return "critical"
|
||||
elif roi_weeks <= 4:
|
||||
return "high"
|
||||
elif roi_weeks <= 12:
|
||||
return "medium"
|
||||
return "low"
|
||||
|
||||
|
||||
# ── Output formatting ─────────────────────────────────────────
|
||||
|
||||
def format_proposals_markdown(
|
||||
proposals: List[ImprovementProposal],
|
||||
patterns: List[WastePattern],
|
||||
generated_at: str,
|
||||
) -> str:
|
||||
"""Format proposals as a markdown document."""
|
||||
lines = [
|
||||
"# Improvement Proposals",
|
||||
"",
|
||||
f"Generated: {generated_at}",
|
||||
f"Based on analysis of {sum(p.occurrences for p in patterns)} waste events across {len(set(r for p in patterns for r in p.affected_repos))} repos.",
|
||||
"",
|
||||
"---",
|
||||
"",
|
||||
"## Summary",
|
||||
"",
|
||||
f"| Metric | Value |",
|
||||
f"|--------|-------|",
|
||||
f"| Total proposals | {len(proposals)} |",
|
||||
f"| Critical priority | {sum(1 for p in proposals if p.priority == 'critical')} |",
|
||||
f"| Total monthly hours wasted | {sum(p.estimated_monthly_hours_saved for p in proposals):.1f}h |",
|
||||
f"| Total weekly hours recoverable | {sum(p.estimated_weekly_hours_saved for p in proposals):.1f}h |",
|
||||
f"| Implementation investment | {sum(p.implementation_hours for p in proposals):.1f}h |",
|
||||
"",
|
||||
"---",
|
||||
"",
|
||||
]
|
||||
|
||||
for i, prop in enumerate(proposals, 1):
|
||||
lines.extend([
|
||||
f"## {i}. {prop.title}",
|
||||
"",
|
||||
f"**Priority:** {prop.priority.upper()} ",
|
||||
f"**Area:** {prop.affected_area} ",
|
||||
f"**ROI break-even:** {prop.roi_weeks:.1f} weeks",
|
||||
"",
|
||||
"### Problem",
|
||||
"",
|
||||
prop.problem,
|
||||
"",
|
||||
"### Proposed Solution",
|
||||
"",
|
||||
prop.proposed_solution,
|
||||
"",
|
||||
"### ROI Estimate",
|
||||
"",
|
||||
f"- Weekly hours saved: **{prop.estimated_weekly_hours_saved}h**",
|
||||
f"- Monthly hours saved: **{prop.estimated_monthly_hours_saved}h**",
|
||||
f"- Implementation time: **{prop.implementation_hours}h**",
|
||||
f"- Break-even: **{prop.roi_weeks:.1f} weeks**",
|
||||
"",
|
||||
])
|
||||
|
||||
if prop.supporting_evidence:
|
||||
lines.extend([
|
||||
"### Evidence",
|
||||
"",
|
||||
])
|
||||
for ev in prop.supporting_evidence:
|
||||
lines.append(f"- {ev}")
|
||||
lines.append("")
|
||||
|
||||
lines.extend(["---", ""])
|
||||
|
||||
# Waste pattern appendix
|
||||
lines.extend([
|
||||
"## Appendix: Detected Waste Patterns",
|
||||
"",
|
||||
"| Type | Description | Occurrences | Hours |",
|
||||
"|------|-------------|-------------|-------|",
|
||||
])
|
||||
for p in patterns[:20]:
|
||||
lines.append(
|
||||
f"| {p.pattern_type} | {p.description[:60]} | {p.occurrences} | {p.total_time_hours}h |"
|
||||
)
|
||||
lines.append("")
|
||||
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def format_proposals_json(proposals: List[ImprovementProposal]) -> str:
|
||||
"""Format proposals as JSON."""
|
||||
return json.dumps(
|
||||
[asdict(p) for p in proposals],
|
||||
indent=2,
|
||||
default=str,
|
||||
)
|
||||
|
||||
|
||||
# ── Main ──────────────────────────────────────────────────────
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Generate improvement proposals from session analytics")
|
||||
parser.add_argument("--input", required=True, help="Path to session analytics JSON file")
|
||||
parser.add_argument("--output", help="Output file path (default: stdout)")
|
||||
parser.add_argument("--format", choices=["markdown", "json"], default="markdown", help="Output format")
|
||||
parser.add_argument("--hourly-rate", type=float, default=50.0, help="Developer hourly rate for ROI calc")
|
||||
parser.add_argument("--threshold", type=float, default=2.0, help="Min monthly hours to include proposal")
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
with open(args.input) as f:
|
||||
data = json.load(f)
|
||||
|
||||
sessions = data if isinstance(data, list) else data.get("sessions", [])
|
||||
|
||||
if not sessions:
|
||||
print("No sessions found in input file.", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
# Analyze
|
||||
patterns = analyze_sessions(sessions)
|
||||
proposals = generate_proposals(patterns, hourly_rate=args.hourly_rate)
|
||||
|
||||
# Filter by threshold
|
||||
proposals = [p for p in proposals if p.estimated_monthly_hours_saved >= args.threshold]
|
||||
|
||||
generated_at = datetime.now(timezone.utc).isoformat()
|
||||
|
||||
if args.format == "markdown":
|
||||
output = format_proposals_markdown(proposals, patterns, generated_at)
|
||||
else:
|
||||
output = format_proposals_json(proposals)
|
||||
|
||||
if args.output:
|
||||
with open(args.output, "w") as f:
|
||||
f.write(output)
|
||||
print(f"Wrote {len(proposals)} proposals to {args.output}", file=sys.stderr)
|
||||
else:
|
||||
print(output)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,275 +0,0 @@
|
||||
"""
|
||||
Knowledge Gap Identifier — Pipeline 10.7
|
||||
|
||||
Cross-references code, docs, and tests to find gaps:
|
||||
- Undocumented functions/classes
|
||||
- Untested code paths
|
||||
- Documented but missing implementations
|
||||
- Test files without corresponding source
|
||||
|
||||
Produces a gap report with severity and suggestions.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import ast
|
||||
import os
|
||||
import re
|
||||
from dataclasses import dataclass, field
|
||||
from enum import Enum
|
||||
from pathlib import Path
|
||||
from typing import Dict, List, Optional, Set
|
||||
|
||||
|
||||
class GapSeverity(Enum):
|
||||
INFO = "info"
|
||||
WARNING = "warning"
|
||||
ERROR = "error"
|
||||
|
||||
|
||||
class GapType(Enum):
|
||||
UNDOCUMENTED = "undocumented"
|
||||
UNTESTED = "untested"
|
||||
MISSING_IMPLEMENTATION = "missing_implementation"
|
||||
ORPHAN_TEST = "orphan_test"
|
||||
STALE_DOC = "stale_doc"
|
||||
|
||||
|
||||
@dataclass
|
||||
class Gap:
|
||||
"""A single knowledge gap."""
|
||||
gap_type: GapType
|
||||
severity: GapSeverity
|
||||
file: str
|
||||
line: Optional[int]
|
||||
name: str
|
||||
description: str
|
||||
suggestion: str
|
||||
|
||||
|
||||
@dataclass
|
||||
class GapReport:
|
||||
"""Full gap analysis report."""
|
||||
repo_path: str
|
||||
gaps: List[Gap] = field(default_factory=list)
|
||||
stats: Dict[str, int] = field(default_factory=dict)
|
||||
|
||||
def summary(self) -> str:
|
||||
lines = [f"Gap Report for {self.repo_path}", "=" * 40]
|
||||
by_type = {}
|
||||
for g in self.gaps:
|
||||
by_type.setdefault(g.gap_type.value, []).append(g)
|
||||
|
||||
for gtype, items in sorted(by_type.items()):
|
||||
lines.append(f"\n{gtype.upper()} ({len(items)}):")
|
||||
for g in items:
|
||||
loc = f"{g.file}:{g.line}" if g.line else g.file
|
||||
lines.append(f" [{g.severity.value}] {g.name} @ {loc}")
|
||||
lines.append(f" {g.description}")
|
||||
|
||||
lines.append(f"\nTotal gaps: {len(self.gaps)}")
|
||||
self.stats = {k: len(v) for k, v in by_type.items()}
|
||||
return "\n".join(lines)
|
||||
|
||||
def to_dict(self) -> dict:
|
||||
return {
|
||||
"repo_path": self.repo_path,
|
||||
"total_gaps": len(self.gaps),
|
||||
"stats": {k: len(v) for k, v in
|
||||
{gt: [g for g in self.gaps if g.gap_type == gt]
|
||||
for gt in GapType}.items() if v},
|
||||
"gaps": [
|
||||
{
|
||||
"type": g.gap_type.value,
|
||||
"severity": g.severity.value,
|
||||
"file": g.file,
|
||||
"line": g.line,
|
||||
"name": g.name,
|
||||
"description": g.description,
|
||||
"suggestion": g.suggestion,
|
||||
}
|
||||
for g in self.gaps
|
||||
],
|
||||
}
|
||||
|
||||
|
||||
def _collect_python_files(root: Path) -> List[Path]:
|
||||
"""Collect .py files, excluding venv/node_modules/.git."""
|
||||
skip = {".git", "venv", "env", ".venv", "node_modules", "__pycache__", ".tox", ".mypy_cache"}
|
||||
files = []
|
||||
for dirpath, dirnames, filenames in os.walk(root):
|
||||
dirnames[:] = [d for d in dirnames if d not in skip]
|
||||
for f in filenames:
|
||||
if f.endswith(".py"):
|
||||
files.append(Path(dirpath) / f)
|
||||
return files
|
||||
|
||||
|
||||
def _extract_python_symbols(filepath: Path) -> Set[str]:
|
||||
"""Extract top-level function and class names from a Python file."""
|
||||
symbols = set()
|
||||
try:
|
||||
source = filepath.read_text(encoding="utf-8", errors="replace")
|
||||
tree = ast.parse(source, filename=str(filepath))
|
||||
except (SyntaxError, UnicodeDecodeError):
|
||||
return symbols
|
||||
|
||||
for node in ast.iter_child_nodes(tree):
|
||||
if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef, ast.ClassDef)):
|
||||
symbols.add(node.name)
|
||||
return symbols
|
||||
|
||||
|
||||
def _extract_doc_symbols(filepath: Path) -> Set[str]:
|
||||
"""Extract function/class names mentioned in markdown docs."""
|
||||
symbols = set()
|
||||
try:
|
||||
text = filepath.read_text(encoding="utf-8", errors="replace")
|
||||
except (UnicodeDecodeError, OSError):
|
||||
return symbols
|
||||
|
||||
# Match backtick-quoted identifiers: `ClassName`, `func_name`, `func()`
|
||||
for m in re.finditer(r"`([A-Za-z_]\w+)(?:\(\))?`", text):
|
||||
symbols.add(m.group(1))
|
||||
# Match ## ClassName or ### func_name headings
|
||||
for m in re.finditer(r"^#{1,4}\s+(\w+)", text, re.MULTILINE):
|
||||
symbols.add(m.group(1))
|
||||
return symbols
|
||||
|
||||
|
||||
def _collect_test_files(root: Path) -> Dict[str, Path]:
|
||||
"""Map test module names to their file paths."""
|
||||
test_map = {}
|
||||
for dirpath, dirnames, filenames in os.walk(root):
|
||||
dirnames[:] = [d for d in dirnames if d not in {".git", "venv", "node_modules"}]
|
||||
for f in filenames:
|
||||
if f.startswith("test_") and f.endswith(".py"):
|
||||
# test_foo.py -> foo
|
||||
module_name = f[5:-3]
|
||||
test_map[module_name] = Path(dirpath) / f
|
||||
return test_map
|
||||
|
||||
|
||||
class KnowledgeGapIdentifier:
|
||||
"""Analyzes a repo for knowledge gaps between code, docs, and tests."""
|
||||
|
||||
def analyze(self, repo_path: str) -> GapReport:
|
||||
root = Path(repo_path).resolve()
|
||||
report = GapReport(repo_path=str(root))
|
||||
|
||||
if not root.is_dir():
|
||||
report.gaps.append(Gap(
|
||||
gap_type=GapType.UNDOCUMENTED,
|
||||
severity=GapSeverity.ERROR,
|
||||
file=str(root),
|
||||
line=None,
|
||||
name="repo",
|
||||
description="Path is not a directory",
|
||||
suggestion="Provide a valid repo directory",
|
||||
))
|
||||
return report
|
||||
|
||||
# Collect artifacts
|
||||
py_files = _collect_python_files(root)
|
||||
doc_files = list(root.glob("docs/**/*.md")) + list(root.glob("*.md"))
|
||||
test_map = _collect_test_files(root / "tests") if (root / "tests").is_dir() else {}
|
||||
|
||||
# Extract symbols from each source file
|
||||
source_symbols: Dict[str, Set[str]] = {} # relative_path -> symbols
|
||||
all_source_symbols: Set[str] = set()
|
||||
|
||||
for pf in py_files:
|
||||
rel = str(pf.relative_to(root))
|
||||
# Skip test files and setup/config
|
||||
if "/tests/" in rel or rel.startswith("tests/") or rel.startswith("test_"):
|
||||
continue
|
||||
if pf.name in ("setup.py", "conftest.py", "conf.py"):
|
||||
continue
|
||||
|
||||
syms = _extract_python_symbols(pf)
|
||||
if syms:
|
||||
source_symbols[rel] = syms
|
||||
all_source_symbols.update(syms)
|
||||
|
||||
# Extract documented symbols
|
||||
doc_symbols: Set[str] = set()
|
||||
for df in doc_files:
|
||||
doc_symbols.update(_extract_doc_symbols(df))
|
||||
|
||||
# Extract test-covered symbols
|
||||
tested_modules: Set[str] = set(test_map.keys())
|
||||
|
||||
# --- Find gaps ---
|
||||
|
||||
# 1. Undocumented: source symbols not in any doc
|
||||
for rel_path, syms in source_symbols.items():
|
||||
for sym in sorted(syms):
|
||||
if sym.startswith("_") and not sym.startswith("__"):
|
||||
continue # Skip private
|
||||
if sym not in doc_symbols:
|
||||
report.gaps.append(Gap(
|
||||
gap_type=GapType.UNDOCUMENTED,
|
||||
severity=GapSeverity.WARNING,
|
||||
file=rel_path,
|
||||
line=None,
|
||||
name=sym,
|
||||
description=f"{sym} defined in {rel_path} but not referenced in any docs",
|
||||
suggestion=f"Add documentation for {sym} in a .md file",
|
||||
))
|
||||
|
||||
# 2. Untested: source modules without a corresponding test file
|
||||
for rel_path in source_symbols:
|
||||
module_name = Path(rel_path).stem
|
||||
if module_name not in tested_modules and module_name not in ("__init__", "main", "config"):
|
||||
report.gaps.append(Gap(
|
||||
gap_type=GapType.UNTESTED,
|
||||
severity=GapSeverity.ERROR,
|
||||
file=rel_path,
|
||||
line=None,
|
||||
name=module_name,
|
||||
description=f"No test file found for {rel_path}",
|
||||
suggestion=f"Create tests/test_{module_name}.py",
|
||||
))
|
||||
|
||||
# 3. Missing implementation: doc references symbol not in any source
|
||||
referenced_but_missing = doc_symbols - all_source_symbols
|
||||
for sym in sorted(referenced_but_missing):
|
||||
# Filter out common non-code terms
|
||||
if sym.lower() in {"todo", "fixme", "note", "example", "usage", "api",
|
||||
"install", "setup", "config", "license", "contributing",
|
||||
"changelog", "readme", "python", "bash", "json", "yaml",
|
||||
"http", "url", "cli", "gui", "ui", "api", "rest"}:
|
||||
continue
|
||||
if len(sym) < 3:
|
||||
continue
|
||||
report.gaps.append(Gap(
|
||||
gap_type=GapType.MISSING_IMPLEMENTATION,
|
||||
severity=GapSeverity.INFO,
|
||||
file="(docs)",
|
||||
line=None,
|
||||
name=sym,
|
||||
description=f"{sym} referenced in docs but not found in source code",
|
||||
suggestion=f"Verify if {sym} should be implemented or update docs",
|
||||
))
|
||||
|
||||
# 4. Orphan tests: test files without matching source
|
||||
for test_mod, test_path in test_map.items():
|
||||
if test_mod not in tested_modules and not any(
|
||||
test_mod in Path(f).stem for f in source_symbols
|
||||
):
|
||||
# Check if any source file partially matches
|
||||
matches_source = any(test_mod.replace("_", "-") in f or test_mod.replace("_", "") in Path(f).stem
|
||||
for f in source_symbols)
|
||||
if not matches_source:
|
||||
rel = str(test_path.relative_to(root))
|
||||
report.gaps.append(Gap(
|
||||
gap_type=GapType.ORPHAN_TEST,
|
||||
severity=GapSeverity.WARNING,
|
||||
file=rel,
|
||||
line=None,
|
||||
name=test_mod,
|
||||
description=f"Test file {rel} exists but no matching source module found",
|
||||
suggestion=f"Verify if the source was renamed or removed",
|
||||
))
|
||||
|
||||
return report
|
||||
@@ -1,131 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Knowledge Store Staleness Detector — Detect stale knowledge entries by comparing source file hashes.
|
||||
|
||||
Usage:
|
||||
python3 scripts/knowledge_staleness_check.py --index knowledge/index.json
|
||||
python3 scripts/knowledge_staleness_check.py --index knowledge/index.json --json
|
||||
python3 scripts/knowledge_staleness_check.py --index knowledge/index.json --fix
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import hashlib
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
from typing import Dict, List, Any, Optional
|
||||
|
||||
|
||||
def compute_file_hash(filepath: str) -> Optional[str]:
|
||||
"""Compute SHA-256 hash of a file. Returns None if file doesn't exist."""
|
||||
try:
|
||||
with open(filepath, "rb") as f:
|
||||
return "sha256:" + hashlib.sha256(f.read()).hexdigest()
|
||||
except (FileNotFoundError, IsADirectoryError, PermissionError):
|
||||
return None
|
||||
|
||||
|
||||
def check_staleness(index_path: str, repo_root: str = ".") -> List[Dict[str, Any]]:
|
||||
"""Check all entries in knowledge index for staleness.
|
||||
|
||||
Returns list of entries with staleness info:
|
||||
- status: "fresh" | "stale" | "missing_source" | "no_hash"
|
||||
- current_hash: computed hash (if source exists)
|
||||
- stored_hash: hash from index
|
||||
"""
|
||||
with open(index_path) as f:
|
||||
data = json.load(f)
|
||||
|
||||
facts = data.get("facts", [])
|
||||
results = []
|
||||
|
||||
for entry in facts:
|
||||
source_file = entry.get("source_file")
|
||||
stored_hash = entry.get("source_hash")
|
||||
|
||||
if not source_file:
|
||||
results.append({**entry, "status": "no_source", "current_hash": None})
|
||||
continue
|
||||
|
||||
full_path = os.path.join(repo_root, source_file)
|
||||
current_hash = compute_file_hash(full_path)
|
||||
|
||||
if current_hash is None:
|
||||
results.append({**entry, "status": "missing_source", "current_hash": None})
|
||||
elif not stored_hash:
|
||||
results.append({**entry, "status": "no_hash", "current_hash": current_hash})
|
||||
elif current_hash != stored_hash:
|
||||
results.append({**entry, "status": "stale", "current_hash": current_hash})
|
||||
else:
|
||||
results.append({**entry, "status": "fresh", "current_hash": current_hash})
|
||||
|
||||
return results
|
||||
|
||||
|
||||
def fix_hashes(index_path: str, repo_root: str = ".") -> int:
|
||||
"""Add hashes to entries missing them. Returns count of fixed entries."""
|
||||
with open(index_path) as f:
|
||||
data = json.load(f)
|
||||
|
||||
fixed = 0
|
||||
for entry in data.get("facts", []):
|
||||
if entry.get("source_hash"):
|
||||
continue
|
||||
source_file = entry.get("source_file")
|
||||
if not source_file:
|
||||
continue
|
||||
full_path = os.path.join(repo_root, source_file)
|
||||
h = compute_file_hash(full_path)
|
||||
if h:
|
||||
entry["source_hash"] = h
|
||||
fixed += 1
|
||||
|
||||
with open(index_path, "w") as f:
|
||||
json.dump(data, f, indent=2)
|
||||
|
||||
return fixed
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Check knowledge store staleness")
|
||||
parser.add_argument("--index", required=True, help="Path to knowledge/index.json")
|
||||
parser.add_argument("--repo", default=".", help="Repo root for source file resolution")
|
||||
parser.add_argument("--json", action="store_true", help="Output as JSON")
|
||||
parser.add_argument("--fix", action="store_true", help="Add hashes to entries missing them")
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
if args.fix:
|
||||
fixed = fix_hashes(args.index, args.repo)
|
||||
print(f"Fixed {fixed} entries with missing hashes.")
|
||||
return
|
||||
|
||||
results = check_staleness(args.index, args.repo)
|
||||
|
||||
if args.json:
|
||||
print(json.dumps(results, indent=2))
|
||||
else:
|
||||
stale = [r for r in results if r["status"] != "fresh"]
|
||||
fresh = [r for r in results if r["status"] == "fresh"]
|
||||
|
||||
print(f"Knowledge Store Staleness Check")
|
||||
print(f" Total entries: {len(results)}")
|
||||
print(f" Fresh: {len(fresh)}")
|
||||
print(f" Stale/Issues: {len(stale)}")
|
||||
print()
|
||||
|
||||
if stale:
|
||||
print("Issues found:")
|
||||
for r in stale:
|
||||
status = r["status"]
|
||||
fact = r.get("fact", "?")[:60]
|
||||
source = r.get("source_file", "?")
|
||||
print(f" [{status}] {source}: {fact}")
|
||||
else:
|
||||
print("All entries are fresh!")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,551 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Performance Bottleneck Finder — Identify slow tests, builds, and CI steps.
|
||||
|
||||
Analyzes:
|
||||
1. Pytest output for slow tests
|
||||
2. Build logs for slow steps
|
||||
3. CI workflow durations
|
||||
4. File system for large/slow artifacts
|
||||
|
||||
Usage:
|
||||
python3 scripts/perf_bottleneck_finder.py --repo /path/to/repo
|
||||
python3 scripts/perf_bottleneck_finder.py --repo /path/to/repo --json
|
||||
python3 scripts/perf_bottleneck_finder.py --repo /path/to/repo --report metrics/perf_report.md
|
||||
|
||||
Weekly cron:
|
||||
0 9 * * 1 cd /path/to/compounding-intelligence && python3 scripts/perf_bottleneck_finder.py --repo /path/to/target --report metrics/perf_report.md
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import subprocess
|
||||
import sys
|
||||
from collections import defaultdict
|
||||
from dataclasses import dataclass, field, asdict
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
from typing import Any, Dict, List, Optional, Tuple
|
||||
|
||||
|
||||
# ── Configuration ──────────────────────────────────────────────────
|
||||
|
||||
SLOW_TEST_THRESHOLD_S = 2.0 # Tests slower than this are flagged
|
||||
SLOW_BUILD_STEP_THRESHOLD_S = 10.0
|
||||
TOP_N_BOTTLENECKS = 10 # Report top N bottlenecks
|
||||
PYTEST_DURATIONS_COUNT = 20 # Number of slow tests to collect
|
||||
LOG_EXTENSIONS = {".log", ".txt"}
|
||||
|
||||
|
||||
@dataclass
|
||||
class Bottleneck:
|
||||
"""A single performance bottleneck."""
|
||||
category: str # "test", "build", "ci", "artifact", "import"
|
||||
name: str # What's slow
|
||||
duration_s: float # How long it takes
|
||||
severity: str # "critical", "warning", "info"
|
||||
recommendation: str # How to fix
|
||||
file_path: Optional[str] = None
|
||||
line_number: Optional[int] = None
|
||||
|
||||
|
||||
@dataclass
|
||||
class PerfReport:
|
||||
"""Full performance report."""
|
||||
timestamp: str
|
||||
repo_path: str
|
||||
bottlenecks: List[Bottleneck] = field(default_factory=list)
|
||||
summary: Dict[str, Any] = field(default_factory=dict)
|
||||
test_stats: Dict[str, Any] = field(default_factory=dict)
|
||||
build_stats: Dict[str, Any] = field(default_factory=dict)
|
||||
ci_stats: Dict[str, Any] = field(default_factory=dict)
|
||||
|
||||
def to_dict(self) -> dict:
|
||||
d = asdict(self)
|
||||
return d
|
||||
|
||||
|
||||
# ── Test Analysis ──────────────────────────────────────────────────
|
||||
|
||||
def find_slow_tests_pytest(repo_path: str) -> List[Bottleneck]:
|
||||
"""Run pytest --durations and parse slow tests."""
|
||||
bottlenecks = []
|
||||
|
||||
# Try to run pytest with durations
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["python3", "-m", "pytest", "--co", "-q", "--durations=0"],
|
||||
cwd=repo_path, capture_output=True, text=True, timeout=30
|
||||
)
|
||||
# If tests exist, try to get durations from last run
|
||||
durations_file = os.path.join(repo_path, ".pytest_cache", "v", "cache", "durations")
|
||||
if os.path.exists(durations_file):
|
||||
with open(durations_file) as f:
|
||||
for line in f:
|
||||
parts = line.strip().split()
|
||||
if len(parts) >= 2:
|
||||
try:
|
||||
duration = float(parts[0])
|
||||
test_name = " ".join(parts[1:])
|
||||
if duration > SLOW_TEST_THRESHOLD_S:
|
||||
severity = "critical" if duration > 10 else "warning"
|
||||
bottlenecks.append(Bottleneck(
|
||||
category="test",
|
||||
name=test_name,
|
||||
duration_s=duration,
|
||||
severity=severity,
|
||||
recommendation=f"Test takes {duration:.1f}s. Consider mocking slow I/O, using fixtures, or marking with @pytest.mark.slow."
|
||||
))
|
||||
except ValueError:
|
||||
continue
|
||||
except (subprocess.TimeoutExpired, FileNotFoundError):
|
||||
pass
|
||||
|
||||
return bottlenecks
|
||||
|
||||
|
||||
def find_slow_tests_by_scan(repo_path: str) -> List[Bottleneck]:
|
||||
"""Scan test files for patterns that indicate slow tests."""
|
||||
bottlenecks = []
|
||||
test_patterns = [
|
||||
(r"time\.sleep\((\d+(?:\.\d+)?)\)", "Contains time.sleep() — consider using mock or async wait"),
|
||||
(r"subprocess\.run\(.*timeout=(\d+)", "Subprocess with timeout — may block test"),
|
||||
(r"requests\.(get|post|put|delete)\(", "Real HTTP call — mock with responses or httpretty"),
|
||||
(r"open\\([^)]*)[\x27\x22]w[\x27\x22]", "File I/O in test — use tmp_path fixture"),
|
||||
]
|
||||
|
||||
for root, dirs, files in os.walk(repo_path):
|
||||
# Skip hidden and cache dirs
|
||||
dirs[:] = [d for d in dirs if not d.startswith(('.', '__pycache__', 'node_modules', '.git'))]
|
||||
|
||||
for fname in files:
|
||||
if not (fname.startswith("test_") or fname.endswith("_test.py")):
|
||||
continue
|
||||
if not fname.endswith(".py"):
|
||||
continue
|
||||
|
||||
fpath = os.path.join(root, fname)
|
||||
rel_path = os.path.relpath(fpath, repo_path)
|
||||
|
||||
try:
|
||||
with open(fpath) as f:
|
||||
lines = f.readlines()
|
||||
except (PermissionError, UnicodeDecodeError):
|
||||
continue
|
||||
|
||||
for i, line in enumerate(lines):
|
||||
for pattern, recommendation in test_patterns:
|
||||
match = re.search(pattern, line)
|
||||
if match:
|
||||
duration = 1.0 # Default estimate
|
||||
if "sleep" in pattern:
|
||||
try:
|
||||
duration = float(match.group(1))
|
||||
except (ValueError, IndexError):
|
||||
duration = 1.0
|
||||
elif "timeout" in pattern:
|
||||
try:
|
||||
duration = float(match.group(1))
|
||||
except (ValueError, IndexError):
|
||||
duration = 10.0
|
||||
else:
|
||||
duration = 2.0 # Estimated
|
||||
|
||||
bottlenecks.append(Bottleneck(
|
||||
category="test",
|
||||
name=f"{rel_path}:{i+1}",
|
||||
duration_s=duration,
|
||||
severity="warning" if duration < 5 else "critical",
|
||||
recommendation=recommendation,
|
||||
file_path=rel_path,
|
||||
line_number=i + 1
|
||||
))
|
||||
|
||||
return bottlenecks
|
||||
|
||||
|
||||
# ── Build Analysis ─────────────────────────────────────────────────
|
||||
|
||||
def analyze_build_artifacts(repo_path: str) -> List[Bottleneck]:
|
||||
"""Find large build artifacts that slow down builds."""
|
||||
bottlenecks = []
|
||||
large_dirs = {
|
||||
"node_modules": "Consider using npm ci --production or yarn --production",
|
||||
"__pycache__": "Consider .gitignore and cleaning before builds",
|
||||
".tox": "Consider caching tox environments",
|
||||
".pytest_cache": "Consider cleaning between CI runs",
|
||||
"dist": "Check if dist/ artifacts are being rebuilt unnecessarily",
|
||||
"build": "Check if build/ artifacts are being rebuilt unnecessarily",
|
||||
".next": "Next.js cache — consider incremental builds",
|
||||
"venv": "Virtual env in repo — move outside or use Docker",
|
||||
}
|
||||
|
||||
for dirname, recommendation in large_dirs.items():
|
||||
dirpath = os.path.join(repo_path, dirname)
|
||||
if os.path.isdir(dirpath):
|
||||
total_size = 0
|
||||
file_count = 0
|
||||
for root, dirs, files in os.walk(dirpath):
|
||||
for f in files:
|
||||
try:
|
||||
fpath = os.path.join(root, f)
|
||||
total_size += os.path.getsize(fpath)
|
||||
file_count += 1
|
||||
except OSError:
|
||||
pass
|
||||
|
||||
if total_size > 10 * 1024 * 1024: # > 10MB
|
||||
size_mb = total_size / (1024 * 1024)
|
||||
bottlenecks.append(Bottleneck(
|
||||
category="build",
|
||||
name=f"{dirname}/ ({size_mb:.1f}MB, {file_count} files)",
|
||||
duration_s=size_mb * 0.5, # Rough estimate
|
||||
severity="critical" if size_mb > 100 else "warning",
|
||||
recommendation=recommendation
|
||||
))
|
||||
|
||||
return bottlenecks
|
||||
|
||||
|
||||
def analyze_makefile_targets(repo_path: str) -> List[Bottleneck]:
|
||||
"""Analyze Makefile for potentially slow targets."""
|
||||
bottlenecks = []
|
||||
makefiles = []
|
||||
|
||||
for root, dirs, files in os.walk(repo_path):
|
||||
dirs[:] = [d for d in dirs if not d.startswith(('.', '__pycache__'))]
|
||||
for f in files:
|
||||
if f in ("Makefile", "makefile", "GNUmakefile"):
|
||||
makefiles.append(os.path.join(root, f))
|
||||
|
||||
slow_patterns = [
|
||||
(r"pip install", "pip install without --no-deps or constraints"),
|
||||
(r"npm install(?!.*--production)", "npm install without --production flag"),
|
||||
(r"docker build", "Docker build — consider multi-stage and layer caching"),
|
||||
(r"pytest(?!.*-x|--maxfail)", "pytest without early exit on failure"),
|
||||
(r"mypy|mypy --strict", "Type checking — consider incremental mode"),
|
||||
]
|
||||
|
||||
for mfile in makefiles:
|
||||
rel_path = os.path.relpath(mfile, repo_path)
|
||||
try:
|
||||
with open(mfile) as f:
|
||||
content = f.read()
|
||||
except (PermissionError, UnicodeDecodeError):
|
||||
continue
|
||||
|
||||
for pattern, recommendation in slow_patterns:
|
||||
if re.search(pattern, content):
|
||||
bottlenecks.append(Bottleneck(
|
||||
category="build",
|
||||
name=f"{rel_path}: {pattern}",
|
||||
duration_s=5.0,
|
||||
severity="info",
|
||||
recommendation=recommendation,
|
||||
file_path=rel_path
|
||||
))
|
||||
|
||||
return bottlenecks
|
||||
|
||||
|
||||
# ── CI Analysis ────────────────────────────────────────────────────
|
||||
|
||||
def analyze_github_actions(repo_path: str) -> List[Bottleneck]:
|
||||
"""Analyze GitHub Actions workflow files for inefficiencies."""
|
||||
bottlenecks = []
|
||||
workflow_dir = os.path.join(repo_path, ".github", "workflows")
|
||||
|
||||
if not os.path.isdir(workflow_dir):
|
||||
return bottlenecks
|
||||
|
||||
slow_patterns = [
|
||||
(r"runs-on:\s*ubuntu-latest", 0, "Consider caching dependencies between runs"),
|
||||
(r"npm install", 2, "Use npm ci instead of npm install for reproducible builds"),
|
||||
(r"pip install(?!.*--cache-dir)", 2, "Add --cache-dir or use pip cache action"),
|
||||
(r"docker build(?!.*--cache-from)", 5, "Use Docker layer caching"),
|
||||
(r"python -m pytest(?!.*-n|--numprocesses)", 3, "Consider pytest-xdist for parallel test execution"),
|
||||
]
|
||||
|
||||
for fname in os.listdir(workflow_dir):
|
||||
if not fname.endswith(('.yml', '.yaml')):
|
||||
continue
|
||||
|
||||
fpath = os.path.join(workflow_dir, fname)
|
||||
try:
|
||||
with open(fpath) as f:
|
||||
content = f.read()
|
||||
except (PermissionError, UnicodeDecodeError):
|
||||
continue
|
||||
|
||||
for pattern, est_savings, recommendation in slow_patterns:
|
||||
if re.search(pattern, content):
|
||||
bottlenecks.append(Bottleneck(
|
||||
category="ci",
|
||||
name=f"{fname}: {pattern}",
|
||||
duration_s=est_savings,
|
||||
severity="info",
|
||||
recommendation=recommendation,
|
||||
file_path=f".github/workflows/{fname}"
|
||||
))
|
||||
|
||||
return bottlenecks
|
||||
|
||||
|
||||
def analyze_gitea_ci(repo_path: str) -> List[Bottleneck]:
|
||||
"""Analyze Gitea/Drone CI config files."""
|
||||
bottlenecks = []
|
||||
ci_files = [".gitea/workflows", ".drone.yml", ".woodpecker.yml"]
|
||||
|
||||
for ci_path in ci_files:
|
||||
full_path = os.path.join(repo_path, ci_path)
|
||||
if os.path.isfile(full_path):
|
||||
try:
|
||||
with open(full_path) as f:
|
||||
content = f.read()
|
||||
except (PermissionError, UnicodeDecodeError):
|
||||
continue
|
||||
|
||||
if "pip install" in content and "--cache-dir" not in content:
|
||||
bottlenecks.append(Bottleneck(
|
||||
category="ci",
|
||||
name=f"{ci_path}: pip without cache",
|
||||
duration_s=5.0,
|
||||
severity="warning",
|
||||
recommendation="Add --cache-dir or mount pip cache volume",
|
||||
file_path=ci_path
|
||||
))
|
||||
|
||||
elif os.path.isdir(full_path):
|
||||
for fname in os.listdir(full_path):
|
||||
if not fname.endswith(('.yml', '.yaml')):
|
||||
continue
|
||||
fpath = os.path.join(full_path, fname)
|
||||
try:
|
||||
with open(fpath) as f:
|
||||
content = f.read()
|
||||
except (PermissionError, UnicodeDecodeError):
|
||||
continue
|
||||
|
||||
if "pip install" in content and "--cache-dir" not in content:
|
||||
bottlenecks.append(Bottleneck(
|
||||
category="ci",
|
||||
name=f"{ci_path}/{fname}: pip without cache",
|
||||
duration_s=5.0,
|
||||
severity="warning",
|
||||
recommendation="Add --cache-dir or mount pip cache volume",
|
||||
file_path=f"{ci_path}/{fname}"
|
||||
))
|
||||
|
||||
return bottlenecks
|
||||
|
||||
|
||||
# ── Import Analysis ────────────────────────────────────────────────
|
||||
|
||||
def find_slow_imports(repo_path: str) -> List[Bottleneck]:
|
||||
"""Find Python files with heavy import chains."""
|
||||
bottlenecks = []
|
||||
heavy_imports = {
|
||||
"pandas": 0.5,
|
||||
"numpy": 0.3,
|
||||
"torch": 2.0,
|
||||
"tensorflow": 3.0,
|
||||
"scipy": 0.5,
|
||||
"matplotlib": 0.8,
|
||||
"sklearn": 0.5,
|
||||
"transformers": 1.5,
|
||||
}
|
||||
|
||||
for root, dirs, files in os.walk(repo_path):
|
||||
dirs[:] = [d for d in dirs if not d.startswith(('.', '__pycache__', 'node_modules'))]
|
||||
for fname in files:
|
||||
if not fname.endswith(".py"):
|
||||
continue
|
||||
|
||||
fpath = os.path.join(root, fname)
|
||||
rel_path = os.path.relpath(fpath, repo_path)
|
||||
|
||||
try:
|
||||
with open(fpath) as f:
|
||||
lines = f.readlines()
|
||||
except (PermissionError, UnicodeDecodeError):
|
||||
continue
|
||||
|
||||
for i, line in enumerate(lines):
|
||||
stripped = line.strip()
|
||||
if stripped.startswith("import ") or stripped.startswith("from "):
|
||||
for heavy, est_time in heavy_imports.items():
|
||||
if heavy in stripped:
|
||||
bottlenecks.append(Bottleneck(
|
||||
category="import",
|
||||
name=f"{rel_path}:{i+1}: import {heavy}",
|
||||
duration_s=est_time,
|
||||
severity="info" if est_time < 1.0 else "warning",
|
||||
recommendation=f"Heavy import ({heavy} ~{est_time}s). Consider lazy import or conditional import.",
|
||||
file_path=rel_path,
|
||||
line_number=i + 1
|
||||
))
|
||||
|
||||
return bottlenecks
|
||||
|
||||
|
||||
# ── Report Generation ──────────────────────────────────────────────
|
||||
|
||||
def severity_sort_key(b: Bottleneck) -> Tuple[int, float]:
|
||||
"""Sort by severity then duration."""
|
||||
sev_order = {"critical": 0, "warning": 1, "info": 2}
|
||||
return (sev_order.get(b.severity, 3), -b.duration_s)
|
||||
|
||||
|
||||
def generate_report(repo_path: str) -> PerfReport:
|
||||
"""Run all analyses and generate a performance report."""
|
||||
report = PerfReport(
|
||||
timestamp=datetime.now(timezone.utc).isoformat(),
|
||||
repo_path=os.path.abspath(repo_path)
|
||||
)
|
||||
|
||||
# Collect all bottlenecks
|
||||
all_bottlenecks = []
|
||||
|
||||
print("Scanning for slow tests (pytest cache)...")
|
||||
all_bottlenecks.extend(find_slow_tests_pytest(repo_path))
|
||||
|
||||
print("Scanning for slow test patterns...")
|
||||
all_bottlenecks.extend(find_slow_tests_by_scan(repo_path))
|
||||
|
||||
print("Analyzing build artifacts...")
|
||||
all_bottlenecks.extend(analyze_build_artifacts(repo_path))
|
||||
|
||||
print("Analyzing Makefiles...")
|
||||
all_bottlenecks.extend(analyze_makefile_targets(repo_path))
|
||||
|
||||
print("Analyzing CI workflows...")
|
||||
all_bottlenecks.extend(analyze_github_actions(repo_path))
|
||||
all_bottlenecks.extend(analyze_gitea_ci(repo_path))
|
||||
|
||||
print("Scanning for heavy imports...")
|
||||
all_bottlenecks.extend(find_slow_imports(repo_path))
|
||||
|
||||
# Sort by severity and duration
|
||||
all_bottlenecks.sort(key=severity_sort_key)
|
||||
report.bottlenecks = all_bottlenecks[:TOP_N_BOTTLENECKS * 2] # Keep more for stats
|
||||
|
||||
# Compute summary
|
||||
by_category = defaultdict(list)
|
||||
for b in all_bottlenecks:
|
||||
by_category[b.category].append(b)
|
||||
|
||||
report.summary = {
|
||||
"total_bottlenecks": len(all_bottlenecks),
|
||||
"critical": sum(1 for b in all_bottlenecks if b.severity == "critical"),
|
||||
"warning": sum(1 for b in all_bottlenecks if b.severity == "warning"),
|
||||
"info": sum(1 for b in all_bottlenecks if b.severity == "info"),
|
||||
"estimated_total_slowdown_s": sum(b.duration_s for b in all_bottlenecks),
|
||||
"by_category": {cat: len(items) for cat, items in by_category.items()},
|
||||
}
|
||||
|
||||
report.test_stats = {
|
||||
"slow_tests": len(by_category.get("test", [])),
|
||||
"total_estimated_s": sum(b.duration_s for b in by_category.get("test", [])),
|
||||
}
|
||||
|
||||
report.build_stats = {
|
||||
"build_issues": len(by_category.get("build", [])),
|
||||
"total_estimated_s": sum(b.duration_s for b in by_category.get("build", [])),
|
||||
}
|
||||
|
||||
report.ci_stats = {
|
||||
"ci_issues": len(by_category.get("ci", [])),
|
||||
"total_estimated_s": sum(b.duration_s for b in by_category.get("ci", [])),
|
||||
}
|
||||
|
||||
return report
|
||||
|
||||
|
||||
def format_markdown(report: PerfReport) -> str:
|
||||
"""Format report as markdown."""
|
||||
lines = []
|
||||
lines.append(f"# Performance Bottleneck Report")
|
||||
lines.append(f"")
|
||||
lines.append(f"Generated: {report.timestamp}")
|
||||
lines.append(f"Repository: {report.repo_path}")
|
||||
lines.append(f"")
|
||||
|
||||
# Summary
|
||||
s = report.summary
|
||||
lines.append(f"## Summary")
|
||||
lines.append(f"")
|
||||
lines.append(f"- **Total bottlenecks:** {s['total_bottlenecks']}")
|
||||
lines.append(f"- **Critical:** {s['critical']} | **Warning:** {s['warning']} | **Info:** {s['info']}")
|
||||
lines.append(f"- **Estimated total slowdown:** {s['estimated_total_slowdown_s']:.1f}s")
|
||||
lines.append(f"- **By category:** {', '.join(f'{k}: {v}' for k, v in s['by_category'].items())}")
|
||||
lines.append(f"")
|
||||
|
||||
# Top bottlenecks
|
||||
lines.append(f"## Top {min(TOP_N_BOTTLENECKS, len(report.bottlenecks))} Bottlenecks")
|
||||
lines.append(f"")
|
||||
|
||||
for i, b in enumerate(report.bottlenecks[:TOP_N_BOTTLENECKS], 1):
|
||||
icon = {"critical": "🔴", "warning": "🟡", "info": "🔵"}.get(b.severity, "⚪")
|
||||
loc = f" ({b.file_path}:{b.line_number})" if b.file_path else ""
|
||||
lines.append(f"{i}. {icon} **{b.category}** — {b.name}{loc}")
|
||||
lines.append(f" - Duration: ~{b.duration_s:.1f}s | Severity: {b.severity}")
|
||||
lines.append(f" - Fix: {b.recommendation}")
|
||||
lines.append(f"")
|
||||
|
||||
# Category breakdowns
|
||||
for cat in ["test", "build", "ci", "import"]:
|
||||
items = [b for b in report.bottlenecks if b.category == cat]
|
||||
if items:
|
||||
lines.append(f"## {cat.title()} Bottlenecks")
|
||||
lines.append(f"")
|
||||
for b in items:
|
||||
icon = {"critical": "🔴", "warning": "🟡", "info": "🔵"}.get(b.severity, "⚪")
|
||||
loc = f" ({b.file_path}:{b.line_number})" if b.file_path else ""
|
||||
lines.append(f"- {icon} {b.name}{loc} — ~{b.duration_s:.1f}s — {b.recommendation}")
|
||||
lines.append(f"")
|
||||
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
|
||||
# ── Main ───────────────────────────────────────────────────────────
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Performance Bottleneck Finder")
|
||||
parser.add_argument("--repo", default=".", help="Path to repository to analyze")
|
||||
parser.add_argument("--json", action="store_true", help="Output as JSON")
|
||||
parser.add_argument("--report", help="Write markdown report to file")
|
||||
parser.add_argument("--threshold", type=float, default=SLOW_TEST_THRESHOLD_S,
|
||||
help="Slow test threshold in seconds")
|
||||
args = parser.parse_args()
|
||||
|
||||
# Threshold override handled via module-level default
|
||||
# (scan_tests uses SLOW_TEST_THRESHOLD_S from module scope)
|
||||
|
||||
if not os.path.isdir(args.repo):
|
||||
print(f"Error: {args.repo} is not a directory", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
report = generate_report(args.repo)
|
||||
|
||||
if args.json:
|
||||
print(json.dumps(report.to_dict(), indent=2))
|
||||
else:
|
||||
md = format_markdown(report)
|
||||
if args.report:
|
||||
os.makedirs(os.path.dirname(args.report) or ".", exist_ok=True)
|
||||
with open(args.report, "w") as f:
|
||||
f.write(md)
|
||||
print(f"Report written to {args.report}")
|
||||
else:
|
||||
print(md)
|
||||
|
||||
# Exit code: 1 if critical bottlenecks found
|
||||
if report.summary.get("critical", 0) > 0:
|
||||
sys.exit(1)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,682 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Priority Rebalancer — Re-evaluate issue priorities based on accumulated data.
|
||||
|
||||
Reads pipeline outputs, knowledge store, and Gitea issues to suggest
|
||||
priority changes based on what the fleet has learned.
|
||||
|
||||
Usage:
|
||||
python3 scripts/priority_rebalancer.py --org Timmy_Foundation
|
||||
python3 scripts/priority_rebalancer.py --org Timmy_Foundation --repo compounding-intelligence
|
||||
python3 scripts/priority_rebalancer.py --org Timmy_Foundation --dry-run
|
||||
python3 scripts/priority_rebalancer.py --org Timmy_Foundation --apply
|
||||
|
||||
Output:
|
||||
metrics/priority_report.json — full analysis
|
||||
metrics/priority_suggestions.md — human-readable suggestions
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
from datetime import datetime, timezone, timedelta
|
||||
from pathlib import Path
|
||||
from typing import Dict, List, Any, Optional, Tuple
|
||||
from dataclasses import dataclass, field, asdict
|
||||
from collections import Counter, defaultdict
|
||||
import urllib.request
|
||||
import urllib.error
|
||||
|
||||
|
||||
# ============================================================
|
||||
# Data Models
|
||||
# ============================================================
|
||||
|
||||
@dataclass
|
||||
class IssueScore:
|
||||
issue_id: int
|
||||
repo: str
|
||||
title: str
|
||||
current_labels: List[str]
|
||||
current_priority: Optional[str]
|
||||
suggested_priority: Optional[str]
|
||||
score: float
|
||||
reasons: List[str]
|
||||
age_days: int
|
||||
comment_count: int
|
||||
assignee: Optional[str]
|
||||
dependencies: List[str] = field(default_factory=list)
|
||||
blocking: List[str] = field(default_factory=list)
|
||||
|
||||
|
||||
@dataclass
|
||||
class PipelineSignal:
|
||||
source: str # "knowledge", "metrics", "sessions", "staleness"
|
||||
signal_type: str # "stale_knowledge", "high_error_rate", "missing_coverage", etc.
|
||||
weight: float # 0.0 - 1.0
|
||||
detail: str
|
||||
affected_repos: List[str] = field(default_factory=list)
|
||||
affected_issues: List[int] = field(default_factory=list)
|
||||
|
||||
|
||||
# ============================================================
|
||||
# Gitea API Client
|
||||
# ============================================================
|
||||
|
||||
class GiteaClient:
|
||||
def __init__(self, base_url: str, token: str):
|
||||
self.base_url = base_url.rstrip("/")
|
||||
self.token = token
|
||||
|
||||
def _request(self, path: str, params: Dict = None) -> Any:
|
||||
url = f"{self.base_url}/api/v1{path}"
|
||||
if params:
|
||||
qs = "&".join(f"{k}={v}" for k, v in params.items() if v is not None)
|
||||
url += f"?{qs}"
|
||||
|
||||
req = urllib.request.Request(url)
|
||||
req.add_header("Authorization", f"token {self.token}")
|
||||
req.add_header("Content-Type", "application/json")
|
||||
|
||||
try:
|
||||
with urllib.request.urlopen(req, timeout=30) as resp:
|
||||
return json.loads(resp.read().decode())
|
||||
except urllib.error.HTTPError as e:
|
||||
print(f"API error {e.code} for {path}: {e.read().decode()[:200]}", file=sys.stderr)
|
||||
return None
|
||||
|
||||
def get_org_repos(self, org: str) -> List[Dict]:
|
||||
repos = []
|
||||
page = 1
|
||||
while True:
|
||||
batch = self._request(f"/orgs/{org}/repos", {"limit": 50, "page": page})
|
||||
if not batch:
|
||||
break
|
||||
repos.extend(batch)
|
||||
if len(batch) < 50:
|
||||
break
|
||||
page += 1
|
||||
return repos
|
||||
|
||||
def get_issues(self, org: str, repo: str, state: str = "open") -> List[Dict]:
|
||||
issues = []
|
||||
page = 1
|
||||
while True:
|
||||
batch = self._request(f"/repos/{org}/{repo}/issues",
|
||||
{"state": state, "limit": 50, "page": page, "type": "issues"})
|
||||
if not batch:
|
||||
break
|
||||
issues.extend(batch)
|
||||
if len(batch) < 50:
|
||||
break
|
||||
page += 1
|
||||
return issues
|
||||
|
||||
def add_label_to_issue(self, org: str, repo: str, issue_num: int, label_ids: List[int]) -> bool:
|
||||
url = f"{self.base_url}/api/v1/repos/{org}/{repo}/issues/{issue_num}/labels"
|
||||
data = json.dumps({"labels": label_ids}).encode()
|
||||
req = urllib.request.Request(url, data=data, method="POST")
|
||||
req.add_header("Authorization", f"token {self.token}")
|
||||
req.add_header("Content-Type", "application/json")
|
||||
try:
|
||||
with urllib.request.urlopen(req, timeout=15) as resp:
|
||||
return resp.status == 200
|
||||
except Exception:
|
||||
return False
|
||||
|
||||
def remove_label_from_issue(self, org: str, repo: str, issue_num: int, label_id: int) -> bool:
|
||||
url = f"{self.base_url}/api/v1/repos/{org}/{repo}/issues/{issue_num}/labels/{label_id}"
|
||||
req = urllib.request.Request(url, method="DELETE")
|
||||
req.add_header("Authorization", f"token {self.token}")
|
||||
try:
|
||||
with urllib.request.urlopen(req, timeout=15) as resp:
|
||||
return resp.status == 200
|
||||
except Exception:
|
||||
return False
|
||||
|
||||
def get_repo_labels(self, org: str, repo: str) -> List[Dict]:
|
||||
labels = []
|
||||
page = 1
|
||||
while True:
|
||||
batch = self._request(f"/repos/{org}/{repo}/labels", {"limit": 50, "page": page})
|
||||
if not batch:
|
||||
break
|
||||
labels.extend(batch)
|
||||
if len(batch) < 50:
|
||||
break
|
||||
page += 1
|
||||
return labels
|
||||
|
||||
def add_comment(self, org: str, repo: str, issue_num: int, body: str) -> bool:
|
||||
url = f"{self.base_url}/api/v1/repos/{org}/{repo}/issues/{issue_num}/comments"
|
||||
data = json.dumps({"body": body}).encode()
|
||||
req = urllib.request.Request(url, data=data, method="POST")
|
||||
req.add_header("Authorization", f"token {self.token}")
|
||||
req.add_header("Content-Type", "application/json")
|
||||
try:
|
||||
with urllib.request.urlopen(req, timeout=15) as resp:
|
||||
return resp.status == 201
|
||||
except Exception:
|
||||
return False
|
||||
|
||||
|
||||
# ============================================================
|
||||
# Pipeline Signal Collectors
|
||||
# ============================================================
|
||||
|
||||
def collect_knowledge_signals(knowledge_dir: str) -> List[PipelineSignal]:
|
||||
"""Analyze knowledge store for coverage gaps and staleness."""
|
||||
signals = []
|
||||
index_path = os.path.join(knowledge_dir, "index.json")
|
||||
|
||||
if not os.path.exists(index_path):
|
||||
signals.append(PipelineSignal(
|
||||
source="knowledge",
|
||||
signal_type="missing_index",
|
||||
weight=0.8,
|
||||
detail="knowledge/index.json not found — no knowledge base exists"
|
||||
))
|
||||
return signals
|
||||
|
||||
try:
|
||||
with open(index_path) as f:
|
||||
data = json.load(f)
|
||||
except (json.JSONDecodeError, IOError) as e:
|
||||
signals.append(PipelineSignal(
|
||||
source="knowledge",
|
||||
signal_type="corrupt_index",
|
||||
weight=0.9,
|
||||
detail=f"knowledge/index.json is corrupt: {e}"
|
||||
))
|
||||
return signals
|
||||
|
||||
facts = data.get("facts", [])
|
||||
total = len(facts)
|
||||
|
||||
if total == 0:
|
||||
signals.append(PipelineSignal(
|
||||
source="knowledge",
|
||||
signal_type="empty_knowledge",
|
||||
weight=0.7,
|
||||
detail="Knowledge store has 0 facts — harvester not running or not finding sessions"
|
||||
))
|
||||
return signals
|
||||
|
||||
# Check staleness
|
||||
stale_count = 0
|
||||
missing_source = 0
|
||||
for fact in facts:
|
||||
status = fact.get("status", "unknown")
|
||||
if status == "stale":
|
||||
stale_count += 1
|
||||
elif status in ("missing_source", "no_source"):
|
||||
missing_source += 1
|
||||
|
||||
if stale_count > 0:
|
||||
signals.append(PipelineSignal(
|
||||
source="knowledge",
|
||||
signal_type="stale_knowledge",
|
||||
weight=min(1.0, stale_count / max(1, total)),
|
||||
detail=f"{stale_count}/{total} facts are stale (source files changed)"
|
||||
))
|
||||
|
||||
if missing_source > 0:
|
||||
signals.append(PipelineSignal(
|
||||
source="knowledge",
|
||||
signal_type="missing_sources",
|
||||
weight=min(1.0, missing_source / max(1, total)),
|
||||
detail=f"{missing_source}/{total} facts have missing source files"
|
||||
))
|
||||
|
||||
# Coverage by repo
|
||||
repo_counts = Counter(f.get("repo", "unknown") for f in facts)
|
||||
if len(repo_counts) < 3:
|
||||
signals.append(PipelineSignal(
|
||||
source="knowledge",
|
||||
signal_type="low_coverage",
|
||||
weight=0.5,
|
||||
detail=f"Knowledge covers only {len(repo_counts)} repos — expand harvester scope",
|
||||
affected_repos=list(repo_counts.keys())
|
||||
))
|
||||
|
||||
return signals
|
||||
|
||||
|
||||
def collect_staleness_signals(scripts_dir: str, knowledge_dir: str) -> List[PipelineSignal]:
|
||||
"""Run staleness checker if available."""
|
||||
signals = []
|
||||
checker = os.path.join(scripts_dir, "knowledge_staleness_check.py")
|
||||
index_path = os.path.join(knowledge_dir, "index.json")
|
||||
|
||||
if not os.path.exists(checker) or not os.path.exists(index_path):
|
||||
return signals
|
||||
|
||||
try:
|
||||
import subprocess
|
||||
result = subprocess.run(
|
||||
["python3", checker, "--index", index_path, "--json"],
|
||||
capture_output=True, text=True, timeout=30
|
||||
)
|
||||
if result.returncode == 0:
|
||||
data = json.loads(result.stdout)
|
||||
stale = data.get("stale_count", 0)
|
||||
total = data.get("total", 0)
|
||||
if stale > 0:
|
||||
signals.append(PipelineSignal(
|
||||
source="staleness",
|
||||
signal_type="stale_knowledge",
|
||||
weight=min(1.0, stale / max(1, total)),
|
||||
detail=f"Staleness checker found {stale}/{total} stale entries"
|
||||
))
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
return signals
|
||||
|
||||
|
||||
def collect_metrics_signals(metrics_dir: str) -> List[PipelineSignal]:
|
||||
"""Analyze metrics directory for pipeline health."""
|
||||
signals = []
|
||||
|
||||
if not os.path.isdir(metrics_dir):
|
||||
return signals
|
||||
|
||||
files = os.listdir(metrics_dir)
|
||||
if len(files) <= 1: # Only .gitkeep
|
||||
signals.append(PipelineSignal(
|
||||
source="metrics",
|
||||
signal_type="no_metrics",
|
||||
weight=0.6,
|
||||
detail="Metrics directory is empty — measurer pipeline not producing output"
|
||||
))
|
||||
|
||||
return signals
|
||||
|
||||
|
||||
# ============================================================
|
||||
# Priority Scoring Engine
|
||||
# ============================================================
|
||||
|
||||
PRIORITY_LEVELS = ["P0", "P1", "P2", "P3", "none"]
|
||||
PRIORITY_LABELS = {"p0": "P0", "p1": "P1", "p2": "P2", "p3": "P3",
|
||||
"priority:critical": "P0", "priority:high": "P1",
|
||||
"priority:medium": "P2", "priority:low": "P3"}
|
||||
|
||||
|
||||
def extract_priority(labels: List[str]) -> Optional[str]:
|
||||
"""Extract priority level from issue labels."""
|
||||
for label in labels:
|
||||
lower = label.lower().strip()
|
||||
if lower in PRIORITY_LABELS:
|
||||
return PRIORITY_LABELS[lower]
|
||||
return None
|
||||
|
||||
|
||||
def compute_issue_score(
|
||||
issue: Dict,
|
||||
repo: str,
|
||||
signals: List[PipelineSignal],
|
||||
now: datetime
|
||||
) -> IssueScore:
|
||||
"""Compute priority score for a single issue."""
|
||||
|
||||
labels = [l.get("name", "") if isinstance(l, dict) else l for l in issue.get("labels", [])]
|
||||
title = issue.get("title", "")
|
||||
issue_id = issue.get("number", 0)
|
||||
current_priority = extract_priority(labels)
|
||||
|
||||
# Parse dates
|
||||
created_str = issue.get("created_at", "")
|
||||
try:
|
||||
created = datetime.fromisoformat(created_str.replace("Z", "+00:00"))
|
||||
age_days = (now - created.replace(tzinfo=None)).days
|
||||
except (ValueError, AttributeError):
|
||||
age_days = 0
|
||||
|
||||
comment_count = issue.get("comments", 0)
|
||||
assignee = None
|
||||
assignees = issue.get("assignees") or []
|
||||
if assignees:
|
||||
assignee = assignees[0].get("login") if isinstance(assignees[0], dict) else str(assignees[0])
|
||||
|
||||
# Base score
|
||||
score = 0.0
|
||||
reasons = []
|
||||
|
||||
# Age factor: older issues drift down unless actively discussed
|
||||
if age_days > 90 and comment_count < 2:
|
||||
score -= 15
|
||||
reasons.append(f"Dormant: {age_days} days old with only {comment_count} comments")
|
||||
elif age_days > 30:
|
||||
score -= 5
|
||||
reasons.append(f"Aging: {age_days} days old")
|
||||
|
||||
# Activity factor: recent discussion suggests urgency
|
||||
if comment_count > 5:
|
||||
score += 10
|
||||
reasons.append(f"Active discussion: {comment_count} comments")
|
||||
elif comment_count > 0:
|
||||
score += 3
|
||||
|
||||
# Assignment factor: unassigned issues need triage
|
||||
if not assignee:
|
||||
score += 5
|
||||
reasons.append("Unassigned — needs triage")
|
||||
|
||||
# Pipeline signal alignment
|
||||
for signal in signals:
|
||||
title_lower = title.lower()
|
||||
if signal.signal_type == "stale_knowledge" and "stale" in title_lower:
|
||||
score += signal.weight * 20
|
||||
reasons.append(f"Matches signal: {signal.detail}")
|
||||
elif signal.signal_type == "empty_knowledge" and ("harvester" in title_lower or "knowledge" in title_lower):
|
||||
score += signal.weight * 25
|
||||
reasons.append(f"Critical gap: {signal.detail}")
|
||||
elif signal.signal_type == "no_metrics" and "measur" in title_lower:
|
||||
score += signal.weight * 15
|
||||
reasons.append(f"Pipeline gap: {signal.detail}")
|
||||
elif signal.signal_type == "low_coverage" and any(r.lower() in title_lower for r in signal.affected_repos):
|
||||
score += signal.weight * 10
|
||||
reasons.append(f"Coverage gap: {signal.detail}")
|
||||
|
||||
# Keyword boosts
|
||||
keyword_scores = {
|
||||
"broken": 20, "bug": 15, "fix": 10, "error": 12, "fail": 15,
|
||||
"security": 25, "auth": 20, "data loss": 30, "crash": 25,
|
||||
"blocker": 20, "urgent": 15, "critical": 15,
|
||||
"epic": 8, "feature": -3, "nice to have": -10, "someday": -15
|
||||
}
|
||||
title_lower = title.lower()
|
||||
for keyword, boost in keyword_scores.items():
|
||||
if keyword in title_lower:
|
||||
score += boost
|
||||
if boost > 0:
|
||||
reasons.append(f"Keyword match: '{keyword}' (+{boost})")
|
||||
|
||||
# Label-based adjustments
|
||||
for label in labels:
|
||||
lower = label.lower()
|
||||
if lower == "pipeline":
|
||||
score += 5 # Pipeline issues are infrastructure
|
||||
elif lower == "bug":
|
||||
score += 12
|
||||
elif lower == "enhancement":
|
||||
score -= 2
|
||||
elif lower == "documentation":
|
||||
score -= 5
|
||||
elif "epic" in lower:
|
||||
score += 3
|
||||
|
||||
# Determine suggested priority
|
||||
if score >= 40:
|
||||
suggested = "P0"
|
||||
elif score >= 25:
|
||||
suggested = "P1"
|
||||
elif score >= 10:
|
||||
suggested = "P2"
|
||||
elif score >= 0:
|
||||
suggested = "P3"
|
||||
else:
|
||||
suggested = None # Consider closing or deprioritizing
|
||||
|
||||
return IssueScore(
|
||||
issue_id=issue_id,
|
||||
repo=repo,
|
||||
title=title,
|
||||
current_labels=labels,
|
||||
current_priority=current_priority,
|
||||
suggested_priority=suggested,
|
||||
score=round(score, 1),
|
||||
reasons=reasons if reasons else ["No strong signals"],
|
||||
age_days=age_days,
|
||||
comment_count=comment_count,
|
||||
assignee=assignee
|
||||
)
|
||||
|
||||
|
||||
# ============================================================
|
||||
# Report Generation
|
||||
# ============================================================
|
||||
|
||||
def generate_report(
|
||||
scores: List[IssueScore],
|
||||
signals: List[PipelineSignal],
|
||||
org: str,
|
||||
repos_scanned: List[str]
|
||||
) -> Dict[str, Any]:
|
||||
"""Generate the full priority report."""
|
||||
now = datetime.now(timezone.utc).isoformat()
|
||||
|
||||
# Categorize changes
|
||||
upgrades = [s for s in scores if s.suggested_priority and s.current_priority and
|
||||
PRIORITY_LEVELS.index(s.suggested_priority) < PRIORITY_LEVELS.index(s.current_priority)]
|
||||
downgrades = [s for s in scores if s.suggested_priority and s.current_priority and
|
||||
PRIORITY_LEVELS.index(s.suggested_priority) > PRIORITY_LEVELS.index(s.current_priority)]
|
||||
new_assignments = [s for s in scores if s.suggested_priority and not s.current_priority]
|
||||
no_change = [s for s in scores if s.suggested_priority == s.current_priority]
|
||||
|
||||
return {
|
||||
"generated_at": now,
|
||||
"org": org,
|
||||
"repos_scanned": repos_scanned,
|
||||
"total_issues": len(scores),
|
||||
"signals": [asdict(s) for s in signals],
|
||||
"summary": {
|
||||
"suggested_upgrades": len(upgrades),
|
||||
"suggested_downgrades": len(downgrades),
|
||||
"suggested_new_priorities": len(new_assignments),
|
||||
"unchanged": len(no_change)
|
||||
},
|
||||
"top_priority": [asdict(s) for s in sorted(scores, key=lambda x: x.score, reverse=True)[:10]],
|
||||
"upgrades": [asdict(s) for s in upgrades],
|
||||
"downgrades": [asdict(s) for s in downgrades],
|
||||
"new_assignments": [asdict(s) for s in new_assignments],
|
||||
"all_scores": [asdict(s) for s in sorted(scores, key=lambda x: x.score, reverse=True)]
|
||||
}
|
||||
|
||||
|
||||
def generate_markdown_report(report: Dict[str, Any]) -> str:
|
||||
"""Generate human-readable markdown report."""
|
||||
lines = []
|
||||
lines.append("# Priority Rebalancer Report")
|
||||
lines.append(f"**Generated:** {report['generated_at']}")
|
||||
lines.append(f"**Org:** {report['org']}")
|
||||
lines.append(f"**Repos scanned:** {', '.join(report['repos_scanned'])}")
|
||||
lines.append(f"**Issues analyzed:** {report['total_issues']}")
|
||||
lines.append("")
|
||||
|
||||
# Signals
|
||||
if report["signals"]:
|
||||
lines.append("## Pipeline Signals")
|
||||
for sig in report["signals"]:
|
||||
weight_bar = "█" * int(sig["weight"] * 10) + "░" * (10 - int(sig["weight"] * 10))
|
||||
lines.append(f"- [{weight_bar}] **{sig['source']}/{sig['signal_type']}** — {sig['detail']}")
|
||||
lines.append("")
|
||||
|
||||
# Summary
|
||||
s = report["summary"]
|
||||
lines.append("## Summary")
|
||||
lines.append(f"- Suggested upgrades: **{s['suggested_upgrades']}**")
|
||||
lines.append(f"- Suggested downgrades: **{s['suggested_downgrades']}**")
|
||||
lines.append(f"- New priority assignments: **{s['suggested_new_priorities']}**")
|
||||
lines.append(f"- Unchanged: **{s['unchanged']}**")
|
||||
lines.append("")
|
||||
|
||||
# Top 10
|
||||
lines.append("## Top 10 by Score")
|
||||
for i, item in enumerate(report["top_priority"][:10], 1):
|
||||
cur = item["current_priority"] or "none"
|
||||
sug = item["suggested_priority"] or "none"
|
||||
arrow = "↑" if PRIORITY_LEVELS.index(sug) < PRIORITY_LEVELS.index(cur) else "↓" if PRIORITY_LEVELS.index(sug) > PRIORITY_LEVELS.index(cur) else "→"
|
||||
lines.append(f"{i}. **[{item['repo']}#{item['issue_id']}]** {item['title']}")
|
||||
lines.append(f" Score: {item['score']} | Current: {cur} {arrow} Suggested: {sug}")
|
||||
lines.append(f" Reasons: {'; '.join(item['reasons'][:3])}")
|
||||
lines.append("")
|
||||
|
||||
# Upgrades
|
||||
if report["upgrades"]:
|
||||
lines.append("## Suggested Upgrades")
|
||||
for item in report["upgrades"]:
|
||||
lines.append(f"- **[{item['repo']}#{item['issue_id']}]** {item['title']}")
|
||||
lines.append(f" {item['current_priority']} → {item['suggested_priority']} (score: {item['score']})")
|
||||
lines.append(f" {'; '.join(item['reasons'][:2])}")
|
||||
lines.append("")
|
||||
|
||||
# Downgrades
|
||||
if report["downgrades"]:
|
||||
lines.append("## Suggested Downgrades")
|
||||
for item in report["downgrades"]:
|
||||
lines.append(f"- **[{item['repo']}#{item['issue_id']}]** {item['title']}")
|
||||
lines.append(f" {item['current_priority']} → {item['suggested_priority']} (score: {item['score']})")
|
||||
lines.append(f" {'; '.join(item['reasons'][:2])}")
|
||||
lines.append("")
|
||||
|
||||
# New assignments
|
||||
if report["new_assignments"]:
|
||||
lines.append("## New Priority Suggestions (currently unlabelled)")
|
||||
for item in report["new_assignments"][:20]:
|
||||
lines.append(f"- **[{item['repo']}#{item['issue_id']}]** {item['title']}")
|
||||
lines.append(f" Suggested: {item['suggested_priority']} (score: {item['score']})")
|
||||
lines.append(f" {'; '.join(item['reasons'][:2])}")
|
||||
lines.append("")
|
||||
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
# ============================================================
|
||||
# Main
|
||||
# ============================================================
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Priority Rebalancer — re-score issues based on pipeline data")
|
||||
parser.add_argument("--org", default="Timmy_Foundation", help="Gitea org name")
|
||||
parser.add_argument("--repo", help="Single repo to analyze (default: all)")
|
||||
parser.add_argument("--base-url", default="https://forge.alexanderwhitestone.com", help="Gitea base URL")
|
||||
parser.add_argument("--knowledge-dir", default=None, help="Path to knowledge directory")
|
||||
parser.add_argument("--metrics-dir", default=None, help="Path to metrics directory")
|
||||
parser.add_argument("--scripts-dir", default=None, help="Path to scripts directory")
|
||||
parser.add_argument("--output-dir", default=None, help="Path to output directory")
|
||||
parser.add_argument("--dry-run", action="store_true", help="Show what would change without applying")
|
||||
parser.add_argument("--apply", action="store_true", help="Apply priority changes via API")
|
||||
parser.add_argument("--json", action="store_true", help="Output JSON instead of markdown")
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
# Resolve paths relative to script location
|
||||
script_dir = Path(__file__).parent
|
||||
repo_root = script_dir.parent
|
||||
|
||||
knowledge_dir = args.knowledge_dir or str(repo_root / "knowledge")
|
||||
metrics_dir = args.metrics_dir or str(repo_root / "metrics")
|
||||
scripts_dir = args.scripts_dir or str(repo_root / "scripts")
|
||||
output_dir = args.output_dir or str(repo_root / "metrics")
|
||||
|
||||
# Get token
|
||||
token = os.environ.get("GITEA_TOKEN")
|
||||
if not token:
|
||||
token_path = os.path.expanduser("~/.config/gitea/token")
|
||||
if os.path.exists(token_path):
|
||||
with open(token_path) as f:
|
||||
token = f.read().strip()
|
||||
|
||||
if not token:
|
||||
print("Error: No Gitea token found. Set GITEA_TOKEN or create ~/.config/gitea/token", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
client = GiteaClient(args.base_url, token)
|
||||
now = datetime.utcnow()
|
||||
|
||||
# Collect pipeline signals
|
||||
print("Collecting pipeline signals...", file=sys.stderr)
|
||||
signals = []
|
||||
signals.extend(collect_knowledge_signals(knowledge_dir))
|
||||
signals.extend(collect_staleness_signals(scripts_dir, knowledge_dir))
|
||||
signals.extend(collect_metrics_signals(metrics_dir))
|
||||
print(f" Found {len(signals)} signals", file=sys.stderr)
|
||||
|
||||
# Get repos
|
||||
if args.repo:
|
||||
repos = [{"name": args.repo}]
|
||||
else:
|
||||
repos = client.get_org_repos(args.org)
|
||||
|
||||
repo_names = [r["name"] for r in repos]
|
||||
print(f"Scanning {len(repo_names)} repos: {', '.join(repo_names[:5])}{'...' if len(repo_names) > 5 else ''}", file=sys.stderr)
|
||||
|
||||
# Score all issues
|
||||
all_scores = []
|
||||
for repo in repos:
|
||||
repo_name = repo["name"]
|
||||
issues = client.get_issues(args.org, repo_name)
|
||||
print(f" {repo_name}: {len(issues)} open issues", file=sys.stderr)
|
||||
|
||||
for issue in issues:
|
||||
if issue.get("pull_request"):
|
||||
continue
|
||||
score = compute_issue_score(issue, repo_name, signals, now)
|
||||
all_scores.append(score)
|
||||
|
||||
# Generate report
|
||||
report = generate_report(all_scores, signals, args.org, repo_names)
|
||||
|
||||
# Output
|
||||
os.makedirs(output_dir, exist_ok=True)
|
||||
|
||||
if args.json:
|
||||
print(json.dumps(report, indent=2))
|
||||
else:
|
||||
md = generate_markdown_report(report)
|
||||
print(md)
|
||||
|
||||
# Write files
|
||||
report_path = os.path.join(output_dir, "priority_report.json")
|
||||
with open(report_path, "w") as f:
|
||||
json.dump(report, f, indent=2)
|
||||
print(f"\nFull report: {report_path}", file=sys.stderr)
|
||||
|
||||
md_path = os.path.join(output_dir, "priority_suggestions.md")
|
||||
with open(md_path, "w") as f:
|
||||
f.write(generate_markdown_report(report))
|
||||
print(f"Suggestions: {md_path}", file=sys.stderr)
|
||||
|
||||
# Apply changes if requested
|
||||
if args.apply:
|
||||
print("\nApplying priority changes...", file=sys.stderr)
|
||||
applied = 0
|
||||
|
||||
# Get label IDs for priority labels
|
||||
priority_label_map = {}
|
||||
for repo_name in repo_names:
|
||||
labels = client.get_repo_labels(args.org, repo_name)
|
||||
for label in labels:
|
||||
name = label.get("name", "").lower()
|
||||
if name in ("p0", "p1", "p2", "p3"):
|
||||
priority_label_map[(repo_name, name)] = label["id"]
|
||||
|
||||
for score in all_scores:
|
||||
if score.suggested_priority and score.suggested_priority != score.current_priority:
|
||||
sug_lower = score.suggested_priority.lower()
|
||||
label_id = priority_label_map.get((score.repo, sug_lower))
|
||||
if label_id:
|
||||
ok = client.add_label_to_issue(args.org, score.repo, score.issue_id, [label_id])
|
||||
if ok:
|
||||
applied += 1
|
||||
print(f" Applied: [{score.repo}#{score.issue_id}] → {score.suggested_priority}", file=sys.stderr)
|
||||
|
||||
# Add comment explaining the change
|
||||
comment = f"**Priority Rebalancer** suggested: **{score.suggested_priority}** (was: {score.current_priority or 'none'})\n\n"
|
||||
comment += f"Score: {score.score}\n"
|
||||
comment += f"Reasons:\n"
|
||||
for r in score.reasons[:5]:
|
||||
comment += f"- {r}\n"
|
||||
client.add_comment(args.org, score.repo, score.issue_id, comment)
|
||||
|
||||
print(f"Applied {applied} priority changes", file=sys.stderr)
|
||||
|
||||
elif args.dry_run:
|
||||
print(f"\nDry run — {report['summary']['suggested_upgrades'] + report['summary']['suggested_downgrades'] + report['summary']['suggested_new_priorities']} changes would be applied", file=sys.stderr)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,290 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Finds refactoring opportunities in codebases
|
||||
|
||||
Engine ID: 10.4
|
||||
|
||||
Usage:
|
||||
python3 scripts/refactoring_opportunity_finder.py --output proposals/refactoring_opportunity_finder.json
|
||||
python3 scripts/refactoring_opportunity_finder.py --output proposals/refactoring_opportunity_finder.json --dry-run
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import ast
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
from dataclasses import dataclass, field
|
||||
from datetime import datetime, timezone
|
||||
from typing import Optional, Tuple
|
||||
|
||||
|
||||
# ── Data Classes ────────────────────────────────────────────────────────
|
||||
|
||||
@dataclass
|
||||
class FileMetrics:
|
||||
"""Metrics for a single source file."""
|
||||
path: str
|
||||
lines: int = 0
|
||||
complexity: float = 0.0
|
||||
max_complexity: int = 0
|
||||
functions: int = 0
|
||||
classes: int = 0
|
||||
churn_30d: int = 0
|
||||
churn_90d: int = 0
|
||||
test_coverage: Optional[float] = None
|
||||
refactoring_score: float = 0.0
|
||||
|
||||
|
||||
# ── Complexity Analysis ─────────────────────────────────────────────────
|
||||
|
||||
class ComplexityVisitor(ast.NodeVisitor):
|
||||
"""AST visitor that computes cyclomatic complexity per function."""
|
||||
|
||||
def __init__(self):
|
||||
self.complexities = []
|
||||
self.function_count = 0
|
||||
self.class_count = 0
|
||||
self._current_complexity = 0
|
||||
self._in_function = False
|
||||
|
||||
def visit_FunctionDef(self, node):
|
||||
self.function_count += 1
|
||||
old_complexity = self._current_complexity
|
||||
old_in_function = self._in_function
|
||||
self._current_complexity = 1 # Base complexity
|
||||
self._in_function = True
|
||||
|
||||
self.generic_visit(node)
|
||||
|
||||
self.complexities.append(self._current_complexity)
|
||||
self._current_complexity = old_complexity
|
||||
self._in_function = old_in_function
|
||||
|
||||
visit_AsyncFunctionDef = visit_FunctionDef
|
||||
|
||||
def visit_ClassDef(self, node):
|
||||
self.class_count += 1
|
||||
self.generic_visit(node)
|
||||
|
||||
def visit_If(self, node):
|
||||
if self._in_function:
|
||||
self._current_complexity += 1
|
||||
self.generic_visit(node)
|
||||
|
||||
def visit_For(self, node):
|
||||
if self._in_function:
|
||||
self._current_complexity += 1
|
||||
self.generic_visit(node)
|
||||
|
||||
visit_AsyncFor = visit_For
|
||||
|
||||
def visit_While(self, node):
|
||||
if self._in_function:
|
||||
self._current_complexity += 1
|
||||
self.generic_visit(node)
|
||||
|
||||
def visit_ExceptHandler(self, node):
|
||||
if self._in_function:
|
||||
self._current_complexity += 1
|
||||
self.generic_visit(node)
|
||||
|
||||
def visit_With(self, node):
|
||||
if self._in_function:
|
||||
self._current_complexity += 1
|
||||
self.generic_visit(node)
|
||||
|
||||
visit_AsyncWith = visit_With
|
||||
|
||||
def visit_Assert(self, node):
|
||||
if self._in_function:
|
||||
self._current_complexity += 1
|
||||
self.generic_visit(node)
|
||||
|
||||
def visit_BoolOp(self, node):
|
||||
# Each 'and'/'or' adds a branch
|
||||
if self._in_function:
|
||||
self._current_complexity += len(node.values) - 1
|
||||
self.generic_visit(node)
|
||||
|
||||
def visit_IfExp(self, node):
|
||||
# Ternary expression
|
||||
if self._in_function:
|
||||
self._current_complexity += 1
|
||||
self.generic_visit(node)
|
||||
|
||||
|
||||
def compute_file_complexity(filepath: str) -> Tuple[float, int, int, int, int]:
|
||||
"""
|
||||
Compute cyclomatic complexity for a Python file.
|
||||
|
||||
Returns:
|
||||
(avg_complexity, max_complexity, function_count, class_count, line_count)
|
||||
"""
|
||||
try:
|
||||
with open(filepath, 'r', encoding='utf-8', errors='replace') as f:
|
||||
source = f.read()
|
||||
except (IOError, OSError):
|
||||
return 0.0, 0, 0, 0, 0
|
||||
|
||||
try:
|
||||
tree = ast.parse(source, filename=filepath)
|
||||
except SyntaxError:
|
||||
return 0.0, 0, 0, 0, 0
|
||||
|
||||
visitor = ComplexityVisitor()
|
||||
visitor.visit(tree)
|
||||
|
||||
line_count = len(source.splitlines())
|
||||
|
||||
if not visitor.complexities:
|
||||
# No functions, but might have classes
|
||||
return 0.0, 0, visitor.function_count, visitor.class_count, line_count
|
||||
|
||||
avg = sum(visitor.complexities) / len(visitor.complexities)
|
||||
max_c = max(visitor.complexities)
|
||||
|
||||
return avg, max_c, visitor.function_count, visitor.class_count, line_count
|
||||
|
||||
|
||||
# ── Refactoring Score ───────────────────────────────────────────────────
|
||||
|
||||
def calculate_refactoring_score(metrics: FileMetrics) -> float:
|
||||
"""
|
||||
Calculate a refactoring priority score (0-100) based on file metrics.
|
||||
|
||||
Higher score = higher priority for refactoring.
|
||||
Components:
|
||||
- Complexity (0-30 points): higher avg/max complexity = higher score
|
||||
- Size (0-20 points): larger files = higher score
|
||||
- Churn (0-30 points): more changes recently = higher score
|
||||
- Coverage (0-20 points): lower test coverage = higher score
|
||||
"""
|
||||
score = 0.0
|
||||
|
||||
# Complexity component (0-30)
|
||||
# avg=10+ or max=20+ → 30 points
|
||||
complexity_score = min(30.0, (metrics.complexity * 2) + (metrics.max_complexity * 0.5))
|
||||
score += max(0.0, complexity_score)
|
||||
|
||||
# Size component (0-20)
|
||||
# 500+ lines → 20 points
|
||||
size_score = min(20.0, metrics.lines / 25.0)
|
||||
score += max(0.0, size_score)
|
||||
|
||||
# Churn component (0-30)
|
||||
# Weighted: recent churn (30d) counts more than older (90d)
|
||||
churn_score = min(30.0, (metrics.churn_30d * 2) + (metrics.churn_90d * 0.5))
|
||||
score += max(0.0, churn_score)
|
||||
|
||||
# Coverage component (0-20)
|
||||
# Lower coverage → higher score
|
||||
if metrics.test_coverage is not None:
|
||||
# coverage=0 → 20 points, coverage=1 → 0 points
|
||||
coverage_score = (1.0 - metrics.test_coverage) * 20.0
|
||||
else:
|
||||
# No data → assume medium risk (10 points)
|
||||
coverage_score = 10.0
|
||||
score += max(0.0, coverage_score)
|
||||
|
||||
return min(100.0, max(0.0, score))
|
||||
|
||||
|
||||
# ── Proposal Generation ─────────────────────────────────────────────────
|
||||
|
||||
def scan_directory(directory: str, extensions: tuple = ('.py',)) -> list:
|
||||
"""Scan directory for source files."""
|
||||
files = []
|
||||
for root, dirs, filenames in os.walk(directory):
|
||||
# Skip hidden dirs and common non-source dirs
|
||||
dirs[:] = [d for d in dirs if not d.startswith('.') and d not in (
|
||||
'__pycache__', 'node_modules', 'venv', '.venv', 'env',
|
||||
'build', 'dist', '.git', '.tox'
|
||||
)]
|
||||
for fname in filenames:
|
||||
if any(fname.endswith(ext) for ext in extensions):
|
||||
files.append(os.path.join(root, fname))
|
||||
return files
|
||||
|
||||
|
||||
def generate_proposals(directory: str = '.', min_score: float = 30.0) -> list:
|
||||
"""Generate refactoring proposals by analyzing source files."""
|
||||
proposals = []
|
||||
files = scan_directory(directory)
|
||||
|
||||
for filepath in files:
|
||||
avg, max_c, funcs, classes, lines = compute_file_complexity(filepath)
|
||||
|
||||
if funcs == 0 and classes == 0:
|
||||
continue
|
||||
|
||||
metrics = FileMetrics(
|
||||
path=filepath,
|
||||
lines=lines,
|
||||
complexity=avg,
|
||||
max_complexity=max_c,
|
||||
functions=funcs,
|
||||
classes=classes
|
||||
)
|
||||
score = calculate_refactoring_score(metrics)
|
||||
metrics.refactoring_score = score
|
||||
|
||||
if score >= min_score:
|
||||
reasons = []
|
||||
if max_c > 10:
|
||||
reasons.append(f"high max complexity ({max_c})")
|
||||
if avg > 5:
|
||||
reasons.append(f"high avg complexity ({avg:.1f})")
|
||||
if lines > 300:
|
||||
reasons.append(f"large file ({lines} lines)")
|
||||
|
||||
proposals.append({
|
||||
"title": f"Refactor {os.path.basename(filepath)} (score: {score:.0f})",
|
||||
"description": f"{filepath}: {', '.join(reasons) if reasons else 'general improvement candidate'}",
|
||||
"impact": min(10, int(score / 10)),
|
||||
"effort": min(10, max(1, int(max_c / 3))),
|
||||
"category": "refactoring",
|
||||
"source_engine": "10.4",
|
||||
"timestamp": datetime.now(timezone.utc).isoformat(),
|
||||
"metrics": {
|
||||
"path": filepath,
|
||||
"score": round(score, 2),
|
||||
"avg_complexity": round(avg, 2),
|
||||
"max_complexity": max_c,
|
||||
"lines": lines,
|
||||
"functions": funcs,
|
||||
"classes": classes
|
||||
}
|
||||
})
|
||||
|
||||
# Sort by score descending
|
||||
proposals.sort(key=lambda p: p.get('metrics', {}).get('score', 0), reverse=True)
|
||||
return proposals
|
||||
|
||||
|
||||
# ── CLI ─────────────────────────────────────────────────────────────────
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Finds refactoring opportunities in codebases")
|
||||
parser.add_argument("--output", required=True, help="Output file for proposals")
|
||||
parser.add_argument("--dry-run", action="store_true", help="Don't write output file")
|
||||
parser.add_argument("--directory", default=".", help="Directory to scan")
|
||||
parser.add_argument("--min-score", type=float, default=30.0, help="Minimum score threshold")
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
proposals = generate_proposals(args.directory, args.min_score)
|
||||
|
||||
if not args.dry_run:
|
||||
os.makedirs(os.path.dirname(args.output) or '.', exist_ok=True)
|
||||
with open(args.output, "w") as f:
|
||||
json.dump({"proposals": proposals}, f, indent=2)
|
||||
print(f"Generated {len(proposals)} proposals -> {args.output}")
|
||||
else:
|
||||
print(f"Would generate {len(proposals)} proposals")
|
||||
for p in proposals:
|
||||
print(f" - {p['title']}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,353 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
sampler.py — Score and rank sessions by harvest value.
|
||||
|
||||
With 20k+ sessions on disk, we can't harvest all at once. This script
|
||||
scores each session by how likely it is to contain valuable knowledge,
|
||||
so the harvester processes the best ones first.
|
||||
|
||||
Scoring strategy:
|
||||
- Recency: last 7d=3pts, last 30d=2pts, older=1pt
|
||||
- Length: >50 messages=3pts, >20=2pts, <20=1pt
|
||||
- Repo uniqueness: first session for a repo=5pts, otherwise=1pt
|
||||
- Outcome: failure=3pts (most to learn), success=2pts, unknown=1pt
|
||||
- Tool calls: >10 tool invocations=2pts (complex sessions)
|
||||
|
||||
Usage:
|
||||
python3 sampler.py --count 100 # Top 100 sessions
|
||||
python3 sampler.py --repo the-nexus --count 20 # Top 20 for a repo
|
||||
python3 sampler.py --since 2026-04-01 # All sessions since date
|
||||
python3 sampler.py --count 50 --min-score 8 # Only high-value sessions
|
||||
python3 sampler.py --count 100 --output sample.json # Save to file
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import time
|
||||
from datetime import datetime, timedelta, timezone
|
||||
from pathlib import Path
|
||||
from typing import Optional
|
||||
|
||||
|
||||
# --- Fast session scanning (no full parse) ---
|
||||
|
||||
def scan_session_fast(path: str) -> dict:
|
||||
"""Extract scoring metadata from a session without parsing the full JSONL.
|
||||
|
||||
Reads only: first line, last ~20 lines, and line count. This processes
|
||||
20k sessions in seconds instead of minutes.
|
||||
"""
|
||||
meta = {
|
||||
'path': path,
|
||||
'message_count': 0,
|
||||
'has_tool_calls': False,
|
||||
'tool_call_count': 0,
|
||||
'first_timestamp': '',
|
||||
'last_timestamp': '',
|
||||
'is_failure': False,
|
||||
'repos_mentioned': set(),
|
||||
'first_role': '',
|
||||
'last_content_preview': '',
|
||||
}
|
||||
|
||||
try:
|
||||
file_size = os.path.getsize(path)
|
||||
if file_size == 0:
|
||||
return meta
|
||||
|
||||
with open(path, 'r', encoding='utf-8', errors='replace') as f:
|
||||
# Read first line for timestamp + role
|
||||
first_line = f.readline().strip()
|
||||
if first_line:
|
||||
try:
|
||||
first_msg = json.loads(first_line)
|
||||
meta['first_timestamp'] = first_msg.get('timestamp', '')
|
||||
meta['first_role'] = first_msg.get('role', '')
|
||||
except json.JSONDecodeError:
|
||||
pass
|
||||
|
||||
# Fast line count + collect tail lines
|
||||
# For the tail, seek to near end of file
|
||||
tail_lines = []
|
||||
line_count = 1 # already read first
|
||||
|
||||
if file_size > 8192:
|
||||
# Seek to last 8KB for tail sampling
|
||||
f.seek(max(0, file_size - 8192))
|
||||
f.readline() # skip partial line
|
||||
for line in f:
|
||||
line = line.strip()
|
||||
if line:
|
||||
tail_lines.append(line)
|
||||
line_count += 1
|
||||
# We lost the exact count for big files — estimate from file size
|
||||
# Average JSONL line is ~500 bytes
|
||||
if line_count < 100:
|
||||
line_count = max(line_count, file_size // 500)
|
||||
else:
|
||||
# Small file — read all
|
||||
for line in f:
|
||||
line = line.strip()
|
||||
if line:
|
||||
tail_lines.append(line)
|
||||
line_count += 1
|
||||
|
||||
meta['message_count'] = line_count
|
||||
|
||||
# Parse tail lines for outcome, tool calls, repos
|
||||
for line in tail_lines[-30:]: # last 30 non-empty lines
|
||||
try:
|
||||
msg = json.loads(line)
|
||||
|
||||
# Track last timestamp
|
||||
ts = msg.get('timestamp', '')
|
||||
if ts:
|
||||
meta['last_timestamp'] = ts
|
||||
|
||||
# Count tool calls
|
||||
if msg.get('tool_calls'):
|
||||
meta['has_tool_calls'] = True
|
||||
meta['tool_call_count'] += len(msg['tool_calls'])
|
||||
|
||||
# Detect failure signals in content
|
||||
content = ''
|
||||
if isinstance(msg.get('content'), str):
|
||||
content = msg['content'].lower()
|
||||
elif isinstance(msg.get('content'), list):
|
||||
for part in msg['content']:
|
||||
if isinstance(part, dict) and part.get('type') == 'text':
|
||||
content += part.get('text', '').lower()
|
||||
|
||||
if content:
|
||||
meta['last_content_preview'] = content[:200]
|
||||
failure_signals = ['error', 'failed', 'cannot', 'unable',
|
||||
'exception', 'traceback', 'rejected', 'denied']
|
||||
if any(sig in content for sig in failure_signals):
|
||||
meta['is_failure'] = True
|
||||
|
||||
# Extract repo references from tool call arguments
|
||||
if msg.get('tool_calls'):
|
||||
for tc in msg['tool_calls']:
|
||||
args = tc.get('function', {}).get('arguments', '')
|
||||
if isinstance(args, str):
|
||||
# Look for repo patterns
|
||||
for pattern in ['Timmy_Foundation/', 'Rockachopa/', 'compounding-intelligence', 'the-nexus', 'timmy-home', 'hermes-agent', 'the-beacon', 'the-door']:
|
||||
if pattern in args:
|
||||
repo = pattern.rstrip('/')
|
||||
meta['repos_mentioned'].add(repo)
|
||||
|
||||
except json.JSONDecodeError:
|
||||
continue
|
||||
|
||||
except (IOError, OSError):
|
||||
pass
|
||||
|
||||
meta['repos_mentioned'] = list(meta['repos_mentioned'])
|
||||
return meta
|
||||
|
||||
|
||||
# --- Filename timestamp parsing ---
|
||||
|
||||
def parse_session_timestamp(filename: str) -> Optional[datetime]:
|
||||
"""Parse timestamp from session filename.
|
||||
|
||||
Common formats:
|
||||
session_20260413_123456_hash.jsonl
|
||||
20260413_123456_hash.jsonl
|
||||
"""
|
||||
stem = Path(filename).stem
|
||||
parts = stem.split('_')
|
||||
|
||||
# Try session_YYYYMMDD_HHMMSS format
|
||||
for i, part in enumerate(parts):
|
||||
if len(part) == 8 and part.isdigit():
|
||||
date_part = part
|
||||
time_part = parts[i + 1] if i + 1 < len(parts) and len(parts[i + 1]) == 6 else '000000'
|
||||
try:
|
||||
return datetime.strptime(f"{date_part}_{time_part}", '%Y%m%d_%H%M%S').replace(tzinfo=timezone.utc)
|
||||
except ValueError:
|
||||
continue
|
||||
|
||||
# Fallback: use file modification time
|
||||
return None
|
||||
|
||||
|
||||
# --- Scoring ---
|
||||
|
||||
def score_session(meta: dict, now: datetime, seen_repos: set) -> tuple[int, dict]:
|
||||
"""Score a session for harvest value. Returns (score, breakdown)."""
|
||||
score = 0
|
||||
breakdown = {}
|
||||
|
||||
# 1. Recency
|
||||
ts = parse_session_timestamp(os.path.basename(meta['path']))
|
||||
if ts is None:
|
||||
# Fallback to mtime
|
||||
try:
|
||||
ts = datetime.fromtimestamp(os.path.getmtime(meta['path']), tz=timezone.utc)
|
||||
except OSError:
|
||||
ts = now - timedelta(days=365)
|
||||
|
||||
age_days = (now - ts).days
|
||||
if age_days <= 7:
|
||||
recency = 3
|
||||
elif age_days <= 30:
|
||||
recency = 2
|
||||
else:
|
||||
recency = 1
|
||||
score += recency
|
||||
breakdown['recency'] = recency
|
||||
|
||||
# 2. Length
|
||||
count = meta['message_count']
|
||||
if count > 50:
|
||||
length = 3
|
||||
elif count > 20:
|
||||
length = 2
|
||||
else:
|
||||
length = 1
|
||||
score += length
|
||||
breakdown['length'] = length
|
||||
|
||||
# 3. Repo uniqueness (first session mentioning a repo gets bonus)
|
||||
repo_score = 0
|
||||
for repo in meta.get('repos_mentioned', []):
|
||||
if repo not in seen_repos:
|
||||
seen_repos.add(repo)
|
||||
repo_score = max(repo_score, 5)
|
||||
else:
|
||||
repo_score = max(repo_score, 1)
|
||||
score += repo_score
|
||||
breakdown['repo_unique'] = repo_score
|
||||
|
||||
# 4. Outcome
|
||||
if meta.get('is_failure'):
|
||||
outcome = 3
|
||||
elif meta.get('last_content_preview', '').strip():
|
||||
outcome = 2 # has some content = likely completed
|
||||
else:
|
||||
outcome = 1
|
||||
score += outcome
|
||||
breakdown['outcome'] = outcome
|
||||
|
||||
# 5. Tool calls
|
||||
if meta.get('tool_call_count', 0) > 10:
|
||||
tool = 2
|
||||
else:
|
||||
tool = 0
|
||||
score += tool
|
||||
breakdown['tool_calls'] = tool
|
||||
|
||||
return score, breakdown
|
||||
|
||||
|
||||
# --- Main ---
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Score and rank sessions for harvesting")
|
||||
parser.add_argument('--sessions-dir', default=os.path.expanduser('~/.hermes/sessions'),
|
||||
help='Directory containing session files')
|
||||
parser.add_argument('--count', type=int, default=100, help='Number of top sessions to return')
|
||||
parser.add_argument('--repo', default='', help='Filter to sessions mentioning this repo')
|
||||
parser.add_argument('--since', default='', help='Only score sessions after this date (YYYY-MM-DD)')
|
||||
parser.add_argument('--min-score', type=int, default=0, help='Minimum score threshold')
|
||||
parser.add_argument('--output', default='', help='Output file (JSON). Default: stdout')
|
||||
parser.add_argument('--format', choices=['json', 'paths', 'table'], default='table',
|
||||
help='Output format: json (full), paths (one per line), table (human)')
|
||||
parser.add_argument('--top-percent', type=float, default=0, help='Return top N%% instead of --count')
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
sessions_dir = Path(args.sessions_dir)
|
||||
if not sessions_dir.is_dir():
|
||||
print(f"ERROR: Sessions directory not found: {sessions_dir}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
# Find all JSONL files
|
||||
print(f"Scanning {sessions_dir}...", file=sys.stderr)
|
||||
t0 = time.time()
|
||||
|
||||
session_files = list(sessions_dir.glob('*.jsonl'))
|
||||
total = len(session_files)
|
||||
print(f"Found {total} session files", file=sys.stderr)
|
||||
|
||||
# Parse since date
|
||||
since_dt = None
|
||||
if args.since:
|
||||
since_dt = datetime.strptime(args.since, '%Y-%m-%d').replace(tzinfo=timezone.utc)
|
||||
|
||||
# Score all sessions
|
||||
now = datetime.now(timezone.utc)
|
||||
seen_repos = set() # Track repos for uniqueness scoring
|
||||
scored = []
|
||||
|
||||
for i, sf in enumerate(session_files):
|
||||
# Date filter (fast path: check filename first)
|
||||
if since_dt:
|
||||
ts = parse_session_timestamp(sf.name)
|
||||
if ts and ts < since_dt:
|
||||
continue
|
||||
|
||||
meta = scan_session_fast(str(sf))
|
||||
|
||||
# Repo filter
|
||||
if args.repo:
|
||||
repos = meta.get('repos_mentioned', [])
|
||||
if args.repo.lower() not in [r.lower() for r in repos]:
|
||||
# Also check filename
|
||||
if args.repo.lower() not in sf.name.lower():
|
||||
continue
|
||||
|
||||
score, breakdown = score_session(meta, now, seen_repos)
|
||||
|
||||
if score >= args.min_score:
|
||||
scored.append({
|
||||
'path': str(sf),
|
||||
'filename': sf.name,
|
||||
'score': score,
|
||||
'breakdown': breakdown,
|
||||
'message_count': meta['message_count'],
|
||||
'repos': meta['repos_mentioned'],
|
||||
'is_failure': meta['is_failure'],
|
||||
})
|
||||
|
||||
if (i + 1) % 5000 == 0:
|
||||
elapsed = time.time() - t0
|
||||
print(f" Scanned {i + 1}/{total} ({elapsed:.1f}s)", file=sys.stderr)
|
||||
|
||||
elapsed = time.time() - t0
|
||||
print(f"Scored {len(scored)} sessions in {elapsed:.1f}s", file=sys.stderr)
|
||||
|
||||
# Sort by score descending
|
||||
scored.sort(key=lambda x: x['score'], reverse=True)
|
||||
|
||||
# Apply count or percent
|
||||
if args.top_percent > 0:
|
||||
count = max(1, int(len(scored) * args.top_percent / 100))
|
||||
else:
|
||||
count = args.count
|
||||
scored = scored[:count]
|
||||
|
||||
# Output
|
||||
if args.output:
|
||||
with open(args.output, 'w', encoding='utf-8') as f:
|
||||
json.dump(scored, f, indent=2)
|
||||
print(f"Wrote {len(scored)} sessions to {args.output}", file=sys.stderr)
|
||||
elif args.format == 'json':
|
||||
json.dump(scored, sys.stdout, indent=2)
|
||||
elif args.format == 'paths':
|
||||
for s in scored:
|
||||
print(s['path'])
|
||||
else: # table
|
||||
print(f"{'SCORE':>5} {'MSGS':>5} {'REPOS':<25} {'FILE'}")
|
||||
print(f"{'-'*5} {'-'*5} {'-'*25} {'-'*40}")
|
||||
for s in scored:
|
||||
repos = ', '.join(s['repos'][:2]) if s['repos'] else '-'
|
||||
fail = ' FAIL' if s['is_failure'] else ''
|
||||
print(f"{s['score']:>5} {s['message_count']:>5} {repos:<25} {s['filename'][:40]}{fail}")
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
@@ -1,276 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
session_metadata.py - Extract structured metadata from Hermes session transcripts.
|
||||
Works alongside session_reader.py to provide higher-level session analysis.
|
||||
"""
|
||||
|
||||
import json
|
||||
import re
|
||||
import sys
|
||||
from dataclasses import dataclass, asdict
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
from typing import Dict, List, Optional, Any
|
||||
|
||||
# Import from session_reader (the canonical reader)
|
||||
from session_reader import read_session
|
||||
|
||||
|
||||
@dataclass
|
||||
class SessionSummary:
|
||||
"""Structured summary of a Hermes session transcript."""
|
||||
session_id: str
|
||||
model: str
|
||||
repo: str
|
||||
outcome: str
|
||||
message_count: int
|
||||
tool_calls: int
|
||||
duration_estimate: str
|
||||
key_actions: List[str]
|
||||
errors_encountered: List[str]
|
||||
start_time: Optional[str] = None
|
||||
end_time: Optional[str] = None
|
||||
total_tokens_estimate: int = 0
|
||||
user_messages: int = 0
|
||||
assistant_messages: int = 0
|
||||
tool_outputs: int = 0
|
||||
|
||||
|
||||
def extract_session_metadata(file_path: str) -> SessionSummary:
|
||||
"""
|
||||
Extract structured metadata from a Hermes session JSONL transcript.
|
||||
Uses session_reader.read_session() for file reading.
|
||||
"""
|
||||
session_id = Path(file_path).stem
|
||||
messages = []
|
||||
model = "unknown"
|
||||
repo = "unknown"
|
||||
tool_calls_count = 0
|
||||
key_actions = []
|
||||
errors = []
|
||||
start_time = None
|
||||
end_time = None
|
||||
total_tokens = 0
|
||||
|
||||
# Common repo patterns to look for
|
||||
repo_patterns = [
|
||||
r"(?:the-nexus|compounding-intelligence|timmy-config|hermes-agent)",
|
||||
r"(?:forge\.alexanderwhitestone\.com/([^/]+/[^/\\s]+))",
|
||||
r"(?:github\.com/([^/]+/[^/\\s]+))",
|
||||
r"(?:Timmy_Foundation/([^/\\s]+))",
|
||||
]
|
||||
|
||||
try:
|
||||
# Use the canonical reader from session_reader.py
|
||||
messages = read_session(file_path)
|
||||
except FileNotFoundError:
|
||||
return SessionSummary(
|
||||
session_id=session_id,
|
||||
model="unknown",
|
||||
repo="unknown",
|
||||
outcome="failure",
|
||||
message_count=0,
|
||||
tool_calls=0,
|
||||
duration_estimate="0m",
|
||||
key_actions=[],
|
||||
errors_encountered=[f"File not found: {file_path}"]
|
||||
)
|
||||
|
||||
# Process messages for metadata
|
||||
for entry in messages:
|
||||
# Extract model from assistant messages
|
||||
if entry.get("role") == "assistant" and entry.get("model"):
|
||||
model = entry["model"]
|
||||
|
||||
# Extract timestamps
|
||||
if entry.get("timestamp"):
|
||||
ts = entry["timestamp"]
|
||||
if start_time is None:
|
||||
start_time = ts
|
||||
end_time = ts
|
||||
|
||||
# Count tool calls
|
||||
if entry.get("tool_calls"):
|
||||
tool_calls_count += len(entry["tool_calls"])
|
||||
for tc in entry["tool_calls"]:
|
||||
if tc.get("function", {}).get("name"):
|
||||
action = f"{tc['function']['name']}"
|
||||
if action not in key_actions:
|
||||
key_actions.append(action)
|
||||
|
||||
# Estimate tokens from content length
|
||||
content = entry.get("content", "")
|
||||
if isinstance(content, str):
|
||||
total_tokens += len(content.split())
|
||||
elif isinstance(content, list):
|
||||
for item in content:
|
||||
if isinstance(item, dict) and "text" in item:
|
||||
total_tokens += len(item["text"].split())
|
||||
|
||||
# Look for repo mentions in content
|
||||
if entry.get("content"):
|
||||
content_str = str(entry["content"])
|
||||
for pattern in repo_patterns:
|
||||
match = re.search(pattern, content_str, re.IGNORECASE)
|
||||
if match:
|
||||
if match.groups():
|
||||
repo = match.group(1)
|
||||
else:
|
||||
repo = match.group(0)
|
||||
break
|
||||
|
||||
# Look for error messages
|
||||
if entry.get("role") == "tool" and entry.get("is_error"):
|
||||
error_msg = entry.get("content", "Unknown error")
|
||||
if isinstance(error_msg, str) and len(error_msg) < 200:
|
||||
errors.append(error_msg[:200])
|
||||
|
||||
# Count message types
|
||||
user_messages = sum(1 for m in messages if m.get("role") == "user")
|
||||
assistant_messages = sum(1 for m in messages if m.get("role") == "assistant")
|
||||
tool_outputs = sum(1 for m in messages if m.get("role") == "tool")
|
||||
|
||||
# Calculate duration estimate
|
||||
duration_estimate = "unknown"
|
||||
if start_time and end_time:
|
||||
try:
|
||||
# Try to parse timestamps
|
||||
start_dt = None
|
||||
end_dt = None
|
||||
|
||||
# Handle various timestamp formats
|
||||
for fmt in ["%Y-%m-%dT%H:%M:%S.%fZ", "%Y-%m-%dT%H:%M:%SZ", "%Y-%m-%d %H:%M:%S"]:
|
||||
try:
|
||||
if start_dt is None:
|
||||
start_dt = datetime.strptime(start_time, fmt)
|
||||
if end_dt is None:
|
||||
end_dt = datetime.strptime(end_time, fmt)
|
||||
except ValueError:
|
||||
continue
|
||||
|
||||
if start_dt and end_dt:
|
||||
duration = end_dt - start_dt
|
||||
minutes = duration.total_seconds() / 60
|
||||
duration_estimate = f"{minutes:.0f}m"
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Classify outcome
|
||||
outcome = "unknown"
|
||||
if errors:
|
||||
# Check if any errors are fatal
|
||||
fatal_errors = any("405" in e or "permission" in e.lower() or "authentication" in e.lower()
|
||||
for e in errors)
|
||||
if fatal_errors:
|
||||
outcome = "failure"
|
||||
else:
|
||||
outcome = "partial"
|
||||
elif messages:
|
||||
# Check last message for success indicators
|
||||
last_msg = messages[-1]
|
||||
if last_msg.get("role") == "assistant":
|
||||
content = last_msg.get("content", "")
|
||||
if isinstance(content, str):
|
||||
success_indicators = ["done", "completed", "success", "merged", "pushed"]
|
||||
if any(indicator in content.lower() for indicator in success_indicators):
|
||||
outcome = "success"
|
||||
else:
|
||||
outcome = "unknown"
|
||||
|
||||
# Deduplicate key actions (keep unique, limit to 10)
|
||||
unique_actions = []
|
||||
for action in key_actions:
|
||||
if action not in unique_actions:
|
||||
unique_actions.append(action)
|
||||
if len(unique_actions) >= 10:
|
||||
break
|
||||
|
||||
# Deduplicate errors (keep unique, limit to 5)
|
||||
unique_errors = []
|
||||
for error in errors:
|
||||
if error not in unique_errors:
|
||||
unique_errors.append(error)
|
||||
if len(unique_errors) >= 5:
|
||||
break
|
||||
|
||||
return SessionSummary(
|
||||
session_id=session_id,
|
||||
model=model,
|
||||
repo=repo,
|
||||
outcome=outcome,
|
||||
message_count=len(messages),
|
||||
tool_calls=tool_calls_count,
|
||||
duration_estimate=duration_estimate,
|
||||
key_actions=unique_actions,
|
||||
errors_encountered=unique_errors,
|
||||
start_time=start_time,
|
||||
end_time=end_time,
|
||||
total_tokens_estimate=total_tokens,
|
||||
user_messages=user_messages,
|
||||
assistant_messages=assistant_messages,
|
||||
tool_outputs=tool_outputs
|
||||
)
|
||||
|
||||
|
||||
def process_session_directory(directory_path: str, output_file: Optional[str] = None) -> List[SessionSummary]:
|
||||
"""
|
||||
Process all JSONL files in a directory.
|
||||
"""
|
||||
directory = Path(directory_path)
|
||||
if not directory.exists():
|
||||
print(f"Error: Directory {directory_path} does not exist", file=sys.stderr)
|
||||
return []
|
||||
|
||||
jsonl_files = list(directory.glob("*.jsonl"))
|
||||
if not jsonl_files:
|
||||
print(f"Warning: No JSONL files found in {directory_path}", file=sys.stderr)
|
||||
return []
|
||||
|
||||
summaries = []
|
||||
for jsonl_file in sorted(jsonl_files):
|
||||
print(f"Processing {jsonl_file.name}...", file=sys.stderr)
|
||||
summary = extract_session_metadata(str(jsonl_file))
|
||||
summaries.append(summary)
|
||||
|
||||
if output_file:
|
||||
with open(output_file, 'w', encoding='utf-8') as f:
|
||||
json.dump([asdict(s) for s in summaries], f, indent=2)
|
||||
print(f"Wrote {len(summaries)} summaries to {output_file}", file=sys.stderr)
|
||||
|
||||
return summaries
|
||||
|
||||
|
||||
def main():
|
||||
"""CLI entry point."""
|
||||
import argparse
|
||||
|
||||
parser = argparse.ArgumentParser(description="Extract metadata from Hermes session JSONL transcripts")
|
||||
parser.add_argument("path", help="Path to JSONL file or directory of session files")
|
||||
parser.add_argument("-o", "--output", help="Output JSON file (default: stdout)")
|
||||
parser.add_argument("-v", "--verbose", action="store_true", help="Verbose output")
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
path = Path(args.path)
|
||||
|
||||
if path.is_file():
|
||||
summary = extract_session_metadata(str(path))
|
||||
if args.output:
|
||||
with open(args.output, 'w') as f:
|
||||
json.dump(asdict(summary), f, indent=2)
|
||||
print(f"Wrote summary to {args.output}", file=sys.stderr)
|
||||
else:
|
||||
print(json.dumps(asdict(summary), indent=2))
|
||||
|
||||
elif path.is_dir():
|
||||
summaries = process_session_directory(str(path), args.output)
|
||||
if not args.output:
|
||||
print(json.dumps([asdict(s) for s in summaries], indent=2))
|
||||
|
||||
else:
|
||||
print(f"Error: {args.path} is not a file or directory", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,234 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Session Transcript → Training Pair Harvester
|
||||
|
||||
Scans Hermes session JSONL files for Q&A patterns and extracts
|
||||
terse→rich training pairs. Outputs JSONL matching the timmy-config
|
||||
training pairs spec.
|
||||
|
||||
Usage:
|
||||
python3 scripts/session_pair_harvester.py ~/.hermes/sessions/
|
||||
python3 scripts/session_pair_harvester.py session.jsonl --output pairs.jsonl
|
||||
python3 scripts/session_pair_harvester.py --dir ~/.hermes/sessions/ --min-ratio 2.0
|
||||
|
||||
Output format:
|
||||
{"terse": "user short prompt", "rich": "ai detailed response", "source": "session_id", "model": "..."}
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import hashlib
|
||||
import json
|
||||
import sys
|
||||
from pathlib import Path
|
||||
from typing import Optional
|
||||
|
||||
|
||||
def compute_hash(text: str) -> str:
|
||||
"""Content hash for deduplication."""
|
||||
return hashlib.sha256(text.encode()).hexdigest()[:16]
|
||||
|
||||
|
||||
def extract_pairs_from_session(session_data: dict, min_ratio: float = 1.5,
|
||||
min_response_words: int = 20) -> list:
|
||||
"""Extract terse→rich pairs from a single session object."""
|
||||
pairs = []
|
||||
conversations = session_data.get("conversations", [])
|
||||
session_id = session_data.get("id", "unknown")
|
||||
model = session_data.get("model", "unknown")
|
||||
|
||||
seen_hashes = set()
|
||||
|
||||
for i, msg in enumerate(conversations):
|
||||
# Look for assistant/gpt responses
|
||||
if msg.get("from") not in ("gpt", "assistant"):
|
||||
continue
|
||||
|
||||
response_text = msg.get("value", "")
|
||||
if not response_text or len(response_text.split()) < min_response_words:
|
||||
continue
|
||||
|
||||
# Find the preceding human message
|
||||
prompt_text = ""
|
||||
for j in range(i - 1, -1, -1):
|
||||
if conversations[j].get("from") == "human":
|
||||
prompt_text = conversations[j].get("value", "")
|
||||
break
|
||||
|
||||
if not prompt_text:
|
||||
continue
|
||||
|
||||
# Filter: skip tool results, system messages embedded as human
|
||||
if prompt_text.startswith("{") and "output" in prompt_text[:100]:
|
||||
continue # likely a tool result
|
||||
if prompt_text.startswith("# SOUL.md") or prompt_text.startswith("You are"):
|
||||
continue # system prompt leak
|
||||
|
||||
# Quality filters
|
||||
prompt_words = len(prompt_text.split())
|
||||
response_words = len(response_text.split())
|
||||
|
||||
# Must have meaningful length ratio
|
||||
if prompt_words == 0 or response_words == 0:
|
||||
continue
|
||||
ratio = response_words / prompt_words
|
||||
if ratio < min_ratio:
|
||||
continue
|
||||
|
||||
# Skip responses that are mostly code
|
||||
code_blocks = response_text.count("```")
|
||||
if code_blocks >= 4 and len(response_text.replace("```", "").strip()) < 50:
|
||||
continue
|
||||
|
||||
# Skip responses with tool call artifacts
|
||||
if "tool_call" in response_text[:100] or "function_call" in response_text[:100]:
|
||||
continue
|
||||
|
||||
# Deduplicate by content hash
|
||||
content_hash = compute_hash(prompt_text + response_text[:200])
|
||||
if content_hash in seen_hashes:
|
||||
continue
|
||||
seen_hashes.add(content_hash)
|
||||
|
||||
# Clean up response: remove markdown headers if too many
|
||||
clean_response = response_text
|
||||
|
||||
pairs.append({
|
||||
"terse": prompt_text.strip(),
|
||||
"rich": clean_response.strip(),
|
||||
"source": session_id,
|
||||
"model": model,
|
||||
"prompt_words": prompt_words,
|
||||
"response_words": response_words,
|
||||
"ratio": round(ratio, 2),
|
||||
})
|
||||
|
||||
return pairs
|
||||
|
||||
|
||||
def extract_from_jsonl_file(filepath: str, **kwargs) -> list:
|
||||
"""Extract pairs from a session JSONL file."""
|
||||
pairs = []
|
||||
path = Path(filepath)
|
||||
|
||||
if not path.exists():
|
||||
print(f"Warning: {filepath} not found", file=sys.stderr)
|
||||
return pairs
|
||||
|
||||
content = path.read_text()
|
||||
lines = content.strip().split("\n")
|
||||
|
||||
for line in lines:
|
||||
line = line.strip()
|
||||
if not line:
|
||||
continue
|
||||
try:
|
||||
session = json.loads(line)
|
||||
except json.JSONDecodeError:
|
||||
continue
|
||||
|
||||
session_pairs = extract_pairs_from_session(session, **kwargs)
|
||||
pairs.extend(session_pairs)
|
||||
|
||||
return pairs
|
||||
|
||||
|
||||
def deduplicate_pairs(pairs: list) -> list:
|
||||
"""Remove duplicate pairs across files."""
|
||||
seen = set()
|
||||
unique = []
|
||||
for pair in pairs:
|
||||
key = compute_hash(pair["terse"] + pair["rich"][:200])
|
||||
if key not in seen:
|
||||
seen.add(key)
|
||||
unique.append(pair)
|
||||
return unique
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description="Harvest training pairs from session transcripts")
|
||||
parser.add_argument("input", nargs="?", help="Session JSONL file or directory")
|
||||
parser.add_argument("--dir", "-d", help="Directory to scan for session files")
|
||||
parser.add_argument("--output", "-o", default="harvested_pairs.jsonl", help="Output file")
|
||||
parser.add_argument("--min-ratio", type=float, default=1.5, help="Min response/prompt word ratio")
|
||||
parser.add_argument("--min-words", type=int, default=20, help="Min response word count")
|
||||
parser.add_argument("--dry-run", action="store_true", help="Print stats without writing")
|
||||
args = parser.parse_args()
|
||||
|
||||
all_pairs = []
|
||||
files_scanned = 0
|
||||
|
||||
scan_dir = args.dir or args.input
|
||||
if not scan_dir:
|
||||
parser.print_help()
|
||||
sys.exit(1)
|
||||
|
||||
scan_path = Path(scan_dir)
|
||||
if scan_path.is_dir():
|
||||
jsonl_files = sorted(scan_path.rglob("*.jsonl"))
|
||||
print(f"Scanning {len(jsonl_files)} files in {scan_dir}...", file=sys.stderr)
|
||||
for fpath in jsonl_files:
|
||||
pairs = extract_from_jsonl_file(
|
||||
str(fpath),
|
||||
min_ratio=args.min_ratio,
|
||||
min_response_words=args.min_words
|
||||
)
|
||||
all_pairs.extend(pairs)
|
||||
files_scanned += 1
|
||||
else:
|
||||
pairs = extract_from_jsonl_file(
|
||||
str(scan_path),
|
||||
min_ratio=args.min_ratio,
|
||||
min_response_words=args.min_words
|
||||
)
|
||||
all_pairs.extend(pairs)
|
||||
files_scanned = 1
|
||||
|
||||
# Deduplicate
|
||||
unique_pairs = deduplicate_pairs(all_pairs)
|
||||
|
||||
# Stats
|
||||
if unique_pairs:
|
||||
avg_prompt = sum(p["prompt_words"] for p in unique_pairs) / len(unique_pairs)
|
||||
avg_response = sum(p["response_words"] for p in unique_pairs) / len(unique_pairs)
|
||||
avg_ratio = sum(p["ratio"] for p in unique_pairs) / len(unique_pairs)
|
||||
else:
|
||||
avg_prompt = avg_response = avg_ratio = 0
|
||||
|
||||
stats = {
|
||||
"files_scanned": files_scanned,
|
||||
"raw_pairs": len(all_pairs),
|
||||
"unique_pairs": len(unique_pairs),
|
||||
"duplicates_removed": len(all_pairs) - len(unique_pairs),
|
||||
"avg_prompt_words": round(avg_prompt, 1),
|
||||
"avg_response_words": round(avg_response, 1),
|
||||
"avg_ratio": round(avg_ratio, 2),
|
||||
}
|
||||
|
||||
print(json.dumps(stats, indent=2), file=sys.stderr)
|
||||
|
||||
if args.dry_run:
|
||||
# Print sample pairs
|
||||
for pair in unique_pairs[:3]:
|
||||
print(f"\n--- Source: {pair['source']} (ratio: {pair['ratio']}) ---", file=sys.stderr)
|
||||
print(f"TERSE: {pair['terse'][:100]}...", file=sys.stderr)
|
||||
print(f"RICH: {pair['rich'][:150]}...", file=sys.stderr)
|
||||
return
|
||||
|
||||
# Write output
|
||||
output_path = Path(args.output)
|
||||
with open(output_path, "w") as f:
|
||||
for pair in unique_pairs:
|
||||
# Strip internal fields for output
|
||||
output = {
|
||||
"terse": pair["terse"],
|
||||
"rich": pair["rich"],
|
||||
"source": pair["source"],
|
||||
"model": pair["model"],
|
||||
}
|
||||
f.write(json.dumps(output) + "\n")
|
||||
|
||||
print(f"\nWrote {len(unique_pairs)} pairs to {output_path}", file=sys.stderr)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,142 +1,298 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
session_reader.py — Parse Hermes session JSONL transcripts.
|
||||
|
||||
Each line in a session file is a JSON object representing a message.
|
||||
Standard fields: role (user|assistant|system), content (str), timestamp (str).
|
||||
Tool calls and tool results are also captured.
|
||||
Hermes Session JSONL Transcript Parser
|
||||
Parses JSONL session transcripts and extracts structured data.
|
||||
Part of the compounding-intelligence harvester pipeline.
|
||||
"""
|
||||
|
||||
import json
|
||||
import re
|
||||
import sys
|
||||
import os
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
from typing import Iterator, Optional
|
||||
from typing import Dict, List, Optional, Any
|
||||
from dataclasses import dataclass, asdict
|
||||
|
||||
|
||||
def read_session(path: str) -> list[dict]:
|
||||
"""Read a session JSONL file and return all messages as a list."""
|
||||
messages = []
|
||||
with open(path, 'r', encoding='utf-8') as f:
|
||||
for line_num, line in enumerate(f, 1):
|
||||
line = line.strip()
|
||||
if not line:
|
||||
continue
|
||||
try:
|
||||
msg = json.loads(line)
|
||||
messages.append(msg)
|
||||
except json.JSONDecodeError as e:
|
||||
print(f"WARNING: Skipping malformed JSON at line {line_num}: {e}", file=sys.stderr)
|
||||
return messages
|
||||
@dataclass
|
||||
class SessionSummary:
|
||||
"""Structured summary of a Hermes session transcript."""
|
||||
session_id: str
|
||||
model: str
|
||||
repo: str
|
||||
outcome: str
|
||||
message_count: int
|
||||
tool_calls: int
|
||||
duration_estimate: str
|
||||
key_actions: List[str]
|
||||
errors_encountered: List[str]
|
||||
start_time: Optional[str] = None
|
||||
end_time: Optional[str] = None
|
||||
total_tokens_estimate: int = 0
|
||||
user_messages: int = 0
|
||||
assistant_messages: int = 0
|
||||
tool_outputs: int = 0
|
||||
|
||||
|
||||
def read_session_iter(path: str) -> Iterator[dict]:
|
||||
"""Iterate over session messages without loading all into memory."""
|
||||
with open(path, 'r', encoding='utf-8') as f:
|
||||
for line_num, line in enumerate(f, 1):
|
||||
line = line.strip()
|
||||
if not line:
|
||||
continue
|
||||
try:
|
||||
yield json.loads(line)
|
||||
except json.JSONDecodeError as e:
|
||||
print(f"WARNING: Skipping malformed JSON at line {line_num}: {e}", file=sys.stderr)
|
||||
|
||||
|
||||
def extract_conversation(messages: list[dict]) -> list[dict]:
|
||||
"""Extract user/assistant conversation turns, skipping tool-only messages."""
|
||||
conversation = []
|
||||
for msg in messages:
|
||||
role = msg.get('role', '')
|
||||
content = msg.get('content', '')
|
||||
|
||||
# Skip empty messages and pure tool calls
|
||||
if role in ('user', 'assistant', 'system'):
|
||||
if isinstance(content, str) and content.strip():
|
||||
conversation.append({
|
||||
'role': role,
|
||||
'content': content.strip(),
|
||||
'timestamp': msg.get('timestamp', '')
|
||||
})
|
||||
elif isinstance(content, list):
|
||||
# Multimodal content — extract text parts
|
||||
text_parts = []
|
||||
for part in content:
|
||||
if isinstance(part, dict) and part.get('type') == 'text':
|
||||
text_parts.append(part.get('text', ''))
|
||||
if text_parts:
|
||||
conversation.append({
|
||||
'role': role,
|
||||
'content': '\n'.join(text_parts),
|
||||
'timestamp': msg.get('timestamp', '')
|
||||
})
|
||||
return conversation
|
||||
|
||||
|
||||
def truncate_for_context(messages: list[dict], head: int = 50, tail: int = 50) -> list[dict]:
|
||||
"""Truncate long sessions: keep first N + last N messages.
|
||||
|
||||
This preserves session start (initial context) and end (final results),
|
||||
skipping the messy middle of long debugging sessions.
|
||||
def parse_jsonl_session(file_path: str) -> SessionSummary:
|
||||
"""
|
||||
if len(messages) <= head + tail:
|
||||
return messages
|
||||
Parse a Hermes session JSONL transcript and extract structured data.
|
||||
|
||||
truncated = messages[:head]
|
||||
truncated.append({
|
||||
'role': 'system',
|
||||
'content': f'[{len(messages) - head - tail} messages truncated]',
|
||||
'timestamp': ''
|
||||
})
|
||||
truncated.extend(messages[-tail:])
|
||||
return truncated
|
||||
|
||||
|
||||
def messages_to_text(messages: list[dict]) -> str:
|
||||
"""Convert message list to plain text for LLM consumption."""
|
||||
lines = []
|
||||
for msg in messages:
|
||||
role = msg.get('role', 'unknown').upper()
|
||||
content = msg.get('content', '')
|
||||
if msg.get('role') == 'system' and 'truncated' in content:
|
||||
lines.append(f'--- {content} ---')
|
||||
Args:
|
||||
file_path: Path to the JSONL session file
|
||||
|
||||
Returns:
|
||||
SessionSummary with extracted data
|
||||
"""
|
||||
session_id = Path(file_path).stem
|
||||
messages = []
|
||||
model = "unknown"
|
||||
repo = "unknown"
|
||||
tool_calls_count = 0
|
||||
key_actions = []
|
||||
errors = []
|
||||
start_time = None
|
||||
end_time = None
|
||||
total_tokens = 0
|
||||
|
||||
# Common repo patterns to look for
|
||||
repo_patterns = [
|
||||
r"(?:the-nexus|compounding-intelligence|timmy-config|hermes-agent)",
|
||||
r"(?:forge\.alexanderwhitestone\.com/([^/]+/[^/\\s]+))",
|
||||
r"(?:github\.com/([^/]+/[^/\\s]+))",
|
||||
r"(?:Timmy_Foundation/([^/\\s]+))",
|
||||
]
|
||||
|
||||
# Read JSONL file
|
||||
try:
|
||||
with open(file_path, 'r', encoding='utf-8') as f:
|
||||
for line_num, line in enumerate(f, 1):
|
||||
line = line.strip()
|
||||
if not line:
|
||||
continue
|
||||
|
||||
try:
|
||||
entry = json.loads(line)
|
||||
except json.JSONDecodeError as e:
|
||||
errors.append(f"Line {line_num}: Invalid JSON - {e}")
|
||||
continue
|
||||
|
||||
messages.append(entry)
|
||||
|
||||
# Extract model from assistant messages
|
||||
if entry.get("role") == "assistant" and entry.get("model"):
|
||||
model = entry["model"]
|
||||
|
||||
# Extract timestamps
|
||||
if entry.get("timestamp"):
|
||||
ts = entry["timestamp"]
|
||||
if start_time is None:
|
||||
start_time = ts
|
||||
end_time = ts
|
||||
|
||||
# Count tool calls
|
||||
if entry.get("tool_calls"):
|
||||
tool_calls_count += len(entry["tool_calls"])
|
||||
for tc in entry["tool_calls"]:
|
||||
if tc.get("function", {}).get("name"):
|
||||
action = f"{tc['function']['name']}"
|
||||
if action not in key_actions:
|
||||
key_actions.append(action)
|
||||
|
||||
# Estimate tokens from content length
|
||||
content = entry.get("content", "")
|
||||
if isinstance(content, str):
|
||||
total_tokens += len(content.split())
|
||||
elif isinstance(content, list):
|
||||
for item in content:
|
||||
if isinstance(item, dict) and "text" in item:
|
||||
total_tokens += len(item["text"].split())
|
||||
|
||||
# Look for repo mentions in content
|
||||
if entry.get("content"):
|
||||
content_str = str(entry["content"])
|
||||
for pattern in repo_patterns:
|
||||
match = re.search(pattern, content_str, re.IGNORECASE)
|
||||
if match:
|
||||
if match.groups():
|
||||
repo = match.group(1)
|
||||
else:
|
||||
repo = match.group(0)
|
||||
break
|
||||
|
||||
# Look for error messages
|
||||
if entry.get("role") == "tool" and entry.get("is_error"):
|
||||
error_msg = entry.get("content", "Unknown error")
|
||||
if isinstance(error_msg, str) and len(error_msg) < 200:
|
||||
errors.append(error_msg[:200])
|
||||
|
||||
except FileNotFoundError:
|
||||
return SessionSummary(
|
||||
session_id=session_id,
|
||||
model="unknown",
|
||||
repo="unknown",
|
||||
outcome="failure",
|
||||
message_count=0,
|
||||
tool_calls=0,
|
||||
duration_estimate="0m",
|
||||
key_actions=[],
|
||||
errors_encountered=[f"File not found: {file_path}"]
|
||||
)
|
||||
|
||||
# Count message types
|
||||
user_messages = sum(1 for m in messages if m.get("role") == "user")
|
||||
assistant_messages = sum(1 for m in messages if m.get("role") == "assistant")
|
||||
tool_outputs = sum(1 for m in messages if m.get("role") == "tool")
|
||||
|
||||
# Calculate duration estimate
|
||||
duration_estimate = "unknown"
|
||||
if start_time and end_time:
|
||||
try:
|
||||
# Try to parse timestamps
|
||||
start_dt = None
|
||||
end_dt = None
|
||||
|
||||
# Handle various timestamp formats
|
||||
for fmt in ["%Y-%m-%dT%H:%M:%S.%fZ", "%Y-%m-%dT%H:%M:%SZ", "%Y-%m-%d %H:%M:%S"]:
|
||||
try:
|
||||
if start_dt is None:
|
||||
start_dt = datetime.strptime(start_time, fmt)
|
||||
if end_dt is None:
|
||||
end_dt = datetime.strptime(end_time, fmt)
|
||||
except ValueError:
|
||||
continue
|
||||
|
||||
if start_dt and end_dt:
|
||||
duration = end_dt - start_dt
|
||||
minutes = duration.total_seconds() / 60
|
||||
duration_estimate = f"{minutes:.0f}m"
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Classify outcome
|
||||
outcome = "unknown"
|
||||
if errors:
|
||||
# Check if any errors are fatal
|
||||
fatal_errors = any("405" in e or "permission" in e.lower() or "authentication" in e.lower()
|
||||
for e in errors)
|
||||
if fatal_errors:
|
||||
outcome = "failure"
|
||||
else:
|
||||
lines.append(f'{role}: {content}')
|
||||
return '\n\n'.join(lines)
|
||||
|
||||
|
||||
def get_session_metadata(path: str) -> dict:
|
||||
"""Extract metadata from a session file (first message often has config info)."""
|
||||
messages = read_session(path)
|
||||
if not messages:
|
||||
return {'path': path, 'message_count': 0}
|
||||
outcome = "partial"
|
||||
elif messages:
|
||||
# Check last message for success indicators
|
||||
last_msg = messages[-1]
|
||||
if last_msg.get("role") == "assistant":
|
||||
content = last_msg.get("content", "")
|
||||
if isinstance(content, str):
|
||||
success_indicators = ["done", "completed", "success", "merged", "pushed"]
|
||||
if any(indicator in content.lower() for indicator in success_indicators):
|
||||
outcome = "success"
|
||||
else:
|
||||
outcome = "unknown"
|
||||
|
||||
first = messages[0]
|
||||
last = messages[-1]
|
||||
# Deduplicate key actions (keep unique, limit to 10)
|
||||
unique_actions = []
|
||||
for action in key_actions:
|
||||
if action not in unique_actions:
|
||||
unique_actions.append(action)
|
||||
if len(unique_actions) >= 10:
|
||||
break
|
||||
|
||||
return {
|
||||
'path': path,
|
||||
'message_count': len(messages),
|
||||
'first_timestamp': first.get('timestamp', ''),
|
||||
'last_timestamp': last.get('timestamp', ''),
|
||||
'first_role': first.get('role', ''),
|
||||
'has_tool_calls': any(m.get('tool_calls') for m in messages),
|
||||
}
|
||||
# Deduplicate errors (keep unique, limit to 5)
|
||||
unique_errors = []
|
||||
for error in errors:
|
||||
if error not in unique_errors:
|
||||
unique_errors.append(error)
|
||||
if len(unique_errors) >= 5:
|
||||
break
|
||||
|
||||
return SessionSummary(
|
||||
session_id=session_id,
|
||||
model=model,
|
||||
repo=repo,
|
||||
outcome=outcome,
|
||||
message_count=len(messages),
|
||||
tool_calls=tool_calls_count,
|
||||
duration_estimate=duration_estimate,
|
||||
key_actions=unique_actions,
|
||||
errors_encountered=unique_errors,
|
||||
start_time=start_time,
|
||||
end_time=end_time,
|
||||
total_tokens_estimate=total_tokens,
|
||||
user_messages=user_messages,
|
||||
assistant_messages=assistant_messages,
|
||||
tool_outputs=tool_outputs
|
||||
)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
if len(sys.argv) < 2:
|
||||
print(f"Usage: {sys.argv[0]} <session.jsonl>")
|
||||
def process_session_directory(directory_path: str, output_file: Optional[str] = None) -> List[SessionSummary]:
|
||||
"""
|
||||
Process all JSONL files in a directory.
|
||||
|
||||
Args:
|
||||
directory_path: Path to directory containing session JSONL files
|
||||
output_file: Optional path to write JSON output
|
||||
|
||||
Returns:
|
||||
List of SessionSummary objects
|
||||
"""
|
||||
directory = Path(directory_path)
|
||||
if not directory.exists():
|
||||
print(f"Error: Directory {directory_path} does not exist", file=sys.stderr)
|
||||
return []
|
||||
|
||||
jsonl_files = list(directory.glob("session_*.jsonl"))
|
||||
if not jsonl_files:
|
||||
print(f"Warning: No session_*.jsonl files found in {directory_path}", file=sys.stderr)
|
||||
return []
|
||||
|
||||
summaries = []
|
||||
for jsonl_file in sorted(jsonl_files):
|
||||
print(f"Processing {jsonl_file.name}...", file=sys.stderr)
|
||||
summary = parse_jsonl_session(str(jsonl_file))
|
||||
summaries.append(summary)
|
||||
|
||||
if output_file:
|
||||
with open(output_file, 'w', encoding='utf-8') as f:
|
||||
json.dump([asdict(s) for s in summaries], f, indent=2)
|
||||
print(f"Wrote {len(summaries)} summaries to {output_file}", file=sys.stderr)
|
||||
|
||||
return summaries
|
||||
|
||||
|
||||
def main():
|
||||
"""CLI entry point."""
|
||||
import argparse
|
||||
|
||||
parser = argparse.ArgumentParser(description="Parse Hermes session JSONL transcripts")
|
||||
parser.add_argument("path", help="Path to JSONL file or directory of session files")
|
||||
parser.add_argument("-o", "--output", help="Output JSON file (default: stdout)")
|
||||
parser.add_argument("-v", "--verbose", action="store_true", help="Verbose output")
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
path = Path(args.path)
|
||||
|
||||
if path.is_file():
|
||||
summary = parse_jsonl_session(str(path))
|
||||
if args.output:
|
||||
with open(args.output, 'w') as f:
|
||||
json.dump(asdict(summary), f, indent=2)
|
||||
print(f"Wrote summary to {args.output}", file=sys.stderr)
|
||||
else:
|
||||
print(json.dumps(asdict(summary), indent=2))
|
||||
|
||||
elif path.is_dir():
|
||||
summaries = process_session_directory(str(path), args.output)
|
||||
if not args.output:
|
||||
print(json.dumps([asdict(s) for s in summaries], indent=2))
|
||||
|
||||
else:
|
||||
print(f"Error: {args.path} is not a file or directory", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
path = sys.argv[1]
|
||||
meta = get_session_metadata(path)
|
||||
print(json.dumps(meta, indent=2))
|
||||
|
||||
messages = read_session(path)
|
||||
conv = extract_conversation(messages)
|
||||
print(f"\nConversation: {len(conv)} turns")
|
||||
|
||||
truncated = truncate_for_context(conv)
|
||||
print(f"After truncation: {len(truncated)} turns")
|
||||
print(f"\nPreview (first 500 chars):")
|
||||
print(messages_to_text(truncated[:5])[:500])
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
|
||||
@@ -1,147 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Tests for scripts/automation_opportunity_finder.py — 8 tests."""
|
||||
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import tempfile
|
||||
|
||||
sys.path.insert(0, os.path.dirname(__file__) or ".")
|
||||
import importlib.util
|
||||
spec = importlib.util.spec_from_file_location(
|
||||
"aof",
|
||||
os.path.join(os.path.dirname(__file__) or ".", "automation_opportunity_finder.py"),
|
||||
)
|
||||
mod = importlib.util.module_from_spec(spec)
|
||||
spec.loader.exec_module(mod)
|
||||
|
||||
|
||||
def test_analyze_cron_jobs_no_file():
|
||||
"""Returns empty list when no cron jobs file exists."""
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
result = mod.analyze_cron_jobs(tmpdir)
|
||||
assert result == []
|
||||
print("PASS: test_analyze_cron_jobs_no_file")
|
||||
|
||||
|
||||
def test_analyze_cron_jobs_disabled():
|
||||
"""Detects disabled cron jobs."""
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
cron_dir = os.path.join(tmpdir, "cron")
|
||||
os.makedirs(cron_dir)
|
||||
jobs = [
|
||||
{"id": "j1", "name": "backup", "enabled": False, "schedule": "0 * * * *"},
|
||||
{"id": "j2", "name": "health", "enabled": True, "schedule": "*/5 * * * *"},
|
||||
]
|
||||
with open(os.path.join(cron_dir, "jobs.json"), "w") as f:
|
||||
json.dump(jobs, f)
|
||||
result = mod.analyze_cron_jobs(tmpdir)
|
||||
assert any(p["category"] == "cron_disabled" for p in result)
|
||||
print("PASS: test_analyze_cron_jobs_disabled")
|
||||
|
||||
|
||||
def test_analyze_cron_jobs_errors():
|
||||
"""Detects cron jobs with error status."""
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
cron_dir = os.path.join(tmpdir, "cron")
|
||||
os.makedirs(cron_dir)
|
||||
jobs = [
|
||||
{"id": "j1", "name": "broken", "enabled": True, "last_status": "error", "schedule": "0 * * * *"},
|
||||
]
|
||||
with open(os.path.join(cron_dir, "jobs.json"), "w") as f:
|
||||
json.dump(jobs, f)
|
||||
result = mod.analyze_cron_jobs(tmpdir)
|
||||
assert any(p["category"] == "cron_errors" for p in result)
|
||||
print("PASS: test_analyze_cron_jobs_errors")
|
||||
|
||||
|
||||
def test_analyze_documents_finds_todos():
|
||||
"""Detects TODO markers in documents."""
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
docs_dir = os.path.join(tmpdir, "docs")
|
||||
os.makedirs(docs_dir)
|
||||
for i in range(3):
|
||||
with open(os.path.join(docs_dir, f"guide{i}.md"), "w") as f:
|
||||
f.write(f"# Guide {i}\n\nTODO: Automate this step\n")
|
||||
result = mod.analyze_documents([tmpdir])
|
||||
assert any(p["category"] == "manual_todo" for p in result)
|
||||
todo_proposals = [p for p in result if p["category"] == "manual_todo"]
|
||||
assert todo_proposals[0]["details"].__len__() == 3
|
||||
print("PASS: test_analyze_documents_finds_todos")
|
||||
|
||||
|
||||
def test_analyze_scripts_repeated_commands():
|
||||
"""Detects repeated shell commands across scripts."""
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
scripts_dir = os.path.join(tmpdir, "scripts")
|
||||
os.makedirs(scripts_dir)
|
||||
repeated_cmd = "docker restart myapp"
|
||||
for i in range(4):
|
||||
with open(os.path.join(scripts_dir, f"deploy{i}.sh"), "w") as f:
|
||||
f.write(f"#!/bin/bash\n{repeated_cmd}\n")
|
||||
result = mod.analyze_scripts([tmpdir])
|
||||
assert any(p["category"] == "repeated_command" for p in result)
|
||||
print("PASS: test_analyze_scripts_repeated_commands")
|
||||
|
||||
|
||||
def test_analyze_session_transcripts():
|
||||
"""Detects repeated tool-call sequences."""
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
sessions_dir = os.path.join(tmpdir, "sessions")
|
||||
os.makedirs(sessions_dir)
|
||||
for i in range(4):
|
||||
with open(os.path.join(sessions_dir, f"session{i}.jsonl"), "w") as f:
|
||||
f.write(json.dumps({"role": "user", "content": f"task {i}"}) + "\n")
|
||||
f.write(json.dumps({
|
||||
"role": "assistant",
|
||||
"content": "working",
|
||||
"tool_calls": [
|
||||
{"function": {"name": "read_file"}},
|
||||
{"function": {"name": "write_file"}},
|
||||
]
|
||||
}) + "\n")
|
||||
result = mod.analyze_session_transcripts([sessions_dir])
|
||||
assert any(p["category"] == "tool_sequence" for p in result)
|
||||
seq_proposals = [p for p in result if p["category"] == "tool_sequence"]
|
||||
assert any("read_file" in p["title"] and "write_file" in p["title"] for p in seq_proposals)
|
||||
print("PASS: test_analyze_session_transcripts")
|
||||
|
||||
|
||||
def test_deduplicate_proposals():
|
||||
"""Deduplicates proposals with similar titles."""
|
||||
proposals = [
|
||||
{"title": "TODO found 3 times", "category": "manual_todo", "confidence": 0.7, "impact": "medium", "description": "x", "sources": []},
|
||||
{"title": "TODO found 3 times", "category": "manual_todo", "confidence": 0.7, "impact": "medium", "description": "x", "sources": []},
|
||||
{"title": "FIXME found 5 times", "category": "manual_fixme", "confidence": 0.8, "impact": "medium", "description": "y", "sources": []},
|
||||
]
|
||||
result = mod.deduplicate_proposals(proposals)
|
||||
assert len(result) == 2
|
||||
print("PASS: test_deduplicate_proposals")
|
||||
|
||||
|
||||
def test_rank_proposals():
|
||||
"""Ranks proposals by impact * confidence."""
|
||||
proposals = [
|
||||
{"title": "low", "category": "x", "confidence": 0.9, "impact": "low", "description": "", "sources": []},
|
||||
{"title": "high", "category": "x", "confidence": 0.8, "impact": "high", "description": "", "sources": []},
|
||||
{"title": "med", "category": "x", "confidence": 0.7, "impact": "medium", "description": "", "sources": []},
|
||||
]
|
||||
result = mod.rank_proposals(proposals)
|
||||
assert result[0]["title"] == "high"
|
||||
assert result[-1]["title"] == "low"
|
||||
print("PASS: test_rank_proposals")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
tests = [v for k, v in globals().items() if k.startswith("test_")]
|
||||
passed = 0
|
||||
failed = 0
|
||||
for t in tests:
|
||||
try:
|
||||
t()
|
||||
passed += 1
|
||||
except Exception as e:
|
||||
print(f"FAIL: {t.__name__}: {e}")
|
||||
failed += 1
|
||||
print(f"\n{passed}/{passed+failed} tests passed")
|
||||
sys.exit(1 if failed else 0)
|
||||
@@ -1,239 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Tests for bootstrapper.py — context assembly from knowledge store.
|
||||
"""
|
||||
|
||||
import json
|
||||
import sys
|
||||
import tempfile
|
||||
from pathlib import Path
|
||||
|
||||
# Add scripts dir to path for import
|
||||
sys.path.insert(0, str(Path(__file__).resolve().parent))
|
||||
|
||||
from bootstrapper import (
|
||||
build_bootstrap_context,
|
||||
estimate_tokens,
|
||||
filter_facts,
|
||||
load_index,
|
||||
sort_facts,
|
||||
truncate_to_tokens,
|
||||
)
|
||||
|
||||
|
||||
def make_index(facts: list[dict], tmp_dir: Path) -> Path:
|
||||
"""Create a temporary index.json with given facts."""
|
||||
index = {
|
||||
"version": 1,
|
||||
"last_updated": "2026-04-13T20:00:00Z",
|
||||
"total_facts": len(facts),
|
||||
"facts": facts,
|
||||
}
|
||||
path = tmp_dir / "index.json"
|
||||
with open(path, "w") as f:
|
||||
json.dump(index, f)
|
||||
return path
|
||||
|
||||
|
||||
def test_empty_index():
|
||||
"""Empty knowledge store produces graceful output."""
|
||||
with tempfile.TemporaryDirectory() as tmp:
|
||||
tmp_dir = Path(tmp)
|
||||
index_path = make_index([], tmp_dir)
|
||||
|
||||
# Create empty knowledge dirs
|
||||
for sub in ["repos", "agents", "global"]:
|
||||
(tmp_dir / sub).mkdir(exist_ok=True)
|
||||
|
||||
context = build_bootstrap_context(
|
||||
repo="the-nexus", index_path=index_path
|
||||
)
|
||||
assert "No relevant knowledge found" in context
|
||||
assert "Starting fresh" in context
|
||||
print("PASS: empty_index")
|
||||
|
||||
|
||||
def test_filter_by_repo():
|
||||
"""Filter facts by repository."""
|
||||
facts = [
|
||||
{"fact": "A", "category": "fact", "repo": "the-nexus", "confidence": 0.9},
|
||||
{"fact": "B", "category": "fact", "repo": "fleet-ops", "confidence": 0.8},
|
||||
{"fact": "C", "category": "fact", "repo": "global", "confidence": 0.7},
|
||||
]
|
||||
filtered = filter_facts(facts, repo="the-nexus", include_global=True)
|
||||
texts = [f["fact"] for f in filtered]
|
||||
assert "A" in texts
|
||||
assert "B" not in texts
|
||||
assert "C" in texts
|
||||
print("PASS: filter_by_repo")
|
||||
|
||||
|
||||
def test_filter_by_agent():
|
||||
"""Filter facts by agent type."""
|
||||
facts = [
|
||||
{"fact": "A", "category": "pattern", "repo": "global", "agent": "mimo-sprint", "confidence": 0.8},
|
||||
{"fact": "B", "category": "pattern", "repo": "global", "agent": "groq-fast", "confidence": 0.7},
|
||||
{"fact": "C", "category": "fact", "repo": "global", "confidence": 0.9},
|
||||
]
|
||||
filtered = filter_facts(facts, agent="mimo-sprint", include_global=True)
|
||||
texts = [f["fact"] for f in filtered]
|
||||
assert "A" in texts
|
||||
assert "B" not in texts
|
||||
assert "C" in texts # global, no agent restriction
|
||||
print("PASS: filter_by_agent")
|
||||
|
||||
|
||||
def test_no_global_flag():
|
||||
"""Excluding global facts works."""
|
||||
facts = [
|
||||
{"fact": "A", "category": "fact", "repo": "the-nexus", "confidence": 0.9},
|
||||
{"fact": "B", "category": "fact", "repo": "global", "confidence": 0.8},
|
||||
]
|
||||
filtered = filter_facts(facts, repo="the-nexus", include_global=False)
|
||||
texts = [f["fact"] for f in filtered]
|
||||
assert "A" in texts
|
||||
assert "B" not in texts
|
||||
print("PASS: no_global_flag")
|
||||
|
||||
|
||||
def test_sort_by_confidence():
|
||||
"""Facts sort by confidence descending."""
|
||||
facts = [
|
||||
{"fact": "low", "category": "fact", "repo": "global", "confidence": 0.3},
|
||||
{"fact": "high", "category": "fact", "repo": "global", "confidence": 0.95},
|
||||
{"fact": "mid", "category": "fact", "repo": "global", "confidence": 0.7},
|
||||
]
|
||||
sorted_f = sort_facts(facts)
|
||||
assert sorted_f[0]["fact"] == "high"
|
||||
assert sorted_f[1]["fact"] == "mid"
|
||||
assert sorted_f[2]["fact"] == "low"
|
||||
print("PASS: sort_by_confidence")
|
||||
|
||||
|
||||
def test_sort_pitfalls_first():
|
||||
"""Pitfalls sort before facts at same confidence."""
|
||||
facts = [
|
||||
{"fact": "regular fact", "category": "fact", "repo": "global", "confidence": 0.8},
|
||||
{"fact": "danger pitfall", "category": "pitfall", "repo": "global", "confidence": 0.8},
|
||||
]
|
||||
sorted_f = sort_facts(facts)
|
||||
assert sorted_f[0]["category"] == "pitfall"
|
||||
print("PASS: sort_pitfalls_first")
|
||||
|
||||
|
||||
def test_truncate_to_tokens():
|
||||
"""Truncation cuts at line boundary."""
|
||||
text = "line1\nline2\nline3\nline4\nline5\n"
|
||||
truncated = truncate_to_tokens(text, max_tokens=2) # ~8 chars
|
||||
assert "line1" in truncated
|
||||
assert "truncated" in truncated.lower()
|
||||
print("PASS: truncate_to_tokens")
|
||||
|
||||
|
||||
def test_estimate_tokens():
|
||||
"""Token estimation is reasonable."""
|
||||
text = "a" * 400
|
||||
tokens = estimate_tokens(text)
|
||||
assert 90 <= tokens <= 110 # ~100 tokens
|
||||
print("PASS: estimate_tokens")
|
||||
|
||||
|
||||
def test_build_full_context():
|
||||
"""Full context with facts renders correctly."""
|
||||
facts = [
|
||||
{"fact": "API merges fail with 405", "category": "pitfall", "repo": "the-nexus", "confidence": 0.95},
|
||||
{"fact": "Has 50+ open PRs", "category": "fact", "repo": "the-nexus", "confidence": 0.9},
|
||||
{"fact": "Token at ~/.config/gitea/token", "category": "tool-quirk", "repo": "global", "confidence": 0.9},
|
||||
{"fact": "Check git remote -v first", "category": "pattern", "repo": "global", "confidence": 0.8},
|
||||
]
|
||||
|
||||
with tempfile.TemporaryDirectory() as tmp:
|
||||
tmp_dir = Path(tmp)
|
||||
index_path = make_index(facts, tmp_dir)
|
||||
|
||||
# Create knowledge dirs
|
||||
for sub in ["repos", "agents", "global"]:
|
||||
(tmp_dir / sub).mkdir(exist_ok=True)
|
||||
|
||||
context = build_bootstrap_context(
|
||||
repo="the-nexus",
|
||||
agent="mimo-sprint",
|
||||
include_global=True,
|
||||
index_path=index_path,
|
||||
)
|
||||
|
||||
assert "What You Know" in context
|
||||
assert "PITFALLS" in context
|
||||
assert "API merges fail with 405" in context
|
||||
assert "the-nexus" in context
|
||||
assert "Token at" in context # global fact included
|
||||
print("PASS: build_full_context")
|
||||
|
||||
|
||||
def test_max_tokens_respected():
|
||||
"""Output respects max_tokens limit."""
|
||||
# Generate lots of facts
|
||||
facts = [
|
||||
{"fact": f"Fact number {i} with some detail about things", "category": "fact", "repo": "global", "confidence": 0.8}
|
||||
for i in range(100)
|
||||
]
|
||||
|
||||
with tempfile.TemporaryDirectory() as tmp:
|
||||
tmp_dir = Path(tmp)
|
||||
index_path = make_index(facts, tmp_dir)
|
||||
for sub in ["repos", "agents", "global"]:
|
||||
(tmp_dir / sub).mkdir(exist_ok=True)
|
||||
|
||||
context = build_bootstrap_context(
|
||||
repo=None,
|
||||
max_tokens=500,
|
||||
index_path=index_path,
|
||||
)
|
||||
|
||||
actual_tokens = estimate_tokens(context)
|
||||
# Allow 10% overshoot since we cut at line boundaries
|
||||
assert actual_tokens <= 550, f"Expected ~500 tokens, got {actual_tokens}"
|
||||
print(f"PASS: max_tokens_respected (got {actual_tokens} tokens)")
|
||||
|
||||
|
||||
def test_missing_index_graceful():
|
||||
"""Missing index.json doesn't crash."""
|
||||
with tempfile.TemporaryDirectory() as tmp:
|
||||
tmp_dir = Path(tmp)
|
||||
# Don't create index.json
|
||||
for sub in ["repos", "agents", "global"]:
|
||||
(tmp_dir / sub).mkdir(exist_ok=True)
|
||||
|
||||
fake_index = tmp_dir / "nonexistent.json"
|
||||
context = build_bootstrap_context(repo="anything", index_path=fake_index)
|
||||
assert "No relevant knowledge found" in context
|
||||
print("PASS: missing_index_graceful")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
tests = [
|
||||
test_empty_index,
|
||||
test_filter_by_repo,
|
||||
test_filter_by_agent,
|
||||
test_no_global_flag,
|
||||
test_sort_by_confidence,
|
||||
test_sort_pitfalls_first,
|
||||
test_truncate_to_tokens,
|
||||
test_estimate_tokens,
|
||||
test_build_full_context,
|
||||
test_max_tokens_respected,
|
||||
test_missing_index_graceful,
|
||||
]
|
||||
|
||||
passed = 0
|
||||
failed = 0
|
||||
for test in tests:
|
||||
try:
|
||||
test()
|
||||
passed += 1
|
||||
except Exception as e:
|
||||
print(f"FAIL: {test.__name__} — {e}")
|
||||
failed += 1
|
||||
|
||||
print(f"\n{passed} passed, {failed} failed")
|
||||
sys.exit(0 if failed == 0 else 1)
|
||||
@@ -1,189 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Tests for scripts/diff_analyzer.py — 10 tests."""
|
||||
|
||||
import sys
|
||||
import os
|
||||
sys.path.insert(0, os.path.dirname(__file__) or ".")
|
||||
|
||||
import importlib.util
|
||||
spec = importlib.util.spec_from_file_location("da", os.path.join(os.path.dirname(__file__) or ".", "diff_analyzer.py"))
|
||||
mod = importlib.util.module_from_spec(spec)
|
||||
spec.loader.exec_module(mod)
|
||||
DiffAnalyzer = mod.DiffAnalyzer
|
||||
ChangeCategory = mod.ChangeCategory
|
||||
|
||||
|
||||
SAMPLE_ADD = """diff --git a/new.py b/new.py
|
||||
new file mode 100644
|
||||
--- /dev/null
|
||||
+++ b/new.py
|
||||
@@ -0,0 +1,3 @@
|
||||
+def hello():
|
||||
+ print("world")
|
||||
+ return True
|
||||
"""
|
||||
|
||||
SAMPLE_DELETE = """diff --git a/old.py b/old.py
|
||||
deleted file mode 100644
|
||||
--- a/old.py
|
||||
+++ /dev/null
|
||||
@@ -1,2 +0,0 @@
|
||||
-def goodbye():
|
||||
- pass
|
||||
"""
|
||||
|
||||
SAMPLE_MODIFY = """diff --git a/app.py b/app.py
|
||||
--- a/app.py
|
||||
+++ b/app.py
|
||||
@@ -1,3 +1,4 @@
|
||||
def main():
|
||||
- print("old")
|
||||
+ print("new")
|
||||
+ print("extra")
|
||||
return 0
|
||||
"""
|
||||
|
||||
SAMPLE_RENAME = """diff --git a/old_name.py b/new_name.py
|
||||
rename from old_name.py
|
||||
rename to new_name.py
|
||||
--- a/old_name.py
|
||||
+++ b/new_name.py
|
||||
@@ -1,1 +1,1 @@
|
||||
-old content
|
||||
+new content
|
||||
"""
|
||||
|
||||
SAMPLE_MULTI = """diff --git a/a.py b/a.py
|
||||
--- a/a.py
|
||||
+++ b/a.py
|
||||
@@ -1,1 +1,2 @@
|
||||
existing
|
||||
+added line
|
||||
diff --git b/b.py b/b.py
|
||||
new file mode 100644
|
||||
--- /dev/null
|
||||
+++ b/b.py
|
||||
@@ -0,0 +1,1 @@
|
||||
+new file
|
||||
"""
|
||||
|
||||
SAMPLE_BINARY = """diff --git a/img.png b/img.png
|
||||
Binary files a/img.png and b/img.png differ
|
||||
"""
|
||||
|
||||
|
||||
def test_empty():
|
||||
a = DiffAnalyzer()
|
||||
s = a.analyze("")
|
||||
assert s.total_files_changed == 0
|
||||
print("PASS: test_empty")
|
||||
|
||||
def test_addition():
|
||||
a = DiffAnalyzer()
|
||||
s = a.analyze(SAMPLE_ADD)
|
||||
assert s.total_files_changed == 1
|
||||
assert s.total_added == 3
|
||||
assert s.total_deleted == 0
|
||||
assert s.new_files == 1
|
||||
assert s.files[0].hunks[0].category == ChangeCategory.ADDED
|
||||
print("PASS: test_addition")
|
||||
|
||||
def test_deletion():
|
||||
a = DiffAnalyzer()
|
||||
s = a.analyze(SAMPLE_DELETE)
|
||||
assert s.total_deleted == 2
|
||||
assert s.deleted_files == 1
|
||||
assert s.files[0].hunks[0].category == ChangeCategory.DELETED
|
||||
print("PASS: test_deletion")
|
||||
|
||||
def test_modification():
|
||||
a = DiffAnalyzer()
|
||||
s = a.analyze(SAMPLE_MODIFY)
|
||||
assert s.total_added == 2
|
||||
assert s.total_deleted == 1
|
||||
assert s.files[0].hunks[0].category == ChangeCategory.MODIFIED
|
||||
print("PASS: test_modification")
|
||||
|
||||
def test_rename():
|
||||
a = DiffAnalyzer()
|
||||
s = a.analyze(SAMPLE_RENAME)
|
||||
assert s.renamed_files == 1
|
||||
assert s.files[0].old_path == "old_name.py"
|
||||
assert s.files[0].path == "new_name.py"
|
||||
assert s.files[0].is_renamed == True
|
||||
print("PASS: test_rename")
|
||||
|
||||
def test_multiple_files():
|
||||
a = DiffAnalyzer()
|
||||
s = a.analyze(SAMPLE_MULTI)
|
||||
assert s.total_files_changed == 2
|
||||
assert s.new_files == 1
|
||||
print("PASS: test_multiple_files")
|
||||
|
||||
def test_binary():
|
||||
a = DiffAnalyzer()
|
||||
s = a.analyze(SAMPLE_BINARY)
|
||||
assert s.binary_files == 1
|
||||
assert s.files[0].is_binary == True
|
||||
assert len(s.files[0].hunks) == 0
|
||||
print("PASS: test_binary")
|
||||
|
||||
def test_to_dict():
|
||||
a = DiffAnalyzer()
|
||||
s = a.analyze(SAMPLE_MODIFY)
|
||||
d = s.to_dict()
|
||||
assert "total_files_changed" in d
|
||||
assert "files" in d
|
||||
assert isinstance(d["files"], list)
|
||||
print("PASS: test_to_dict")
|
||||
|
||||
def test_context_only():
|
||||
diff = """diff --git a/f.py b/f.py
|
||||
--- a/f.py
|
||||
+++ b/f.py
|
||||
@@ -1,3 +1,3 @@
|
||||
line1
|
||||
-old
|
||||
+new
|
||||
line3
|
||||
"""
|
||||
a = DiffAnalyzer()
|
||||
s = a.analyze(diff)
|
||||
# Has both added and deleted = MODIFIED
|
||||
assert s.files[0].hunks[0].category == ChangeCategory.MODIFIED
|
||||
print("PASS: test_context_only")
|
||||
|
||||
def test_multi_hunk():
|
||||
diff = """diff --git a/f.py b/f.py
|
||||
--- a/f.py
|
||||
+++ b/f.py
|
||||
@@ -1,1 +1,2 @@
|
||||
existing
|
||||
+first addition
|
||||
@@ -10,1 +11,2 @@
|
||||
more
|
||||
+second addition
|
||||
"""
|
||||
a = DiffAnalyzer()
|
||||
s = a.analyze(diff)
|
||||
assert s.total_hunks == 2
|
||||
assert s.total_added == 2
|
||||
print("PASS: test_multi_hunk")
|
||||
|
||||
|
||||
def run_all():
|
||||
test_empty()
|
||||
test_addition()
|
||||
test_deletion()
|
||||
test_modification()
|
||||
test_rename()
|
||||
test_multiple_files()
|
||||
test_binary()
|
||||
test_to_dict()
|
||||
test_context_only()
|
||||
test_multi_hunk()
|
||||
print("\nAll 10 tests passed!")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
run_all()
|
||||
@@ -1,109 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Tests for scripts/gitea_issue_parser.py"""
|
||||
|
||||
import sys
|
||||
import os
|
||||
sys.path.insert(0, os.path.dirname(__file__) or ".")
|
||||
|
||||
# Import from sibling
|
||||
import importlib.util
|
||||
spec = importlib.util.spec_from_file_location("parser", os.path.join(os.path.dirname(__file__) or ".", "gitea_issue_parser.py"))
|
||||
mod = importlib.util.module_from_spec(spec)
|
||||
spec.loader.exec_module(mod)
|
||||
parse_issue_body = mod.parse_issue_body
|
||||
|
||||
|
||||
def test_basic_parsing():
|
||||
body = """## Context
|
||||
|
||||
This is the background info.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] First criterion
|
||||
- [x] Second criterion (done)
|
||||
|
||||
## What to build
|
||||
|
||||
Some description.
|
||||
"""
|
||||
result = parse_issue_body(body, title="Test (#42)", labels=["bug"])
|
||||
assert result["title"] == "Test (#42)"
|
||||
assert result["labels"] == ["bug"]
|
||||
assert result["epic_ref"] == 42
|
||||
assert len(result["criteria"]) == 2
|
||||
assert result["criteria"][0]["text"] == "First criterion"
|
||||
assert result["criteria"][0]["checked"] == False
|
||||
assert result["criteria"][1]["checked"] == True
|
||||
assert "context" in result["sections"]
|
||||
print("PASS: test_basic_parsing")
|
||||
|
||||
|
||||
def test_numbered_criteria():
|
||||
body = """## Acceptance Criteria
|
||||
|
||||
1. First item
|
||||
2. Second item
|
||||
3. Third item
|
||||
"""
|
||||
result = parse_issue_body(body)
|
||||
assert len(result["criteria"]) == 3
|
||||
assert result["criteria"][0]["text"] == "First item"
|
||||
print("PASS: test_numbered_criteria")
|
||||
|
||||
|
||||
def test_epic_ref_from_body():
|
||||
body = "Closes #123\n\nSome description."
|
||||
result = parse_issue_body(body)
|
||||
assert result["epic_ref"] == 123
|
||||
print("PASS: test_epic_ref_from_body")
|
||||
|
||||
|
||||
def test_empty_body():
|
||||
result = parse_issue_body("")
|
||||
assert result["criteria"] == []
|
||||
assert result["context"] == ""
|
||||
assert result["sections"] == {}
|
||||
print("PASS: test_empty_body")
|
||||
|
||||
|
||||
def test_no_sections():
|
||||
body = "Just a plain issue body with no headings."
|
||||
result = parse_issue_body(body)
|
||||
assert result["context"] == "Just a plain issue body with no headings."
|
||||
print("PASS: test_no_sections")
|
||||
|
||||
|
||||
def test_multiple_sections():
|
||||
body = """## Problem
|
||||
|
||||
Something is broken.
|
||||
|
||||
## Fix
|
||||
|
||||
Do this instead.
|
||||
|
||||
## Notes
|
||||
|
||||
Additional info.
|
||||
"""
|
||||
result = parse_issue_body(body)
|
||||
assert "problem" in result["sections"]
|
||||
assert "fix" in result["sections"]
|
||||
assert "notes" in result["sections"]
|
||||
assert "Something is broken" in result["sections"]["problem"]
|
||||
print("PASS: test_multiple_sections")
|
||||
|
||||
|
||||
def run_all():
|
||||
test_basic_parsing()
|
||||
test_numbered_criteria()
|
||||
test_epic_ref_from_body()
|
||||
test_empty_body()
|
||||
test_no_sections()
|
||||
test_multiple_sections()
|
||||
print("\nAll 6 tests passed!")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
run_all()
|
||||
@@ -1,129 +1,41 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Test harness for knowledge extraction prompt.
|
||||
Validates output structure, content quality, and hallucination resistance.
|
||||
|
||||
Usage:
|
||||
python3 scripts/test_harvest_prompt.py # Run all tests
|
||||
python3 scripts/test_harvest_prompt.py --transcript FILE # Test against a real transcript
|
||||
python3 scripts/test_harvest_prompt.py --validate FILE # Validate an existing extraction JSON
|
||||
Test script for knowledge extraction prompt.
|
||||
Validates that the prompt produces consistent, structured output.
|
||||
"""
|
||||
|
||||
import json
|
||||
import sys
|
||||
import argparse
|
||||
from pathlib import Path
|
||||
|
||||
VALID_CATEGORIES = {"fact", "pitfall", "pattern", "tool-quirk", "question"}
|
||||
REQUIRED_FIELDS = {"fact", "category", "repo", "confidence", "evidence"}
|
||||
REQUIRED_META = {"session_outcome", "tools_used", "repos_touched", "error_count", "knowledge_count"}
|
||||
|
||||
|
||||
def validate_knowledge_item(item, idx):
|
||||
"""Validate a single knowledge item. Returns list of errors."""
|
||||
errors = []
|
||||
if not isinstance(item, dict):
|
||||
return [f"Item {idx}: not a dict"]
|
||||
for field in REQUIRED_FIELDS:
|
||||
def validate_knowledge_item(item):
|
||||
"""Validate a single knowledge item."""
|
||||
required_fields = ["fact", "category", "repo", "confidence"]
|
||||
for field in required_fields:
|
||||
if field not in item:
|
||||
errors.append(f"Item {idx}: missing field '{field}'")
|
||||
if not isinstance(item.get("fact", ""), str) or len(item.get("fact", "").strip()) == 0:
|
||||
errors.append(f"Item {idx}: fact must be a non-empty string")
|
||||
if item.get("category") not in VALID_CATEGORIES:
|
||||
errors.append(f"Item {idx}: invalid category '{item.get('category')}'")
|
||||
if not isinstance(item.get("repo", ""), str) or len(item.get("repo", "").strip()) == 0:
|
||||
errors.append(f"Item {idx}: repo must be a non-empty string")
|
||||
conf = item.get("confidence")
|
||||
if not isinstance(conf, (int, float)) or not (0.0 <= conf <= 1.0):
|
||||
errors.append(f"Item {idx}: confidence must be a number 0.0-1.0, got {conf}")
|
||||
if not isinstance(item.get("evidence", ""), str) or len(item.get("evidence", "").strip()) == 0:
|
||||
errors.append(f"Item {idx}: evidence must be a non-empty string (hallucination check)")
|
||||
return errors
|
||||
return False, f"Missing field: {field}"
|
||||
|
||||
if not isinstance(item["fact"], str) or len(item["fact"].strip()) == 0:
|
||||
return False, "Fact must be a non-empty string"
|
||||
|
||||
valid_categories = ["fact", "pitfall", "pattern", "tool-quirk", "question"]
|
||||
if item["category"] not in valid_categories:
|
||||
return False, f"Invalid category: {item['category']}"
|
||||
|
||||
if not isinstance(item["repo"], str):
|
||||
return False, "Repo must be a string"
|
||||
|
||||
if not isinstance(item["confidence"], (int, float)):
|
||||
return False, "Confidence must be a number"
|
||||
|
||||
if not (0.0 <= item["confidence"] <= 1.0):
|
||||
return False, "Confidence must be between 0.0 and 1.0"
|
||||
|
||||
return True, "Valid"
|
||||
|
||||
|
||||
def validate_extraction(data):
|
||||
"""Validate a full extraction result. Returns (is_valid, errors, warnings)."""
|
||||
errors = []
|
||||
warnings = []
|
||||
|
||||
if not isinstance(data, dict):
|
||||
return False, ["Root is not a JSON object"], []
|
||||
|
||||
if "knowledge" not in data:
|
||||
return False, ["Missing 'knowledge' array"], []
|
||||
|
||||
if not isinstance(data["knowledge"], list):
|
||||
return False, ["'knowledge' is not an array"], []
|
||||
|
||||
for i, item in enumerate(data["knowledge"]):
|
||||
errors.extend(validate_knowledge_item(item, i))
|
||||
|
||||
# Meta block validation
|
||||
if "meta" not in data:
|
||||
warnings.append("Missing 'meta' block (session_outcome, tools_used, etc.)")
|
||||
else:
|
||||
meta = data["meta"]
|
||||
for field in REQUIRED_META:
|
||||
if field not in meta:
|
||||
warnings.append(f"Meta missing field '{field}'")
|
||||
|
||||
# Quality checks
|
||||
facts = data["knowledge"]
|
||||
if len(facts) == 0:
|
||||
warnings.append("No knowledge extracted (empty session or extraction failure)")
|
||||
|
||||
# Check for near-duplicate facts
|
||||
seen_facts = set()
|
||||
for item in facts:
|
||||
normalized = item.get("fact", "").lower().strip()[:80]
|
||||
if normalized in seen_facts:
|
||||
warnings.append(f"Duplicate fact detected: '{normalized[:50]}...'")
|
||||
seen_facts.add(normalized)
|
||||
|
||||
# Check confidence distribution
|
||||
confidences = [item.get("confidence", 0) for item in facts]
|
||||
if confidences:
|
||||
avg_conf = sum(confidences) / len(confidences)
|
||||
if avg_conf > 0.9:
|
||||
warnings.append(f"Average confidence {avg_conf:.2f} is suspiciously high (possible hallucination)")
|
||||
if avg_conf < 0.4:
|
||||
warnings.append(f"Average confidence {avg_conf:.2f} is very low (extraction may be too cautious)")
|
||||
|
||||
return len(errors) == 0, errors, warnings
|
||||
|
||||
|
||||
def validate_transcript_coverage(data, transcript):
|
||||
"""
|
||||
Check that extracted facts are actually supported by the transcript.
|
||||
This is a heuristic — checks that key terms from facts appear in transcript.
|
||||
Returns list of potential hallucinations.
|
||||
"""
|
||||
hallucinations = []
|
||||
transcript_lower = transcript.lower()
|
||||
|
||||
for item in data.get("knowledge", []):
|
||||
fact = item.get("fact", "")
|
||||
evidence = item.get("evidence", "")
|
||||
|
||||
# Check if evidence string appears in transcript
|
||||
if evidence and evidence.lower() not in transcript_lower:
|
||||
# Partial match — check if key terms are present
|
||||
evidence_words = set(evidence.lower().split())
|
||||
transcript_words = set(transcript_lower.split())
|
||||
overlap = evidence_words & transcript_words
|
||||
if len(overlap) < min(3, len(evidence_words) * 0.3):
|
||||
hallucinations.append({
|
||||
"fact": fact[:80],
|
||||
"evidence": evidence[:80],
|
||||
"issue": "Evidence not found in transcript — possible hallucination"
|
||||
})
|
||||
|
||||
return hallucinations
|
||||
|
||||
|
||||
# --- Test transcripts ---
|
||||
|
||||
SAMPLE_TRANSCRIPT_SUCCESS = """
|
||||
def test_sample_transcript():
|
||||
"""Test with a sample transcript."""
|
||||
sample_transcript = """
|
||||
User: Clone the repo at forge.alexanderwhitestone.com/Timmy_Foundation/compounding-intelligence
|
||||
Agent: Trying git clone... Error: 401 Unauthorized
|
||||
Agent: Checking for token... Found in ~/.config/gitea/token
|
||||
@@ -132,253 +44,30 @@ Agent: Clone successful with token authentication
|
||||
Agent: Now examining the repo structure...
|
||||
Agent: Found knowledge/index.json with 0 facts
|
||||
Agent: Creating harvest-prompt.md in templates/
|
||||
Agent: Done. Files created: templates/harvest-prompt.md, scripts/test_harvest_prompt.py
|
||||
"""
|
||||
|
||||
SAMPLE_TRANSCRIPT_FAILURE = """
|
||||
User: Deploy the cron jobs to production
|
||||
Agent: Connecting to VPS... SSH timeout after 30s
|
||||
Agent: Retrying with verbose output... Connection refused on port 22
|
||||
Agent: Checking firewall rules... iptables shows port 22 is open
|
||||
Agent: Checking sshd config... sshd is not running
|
||||
Agent: Attempting to start sshd... systemctl start sshd failed: unit not found
|
||||
Agent: ERROR: OpenSSH server not installed on this VPS
|
||||
Agent: Cannot proceed without SSH access. Deployment blocked.
|
||||
"""
|
||||
|
||||
SAMPLE_TRANSCRIPT_API_QUIRK = """
|
||||
User: List all open issues in hermes-agent
|
||||
Agent: Querying Gitea API... endpoint: /repos/Timmy_Foundation/hermes-agent/issues?state=open&limit=50
|
||||
Agent: Response: 0 items. That's odd.
|
||||
Agent: Trying without state filter... endpoint: /repos/Timmy_Foundation/hermes-agent/issues?limit=50
|
||||
Agent: Response: 50 items returned, but all have pull_request field. These are PRs, not issues.
|
||||
Agent: Gitea issues API returns PRs alongside issues. Must filter by presence of pull_request key.
|
||||
Agent: Also discovered: state=open parameter returns empty when used with limit=50, but works with limit=5.
|
||||
Agent: This appears to be a Gitea API quirk — pagination interacts with state filter unexpectedly.
|
||||
Agent: Workaround: fetch without state filter, then filter client-side for state=='open' and 'pull_request' not in item.
|
||||
"""
|
||||
|
||||
|
||||
def run_tests():
|
||||
"""Run the built-in test suite."""
|
||||
tests_passed = 0
|
||||
tests_failed = 0
|
||||
|
||||
print("=" * 60)
|
||||
print("KNOWLEDGE EXTRACTION PROMPT — TEST SUITE")
|
||||
print("=" * 60)
|
||||
|
||||
# Test 1: Prompt file exists and is under 2k tokens (~8k chars)
|
||||
print("\n[Test 1] Prompt file size constraint")
|
||||
prompt_path = Path("templates/harvest-prompt.md")
|
||||
if not prompt_path.exists():
|
||||
print(" FAIL: harvest-prompt.md not found")
|
||||
tests_failed += 1
|
||||
else:
|
||||
size = prompt_path.stat().st_size
|
||||
# Rough token estimate: ~4 chars per token
|
||||
est_tokens = size / 4
|
||||
print(f" Prompt size: {size} bytes (~{est_tokens:.0f} tokens)")
|
||||
if est_tokens > 2000:
|
||||
print(f" WARN: Prompt exceeds ~1500 tokens (target: ~1000)")
|
||||
else:
|
||||
print(f" PASS: Within token budget")
|
||||
tests_passed += 1
|
||||
|
||||
# Test 2: Validate a well-formed extraction
|
||||
print("\n[Test 2] Valid extraction passes validation")
|
||||
valid_extraction = {
|
||||
"knowledge": [
|
||||
{
|
||||
"fact": "Gitea auth token is at ~/.config/gitea/token",
|
||||
"category": "tool-quirk",
|
||||
"repo": "global",
|
||||
"confidence": 0.9,
|
||||
"evidence": "Found in ~/.config/gitea/token"
|
||||
},
|
||||
{
|
||||
"fact": "Clone fails with 401 when no token is provided",
|
||||
"category": "pitfall",
|
||||
"repo": "compounding-intelligence",
|
||||
"confidence": 0.9,
|
||||
"evidence": "Error: 401 Unauthorized"
|
||||
}
|
||||
],
|
||||
"meta": {
|
||||
"session_outcome": "success",
|
||||
"tools_used": ["git"],
|
||||
"repos_touched": ["compounding-intelligence"],
|
||||
"error_count": 1,
|
||||
"knowledge_count": 2
|
||||
}
|
||||
}
|
||||
is_valid, errors, warnings = validate_extraction(valid_extraction)
|
||||
if is_valid:
|
||||
print(f" PASS: Valid extraction accepted ({len(warnings)} warnings)")
|
||||
tests_passed += 1
|
||||
else:
|
||||
print(f" FAIL: Valid extraction rejected: {errors}")
|
||||
tests_failed += 1
|
||||
|
||||
# Test 3: Reject missing fields
|
||||
print("\n[Test 3] Missing fields are rejected")
|
||||
bad_extraction = {
|
||||
"knowledge": [
|
||||
{"fact": "Something learned", "category": "fact"} # Missing repo, confidence, evidence
|
||||
]
|
||||
}
|
||||
is_valid, errors, warnings = validate_extraction(bad_extraction)
|
||||
if not is_valid:
|
||||
print(f" PASS: Rejected with {len(errors)} errors")
|
||||
tests_passed += 1
|
||||
else:
|
||||
print(f" FAIL: Should have rejected missing fields")
|
||||
tests_failed += 1
|
||||
|
||||
# Test 4: Reject invalid category
|
||||
print("\n[Test 4] Invalid category is rejected")
|
||||
bad_cat = {
|
||||
"knowledge": [
|
||||
{"fact": "Test", "category": "discovery", "repo": "x", "confidence": 0.8, "evidence": "test"}
|
||||
]
|
||||
}
|
||||
is_valid, errors, warnings = validate_extraction(bad_cat)
|
||||
if not is_valid and any("category" in e for e in errors):
|
||||
print(f" PASS: Invalid category 'discovery' rejected")
|
||||
tests_passed += 1
|
||||
else:
|
||||
print(f" FAIL: Should have rejected invalid category")
|
||||
tests_failed += 1
|
||||
|
||||
# Test 5: Detect near-duplicates
|
||||
print("\n[Test 5] Duplicate detection")
|
||||
dup_extraction = {
|
||||
"knowledge": [
|
||||
{"fact": "Token is at ~/.config/gitea/token", "category": "fact", "repo": "x", "confidence": 0.9, "evidence": "a"},
|
||||
{"fact": "Token is at ~/.config/gitea/token", "category": "fact", "repo": "x", "confidence": 0.9, "evidence": "b"}
|
||||
],
|
||||
"meta": {"session_outcome": "success", "tools_used": [], "repos_touched": [], "error_count": 0, "knowledge_count": 2}
|
||||
}
|
||||
is_valid, errors, warnings = validate_extraction(dup_extraction)
|
||||
if any("Duplicate" in w for w in warnings):
|
||||
print(f" PASS: Duplicate detected")
|
||||
tests_passed += 1
|
||||
else:
|
||||
print(f" FAIL: Should have detected duplicate")
|
||||
tests_failed += 1
|
||||
|
||||
# Test 6: Hallucination check against transcript
|
||||
print("\n[Test 6] Hallucination detection")
|
||||
hallucinated = {
|
||||
"knowledge": [
|
||||
{
|
||||
"fact": "Database port is 5433",
|
||||
"category": "fact",
|
||||
"repo": "x",
|
||||
"confidence": 0.9,
|
||||
"evidence": "PostgreSQL listening on port 5433"
|
||||
}
|
||||
],
|
||||
"meta": {"session_outcome": "success", "tools_used": [], "repos_touched": [], "error_count": 0, "knowledge_count": 1}
|
||||
}
|
||||
hallucinations = validate_transcript_coverage(hallucinated, SAMPLE_TRANSCRIPT_SUCCESS)
|
||||
if hallucinations:
|
||||
print(f" PASS: Hallucination detected ({len(hallucinations)} items)")
|
||||
tests_passed += 1
|
||||
else:
|
||||
print(f" FAIL: Should have detected hallucinated evidence")
|
||||
tests_failed += 1
|
||||
|
||||
# Test 7: Failed session should extract pitfalls
|
||||
print("\n[Test 7] Failed session extraction shape")
|
||||
failed_extraction = {
|
||||
"knowledge": [
|
||||
{
|
||||
"fact": "SSH server not installed on target VPS",
|
||||
"category": "pitfall",
|
||||
"repo": "global",
|
||||
"confidence": 0.9,
|
||||
"evidence": "ERROR: OpenSSH server not installed on this VPS"
|
||||
},
|
||||
{
|
||||
"fact": "VPS blocks deployment without SSH access",
|
||||
"category": "question",
|
||||
"repo": "global",
|
||||
"confidence": 0.7,
|
||||
"evidence": "Cannot proceed without SSH access. Deployment blocked."
|
||||
}
|
||||
],
|
||||
"meta": {
|
||||
"session_outcome": "failed",
|
||||
"tools_used": ["ssh", "systemctl"],
|
||||
"repos_touched": [],
|
||||
"error_count": 3,
|
||||
"knowledge_count": 2
|
||||
}
|
||||
}
|
||||
is_valid, errors, warnings = validate_extraction(failed_extraction)
|
||||
if is_valid:
|
||||
categories = [item["category"] for item in failed_extraction["knowledge"]]
|
||||
if "pitfall" in categories:
|
||||
print(f" PASS: Failed session extracted {len(categories)} items including pitfalls")
|
||||
tests_passed += 1
|
||||
else:
|
||||
print(f" FAIL: Failed session should extract pitfalls")
|
||||
tests_failed += 1
|
||||
else:
|
||||
print(f" FAIL: {errors}")
|
||||
tests_failed += 1
|
||||
|
||||
# Test 8: Empty extraction is warned
|
||||
print("\n[Test 8] Empty extraction warning")
|
||||
empty = {"knowledge": [], "meta": {"session_outcome": "success", "tools_used": [], "repos_touched": [], "error_count": 0, "knowledge_count": 0}}
|
||||
is_valid, errors, warnings = validate_extraction(empty)
|
||||
if any("No knowledge" in w for w in warnings):
|
||||
print(f" PASS: Empty extraction warned")
|
||||
tests_passed += 1
|
||||
else:
|
||||
print(f" FAIL: Should warn on empty extraction")
|
||||
tests_failed += 1
|
||||
|
||||
# Summary
|
||||
print(f"\n{'=' * 60}")
|
||||
print(f"Results: {tests_passed} passed, {tests_failed} failed")
|
||||
print(f"{'=' * 60}")
|
||||
return tests_failed == 0
|
||||
|
||||
|
||||
def validate_file(filepath):
|
||||
"""Validate an existing extraction JSON file."""
|
||||
path = Path(filepath)
|
||||
if not path.exists():
|
||||
print(f"ERROR: {filepath} not found")
|
||||
return False
|
||||
|
||||
data = json.loads(path.read_text())
|
||||
is_valid, errors, warnings = validate_extraction(data)
|
||||
|
||||
print(f"Validation of {filepath}:")
|
||||
print(f" Knowledge items: {len(data.get('knowledge', []))}")
|
||||
print(f" Errors: {len(errors)}")
|
||||
print(f" Warnings: {len(warnings)}")
|
||||
|
||||
for e in errors:
|
||||
print(f" ERROR: {e}")
|
||||
for w in warnings:
|
||||
print(f" WARN: {w}")
|
||||
|
||||
return is_valid
|
||||
|
||||
|
||||
# This would be replaced with actual prompt execution
|
||||
print("Sample transcript processed")
|
||||
print("Expected categories: fact, pitfall, pattern, tool-quirk, question")
|
||||
return True
|
||||
|
||||
if __name__ == "__main__":
|
||||
parser = argparse.ArgumentParser(description="Test knowledge extraction prompt")
|
||||
parser.add_argument("--validate", help="Validate an existing extraction JSON file")
|
||||
parser.add_argument("--transcript", help="Test against a real transcript file (informational)")
|
||||
args = parser.parse_args()
|
||||
|
||||
if args.validate:
|
||||
success = validate_file(args.validate)
|
||||
sys.exit(0 if success else 1)
|
||||
else:
|
||||
success = run_tests()
|
||||
sys.exit(0 if success else 1)
|
||||
print("Testing knowledge extraction prompt...")
|
||||
|
||||
# Test 1: Validate prompt file exists
|
||||
prompt_path = Path("templates/harvest-prompt.md")
|
||||
if not prompt_path.exists():
|
||||
print("ERROR: harvest-prompt.md not found")
|
||||
sys.exit(1)
|
||||
|
||||
print(f"OK: Prompt file exists: {prompt_path}")
|
||||
|
||||
# Test 2: Check prompt size
|
||||
prompt_size = prompt_path.stat().st_size
|
||||
print(f"OK: Prompt size: {prompt_size} bytes")
|
||||
|
||||
# Test 3: Test sample transcript processing
|
||||
if test_sample_transcript():
|
||||
print("OK: Sample transcript test passed")
|
||||
|
||||
print("\nAll tests passed!")
|
||||
|
||||
@@ -1,72 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Comprehensive tests for knowledge extraction prompt."""
|
||||
import json, re
|
||||
from pathlib import Path
|
||||
|
||||
def check_prompt_structure():
|
||||
p = Path("templates/harvest-prompt.md")
|
||||
if not p.exists(): return False, "harvest-prompt.md not found"
|
||||
c = p.read_text()
|
||||
for s in ["System Prompt","Instructions","Categories","Output Format","Confidence Scoring","Constraints","Example"]:
|
||||
if s.lower() not in c.lower(): return False, f"Missing section: {s}"
|
||||
for cat in ["fact","pitfall","pattern","tool-quirk","question"]:
|
||||
if cat not in c: return False, f"Missing category: {cat}"
|
||||
if len(c) > 5000: return False, f"Too large: {len(c)}"
|
||||
if len(c) < 1000: return False, f"Too small: {len(c)}"
|
||||
return True, "Prompt structure is valid"
|
||||
|
||||
def check_confidence_scoring():
|
||||
c = Path("templates/harvest-prompt.md").read_text()
|
||||
for l in ["0.9-1.0","0.7-0.8","0.5-0.6","0.3-0.4","0.1-0.2"]:
|
||||
if l not in c: return False, f"Missing level: {l}"
|
||||
return True, "Confidence scoring defined"
|
||||
|
||||
def check_example_quality():
|
||||
c = Path("templates/harvest-prompt.md").read_text()
|
||||
if "example" not in c.lower(): return False, "No examples"
|
||||
m = re.search(r'"knowledge"', c[c.lower().find("example"):])
|
||||
if not m: return False, "No JSON example"
|
||||
return True, "Examples present"
|
||||
|
||||
def check_constraint_coverage():
|
||||
c = Path("templates/harvest-prompt.md").read_text()
|
||||
for x in ["no hallucination","explicitly","partial","failed sessions"]:
|
||||
if x not in c.lower(): return False, f"Missing: {x}"
|
||||
return True, "Constraints covered"
|
||||
|
||||
def check_test_sessions():
|
||||
d = Path("test_sessions")
|
||||
if not d.exists(): return False, "test_sessions/ not found"
|
||||
files = list(d.glob("*.jsonl"))
|
||||
if len(files) < 5: return False, f"Only {len(files)} sessions"
|
||||
for f in files:
|
||||
for i, line in enumerate(f.read_text().strip().split("\n"), 1):
|
||||
try: json.loads(line)
|
||||
except json.JSONDecodeError as e: return False, f"{f.name}:{i}: {e}"
|
||||
return True, f"{len(files)} valid sessions"
|
||||
|
||||
def test_prompt_structure():
|
||||
passed, msg = check_prompt_structure()
|
||||
assert passed, msg
|
||||
|
||||
def test_confidence_scoring():
|
||||
passed, msg = check_confidence_scoring()
|
||||
assert passed, msg
|
||||
|
||||
def test_example_quality():
|
||||
passed, msg = check_example_quality()
|
||||
assert passed, msg
|
||||
|
||||
def test_constraint_coverage():
|
||||
passed, msg = check_constraint_coverage()
|
||||
assert passed, msg
|
||||
|
||||
def test_test_sessions():
|
||||
passed, msg = check_test_sessions()
|
||||
assert passed, msg
|
||||
|
||||
if __name__ == "__main__":
|
||||
checks = [check_prompt_structure, check_confidence_scoring, check_example_quality, check_constraint_coverage, check_test_sessions]
|
||||
for fn in checks:
|
||||
ok, msg = fn()
|
||||
print(f"{'PASS' if ok else 'FAIL'}: {fn.__name__} -- {msg}")
|
||||
@@ -1,162 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Smoke test for harvester pipeline — verifies the full chain:
|
||||
session_reader -> prompt -> LLM (mocked) -> validate -> deduplicate -> store
|
||||
|
||||
Does NOT call the real LLM. Tests plumbing only.
|
||||
"""
|
||||
|
||||
import json
|
||||
import sys
|
||||
import tempfile
|
||||
import os
|
||||
from pathlib import Path
|
||||
|
||||
# Setup path
|
||||
SCRIPT_DIR = Path(__file__).parent.absolute()
|
||||
sys.path.insert(0, str(SCRIPT_DIR))
|
||||
|
||||
from session_reader import read_session, extract_conversation, truncate_for_context, messages_to_text
|
||||
from harvester import validate_fact, deduplicate, load_existing_knowledge, fact_fingerprint
|
||||
|
||||
|
||||
def test_session_reader():
|
||||
"""Test that session_reader parses JSONL correctly."""
|
||||
with tempfile.NamedTemporaryFile(mode='w', suffix='.jsonl', delete=False) as f:
|
||||
f.write('{"role": "user", "content": "Hello", "timestamp": "2026-04-13T10:00:00Z"}\n')
|
||||
f.write('{"role": "assistant", "content": "Hi there", "timestamp": "2026-04-13T10:00:01Z"}\n')
|
||||
f.write('{"role": "user", "content": "Clone the repo", "timestamp": "2026-04-13T10:00:02Z"}\n')
|
||||
f.write('{"role": "assistant", "content": "Cloned successfully", "timestamp": "2026-04-13T10:00:05Z"}\n')
|
||||
path = f.name
|
||||
|
||||
messages = read_session(path)
|
||||
assert len(messages) == 4, f"Expected 4 messages, got {len(messages)}"
|
||||
|
||||
conv = extract_conversation(messages)
|
||||
assert len(conv) == 4, f"Expected 4 conversation turns, got {len(conv)}"
|
||||
|
||||
text = messages_to_text(conv)
|
||||
assert "USER: Hello" in text
|
||||
assert "ASSISTANT: Hi there" in text
|
||||
|
||||
truncated = truncate_for_context(conv, head=2, tail=2)
|
||||
assert len(truncated) == 4 # 4 <= head+tail, so no truncation
|
||||
|
||||
os.unlink(path)
|
||||
print(" [PASS] session_reader pipeline works")
|
||||
|
||||
|
||||
def test_validate_fact():
|
||||
"""Test fact validation."""
|
||||
good = {"fact": "Gitea token is at ~/.config/gitea/token", "category": "tool-quirk", "repo": "global", "confidence": 0.9}
|
||||
assert validate_fact(good), "Valid fact should pass"
|
||||
|
||||
bad_missing = {"fact": "Something", "category": "fact"}
|
||||
assert not validate_fact(bad_missing), "Missing fields should fail"
|
||||
|
||||
bad_category = {"fact": "Something", "category": "nonsense", "repo": "x", "confidence": 0.5}
|
||||
assert not validate_fact(bad_category), "Bad category should fail"
|
||||
|
||||
bad_conf = {"fact": "Something", "category": "fact", "repo": "x", "confidence": 1.5}
|
||||
assert not validate_fact(bad_conf), "Confidence > 1.0 should fail"
|
||||
|
||||
print(" [PASS] fact validation works")
|
||||
|
||||
|
||||
def test_deduplicate():
|
||||
"""Test deduplication."""
|
||||
existing = [
|
||||
{"fact": "Token is at ~/.config/gitea/token", "category": "tool-quirk", "repo": "global", "confidence": 0.9}
|
||||
]
|
||||
new = [
|
||||
{"fact": "Token is at ~/.config/gitea/token", "category": "tool-quirk", "repo": "global", "confidence": 0.9}, # exact dup
|
||||
{"fact": "Deploy uses Ansible on port 22", "category": "pattern", "repo": "fleet", "confidence": 0.8}, # unique
|
||||
]
|
||||
result = deduplicate(new, existing)
|
||||
assert len(result) == 1, f"Expected 1 unique, got {len(result)}"
|
||||
assert result[0]["fact"] == "Deploy uses Ansible on port 22"
|
||||
print(" [PASS] deduplication works")
|
||||
|
||||
|
||||
def test_knowledge_store_roundtrip():
|
||||
"""Test loading and writing knowledge index."""
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
# Load empty index
|
||||
index = load_existing_knowledge(tmpdir)
|
||||
assert index["total_facts"] == 0
|
||||
|
||||
# Write a fact
|
||||
new_facts = [{"fact": "Test fact", "category": "fact", "repo": "test", "confidence": 0.9}]
|
||||
|
||||
# Use harvester's write function
|
||||
from harvester import write_knowledge
|
||||
write_knowledge(index, new_facts, tmpdir, source_session="test.jsonl")
|
||||
|
||||
# Reload and verify
|
||||
index2 = load_existing_knowledge(tmpdir)
|
||||
assert index2["total_facts"] == 1
|
||||
assert index2["facts"][0]["fact"] == "Test fact"
|
||||
assert index2["facts"][0]["source_session"] == "test.jsonl"
|
||||
|
||||
# Check markdown was written
|
||||
md_path = Path(tmpdir) / "repos" / "test.md"
|
||||
assert md_path.exists(), "Markdown file should be created"
|
||||
|
||||
print(" [PASS] knowledge store roundtrip works")
|
||||
|
||||
|
||||
def test_full_chain_no_llm():
|
||||
"""Test the full pipeline minus the LLM call."""
|
||||
with tempfile.NamedTemporaryFile(mode='w', suffix='.jsonl', delete=False) as f:
|
||||
f.write('{"role": "user", "content": "Clone compounding-intelligence", "timestamp": "2026-04-13T10:00:00Z"}\n')
|
||||
f.write('{"role": "assistant", "content": "Cloned successfully", "timestamp": "2026-04-13T10:00:05Z"}\n')
|
||||
session_path = f.name
|
||||
|
||||
with tempfile.TemporaryDirectory() as knowledge_dir:
|
||||
# Step 1: Read
|
||||
messages = read_session(session_path)
|
||||
assert len(messages) == 2
|
||||
|
||||
# Step 2: Extract conversation
|
||||
conv = extract_conversation(messages)
|
||||
assert len(conv) == 2
|
||||
|
||||
# Step 3: Truncate
|
||||
truncated = truncate_for_context(conv, head=50, tail=50)
|
||||
|
||||
# Step 4: Convert to text (this goes to the LLM)
|
||||
transcript = messages_to_text(truncated)
|
||||
assert "Clone compounding-intelligence" in transcript
|
||||
|
||||
# Step 5-7: Would be LLM call, validate, deduplicate
|
||||
# We simulate LLM output here
|
||||
mock_facts = [
|
||||
{"fact": "compounding-intelligence repo was cloned", "category": "fact", "repo": "compounding-intelligence", "confidence": 0.9}
|
||||
]
|
||||
valid = [f for f in mock_facts if validate_fact(f)]
|
||||
|
||||
# Step 6: Deduplicate
|
||||
index = load_existing_knowledge(knowledge_dir)
|
||||
new_facts = deduplicate(valid, index.get("facts", []))
|
||||
assert len(new_facts) == 1
|
||||
|
||||
# Step 7: Store
|
||||
from harvester import write_knowledge
|
||||
write_knowledge(index, new_facts, knowledge_dir, source_session=session_path)
|
||||
|
||||
# Verify
|
||||
index2 = load_existing_knowledge(knowledge_dir)
|
||||
assert index2["total_facts"] == 1
|
||||
|
||||
os.unlink(session_path)
|
||||
print(" [PASS] full chain (reader -> validate -> dedup -> store) works")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
print("Running harvester pipeline smoke tests...")
|
||||
test_session_reader()
|
||||
test_validate_fact()
|
||||
test_deduplicate()
|
||||
test_knowledge_store_roundtrip()
|
||||
test_full_chain_no_llm()
|
||||
print("\nAll tests passed.")
|
||||
@@ -1,282 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Tests for scripts/improvement_proposals.py — 15 tests."""
|
||||
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import tempfile
|
||||
|
||||
sys.path.insert(0, os.path.dirname(__file__) or ".")
|
||||
import importlib.util
|
||||
|
||||
spec = importlib.util.spec_from_file_location(
|
||||
"ip", os.path.join(os.path.dirname(__file__) or ".", "improvement_proposals.py")
|
||||
)
|
||||
mod = importlib.util.module_from_spec(spec)
|
||||
spec.loader.exec_module(mod)
|
||||
|
||||
analyze_sessions = mod.analyze_sessions
|
||||
generate_proposals = mod.generate_proposals
|
||||
format_proposals_markdown = mod.format_proposals_markdown
|
||||
format_proposals_json = mod.format_proposals_json
|
||||
_normalize_error = mod._normalize_error
|
||||
|
||||
|
||||
# ── Helper to build test sessions ─────────────────────────────
|
||||
|
||||
def _make_session(session_id, repo="test-repo", errors=None, tool_calls=None, duration=30):
|
||||
return {
|
||||
"session_id": session_id,
|
||||
"repo": repo,
|
||||
"errors": [{"message": e} for e in (errors or [])],
|
||||
"tool_calls": tool_calls or [],
|
||||
"duration_minutes": duration,
|
||||
"messages": [],
|
||||
}
|
||||
|
||||
|
||||
def _make_tool_calls(repeats):
|
||||
"""Create tool call list with repeated tools."""
|
||||
calls = []
|
||||
for tool, count in repeats:
|
||||
for _ in range(count):
|
||||
calls.append({"tool": tool, "latency_ms": 100})
|
||||
return calls
|
||||
|
||||
|
||||
# ── Tests ─────────────────────────────────────────────────────
|
||||
|
||||
def test_empty_sessions():
|
||||
patterns = analyze_sessions([])
|
||||
assert patterns == []
|
||||
print("PASS: test_empty_sessions")
|
||||
|
||||
|
||||
def test_no_patterns_on_clean_sessions():
|
||||
sessions = [
|
||||
_make_session("s1", tool_calls=[{"tool": "read_file", "latency_ms": 50}]),
|
||||
_make_session("s2", tool_calls=[{"tool": "write_file", "latency_ms": 80}]),
|
||||
]
|
||||
patterns = analyze_sessions(sessions)
|
||||
# No repeated errors, no slow tools, no retries
|
||||
assert len(patterns) == 0
|
||||
print("PASS: test_no_patterns_on_clean_sessions")
|
||||
|
||||
|
||||
def test_repeated_error_detection():
|
||||
"""Same error across 3+ sessions triggers pattern."""
|
||||
sessions = [
|
||||
_make_session(f"s{i}", errors=["ModuleNotFoundError: No module named bannerlord_trace"])
|
||||
for i in range(4)
|
||||
]
|
||||
patterns = analyze_sessions(sessions)
|
||||
repeated = [p for p in patterns if p.pattern_type == "repeated_error"]
|
||||
assert len(repeated) == 1
|
||||
assert repeated[0].occurrences == 4
|
||||
print("PASS: test_repeated_error_detection")
|
||||
|
||||
|
||||
def test_repeated_error_threshold():
|
||||
"""2 occurrences should NOT trigger (threshold is 3)."""
|
||||
sessions = [
|
||||
_make_session("s1", errors=["TimeoutError: connection timed out"]),
|
||||
_make_session("s2", errors=["TimeoutError: connection timed out"]),
|
||||
]
|
||||
patterns = analyze_sessions(sessions)
|
||||
repeated = [p for p in patterns if p.pattern_type == "repeated_error"]
|
||||
assert len(repeated) == 0
|
||||
print("PASS: test_repeated_error_threshold")
|
||||
|
||||
|
||||
def test_slow_tool_detection():
|
||||
"""Tool with avg latency > 5000ms across 5+ calls."""
|
||||
calls = [{"tool": "git_push", "latency_ms": 8000} for _ in range(10)]
|
||||
sessions = [_make_session("s1", tool_calls=calls)]
|
||||
patterns = analyze_sessions(sessions)
|
||||
slow = [p for p in patterns if p.pattern_type == "slow_tool"]
|
||||
assert len(slow) == 1
|
||||
assert "git_push" in slow[0].description
|
||||
print("PASS: test_slow_tool_detection")
|
||||
|
||||
|
||||
def test_fast_tool_not_flagged():
|
||||
"""Tool under 5000ms avg should not trigger."""
|
||||
calls = [{"tool": "read_file", "latency_ms": 50} for _ in range(10)]
|
||||
sessions = [_make_session("s1", tool_calls=calls)]
|
||||
patterns = analyze_sessions(sessions)
|
||||
slow = [p for p in patterns if p.pattern_type == "slow_tool"]
|
||||
assert len(slow) == 0
|
||||
print("PASS: test_fast_tool_not_flagged")
|
||||
|
||||
|
||||
def test_failed_retry_detection():
|
||||
"""3+ consecutive calls to same tool triggers retry pattern."""
|
||||
calls = _make_tool_calls([("execute_code", 5)])
|
||||
sessions = [_make_session("s1", tool_calls=calls)]
|
||||
sessions.extend([
|
||||
_make_session(f"s{i}", tool_calls=_make_tool_calls([("execute_code", 4)]))
|
||||
for i in range(2, 5)
|
||||
])
|
||||
patterns = analyze_sessions(sessions)
|
||||
retries = [p for p in patterns if p.pattern_type == "failed_retry"]
|
||||
assert len(retries) >= 1
|
||||
print("PASS: test_failed_retry_detection")
|
||||
|
||||
|
||||
def test_manual_process_detection():
|
||||
"""10+ tool calls with <= 3 unique tools."""
|
||||
calls = _make_tool_calls([("terminal", 8), ("read_file", 5)])
|
||||
sessions = [_make_session("s1", tool_calls=calls, duration=25)]
|
||||
patterns = analyze_sessions(sessions)
|
||||
manual = [p for p in patterns if p.pattern_type == "manual_process"]
|
||||
assert len(manual) == 1
|
||||
print("PASS: test_manual_process_detection")
|
||||
|
||||
|
||||
def test_generate_proposals_from_patterns():
|
||||
"""Proposals generated from waste patterns."""
|
||||
sessions = [
|
||||
_make_session(f"s{i}", errors=["Error: push timeout"])
|
||||
for i in range(5)
|
||||
]
|
||||
patterns = analyze_sessions(sessions)
|
||||
proposals = generate_proposals(patterns)
|
||||
assert len(proposals) >= 1
|
||||
assert proposals[0].estimated_monthly_hours_saved > 0
|
||||
assert proposals[0].priority in ("critical", "high", "medium", "low")
|
||||
print("PASS: test_generate_proposals_from_patterns")
|
||||
|
||||
|
||||
def test_proposal_roi_positive():
|
||||
"""ROI weeks should be a positive number for recoverable time."""
|
||||
patterns = [mod.WastePattern(
|
||||
pattern_type="repeated_error",
|
||||
description="Test error",
|
||||
occurrences=10,
|
||||
total_time_hours=5.0,
|
||||
affected_repos=["test"],
|
||||
)]
|
||||
proposals = generate_proposals(patterns)
|
||||
assert len(proposals) == 1
|
||||
assert proposals[0].roi_weeks > 0
|
||||
assert proposals[0].roi_weeks < 100
|
||||
print("PASS: test_proposal_roi_positive")
|
||||
|
||||
|
||||
def test_proposals_sorted_by_impact():
|
||||
"""Proposals should be sorted by monthly hours saved (descending)."""
|
||||
sessions = [
|
||||
_make_session("s1", errors=["Minor warning"] * 3, duration=5),
|
||||
_make_session("s2", errors=["Critical failure: deploy crashed"] * 5, duration=60),
|
||||
]
|
||||
# Add more sessions to cross threshold
|
||||
for i in range(3, 7):
|
||||
sessions.append(_make_session(f"s{i}", errors=["Critical failure: deploy crashed"]))
|
||||
|
||||
patterns = analyze_sessions(sessions)
|
||||
proposals = generate_proposals(patterns)
|
||||
if len(proposals) >= 2:
|
||||
for i in range(len(proposals) - 1):
|
||||
assert proposals[i].estimated_monthly_hours_saved >= proposals[i + 1].estimated_monthly_hours_saved
|
||||
print("PASS: test_proposals_sorted_by_impact")
|
||||
|
||||
|
||||
def test_format_markdown():
|
||||
"""Markdown output should contain expected sections."""
|
||||
patterns = [mod.WastePattern(
|
||||
pattern_type="repeated_error", description="Test", occurrences=5,
|
||||
total_time_hours=2.5, affected_repos=["repo"],
|
||||
)]
|
||||
proposals = generate_proposals(patterns)
|
||||
md = format_proposals_markdown(proposals, patterns, "2026-04-15T00:00:00Z")
|
||||
assert "# Improvement Proposals" in md
|
||||
assert "## Summary" in md
|
||||
assert "### Problem" in md
|
||||
assert "### ROI Estimate" in md
|
||||
assert "## Appendix" in md
|
||||
print("PASS: test_format_markdown")
|
||||
|
||||
|
||||
def test_format_json():
|
||||
"""JSON output should be valid and parseable."""
|
||||
patterns = [mod.WastePattern(
|
||||
pattern_type="slow_tool", description="Slow", occurrences=10,
|
||||
total_time_hours=3.0, affected_repos=["global"],
|
||||
)]
|
||||
proposals = generate_proposals(patterns)
|
||||
output = format_proposals_json(proposals)
|
||||
parsed = json.loads(output)
|
||||
assert isinstance(parsed, list)
|
||||
assert len(parsed) == len(proposals)
|
||||
assert "title" in parsed[0]
|
||||
assert "roi_weeks" in parsed[0]
|
||||
print("PASS: test_format_json")
|
||||
|
||||
|
||||
def test_normalize_error():
|
||||
"""Error normalization should remove paths and hashes."""
|
||||
err1 = _normalize_error("Failed to clone /Users/apayne/repo with token abc123def456")
|
||||
assert "/PATH" in err1
|
||||
assert "HASH" in err1
|
||||
assert "/Users/apayne" not in err1
|
||||
|
||||
err2 = _normalize_error("")
|
||||
assert err2 == ""
|
||||
|
||||
err3 = _normalize_error("Simple error message")
|
||||
assert "simple error" in err3
|
||||
print("PASS: test_normalize_error")
|
||||
|
||||
|
||||
def test_cli_integration():
|
||||
"""End-to-end test: write input JSON, run script, check output."""
|
||||
import subprocess
|
||||
|
||||
sessions = [
|
||||
_make_session(f"s{i}", errors=["Connection refused: port 8080"])
|
||||
for i in range(5)
|
||||
]
|
||||
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
input_path = os.path.join(tmpdir, "analytics.json")
|
||||
output_path = os.path.join(tmpdir, "proposals.md")
|
||||
|
||||
with open(input_path, "w") as f:
|
||||
json.dump({"sessions": sessions}, f)
|
||||
|
||||
script = os.path.join(os.path.dirname(__file__) or ".", "improvement_proposals.py")
|
||||
result = subprocess.run(
|
||||
[sys.executable, script, "--input", input_path, "--output", output_path],
|
||||
capture_output=True, text=True, timeout=10,
|
||||
)
|
||||
|
||||
assert result.returncode == 0, f"CLI failed: {result.stderr}"
|
||||
assert os.path.exists(output_path)
|
||||
|
||||
with open(output_path) as f:
|
||||
content = f.read()
|
||||
assert "# Improvement Proposals" in content
|
||||
print("PASS: test_cli_integration")
|
||||
|
||||
|
||||
def run_all():
|
||||
test_empty_sessions()
|
||||
test_no_patterns_on_clean_sessions()
|
||||
test_repeated_error_detection()
|
||||
test_repeated_error_threshold()
|
||||
test_slow_tool_detection()
|
||||
test_fast_tool_not_flagged()
|
||||
test_failed_retry_detection()
|
||||
test_manual_process_detection()
|
||||
test_generate_proposals_from_patterns()
|
||||
test_proposal_roi_positive()
|
||||
test_proposals_sorted_by_impact()
|
||||
test_format_markdown()
|
||||
test_format_json()
|
||||
test_normalize_error()
|
||||
test_cli_integration()
|
||||
print("\nAll 15 tests passed!")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
run_all()
|
||||
@@ -1,129 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Tests for scripts/knowledge_staleness_check.py — 8 tests."""
|
||||
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import tempfile
|
||||
|
||||
sys.path.insert(0, os.path.dirname(__file__) or ".")
|
||||
import importlib.util
|
||||
spec = importlib.util.spec_from_file_location("ks", os.path.join(os.path.dirname(__file__) or ".", "knowledge_staleness_check.py"))
|
||||
mod = importlib.util.module_from_spec(spec)
|
||||
spec.loader.exec_module(mod)
|
||||
check_staleness = mod.check_staleness
|
||||
fix_hashes = mod.fix_hashes
|
||||
compute_file_hash = mod.compute_file_hash
|
||||
|
||||
|
||||
def test_fresh_entry():
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
src = os.path.join(tmpdir, "source.py")
|
||||
with open(src, "w") as f:
|
||||
f.write("print('hello')")
|
||||
h = compute_file_hash(src)
|
||||
idx = os.path.join(tmpdir, "index.json")
|
||||
with open(idx, "w") as f:
|
||||
json.dump({"facts": [{"fact": "hello", "source_file": "source.py", "source_hash": h}]}, f)
|
||||
results = check_staleness(idx, tmpdir)
|
||||
assert results[0]["status"] == "fresh"
|
||||
print("PASS: test_fresh_entry")
|
||||
|
||||
|
||||
def test_stale_entry():
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
src = os.path.join(tmpdir, "source.py")
|
||||
with open(src, "w") as f:
|
||||
f.write("original content")
|
||||
idx = os.path.join(tmpdir, "index.json")
|
||||
with open(idx, "w") as f:
|
||||
json.dump({"facts": [{"fact": "old", "source_file": "source.py", "source_hash": "sha256:wrong"}]}, f)
|
||||
# Now change the source
|
||||
with open(src, "w") as f:
|
||||
f.write("modified content")
|
||||
results = check_staleness(idx, tmpdir)
|
||||
assert results[0]["status"] == "stale"
|
||||
print("PASS: test_stale_entry")
|
||||
|
||||
|
||||
def test_missing_source():
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
idx = os.path.join(tmpdir, "index.json")
|
||||
with open(idx, "w") as f:
|
||||
json.dump({"facts": [{"fact": "gone", "source_file": "nonexistent.py", "source_hash": "sha256:abc"}]}, f)
|
||||
results = check_staleness(idx, tmpdir)
|
||||
assert results[0]["status"] == "missing_source"
|
||||
print("PASS: test_missing_source")
|
||||
|
||||
|
||||
def test_no_hash():
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
src = os.path.join(tmpdir, "source.py")
|
||||
with open(src, "w") as f:
|
||||
f.write("content")
|
||||
idx = os.path.join(tmpdir, "index.json")
|
||||
with open(idx, "w") as f:
|
||||
json.dump({"facts": [{"fact": "no hash", "source_file": "source.py"}]}, f)
|
||||
results = check_staleness(idx, tmpdir)
|
||||
assert results[0]["status"] == "no_hash"
|
||||
assert results[0]["current_hash"].startswith("sha256:")
|
||||
print("PASS: test_no_hash")
|
||||
|
||||
|
||||
def test_no_source_field():
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
idx = os.path.join(tmpdir, "index.json")
|
||||
with open(idx, "w") as f:
|
||||
json.dump({"facts": [{"fact": "orphan"}]}, f)
|
||||
results = check_staleness(idx, tmpdir)
|
||||
assert results[0]["status"] == "no_source"
|
||||
print("PASS: test_no_source_field")
|
||||
|
||||
|
||||
def test_fix_hashes():
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
src = os.path.join(tmpdir, "source.py")
|
||||
with open(src, "w") as f:
|
||||
f.write("content for hashing")
|
||||
idx = os.path.join(tmpdir, "index.json")
|
||||
with open(idx, "w") as f:
|
||||
json.dump({"facts": [{"fact": "needs hash", "source_file": "source.py"}]}, f)
|
||||
fixed = fix_hashes(idx, tmpdir)
|
||||
assert fixed == 1
|
||||
# Verify hash was added
|
||||
with open(idx) as f:
|
||||
data = json.load(f)
|
||||
assert data["facts"][0]["source_hash"].startswith("sha256:")
|
||||
print("PASS: test_fix_hashes")
|
||||
|
||||
|
||||
def test_empty_index():
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
idx = os.path.join(tmpdir, "index.json")
|
||||
with open(idx, "w") as f:
|
||||
json.dump({"facts": []}, f)
|
||||
results = check_staleness(idx, tmpdir)
|
||||
assert results == []
|
||||
print("PASS: test_empty_index")
|
||||
|
||||
|
||||
def test_compute_hash_nonexistent():
|
||||
h = compute_file_hash("/nonexistent/path/file.py")
|
||||
assert h is None
|
||||
print("PASS: test_compute_hash_nonexistent")
|
||||
|
||||
|
||||
def run_all():
|
||||
test_fresh_entry()
|
||||
test_stale_entry()
|
||||
test_missing_source()
|
||||
test_no_hash()
|
||||
test_no_source_field()
|
||||
test_fix_hashes()
|
||||
test_empty_index()
|
||||
test_compute_hash_nonexistent()
|
||||
print("\nAll 8 tests passed!")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
run_all()
|
||||
@@ -1,305 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Tests for Priority Rebalancer
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import tempfile
|
||||
from datetime import datetime, timedelta
|
||||
from pathlib import Path
|
||||
|
||||
# Add script dir to path
|
||||
sys.path.insert(0, str(Path(__file__).parent))
|
||||
|
||||
from priority_rebalancer import (
|
||||
GiteaClient,
|
||||
IssueScore,
|
||||
PipelineSignal,
|
||||
compute_issue_score,
|
||||
collect_knowledge_signals,
|
||||
collect_metrics_signals,
|
||||
extract_priority,
|
||||
generate_report,
|
||||
generate_markdown_report,
|
||||
PRIORITY_LEVELS,
|
||||
)
|
||||
|
||||
# ============================================================
|
||||
# Test Helpers
|
||||
# ============================================================
|
||||
|
||||
PASS = 0
|
||||
FAIL = 0
|
||||
|
||||
def test(name):
|
||||
def decorator(fn):
|
||||
global PASS, FAIL
|
||||
try:
|
||||
fn()
|
||||
PASS += 1
|
||||
print(f" [PASS] {name}")
|
||||
except Exception as e:
|
||||
FAIL += 1
|
||||
print(f" [FAIL] {name}: {e}")
|
||||
return decorator
|
||||
|
||||
def assert_eq(a, b, msg=""):
|
||||
if a != b:
|
||||
raise AssertionError(f"{msg} expected {b!r}, got {a!r}")
|
||||
|
||||
def assert_true(v, msg=""):
|
||||
if not v:
|
||||
raise AssertionError(msg or "Expected True")
|
||||
|
||||
def assert_false(v, msg=""):
|
||||
if v:
|
||||
raise AssertionError(msg or "Expected False")
|
||||
|
||||
|
||||
# ============================================================
|
||||
# Priority Extraction Tests
|
||||
# ============================================================
|
||||
|
||||
print("=== Priority Rebalancer Tests ===\n")
|
||||
|
||||
print("-- Priority Extraction --")
|
||||
|
||||
@test("extract P0 from label")
|
||||
def _():
|
||||
assert_eq(extract_priority(["P0", "bug"]), "P0")
|
||||
|
||||
@test("extract P1 from priority:high")
|
||||
def _():
|
||||
assert_eq(extract_priority(["priority:high"]), "P1")
|
||||
|
||||
@test("extract P2 from priority:medium")
|
||||
def _():
|
||||
assert_eq(extract_priority(["priority:medium"]), "P2")
|
||||
|
||||
@test("extract P3 from priority:low")
|
||||
def _():
|
||||
assert_eq(extract_priority(["priority:low"]), "P3")
|
||||
|
||||
@test("returns None for no priority")
|
||||
def _():
|
||||
assert_eq(extract_priority(["bug", "enhancement"]), None)
|
||||
|
||||
@test("case insensitive")
|
||||
def _():
|
||||
assert_eq(extract_priority(["p1"]), "P1")
|
||||
assert_eq(extract_priority(["PRIORITY:CRITICAL"]), "P0")
|
||||
|
||||
|
||||
# ============================================================
|
||||
# Issue Scoring Tests
|
||||
# ============================================================
|
||||
|
||||
print("\n-- Issue Scoring --")
|
||||
|
||||
def make_issue(**kwargs):
|
||||
defaults = {
|
||||
"number": 1,
|
||||
"title": "Test issue",
|
||||
"labels": [],
|
||||
"created_at": (datetime.utcnow() - timedelta(days=5)).isoformat() + "Z",
|
||||
"comments": 0,
|
||||
"assignees": None,
|
||||
}
|
||||
defaults.update(kwargs)
|
||||
return defaults
|
||||
|
||||
@test("bug gets score boost")
|
||||
def _():
|
||||
issue = make_issue(title="Incorrect output format", labels=["bug"])
|
||||
score = compute_issue_score(issue, "test-repo", [], datetime.utcnow())
|
||||
assert_true(score.score > 0, f"Bug should boost score, got {score.score}")
|
||||
# Bug label alone should be P2 or P3 (not P0)
|
||||
assert_true(score.suggested_priority in ("P2", "P3"),
|
||||
f"Bug label alone should be P2/P3, got {score.suggested_priority}")
|
||||
|
||||
@test("security gets high score")
|
||||
def _():
|
||||
issue = make_issue(title="Security: auth bypass", labels=["bug"])
|
||||
score = compute_issue_score(issue, "test-repo", [], datetime.utcnow())
|
||||
assert_true(score.score >= 25, f"Security should score high, got {score.score}")
|
||||
|
||||
@test("old dormant issue gets penalized")
|
||||
def _():
|
||||
issue = make_issue(
|
||||
title="Some old feature",
|
||||
created_at=(datetime.utcnow() - timedelta(days=120)).isoformat() + "Z",
|
||||
comments=0
|
||||
)
|
||||
score = compute_issue_score(issue, "test-repo", [], datetime.utcnow())
|
||||
assert_true(score.score < 0, f"Old dormant should be negative, got {score.score}")
|
||||
assert_true(any("Dormant" in r for r in score.reasons), "Should mention dormancy")
|
||||
|
||||
@test("active discussion boosts score")
|
||||
def _():
|
||||
issue = make_issue(title="Important fix", comments=8)
|
||||
score = compute_issue_score(issue, "test-repo", [], datetime.utcnow())
|
||||
assert_true(score.score > 5, f"Active discussion should boost, got {score.score}")
|
||||
assert_true(any("Active" in r for r in score.reasons))
|
||||
|
||||
@test("unassigned gets slight boost")
|
||||
def _():
|
||||
issue = make_issue(title="Fix bug", assignees=None)
|
||||
score = compute_issue_score(issue, "test-repo", [], datetime.utcnow())
|
||||
assert_true(any("Unassigned" in r for r in score.reasons))
|
||||
|
||||
@test("assigned issue notes assignee")
|
||||
def _():
|
||||
issue = make_issue(title="Fix bug", assignees=[{"login": "alice"}])
|
||||
score = compute_issue_score(issue, "test-repo", [], datetime.utcnow())
|
||||
assert_eq(score.assignee, "alice")
|
||||
|
||||
@test("nice-to-have gets penalized")
|
||||
def _():
|
||||
issue = make_issue(title="Nice to have: fancy animation")
|
||||
score = compute_issue_score(issue, "test-repo", [], datetime.utcnow())
|
||||
assert_true(score.score < 0, f"Nice-to-have should be negative, got {score.score}")
|
||||
|
||||
|
||||
# ============================================================
|
||||
# Pipeline Signal Tests
|
||||
# ============================================================
|
||||
|
||||
print("\n-- Pipeline Signals --")
|
||||
|
||||
@test("signal alignment boosts matching issues")
|
||||
def _():
|
||||
signals = [PipelineSignal(
|
||||
source="knowledge",
|
||||
signal_type="stale_knowledge",
|
||||
weight=0.8,
|
||||
detail="20 stale facts"
|
||||
)]
|
||||
issue = make_issue(title="Fix stale knowledge entries")
|
||||
score = compute_issue_score(issue, "test-repo", signals, datetime.utcnow())
|
||||
assert_true(any("Matches signal" in r for r in score.reasons))
|
||||
|
||||
@test("empty knowledge boosts harvester issues")
|
||||
def _():
|
||||
signals = [PipelineSignal(
|
||||
source="knowledge",
|
||||
signal_type="empty_knowledge",
|
||||
weight=0.7,
|
||||
detail="0 facts"
|
||||
)]
|
||||
issue = make_issue(title="Implement harvester pipeline")
|
||||
score = compute_issue_score(issue, "test-repo", signals, datetime.utcnow())
|
||||
assert_true(any("Critical gap" in r for r in score.reasons))
|
||||
|
||||
|
||||
# ============================================================
|
||||
# Knowledge Signal Collection Tests
|
||||
# ============================================================
|
||||
|
||||
print("\n-- Knowledge Signal Collection --")
|
||||
|
||||
@test("missing index generates signal")
|
||||
def _():
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
signals = collect_knowledge_signals(tmpdir)
|
||||
assert_true(len(signals) > 0)
|
||||
assert_eq(signals[0].signal_type, "missing_index")
|
||||
|
||||
@test("empty knowledge generates signal")
|
||||
def _():
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
idx = os.path.join(tmpdir, "index.json")
|
||||
with open(idx, "w") as f:
|
||||
json.dump({"facts": []}, f)
|
||||
signals = collect_knowledge_signals(tmpdir)
|
||||
assert_true(any(s.signal_type == "empty_knowledge" for s in signals))
|
||||
|
||||
@test("corrupt index generates signal")
|
||||
def _():
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
idx = os.path.join(tmpdir, "index.json")
|
||||
with open(idx, "w") as f:
|
||||
f.write("not json {{{")
|
||||
signals = collect_knowledge_signals(tmpdir)
|
||||
assert_true(any(s.signal_type == "corrupt_index" for s in signals))
|
||||
|
||||
@test("knowledge with facts passes")
|
||||
def _():
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
idx = os.path.join(tmpdir, "index.json")
|
||||
with open(idx, "w") as f:
|
||||
json.dump({"facts": [
|
||||
{"id": 1, "repo": "test", "status": "fresh"},
|
||||
{"id": 2, "repo": "test", "status": "fresh"},
|
||||
]}, f)
|
||||
signals = collect_knowledge_signals(tmpdir)
|
||||
# Should not generate missing_index or empty_knowledge
|
||||
assert_false(any(s.signal_type in ("missing_index", "empty_knowledge") for s in signals))
|
||||
|
||||
|
||||
# ============================================================
|
||||
# Metrics Signal Collection Tests
|
||||
# ============================================================
|
||||
|
||||
print("\n-- Metrics Signal Collection --")
|
||||
|
||||
@test("empty metrics dir generates signal")
|
||||
def _():
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
signals = collect_metrics_signals(tmpdir)
|
||||
assert_true(any(s.signal_type == "no_metrics" for s in signals))
|
||||
|
||||
@test("metrics with files passes")
|
||||
def _():
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
# Create files (simulating real metrics dir with .gitkeep + actual files)
|
||||
with open(os.path.join(tmpdir, ".gitkeep"), "w") as f:
|
||||
f.write("")
|
||||
with open(os.path.join(tmpdir, "report.json"), "w") as f:
|
||||
f.write("{}")
|
||||
signals = collect_metrics_signals(tmpdir)
|
||||
assert_false(any(s.signal_type == "no_metrics" for s in signals))
|
||||
|
||||
|
||||
# ============================================================
|
||||
# Report Generation Tests
|
||||
# ============================================================
|
||||
|
||||
print("\n-- Report Generation --")
|
||||
|
||||
@test("report has correct structure")
|
||||
def _():
|
||||
scores = [
|
||||
IssueScore(1, "repo1", "Bug fix", ["bug"], None, "P1", 30.0, ["test"], 5, 3, None),
|
||||
IssueScore(2, "repo1", "Feature", ["enhancement"], "P3", None, -5.0, ["test"], 60, 0, "alice"),
|
||||
]
|
||||
signals = [PipelineSignal("knowledge", "stale_knowledge", 0.5, "10 stale")]
|
||||
report = generate_report(scores, signals, "test-org", ["repo1"])
|
||||
|
||||
assert_eq(report["org"], "test-org")
|
||||
assert_eq(report["total_issues"], 2)
|
||||
assert_true("generated_at" in report)
|
||||
assert_true("summary" in report)
|
||||
assert_true("top_priority" in report)
|
||||
assert_eq(report["summary"]["suggested_new_priorities"], 1)
|
||||
|
||||
@test("markdown report is non-empty")
|
||||
def _():
|
||||
scores = [IssueScore(1, "repo1", "Test", ["bug"], None, "P2", 15.0, ["reason"], 5, 0, None)]
|
||||
report = generate_report(scores, [], "test-org", ["repo1"])
|
||||
md = generate_markdown_report(report)
|
||||
assert_true(len(md) > 100)
|
||||
assert_true("Priority Rebalancer Report" in md)
|
||||
assert_true("Top 10" in md)
|
||||
|
||||
|
||||
# ============================================================
|
||||
# Summary
|
||||
# ============================================================
|
||||
|
||||
print(f"\n=== Summary ===")
|
||||
print(f"Total: {PASS + FAIL} | Passed: {PASS} | Failed: {FAIL}")
|
||||
|
||||
if FAIL > 0:
|
||||
sys.exit(1)
|
||||
@@ -1,242 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Tests for scripts/refactoring_opportunity_finder.py — 10 tests."""
|
||||
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import tempfile
|
||||
|
||||
sys.path.insert(0, os.path.dirname(__file__) or ".")
|
||||
import importlib.util
|
||||
spec = importlib.util.spec_from_file_location(
|
||||
"rof", os.path.join(os.path.dirname(__file__) or ".", "refactoring_opportunity_finder.py"))
|
||||
mod = importlib.util.module_from_spec(spec)
|
||||
spec.loader.exec_module(mod)
|
||||
|
||||
compute_file_complexity = mod.compute_file_complexity
|
||||
calculate_refactoring_score = mod.calculate_refactoring_score
|
||||
FileMetrics = mod.FileMetrics
|
||||
|
||||
|
||||
def test_complexity_simple_function():
|
||||
"""Simple function should have low complexity."""
|
||||
with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f:
|
||||
f.write("""
|
||||
def simple():
|
||||
return 42
|
||||
""")
|
||||
f.flush()
|
||||
avg, max_c, funcs, classes, lines = compute_file_complexity(f.name)
|
||||
assert avg == 1.0, f"Expected 1.0, got {avg}"
|
||||
assert max_c == 1, f"Expected 1, got {max_c}"
|
||||
assert funcs == 1, f"Expected 1, got {funcs}"
|
||||
assert classes == 0, f"Expected 0, got {classes}"
|
||||
os.unlink(f.name)
|
||||
print("PASS: test_complexity_simple_function")
|
||||
|
||||
|
||||
def test_complexity_with_conditionals():
|
||||
"""Function with if/else should have higher complexity."""
|
||||
with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f:
|
||||
f.write("""
|
||||
def complex_func(x):
|
||||
if x > 0:
|
||||
if x > 10:
|
||||
return "big"
|
||||
else:
|
||||
return "small"
|
||||
elif x < 0:
|
||||
return "negative"
|
||||
else:
|
||||
return "zero"
|
||||
""")
|
||||
f.flush()
|
||||
avg, max_c, funcs, classes, lines = compute_file_complexity(f.name)
|
||||
# Base 1 + 3 if/elif + 1 nested if = 5
|
||||
assert max_c >= 4, f"Expected max_c >= 4, got {max_c}"
|
||||
assert funcs == 1, f"Expected 1, got {funcs}"
|
||||
os.unlink(f.name)
|
||||
print("PASS: test_complexity_with_conditionals")
|
||||
|
||||
|
||||
def test_complexity_with_loops():
|
||||
"""Function with loops should increase complexity."""
|
||||
with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f:
|
||||
f.write("""
|
||||
def loop_func(items):
|
||||
result = []
|
||||
for item in items:
|
||||
if item > 0:
|
||||
result.append(item)
|
||||
while len(result) > 10:
|
||||
result.pop()
|
||||
return result
|
||||
""")
|
||||
f.flush()
|
||||
avg, max_c, funcs, classes, lines = compute_file_complexity(f.name)
|
||||
# Base 1 + 1 for + 1 if + 1 while = 4
|
||||
assert max_c >= 3, f"Expected max_c >= 3, got {max_c}"
|
||||
os.unlink(f.name)
|
||||
print("PASS: test_complexity_with_loops")
|
||||
|
||||
|
||||
def test_complexity_with_class():
|
||||
"""Class with methods should count both."""
|
||||
with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f:
|
||||
f.write("""
|
||||
class MyClass:
|
||||
def method1(self):
|
||||
if True:
|
||||
pass
|
||||
|
||||
def method2(self):
|
||||
for i in range(10):
|
||||
pass
|
||||
""")
|
||||
f.flush()
|
||||
avg, max_c, funcs, classes, lines = compute_file_complexity(f.name)
|
||||
assert classes == 1, f"Expected 1 class, got {classes}"
|
||||
assert funcs == 2, f"Expected 2 functions, got {funcs}"
|
||||
os.unlink(f.name)
|
||||
print("PASS: test_complexity_with_class")
|
||||
|
||||
|
||||
def test_complexity_syntax_error():
|
||||
"""File with syntax error should return zeros."""
|
||||
with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f:
|
||||
f.write("def broken(:\n pass")
|
||||
f.flush()
|
||||
avg, max_c, funcs, classes, lines = compute_file_complexity(f.name)
|
||||
assert avg == 0.0, f"Expected 0.0, got {avg}"
|
||||
assert funcs == 0, f"Expected 0, got {funcs}"
|
||||
os.unlink(f.name)
|
||||
print("PASS: test_complexity_syntax_error")
|
||||
|
||||
|
||||
def test_refactoring_score_high_complexity():
|
||||
"""High complexity should give high score."""
|
||||
metrics = FileMetrics(
|
||||
path="test.py",
|
||||
lines=200,
|
||||
complexity=15.0,
|
||||
max_complexity=25,
|
||||
functions=10,
|
||||
classes=2,
|
||||
churn_30d=5,
|
||||
churn_90d=15,
|
||||
test_coverage=0.3,
|
||||
refactoring_score=0.0
|
||||
)
|
||||
score = calculate_refactoring_score(metrics)
|
||||
assert score > 50, f"Expected score > 50, got {score}"
|
||||
print("PASS: test_refactoring_score_high_complexity")
|
||||
|
||||
|
||||
def test_refactoring_score_low_complexity():
|
||||
"""Low complexity should give lower score."""
|
||||
metrics = FileMetrics(
|
||||
path="test.py",
|
||||
lines=50,
|
||||
complexity=2.0,
|
||||
max_complexity=3,
|
||||
functions=3,
|
||||
classes=0,
|
||||
churn_30d=0,
|
||||
churn_90d=1,
|
||||
test_coverage=0.9,
|
||||
refactoring_score=0.0
|
||||
)
|
||||
score = calculate_refactoring_score(metrics)
|
||||
assert score < 30, f"Expected score < 30, got {score}"
|
||||
print("PASS: test_refactoring_score_low_complexity")
|
||||
|
||||
|
||||
def test_refactoring_score_high_churn():
|
||||
"""High churn should increase score."""
|
||||
metrics = FileMetrics(
|
||||
path="test.py",
|
||||
lines=100,
|
||||
complexity=5.0,
|
||||
max_complexity=8,
|
||||
functions=5,
|
||||
classes=0,
|
||||
churn_30d=10,
|
||||
churn_90d=20,
|
||||
test_coverage=0.5,
|
||||
refactoring_score=0.0
|
||||
)
|
||||
score = calculate_refactoring_score(metrics)
|
||||
# Churn should contribute significantly
|
||||
assert score > 40, f"Expected score > 40 for high churn, got {score}"
|
||||
print("PASS: test_refactoring_score_high_churn")
|
||||
|
||||
|
||||
def test_refactoring_score_no_coverage():
|
||||
"""No coverage data should assume medium risk."""
|
||||
metrics = FileMetrics(
|
||||
path="test.py",
|
||||
lines=100,
|
||||
complexity=5.0,
|
||||
max_complexity=8,
|
||||
functions=5,
|
||||
classes=0,
|
||||
churn_30d=1,
|
||||
churn_90d=2,
|
||||
test_coverage=None,
|
||||
refactoring_score=0.0
|
||||
)
|
||||
score = calculate_refactoring_score(metrics)
|
||||
# Should have some score from the 5-point coverage component
|
||||
assert score > 0, f"Expected positive score, got {score}"
|
||||
print("PASS: test_refactoring_score_no_coverage")
|
||||
|
||||
|
||||
def test_refactoring_score_large_file():
|
||||
"""Large files should score higher."""
|
||||
metrics_small = FileMetrics(
|
||||
path="small.py",
|
||||
lines=50,
|
||||
complexity=5.0,
|
||||
max_complexity=8,
|
||||
functions=3,
|
||||
classes=0,
|
||||
churn_30d=1,
|
||||
churn_90d=2,
|
||||
test_coverage=0.8,
|
||||
refactoring_score=0.0
|
||||
)
|
||||
metrics_large = FileMetrics(
|
||||
path="large.py",
|
||||
lines=1000,
|
||||
complexity=5.0,
|
||||
max_complexity=8,
|
||||
functions=3,
|
||||
classes=0,
|
||||
churn_30d=1,
|
||||
churn_90d=2,
|
||||
test_coverage=0.8,
|
||||
refactoring_score=0.0
|
||||
)
|
||||
score_small = calculate_refactoring_score(metrics_small)
|
||||
score_large = calculate_refactoring_score(metrics_large)
|
||||
assert score_large > score_small, \
|
||||
f"Large file ({score_large}) should score higher than small ({score_small})"
|
||||
print("PASS: test_refactoring_score_large_file")
|
||||
|
||||
|
||||
def run_all():
|
||||
test_complexity_simple_function()
|
||||
test_complexity_with_conditionals()
|
||||
test_complexity_with_loops()
|
||||
test_complexity_with_class()
|
||||
test_complexity_syntax_error()
|
||||
test_refactoring_score_high_complexity()
|
||||
test_refactoring_score_low_complexity()
|
||||
test_refactoring_score_high_churn()
|
||||
test_refactoring_score_no_coverage()
|
||||
test_refactoring_score_large_file()
|
||||
print("\nAll 10 tests passed!")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
run_all()
|
||||
@@ -1,90 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Tests for session_pair_harvester."""
|
||||
|
||||
import json
|
||||
import sys
|
||||
import os
|
||||
import tempfile
|
||||
|
||||
sys.path.insert(0, os.path.dirname(__file__))
|
||||
from session_pair_harvester import extract_pairs_from_session, deduplicate_pairs, compute_hash
|
||||
|
||||
|
||||
def test_basic_extraction():
|
||||
session = {
|
||||
"id": "test_001",
|
||||
"model": "test-model",
|
||||
"conversations": [
|
||||
{"from": "system", "value": "You are helpful."},
|
||||
{"from": "human", "value": "What is Python?"},
|
||||
{"from": "gpt", "value": "Python is a high-level programming language known for its readability and versatility. It supports multiple paradigms including procedural, object-oriented, and functional programming. Python is widely used in web development, data science, machine learning, and automation."},
|
||||
]
|
||||
}
|
||||
pairs = extract_pairs_from_session(session, min_ratio=1.5, min_response_words=10)
|
||||
assert len(pairs) == 1
|
||||
assert pairs[0]["terse"] == "What is Python?"
|
||||
assert "programming language" in pairs[0]["rich"]
|
||||
assert pairs[0]["source"] == "test_001"
|
||||
print("PASS: test_basic_extraction")
|
||||
|
||||
|
||||
def test_filters_short_responses():
|
||||
session = {
|
||||
"id": "test_002",
|
||||
"model": "test",
|
||||
"conversations": [
|
||||
{"from": "human", "value": "Hi"},
|
||||
{"from": "gpt", "value": "Hello!"},
|
||||
]
|
||||
}
|
||||
pairs = extract_pairs_from_session(session, min_ratio=1.5, min_response_words=20)
|
||||
assert len(pairs) == 0
|
||||
print("PASS: test_filters_short_responses")
|
||||
|
||||
|
||||
def test_skips_tool_results():
|
||||
session = {
|
||||
"id": "test_003",
|
||||
"model": "test",
|
||||
"conversations": [
|
||||
{"from": "human", "value": '{"output": "file content", "exit_code": 0}'},
|
||||
{"from": "gpt", "value": "The file was read successfully. Now let me analyze the content and provide a detailed summary of what was found in the file system."},
|
||||
]
|
||||
}
|
||||
pairs = extract_pairs_from_session(session, min_ratio=1.5, min_response_words=10)
|
||||
assert len(pairs) == 0
|
||||
print("PASS: test_skips_tool_results")
|
||||
|
||||
|
||||
def test_deduplication():
|
||||
pairs = [
|
||||
{"terse": "What is X?", "rich": "X is Y.", "source": "s1", "model": "m"},
|
||||
{"terse": "What is X?", "rich": "X is Y.", "source": "s2", "model": "m"},
|
||||
{"terse": "What is Z?", "rich": "Z is W.", "source": "s1", "model": "m"},
|
||||
]
|
||||
unique = deduplicate_pairs(pairs)
|
||||
assert len(unique) == 2
|
||||
print("PASS: test_deduplication")
|
||||
|
||||
|
||||
def test_ratio_filter():
|
||||
session = {
|
||||
"id": "test_005",
|
||||
"model": "test",
|
||||
"conversations": [
|
||||
{"from": "human", "value": "Explain quantum computing in detail with examples and applications"},
|
||||
{"from": "gpt", "value": "OK."},
|
||||
]
|
||||
}
|
||||
pairs = extract_pairs_from_session(session, min_ratio=1.5, min_response_words=10)
|
||||
assert len(pairs) == 0 # response too short relative to prompt
|
||||
print("PASS: test_ratio_filter")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
test_basic_extraction()
|
||||
test_filters_short_responses()
|
||||
test_skips_tool_results()
|
||||
test_deduplication()
|
||||
test_ratio_filter()
|
||||
print("\nAll tests passed.")
|
||||
@@ -1,38 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Validate knowledge files and index.json against the schema."""
|
||||
import json, sys
|
||||
from pathlib import Path
|
||||
|
||||
VALID_CATEGORIES = {"fact", "pitfall", "pattern", "tool-quirk", "question"}
|
||||
REQUIRED = {"id", "fact", "category", "domain", "confidence"}
|
||||
|
||||
def validate_fact(fact, src=""):
|
||||
errs = []
|
||||
for f in REQUIRED:
|
||||
if f not in fact: errs.append(f"{src}: missing '{f}'")
|
||||
if "category" in fact and fact["category"] not in VALID_CATEGORIES:
|
||||
errs.append(f"{src}: invalid category '{fact['category']}'")
|
||||
if "confidence" in fact:
|
||||
if not isinstance(fact["confidence"], (int, float)) or not (0 <= fact["confidence"] <= 1):
|
||||
errs.append(f"{src}: confidence must be 0.0-1.0")
|
||||
if "id" in fact:
|
||||
parts = fact["id"].split(":")
|
||||
if len(parts) != 3: errs.append(f"{src}: id must be domain:category:sequence")
|
||||
return errs
|
||||
|
||||
def main():
|
||||
idx = Path(__file__).parent.parent / "knowledge" / "index.json"
|
||||
if not idx.exists(): print(f"FAILED: {idx} not found"); sys.exit(1)
|
||||
data = json.load(open(idx))
|
||||
errs = []
|
||||
seen = set()
|
||||
for i, f in enumerate(data.get("facts", [])):
|
||||
errs.extend(validate_fact(f, f"[{i}]"))
|
||||
if "id" in f:
|
||||
if f["id"] in seen: errs.append(f"duplicate id '{f['id']}'")
|
||||
seen.add(f["id"])
|
||||
if errs:
|
||||
print(f"FAILED - {len(errs)} errors:"); [print(f" x {e}") for e in errs]; sys.exit(1)
|
||||
print(f"PASSED - {len(data.get('facts', []))} facts")
|
||||
|
||||
if __name__ == "__main__": main()
|
||||
@@ -1,72 +0,0 @@
|
||||
# Conference Talk Knowledge Extraction Prompt
|
||||
|
||||
## System Prompt
|
||||
|
||||
You are a knowledge extraction engine specialized in conference talks. You read talk transcripts and output ONLY structured JSON. You extract factual insights, patterns, tool discoveries, and warnings that are durable and actionable for the Timmy Foundation fleet.
|
||||
|
||||
## Prompt
|
||||
|
||||
```
|
||||
TASK: Extract durable knowledge from this conference talk transcript.
|
||||
|
||||
RULES:
|
||||
1. Extract ONLY information explicitly stated or strongly implied in the transcript.
|
||||
2. Do NOT hallucinate, infer unsupported details, or invent quotes.
|
||||
3. Every fact must be grounded in something the speaker actually said.
|
||||
4. Focus on **durable, reusable** knowledge — not specific project details that won't apply elsewhere.
|
||||
5. Prioritize insights that improve: workflows, tool usage, system design, governance, or operational reliability.
|
||||
|
||||
CATEGORIES (assign exactly one per item):
|
||||
- fact: Concrete, verifiable takeaway (technical detail, config, workflow)
|
||||
- pitfall: Mistake, trap, or cost of wrong approach the speaker warned about
|
||||
- pattern: Successful approach, sequence, or template worth reusing
|
||||
- tool-quirk: Unexpected behavior, gotcha, or setup detail for a specific tool/platform
|
||||
- question: Something raised but not fully answered — worth investigating further
|
||||
|
||||
CONFIDENCE:
|
||||
- 0.9–1.0: Explicitly stated by speaker with clear reasoning/evidence
|
||||
- 0.7–0.8: Clearly implied by multiple statements, speaker's expertise
|
||||
- 0.5–0.6: Suggested or hinted, but not directly confirmed
|
||||
- 0.3–0.4: Interpretive, speculative, or single-data-point observation
|
||||
|
||||
TARGET DOMAIN:
|
||||
- If talk is about a specific repo (e.g. hermes-agent, the-nexus), set `domain` to that repo name.
|
||||
- If talk is about general principles, fleet processes, or multiple repos, set `domain` to "global".
|
||||
- If talk is about an agent type (mimo, groq, claude), set `domain` to the agent name.
|
||||
- If talk is about the compounding-intelligence system itself, set `domain` to "compounding-intelligence".
|
||||
|
||||
OUTPUT FORMAT (valid JSON only, no markdown, no explanation):
|
||||
|
||||
{
|
||||
"knowledge": [
|
||||
{
|
||||
"fact": "One specific, actionable sentence of knowledge",
|
||||
"category": "fact|pitfall|pattern|tool-quirk|question",
|
||||
"domain": "global|{repo}|{agent}|compounding-intelligence",
|
||||
"confidence": 0.0-1.0,
|
||||
"tags": ["relevant", "keywords"],
|
||||
"evidence": "Brief paraphrase or quote from the transcript that supports this"
|
||||
}
|
||||
],
|
||||
"meta": {
|
||||
"talk_title": "Title of the talk (if known)",
|
||||
"speaker": "Speaker name(s)",
|
||||
"conference": "Conference name",
|
||||
"talk_url": "URL to talk/video (if available)",
|
||||
"knowledge_count": 0,
|
||||
"extraction_date": "2026-04-26"
|
||||
}
|
||||
}
|
||||
|
||||
TRANSCRIPT:
|
||||
{{transcript}}
|
||||
```
|
||||
|
||||
## Design Notes
|
||||
|
||||
- Keep `fact` field to **one clear sentence**. Avoid run-ons.
|
||||
- `evidence` should be a 1–2 sentence paraphrase, not verbatim paragraph.
|
||||
- `tags` should include: tool names, repo names, agent types, concepts mentioned
|
||||
- Focus on what the fleet can **reuse tomorrow**, not ephemeral project context
|
||||
- If the talk is high-level vision with no concrete details, that's a `question` or low-confidence `fact`
|
||||
|
||||
@@ -2,107 +2,98 @@
|
||||
|
||||
## System Prompt
|
||||
|
||||
You are a knowledge extraction engine. You read session transcripts and output ONLY structured JSON. You never infer. You never assume. You extract only what the transcript explicitly states.
|
||||
You are a knowledge extraction engine. Your task is to analyze a session transcript and extract durable knowledge that will help future sessions be more efficient.
|
||||
|
||||
## Prompt
|
||||
## Instructions
|
||||
|
||||
Read the session transcript carefully. Extract ONLY information that is explicitly stated in the transcript. Do NOT infer, assume, or hallucinate information.
|
||||
|
||||
### Categories
|
||||
|
||||
Extract knowledge into these categories:
|
||||
|
||||
1. **fact**: Concrete, verifiable information learned (e.g., "Repository X has 5 files", "API returns JSON with field Y")
|
||||
2. **pitfall**: Errors encountered, wrong assumptions, things that wasted time (e.g., "Assumed API token was in env var GITEA_TOKEN, but it's in ~/.config/gitea/token")
|
||||
3. **pattern**: Successful sequences of actions (e.g., "To deploy: 1. Run tests 2. Build 3. Push to Gitea 4. Trigger webhook")
|
||||
4. **tool-quirk**: Environment-specific behaviors (e.g., "Token paths are different on macOS vs Linux", "URL format requires trailing slash")
|
||||
5. **question**: Things identified but not answered (e.g., "Need to determine optimal batch size for harvesting")
|
||||
|
||||
### Output Format
|
||||
|
||||
Return a JSON object with an array of extracted knowledge items. Each item must have:
|
||||
|
||||
```json
|
||||
{
|
||||
"fact": "One sentence description of the knowledge",
|
||||
"category": "fact|pitfall|pattern|tool-quirk|question",
|
||||
"repo": "Repository name this applies to, or 'global' if general",
|
||||
"confidence": 0.0-1.0
|
||||
}
|
||||
```
|
||||
TASK: Extract durable knowledge from this session transcript.
|
||||
|
||||
RULES:
|
||||
1. Extract ONLY information explicitly stated in the transcript.
|
||||
2. Do NOT infer, assume, or hallucinate.
|
||||
3. Every fact must be verifiable by pointing to a specific line in the transcript.
|
||||
4. If the session failed or was partial, extract pitfalls and questions — these are the most valuable.
|
||||
5. Be specific. "Gitea API is slow" is worthless. "Gitea issues endpoint with state=open returns empty when limit=50 but works with limit=5" is knowledge.
|
||||
### Confidence Scoring
|
||||
|
||||
CATEGORIES (assign exactly one per item):
|
||||
- fact: Concrete, verifiable thing learned (paths, formats, counts, configs)
|
||||
- pitfall: Error hit, wrong assumption, time wasted, thing that didn't work
|
||||
- pattern: Successful sequence that should be reused (deploy steps, debug flow)
|
||||
- tool-quirk: Environment-specific behavior (token paths, URL formats, API gotchas)
|
||||
- question: Something identified but not answered — the NEXT agent should investigate
|
||||
- 0.9-1.0: Explicitly stated with verification (e.g., "Error message shows X")
|
||||
- 0.7-0.8: Clearly implied by multiple data points
|
||||
- 0.5-0.6: Suggested but not fully verified
|
||||
- 0.3-0.4: Inferred from limited data
|
||||
- 0.1-0.2: Speculative or uncertain
|
||||
|
||||
CONFIDENCE:
|
||||
- 0.9: Directly observed with error output or explicit verification
|
||||
- 0.7: Multiple data points confirm, but not explicitly verified
|
||||
- 0.5: Suggested by context, not tested
|
||||
- 0.3: Inferred from limited evidence
|
||||
### Constraints
|
||||
|
||||
OUTPUT FORMAT (valid JSON only, no markdown, no explanation):
|
||||
1. **No hallucination**: Only extract what's explicitly in the transcript
|
||||
2. **Specificity**: Each fact must be specific and actionable
|
||||
3. **Relevance**: Only extract knowledge that would help future sessions
|
||||
4. **Brevity**: One sentence per fact
|
||||
5. **Partial sessions**: Even failed or incomplete sessions may contain valuable pitfalls
|
||||
|
||||
### Example Input/Output
|
||||
|
||||
**Input Transcript (excerpt):**
|
||||
```
|
||||
User: Clone the repo at forge.alexanderwhitestone.com/Timmy_Foundation/compounding-intelligence
|
||||
Agent: Trying git clone... Error: 401 Unauthorized
|
||||
Agent: Checking for token... Found in ~/.config/gitea/token
|
||||
Agent: Token is gitea_token format, not OAuth
|
||||
Agent: Clone successful with token authentication
|
||||
```
|
||||
|
||||
**Output:**
|
||||
```json
|
||||
{
|
||||
"knowledge": [
|
||||
{
|
||||
"fact": "One specific sentence of knowledge",
|
||||
"category": "fact|pitfall|pattern|tool-quirk|question",
|
||||
"repo": "repo-name or global",
|
||||
"confidence": 0.0-1.0,
|
||||
"evidence": "Brief quote or reference from transcript that supports this"
|
||||
"fact": "Gitea repo at forge.alexanderwhitestone.com requires authentication for cloning",
|
||||
"category": "fact",
|
||||
"repo": "compounding-intelligence",
|
||||
"confidence": 0.9
|
||||
},
|
||||
{
|
||||
"fact": "Gitea authentication token is stored at ~/.config/gitea/token",
|
||||
"category": "tool-quirk",
|
||||
"repo": "global",
|
||||
"confidence": 0.9
|
||||
},
|
||||
{
|
||||
"fact": "Gitea uses gitea_token format, not OAuth for API access",
|
||||
"category": "tool-quirk",
|
||||
"repo": "global",
|
||||
"confidence": 0.8
|
||||
},
|
||||
{
|
||||
"fact": "Clone fails with 401 when no token is provided",
|
||||
"category": "pitfall",
|
||||
"repo": "compounding-intelligence",
|
||||
"confidence": 0.9
|
||||
}
|
||||
],
|
||||
"meta": {
|
||||
"session_outcome": "success|partial|failed",
|
||||
"tools_used": ["tool1", "tool2"],
|
||||
"repos_touched": ["repo1"],
|
||||
"error_count": 0,
|
||||
"knowledge_count": 0
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
TRANSCRIPT:
|
||||
{{transcript}}
|
||||
```
|
||||
|
||||
## Design Notes
|
||||
## Final Notes
|
||||
|
||||
### Why this works with mimo-v2-pro
|
||||
|
||||
Mimo needs:
|
||||
- Explicit format constraints ("valid JSON only, no markdown")
|
||||
- Clear category definitions with concrete examples
|
||||
- Hard rules before soft guidance
|
||||
- The transcript at the END (so it reads all instructions first)
|
||||
|
||||
This prompt front-loads all rules, then gives the transcript last. Mimo follows the pattern.
|
||||
|
||||
### Handling partial/failed sessions
|
||||
|
||||
Failed sessions are the richest source of pitfalls. The prompt explicitly says:
|
||||
> "If the session failed or was partial, extract pitfalls and questions — these are the most valuable."
|
||||
|
||||
This reframes failure as valuable output, not noise to discard.
|
||||
|
||||
### The `evidence` field
|
||||
|
||||
Added to the original spec. Every extracted item must cite where in the transcript it came from. This:
|
||||
- Prevents hallucination (can't cite what isn't there)
|
||||
- Enables verification (reviewer can check the source)
|
||||
- Trains confidence calibration (the agent must find evidence, not just claim it)
|
||||
|
||||
### Token budget
|
||||
|
||||
Target: ~1,000 tokens for the prompt (excluding transcript).
|
||||
|
||||
```
|
||||
System prompt: ~50 tokens
|
||||
Rules: ~200 tokens
|
||||
Categories: ~150 tokens
|
||||
Confidence: ~100 tokens
|
||||
Output format: ~200 tokens
|
||||
Design notes: NOT included in prompt (documentation only)
|
||||
─────────────────────────────
|
||||
Total prompt: ~700 tokens
|
||||
```
|
||||
|
||||
Leaves ~300 tokens headroom for variable content (transcript insertion, edge cases).
|
||||
|
||||
### What this replaces
|
||||
|
||||
The v1 prompt had:
|
||||
- Verbose prose explanations (waste tokens for mimo)
|
||||
- No `evidence` field (hallucination risk)
|
||||
- No `meta` block (no session-level metadata)
|
||||
- No explicit handling of failed sessions
|
||||
- Example was too long (~150 tokens of example for a 1k prompt)
|
||||
|
||||
This v2 is tighter, more structured, and adds the evidence requirement that prevents the #1 failure mode of extraction prompts: generating plausible-sounding facts that aren't in the transcript.
|
||||
- Process the entire transcript, not just the beginning
|
||||
- Pay special attention to errors and corrections
|
||||
- Note any environment-specific details
|
||||
- Track tool-specific behaviors and quirks
|
||||
- If the session failed, focus on pitfalls and questions
|
||||
|
||||
@@ -1,18 +0,0 @@
|
||||
{"role": "user", "content": "Deploy the application to production"}
|
||||
{"role": "assistant", "content": "I'll deploy the application to production.", "tool_calls": [{"function": "terminal", "arguments": {"command": "ssh deploy@prod-server 'cd /app && git pull && docker-compose up -d'"}}]}
|
||||
{"role": "tool", "content": "ssh: connect to host prod-server port 22: Connection refused"}
|
||||
{"role": "assistant", "content": "Connection refused. Let me check the server status.", "tool_calls": [{"function": "terminal", "arguments": {"command": "ping -c 3 prod-server"}}]}
|
||||
{"role": "tool", "content": "PING prod-server (192.168.1.100): 56 data bytes\nRequest timeout for icmp_seq 0\nRequest timeout for icmp_seq 1\nRequest timeout for icmp_seq 2\n--- prod-server ping statistics ---\n3 packets transmitted, 0 packets received, 100.0% packet loss"}
|
||||
{"role": "assistant", "content": "The server appears to be down. Let me check if there's an alternative deployment method."}
|
||||
{"role": "user", "content": "Try using the VPS instead"}
|
||||
{"role": "assistant", "content": "I'll try deploying to the VPS.", "tool_calls": [{"function": "terminal", "arguments": {"command": "ssh deploy@vps-server 'cd /app && git pull && docker-compose up -d'"}}]}
|
||||
{"role": "tool", "content": "Permission denied (publickey)."}
|
||||
{"role": "assistant", "content": "Permission denied. The SSH key might not be set up correctly. Let me check the SSH configuration.", "tool_calls": [{"function": "terminal", "arguments": {"command": "cat ~/.ssh/config"}}]}
|
||||
{"role": "tool", "content": "Host vps-server\n HostName 167.99.126.228\n User deploy\n IdentityFile ~/.ssh/id_rsa_vps"}
|
||||
{"role": "assistant", "content": "The SSH config looks correct. Let me check if the key exists.", "tool_calls": [{"function": "terminal", "arguments": {"command": "ls -la ~/.ssh/id_rsa_vps"}}]}
|
||||
{"role": "tool", "content": "ls: /Users/user/.ssh/id_rsa_vps: No such file or directory"}
|
||||
{"role": "assistant", "content": "The SSH key doesn't exist. This is a pitfall - we need to generate the SSH key first."}
|
||||
{"role": "user", "content": "Generate a new SSH key for the VPS"}
|
||||
{"role": "assistant", "content": "I'll generate a new SSH key.", "tool_calls": [{"function": "terminal", "arguments": {"command": "ssh-keygen -t rsa -b 4096 -f ~/.ssh/id_rsa_vps -N ''"}}]}
|
||||
{"role": "tool", "content": "Generating public/private rsa key pair.\nYour identification has been saved in /Users/user/.ssh/id_rsa_vps\nYour public key has been saved in /Users/user/.ssh/id_rsa_vps.pub\nThe key fingerprint is:\nSHA256:abc123... user@machine"}
|
||||
{"role": "assistant", "content": "SSH key generated. Now I need to add it to the VPS."}
|
||||
@@ -1,14 +0,0 @@
|
||||
{"role": "user", "content": "Check the Gitea API for open issues"}
|
||||
{"role": "assistant", "content": "I'll check the Gitea API for open issues.", "tool_calls": [{"function": "terminal", "arguments": {"command": "curl -H 'Authorization: token $GITEA_TOKEN' https://forge.alexanderwhitestone.com/api/v1/repos/Timmy_Foundation/compounding-intelligence/issues?state=open"}}]}
|
||||
{"role": "tool", "content": "curl: (6) Could not resolve host: forge.alexanderwhitestone.com"}
|
||||
{"role": "assistant", "content": "DNS resolution failed. Let me check the network connectivity.", "tool_calls": [{"function": "terminal", "arguments": {"command": "nslookup forge.alexanderwhitestone.com"}}]}
|
||||
{"role": "tool", "content": "Server: 8.8.8.8\nAddress: 8.8.8.8#53\n\n** server can't find forge.alexanderwhitestone.com: NXDOMAIN"}
|
||||
{"role": "assistant", "content": "The domain doesn't resolve. This might be a DNS issue. Let me try using the IP address directly.", "tool_calls": [{"function": "terminal", "arguments": {"command": "curl -H 'Authorization: token $GITEA_TOKEN' https://167.99.126.228/api/v1/repos/Timmy_Foundation/compounding-intelligence/issues?state=open"}}]}
|
||||
{"role": "tool", "content": "curl: (60) SSL certificate problem: self signed certificate"}
|
||||
{"role": "assistant", "content": "SSL certificate issue. The server is using a self-signed certificate. Let me try with -k flag to skip SSL verification.", "tool_calls": [{"function": "terminal", "arguments": {"command": "curl -k -H 'Authorization: token $GITEA_TOKEN' https://167.99.126.228/api/v1/repos/Timmy_Foundation/compounding-intelligence/issues?state=open"}}]}
|
||||
{"role": "tool", "content": "[]"}
|
||||
{"role": "assistant", "content": "No open issues found. The API returned an empty array."}
|
||||
{"role": "user", "content": "Check if there are any pull requests"}
|
||||
{"role": "assistant", "content": "Let me check for pull requests.", "tool_calls": [{"function": "terminal", "arguments": {"command": "curl -k -H 'Authorization: token $GITEA_TOKEN' https://167.99.126.228/api/v1/repos/Timmy_Foundation/compounding-intelligence/pulls?state=open"}}]}
|
||||
{"role": "tool", "content": "[{"id": 20, "title": "feat: add harvester.py + session_reader.py — session knowledge extractor (closes #8)", "state": "open", "user": {"login": "Rockachopa"}, "created_at": "2026-04-14T17:28:20Z"}]"}
|
||||
{"role": "assistant", "content": "Found 1 open pull request: PR #20 - "feat: add harvester.py + session_reader.py — session knowledge extractor (closes #8)" by Rockachopa."}
|
||||
@@ -1,10 +0,0 @@
|
||||
{"role": "user", "content": "How do I deploy a new version of the application?"}
|
||||
{"role": "assistant", "content": "Here's the deployment process:\n1. Run tests: pytest\n2. Build Docker image: docker build -t app:latest .\n3. Push to registry: docker push registry.example.com/app:latest\n4. Update production: ssh prod 'cd /app && docker-compose pull && docker-compose up -d'\n5. Verify: curl https://app.example.com/health"}
|
||||
{"role": "user", "content": "What if tests fail?"}
|
||||
{"role": "assistant", "content": "If tests fail:\n1. Check the error logs: pytest -v\n2. Fix the failing tests\n3. Re-run tests: pytest\n4. If still failing, check recent changes: git log --oneline -10\n5. Consider reverting: git revert HEAD"}
|
||||
{"role": "user", "content": "What if the Docker build fails?"}
|
||||
{"role": "assistant", "content": "If Docker build fails:\n1. Check Dockerfile syntax: docker build --no-cache -t app:latest .\n2. Check for missing dependencies: cat requirements.txt\n3. Check disk space: df -h\n4. Clean up Docker: docker system prune -f\n5. Try building with verbose output: docker build --progress=plain -t app:latest ."}
|
||||
{"role": "user", "content": "What if the push to registry fails?"}
|
||||
{"role": "assistant", "content": "If push to registry fails:\n1. Check registry credentials: docker login registry.example.com\n2. Check registry space: curl https://registry.example.com/v2/_catalog\n3. Check image size: docker images app:latest\n4. Try tagging with version: docker tag app:latest registry.example.com/app:v1.0.0\n5. Push with version tag: docker push registry.example.com/app:v1.0.0"}
|
||||
{"role": "user", "content": "What if the production update fails?"}
|
||||
{"role": "assistant", "content": "If production update fails:\n1. Check SSH connection: ssh prod 'echo connected'\n2. Check disk space on prod: ssh prod 'df -h'\n3. Check Docker status: ssh prod 'docker ps -a'\n4. Check logs: ssh prod 'docker-compose logs --tail=100'\n5. Rollback: ssh prod 'cd /app && git checkout HEAD~1 && docker-compose up -d'"}
|
||||
@@ -1,12 +0,0 @@
|
||||
{"role": "user", "content": "I need to implement a new feature for the application"}
|
||||
{"role": "assistant", "content": "I can help you implement a new feature. What feature would you like to add?"}
|
||||
{"role": "user", "content": "I want to add user authentication"}
|
||||
{"role": "assistant", "content": "User authentication is a good feature. I have a few questions:\n1. What authentication method do you prefer? (JWT, OAuth, Session-based)\n2. Do you need social login? (Google, GitHub, etc.)\n3. What user data do you need to store?\n4. Do you need role-based access control?\n5. What's your database? (PostgreSQL, MySQL, SQLite)"}
|
||||
{"role": "user", "content": "I prefer JWT, no social login, just email and password, no roles, PostgreSQL"}
|
||||
{"role": "assistant", "content": "Great choices. Here's the implementation plan:\n1. Create user model with email and password hash\n2. Create JWT token generation and validation\n3. Create login endpoint: POST /api/auth/login\n4. Create register endpoint: POST /api/auth/register\n5. Create middleware for protected routes\n6. Add password hashing with bcrypt\n7. Add rate limiting for login attempts\n8. Add password reset functionality\n9. Add email verification\n10. Test all endpoints"}
|
||||
{"role": "user", "content": "What about password reset?"}
|
||||
{"role": "assistant", "content": "For password reset:\n1. Create reset token table\n2. Create forgot password endpoint: POST /api/auth/forgot-password\n3. Send reset email with token\n4. Create reset password endpoint: POST /api/auth/reset-password\n5. Validate reset token and expiration\n6. Update password and delete token\n7. Send confirmation email"}
|
||||
{"role": "user", "content": "What about email verification?"}
|
||||
{"role": "assistant", "content": "For email verification:\n1. Add email_verified field to user model\n2. Create verification token table\n3. Send verification email on registration\n4. Create verify email endpoint: GET /api/auth/verify-email?token=...\n5. Validate token and update email_verified\n6. Add middleware to check email_verified for certain routes\n7. Add resend verification email endpoint"}
|
||||
{"role": "user", "content": "What about rate limiting?"}
|
||||
{"role": "assistant", "content": "For rate limiting:\n1. Use Redis for rate limit storage\n2. Create rate limit middleware\n3. Set limits: 5 login attempts per minute, 3 password resets per hour\n4. Return 429 Too Many Requests when limit exceeded\n5. Add X-RateLimit headers to responses\n6. Log rate limit violations\n7. Add whitelist for trusted IPs"}
|
||||
@@ -1,13 +0,0 @@
|
||||
{"role": "user", "content": "Clone the repo at forge.alexanderwhitestone.com/Timmy_Foundation/compounding-intelligence"}
|
||||
{"role": "assistant", "content": "I'll clone the repository for you.", "tool_calls": [{"function": "terminal", "arguments": {"command": "git clone https://forge.alexanderwhitestone.com/Timmy_Foundation/compounding-intelligence.git"}}]}
|
||||
{"role": "tool", "content": "Cloning into 'compounding-intelligence'...\nremote: Enumerating objects: 15, done.\nremote: Counting objects: 100% (15/15), done.\nremote: Compressing objects: 100% (12/12), done.\nremote: Total 15 (delta 2), reused 0 (delta 0), pack-reused 0\nReceiving objects: 100% (15/15), done.\nResolving deltas: 100% (2/2), done."}
|
||||
{"role": "assistant", "content": "Repository cloned successfully. Let me check the structure.", "tool_calls": [{"function": "terminal", "arguments": {"command": "ls -la compounding-intelligence/"}}]}
|
||||
{"role": "tool", "content": "total 8\ndrwxr-xr-x 5 user staff 160 Apr 14 17:21 .\ndrwxr-xr-x 12 user staff 384 Apr 14 17:21 ..\ndrwxr-xr-x 13 user staff 416 Apr 14 17:21 .git\n-rw-r--r-- 1 user staff 1234 Apr 14 17:21 README.md\ndrwxr-xr-x 3 user staff 96 Apr 14 17:21 knowledge\n-rw-r--r-- 1 user staff 456 Apr 14 17:21 metrics\ndrwxr-xr-x 4 user staff 128 Apr 14 17:21 scripts\ndrwxr-xr-x 3 user staff 96 Apr 14 17:21 templates"}
|
||||
{"role": "user", "content": "What files are in the templates directory?"}
|
||||
{"role": "assistant", "content": "Let me check the templates directory.", "tool_calls": [{"function": "terminal", "arguments": {"command": "ls -la compounding-intelligence/templates/"}}]}
|
||||
{"role": "tool", "content": "total 0\ndrwxr-xr-x 3 user staff 96 Apr 14 17:21 .\ndrwxr-xr-x 5 user staff 160 Apr 14 17:21 ..\n-rw-r--r-- 1 user staff 0 Apr 14 17:21 .gitkeep"}
|
||||
{"role": "assistant", "content": "The templates directory currently only contains a .gitkeep file. No templates have been added yet."}
|
||||
{"role": "user", "content": "Create a harvest-prompt.md template in the templates directory."}
|
||||
{"role": "assistant", "content": "I'll create the harvest-prompt.md template for you.", "tool_calls": [{"function": "terminal", "arguments": {"command": "cat > compounding-intelligence/templates/harvest-prompt.md << 'EOF'\n# Knowledge Extraction Prompt\n\nExtract knowledge from session transcripts.\nEOF"}}]}
|
||||
{"role": "tool", "content": ""}
|
||||
{"role": "assistant", "content": "Template created successfully. The harvest-prompt.md file is now in the templates directory."}
|
||||
@@ -1,19 +0,0 @@
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
def test_requirements_makefile_and_workflow_exist() -> None:
|
||||
assert Path("requirements.txt").exists()
|
||||
assert Path("Makefile").exists()
|
||||
assert Path(".gitea/workflows/test.yml").exists()
|
||||
|
||||
|
||||
def test_ci_workflow_runs_project_test_command() -> None:
|
||||
workflow = Path(".gitea/workflows/test.yml").read_text(encoding="utf-8")
|
||||
requirements = Path("requirements.txt").read_text(encoding="utf-8")
|
||||
makefile = Path("Makefile").read_text(encoding="utf-8")
|
||||
|
||||
assert "pytest" in requirements
|
||||
assert "test:" in makefile
|
||||
assert "python3 -m pytest tests/test_ci_config.py scripts/test_*.py -v" in makefile
|
||||
assert "pip install -r requirements.txt" in workflow
|
||||
assert "make test" in workflow
|
||||
@@ -1,207 +0,0 @@
|
||||
"""Tests for knowledge deduplication module (Issue #196)."""
|
||||
|
||||
import json
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
|
||||
sys.path.insert(0, str(Path(__file__).parent.parent / "scripts"))
|
||||
|
||||
from dedup import (
|
||||
normalize_text,
|
||||
content_hash,
|
||||
tokenize,
|
||||
token_similarity,
|
||||
quality_score,
|
||||
merge_facts,
|
||||
dedup_facts,
|
||||
generate_test_duplicates,
|
||||
)
|
||||
|
||||
|
||||
class TestNormalize:
|
||||
def test_lowercases(self):
|
||||
assert normalize_text("Hello World") == "hello world"
|
||||
|
||||
def test_collapses_whitespace(self):
|
||||
assert normalize_text(" hello world ") == "hello world"
|
||||
|
||||
def test_strips(self):
|
||||
assert normalize_text(" text ") == "text"
|
||||
|
||||
|
||||
class TestContentHash:
|
||||
def test_deterministic(self):
|
||||
h1 = content_hash("Hello World")
|
||||
h2 = content_hash("hello world")
|
||||
h3 = content_hash(" Hello World ")
|
||||
assert h1 == h2 == h3
|
||||
|
||||
def test_different_texts(self):
|
||||
h1 = content_hash("Hello")
|
||||
h2 = content_hash("World")
|
||||
assert h1 != h2
|
||||
|
||||
def test_returns_hex(self):
|
||||
h = content_hash("test")
|
||||
assert len(h) == 64 # SHA256
|
||||
assert all(c in '0123456789abcdef' for c in h)
|
||||
|
||||
|
||||
class TestTokenize:
|
||||
def test_extracts_words(self):
|
||||
tokens = tokenize("Hello World Test")
|
||||
assert "hello" in tokens
|
||||
assert "world" in tokens
|
||||
assert "test" in tokens
|
||||
|
||||
def test_skips_short_words(self):
|
||||
tokens = tokenize("a to is the hello")
|
||||
assert "a" not in tokens
|
||||
assert "to" not in tokens
|
||||
assert "hello" in tokens
|
||||
|
||||
def test_returns_set(self):
|
||||
tokens = tokenize("hello hello world")
|
||||
assert isinstance(tokens, set)
|
||||
assert len(tokens) == 2
|
||||
|
||||
|
||||
class TestTokenSimilarity:
|
||||
def test_identical(self):
|
||||
assert token_similarity("hello world", "hello world") == 1.0
|
||||
|
||||
def test_no_overlap(self):
|
||||
assert token_similarity("alpha beta", "gamma delta") == 0.0
|
||||
|
||||
def test_partial_overlap(self):
|
||||
sim = token_similarity("hello world test", "hello universe test")
|
||||
assert 0.3 < sim < 0.7
|
||||
|
||||
def test_empty(self):
|
||||
assert token_similarity("", "hello") == 0.0
|
||||
assert token_similarity("hello", "") == 0.0
|
||||
|
||||
def test_symmetric(self):
|
||||
a = "hello world test"
|
||||
b = "hello universe test"
|
||||
assert token_similarity(a, b) == token_similarity(b, a)
|
||||
|
||||
|
||||
class TestQualityScore:
|
||||
def test_high_confidence(self):
|
||||
fact = {"confidence": 0.95, "source_count": 5, "tags": ["test"], "related": ["x"]}
|
||||
score = quality_score(fact)
|
||||
assert score > 0.7
|
||||
|
||||
def test_low_confidence(self):
|
||||
fact = {"confidence": 0.3, "source_count": 1}
|
||||
score = quality_score(fact)
|
||||
assert score < 0.5
|
||||
|
||||
def test_defaults(self):
|
||||
score = quality_score({})
|
||||
assert 0 < score < 1
|
||||
|
||||
|
||||
class TestMergeFacts:
|
||||
def test_merges_tags(self):
|
||||
keep = {"id": "a", "fact": "test", "tags": ["git"], "confidence": 0.9}
|
||||
drop = {"id": "b", "fact": "test", "tags": ["python"], "confidence": 0.8}
|
||||
merged = merge_facts(keep, drop)
|
||||
assert "git" in merged["tags"]
|
||||
assert "python" in merged["tags"]
|
||||
|
||||
def test_merges_source_count(self):
|
||||
keep = {"id": "a", "fact": "test", "source_count": 3}
|
||||
drop = {"id": "b", "fact": "test", "source_count": 2}
|
||||
merged = merge_facts(keep, drop)
|
||||
assert merged["source_count"] == 5
|
||||
|
||||
def test_keeps_higher_confidence(self):
|
||||
keep = {"id": "a", "fact": "test", "confidence": 0.7}
|
||||
drop = {"id": "b", "fact": "test", "confidence": 0.9}
|
||||
merged = merge_facts(keep, drop)
|
||||
assert merged["confidence"] == 0.9
|
||||
|
||||
def test_tracks_merged_from(self):
|
||||
keep = {"id": "a", "fact": "test"}
|
||||
drop = {"id": "b", "fact": "test"}
|
||||
merged = merge_facts(keep, drop)
|
||||
assert "b" in merged["_merged_from"]
|
||||
|
||||
|
||||
class TestDedupFacts:
|
||||
def test_removes_exact_dupes(self):
|
||||
facts = [
|
||||
{"id": "1", "fact": "Always use git rebase"},
|
||||
{"id": "2", "fact": "Always use git rebase"}, # exact dupe
|
||||
{"id": "3", "fact": "Check logs first"},
|
||||
]
|
||||
deduped, stats = dedup_facts(facts)
|
||||
assert stats["exact_dupes"] == 1
|
||||
assert stats["unique"] == 2
|
||||
|
||||
def test_removes_near_dupes(self):
|
||||
facts = [
|
||||
{"id": "1", "fact": "Always check logs before deploying to production server"},
|
||||
{"id": "2", "fact": "Always check logs before deploying to production environment"},
|
||||
{"id": "3", "fact": "Use docker compose for local development environments"},
|
||||
]
|
||||
deduped, stats = dedup_facts(facts, near_threshold=0.5)
|
||||
assert stats["near_dupes"] >= 1
|
||||
assert stats["unique"] == 2
|
||||
|
||||
def test_preserves_unique(self):
|
||||
facts = [
|
||||
{"id": "1", "fact": "Use git rebase for clean history"},
|
||||
{"id": "2", "fact": "Docker containers should be stateless"},
|
||||
{"id": "3", "fact": "Always write tests before code"},
|
||||
]
|
||||
deduped, stats = dedup_facts(facts)
|
||||
assert stats["unique"] == 3
|
||||
assert stats["removed"] == 0
|
||||
|
||||
def test_empty_input(self):
|
||||
deduped, stats = dedup_facts([])
|
||||
assert stats["total"] == 0
|
||||
assert stats["unique"] == 0
|
||||
|
||||
def test_keeps_higher_quality_near_dup(self):
|
||||
facts = [
|
||||
{"id": "1", "fact": "Check logs before deploying to production server", "confidence": 0.5, "source_count": 1},
|
||||
{"id": "2", "fact": "Check logs before deploying to production environment", "confidence": 0.9, "source_count": 5, "tags": ["ops"]},
|
||||
]
|
||||
deduped, stats = dedup_facts(facts, near_threshold=0.5)
|
||||
assert stats["unique"] == 1
|
||||
# Higher quality fact should be kept
|
||||
assert deduped[0]["confidence"] == 0.9
|
||||
|
||||
def test_dry_run_does_not_modify(self):
|
||||
facts = [
|
||||
{"id": "1", "fact": "Same text"},
|
||||
{"id": "2", "fact": "Same text"},
|
||||
]
|
||||
deduped, stats = dedup_facts(facts, dry_run=True)
|
||||
assert stats["exact_dupes"] == 1
|
||||
# In dry_run, merge_facts is skipped so facts aren't modified
|
||||
assert len(deduped) == 1
|
||||
|
||||
|
||||
class TestGenerateTestDuplicates:
|
||||
def test_generates_correct_count(self):
|
||||
facts = generate_test_duplicates(20)
|
||||
assert len(facts) > 20 # 20 unique + duplicates
|
||||
|
||||
def test_has_exact_dupes(self):
|
||||
facts = generate_test_duplicates(20)
|
||||
hashes = [content_hash(f["fact"]) for f in facts]
|
||||
# Should have some duplicate hashes
|
||||
assert len(hashes) != len(set(hashes))
|
||||
|
||||
def test_dedup_removes_dupes(self):
|
||||
facts = generate_test_duplicates(20)
|
||||
deduped, stats = dedup_facts(facts)
|
||||
assert stats["unique"] <= 20
|
||||
assert stats["removed"] > 0
|
||||
@@ -1,227 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Tests for scripts/freshness.py — 8 tests."""
|
||||
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import tempfile
|
||||
|
||||
sys.path.insert(0, os.path.join(os.path.dirname(__file__) or ".", ".."))
|
||||
import importlib.util
|
||||
spec = importlib.util.spec_from_file_location(
|
||||
"freshness", os.path.join(os.path.dirname(__file__) or ".", "..", "scripts", "freshness.py"))
|
||||
mod = importlib.util.module_from_spec(spec)
|
||||
spec.loader.exec_module(mod)
|
||||
|
||||
compute_file_hash = mod.compute_file_hash
|
||||
check_freshness = mod.check_freshness
|
||||
load_knowledge_entries = mod.load_knowledge_entries
|
||||
|
||||
|
||||
def test_compute_file_hash():
|
||||
"""File hash should be computed correctly."""
|
||||
with tempfile.NamedTemporaryFile(mode='w', delete=False) as f:
|
||||
f.write("test content")
|
||||
f.flush()
|
||||
h = compute_file_hash(f.name)
|
||||
assert h is not None
|
||||
assert h.startswith("sha256:")
|
||||
os.unlink(f.name)
|
||||
print("PASS: test_compute_file_hash")
|
||||
|
||||
|
||||
def test_compute_file_hash_nonexistent():
|
||||
"""Nonexistent file should return None."""
|
||||
h = compute_file_hash("/nonexistent/file.txt")
|
||||
assert h is None
|
||||
print("PASS: test_compute_file_hash_nonexistent")
|
||||
|
||||
|
||||
def test_load_knowledge_entries_empty():
|
||||
"""Empty knowledge dir should return empty list."""
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
entries = load_knowledge_entries(tmpdir)
|
||||
assert entries == []
|
||||
print("PASS: test_load_knowledge_entries_empty")
|
||||
|
||||
|
||||
def test_load_knowledge_entries_from_index():
|
||||
"""Should load entries from index.json."""
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
# Create index.json
|
||||
index_path = os.path.join(tmpdir, "index.json")
|
||||
with open(index_path, "w") as f:
|
||||
json.dump({
|
||||
"facts": [
|
||||
{
|
||||
"fact": "Test fact",
|
||||
"source_file": "test.py",
|
||||
"source_hash": "sha256:abc123",
|
||||
"category": "fact",
|
||||
"confidence": 0.9
|
||||
}
|
||||
]
|
||||
}, f)
|
||||
|
||||
entries = load_knowledge_entries(tmpdir)
|
||||
assert len(entries) == 1
|
||||
assert entries[0]["fact"] == "Test fact"
|
||||
assert entries[0]["source_file"] == "test.py"
|
||||
print("PASS: test_load_knowledge_entries_from_index")
|
||||
|
||||
|
||||
def test_load_knowledge_entries_from_yaml():
|
||||
"""Should load entries from YAML files."""
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
# Create global directory
|
||||
global_dir = os.path.join(tmpdir, "global")
|
||||
os.makedirs(global_dir)
|
||||
|
||||
# Create YAML file
|
||||
yaml_path = os.path.join(global_dir, "test.yaml")
|
||||
with open(yaml_path, "w") as f:
|
||||
f.write("""
|
||||
pitfalls:
|
||||
- description: "Test pitfall"
|
||||
source_file: "test.py"
|
||||
source_hash: "sha256:def456"
|
||||
category: "pitfall"
|
||||
confidence: 0.8
|
||||
""")
|
||||
|
||||
entries = load_knowledge_entries(tmpdir)
|
||||
assert len(entries) == 1
|
||||
assert entries[0]["fact"] == "Test pitfall"
|
||||
assert entries[0]["category"] == "pitfall"
|
||||
print("PASS: test_load_knowledge_entries_from_yaml")
|
||||
|
||||
|
||||
def test_check_freshness_no_changes():
|
||||
"""With no source file reference, entries should be counted correctly."""
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
# Create knowledge dir
|
||||
knowledge_dir = os.path.join(tmpdir, "knowledge")
|
||||
os.makedirs(knowledge_dir)
|
||||
|
||||
# Create repo dir
|
||||
repo_dir = os.path.join(tmpdir, "repo")
|
||||
os.makedirs(repo_dir)
|
||||
|
||||
# Create index.json with entry that has no source_file
|
||||
index_path = os.path.join(knowledge_dir, "index.json")
|
||||
with open(index_path, "w") as f:
|
||||
json.dump({
|
||||
"facts": [
|
||||
{
|
||||
"fact": "General knowledge",
|
||||
"category": "fact",
|
||||
"confidence": 0.9
|
||||
# No source_file or source_hash
|
||||
}
|
||||
]
|
||||
}, f)
|
||||
|
||||
result = check_freshness(knowledge_dir, repo_dir, days=1)
|
||||
|
||||
# Entry without source_file should be counted as "fresh" (no_source status)
|
||||
assert result["summary"]["total"] == 1
|
||||
assert result["summary"]["stale"] == 0
|
||||
assert result["summary"]["fresh"] == 1
|
||||
assert result["fresh_entries"][0]["status"] == "no_source"
|
||||
print("PASS: test_check_freshness_no_changes")
|
||||
|
||||
|
||||
def test_check_freshness_with_hash_mismatch():
|
||||
"""Hash mismatch should mark entry as stale."""
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
# Create knowledge dir
|
||||
knowledge_dir = os.path.join(tmpdir, "knowledge")
|
||||
os.makedirs(knowledge_dir)
|
||||
|
||||
# Create repo dir with a file
|
||||
repo_dir = os.path.join(tmpdir, "repo")
|
||||
os.makedirs(repo_dir)
|
||||
|
||||
test_file = os.path.join(repo_dir, "test.py")
|
||||
with open(test_file, "w") as f:
|
||||
f.write("print('hello')")
|
||||
|
||||
# Create index.json with wrong hash
|
||||
index_path = os.path.join(knowledge_dir, "index.json")
|
||||
with open(index_path, "w") as f:
|
||||
json.dump({
|
||||
"facts": [
|
||||
{
|
||||
"fact": "Test fact",
|
||||
"source_file": "test.py",
|
||||
"source_hash": "sha256:wronghash",
|
||||
"category": "fact",
|
||||
"confidence": 0.9
|
||||
}
|
||||
]
|
||||
}, f)
|
||||
|
||||
# Initialize git repo
|
||||
os.system(f"cd {repo_dir} && git init && git add . && git commit -m 'init' 2>/dev/null")
|
||||
|
||||
result = check_freshness(knowledge_dir, repo_dir, days=1)
|
||||
|
||||
assert result["summary"]["total"] == 1
|
||||
assert result["summary"]["stale"] == 1
|
||||
assert result["summary"]["fresh"] == 0
|
||||
assert result["stale_entries"][0]["reason"] == "hash_mismatch"
|
||||
print("PASS: test_check_freshness_with_hash_mismatch")
|
||||
|
||||
|
||||
def test_check_freshness_missing_source():
|
||||
"""Missing source file should mark entry as stale."""
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
# Create knowledge dir
|
||||
knowledge_dir = os.path.join(tmpdir, "knowledge")
|
||||
os.makedirs(knowledge_dir)
|
||||
|
||||
# Create repo dir (without the referenced file)
|
||||
repo_dir = os.path.join(tmpdir, "repo")
|
||||
os.makedirs(repo_dir)
|
||||
|
||||
# Create index.json referencing nonexistent file
|
||||
index_path = os.path.join(knowledge_dir, "index.json")
|
||||
with open(index_path, "w") as f:
|
||||
json.dump({
|
||||
"facts": [
|
||||
{
|
||||
"fact": "Test fact",
|
||||
"source_file": "nonexistent.py",
|
||||
"source_hash": "sha256:abc123",
|
||||
"category": "fact",
|
||||
"confidence": 0.9
|
||||
}
|
||||
]
|
||||
}, f)
|
||||
|
||||
# Initialize git repo
|
||||
os.system(f"cd {repo_dir} && git init && git add . && git commit -m 'init' 2>/dev/null")
|
||||
|
||||
result = check_freshness(knowledge_dir, repo_dir, days=1)
|
||||
|
||||
assert result["summary"]["total"] == 1
|
||||
assert result["summary"]["stale"] == 1
|
||||
assert result["summary"]["fresh"] == 0
|
||||
assert result["stale_entries"][0]["reason"] == "source_missing"
|
||||
print("PASS: test_check_freshness_missing_source")
|
||||
|
||||
|
||||
def run_all():
|
||||
test_compute_file_hash()
|
||||
test_compute_file_hash_nonexistent()
|
||||
test_load_knowledge_entries_empty()
|
||||
test_load_knowledge_entries_from_index()
|
||||
test_load_knowledge_entries_from_yaml()
|
||||
test_check_freshness_no_changes()
|
||||
test_check_freshness_with_hash_mismatch()
|
||||
test_check_freshness_missing_source()
|
||||
print("\nAll 8 tests passed!")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
run_all()
|
||||
@@ -1,141 +0,0 @@
|
||||
"""Tests for knowledge_gap_identifier module."""
|
||||
|
||||
import sys
|
||||
import os
|
||||
import tempfile
|
||||
import shutil
|
||||
from pathlib import Path
|
||||
|
||||
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..', 'scripts'))
|
||||
|
||||
from knowledge_gap_identifier import KnowledgeGapIdentifier, GapType, GapSeverity
|
||||
|
||||
|
||||
def _make_repo(tmpdir, structure):
|
||||
"""Create a test repo from a dict of {path: content}."""
|
||||
for rel_path, content in structure.items():
|
||||
p = Path(tmpdir) / rel_path
|
||||
p.parent.mkdir(parents=True, exist_ok=True)
|
||||
p.write_text(content)
|
||||
|
||||
|
||||
def test_undocumented_symbol():
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
_make_repo(tmpdir, {
|
||||
"src/calculator.py": "def add(a, b):\n return a + b\n",
|
||||
"README.md": "# Calculator\n",
|
||||
})
|
||||
report = KnowledgeGapIdentifier().analyze(tmpdir)
|
||||
undocumented = [g for g in report.gaps if g.gap_type == GapType.UNDOCUMENTED]
|
||||
assert any(g.name == "add" for g in undocumented), "add should be undocumented"
|
||||
|
||||
|
||||
def test_documented_symbol_no_gap():
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
_make_repo(tmpdir, {
|
||||
"src/calculator.py": "def add(a, b):\n return a + b\n",
|
||||
"README.md": "# Calculator\nUse `add()` to add numbers.\n",
|
||||
})
|
||||
report = KnowledgeGapIdentifier().analyze(tmpdir)
|
||||
undocumented = [g for g in report.gaps
|
||||
if g.gap_type == GapType.UNDOCUMENTED and g.name == "add"]
|
||||
assert len(undocumented) == 0, "add is documented, should not be flagged"
|
||||
|
||||
|
||||
def test_untested_module():
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
_make_repo(tmpdir, {
|
||||
"src/calculator.py": "def add(a, b):\n return a + b\n",
|
||||
"src/helper.py": "def format(x):\n return str(x)\n",
|
||||
"tests/test_calculator.py": "from src.calculator import add\nassert add(1,2) == 3\n",
|
||||
})
|
||||
report = KnowledgeGapIdentifier().analyze(tmpdir)
|
||||
untested = [g for g in report.gaps if g.gap_type == GapType.UNTESTED]
|
||||
assert any("helper" in g.name for g in untested), "helper should be untested"
|
||||
|
||||
|
||||
def test_tested_module_no_gap():
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
_make_repo(tmpdir, {
|
||||
"src/calculator.py": "def add(a, b):\n return a + b\n",
|
||||
"tests/test_calculator.py": "def test_add():\n assert True\n",
|
||||
})
|
||||
report = KnowledgeGapIdentifier().analyze(tmpdir)
|
||||
untested = [g for g in report.gaps
|
||||
if g.gap_type == GapType.UNTESTED and "calculator" in g.name]
|
||||
assert len(untested) == 0, "calculator has tests, should not be flagged"
|
||||
|
||||
|
||||
def test_missing_implementation():
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
_make_repo(tmpdir, {
|
||||
"src/app.py": "def run():\n pass\n",
|
||||
"docs/api.md": "# API\nUse `NonExistentClass` to do things.\n",
|
||||
})
|
||||
report = KnowledgeGapIdentifier().analyze(tmpdir)
|
||||
missing = [g for g in report.gaps if g.gap_type == GapType.MISSING_IMPLEMENTATION]
|
||||
assert any(g.name == "NonExistentClass" for g in missing)
|
||||
|
||||
|
||||
def test_private_symbols_skipped():
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
_make_repo(tmpdir, {
|
||||
"src/app.py": "def _internal():\n pass\ndef public():\n pass\n",
|
||||
"README.md": "# App\n",
|
||||
})
|
||||
report = KnowledgeGapIdentifier().analyze(tmpdir)
|
||||
undocumented_names = [g.name for g in report.gaps if g.gap_type == GapType.UNDOCUMENTED]
|
||||
assert "_internal" not in undocumented_names, "Private symbols should be skipped"
|
||||
assert "public" in undocumented_names
|
||||
|
||||
|
||||
def test_empty_repo():
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
report = KnowledgeGapIdentifier().analyze(tmpdir)
|
||||
assert len(report.gaps) == 0
|
||||
|
||||
|
||||
def test_invalid_path():
|
||||
report = KnowledgeGapIdentifier().analyze("/nonexistent/path/xyz")
|
||||
assert len(report.gaps) == 1
|
||||
assert report.gaps[0].severity == GapSeverity.ERROR
|
||||
|
||||
|
||||
def test_report_summary():
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
_make_repo(tmpdir, {
|
||||
"src/app.py": "class MyService:\n def handle(self):\n pass\n",
|
||||
"README.md": "# App\n",
|
||||
})
|
||||
report = KnowledgeGapIdentifier().analyze(tmpdir)
|
||||
summary = report.summary()
|
||||
assert "UNDOCUMENTED" in summary
|
||||
assert "MyService" in summary
|
||||
|
||||
|
||||
def test_report_to_dict():
|
||||
with tempfile.TemporaryDirectory() as tmpdir:
|
||||
_make_repo(tmpdir, {
|
||||
"src/app.py": "def hello():\n pass\n",
|
||||
"README.md": "# App\n",
|
||||
})
|
||||
report = KnowledgeGapIdentifier().analyze(tmpdir)
|
||||
d = report.to_dict()
|
||||
assert "total_gaps" in d
|
||||
assert "gaps" in d
|
||||
assert isinstance(d["gaps"], list)
|
||||
assert d["total_gaps"] > 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
test_undocumented_symbol()
|
||||
test_documented_symbol_no_gap()
|
||||
test_untested_module()
|
||||
test_tested_module_no_gap()
|
||||
test_missing_implementation()
|
||||
test_private_symbols_skipped()
|
||||
test_empty_repo()
|
||||
test_invalid_path()
|
||||
test_report_summary()
|
||||
test_report_to_dict()
|
||||
print("All 10 tests passed.")
|
||||
@@ -1,319 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Tests for Performance Bottleneck Finder.
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
import tempfile
|
||||
import textwrap
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
|
||||
# Add scripts to path
|
||||
import sys
|
||||
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "scripts"))
|
||||
|
||||
from perf_bottleneck_finder import (
|
||||
Bottleneck,
|
||||
PerfReport,
|
||||
find_slow_tests_by_scan,
|
||||
analyze_build_artifacts,
|
||||
analyze_makefile_targets,
|
||||
find_slow_imports,
|
||||
generate_report,
|
||||
format_markdown,
|
||||
severity_sort_key,
|
||||
)
|
||||
|
||||
|
||||
class TestBottleneck:
|
||||
"""Test Bottleneck dataclass."""
|
||||
|
||||
def test_creation(self):
|
||||
b = Bottleneck(
|
||||
category="test",
|
||||
name="test_foo",
|
||||
duration_s=5.0,
|
||||
severity="warning",
|
||||
recommendation="Mock it"
|
||||
)
|
||||
assert b.category == "test"
|
||||
assert b.name == "test_foo"
|
||||
assert b.duration_s == 5.0
|
||||
assert b.severity == "warning"
|
||||
assert b.recommendation == "Mock it"
|
||||
assert b.file_path is None
|
||||
assert b.line_number is None
|
||||
|
||||
def test_with_location(self):
|
||||
b = Bottleneck(
|
||||
category="test",
|
||||
name="test_bar",
|
||||
duration_s=2.0,
|
||||
severity="info",
|
||||
recommendation="Consider",
|
||||
file_path="tests/test_bar.py",
|
||||
line_number=42
|
||||
)
|
||||
assert b.file_path == "tests/test_bar.py"
|
||||
assert b.line_number == 42
|
||||
|
||||
def test_to_dict(self):
|
||||
b = Bottleneck("test", "x", 1.0, "info", "y")
|
||||
d = b.__dict__
|
||||
assert "category" in d
|
||||
assert "duration_s" in d
|
||||
|
||||
|
||||
class TestPerfReport:
|
||||
"""Test PerfReport dataclass."""
|
||||
|
||||
def test_creation(self):
|
||||
report = PerfReport(
|
||||
timestamp="2026-01-01T00:00:00Z",
|
||||
repo_path="/tmp/repo"
|
||||
)
|
||||
assert report.timestamp == "2026-01-01T00:00:00Z"
|
||||
assert report.bottlenecks == []
|
||||
assert report.summary == {}
|
||||
|
||||
def test_to_dict(self):
|
||||
report = PerfReport(
|
||||
timestamp="2026-01-01T00:00:00Z",
|
||||
repo_path="/tmp/repo",
|
||||
bottlenecks=[Bottleneck("test", "x", 1.0, "info", "y")]
|
||||
)
|
||||
d = report.to_dict()
|
||||
assert "bottlenecks" in d
|
||||
assert len(d["bottlenecks"]) == 1
|
||||
|
||||
|
||||
class TestSeveritySort:
|
||||
"""Test severity sorting."""
|
||||
|
||||
def test_critical_first(self):
|
||||
items = [
|
||||
Bottleneck("test", "a", 1.0, "info", ""),
|
||||
Bottleneck("test", "b", 0.5, "critical", ""),
|
||||
Bottleneck("test", "c", 2.0, "warning", ""),
|
||||
]
|
||||
items.sort(key=severity_sort_key)
|
||||
assert items[0].severity == "critical"
|
||||
assert items[1].severity == "warning"
|
||||
assert items[2].severity == "info"
|
||||
|
||||
def test_duration_within_severity(self):
|
||||
items = [
|
||||
Bottleneck("test", "slow", 10.0, "warning", ""),
|
||||
Bottleneck("test", "fast", 1.0, "warning", ""),
|
||||
]
|
||||
items.sort(key=severity_sort_key)
|
||||
assert items[0].name == "slow" # Higher duration first within same severity
|
||||
|
||||
|
||||
class TestSlowTestScan:
|
||||
"""Test slow test pattern scanning."""
|
||||
|
||||
def test_finds_sleep(self, tmp_path):
|
||||
test_file = tmp_path / "test_sleepy.py"
|
||||
test_file.write_text(textwrap.dedent('''
|
||||
import time
|
||||
|
||||
def test_slow():
|
||||
time.sleep(5)
|
||||
assert True
|
||||
'''))
|
||||
|
||||
bottlenecks = find_slow_tests_by_scan(str(tmp_path))
|
||||
assert len(bottlenecks) >= 1
|
||||
assert any("sleep" in b.recommendation.lower() for b in bottlenecks)
|
||||
|
||||
def test_finds_http_calls(self, tmp_path):
|
||||
test_file = tmp_path / "test_http.py"
|
||||
test_file.write_text(textwrap.dedent('''
|
||||
import requests
|
||||
|
||||
def test_api():
|
||||
resp = requests.get("https://example.com")
|
||||
assert resp.status_code == 200
|
||||
'''))
|
||||
|
||||
bottlenecks = find_slow_tests_by_scan(str(tmp_path))
|
||||
assert len(bottlenecks) >= 1
|
||||
assert any("HTTP" in b.recommendation or "mock" in b.recommendation.lower() for b in bottlenecks)
|
||||
|
||||
def test_skips_non_test_files(self, tmp_path):
|
||||
src_file = tmp_path / "main.py"
|
||||
src_file.write_text("import time\ntime.sleep(10)\n")
|
||||
|
||||
bottlenecks = find_slow_tests_by_scan(str(tmp_path))
|
||||
assert len(bottlenecks) == 0
|
||||
|
||||
def test_handles_missing_dir(self):
|
||||
bottlenecks = find_slow_tests_by_scan("/nonexistent/path")
|
||||
assert bottlenecks == []
|
||||
|
||||
def test_file_path_populated(self, tmp_path):
|
||||
test_file = tmp_path / "test_example.py"
|
||||
test_file.write_text("import time\n\ndef test_it():\n time.sleep(2)\n")
|
||||
|
||||
bottlenecks = find_slow_tests_by_scan(str(tmp_path))
|
||||
assert len(bottlenecks) >= 1
|
||||
assert bottlenecks[0].file_path is not None
|
||||
assert bottlenecks[0].line_number is not None
|
||||
|
||||
|
||||
class TestBuildArtifacts:
|
||||
"""Test build artifact analysis."""
|
||||
|
||||
def test_finds_large_node_modules(self, tmp_path):
|
||||
nm = tmp_path / "node_modules"
|
||||
nm.mkdir()
|
||||
# Create a file > 10MB
|
||||
big_file = nm / "big.txt"
|
||||
big_file.write_bytes(b"x" * (11 * 1024 * 1024))
|
||||
|
||||
bottlenecks = analyze_build_artifacts(str(tmp_path))
|
||||
assert len(bottlenecks) >= 1
|
||||
assert any("node_modules" in b.name for b in bottlenecks)
|
||||
|
||||
def test_ignores_small_dirs(self, tmp_path):
|
||||
nm = tmp_path / "node_modules"
|
||||
nm.mkdir()
|
||||
small_file = nm / "small.txt"
|
||||
small_file.write_bytes(b"x" * 100)
|
||||
|
||||
bottlenecks = analyze_build_artifacts(str(tmp_path))
|
||||
assert not any("node_modules" in b.name for b in bottlenecks)
|
||||
|
||||
def test_finds_pycache(self, tmp_path):
|
||||
cache = tmp_path / "__pycache__"
|
||||
cache.mkdir()
|
||||
big_file = cache / "big.pyc"
|
||||
big_file.write_bytes(b"x" * (11 * 1024 * 1024))
|
||||
|
||||
bottlenecks = analyze_build_artifacts(str(tmp_path))
|
||||
assert any("__pycache__" in b.name for b in bottlenecks)
|
||||
|
||||
|
||||
class TestMakefileAnalysis:
|
||||
"""Test Makefile analysis."""
|
||||
|
||||
def test_finds_pip_install(self, tmp_path):
|
||||
makefile = tmp_path / "Makefile"
|
||||
makefile.write_text(textwrap.dedent('''
|
||||
install:
|
||||
pip install -r requirements.txt
|
||||
|
||||
test:
|
||||
pytest
|
||||
'''))
|
||||
|
||||
bottlenecks = analyze_makefile_targets(str(tmp_path))
|
||||
assert len(bottlenecks) >= 1
|
||||
|
||||
def test_no_makefile(self, tmp_path):
|
||||
bottlenecks = analyze_makefile_targets(str(tmp_path))
|
||||
assert bottlenecks == []
|
||||
|
||||
|
||||
class TestImportAnalysis:
|
||||
"""Test heavy import detection."""
|
||||
|
||||
def test_finds_pandas(self, tmp_path):
|
||||
src = tmp_path / "analysis.py"
|
||||
src.write_text("import pandas as pd\n")
|
||||
|
||||
bottlenecks = find_slow_imports(str(tmp_path))
|
||||
assert len(bottlenecks) >= 1
|
||||
assert any("pandas" in b.name for b in bottlenecks)
|
||||
|
||||
def test_finds_torch(self, tmp_path):
|
||||
src = tmp_path / "model.py"
|
||||
src.write_text("import torch\n")
|
||||
|
||||
bottlenecks = find_slow_imports(str(tmp_path))
|
||||
assert any("torch" in b.name for b in bottlenecks)
|
||||
|
||||
def test_skips_light_imports(self, tmp_path):
|
||||
src = tmp_path / "utils.py"
|
||||
src.write_text("import json\nimport os\nimport sys\n")
|
||||
|
||||
bottlenecks = find_slow_imports(str(tmp_path))
|
||||
assert len(bottlenecks) == 0
|
||||
|
||||
|
||||
class TestGenerateReport:
|
||||
"""Test full report generation."""
|
||||
|
||||
def test_empty_repo(self, tmp_path):
|
||||
report = generate_report(str(tmp_path))
|
||||
assert report.summary["total_bottlenecks"] >= 0
|
||||
assert "critical" in report.summary
|
||||
assert "warning" in report.summary
|
||||
|
||||
def test_with_findings(self, tmp_path):
|
||||
# Create a test file with issues
|
||||
test_file = tmp_path / "test_slow.py"
|
||||
test_file.write_text(textwrap.dedent('''
|
||||
import time
|
||||
import requests
|
||||
|
||||
def test_sleepy():
|
||||
time.sleep(3)
|
||||
|
||||
def test_http():
|
||||
requests.get("https://example.com")
|
||||
'''))
|
||||
|
||||
report = generate_report(str(tmp_path))
|
||||
assert report.summary["total_bottlenecks"] >= 2
|
||||
assert len(report.bottlenecks) > 0
|
||||
|
||||
def test_summary_categories(self, tmp_path):
|
||||
report = generate_report(str(tmp_path))
|
||||
assert "by_category" in report.summary
|
||||
|
||||
|
||||
class TestMarkdownReport:
|
||||
"""Test markdown output."""
|
||||
|
||||
def test_format(self):
|
||||
report = PerfReport(
|
||||
timestamp="2026-01-01T00:00:00Z",
|
||||
repo_path="/tmp/repo",
|
||||
bottlenecks=[
|
||||
Bottleneck("test", "slow_test", 5.0, "critical", "Fix it")
|
||||
],
|
||||
summary={
|
||||
"total_bottlenecks": 1,
|
||||
"critical": 1,
|
||||
"warning": 0,
|
||||
"info": 0,
|
||||
"estimated_total_slowdown_s": 5.0,
|
||||
"by_category": {"test": 1},
|
||||
}
|
||||
)
|
||||
md = format_markdown(report)
|
||||
assert "# Performance Bottleneck Report" in md
|
||||
assert "slow_test" in md
|
||||
assert "🔴" in md
|
||||
assert "Fix it" in md
|
||||
|
||||
def test_empty_report(self):
|
||||
report = PerfReport(
|
||||
timestamp="2026-01-01T00:00:00Z",
|
||||
repo_path="/tmp/repo",
|
||||
summary={
|
||||
"total_bottlenecks": 0,
|
||||
"critical": 0,
|
||||
"warning": 0,
|
||||
"info": 0,
|
||||
"estimated_total_slowdown_s": 0,
|
||||
"by_category": {},
|
||||
}
|
||||
)
|
||||
md = format_markdown(report)
|
||||
assert "Total bottlenecks:** 0" in md
|
||||
@@ -1,108 +0,0 @@
|
||||
"""
|
||||
Tests for quality_gate.py — Knowledge entry quality scoring.
|
||||
"""
|
||||
|
||||
import unittest
|
||||
from datetime import datetime, timezone, timedelta
|
||||
|
||||
import sys
|
||||
from pathlib import Path
|
||||
sys.path.insert(0, str(Path(__file__).parent.parent))
|
||||
|
||||
from quality_gate import (
|
||||
score_specificity,
|
||||
score_actionability,
|
||||
score_freshness,
|
||||
score_source_quality,
|
||||
score_entry,
|
||||
filter_entries,
|
||||
)
|
||||
|
||||
|
||||
class TestScoreSpecificity(unittest.TestCase):
|
||||
def test_specific_content_scores_high(self):
|
||||
content = "Run `python3 deploy.py --env prod` on 2026-04-15. Example: step 1 configure nginx."
|
||||
score = score_specificity(content)
|
||||
self.assertGreater(score, 0.6)
|
||||
|
||||
def test_vague_content_scores_low(self):
|
||||
content = "It generally depends. Various factors might affect this. Basically, it varies."
|
||||
score = score_specificity(content)
|
||||
self.assertLess(score, 0.5)
|
||||
|
||||
def test_empty_scores_baseline(self):
|
||||
score = score_specificity("")
|
||||
self.assertAlmostEqual(score, 0.5, delta=0.1)
|
||||
|
||||
|
||||
class TestScoreActionability(unittest.TestCase):
|
||||
def test_actionable_content_scores_high(self):
|
||||
content = "1. Run `pip install -r requirements.txt`\n2. Execute `python3 train.py`\n3. Verify with `pytest`"
|
||||
score = score_actionability(content)
|
||||
self.assertGreater(score, 0.6)
|
||||
|
||||
def test_abstract_content_scores_low(self):
|
||||
content = "The concept of intelligence is fascinating and multifaceted."
|
||||
score = score_actionability(content)
|
||||
self.assertLess(score, 0.5)
|
||||
|
||||
|
||||
class TestScoreFreshness(unittest.TestCase):
|
||||
def test_recent_timestamp_scores_high(self):
|
||||
recent = datetime.now(timezone.utc).isoformat()
|
||||
score = score_freshness(recent)
|
||||
self.assertGreater(score, 0.9)
|
||||
|
||||
def test_old_timestamp_scores_low(self):
|
||||
old = (datetime.now(timezone.utc) - timedelta(days=365)).isoformat()
|
||||
score = score_freshness(old)
|
||||
self.assertLess(score, 0.2)
|
||||
|
||||
def test_none_returns_baseline(self):
|
||||
score = score_freshness(None)
|
||||
self.assertEqual(score, 0.5)
|
||||
|
||||
|
||||
class TestScoreSourceQuality(unittest.TestCase):
|
||||
def test_claude_scores_high(self):
|
||||
self.assertGreater(score_source_quality("claude-sonnet"), 0.85)
|
||||
|
||||
def test_ollama_scores_lower(self):
|
||||
self.assertLess(score_source_quality("ollama"), 0.7)
|
||||
|
||||
def test_unknown_returns_default(self):
|
||||
self.assertEqual(score_source_quality("unknown"), 0.5)
|
||||
|
||||
|
||||
class TestScoreEntry(unittest.TestCase):
|
||||
def test_good_entry_scores_high(self):
|
||||
entry = {
|
||||
"content": "To deploy: run `kubectl apply -f deployment.yaml`. Verify with `kubectl get pods`.",
|
||||
"model": "claude-sonnet",
|
||||
"timestamp": datetime.now(timezone.utc).isoformat(),
|
||||
}
|
||||
score = score_entry(entry)
|
||||
self.assertGreater(score, 0.6)
|
||||
|
||||
def test_poor_entry_scores_low(self):
|
||||
entry = {
|
||||
"content": "It depends. Various things might happen.",
|
||||
"model": "unknown",
|
||||
}
|
||||
score = score_entry(entry)
|
||||
self.assertLess(score, 0.5)
|
||||
|
||||
|
||||
class TestFilterEntries(unittest.TestCase):
|
||||
def test_filters_low_quality(self):
|
||||
entries = [
|
||||
{"content": "Run `deploy.py` to fix the issue.", "model": "claude"},
|
||||
{"content": "It might work sometimes.", "model": "unknown"},
|
||||
{"content": "Configure nginx: step 1 edit nginx.conf", "model": "gpt-4"},
|
||||
]
|
||||
filtered = filter_entries(entries, threshold=0.5)
|
||||
self.assertGreaterEqual(len(filtered), 2)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
unittest.main()
|
||||
Reference in New Issue
Block a user