Commit Graph

86 Commits

Author SHA1 Message Date
e8359cf10a feat: automation opportunity finder (#170)
Analyzes cron jobs, docs, scripts, session transcripts, and shell history to find manual processes that could be automated.

Outputs ranked proposals with confidence scores and impact ratings.
2026-04-15 14:51:29 +00:00
b3592e14ad test: add tests for Performance Bottleneck Finder
Refs #171
2026-04-15 14:48:59 +00:00
f1175df79d test: add improvement proposal generator tests (#168) 2026-04-15 14:47:30 +00:00
be805a1b4c feat: add Performance Bottleneck Finder (#171)
Analyzes: slow tests, build artifacts, CI workflows, heavy imports.
Outputs: markdown report or JSON. Designed for weekly cron.

Closes #171
2026-04-15 14:47:27 +00:00
1d47665dd4 feat: add improvement proposal generator (#168) 2026-04-15 14:47:26 +00:00
5eab5e4aac test: knowledge gap identifier tests (#172) 2026-04-15 14:42:30 +00:00
71dd801575 feat: knowledge gap identifier — Pipeline 10.7 (#172) 2026-04-15 14:42:28 +00:00
e6f1b07f16 Merge pull request 'feat: Knowledge store staleness detector (closes #179)' (#185) from feat/179-staleness-check into main 2026-04-15 06:09:14 +00:00
81c02f6709 feat: Add staleness detector tests (closes #179) 2026-04-15 04:00:46 +00:00
c2c3c6a3b9 feat: Add knowledge staleness detector (closes #179) 2026-04-15 04:00:12 +00:00
d664119b9c feat: Add diff analyzer tests (closes #176) 2026-04-15 03:57:21 +00:00
764414d4d5 feat: Add diff analyzer (closes #176) 2026-04-15 03:56:27 +00:00
54f3bef7fc feat: Add parser tests (closes #177) 2026-04-15 03:50:04 +00:00
4fcd372de4 feat: Add Gitea issue body parser (closes #177) 2026-04-15 03:49:00 +00:00
77a753f6f2 feat: dead code detector for Python codebases (#94) 2026-04-15 03:46:43 +00:00
cbebd93cbb feat: cross-repo dependency graph builder (#93) 2026-04-15 03:44:12 +00:00
b36f617d4a test: add tests for session pair harvester (#91) 2026-04-15 03:39:09 +00:00
b5466dc938 feat: session transcript → training pair harvester (#91) 2026-04-15 03:39:08 +00:00
55797c8a3e feat: add sampler.py — session value scorer (#17) 2026-04-15 03:02:12 +00:00
7342fc7cb2 fix(#7): full test harness for knowledge extraction
- 8 tests: structure, validation, hallucination, duplicates, failed sessions
- validate_extraction() checks all required fields + meta block
- validate_transcript_coverage() heuristic hallucination detection
- CLI: --validate FILE for checking existing extractions
- 3 sample transcripts for testing
2026-04-15 00:22:55 +00:00
206cfbb498 fix(#7): redesign knowledge extraction prompt
- Tightened to ~700 tokens (target: ~1k)
- Added evidence field: every fact must cite transcript source
- Added meta block: session_outcome, tools_used, repos_touched
- Explicit handling of partial/failed sessions
- Front-loaded rules before transcript for mimo-v2-pro

Closes #7
2026-04-15 00:22:39 +00:00
cdb71adddf docs: GENOME.md — full codebase analysis #676 2026-04-14 22:58:55 +00:00
160dfcf419 feat: add session_metadata.py — structured session metadata extractor (#6) 2026-04-14 19:06:16 +00:00
8d716ff03f Add comprehensive test script for harvest prompt validation 2026-04-14 19:02:41 +00:00
920510996e Add test session 5: Session with questions 2026-04-14 19:01:03 +00:00
1fafeaf5a4 Add test session 4: Session with patterns 2026-04-14 19:01:00 +00:00
36b440f998 Add test session 3: Partial session with tool quirks 2026-04-14 19:00:58 +00:00
9f3caabf42 Add test session 2: Failed session with pitfalls 2026-04-14 19:00:56 +00:00
a21f3a44e1 Add test session 1: Successful session 2026-04-14 18:58:05 +00:00
Timmy
b32d316023 feat(#10): knowledge file format schema + example knowledge files
- SCHEMA.md: full specification for index.json and YAML knowledge files
- knowledge/global/pitfalls.yaml: 8 cross-repo pitfalls
- knowledge/global/tool-quirks.yaml: 7 environment quirk facts
- knowledge/repos/hermes-agent.yaml: 8 per-repo pitfalls (cron, paths, SSH)
- knowledge/repos/the-nexus.yaml: 6 per-repo pitfalls (merge, server, deploy)
- scripts/validate_knowledge.py: schema validator (29 facts, all passing)
- knowledge/index.json: populated with 29 seed facts from real fleet data

Design decisions:
- YAML for humans, index.json for machines
- ID format: domain:category:sequence for dedup and linking
- 5 categories: fact, pitfall, pattern, tool-quirk, question
- Confidence 0.0-1.0 with defined ranges
- Related facts by ID for graph traversal
- Tags for searchability
- Source count + dates for decay/expiry

Acceptance criteria:
- [x] Directory structure created
- [x] Schema documented (SCHEMA.md)
- [x] index.json with real facts (29 total)
- [x] Example knowledge files for 2 repos (hermes-agent, the-nexus)
- [x] Validation script passes
2026-04-14 14:21:21 -04:00
Timmy
b65256bf76 feat: build bootstrapper.py - pre-session context assembler
Assembles relevant knowledge from the store into a compact 2k-token
context block for session injection.

Features:
- Filter by repo, agent type, and global scope
- Sort by confidence (pitfalls first, patterns, facts)
- Per-repo and per-agent markdown knowledge files
- Graceful empty-store handling
- JSON output mode for programmatic use
- Token-count-aware truncation at line boundaries

Closes #11
2026-04-14 14:05:30 -04:00
Alexander Whitestone
da073ad7cf feat: add harvester.py — session knowledge extractor (#8)
Main harvester module that chains:
  session_reader → extraction prompt → LLM → validate → deduplicate → store

Includes:
- scripts/harvester.py — main module (reader + prompt + storage pipeline)
- scripts/session_reader.py — JSONL transcript parser
- scripts/test_harvester_pipeline.py — smoke tests (all passing)

Pipeline:
  1. Read session JSONL via session_reader
  2. Truncate long sessions (first 50 + last 50 messages)
  3. Send transcript + extraction prompt to LLM (mimo-v2-pro)
  4. Parse structured JSON response (facts/pitfalls/patterns/quirks/questions)
  5. Validate fields + confidence threshold
  6. Deduplicate against knowledge/index.json (fingerprint + word overlap)
  7. Write to knowledge store (index.json + per-repo markdown)

CLI:
  Single:  python3 harvester.py --session <path> --output knowledge/
  Batch:   python3 harvester.py --batch --since 2026-04-01 --limit 100
  Dry-run: python3 harvester.py --session <path> --dry-run
2026-04-14 14:03:30 -04:00
102ef67a8e Add test script for knowledge extraction prompt 2026-04-14 17:22:17 +00:00
d9f51b30a9 Add knowledge extraction prompt template for issue #7 2026-04-14 17:21:25 +00:00
Alexander Whitestone
b5873e9e3d Initial structure: knowledge store, scripts, metrics, templates 2026-04-14 11:17:01 -04:00
8252ef5b80 Initial commit 2026-04-14 15:11:53 +00:00