Compare commits

...

2 Commits

Author SHA1 Message Date
Alexander Whitestone
0626a3fc33 feat: add compounding-intelligence genome analysis (#676)
Some checks failed
Self-Healing Smoke / self-healing-smoke (pull_request) Failing after 25s
Smoke Test / smoke (pull_request) Failing after 30s
Agent PR Gate / gate (pull_request) Failing after 38s
Agent PR Gate / report (pull_request) Successful in 10s
2026-04-21 03:27:24 -04:00
Alexander Whitestone
98f861b713 wip: add compounding-intelligence genome regression for #676 2026-04-21 03:23:05 -04:00
2 changed files with 595 additions and 0 deletions

View File

@@ -0,0 +1,534 @@
# GENOME.md — compounding-intelligence
*Generated: 2026-04-21 07:23:18 UTC | Refreshed for timmy-home #676 from `Timmy_Foundation/compounding-intelligence` @ `fe8a70a` on `main`*
## Project Overview
`compounding-intelligence` is a Python-first analysis toolkit for turning prior agent work into reusable fleet knowledge.
At a high level it does four things:
1. reads Hermes session transcripts and diff/session artifacts
2. extracts durable knowledge into a structured store
3. assembles bootstrap context for future sessions
4. mines the corpus for higher-order opportunities: automation, refactors, performance, knowledge gaps, and issue-priority changes
The repo's own README still presents the system as three largely planned pipelines. That is now stale.
Current repo truth from live inspection:
- tracked files: 56
- 33 Python files
- 15 test Python files
- Python LOC: 8,394
- workflow files: `.gitea/workflows/test.yml`
- persistent data fixtures: 5 JSONL files under `test_sessions/`
- existing target-repo genome already present upstream: `GENOME.md`
Most important architecture fact:
- this repo is no longer just prompt scaffolding for a future harvester/bootstrapper/measurer loop
- it already contains a growing family of concrete analysis engines under `scripts/`
Largest Python modules by size:
- `scripts/priority_rebalancer.py` — 682 lines
- `scripts/automation_opportunity_finder.py` — 554 lines
- `scripts/perf_bottleneck_finder.py` — 551 lines
- `scripts/improvement_proposals.py` — 451 lines
- `scripts/harvester.py` — 447 lines
- `scripts/bootstrapper.py` — 359 lines
- `scripts/sampler.py` — 353 lines
- `scripts/dead_code_detector.py` — 282 lines
## Architecture
The repo is best understood as three layers: ingestion, knowledge storage/bootstrap, and meta-analysis.
```mermaid
flowchart TD
A[Hermes session JSONL] --> B[session_reader.py]
B --> C[harvester.py]
B --> D[session_pair_harvester.py]
C --> E[knowledge/index.json]
C --> F[knowledge/global/*.yaml or .md]
C --> G[knowledge/repos/*.yaml]
C --> H[knowledge/agents/*]
E --> I[bootstrapper.py]
F --> I
G --> I
H --> I
I --> J[Bootstrapped session context]
E --> K[knowledge_staleness_check.py]
E --> L[priority_rebalancer.py]
E --> M[improvement_proposals.py]
N[test_sessions/*.jsonl] --> C
N --> D
N --> M
O[repo source tree] --> P[knowledge_gap_identifier.py]
O --> Q[dead_code_detector.py]
O --> R[automation_opportunity_finder.py]
O --> S[perf_bottleneck_finder.py]
O --> T[dependency_graph.py]
O --> U[diff_analyzer.py]
O --> V[refactoring_opportunity_finder.py]
W[Gitea issues API] --> L
L --> X[metrics/priority_report.json]
L --> Y[metrics/priority_suggestions.md]
```
What exists today:
- transcript parsing: `scripts/session_reader.py`
- knowledge extraction + dedup + writing: `scripts/harvester.py`
- context assembly: `scripts/bootstrapper.py`
- pair harvesting: `scripts/session_pair_harvester.py`
- staleness detection: `scripts/knowledge_staleness_check.py`
- gap analysis: `scripts/knowledge_gap_identifier.py`
- improvement mining: `scripts/improvement_proposals.py`
- automation mining: `scripts/automation_opportunity_finder.py`
- priority scoring against Gitea: `scripts/priority_rebalancer.py`
- diff scanning: `scripts/diff_analyzer.py`
- dead code analysis: `scripts/dead_code_detector.py`
What exists but is currently broken or incomplete:
- `scripts/refactoring_opportunity_finder.py` is still a stub that only emits sample proposals
- `scripts/perf_bottleneck_finder.py` does not parse
- `scripts/dependency_graph.py` does not parse
## Runtime Truth and Docs Drift
The repo ships its own `GENOME.md`, but that document is materially stale relative to the current codebase.
The strongest drift example:
- upstream `GENOME.md` says core pipeline scripts such as `harvester.py`, `bootstrapper.py`, `measurer.py`, and `session_reader.py` are planned or not yet implemented
- live source inspection shows `scripts/harvester.py`, `scripts/bootstrapper.py`, and `scripts/session_reader.py` are real, non-trivial implementations
- live source inspection also shows additional implemented engines not foregrounded by the README's original three-pipeline framing:
- `scripts/priority_rebalancer.py`
- `scripts/automation_opportunity_finder.py`
- `scripts/improvement_proposals.py`
- `scripts/knowledge_gap_identifier.py`
- `scripts/dead_code_detector.py`
- `scripts/session_pair_harvester.py`
- `scripts/diff_analyzer.py`
So the honest current description is:
- README = founding vision
- existing target-repo `GENOME.md` = partially outdated snapshot
- source + tests = current system truth
This is not a repo with only a single harvester/bootstrapper loop anymore. It is becoming a general-purpose compounding-analysis workbench.
## Entry Points
### 1. CI / canonical test entry point
The only checked-in workflow is `.gitea/workflows/test.yml`.
It installs:
- `requirements.txt`
Then runs:
```bash
make test
```
The Makefile defines:
```make
python3 -m pytest tests/test_ci_config.py scripts/test_*.py -v
```
This is the repo's canonical automation contract today.
### 2. Knowledge extraction entry point
`scripts/harvester.py`
Docstring usage:
```bash
python3 harvester.py --session ~/.hermes/sessions/session_xxx.jsonl --output knowledge/
python3 harvester.py --batch --since 2026-04-01 --limit 100
python3 harvester.py --session session.jsonl --dry-run
```
This is the main LLM-integrated path.
### 3. Session bootstrap entry point
`scripts/bootstrapper.py`
Docstring usage:
```bash
python3 bootstrapper.py --repo the-nexus --agent mimo-sprint
python3 bootstrapper.py --repo timmy-home --global
python3 bootstrapper.py --global
python3 bootstrapper.py --repo the-nexus --max-tokens 1000
```
### 4. Priority rebalancer entry point
`scripts/priority_rebalancer.py`
Docstring usage:
```bash
python3 scripts/priority_rebalancer.py --org Timmy_Foundation
python3 scripts/priority_rebalancer.py --org Timmy_Foundation --repo compounding-intelligence
python3 scripts/priority_rebalancer.py --org Timmy_Foundation --dry-run
python3 scripts/priority_rebalancer.py --org Timmy_Foundation --apply
```
### 5. Secondary analysis engines
Additional operational entry points exist in `scripts/`:
- `automation_opportunity_finder.py`
- `improvement_proposals.py`
- `knowledge_gap_identifier.py`
- `knowledge_staleness_check.py`
- `dead_code_detector.py`
- `diff_analyzer.py`
- `sampler.py`
- `gitea_issue_parser.py`
- `session_pair_harvester.py`
### 6. Seed knowledge content
The knowledge store is not empty scaffolding.
Concrete checked-in knowledge already exists at:
- `knowledge/repos/hermes-agent.yaml`
- `knowledge/repos/the-nexus.yaml`
- `knowledge/global/pitfalls.yaml`
- `knowledge/global/tool-quirks.yaml`
- `knowledge/index.json`
- `knowledge/SCHEMA.md`
## Data Flow
### Flow A — transcript to durable knowledge
1. Raw session JSONL enters via `scripts/session_reader.py`.
2. `read_session()` loads the transcript.
3. `extract_conversation()` strips to meaningful user/assistant/system turns.
4. `truncate_for_context()` compresses long sessions to head + tail.
5. `messages_to_text()` converts structured turns to a plain-text transcript block.
6. `scripts/harvester.py` loads `templates/harvest-prompt.md`.
7. The harvester calls an LLM endpoint, parses the JSON response, validates facts, fingerprints them, deduplicates, then writes `knowledge/index.json` and human-readable per-domain files.
### Flow B — durable knowledge to session bootstrap
1. `scripts/bootstrapper.py` loads `knowledge/index.json`.
2. It filters facts by repo, agent, and global scope.
3. It sorts them by confidence and category priority.
4. It optionally merges markdown knowledge from repo-specific, agent-specific, and global files.
5. It truncates the result to a token budget and emits a bootstrap context block.
### Flow C — corpus to meta-analysis
Several scripts mine the repo and/or session corpus for second-order leverage:
- `scripts/improvement_proposals.py` mines repeated errors, slow tools, manual processes, and retries into proposal objects
- `scripts/automation_opportunity_finder.py` scans transcripts, scripts, docs, and cron jobs for automatable work
- `scripts/knowledge_gap_identifier.py` cross-references code, docs, and tests
- `scripts/priority_rebalancer.py` combines knowledge signals, staleness signals, metrics, and Gitea issues into suggested priority shifts
### Flow D — repo/static inspection
- `scripts/dead_code_detector.py` walks Python ASTs and optionally uses git blame
- `scripts/diff_analyzer.py` parses patches into structured change objects
- `scripts/dependency_graph.py` is intended to scan repos and emit JSON / Mermaid / DOT dependency graphs, but is currently syntactically broken
- `scripts/perf_bottleneck_finder.py` is intended to scan tests/build/CI for bottlenecks, but is currently syntactically broken
## Key Abstractions
### Knowledge item
Defined in practice by `templates/harvest-prompt.md`, `scripts/harvester.py`, and `knowledge/SCHEMA.md`.
Important fields:
- `fact`
- `category`
- `repo` / domain
- `confidence`
- source/evidence metadata
Categories consistently used across the repo:
- fact
- pitfall
- pattern
- tool-quirk
- question
### Session transcript model
`session_reader.py` treats JSONL transcripts as ordered message sequences with:
- role
- content
- timestamp
- optional multimodal text extraction
- optional tool-call metadata
This module is the ingestion foundation for the rest of the system.
### Knowledge store
The repo uses a two-layer representation:
1. machine-readable index: `knowledge/index.json`
2. human-editable domain files: YAML/markdown under `knowledge/global/`, `knowledge/repos/`, and `knowledge/agents/`
`knowledge/SCHEMA.md` is the contract for that store.
### Bootstrap context
`bootstrapper.py` makes the design concrete:
- `filter_facts()` narrows by repo/agent/global scope
- `sort_facts()` orders by confidence and category priority
- `render_facts_section()` groups output by category
- `estimate_tokens()` and `truncate_to_tokens()` implement the context-window budget
- `build_bootstrap_context()` assembles the final injected context block
### Harvester dedup and validation
The central harvester abstractions are not classes but functions:
- `parse_extraction_response()`
- `fact_fingerprint()`
- `deduplicate()`
- `validate_fact()`
- `write_knowledge()`
- `harvest_session()`
This makes the core pipeline easy to test in pieces.
### Priority scoring model
`priority_rebalancer.py` introduces explicit data models:
- `IssueScore`
- `PipelineSignal`
- `GiteaClient`
That script is important because it bridges the local knowledge store to live Gitea issue state.
### Gap report model
`knowledge_gap_identifier.py` formalizes another analysis lane with:
- `GapSeverity`
- `GapType`
- `Gap`
- `GapReport`
- `KnowledgeGapIdentifier`
This is one of the clearest examples that the repo has moved beyond a single harvester/bootstrapper loop into a platform of analyzers.
## API Surface
This repo is primarily a CLI/library surface, not a long-running service.
### Core CLIs
- `scripts/harvester.py`
- `scripts/bootstrapper.py`
- `scripts/priority_rebalancer.py`
- `scripts/improvement_proposals.py`
- `scripts/automation_opportunity_finder.py`
- `scripts/knowledge_staleness_check.py`
- `scripts/dead_code_detector.py`
- `scripts/diff_analyzer.py`
- `scripts/gitea_issue_parser.py`
- `scripts/session_pair_harvester.py`
### External API dependencies
- LLM chat-completions endpoint in `scripts/harvester.py`
- Gitea REST API in `scripts/priority_rebalancer.py`
### File-format APIs
- session input: JSONL files under `test_sessions/`
- knowledge schema: `knowledge/SCHEMA.md`
- extraction prompt contract: `templates/harvest-prompt.md`
- machine store: `knowledge/index.json`
- repo knowledge examples:
- `knowledge/repos/hermes-agent.yaml`
- `knowledge/repos/the-nexus.yaml`
### Output artifacts
Documented or implied outputs include:
- `knowledge/index.json`
- repo/global/agent knowledge files
- `metrics/priority_report.json`
- `metrics/priority_suggestions.md`
- text/markdown/json proposal reports
## Test Coverage Gaps
## Current verified state
I verified the repo in three layers.
### Layer 1 — focused passing slice
Command run:
```bash
python3 -m pytest \
scripts/test_bootstrapper.py \
scripts/test_harvester_pipeline.py \
scripts/test_session_pair_harvester.py \
scripts/test_knowledge_staleness.py \
scripts/test_improvement_proposals.py \
scripts/test_automation_opportunity_finder.py \
scripts/test_gitea_issue_parser.py \
tests/test_ci_config.py \
tests/test_knowledge_gap_identifier.py -q
```
Result:
- `70 passed`
This proves the repo has substantial working logic today.
### Layer 2 — canonical CI command
Command run:
```bash
make test
```
Result:
- CI command collected 76 items and failed during collection with 1 error
- failure source: `scripts/test_refactoring_opportunity_finder.py`
- exact issue filed: `https://forge.alexanderwhitestone.com/Timmy_Foundation/compounding-intelligence/issues/210`
### Layer 3 — full test collection
Commands run:
```bash
python3 -m pytest --collect-only -q
python3 -m pytest -q
```
Result:
- `86 tests collected, 2 errors`
- collection blockers:
1. `scripts/test_refactoring_opportunity_finder.py` expects a real refactoring API that `scripts/refactoring_opportunity_finder.py` does not implement
2. `tests/test_perf_bottleneck_finder.py` cannot import `scripts/perf_bottleneck_finder.py` due a SyntaxError
Additional verification:
```bash
python3 -m py_compile scripts/perf_bottleneck_finder.py
python3 -m py_compile scripts/dependency_graph.py
```
Both fail.
Filed follow-ups:
- `compounding-intelligence/issues/210` — refactoring finder API missing
- `compounding-intelligence/issues/211``scripts/perf_bottleneck_finder.py` SyntaxError
- `compounding-intelligence/issues/212``scripts/dependency_graph.py` SyntaxError
### What is well covered
Strongly exercised subsystems include:
- bootstrapper logic
- harvester pipeline helpers
- session pair harvesting
- knowledge staleness checking
- improvement proposal generation
- automation opportunity mining
- Gitea issue parsing
- CI configuration contract
- knowledge gap analysis
### What is weak or broken
1. `scripts/refactoring_opportunity_finder.py`
- current implementation is a sample stub
- tests expect real complexity and scoring helpers
2. `scripts/perf_bottleneck_finder.py`
- parser broken before runtime
- test module exists but cannot import target script
3. `scripts/dependency_graph.py`
- parser broken before runtime
- no active test lane caught it before this analysis
4. CI scope gap
- `.gitea/workflows/test.yml` runs `make test`
- `make test` does not cover every `tests/*.py` module
- specifically, `tests/test_perf_bottleneck_finder.py` sits outside the Makefile target and the syntax break only shows up when running broader pytest commands
5. warning hygiene
- `scripts/test_priority_rebalancer.py` emits repeated `datetime.utcnow()` deprecation warnings under Python 3.12
## Security Considerations
1. Secret extraction risk
- this repo is literally designed to ingest transcripts and distill knowledge
- if the harvester prompt or filtering logic misses a credential, the system can preserve secrets into the knowledge store
- the risk is explicitly recognized in the target repo's existing `GENOME.md`, but enforcement still depends on implementation discipline
2. Knowledge poisoning
- the system trusts transcripts as source material for compounding facts
- confidence scores and evidence fields help, but there is no hard verification layer proving extracted facts are true before reuse
3. Cross-repo sensitivity
- seeded files such as `knowledge/repos/hermes-agent.yaml` and `knowledge/repos/the-nexus.yaml` store operational quirks and deployment pitfalls
- that is high-value knowledge and can also expose internal operational assumptions if shared broadly
4. External API use
- `scripts/harvester.py` depends on an LLM API endpoint and local key discovery
- `scripts/priority_rebalancer.py` talks to the Gitea API with write-capable operations such as labels and comments
- these scripts deserve careful credential-handling and least-privilege tokens
5. Transcript privacy
- session JSONL can contain user content, repo details, operational mistakes, and potentially sensitive environment facts
- durable storage multiplies the blast radius of accidental retention
## Dependencies
Explicit repo dependency file:
- `requirements.txt``pytest>=8,<9`
Observed runtime/import dependencies from source:
- Python stdlib-heavy design: `json`, `argparse`, `pathlib`, `urllib`, `ast`, `datetime`, `hashlib`, `subprocess`, `collections`, `re`
- `yaml` imported by `scripts/automation_opportunity_finder.py`
Important dependency note:
- `requirements.txt` only declares pytest
- static source inspection shows `yaml` usage, which implies an undeclared dependency on PyYAML or equivalent
- I did not prove a clean-environment failure because the local environment already had `yaml` importable during targeted tests
- this is best treated as dependency drift to verify in a clean environment
## Deployment
This is not a traditional server deployment repo.
Operational modes are:
1. local CLI execution of scripts under `scripts/`
2. CI execution via `.gitea/workflows/test.yml`
3. file-based knowledge store mutation under `knowledge/`
Canonical repo commands observed:
```bash
make test
python3 -m pytest -q
python3 -m pytest --collect-only -q
python3 ~/.hermes/pipelines/codebase-genome.py --path /tmp/compounding-intelligence-676 --output /tmp/compounding-intelligence-676-base-GENOME.md
```
There is no checked-in Dockerfile, packaging metadata, or service runner. The repo behaves more like an internal analysis toolkit than an application service.
## Technical Debt
1. Docs/runtime drift
- README and target-repo `GENOME.md` still describe a repo that is less implemented than reality
- this makes the project look earlier-stage than the current source actually is
2. Broken parser state in two flagship analyzers
- `scripts/perf_bottleneck_finder.py`
- `scripts/dependency_graph.py`
3. Stub-vs-test mismatch
- `scripts/refactoring_opportunity_finder.py` is a placeholder
- `scripts/test_refactoring_opportunity_finder.py` assumes a mature implementation
4. CI blind spot
- `make test` does not represent full-repo pytest health
- broader collection surfaces more problems than the workflow currently enforces
5. Dependency declaration drift
- `yaml` appears in source while `requirements.txt` only lists pytest
6. Warning debt
- `datetime.utcnow()` deprecation noise in `scripts/test_priority_rebalancer.py`
7. Existing target-repo genome drift
- checked-in `GENOME.md` already exists on upstream main, but it undersells the real code surface and should not be treated as authoritative without fresh source verification
## Key Findings
1. `compounding-intelligence` has already evolved into a multi-engine analysis toolkit, not just a future three-pipeline concept.
2. The most grounded working path today is transcript → `session_reader.py``harvester.py` / `bootstrapper.py` with a structured knowledge store.
3. The repo has real, working higher-order analyzers beyond harvesting: `knowledge_gap_identifier.py`, `priority_rebalancer.py`, `improvement_proposals.py`, `automation_opportunity_finder.py`, and `dead_code_detector.py`.
4. The current target-repo `GENOME.md` is useful evidence but stale as a full architectural description.
5. Test health is mixed: a broad, meaningful passing slice exists (`70 passed`), but canonical CI is currently broken by the refactoring finder contract mismatch, and full collection exposes additional syntax failures.
6. Three concrete follow-up issues were warranted and filed during this genome pass:
- `https://forge.alexanderwhitestone.com/Timmy_Foundation/compounding-intelligence/issues/210`
- `https://forge.alexanderwhitestone.com/Timmy_Foundation/compounding-intelligence/issues/211`
- `https://forge.alexanderwhitestone.com/Timmy_Foundation/compounding-intelligence/issues/212`
---
This host-repo genome artifact is the grounded cross-repo analysis requested by timmy-home #676. It intentionally treats the target repo's own `GENOME.md` as evidence rather than gospel, because current source, tests, and verification commands show a significantly more mature — and partially broken — system than the older upstream genome describes.

View File

@@ -0,0 +1,61 @@
from pathlib import Path
import unittest
ROOT = Path(__file__).resolve().parent.parent
GENOME_PATH = ROOT / "compounding-intelligence-GENOME.md"
class TestCompoundingIntelligenceGenome(unittest.TestCase):
def test_genome_file_exists_with_required_sections(self):
self.assertTrue(GENOME_PATH.exists(), "missing compounding-intelligence-GENOME.md")
text = GENOME_PATH.read_text(encoding="utf-8")
required_sections = [
"# GENOME.md — compounding-intelligence",
"## Project Overview",
"## Architecture",
"## Entry Points",
"## Data Flow",
"## Key Abstractions",
"## API Surface",
"## Test Coverage Gaps",
"## Security Considerations",
"## Dependencies",
"## Deployment",
"## Technical Debt",
]
for section in required_sections:
self.assertIn(section, text)
def test_genome_names_current_repo_specific_findings(self):
text = GENOME_PATH.read_text(encoding="utf-8")
required_snippets = [
"```mermaid",
"scripts/harvester.py",
"scripts/bootstrapper.py",
"scripts/priority_rebalancer.py",
"scripts/perf_bottleneck_finder.py",
"scripts/dependency_graph.py",
"scripts/refactoring_opportunity_finder.py",
"knowledge/SCHEMA.md",
"templates/harvest-prompt.md",
".gitea/workflows/test.yml",
"70 passed",
"86 tests collected, 2 errors",
"33 Python files",
"8,394",
"compounding-intelligence/issues/210",
"compounding-intelligence/issues/211",
"compounding-intelligence/issues/212",
]
for snippet in required_snippets:
self.assertIn(snippet, text)
def test_genome_is_substantial(self):
text = GENOME_PATH.read_text(encoding="utf-8")
self.assertGreaterEqual(len(text.splitlines()), 140)
self.assertGreaterEqual(len(text), 9000)
if __name__ == "__main__":
unittest.main()