Compare commits
2 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
0626a3fc33 | ||
|
|
98f861b713 |
534
compounding-intelligence-GENOME.md
Normal file
534
compounding-intelligence-GENOME.md
Normal file
@@ -0,0 +1,534 @@
|
||||
# GENOME.md — compounding-intelligence
|
||||
|
||||
*Generated: 2026-04-21 07:23:18 UTC | Refreshed for timmy-home #676 from `Timmy_Foundation/compounding-intelligence` @ `fe8a70a` on `main`*
|
||||
|
||||
## Project Overview
|
||||
|
||||
`compounding-intelligence` is a Python-first analysis toolkit for turning prior agent work into reusable fleet knowledge.
|
||||
|
||||
At a high level it does four things:
|
||||
1. reads Hermes session transcripts and diff/session artifacts
|
||||
2. extracts durable knowledge into a structured store
|
||||
3. assembles bootstrap context for future sessions
|
||||
4. mines the corpus for higher-order opportunities: automation, refactors, performance, knowledge gaps, and issue-priority changes
|
||||
|
||||
The repo's own README still presents the system as three largely planned pipelines. That is now stale.
|
||||
|
||||
Current repo truth from live inspection:
|
||||
- tracked files: 56
|
||||
- 33 Python files
|
||||
- 15 test Python files
|
||||
- Python LOC: 8,394
|
||||
- workflow files: `.gitea/workflows/test.yml`
|
||||
- persistent data fixtures: 5 JSONL files under `test_sessions/`
|
||||
- existing target-repo genome already present upstream: `GENOME.md`
|
||||
|
||||
Most important architecture fact:
|
||||
- this repo is no longer just prompt scaffolding for a future harvester/bootstrapper/measurer loop
|
||||
- it already contains a growing family of concrete analysis engines under `scripts/`
|
||||
|
||||
Largest Python modules by size:
|
||||
- `scripts/priority_rebalancer.py` — 682 lines
|
||||
- `scripts/automation_opportunity_finder.py` — 554 lines
|
||||
- `scripts/perf_bottleneck_finder.py` — 551 lines
|
||||
- `scripts/improvement_proposals.py` — 451 lines
|
||||
- `scripts/harvester.py` — 447 lines
|
||||
- `scripts/bootstrapper.py` — 359 lines
|
||||
- `scripts/sampler.py` — 353 lines
|
||||
- `scripts/dead_code_detector.py` — 282 lines
|
||||
|
||||
## Architecture
|
||||
|
||||
The repo is best understood as three layers: ingestion, knowledge storage/bootstrap, and meta-analysis.
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A[Hermes session JSONL] --> B[session_reader.py]
|
||||
B --> C[harvester.py]
|
||||
B --> D[session_pair_harvester.py]
|
||||
C --> E[knowledge/index.json]
|
||||
C --> F[knowledge/global/*.yaml or .md]
|
||||
C --> G[knowledge/repos/*.yaml]
|
||||
C --> H[knowledge/agents/*]
|
||||
|
||||
E --> I[bootstrapper.py]
|
||||
F --> I
|
||||
G --> I
|
||||
H --> I
|
||||
I --> J[Bootstrapped session context]
|
||||
|
||||
E --> K[knowledge_staleness_check.py]
|
||||
E --> L[priority_rebalancer.py]
|
||||
E --> M[improvement_proposals.py]
|
||||
|
||||
N[test_sessions/*.jsonl] --> C
|
||||
N --> D
|
||||
N --> M
|
||||
|
||||
O[repo source tree] --> P[knowledge_gap_identifier.py]
|
||||
O --> Q[dead_code_detector.py]
|
||||
O --> R[automation_opportunity_finder.py]
|
||||
O --> S[perf_bottleneck_finder.py]
|
||||
O --> T[dependency_graph.py]
|
||||
O --> U[diff_analyzer.py]
|
||||
O --> V[refactoring_opportunity_finder.py]
|
||||
|
||||
W[Gitea issues API] --> L
|
||||
L --> X[metrics/priority_report.json]
|
||||
L --> Y[metrics/priority_suggestions.md]
|
||||
```
|
||||
|
||||
What exists today:
|
||||
- transcript parsing: `scripts/session_reader.py`
|
||||
- knowledge extraction + dedup + writing: `scripts/harvester.py`
|
||||
- context assembly: `scripts/bootstrapper.py`
|
||||
- pair harvesting: `scripts/session_pair_harvester.py`
|
||||
- staleness detection: `scripts/knowledge_staleness_check.py`
|
||||
- gap analysis: `scripts/knowledge_gap_identifier.py`
|
||||
- improvement mining: `scripts/improvement_proposals.py`
|
||||
- automation mining: `scripts/automation_opportunity_finder.py`
|
||||
- priority scoring against Gitea: `scripts/priority_rebalancer.py`
|
||||
- diff scanning: `scripts/diff_analyzer.py`
|
||||
- dead code analysis: `scripts/dead_code_detector.py`
|
||||
|
||||
What exists but is currently broken or incomplete:
|
||||
- `scripts/refactoring_opportunity_finder.py` is still a stub that only emits sample proposals
|
||||
- `scripts/perf_bottleneck_finder.py` does not parse
|
||||
- `scripts/dependency_graph.py` does not parse
|
||||
|
||||
## Runtime Truth and Docs Drift
|
||||
|
||||
The repo ships its own `GENOME.md`, but that document is materially stale relative to the current codebase.
|
||||
|
||||
The strongest drift example:
|
||||
- upstream `GENOME.md` says core pipeline scripts such as `harvester.py`, `bootstrapper.py`, `measurer.py`, and `session_reader.py` are planned or not yet implemented
|
||||
- live source inspection shows `scripts/harvester.py`, `scripts/bootstrapper.py`, and `scripts/session_reader.py` are real, non-trivial implementations
|
||||
- live source inspection also shows additional implemented engines not foregrounded by the README's original three-pipeline framing:
|
||||
- `scripts/priority_rebalancer.py`
|
||||
- `scripts/automation_opportunity_finder.py`
|
||||
- `scripts/improvement_proposals.py`
|
||||
- `scripts/knowledge_gap_identifier.py`
|
||||
- `scripts/dead_code_detector.py`
|
||||
- `scripts/session_pair_harvester.py`
|
||||
- `scripts/diff_analyzer.py`
|
||||
|
||||
So the honest current description is:
|
||||
- README = founding vision
|
||||
- existing target-repo `GENOME.md` = partially outdated snapshot
|
||||
- source + tests = current system truth
|
||||
|
||||
This is not a repo with only a single harvester/bootstrapper loop anymore. It is becoming a general-purpose compounding-analysis workbench.
|
||||
|
||||
## Entry Points
|
||||
|
||||
### 1. CI / canonical test entry point
|
||||
The only checked-in workflow is `.gitea/workflows/test.yml`.
|
||||
|
||||
It installs:
|
||||
- `requirements.txt`
|
||||
|
||||
Then runs:
|
||||
```bash
|
||||
make test
|
||||
```
|
||||
|
||||
The Makefile defines:
|
||||
```make
|
||||
python3 -m pytest tests/test_ci_config.py scripts/test_*.py -v
|
||||
```
|
||||
|
||||
This is the repo's canonical automation contract today.
|
||||
|
||||
### 2. Knowledge extraction entry point
|
||||
`scripts/harvester.py`
|
||||
|
||||
Docstring usage:
|
||||
```bash
|
||||
python3 harvester.py --session ~/.hermes/sessions/session_xxx.jsonl --output knowledge/
|
||||
python3 harvester.py --batch --since 2026-04-01 --limit 100
|
||||
python3 harvester.py --session session.jsonl --dry-run
|
||||
```
|
||||
|
||||
This is the main LLM-integrated path.
|
||||
|
||||
### 3. Session bootstrap entry point
|
||||
`scripts/bootstrapper.py`
|
||||
|
||||
Docstring usage:
|
||||
```bash
|
||||
python3 bootstrapper.py --repo the-nexus --agent mimo-sprint
|
||||
python3 bootstrapper.py --repo timmy-home --global
|
||||
python3 bootstrapper.py --global
|
||||
python3 bootstrapper.py --repo the-nexus --max-tokens 1000
|
||||
```
|
||||
|
||||
### 4. Priority rebalancer entry point
|
||||
`scripts/priority_rebalancer.py`
|
||||
|
||||
Docstring usage:
|
||||
```bash
|
||||
python3 scripts/priority_rebalancer.py --org Timmy_Foundation
|
||||
python3 scripts/priority_rebalancer.py --org Timmy_Foundation --repo compounding-intelligence
|
||||
python3 scripts/priority_rebalancer.py --org Timmy_Foundation --dry-run
|
||||
python3 scripts/priority_rebalancer.py --org Timmy_Foundation --apply
|
||||
```
|
||||
|
||||
### 5. Secondary analysis engines
|
||||
Additional operational entry points exist in `scripts/`:
|
||||
- `automation_opportunity_finder.py`
|
||||
- `improvement_proposals.py`
|
||||
- `knowledge_gap_identifier.py`
|
||||
- `knowledge_staleness_check.py`
|
||||
- `dead_code_detector.py`
|
||||
- `diff_analyzer.py`
|
||||
- `sampler.py`
|
||||
- `gitea_issue_parser.py`
|
||||
- `session_pair_harvester.py`
|
||||
|
||||
### 6. Seed knowledge content
|
||||
The knowledge store is not empty scaffolding.
|
||||
|
||||
Concrete checked-in knowledge already exists at:
|
||||
- `knowledge/repos/hermes-agent.yaml`
|
||||
- `knowledge/repos/the-nexus.yaml`
|
||||
- `knowledge/global/pitfalls.yaml`
|
||||
- `knowledge/global/tool-quirks.yaml`
|
||||
- `knowledge/index.json`
|
||||
- `knowledge/SCHEMA.md`
|
||||
|
||||
## Data Flow
|
||||
|
||||
### Flow A — transcript to durable knowledge
|
||||
1. Raw session JSONL enters via `scripts/session_reader.py`.
|
||||
2. `read_session()` loads the transcript.
|
||||
3. `extract_conversation()` strips to meaningful user/assistant/system turns.
|
||||
4. `truncate_for_context()` compresses long sessions to head + tail.
|
||||
5. `messages_to_text()` converts structured turns to a plain-text transcript block.
|
||||
6. `scripts/harvester.py` loads `templates/harvest-prompt.md`.
|
||||
7. The harvester calls an LLM endpoint, parses the JSON response, validates facts, fingerprints them, deduplicates, then writes `knowledge/index.json` and human-readable per-domain files.
|
||||
|
||||
### Flow B — durable knowledge to session bootstrap
|
||||
1. `scripts/bootstrapper.py` loads `knowledge/index.json`.
|
||||
2. It filters facts by repo, agent, and global scope.
|
||||
3. It sorts them by confidence and category priority.
|
||||
4. It optionally merges markdown knowledge from repo-specific, agent-specific, and global files.
|
||||
5. It truncates the result to a token budget and emits a bootstrap context block.
|
||||
|
||||
### Flow C — corpus to meta-analysis
|
||||
Several scripts mine the repo and/or session corpus for second-order leverage:
|
||||
- `scripts/improvement_proposals.py` mines repeated errors, slow tools, manual processes, and retries into proposal objects
|
||||
- `scripts/automation_opportunity_finder.py` scans transcripts, scripts, docs, and cron jobs for automatable work
|
||||
- `scripts/knowledge_gap_identifier.py` cross-references code, docs, and tests
|
||||
- `scripts/priority_rebalancer.py` combines knowledge signals, staleness signals, metrics, and Gitea issues into suggested priority shifts
|
||||
|
||||
### Flow D — repo/static inspection
|
||||
- `scripts/dead_code_detector.py` walks Python ASTs and optionally uses git blame
|
||||
- `scripts/diff_analyzer.py` parses patches into structured change objects
|
||||
- `scripts/dependency_graph.py` is intended to scan repos and emit JSON / Mermaid / DOT dependency graphs, but is currently syntactically broken
|
||||
- `scripts/perf_bottleneck_finder.py` is intended to scan tests/build/CI for bottlenecks, but is currently syntactically broken
|
||||
|
||||
## Key Abstractions
|
||||
|
||||
### Knowledge item
|
||||
Defined in practice by `templates/harvest-prompt.md`, `scripts/harvester.py`, and `knowledge/SCHEMA.md`.
|
||||
|
||||
Important fields:
|
||||
- `fact`
|
||||
- `category`
|
||||
- `repo` / domain
|
||||
- `confidence`
|
||||
- source/evidence metadata
|
||||
|
||||
Categories consistently used across the repo:
|
||||
- fact
|
||||
- pitfall
|
||||
- pattern
|
||||
- tool-quirk
|
||||
- question
|
||||
|
||||
### Session transcript model
|
||||
`session_reader.py` treats JSONL transcripts as ordered message sequences with:
|
||||
- role
|
||||
- content
|
||||
- timestamp
|
||||
- optional multimodal text extraction
|
||||
- optional tool-call metadata
|
||||
|
||||
This module is the ingestion foundation for the rest of the system.
|
||||
|
||||
### Knowledge store
|
||||
The repo uses a two-layer representation:
|
||||
1. machine-readable index: `knowledge/index.json`
|
||||
2. human-editable domain files: YAML/markdown under `knowledge/global/`, `knowledge/repos/`, and `knowledge/agents/`
|
||||
|
||||
`knowledge/SCHEMA.md` is the contract for that store.
|
||||
|
||||
### Bootstrap context
|
||||
`bootstrapper.py` makes the design concrete:
|
||||
- `filter_facts()` narrows by repo/agent/global scope
|
||||
- `sort_facts()` orders by confidence and category priority
|
||||
- `render_facts_section()` groups output by category
|
||||
- `estimate_tokens()` and `truncate_to_tokens()` implement the context-window budget
|
||||
- `build_bootstrap_context()` assembles the final injected context block
|
||||
|
||||
### Harvester dedup and validation
|
||||
The central harvester abstractions are not classes but functions:
|
||||
- `parse_extraction_response()`
|
||||
- `fact_fingerprint()`
|
||||
- `deduplicate()`
|
||||
- `validate_fact()`
|
||||
- `write_knowledge()`
|
||||
- `harvest_session()`
|
||||
|
||||
This makes the core pipeline easy to test in pieces.
|
||||
|
||||
### Priority scoring model
|
||||
`priority_rebalancer.py` introduces explicit data models:
|
||||
- `IssueScore`
|
||||
- `PipelineSignal`
|
||||
- `GiteaClient`
|
||||
|
||||
That script is important because it bridges the local knowledge store to live Gitea issue state.
|
||||
|
||||
### Gap report model
|
||||
`knowledge_gap_identifier.py` formalizes another analysis lane with:
|
||||
- `GapSeverity`
|
||||
- `GapType`
|
||||
- `Gap`
|
||||
- `GapReport`
|
||||
- `KnowledgeGapIdentifier`
|
||||
|
||||
This is one of the clearest examples that the repo has moved beyond a single harvester/bootstrapper loop into a platform of analyzers.
|
||||
|
||||
## API Surface
|
||||
|
||||
This repo is primarily a CLI/library surface, not a long-running service.
|
||||
|
||||
### Core CLIs
|
||||
- `scripts/harvester.py`
|
||||
- `scripts/bootstrapper.py`
|
||||
- `scripts/priority_rebalancer.py`
|
||||
- `scripts/improvement_proposals.py`
|
||||
- `scripts/automation_opportunity_finder.py`
|
||||
- `scripts/knowledge_staleness_check.py`
|
||||
- `scripts/dead_code_detector.py`
|
||||
- `scripts/diff_analyzer.py`
|
||||
- `scripts/gitea_issue_parser.py`
|
||||
- `scripts/session_pair_harvester.py`
|
||||
|
||||
### External API dependencies
|
||||
- LLM chat-completions endpoint in `scripts/harvester.py`
|
||||
- Gitea REST API in `scripts/priority_rebalancer.py`
|
||||
|
||||
### File-format APIs
|
||||
- session input: JSONL files under `test_sessions/`
|
||||
- knowledge schema: `knowledge/SCHEMA.md`
|
||||
- extraction prompt contract: `templates/harvest-prompt.md`
|
||||
- machine store: `knowledge/index.json`
|
||||
- repo knowledge examples:
|
||||
- `knowledge/repos/hermes-agent.yaml`
|
||||
- `knowledge/repos/the-nexus.yaml`
|
||||
|
||||
### Output artifacts
|
||||
Documented or implied outputs include:
|
||||
- `knowledge/index.json`
|
||||
- repo/global/agent knowledge files
|
||||
- `metrics/priority_report.json`
|
||||
- `metrics/priority_suggestions.md`
|
||||
- text/markdown/json proposal reports
|
||||
|
||||
## Test Coverage Gaps
|
||||
|
||||
## Current verified state
|
||||
I verified the repo in three layers.
|
||||
|
||||
### Layer 1 — focused passing slice
|
||||
Command run:
|
||||
```bash
|
||||
python3 -m pytest \
|
||||
scripts/test_bootstrapper.py \
|
||||
scripts/test_harvester_pipeline.py \
|
||||
scripts/test_session_pair_harvester.py \
|
||||
scripts/test_knowledge_staleness.py \
|
||||
scripts/test_improvement_proposals.py \
|
||||
scripts/test_automation_opportunity_finder.py \
|
||||
scripts/test_gitea_issue_parser.py \
|
||||
tests/test_ci_config.py \
|
||||
tests/test_knowledge_gap_identifier.py -q
|
||||
```
|
||||
|
||||
Result:
|
||||
- `70 passed`
|
||||
|
||||
This proves the repo has substantial working logic today.
|
||||
|
||||
### Layer 2 — canonical CI command
|
||||
Command run:
|
||||
```bash
|
||||
make test
|
||||
```
|
||||
|
||||
Result:
|
||||
- CI command collected 76 items and failed during collection with 1 error
|
||||
- failure source: `scripts/test_refactoring_opportunity_finder.py`
|
||||
- exact issue filed: `https://forge.alexanderwhitestone.com/Timmy_Foundation/compounding-intelligence/issues/210`
|
||||
|
||||
### Layer 3 — full test collection
|
||||
Commands run:
|
||||
```bash
|
||||
python3 -m pytest --collect-only -q
|
||||
python3 -m pytest -q
|
||||
```
|
||||
|
||||
Result:
|
||||
- `86 tests collected, 2 errors`
|
||||
- collection blockers:
|
||||
1. `scripts/test_refactoring_opportunity_finder.py` expects a real refactoring API that `scripts/refactoring_opportunity_finder.py` does not implement
|
||||
2. `tests/test_perf_bottleneck_finder.py` cannot import `scripts/perf_bottleneck_finder.py` due a SyntaxError
|
||||
|
||||
Additional verification:
|
||||
```bash
|
||||
python3 -m py_compile scripts/perf_bottleneck_finder.py
|
||||
python3 -m py_compile scripts/dependency_graph.py
|
||||
```
|
||||
|
||||
Both fail.
|
||||
|
||||
Filed follow-ups:
|
||||
- `compounding-intelligence/issues/210` — refactoring finder API missing
|
||||
- `compounding-intelligence/issues/211` — `scripts/perf_bottleneck_finder.py` SyntaxError
|
||||
- `compounding-intelligence/issues/212` — `scripts/dependency_graph.py` SyntaxError
|
||||
|
||||
### What is well covered
|
||||
Strongly exercised subsystems include:
|
||||
- bootstrapper logic
|
||||
- harvester pipeline helpers
|
||||
- session pair harvesting
|
||||
- knowledge staleness checking
|
||||
- improvement proposal generation
|
||||
- automation opportunity mining
|
||||
- Gitea issue parsing
|
||||
- CI configuration contract
|
||||
- knowledge gap analysis
|
||||
|
||||
### What is weak or broken
|
||||
1. `scripts/refactoring_opportunity_finder.py`
|
||||
- current implementation is a sample stub
|
||||
- tests expect real complexity and scoring helpers
|
||||
|
||||
2. `scripts/perf_bottleneck_finder.py`
|
||||
- parser broken before runtime
|
||||
- test module exists but cannot import target script
|
||||
|
||||
3. `scripts/dependency_graph.py`
|
||||
- parser broken before runtime
|
||||
- no active test lane caught it before this analysis
|
||||
|
||||
4. CI scope gap
|
||||
- `.gitea/workflows/test.yml` runs `make test`
|
||||
- `make test` does not cover every `tests/*.py` module
|
||||
- specifically, `tests/test_perf_bottleneck_finder.py` sits outside the Makefile target and the syntax break only shows up when running broader pytest commands
|
||||
|
||||
5. warning hygiene
|
||||
- `scripts/test_priority_rebalancer.py` emits repeated `datetime.utcnow()` deprecation warnings under Python 3.12
|
||||
|
||||
## Security Considerations
|
||||
|
||||
1. Secret extraction risk
|
||||
- this repo is literally designed to ingest transcripts and distill knowledge
|
||||
- if the harvester prompt or filtering logic misses a credential, the system can preserve secrets into the knowledge store
|
||||
- the risk is explicitly recognized in the target repo's existing `GENOME.md`, but enforcement still depends on implementation discipline
|
||||
|
||||
2. Knowledge poisoning
|
||||
- the system trusts transcripts as source material for compounding facts
|
||||
- confidence scores and evidence fields help, but there is no hard verification layer proving extracted facts are true before reuse
|
||||
|
||||
3. Cross-repo sensitivity
|
||||
- seeded files such as `knowledge/repos/hermes-agent.yaml` and `knowledge/repos/the-nexus.yaml` store operational quirks and deployment pitfalls
|
||||
- that is high-value knowledge and can also expose internal operational assumptions if shared broadly
|
||||
|
||||
4. External API use
|
||||
- `scripts/harvester.py` depends on an LLM API endpoint and local key discovery
|
||||
- `scripts/priority_rebalancer.py` talks to the Gitea API with write-capable operations such as labels and comments
|
||||
- these scripts deserve careful credential-handling and least-privilege tokens
|
||||
|
||||
5. Transcript privacy
|
||||
- session JSONL can contain user content, repo details, operational mistakes, and potentially sensitive environment facts
|
||||
- durable storage multiplies the blast radius of accidental retention
|
||||
|
||||
## Dependencies
|
||||
|
||||
Explicit repo dependency file:
|
||||
- `requirements.txt` → `pytest>=8,<9`
|
||||
|
||||
Observed runtime/import dependencies from source:
|
||||
- Python stdlib-heavy design: `json`, `argparse`, `pathlib`, `urllib`, `ast`, `datetime`, `hashlib`, `subprocess`, `collections`, `re`
|
||||
- `yaml` imported by `scripts/automation_opportunity_finder.py`
|
||||
|
||||
Important dependency note:
|
||||
- `requirements.txt` only declares pytest
|
||||
- static source inspection shows `yaml` usage, which implies an undeclared dependency on PyYAML or equivalent
|
||||
- I did not prove a clean-environment failure because the local environment already had `yaml` importable during targeted tests
|
||||
- this is best treated as dependency drift to verify in a clean environment
|
||||
|
||||
## Deployment
|
||||
|
||||
This is not a traditional server deployment repo.
|
||||
|
||||
Operational modes are:
|
||||
1. local CLI execution of scripts under `scripts/`
|
||||
2. CI execution via `.gitea/workflows/test.yml`
|
||||
3. file-based knowledge store mutation under `knowledge/`
|
||||
|
||||
Canonical repo commands observed:
|
||||
```bash
|
||||
make test
|
||||
python3 -m pytest -q
|
||||
python3 -m pytest --collect-only -q
|
||||
python3 ~/.hermes/pipelines/codebase-genome.py --path /tmp/compounding-intelligence-676 --output /tmp/compounding-intelligence-676-base-GENOME.md
|
||||
```
|
||||
|
||||
There is no checked-in Dockerfile, packaging metadata, or service runner. The repo behaves more like an internal analysis toolkit than an application service.
|
||||
|
||||
## Technical Debt
|
||||
|
||||
1. Docs/runtime drift
|
||||
- README and target-repo `GENOME.md` still describe a repo that is less implemented than reality
|
||||
- this makes the project look earlier-stage than the current source actually is
|
||||
|
||||
2. Broken parser state in two flagship analyzers
|
||||
- `scripts/perf_bottleneck_finder.py`
|
||||
- `scripts/dependency_graph.py`
|
||||
|
||||
3. Stub-vs-test mismatch
|
||||
- `scripts/refactoring_opportunity_finder.py` is a placeholder
|
||||
- `scripts/test_refactoring_opportunity_finder.py` assumes a mature implementation
|
||||
|
||||
4. CI blind spot
|
||||
- `make test` does not represent full-repo pytest health
|
||||
- broader collection surfaces more problems than the workflow currently enforces
|
||||
|
||||
5. Dependency declaration drift
|
||||
- `yaml` appears in source while `requirements.txt` only lists pytest
|
||||
|
||||
6. Warning debt
|
||||
- `datetime.utcnow()` deprecation noise in `scripts/test_priority_rebalancer.py`
|
||||
|
||||
7. Existing target-repo genome drift
|
||||
- checked-in `GENOME.md` already exists on upstream main, but it undersells the real code surface and should not be treated as authoritative without fresh source verification
|
||||
|
||||
## Key Findings
|
||||
|
||||
1. `compounding-intelligence` has already evolved into a multi-engine analysis toolkit, not just a future three-pipeline concept.
|
||||
2. The most grounded working path today is transcript → `session_reader.py` → `harvester.py` / `bootstrapper.py` with a structured knowledge store.
|
||||
3. The repo has real, working higher-order analyzers beyond harvesting: `knowledge_gap_identifier.py`, `priority_rebalancer.py`, `improvement_proposals.py`, `automation_opportunity_finder.py`, and `dead_code_detector.py`.
|
||||
4. The current target-repo `GENOME.md` is useful evidence but stale as a full architectural description.
|
||||
5. Test health is mixed: a broad, meaningful passing slice exists (`70 passed`), but canonical CI is currently broken by the refactoring finder contract mismatch, and full collection exposes additional syntax failures.
|
||||
6. Three concrete follow-up issues were warranted and filed during this genome pass:
|
||||
- `https://forge.alexanderwhitestone.com/Timmy_Foundation/compounding-intelligence/issues/210`
|
||||
- `https://forge.alexanderwhitestone.com/Timmy_Foundation/compounding-intelligence/issues/211`
|
||||
- `https://forge.alexanderwhitestone.com/Timmy_Foundation/compounding-intelligence/issues/212`
|
||||
|
||||
---
|
||||
|
||||
This host-repo genome artifact is the grounded cross-repo analysis requested by timmy-home #676. It intentionally treats the target repo's own `GENOME.md` as evidence rather than gospel, because current source, tests, and verification commands show a significantly more mature — and partially broken — system than the older upstream genome describes.
|
||||
61
tests/test_compounding_intelligence_genome.py
Normal file
61
tests/test_compounding_intelligence_genome.py
Normal file
@@ -0,0 +1,61 @@
|
||||
from pathlib import Path
|
||||
import unittest
|
||||
|
||||
|
||||
ROOT = Path(__file__).resolve().parent.parent
|
||||
GENOME_PATH = ROOT / "compounding-intelligence-GENOME.md"
|
||||
|
||||
|
||||
class TestCompoundingIntelligenceGenome(unittest.TestCase):
|
||||
def test_genome_file_exists_with_required_sections(self):
|
||||
self.assertTrue(GENOME_PATH.exists(), "missing compounding-intelligence-GENOME.md")
|
||||
text = GENOME_PATH.read_text(encoding="utf-8")
|
||||
required_sections = [
|
||||
"# GENOME.md — compounding-intelligence",
|
||||
"## Project Overview",
|
||||
"## Architecture",
|
||||
"## Entry Points",
|
||||
"## Data Flow",
|
||||
"## Key Abstractions",
|
||||
"## API Surface",
|
||||
"## Test Coverage Gaps",
|
||||
"## Security Considerations",
|
||||
"## Dependencies",
|
||||
"## Deployment",
|
||||
"## Technical Debt",
|
||||
]
|
||||
for section in required_sections:
|
||||
self.assertIn(section, text)
|
||||
|
||||
def test_genome_names_current_repo_specific_findings(self):
|
||||
text = GENOME_PATH.read_text(encoding="utf-8")
|
||||
required_snippets = [
|
||||
"```mermaid",
|
||||
"scripts/harvester.py",
|
||||
"scripts/bootstrapper.py",
|
||||
"scripts/priority_rebalancer.py",
|
||||
"scripts/perf_bottleneck_finder.py",
|
||||
"scripts/dependency_graph.py",
|
||||
"scripts/refactoring_opportunity_finder.py",
|
||||
"knowledge/SCHEMA.md",
|
||||
"templates/harvest-prompt.md",
|
||||
".gitea/workflows/test.yml",
|
||||
"70 passed",
|
||||
"86 tests collected, 2 errors",
|
||||
"33 Python files",
|
||||
"8,394",
|
||||
"compounding-intelligence/issues/210",
|
||||
"compounding-intelligence/issues/211",
|
||||
"compounding-intelligence/issues/212",
|
||||
]
|
||||
for snippet in required_snippets:
|
||||
self.assertIn(snippet, text)
|
||||
|
||||
def test_genome_is_substantial(self):
|
||||
text = GENOME_PATH.read_text(encoding="utf-8")
|
||||
self.assertGreaterEqual(len(text.splitlines()), 140)
|
||||
self.assertGreaterEqual(len(text), 9000)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
unittest.main()
|
||||
Reference in New Issue
Block a user