Timmy_Foundation/timmy-home

Fork 0

Files

Alexander Whitestone 0626a3fc33

Self-Healing Smoke / self-healing-smoke (pull_request) Failing after 25s

Details

Smoke Test / smoke (pull_request) Failing after 30s

Details

Agent PR Gate / gate (pull_request) Failing after 38s

Details

Agent PR Gate / report (pull_request) Successful in 10s

Details

feat: add compounding-intelligence genome analysis (#676 )

2026-04-21 03:27:24 -04:00

20 KiB

Raw Blame History

GENOME.md — compounding-intelligence

Generated: 2026-04-21 07:23:18 UTC | Refreshed for timmy-home #676 from Timmy_Foundation/compounding-intelligence @ fe8a70a on main

Project Overview

compounding-intelligence is a Python-first analysis toolkit for turning prior agent work into reusable fleet knowledge.

At a high level it does four things:

reads Hermes session transcripts and diff/session artifacts
extracts durable knowledge into a structured store
assembles bootstrap context for future sessions
mines the corpus for higher-order opportunities: automation, refactors, performance, knowledge gaps, and issue-priority changes

The repo's own README still presents the system as three largely planned pipelines. That is now stale.

Current repo truth from live inspection:

tracked files: 56
33 Python files
15 test Python files
Python LOC: 8,394
workflow files: .gitea/workflows/test.yml
persistent data fixtures: 5 JSONL files under test_sessions/
existing target-repo genome already present upstream: GENOME.md

Most important architecture fact:

this repo is no longer just prompt scaffolding for a future harvester/bootstrapper/measurer loop
it already contains a growing family of concrete analysis engines under scripts/

Largest Python modules by size:

scripts/priority_rebalancer.py — 682 lines
scripts/automation_opportunity_finder.py — 554 lines
scripts/perf_bottleneck_finder.py — 551 lines
scripts/improvement_proposals.py — 451 lines
scripts/harvester.py — 447 lines
scripts/bootstrapper.py — 359 lines
scripts/sampler.py — 353 lines
scripts/dead_code_detector.py — 282 lines

Architecture

The repo is best understood as three layers: ingestion, knowledge storage/bootstrap, and meta-analysis.

flowchart TD
    A[Hermes session JSONL] --> B[session_reader.py]
    B --> C[harvester.py]
    B --> D[session_pair_harvester.py]
    C --> E[knowledge/index.json]
    C --> F[knowledge/global/*.yaml or .md]
    C --> G[knowledge/repos/*.yaml]
    C --> H[knowledge/agents/*]

    E --> I[bootstrapper.py]
    F --> I
    G --> I
    H --> I
    I --> J[Bootstrapped session context]

    E --> K[knowledge_staleness_check.py]
    E --> L[priority_rebalancer.py]
    E --> M[improvement_proposals.py]

    N[test_sessions/*.jsonl] --> C
    N --> D
    N --> M

    O[repo source tree] --> P[knowledge_gap_identifier.py]
    O --> Q[dead_code_detector.py]
    O --> R[automation_opportunity_finder.py]
    O --> S[perf_bottleneck_finder.py]
    O --> T[dependency_graph.py]
    O --> U[diff_analyzer.py]
    O --> V[refactoring_opportunity_finder.py]

    W[Gitea issues API] --> L
    L --> X[metrics/priority_report.json]
    L --> Y[metrics/priority_suggestions.md]

What exists today:

transcript parsing: scripts/session_reader.py
knowledge extraction + dedup + writing: scripts/harvester.py
context assembly: scripts/bootstrapper.py
pair harvesting: scripts/session_pair_harvester.py
staleness detection: scripts/knowledge_staleness_check.py
gap analysis: scripts/knowledge_gap_identifier.py
improvement mining: scripts/improvement_proposals.py
automation mining: scripts/automation_opportunity_finder.py
priority scoring against Gitea: scripts/priority_rebalancer.py
diff scanning: scripts/diff_analyzer.py
dead code analysis: scripts/dead_code_detector.py

What exists but is currently broken or incomplete:

scripts/refactoring_opportunity_finder.py is still a stub that only emits sample proposals
scripts/perf_bottleneck_finder.py does not parse
scripts/dependency_graph.py does not parse

Runtime Truth and Docs Drift

The repo ships its own GENOME.md, but that document is materially stale relative to the current codebase.

The strongest drift example:

upstream GENOME.md says core pipeline scripts such as harvester.py, bootstrapper.py, measurer.py, and session_reader.py are planned or not yet implemented
live source inspection shows scripts/harvester.py, scripts/bootstrapper.py, and scripts/session_reader.py are real, non-trivial implementations
live source inspection also shows additional implemented engines not foregrounded by the README's original three-pipeline framing:
- scripts/priority_rebalancer.py
- scripts/automation_opportunity_finder.py
- scripts/improvement_proposals.py
- scripts/knowledge_gap_identifier.py
- scripts/dead_code_detector.py
- scripts/session_pair_harvester.py
- scripts/diff_analyzer.py

So the honest current description is:

README = founding vision
existing target-repo GENOME.md = partially outdated snapshot
source + tests = current system truth

This is not a repo with only a single harvester/bootstrapper loop anymore. It is becoming a general-purpose compounding-analysis workbench.

Entry Points

1. CI / canonical test entry point

The only checked-in workflow is .gitea/workflows/test.yml.

It installs:

requirements.txt

Then runs:

make test

The Makefile defines:

python3 -m pytest tests/test_ci_config.py scripts/test_*.py -v

This is the repo's canonical automation contract today.

2. Knowledge extraction entry point

scripts/harvester.py

Docstring usage:

python3 harvester.py --session ~/.hermes/sessions/session_xxx.jsonl --output knowledge/
python3 harvester.py --batch --since 2026-04-01 --limit 100
python3 harvester.py --session session.jsonl --dry-run

This is the main LLM-integrated path.

3. Session bootstrap entry point

scripts/bootstrapper.py

Docstring usage:

python3 bootstrapper.py --repo the-nexus --agent mimo-sprint
python3 bootstrapper.py --repo timmy-home --global
python3 bootstrapper.py --global
python3 bootstrapper.py --repo the-nexus --max-tokens 1000

4. Priority rebalancer entry point

scripts/priority_rebalancer.py

Docstring usage:

python3 scripts/priority_rebalancer.py --org Timmy_Foundation
python3 scripts/priority_rebalancer.py --org Timmy_Foundation --repo compounding-intelligence
python3 scripts/priority_rebalancer.py --org Timmy_Foundation --dry-run
python3 scripts/priority_rebalancer.py --org Timmy_Foundation --apply

5. Secondary analysis engines

Additional operational entry points exist in scripts/:

automation_opportunity_finder.py
improvement_proposals.py
knowledge_gap_identifier.py
knowledge_staleness_check.py
dead_code_detector.py
diff_analyzer.py
sampler.py
gitea_issue_parser.py
session_pair_harvester.py

6. Seed knowledge content

The knowledge store is not empty scaffolding.

Concrete checked-in knowledge already exists at:

knowledge/repos/hermes-agent.yaml
knowledge/repos/the-nexus.yaml
knowledge/global/pitfalls.yaml
knowledge/global/tool-quirks.yaml
knowledge/index.json
knowledge/SCHEMA.md

Data Flow

Flow A — transcript to durable knowledge

Raw session JSONL enters via scripts/session_reader.py.
read_session() loads the transcript.
extract_conversation() strips to meaningful user/assistant/system turns.
truncate_for_context() compresses long sessions to head + tail.
messages_to_text() converts structured turns to a plain-text transcript block.
scripts/harvester.py loads templates/harvest-prompt.md.
The harvester calls an LLM endpoint, parses the JSON response, validates facts, fingerprints them, deduplicates, then writes knowledge/index.json and human-readable per-domain files.

Flow B — durable knowledge to session bootstrap

scripts/bootstrapper.py loads knowledge/index.json.
It filters facts by repo, agent, and global scope.
It sorts them by confidence and category priority.
It optionally merges markdown knowledge from repo-specific, agent-specific, and global files.
It truncates the result to a token budget and emits a bootstrap context block.

Flow C — corpus to meta-analysis

Several scripts mine the repo and/or session corpus for second-order leverage:

scripts/improvement_proposals.py mines repeated errors, slow tools, manual processes, and retries into proposal objects
scripts/automation_opportunity_finder.py scans transcripts, scripts, docs, and cron jobs for automatable work
scripts/knowledge_gap_identifier.py cross-references code, docs, and tests
scripts/priority_rebalancer.py combines knowledge signals, staleness signals, metrics, and Gitea issues into suggested priority shifts

Flow D — repo/static inspection

scripts/dead_code_detector.py walks Python ASTs and optionally uses git blame
scripts/diff_analyzer.py parses patches into structured change objects
scripts/dependency_graph.py is intended to scan repos and emit JSON / Mermaid / DOT dependency graphs, but is currently syntactically broken
scripts/perf_bottleneck_finder.py is intended to scan tests/build/CI for bottlenecks, but is currently syntactically broken

Key Abstractions

Knowledge item

Defined in practice by templates/harvest-prompt.md, scripts/harvester.py, and knowledge/SCHEMA.md.

Important fields:

fact
category
repo / domain
confidence
source/evidence metadata

Categories consistently used across the repo:

fact
pitfall
pattern
tool-quirk
question

Session transcript model

session_reader.py treats JSONL transcripts as ordered message sequences with:

role
content
timestamp
optional multimodal text extraction
optional tool-call metadata

This module is the ingestion foundation for the rest of the system.

Knowledge store

The repo uses a two-layer representation:

machine-readable index: knowledge/index.json
human-editable domain files: YAML/markdown under knowledge/global/, knowledge/repos/, and knowledge/agents/

knowledge/SCHEMA.md is the contract for that store.

Bootstrap context

bootstrapper.py makes the design concrete:

filter_facts() narrows by repo/agent/global scope
sort_facts() orders by confidence and category priority
render_facts_section() groups output by category
estimate_tokens() and truncate_to_tokens() implement the context-window budget
build_bootstrap_context() assembles the final injected context block

Harvester dedup and validation

The central harvester abstractions are not classes but functions:

parse_extraction_response()
fact_fingerprint()
deduplicate()
validate_fact()
write_knowledge()
harvest_session()

This makes the core pipeline easy to test in pieces.

Priority scoring model

priority_rebalancer.py introduces explicit data models:

IssueScore
PipelineSignal
GiteaClient

That script is important because it bridges the local knowledge store to live Gitea issue state.

Gap report model

knowledge_gap_identifier.py formalizes another analysis lane with:

GapSeverity
GapType
Gap
GapReport
KnowledgeGapIdentifier

This is one of the clearest examples that the repo has moved beyond a single harvester/bootstrapper loop into a platform of analyzers.

API Surface

This repo is primarily a CLI/library surface, not a long-running service.

Core CLIs

scripts/harvester.py
scripts/bootstrapper.py
scripts/priority_rebalancer.py
scripts/improvement_proposals.py
scripts/automation_opportunity_finder.py
scripts/knowledge_staleness_check.py
scripts/dead_code_detector.py
scripts/diff_analyzer.py
scripts/gitea_issue_parser.py
scripts/session_pair_harvester.py

External API dependencies

LLM chat-completions endpoint in scripts/harvester.py
Gitea REST API in scripts/priority_rebalancer.py

File-format APIs

session input: JSONL files under test_sessions/
knowledge schema: knowledge/SCHEMA.md
extraction prompt contract: templates/harvest-prompt.md
machine store: knowledge/index.json
repo knowledge examples:
- knowledge/repos/hermes-agent.yaml
- knowledge/repos/the-nexus.yaml

Output artifacts

Documented or implied outputs include:

knowledge/index.json
repo/global/agent knowledge files
metrics/priority_report.json
metrics/priority_suggestions.md
text/markdown/json proposal reports

Test Coverage Gaps

Current verified state

I verified the repo in three layers.

Layer 1 — focused passing slice

Command run:

python3 -m pytest \
  scripts/test_bootstrapper.py \
  scripts/test_harvester_pipeline.py \
  scripts/test_session_pair_harvester.py \
  scripts/test_knowledge_staleness.py \
  scripts/test_improvement_proposals.py \
  scripts/test_automation_opportunity_finder.py \
  scripts/test_gitea_issue_parser.py \
  tests/test_ci_config.py \
  tests/test_knowledge_gap_identifier.py -q

Result:

70 passed

This proves the repo has substantial working logic today.

Layer 2 — canonical CI command

Command run:

make test

Result:

CI command collected 76 items and failed during collection with 1 error
failure source: scripts/test_refactoring_opportunity_finder.py
exact issue filed: https://forge.alexanderwhitestone.com/Timmy_Foundation/compounding-intelligence/issues/210

Layer 3 — full test collection

Commands run:

python3 -m pytest --collect-only -q
python3 -m pytest -q

Result:

86 tests collected, 2 errors
collection blockers:
1. scripts/test_refactoring_opportunity_finder.py expects a real refactoring API that scripts/refactoring_opportunity_finder.py does not implement
2. tests/test_perf_bottleneck_finder.py cannot import scripts/perf_bottleneck_finder.py due a SyntaxError

Additional verification:

python3 -m py_compile scripts/perf_bottleneck_finder.py
python3 -m py_compile scripts/dependency_graph.py

Both fail.

Filed follow-ups:

compounding-intelligence/issues/210 — refactoring finder API missing
compounding-intelligence/issues/211 — scripts/perf_bottleneck_finder.py SyntaxError
compounding-intelligence/issues/212 — scripts/dependency_graph.py SyntaxError

What is well covered

Strongly exercised subsystems include:

bootstrapper logic
harvester pipeline helpers
session pair harvesting
knowledge staleness checking
improvement proposal generation
automation opportunity mining
Gitea issue parsing
CI configuration contract
knowledge gap analysis

What is weak or broken

scripts/refactoring_opportunity_finder.py
- current implementation is a sample stub
- tests expect real complexity and scoring helpers
scripts/perf_bottleneck_finder.py
- parser broken before runtime
- test module exists but cannot import target script
scripts/dependency_graph.py
- parser broken before runtime
- no active test lane caught it before this analysis
CI scope gap
- .gitea/workflows/test.yml runs make test
- make test does not cover every tests/*.py module
- specifically, tests/test_perf_bottleneck_finder.py sits outside the Makefile target and the syntax break only shows up when running broader pytest commands
warning hygiene
- scripts/test_priority_rebalancer.py emits repeated datetime.utcnow() deprecation warnings under Python 3.12

Security Considerations

Secret extraction risk
- this repo is literally designed to ingest transcripts and distill knowledge
- if the harvester prompt or filtering logic misses a credential, the system can preserve secrets into the knowledge store
- the risk is explicitly recognized in the target repo's existing GENOME.md, but enforcement still depends on implementation discipline
Knowledge poisoning
- the system trusts transcripts as source material for compounding facts
- confidence scores and evidence fields help, but there is no hard verification layer proving extracted facts are true before reuse
Cross-repo sensitivity
- seeded files such as knowledge/repos/hermes-agent.yaml and knowledge/repos/the-nexus.yaml store operational quirks and deployment pitfalls
- that is high-value knowledge and can also expose internal operational assumptions if shared broadly
External API use
- scripts/harvester.py depends on an LLM API endpoint and local key discovery
- scripts/priority_rebalancer.py talks to the Gitea API with write-capable operations such as labels and comments
- these scripts deserve careful credential-handling and least-privilege tokens
Transcript privacy
- session JSONL can contain user content, repo details, operational mistakes, and potentially sensitive environment facts
- durable storage multiplies the blast radius of accidental retention

Dependencies

Explicit repo dependency file:

requirements.txt → pytest>=8,<9

Observed runtime/import dependencies from source:

Python stdlib-heavy design: json, argparse, pathlib, urllib, ast, datetime, hashlib, subprocess, collections, re
yaml imported by scripts/automation_opportunity_finder.py

Important dependency note:

requirements.txt only declares pytest
static source inspection shows yaml usage, which implies an undeclared dependency on PyYAML or equivalent
I did not prove a clean-environment failure because the local environment already had yaml importable during targeted tests
this is best treated as dependency drift to verify in a clean environment

Deployment

This is not a traditional server deployment repo.

Operational modes are:

local CLI execution of scripts under scripts/
CI execution via .gitea/workflows/test.yml
file-based knowledge store mutation under knowledge/

Canonical repo commands observed:

make test
python3 -m pytest -q
python3 -m pytest --collect-only -q
python3 ~/.hermes/pipelines/codebase-genome.py --path /tmp/compounding-intelligence-676 --output /tmp/compounding-intelligence-676-base-GENOME.md

There is no checked-in Dockerfile, packaging metadata, or service runner. The repo behaves more like an internal analysis toolkit than an application service.

Technical Debt

Docs/runtime drift
- README and target-repo GENOME.md still describe a repo that is less implemented than reality
- this makes the project look earlier-stage than the current source actually is
Broken parser state in two flagship analyzers
- scripts/perf_bottleneck_finder.py
- scripts/dependency_graph.py
Stub-vs-test mismatch
- scripts/refactoring_opportunity_finder.py is a placeholder
- scripts/test_refactoring_opportunity_finder.py assumes a mature implementation
CI blind spot
- make test does not represent full-repo pytest health
- broader collection surfaces more problems than the workflow currently enforces
Dependency declaration drift
- yaml appears in source while requirements.txt only lists pytest
Warning debt
- datetime.utcnow() deprecation noise in scripts/test_priority_rebalancer.py
Existing target-repo genome drift
- checked-in GENOME.md already exists on upstream main, but it undersells the real code surface and should not be treated as authoritative without fresh source verification

Key Findings

compounding-intelligence has already evolved into a multi-engine analysis toolkit, not just a future three-pipeline concept.
The most grounded working path today is transcript → session_reader.py → harvester.py / bootstrapper.py with a structured knowledge store.
The repo has real, working higher-order analyzers beyond harvesting: knowledge_gap_identifier.py, priority_rebalancer.py, improvement_proposals.py, automation_opportunity_finder.py, and dead_code_detector.py.
The current target-repo GENOME.md is useful evidence but stale as a full architectural description.
Test health is mixed: a broad, meaningful passing slice exists (70 passed), but canonical CI is currently broken by the refactoring finder contract mismatch, and full collection exposes additional syntax failures.
Three concrete follow-up issues were warranted and filed during this genome pass:
- https://forge.alexanderwhitestone.com/Timmy_Foundation/compounding-intelligence/issues/210
- https://forge.alexanderwhitestone.com/Timmy_Foundation/compounding-intelligence/issues/211
- https://forge.alexanderwhitestone.com/Timmy_Foundation/compounding-intelligence/issues/212

This host-repo genome artifact is the grounded cross-repo analysis requested by timmy-home #676. It intentionally treats the target repo's own GENOME.md as evidence rather than gospel, because current source, tests, and verification commands show a significantly more mature — and partially broken — system than the older upstream genome describes.

20 KiB Raw Blame History

GENOME.md — compounding-intelligence

Project Overview

Architecture

Runtime Truth and Docs Drift

Entry Points

1. CI / canonical test entry point

2. Knowledge extraction entry point

3. Session bootstrap entry point

4. Priority rebalancer entry point

5. Secondary analysis engines

6. Seed knowledge content

Data Flow

Flow A — transcript to durable knowledge

Flow B — durable knowledge to session bootstrap

Flow C — corpus to meta-analysis

Flow D — repo/static inspection

Key Abstractions

Knowledge item

Session transcript model

Knowledge store

Bootstrap context

Harvester dedup and validation

Priority scoring model

Gap report model

API Surface

Core CLIs

External API dependencies

File-format APIs

Output artifacts

Test Coverage Gaps

Current verified state

Layer 1 — focused passing slice

Layer 2 — canonical CI command

Layer 3 — full test collection

What is well covered

What is weak or broken

Security Considerations

Dependencies

Deployment

Technical Debt

Key Findings

20 KiB

Raw Blame History