Files

Smoke Test / smoke (pull_request) Failing after 25s

Details

fix(#676 ): Add GENOME.md for compounding-intelligence

Complete codebase genome:
- Project overview and three-pipeline architecture
- Mermaid architecture diagram
- Entry points and data flow
- Knowledge schema and confidence scoring
- Key abstractions
- Test coverage analysis with gaps
- Security considerations
- Dependencies and status

2026-04-15 03:25:20 +00:00

7.1 KiB

Raw Blame History

GENOME.md — compounding-intelligence

Auto-generated codebase genome. Repo 9/16 in the Codebase Genome series.

Project Overview

compounding-intelligence turns 1B+ daily tokens into durable, compounding fleet intelligence. It solves the core problem of AI agent amnesia: every session starts at zero, rediscovering the same facts, pitfalls, and patterns that previous sessions already learned.

The project implements three pipelines forming a compounding loop:

SESSION ENDS --> HARVESTER --> KNOWLEDGE STORE --> BOOTSTRAPPER --> NEW SESSION STARTS SMARTER
                                      |
                                 MEASURER --> Prove it's working

Key insight: Intelligence from a million tokens of work evaporates when the session ends. This project captures it, stores it, and injects it into future sessions so they start smarter.

Architecture

graph LR
    A[Session Transcripts] -->|Harvester| B[Knowledge Store]
    B -->|Bootstrapper| C[New Session Context]
    C --> D[Agent Work]
    D --> A
    B -->|Measurer| E[Dashboard]
    E -->|Metrics| F[Proof of Compounding]

    subgraph Knowledge Store
        B1[index.json]
        B2[global/]
        B3[repos/{repo}.md]
        B4[agents/{agent}.md]
    end

Pipeline 1: Harvester

Input: Finished session transcripts (JSONL format)
Process: LLM extracts durable knowledge using structured prompt
Output: Facts stored in knowledge/ directory
Categories: fact, pitfall, pattern, tool-quirk, question
Deduplication: Content-hash based, existing knowledge has priority

Pipeline 2: Bootstrapper

Input: knowledge/ store
Process: Queries for relevant facts, assembles compact 2k-token context
Output: Injected context at session start
Goal: New sessions start with full situational awareness

Pipeline 3: Measurer

Input: Knowledge store + session metrics
Process: Tracks knowledge velocity, error reduction, hit rate
Output: Dashboard.md + daily reports
Goal: Prove the compounding loop works

Directory Structure

compounding-intelligence/
|-- README.md                          # Project overview and roadmap
|-- knowledge/
|   |-- index.json                     # Machine-readable fact index (versioned)
|   |-- global/                        # Cross-repo knowledge
|   |-- repos/{repo}.md                # Per-repo knowledge files
|   |-- agents/{agent}.md              # Agent-type notes
|-- scripts/
|   |-- test_harvest_prompt.py         # Validation for harvest prompt output
|   |-- test_harvest_prompt_comprehensive.py  # Extended test suite
|-- templates/
|   |-- harvest-prompt.md             # LLM prompt for knowledge extraction
|-- metrics/
|   |-- .gitkeep                      # Placeholder for dashboard
|-- test_sessions/
|   |-- session_failure.jsonl         # Test data: failed session
|   |-- session_partial.jsonl         # Test data: partial session
|   |-- session_patterns.jsonl        # Test data: pattern extraction
|   |-- session_questions.jsonl       # Test data: question identification
|   |-- session_success.jsonl         # Test data: successful session

Entry Points

File	Purpose	Entry
`templates/harvest-prompt.md`	Extraction prompt	LLM input template
`scripts/test_harvest_prompt.py`	Validation	`python3 test_harvest_prompt.py`
`knowledge/index.json`	Data store	Read/write by all pipelines

Data Flow

1. Agent completes session -> session transcript (JSONL)
2. Harvester reads transcript
3. LLM processes via harvest-prompt.md template
4. Extracted knowledge validated against schema
5. Deduplicated against existing index.json
6. New facts appended with source attribution
7. Bootstrapper queries index.json for relevant facts
8. Context injected into next session
9. Measurer tracks velocity and quality metrics

Knowledge Schema

Each knowledge item in index.json:

{
  "fact": "One sentence description",
  "category": "fact|pitfall|pattern|tool-quirk|question",
  "repo": "Repository name or 'global'",
  "confidence": 0.0-1.0,
  "source": "mempalace|fact_store|skill|harvester",
  "source_file": "Origin file if applicable",
  "migrated_at": "ISO 8601 timestamp"
}

Confidence Scoring

0.9-1.0: Explicitly stated with verification
0.7-0.8: Clearly implied by multiple data points
0.5-0.6: Suggested but not fully verified
0.3-0.4: Inferred from limited data
0.1-0.2: Speculative or uncertain

Key Abstractions

Knowledge Item: Atomic unit of extracted intelligence. One fact, one category, one confidence score.
Knowledge Store: Directory-based persistent storage with JSON index.
Harvest Prompt: Structured LLM prompt that converts session transcripts to knowledge items.
Bootstrap Context: Compact 2k-token summary injected at session start.
Compounding Loop: The cycle of extract -> store -> inject -> work -> extract.

API Surface

Knowledge Store (file-based)

Read: knowledge/index.json — all facts
Write: Append to index.json after deduplication
Query: Filter by category, repo, confidence threshold

Templates

harvest-prompt.md: Input template for LLM extraction
bootstrap-context.md: Output template for session injection

Test Coverage

Test File	Covers	Status
`test_harvest_prompt.py`	Schema validation, required fields	Present
`test_harvest_prompt_comprehensive.py`	Extended validation, edge cases	Present
`test_sessions/session_failure.jsonl`	Failure extraction	Test data
`test_sessions/session_partial.jsonl`	Partial session handling	Test data
`test_sessions/session_patterns.jsonl`	Pattern extraction	Test data
`test_sessions/session_questions.jsonl`	Question identification	Test data
`test_sessions/session_success.jsonl`	Full extraction	Test data

Gaps

No integration tests for full harvester pipeline
No tests for bootstrapper context assembly
No tests for measurer metrics computation
No tests for deduplication logic
No CI pipeline configured

Security Considerations

Knowledge injection: Bootstrapper injects context from knowledge store. Malicious facts in the store could influence agent behavior. Trust scoring partially mitigates this.
Session transcripts: May contain sensitive data (tokens, API keys). Harvester must filter sensitive patterns before storage.
LLM extraction: Harvest prompt instructs "no hallucination" but LLMs can still confabulate. Confidence scoring and source attribution provide auditability.
File-based storage: No access control on knowledge files. Anyone with filesystem access can read/modify.

Dependencies

Python 3.10+
No external packages (stdlib only)
LLM access for harvester pipeline (Ollama or cloud provider)
Hermes agent framework for session management

Status

Phase: Early development
Epics: 4 (Harvester, Knowledge Store, Bootstrap, Measurement)
Milestone: 4 (Retroactive Harvest)
Open Issues: Active development across harvester and knowledge store pipelines

7.1 KiB Raw Blame History