Compare commits

..

4 Commits

10 changed files with 1078 additions and 845 deletions

View File

@@ -1,114 +0,0 @@
# Knowledge File Format Specification
**Version:** 1
**Issue:** #10
**Status:** Draft
---
## Overview
The knowledge system has two layers:
1. **index.json** — Machine-readable fact index. Fast lookups by ID, category, repo, tags.
2. **Knowledge files** (YAML) — Human-readable, editable facts organized by domain.
The harvester writes to both. The bootstrapper reads from index.json. Humans edit the YAML files directly.
---
## index.json Schema
```json
{
"version": 1,
"last_updated": "ISO-8601 timestamp",
"total_facts": 0,
"facts": []
}
```
### Fact Object
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `id` | string | yes | Unique identifier: `{domain}:{category}:{sequence}` |
| `fact` | string | yes | One-sentence description of the knowledge |
| `category` | enum | yes | One of: `fact`, `pitfall`, `pattern`, `tool-quirk`, `question` |
| `domain` | string | yes | Where this applies: repo name, `global`, or agent name |
| `confidence` | float | yes | 0.01.0. How certain is this knowledge? |
| `tags` | string[] | no | Searchable labels |
| `source_count` | int | no | How many sessions confirmed this fact |
| `first_seen` | date | no | ISO-8601 date first extracted |
| `last_confirmed` | date | no | ISO-8601 date last seen in a session |
| `expires` | date | no | Optional. After this date, fact is stale |
| `related` | string[] | no | IDs of related facts |
### ID Format: `{domain}:{category}:{sequence}`
### Categories
| Category | Definition |
|----------|------------|
| `fact` | Concrete, verifiable information |
| `pitfall` | Errors, wrong assumptions, time-wasters |
| `pattern` | Successful sequences of actions |
| `tool-quirk` | Environment-specific behaviors |
| `question` | Identified but unanswered |
### Confidence Scoring
| Range | Meaning |
|-------|---------|
| 0.91.0 | Explicitly stated and verified |
| 0.70.8 | Clearly implied by multiple data points |
| 0.50.6 | Suggested but not fully verified |
| 0.30.4 | Inferred from limited data |
| 0.10.2 | Speculative or uncertain |
---
## Directory Structure
```
knowledge/
├── index.json # Machine-readable fact index
├── SCHEMA.md # This file
├── global/ # Cross-repo knowledge
│ ├── pitfalls.yaml
│ ├── patterns.yaml
│ └── tool-quirks.yaml
├── repos/ # Per-repo knowledge
│ ├── {repo-name}.yaml
│ └── ...
└── agents/ # Agent-type knowledge
└── {agent-type}.yaml
```
## YAML File Format
YAML files use frontmatter for metadata, then markdown sections with fact entries:
```yaml
---
domain: global
category: tool-quirk
version: 1
last_updated: "2026-04-13"
---
# Title
## Section
- id: global:tool-quirk:001
fact: "Description"
confidence: 0.95
tags: [tag1, tag2]
source_count: 5
first_seen: "2026-03-27"
```
## Validation
Run `python scripts/validate_knowledge.py` to validate index.json.

View File

@@ -1,80 +0,0 @@
---
domain: global
category: pitfall
version: 1
last_updated: "2026-04-13"
---
# Pitfalls (Global)
Cross-repo traps that waste time across the fleet.
## Git & Forge
- id: global:pitfall:001
fact: "Branch protection requires 1 approval on main - API merges fail with 405 without it"
confidence: 0.95
tags: [git, merge, branch-protection, gitea]
source_count: 12
first_seen: "2026-04-05"
last_confirmed: "2026-04-13"
related: [the-nexus:pitfall:001]
- id: global:pitfall:002
fact: "Never use --no-verify on git commits - it bypasses all hooks including safety checks"
confidence: 0.95
tags: [git, hooks, safety]
source_count: 5
first_seen: "2026-03-28"
last_confirmed: "2026-04-13"
- id: global:pitfall:003
fact: "Gitea PR creation workaround needed on the-nexus - direct API call fails, use alternative endpoint"
confidence: 0.9
tags: [gitea, pr, api, workaround]
source_count: 4
first_seen: "2026-04-06"
last_confirmed: "2026-04-12"
## Agent Operations
- id: global:pitfall:004
fact: "Anthropic is BANNED from fallback chain - if fallback triggers to Anthropic, something is wrong"
confidence: 0.95
tags: [provider, anthropic, fallback]
source_count: 7
first_seen: "2026-03-30"
last_confirmed: "2026-04-13"
- id: global:pitfall:005
fact: "Telegram tokens expired - don't assume Telegram notifications work without checking"
confidence: 0.85
tags: [telegram, notifications, token]
source_count: 3
first_seen: "2026-04-02"
- id: global:pitfall:006
fact: "Multiple gateways = 'cannot schedule futures' error - only one gateway process should run"
confidence: 0.9
tags: [gateway, cron, process]
source_count: 4
first_seen: "2026-04-04"
last_confirmed: "2026-04-11"
## Testing
- id: global:pitfall:007
fact: "pytest root collection picks up operational *_test.py scripts - restrict to tests/ directory"
confidence: 0.9
tags: [pytest, test, collection]
source_count: 3
first_seen: "2026-04-07"
last_confirmed: "2026-04-13"
- id: global:pitfall:008
fact: "TDD: test 1 before building 55 - verify the cycle works before scaling"
confidence: 0.95
tags: [tdd, testing, methodology]
source_count: 8
first_seen: "2026-03-25"
last_confirmed: "2026-04-13"

View File

@@ -1,71 +0,0 @@
---
domain: global
category: tool-quirk
version: 1
last_updated: "2026-04-13"
---
# Tool Quirks (Global)
## Authentication
- id: global:tool-quirk:001
fact: "Gitea token stored at ~/.config/gitea/token, not env var GITEA_TOKEN"
confidence: 0.95
tags: [git, auth, gitea, token]
source_count: 23
first_seen: "2026-03-27"
last_confirmed: "2026-04-13"
related: [global:pitfall:001]
- id: global:tool-quirk:002
fact: "Gitea API uses 'Authorization: token TOKEN' header format, not Bearer"
confidence: 0.9
tags: [git, api, gitea]
source_count: 8
first_seen: "2026-03-28"
last_confirmed: "2026-04-12"
- id: global:tool-quirk:003
fact: "Gitea Issues API type=issues param does NOT filter PRs - use truthiness check on pull_request field"
confidence: 0.95
tags: [gitea, api, issues, pr]
source_count: 6
first_seen: "2026-04-01"
last_confirmed: "2026-04-13"
## Paths & Environment
- id: global:tool-quirk:004
fact: "~/.hermes is the default hermes home - check get_hermes_home() not the path literal"
confidence: 0.9
tags: [paths, hermes, env]
source_count: 10
first_seen: "2026-03-30"
last_confirmed: "2026-04-13"
related: [hermes-agent:pitfall:005]
- id: global:tool-quirk:005
fact: "Ansible vault-encrypted vars in YAML require vault_inline_vars plugin"
confidence: 0.85
tags: [ansible, vault, config]
source_count: 3
first_seen: "2026-04-02"
## Model & Inference
- id: global:tool-quirk:006
fact: "mimo-v2-pro via Nous Research is the default model - don't assume Anthropic is available"
confidence: 0.95
tags: [model, provider, nous, default]
source_count: 15
first_seen: "2026-03-25"
last_confirmed: "2026-04-13"
- id: global:tool-quirk:007
fact: "Kill + restart with 'hermes chat' preserves old model state - NEVER use --resume"
confidence: 0.95
tags: [hermes, model, restart, session]
source_count: 8
first_seen: "2026-03-29"
last_confirmed: "2026-04-12"

View File

@@ -0,0 +1,10 @@
{
"last_harvest": "2026-04-14T18:04:45.484759+00:00",
"harvested_sessions": [
"20260413_175935_20cb44",
"20260413_171106_62c276",
"20260413_181734_aed35b"
],
"total_sessions_processed": 3,
"total_facts_extracted": 59
}

View File

@@ -1,472 +1,597 @@
{
"version": 1,
"last_updated": "2026-04-13T20:00:00Z",
"total_facts": 29,
"last_updated": "2026-04-14T18:04:45.484238+00:00",
"total_facts": 59,
"facts": [
{
"id": "hermes-agent:pitfall:001",
"fact": "deploy-crons.py leaves jobs in mixed model format",
"fact": "Error encountered with file: /private/var/folders/9k/v07xkpp133v03yynn9nx80fr0000gn/T/hermes_sandbox_z8ielhro/script.py",
"category": "pitfall",
"domain": "hermes-agent",
"confidence": 0.95,
"tags": [
"cron",
"deploy",
"model",
"config"
],
"source_count": 5,
"first_seen": "2026-04-08",
"last_confirmed": "2026-04-13",
"related": [
"hermes-agent:pitfall:002",
"hermes-agent:pitfall:003"
]
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_175935_20cb44",
"extracted_at": "2026-04-14T18:04:45.477585+00:00",
"harvested_at": "2026-04-14T18:04:45.479057+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_175935_20cb44.json"
},
{
"id": "hermes-agent:pitfall:002",
"fact": "deploy-crons.py --deploy doesn't set legacy skill field from skills list",
"fact": "Error encountered with file: crons.py",
"category": "pitfall",
"domain": "hermes-agent",
"confidence": 0.9,
"tags": [
"cron",
"deploy",
"skills"
],
"source_count": 3,
"first_seen": "2026-04-09",
"last_confirmed": "2026-04-13",
"related": [
"hermes-agent:pitfall:001"
]
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_175935_20cb44",
"extracted_at": "2026-04-14T18:04:45.477603+00:00",
"harvested_at": "2026-04-14T18:04:45.479059+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_175935_20cb44.json"
},
{
"id": "hermes-agent:pitfall:003",
"fact": "Cron jobs with blank fallback_model fields trigger spurious gateway warnings",
"fact": "Error encountered with file: 300.07",
"category": "pitfall",
"domain": "hermes-agent",
"confidence": 0.9,
"tags": [
"cron",
"model",
"fallback"
],
"source_count": 4,
"first_seen": "2026-04-07",
"last_confirmed": "2026-04-12",
"related": [
"hermes-agent:pitfall:001"
]
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_175935_20cb44",
"extracted_at": "2026-04-14T18:04:45.477614+00:00",
"harvested_at": "2026-04-14T18:04:45.479060+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_175935_20cb44.json"
},
{
"id": "hermes-agent:pitfall:004",
"fact": "model-watchdog.py checks first provider line, not model.provider - causes false drift alarms",
"fact": "Error encountered with file: /private/var/folders/9k/v07xkpp133v03yynn9nx80fr0000gn/T/hermes_sandbox__3wxy21d/script.py",
"category": "pitfall",
"domain": "hermes-agent",
"confidence": 0.9,
"tags": [
"watchdog",
"model",
"config"
],
"source_count": 3,
"first_seen": "2026-04-08",
"last_confirmed": "2026-04-13"
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_175935_20cb44",
"extracted_at": "2026-04-14T18:04:45.477622+00:00",
"harvested_at": "2026-04-14T18:04:45.479061+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_175935_20cb44.json"
},
{
"id": "hermes-agent:pitfall:005",
"fact": "10+ files read HERMES_HOME directly instead of get_hermes_home()",
"fact": "Error encountered with file: /private/var/folders/9k/v07xkpp133v03yynn9nx80fr0000gn/T/hermes_sandbox_dimnu9ba/script.py",
"category": "pitfall",
"domain": "hermes-agent",
"confidence": 0.85,
"tags": [
"paths",
"env",
"hermes-home"
],
"source_count": 6,
"first_seen": "2026-04-06",
"last_confirmed": "2026-04-12",
"related": [
"global:pitfall:002"
]
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_175935_20cb44",
"extracted_at": "2026-04-14T18:04:45.477633+00:00",
"harvested_at": "2026-04-14T18:04:45.479062+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_175935_20cb44.json"
},
{
"id": "hermes-agent:pitfall:006",
"fact": "get_hermes_home() doesn't expand tilde when HERMES_HOME=~/... is set",
"fact": "Error encountered with file: nhermes_cli/cron.py",
"category": "pitfall",
"domain": "hermes-agent",
"confidence": 0.8,
"tags": [
"paths",
"env",
"bug"
],
"source_count": 2,
"first_seen": "2026-04-05"
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_175935_20cb44",
"extracted_at": "2026-04-14T18:04:45.477664+00:00",
"harvested_at": "2026-04-14T18:04:45.479062+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_175935_20cb44.json"
},
{
"id": "hermes-agent:pitfall:007",
"fact": "vps-agent-dispatch reports OK while remote hermes binary path is broken",
"fact": "Error encountered with file: hermes_cli/cron.py",
"category": "pitfall",
"domain": "hermes-agent",
"confidence": 0.9,
"tags": [
"ssh",
"dispatch",
"vps"
],
"source_count": 4,
"first_seen": "2026-04-07",
"last_confirmed": "2026-04-11"
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_175935_20cb44",
"extracted_at": "2026-04-14T18:04:45.477793+00:00",
"harvested_at": "2026-04-14T18:04:45.479063+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_175935_20cb44.json"
},
{
"id": "hermes-agent:pitfall:008",
"fact": "nightwatch-health-monitor SSH check fails on cloud-model-only deployments",
"fact": "Error encountered with file: config.yaml",
"category": "pitfall",
"domain": "hermes-agent",
"confidence": 0.85,
"tags": [
"ssh",
"health",
"cloud"
],
"source_count": 2,
"first_seen": "2026-04-10"
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_175935_20cb44",
"extracted_at": "2026-04-14T18:04:45.477921+00:00",
"harvested_at": "2026-04-14T18:04:45.479064+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_175935_20cb44.json"
},
{
"id": "the-nexus:pitfall:001",
"fact": "Merges fail with HTTP 405 due to branch protection",
"fact": "Error encountered with file: ~/.hermes",
"category": "pitfall",
"domain": "the-nexus",
"confidence": 0.95,
"tags": [
"git",
"merge",
"branch-protection",
"gitea"
],
"source_count": 12,
"first_seen": "2026-04-05",
"last_confirmed": "2026-04-13",
"related": [
"global:pitfall:001"
]
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_175935_20cb44",
"extracted_at": "2026-04-14T18:04:45.478092+00:00",
"harvested_at": "2026-04-14T18:04:45.479065+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_175935_20cb44.json"
},
{
"id": "the-nexus:pitfall:002",
"fact": "ThreadingHTTPServer required for multi-user bridge - standard HTTPServer blocks on concurrent requests",
"fact": "Error encountered with file: ncli.py",
"category": "pitfall",
"domain": "the-nexus",
"confidence": 0.95,
"tags": [
"server",
"concurrency",
"bridge"
],
"source_count": 5,
"first_seen": "2026-04-10",
"last_confirmed": "2026-04-13"
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_175935_20cb44",
"extracted_at": "2026-04-14T18:04:45.478281+00:00",
"harvested_at": "2026-04-14T18:04:45.479065+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_175935_20cb44.json"
},
{
"id": "the-nexus:pitfall:003",
"fact": "ChatLog.log() crashes on message persistence when index.html has orphaned button tags",
"fact": "Error encountered with file: 300.17",
"category": "pitfall",
"domain": "the-nexus",
"confidence": 0.9,
"tags": [
"html",
"crash",
"chatlog"
],
"source_count": 3,
"first_seen": "2026-04-12",
"last_confirmed": "2026-04-13"
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_175935_20cb44",
"extracted_at": "2026-04-14T18:04:45.478293+00:00",
"harvested_at": "2026-04-14T18:04:45.479066+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_175935_20cb44.json"
},
{
"id": "the-nexus:pitfall:004",
"fact": "Three.js LOD not implemented - local hardware struggles with full scene",
"fact": "Error encountered with file: 10.88",
"category": "pitfall",
"domain": "the-nexus",
"confidence": 0.85,
"tags": [
"threejs",
"performance",
"lod"
],
"source_count": 4,
"first_seen": "2026-04-09",
"last_confirmed": "2026-04-13"
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_175935_20cb44",
"extracted_at": "2026-04-14T18:04:45.478370+00:00",
"harvested_at": "2026-04-14T18:04:45.479067+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_175935_20cb44.json"
},
{
"id": "the-nexus:pitfall:005",
"fact": "Duplicate content blocks appear in index.html when PR merges conflict silently",
"fact": "Error encountered with file: k2.5",
"category": "pitfall",
"domain": "the-nexus",
"confidence": 0.8,
"tags": [
"html",
"merge-conflict",
"duplicate"
],
"source_count": 3,
"first_seen": "2026-04-11",
"last_confirmed": "2026-04-13"
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_175935_20cb44",
"extracted_at": "2026-04-14T18:04:45.478386+00:00",
"harvested_at": "2026-04-14T18:04:45.479067+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_175935_20cb44.json"
},
{
"id": "the-nexus:pitfall:006",
"fact": "Unified HTTP + WebSocket server required for proper URL deployment - separate servers break CORS",
"fact": "Error encountered with file: 300.92",
"category": "pitfall",
"domain": "the-nexus",
"confidence": 0.9,
"tags": [
"deploy",
"websocket",
"http",
"cors"
],
"source_count": 4,
"first_seen": "2026-04-10",
"last_confirmed": "2026-04-13"
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_175935_20cb44",
"extracted_at": "2026-04-14T18:04:45.478410+00:00",
"harvested_at": "2026-04-14T18:04:45.479068+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_175935_20cb44.json"
},
{
"id": "global:tool-quirk:001",
"fact": "Gitea token stored at ~/.config/gitea/token, not env var GITEA_TOKEN",
"category": "tool-quirk",
"domain": "global",
"confidence": 0.95,
"tags": [
"git",
"auth",
"gitea",
"token"
],
"source_count": 23,
"first_seen": "2026-03-27",
"last_confirmed": "2026-04-13",
"related": [
"global:pitfall:001"
]
"fact": "Successful command pattern: python observatory.py --check ",
"category": "pattern",
"repo": "global",
"confidence": 0.6,
"session_id": "20260413_175935_20cb44",
"extracted_at": "2026-04-14T18:04:45.478498+00:00",
"harvested_at": "2026-04-14T18:04:45.479069+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_175935_20cb44.json"
},
{
"id": "global:tool-quirk:002",
"fact": "Gitea API uses 'Authorization: token TOKEN' header format, not Bearer",
"category": "tool-quirk",
"domain": "global",
"confidence": 0.9,
"tags": [
"git",
"api",
"gitea"
],
"source_count": 8,
"first_seen": "2026-03-28",
"last_confirmed": "2026-04-12"
},
{
"id": "global:tool-quirk:003",
"fact": "Gitea Issues API type=issues param does NOT filter PRs",
"category": "tool-quirk",
"domain": "global",
"confidence": 0.95,
"tags": [
"gitea",
"api",
"issues",
"pr"
],
"source_count": 6,
"first_seen": "2026-04-01",
"last_confirmed": "2026-04-13"
},
{
"id": "global:tool-quirk:004",
"fact": "~/.hermes is the default hermes home - check get_hermes_home() not the path literal",
"category": "tool-quirk",
"domain": "global",
"confidence": 0.9,
"tags": [
"paths",
"hermes",
"env"
],
"source_count": 10,
"first_seen": "2026-03-30",
"last_confirmed": "2026-04-13",
"related": [
"hermes-agent:pitfall:005"
]
},
{
"id": "global:tool-quirk:005",
"fact": "Ansible vault-encrypted vars in YAML require vault_inline_vars plugin",
"category": "tool-quirk",
"domain": "global",
"confidence": 0.85,
"tags": [
"ansible",
"vault",
"config"
],
"source_count": 3,
"first_seen": "2026-04-02"
},
{
"id": "global:tool-quirk:006",
"fact": "mimo-v2-pro via Nous Research is the default model - don't assume Anthropic is available",
"category": "tool-quirk",
"domain": "global",
"confidence": 0.95,
"tags": [
"model",
"provider",
"nous",
"default"
],
"source_count": 15,
"first_seen": "2026-03-25",
"last_confirmed": "2026-04-13"
},
{
"id": "global:tool-quirk:007",
"fact": "Kill + restart with 'hermes chat' preserves old model state - NEVER use --resume",
"category": "tool-quirk",
"domain": "global",
"confidence": 0.95,
"tags": [
"hermes",
"model",
"restart",
"session"
],
"source_count": 8,
"first_seen": "2026-03-29",
"last_confirmed": "2026-04-12"
},
{
"id": "global:pitfall:001",
"fact": "Branch protection requires 1 approval on main - API merges fail with 405 without it",
"fact": "Error encountered with file: devkit/health.py",
"category": "pitfall",
"domain": "global",
"confidence": 0.95,
"tags": [
"git",
"merge",
"branch-protection",
"gitea"
],
"source_count": 12,
"first_seen": "2026-04-05",
"last_confirmed": "2026-04-13",
"related": [
"the-nexus:pitfall:001"
]
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_175935_20cb44",
"extracted_at": "2026-04-14T18:04:45.478571+00:00",
"harvested_at": "2026-04-14T18:04:45.479069+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_175935_20cb44.json"
},
{
"id": "global:pitfall:002",
"fact": "Never use --no-verify on git commits",
"fact": "Error encountered with file: CHANGELOG.md",
"category": "pitfall",
"domain": "global",
"confidence": 0.95,
"tags": [
"git",
"hooks",
"safety"
],
"source_count": 5,
"first_seen": "2026-03-28",
"last_confirmed": "2026-04-13"
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_175935_20cb44",
"extracted_at": "2026-04-14T18:04:45.478608+00:00",
"harvested_at": "2026-04-14T18:04:45.479070+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_175935_20cb44.json"
},
{
"id": "global:pitfall:003",
"fact": "Gitea PR creation workaround needed on the-nexus - direct API call fails",
"fact": "Error encountered with file: 300.06",
"category": "pitfall",
"domain": "global",
"confidence": 0.9,
"tags": [
"gitea",
"pr",
"api",
"workaround"
],
"source_count": 4,
"first_seen": "2026-04-06",
"last_confirmed": "2026-04-12"
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_175935_20cb44",
"extracted_at": "2026-04-14T18:04:45.478635+00:00",
"harvested_at": "2026-04-14T18:04:45.479071+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_175935_20cb44.json"
},
{
"id": "global:pitfall:004",
"fact": "Anthropic is BANNED from fallback chain",
"fact": "Error encountered with file: 300.03",
"category": "pitfall",
"domain": "global",
"confidence": 0.95,
"tags": [
"provider",
"anthropic",
"fallback"
],
"source_count": 7,
"first_seen": "2026-03-30",
"last_confirmed": "2026-04-13"
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_175935_20cb44",
"extracted_at": "2026-04-14T18:04:45.478658+00:00",
"harvested_at": "2026-04-14T18:04:45.479072+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_175935_20cb44.json"
},
{
"id": "global:pitfall:005",
"fact": "Telegram tokens expired - don't assume Telegram notifications work",
"fact": "Error encountered with file: crons.py",
"category": "pitfall",
"domain": "global",
"confidence": 0.85,
"tags": [
"telegram",
"notifications",
"token"
],
"source_count": 3,
"first_seen": "2026-04-02"
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_175935_20cb44",
"extracted_at": "2026-04-14T18:04:45.478703+00:00",
"harvested_at": "2026-04-14T18:04:45.479072+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_175935_20cb44.json"
},
{
"id": "global:pitfall:006",
"fact": "Multiple gateways = 'cannot schedule futures' error - only one gateway process should run",
"fact": "Error encountered with file: crons.py",
"category": "pitfall",
"domain": "global",
"confidence": 0.9,
"tags": [
"gateway",
"cron",
"process"
],
"source_count": 4,
"first_seen": "2026-04-04",
"last_confirmed": "2026-04-11"
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_175935_20cb44",
"extracted_at": "2026-04-14T18:04:45.478757+00:00",
"harvested_at": "2026-04-14T18:04:45.479073+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_175935_20cb44.json"
},
{
"id": "global:pitfall:007",
"fact": "pytest root collection picks up operational *_test.py scripts - restrict to tests/ directory",
"fact": "Error encountered with file: /private/var/folders/9k/v07xkpp133v03yynn9nx80fr0000gn/T/hermes_sandbox_1h5nj9lg/script.py",
"category": "pitfall",
"domain": "global",
"confidence": 0.9,
"tags": [
"pytest",
"test",
"collection"
],
"source_count": 3,
"first_seen": "2026-04-07",
"last_confirmed": "2026-04-13"
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_175935_20cb44",
"extracted_at": "2026-04-14T18:04:45.478778+00:00",
"harvested_at": "2026-04-14T18:04:45.479074+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_175935_20cb44.json"
},
{
"id": "global:pitfall:008",
"fact": "TDD: test 1 before building 55",
"fact": "Error encountered with file: job.get",
"category": "pitfall",
"domain": "global",
"confidence": 0.95,
"tags": [
"tdd",
"testing",
"methodology"
],
"source_count": 8,
"first_seen": "2026-03-25",
"last_confirmed": "2026-04-13"
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_175935_20cb44",
"extracted_at": "2026-04-14T18:04:45.478833+00:00",
"harvested_at": "2026-04-14T18:04:45.479074+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_175935_20cb44.json"
},
{
"fact": "Error encountered with file: CreateIssueOption.Labels",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_175935_20cb44",
"extracted_at": "2026-04-14T18:04:45.478975+00:00",
"harvested_at": "2026-04-14T18:04:45.479075+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_175935_20cb44.json"
},
{
"fact": "Successful command pattern: git process seems to be running in this repository",
"category": "pattern",
"repo": "global",
"confidence": 0.6,
"session_id": "20260413_175935_20cb44",
"extracted_at": "2026-04-14T18:04:45.479018+00:00",
"harvested_at": "2026-04-14T18:04:45.479076+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_175935_20cb44.json"
},
{
"fact": "Error encountered with file: ~/.hermes",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_171106_62c276",
"extracted_at": "2026-04-14T18:04:45.479242+00:00",
"harvested_at": "2026-04-14T18:04:45.482379+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_171106_62c276.json"
},
{
"fact": "Error encountered with file: pokayoke/hermes_constants.py",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_171106_62c276",
"extracted_at": "2026-04-14T18:04:45.479346+00:00",
"harvested_at": "2026-04-14T18:04:45.482380+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_171106_62c276.json"
},
{
"fact": "Error encountered with file: Path.home",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_171106_62c276",
"extracted_at": "2026-04-14T18:04:45.479565+00:00",
"harvested_at": "2026-04-14T18:04:45.482380+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_171106_62c276.json"
},
{
"fact": "Error encountered with file: /private/var/folders/9k/v07xkpp133v03yynn9nx80fr0000gn/T/hermes_sandbox_5pwgex20/script.py",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_171106_62c276",
"extracted_at": "2026-04-14T18:04:45.479901+00:00",
"harvested_at": "2026-04-14T18:04:45.482381+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_171106_62c276.json"
},
{
"fact": "Error encountered with file: 300.11",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_171106_62c276",
"extracted_at": "2026-04-14T18:04:45.480675+00:00",
"harvested_at": "2026-04-14T18:04:45.482382+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_171106_62c276.json"
},
{
"fact": "Error encountered with file: AIAgent.__init__",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_171106_62c276",
"extracted_at": "2026-04-14T18:04:45.480862+00:00",
"harvested_at": "2026-04-14T18:04:45.482383+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_171106_62c276.json"
},
{
"fact": "Error encountered with file: job.ge",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_171106_62c276",
"extracted_at": "2026-04-14T18:04:45.481044+00:00",
"harvested_at": "2026-04-14T18:04:45.482383+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_171106_62c276.json"
},
{
"fact": "Error encountered with file: cron/scheduler.py",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_171106_62c276",
"extracted_at": "2026-04-14T18:04:45.481254+00:00",
"harvested_at": "2026-04-14T18:04:45.482384+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_171106_62c276.json"
},
{
"fact": "Error encountered with file: __main__.py",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_171106_62c276",
"extracted_at": "2026-04-14T18:04:45.481644+00:00",
"harvested_at": "2026-04-14T18:04:45.482385+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_171106_62c276.json"
},
{
"fact": "Error encountered with file: tests/test_prompt_injection_defense.py",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_171106_62c276",
"extracted_at": "2026-04-14T18:04:45.481654+00:00",
"harvested_at": "2026-04-14T18:04:45.482385+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_171106_62c276.json"
},
{
"fact": "Error encountered with file: /private/var/folders/9k/v07xkpp133v03yynn9nx80fr0000gn/T/hermes_sandbox_v2umc709/script.py",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_171106_62c276",
"extracted_at": "2026-04-14T18:04:45.481666+00:00",
"harvested_at": "2026-04-14T18:04:45.482386+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_171106_62c276.json"
},
{
"fact": "Error encountered with file: pytest.mark",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_171106_62c276",
"extracted_at": "2026-04-14T18:04:45.481733+00:00",
"harvested_at": "2026-04-14T18:04:45.482387+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_171106_62c276.json"
},
{
"fact": "Error encountered with file: ntests/test_prompt_injection_defense.py",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_171106_62c276",
"extracted_at": "2026-04-14T18:04:45.481788+00:00",
"harvested_at": "2026-04-14T18:04:45.482388+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_171106_62c276.json"
},
{
"fact": "Error encountered with file: result.get",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_171106_62c276",
"extracted_at": "2026-04-14T18:04:45.481979+00:00",
"harvested_at": "2026-04-14T18:04:45.482388+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_171106_62c276.json"
},
{
"fact": "Error encountered with file: concurrent.future",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_171106_62c276",
"extracted_at": "2026-04-14T18:04:45.482228+00:00",
"harvested_at": "2026-04-14T18:04:45.482389+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_171106_62c276.json"
},
{
"fact": "Error encountered with file: 0.0",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_171106_62c276",
"extracted_at": "2026-04-14T18:04:45.482252+00:00",
"harvested_at": "2026-04-14T18:04:45.482390+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_171106_62c276.json"
},
{
"fact": "Error encountered with file: /private/var/folders/9k/v07xkpp133v03yynn9nx80fr0000gn/T/hermes_sandbox_mjbblg0z/script.py",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_171106_62c276",
"extracted_at": "2026-04-14T18:04:45.482315+00:00",
"harvested_at": "2026-04-14T18:04:45.482390+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_171106_62c276.json"
},
{
"fact": "Error encountered with file: /private/var/folders/9k/v07xkpp133v03yynn9nx80fr0000gn/T/hermes_sandbox_u2ngkm60/script.py",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_181734_aed35b",
"extracted_at": "2026-04-14T18:04:45.482463+00:00",
"harvested_at": "2026-04-14T18:04:45.484207+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_181734_aed35b.json"
},
{
"fact": "Error encountered with file: /private/var/folders/9k/v07xkpp133v03yynn9nx80fr0000gn/T/hermes_sandbox_i63vbaem/script.py",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_181734_aed35b",
"extracted_at": "2026-04-14T18:04:45.482569+00:00",
"harvested_at": "2026-04-14T18:04:45.484208+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_181734_aed35b.json"
},
{
"fact": "Error encountered with file: 3.12",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_181734_aed35b",
"extracted_at": "2026-04-14T18:04:45.482589+00:00",
"harvested_at": "2026-04-14T18:04:45.484209+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_181734_aed35b.json"
},
{
"fact": "Successful command pattern: git restore --staged ",
"category": "pattern",
"repo": "global",
"confidence": 0.6,
"session_id": "20260413_181734_aed35b",
"extracted_at": "2026-04-14T18:04:45.482629+00:00",
"harvested_at": "2026-04-14T18:04:45.484209+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_181734_aed35b.json"
},
{
"fact": "Error encountered with file: forge.alexanderwhitestone",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_181734_aed35b",
"extracted_at": "2026-04-14T18:04:45.482645+00:00",
"harvested_at": "2026-04-14T18:04:45.484210+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_181734_aed35b.json"
},
{
"fact": "Successful command pattern: git restore --staged ",
"category": "pattern",
"repo": "global",
"confidence": 0.6,
"session_id": "20260413_181734_aed35b",
"extracted_at": "2026-04-14T18:04:45.483301+00:00",
"harvested_at": "2026-04-14T18:04:45.484211+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_181734_aed35b.json"
},
{
"fact": "Error encountered with file: ntests/test_repo_truth.py",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_181734_aed35b",
"extracted_at": "2026-04-14T18:04:45.483472+00:00",
"harvested_at": "2026-04-14T18:04:45.484211+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_181734_aed35b.json"
},
{
"fact": "Successful command pattern: git restore --staged ",
"category": "pattern",
"repo": "global",
"confidence": 0.6,
"session_id": "20260413_181734_aed35b",
"extracted_at": "2026-04-14T18:04:45.483479+00:00",
"harvested_at": "2026-04-14T18:04:45.484212+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_181734_aed35b.json"
},
{
"fact": "Error encountered with file: 300.02",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_181734_aed35b",
"extracted_at": "2026-04-14T18:04:45.483596+00:00",
"harvested_at": "2026-04-14T18:04:45.484213+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_181734_aed35b.json"
},
{
"fact": "Successful command pattern: git restore --staged ",
"category": "pattern",
"repo": "global",
"confidence": 0.6,
"session_id": "20260413_181734_aed35b",
"extracted_at": "2026-04-14T18:04:45.483603+00:00",
"harvested_at": "2026-04-14T18:04:45.484213+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_181734_aed35b.json"
},
{
"fact": "Successful command pattern: git restore --staged ",
"category": "pattern",
"repo": "global",
"confidence": 0.6,
"session_id": "20260413_181734_aed35b",
"extracted_at": "2026-04-14T18:04:45.483697+00:00",
"harvested_at": "2026-04-14T18:04:45.484214+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_181734_aed35b.json"
},
{
"fact": "Error encountered with file: 300.37",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_181734_aed35b",
"extracted_at": "2026-04-14T18:04:45.483785+00:00",
"harvested_at": "2026-04-14T18:04:45.484215+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_181734_aed35b.json"
},
{
"fact": "Error encountered with file: /private/var/folders/9k/v07xkpp133v03yynn9nx80fr0000gn/T/hermes_sandbox_2k0n79t8/script.py",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_181734_aed35b",
"extracted_at": "2026-04-14T18:04:45.483792+00:00",
"harvested_at": "2026-04-14T18:04:45.484216+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_181734_aed35b.json"
},
{
"fact": "Error encountered with file: 300.19",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_181734_aed35b",
"extracted_at": "2026-04-14T18:04:45.483864+00:00",
"harvested_at": "2026-04-14T18:04:45.484216+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_181734_aed35b.json"
},
{
"fact": "Error encountered with file: /private/var/folders/9k/v07xkpp133v03yynn9nx80fr0000gn/T/hermes_sandbox_qxzsy_kv/script.py",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_181734_aed35b",
"extracted_at": "2026-04-14T18:04:45.483919+00:00",
"harvested_at": "2026-04-14T18:04:45.484217+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_181734_aed35b.json"
},
{
"fact": "Error encountered with file: CreateIssueOption.Labels",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_181734_aed35b",
"extracted_at": "2026-04-14T18:04:45.483930+00:00",
"harvested_at": "2026-04-14T18:04:45.484218+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_181734_aed35b.json"
},
{
"fact": "Error encountered with file: verify_triage_status.py",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": "20260413_181734_aed35b",
"extracted_at": "2026-04-14T18:04:45.483963+00:00",
"harvested_at": "2026-04-14T18:04:45.484218+00:00",
"session_path": "/Users/apayne/.hermes/sessions/session_20260413_181734_aed35b.json"
}
]
}

View File

@@ -1,80 +0,0 @@
---
domain: hermes-agent
category: pitfall
version: 1
last_updated: "2026-04-13"
---
# Pitfalls (hermes-agent)
## Cron & Deployment
- id: hermes-agent:pitfall:001
fact: "deploy-crons.py leaves jobs in mixed model format - some have provider/model, some just model"
confidence: 0.95
tags: [cron, deploy, model, config]
source_count: 5
first_seen: "2026-04-08"
last_confirmed: "2026-04-13"
related: [hermes-agent:pitfall:002, hermes-agent:pitfall:003]
- id: hermes-agent:pitfall:002
fact: "deploy-crons.py --deploy doesn't set legacy skill field from skills list"
confidence: 0.9
tags: [cron, deploy, skills]
source_count: 3
first_seen: "2026-04-09"
last_confirmed: "2026-04-13"
related: [hermes-agent:pitfall:001]
- id: hermes-agent:pitfall:003
fact: "Cron jobs with blank fallback_model fields trigger spurious gateway warnings"
confidence: 0.9
tags: [cron, model, fallback]
source_count: 4
first_seen: "2026-04-07"
last_confirmed: "2026-04-12"
related: [hermes-agent:pitfall:001]
- id: hermes-agent:pitfall:004
fact: "model-watchdog.py checks first provider line, not model.provider - causes false drift alarms"
confidence: 0.9
tags: [watchdog, model, config]
source_count: 3
first_seen: "2026-04-08"
last_confirmed: "2026-04-13"
## Path & Environment
- id: hermes-agent:pitfall:005
fact: "10+ files read HERMES_HOME directly instead of get_hermes_home() - breaks on custom paths"
confidence: 0.85
tags: [paths, env, hermes-home]
source_count: 6
first_seen: "2026-04-06"
last_confirmed: "2026-04-12"
related: [global:pitfall:002]
- id: hermes-agent:pitfall:006
fact: "get_hermes_home() doesn't expand tilde when HERMES_HOME=~/... is set"
confidence: 0.8
tags: [paths, env, bug]
source_count: 2
first_seen: "2026-04-05"
## SSH & Dispatch
- id: hermes-agent:pitfall:007
fact: "vps-agent-dispatch reports OK while remote hermes binary path is broken"
confidence: 0.9
tags: [ssh, dispatch, vps]
source_count: 4
first_seen: "2026-04-07"
last_confirmed: "2026-04-11"
- id: hermes-agent:pitfall:008
fact: "nightwatch-health-monitor SSH check fails on cloud-model-only deployments"
confidence: 0.85
tags: [ssh, health, cloud]
source_count: 2
first_seen: "2026-04-10"

View File

@@ -1,63 +0,0 @@
---
domain: the-nexus
category: pitfall
version: 1
last_updated: "2026-04-13"
---
# Pitfalls (the-nexus)
## Git & Merging
- id: the-nexus:pitfall:001
fact: "Merges fail with HTTP 405 due to branch protection - must use merge API with 1 approval"
confidence: 0.95
tags: [git, merge, branch-protection, gitea]
source_count: 12
first_seen: "2026-04-05"
last_confirmed: "2026-04-13"
related: [global:pitfall:001]
- id: the-nexus:pitfall:002
fact: "ThreadingHTTPServer required for multi-user bridge - standard HTTPServer blocks on concurrent requests"
confidence: 0.95
tags: [server, concurrency, bridge]
source_count: 5
first_seen: "2026-04-10"
last_confirmed: "2026-04-13"
- id: the-nexus:pitfall:003
fact: "ChatLog.log() crashes on message persistence when index.html has orphaned button tags"
confidence: 0.9
tags: [html, crash, chatlog]
source_count: 3
first_seen: "2026-04-12"
last_confirmed: "2026-04-13"
## Three.js & Performance
- id: the-nexus:pitfall:004
fact: "Three.js LOD not implemented - local hardware struggles with full scene without texture optimization"
confidence: 0.85
tags: [threejs, performance, lod]
source_count: 4
first_seen: "2026-04-09"
last_confirmed: "2026-04-13"
- id: the-nexus:pitfall:005
fact: "Duplicate content blocks appear in index.html when PR merges conflict silently"
confidence: 0.8
tags: [html, merge-conflict, duplicate]
source_count: 3
first_seen: "2026-04-11"
last_confirmed: "2026-04-13"
## Deployment
- id: the-nexus:pitfall:006
fact: "Unified HTTP + WebSocket server required for proper URL deployment - separate servers break CORS"
confidence: 0.9
tags: [deploy, websocket, http, cors]
source_count: 4
first_seen: "2026-04-10"
last_confirmed: "2026-04-13"

350
scripts/harvester.py Normal file
View File

@@ -0,0 +1,350 @@
#!/usr/bin/env python3
"""
Session Harvester for Compounding Intelligence.
Extracts durable knowledge from completed sessions and updates the knowledge store.
"""
import json
import os
import sys
import logging
from datetime import datetime, timezone, timedelta
from pathlib import Path
from typing import List, Dict, Any, Optional
# Add parent directory to path for imports
sys.path.insert(0, str(Path(__file__).parent))
from session_reader import SessionReader
# Configure logging
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(levelname)s - %(message)s',
handlers=[
logging.FileHandler(Path(__file__).parent.parent / 'metrics' / 'harvester.log'),
logging.StreamHandler()
]
)
logger = logging.getLogger(__name__)
class KnowledgeHarvester:
"""Extracts knowledge from completed sessions."""
def __init__(self, repo_root: str = None):
"""Initialize the harvester."""
if repo_root is None:
repo_root = str(Path(__file__).parent.parent)
self.repo_root = Path(repo_root)
self.knowledge_dir = self.repo_root / "knowledge"
self.index_path = self.knowledge_dir / "index.json"
self.prompt_path = self.repo_root / "templates" / "harvest-prompt.md"
# Load or create knowledge index
self.index = self._load_index()
# Initialize session reader
self.reader = SessionReader()
# Harvest state file
self.state_path = self.knowledge_dir / "harvest_state.json"
self.state = self._load_state()
def _load_index(self) -> Dict[str, Any]:
"""Load or create the knowledge index."""
if self.index_path.exists():
with open(self.index_path, 'r') as f:
return json.load(f)
else:
return {
"version": 1,
"last_updated": datetime.now(timezone.utc).isoformat(),
"total_facts": 0,
"facts": []
}
def _save_index(self):
"""Save the knowledge index."""
self.index["last_updated"] = datetime.now(timezone.utc).isoformat()
with open(self.index_path, 'w') as f:
json.dump(self.index, f, indent=2)
def _load_state(self) -> Dict[str, Any]:
"""Load harvest state."""
if self.state_path.exists():
with open(self.state_path, 'r') as f:
return json.load(f)
else:
return {
"last_harvest": None,
"harvested_sessions": [],
"total_sessions_processed": 0,
"total_facts_extracted": 0
}
def _save_state(self):
"""Save harvest state."""
with open(self.state_path, 'w') as f:
json.dump(self.state, f, indent=2)
def get_sessions_to_harvest(self, max_age_hours: float = 24) -> List[Dict[str, Any]]:
"""
Get sessions that need harvesting.
Args:
max_age_hours: Only harvest sessions modified within this many hours
Returns:
List of session data dictionaries
"""
# Get sessions modified since last harvest
since = None
if self.state["last_harvest"]:
try:
since = datetime.fromisoformat(self.state["last_harvest"].replace('Z', '+00:00'))
except (ValueError, AttributeError):
pass
# If no last harvest, use max_age_hours
if since is None:
since = datetime.now(timezone.utc) - timedelta(hours=max_age_hours)
# Get recent sessions
sessions = self.reader.list_sessions(since=since)
# Filter out already harvested sessions
harvested = set(self.state["harvested_sessions"])
to_harvest = []
for path in sessions:
session = self.reader.read_session(path)
if "error" in session:
logger.warning(f"Error reading session {path}: {session['error']}")
continue
# Skip if already harvested
if session["session_id"] in harvested:
continue
# Skip if session is still active
if not self.reader.is_session_complete(session):
continue
to_harvest.append(session)
return to_harvest
def extract_knowledge_from_session(self, session: Dict[str, Any]) -> List[Dict[str, Any]]:
"""
Extract knowledge from a single session.
This is a simplified extraction that looks for patterns in the session.
In a full implementation, this would use an LLM with the harvest prompt.
Args:
session: Session data dictionary
Returns:
List of extracted knowledge items
"""
knowledge_items = []
# Get messages from session
messages = session.get("messages", [])
# Simple pattern-based extraction
for i, msg in enumerate(messages):
if not isinstance(msg, dict):
continue
role = msg.get("role", "")
content = msg.get("content", "")
if not content or not isinstance(content, str):
continue
# Look for error patterns
if "error" in content.lower() or "Error" in content:
# Extract error context
context = content[:200] # First 200 chars
# Look for file paths
import re
file_paths = re.findall(r'[~/.]?[\w/]+\.\w+', context)
if file_paths:
knowledge_items.append({
"fact": f"Error encountered with file: {file_paths[0]}",
"category": "pitfall",
"repo": "global",
"confidence": 0.7,
"session_id": session["session_id"],
"extracted_at": datetime.now(timezone.utc).isoformat()
})
# Look for successful patterns
if "success" in content.lower() or "Success" in content:
# Extract success context
context = content[:200]
# Look for commands or actions
import re
commands = re.findall(r'(?:git|npm|pip|python|curl|ssh)\s+[\w\s\-\.]+', context)
if commands:
knowledge_items.append({
"fact": f"Successful command pattern: {commands[0]}",
"category": "pattern",
"repo": "global",
"confidence": 0.6,
"session_id": session["session_id"],
"extracted_at": datetime.now(timezone.utc).isoformat()
})
return knowledge_items
def harvest_session(self, session: Dict[str, Any]) -> Dict[str, Any]:
"""
Harvest knowledge from a single session.
Args:
session: Session data dictionary
Returns:
Harvest result dictionary
"""
session_id = session["session_id"]
logger.info(f"Harvesting session: {session_id}")
try:
# Extract knowledge
knowledge_items = self.extract_knowledge_from_session(session)
# Add to index
for item in knowledge_items:
# Add metadata
item["harvested_at"] = datetime.now(timezone.utc).isoformat()
item["session_path"] = session.get("path", "")
# Add to facts
self.index["facts"].append(item)
# Update state
self.state["harvested_sessions"].append(session_id)
self.state["total_sessions_processed"] += 1
self.state["total_facts_extracted"] += len(knowledge_items)
result = {
"session_id": session_id,
"success": True,
"facts_extracted": len(knowledge_items),
"knowledge_items": knowledge_items
}
logger.info(f"Extracted {len(knowledge_items)} facts from session {session_id}")
except Exception as e:
logger.error(f"Error harvesting session {session_id}: {e}")
result = {
"session_id": session_id,
"success": False,
"error": str(e),
"facts_extracted": 0
}
return result
def harvest_batch(self, max_sessions: int = 10, max_age_hours: float = 24) -> Dict[str, Any]:
"""
Harvest a batch of sessions.
Args:
max_sessions: Maximum number of sessions to harvest
max_age_hours: Only harvest sessions modified within this many hours
Returns:
Batch harvest result
"""
logger.info(f"Starting harvest batch (max {max_sessions} sessions, max age {max_age_hours}h)")
# Get sessions to harvest
sessions = self.get_sessions_to_harvest(max_age_hours)
if not sessions:
logger.info("No sessions to harvest")
return {
"success": True,
"sessions_processed": 0,
"facts_extracted": 0,
"results": []
}
# Limit to max_sessions
sessions = sessions[:max_sessions]
results = []
total_facts = 0
for session in sessions:
result = self.harvest_session(session)
results.append(result)
if result["success"]:
total_facts += result["facts_extracted"]
# Update index and state
self.index["total_facts"] = len(self.index["facts"])
self._save_index()
self.state["last_harvest"] = datetime.now(timezone.utc).isoformat()
self._save_state()
batch_result = {
"success": True,
"sessions_processed": len(sessions),
"facts_extracted": total_facts,
"results": results,
"timestamp": datetime.now(timezone.utc).isoformat()
}
logger.info(f"Harvest batch complete: {len(sessions)} sessions, {total_facts} facts")
return batch_result
def main():
"""Main entry point for the harvester."""
import argparse
parser = argparse.ArgumentParser(description="Harvest knowledge from completed sessions")
parser.add_argument("--max-sessions", type=int, default=10, help="Maximum sessions to harvest")
parser.add_argument("--max-age-hours", type=float, default=24, help="Max age in hours")
parser.add_argument("--dry-run", action="store_true", help="Don't save, just report")
args = parser.parse_args()
harvester = KnowledgeHarvester()
if args.dry_run:
sessions = harvester.get_sessions_to_harvest(args.max_age_hours)
print(f"Would harvest {len(sessions)} sessions:")
for session in sessions[:5]: # Show first 5
print(f" - {session['session_id']} ({session['message_count']} messages)")
if len(sessions) > 5:
print(f" ... and {len(sessions) - 5} more")
return
result = harvester.harvest_batch(
max_sessions=args.max_sessions,
max_age_hours=args.max_age_hours
)
if result["success"]:
print(f"Harvest complete: {result['sessions_processed']} sessions, {result['facts_extracted']} facts")
else:
print(f"Harvest failed: {result.get('error', 'Unknown error')}")
sys.exit(1)
if __name__ == "__main__":
main()

194
scripts/session_reader.py Normal file
View File

@@ -0,0 +1,194 @@
#!/usr/bin/env python3
"""
Session reader for Compounding Intelligence.
Reads and parses Hermes session files from ~/.hermes/sessions/.
"""
import json
import os
from datetime import datetime, timezone
from pathlib import Path
from typing import List, Dict, Any, Optional
class SessionReader:
"""Reads and parses Hermes session files."""
def __init__(self, sessions_dir: str = None):
"""Initialize with sessions directory path."""
if sessions_dir is None:
sessions_dir = os.path.expanduser("~/.hermes/sessions")
self.sessions_dir = Path(sessions_dir)
self.supported_extensions = {'.json', '.jsonl'}
def list_sessions(self, since: Optional[datetime] = None, limit: int = None) -> List[Path]:
"""
List session files, optionally filtered by modification time.
Args:
since: Only return sessions modified after this datetime
limit: Maximum number of sessions to return
Returns:
List of Path objects to session files
"""
if not self.sessions_dir.exists():
return []
sessions = []
for f in self.sessions_dir.iterdir():
if f.suffix in self.supported_extensions:
if since is not None:
mtime = datetime.fromtimestamp(f.stat().st_mtime, tz=timezone.utc)
if mtime <= since:
continue
sessions.append(f)
# Sort by modification time (newest first)
sessions.sort(key=lambda p: p.stat().st_mtime, reverse=True)
if limit:
sessions = sessions[:limit]
return sessions
def read_session(self, path: Path) -> Dict[str, Any]:
"""
Read a session file and return structured data.
Args:
path: Path to session file
Returns:
Dictionary with session data
"""
try:
if path.suffix == '.jsonl':
return self._read_jsonl_session(path)
elif path.suffix == '.json':
return self._read_json_session(path)
else:
return {"error": f"Unsupported format: {path.suffix}"}
except Exception as e:
return {"error": str(e), "path": str(path)}
def _read_json_session(self, path: Path) -> Dict[str, Any]:
"""Read a JSON format session file."""
with open(path, 'r') as f:
data = json.load(f)
return {
"session_id": data.get("session_id", path.stem),
"model": data.get("model", "unknown"),
"created_at": data.get("session_start"),
"last_updated": data.get("last_updated"),
"message_count": data.get("message_count", len(data.get("messages", []))),
"messages": data.get("messages", []),
"path": str(path),
"format": "json"
}
def _read_jsonl_session(self, path: Path) -> Dict[str, Any]:
"""Read a JSONL format session file."""
messages = []
session_meta = None
with open(path, 'r') as f:
for line in f:
line = line.strip()
if not line:
continue
try:
entry = json.loads(line)
if entry.get("role") == "session_meta":
session_meta = entry
else:
messages.append(entry)
except json.JSONDecodeError:
continue
session_id = path.stem
if session_meta:
session_id = session_meta.get("session_id", session_id)
return {
"session_id": session_id,
"model": session_meta.get("model", "unknown") if session_meta else "unknown",
"created_at": session_meta.get("timestamp") if session_meta else None,
"last_updated": messages[-1].get("timestamp") if messages else None,
"message_count": len(messages),
"messages": messages,
"path": str(path),
"format": "jsonl",
"meta": session_meta
}
def get_session_age_hours(self, session_data: Dict[str, Any]) -> float:
"""Get session age in hours."""
last_updated = session_data.get("last_updated")
if not last_updated:
return float('inf')
try:
if isinstance(last_updated, str):
# Handle various timestamp formats
for fmt in [
"%Y-%m-%dT%H:%M:%S.%fZ",
"%Y-%m-%dT%H:%M:%SZ",
"%Y-%m-%dT%H:%M:%S.%f",
"%Y-%m-%dT%H:%M:%S"
]:
try:
dt = datetime.strptime(last_updated, fmt)
dt = dt.replace(tzinfo=timezone.utc)
break
except ValueError:
continue
else:
# Try parsing with fromisoformat
dt = datetime.fromisoformat(last_updated.replace('Z', '+00:00'))
else:
dt = last_updated
now = datetime.now(timezone.utc)
age = now - dt
return age.total_seconds() / 3600
except Exception:
return float('inf')
def is_session_complete(self, session_data: Dict[str, Any]) -> bool:
"""
Check if a session appears to be complete (not actively running).
Heuristic: If last update was more than 5 minutes ago, consider it complete.
"""
age_hours = self.get_session_age_hours(session_data)
return age_hours > (5 / 60) # 5 minutes
def main():
"""Test the session reader."""
reader = SessionReader()
# List recent sessions
sessions = reader.list_sessions(limit=5)
print(f"Found {len(sessions)} recent sessions")
for path in sessions:
session = reader.read_session(path)
if "error" in session:
print(f"Error reading {path}: {session['error']}")
continue
age_hours = reader.get_session_age_hours(session)
complete = reader.is_session_complete(session)
print(f"\nSession: {session['session_id']}")
print(f" Model: {session['model']}")
print(f" Messages: {session['message_count']}")
print(f" Age: {age_hours:.1f} hours")
print(f" Complete: {complete}")
if __name__ == "__main__":
main()

View File

@@ -1,38 +0,0 @@
#!/usr/bin/env python3
"""Validate knowledge files and index.json against the schema."""
import json, sys
from pathlib import Path
VALID_CATEGORIES = {"fact", "pitfall", "pattern", "tool-quirk", "question"}
REQUIRED = {"id", "fact", "category", "domain", "confidence"}
def validate_fact(fact, src=""):
errs = []
for f in REQUIRED:
if f not in fact: errs.append(f"{src}: missing '{f}'")
if "category" in fact and fact["category"] not in VALID_CATEGORIES:
errs.append(f"{src}: invalid category '{fact['category']}'")
if "confidence" in fact:
if not isinstance(fact["confidence"], (int, float)) or not (0 <= fact["confidence"] <= 1):
errs.append(f"{src}: confidence must be 0.0-1.0")
if "id" in fact:
parts = fact["id"].split(":")
if len(parts) != 3: errs.append(f"{src}: id must be domain:category:sequence")
return errs
def main():
idx = Path(__file__).parent.parent / "knowledge" / "index.json"
if not idx.exists(): print(f"FAILED: {idx} not found"); sys.exit(1)
data = json.load(open(idx))
errs = []
seen = set()
for i, f in enumerate(data.get("facts", [])):
errs.extend(validate_fact(f, f"[{i}]"))
if "id" in f:
if f["id"] in seen: errs.append(f"duplicate id '{f['id']}'")
seen.add(f["id"])
if errs:
print(f"FAILED - {len(errs)} errors:"); [print(f" x {e}") for e in errs]; sys.exit(1)
print(f"PASSED - {len(data.get('facts', []))} facts")
if __name__ == "__main__": main()