Compare commits

...

70 Commits

Author SHA1 Message Date
Perplexity
7ec45642eb feat(ansible): Canonical IaC playbook for fleet management
Some checks failed
PR Checklist / pr-checklist (pull_request) Failing after 1m27s
Implements the Ansible Infrastructure as Code story from KT 2026-04-08.

One canonical Ansible playbook defines:
- Deadman switch (snapshot good config on health, rollback+restart on death)
- Golden state config deployment (Anthropic BANNED, Kimi→Gemini→Ollama)
- Cron schedule (source-controlled, no manual crontab edits)
- Agent startup sequence (pull→validate→start→verify)
- request_log telemetry table (every inference call logged)
- Thin config pattern (immutable local pointer to upstream)
- Gitea webhook handler (deploy on merge)
- Config validator (rejects banned providers)

Fleet inventory: Timmy (Mac), Allegro (VPS), Bezalel (VPS), Ezra (VPS)

Roles: wizard_base, golden_state, deadman_switch, request_log, cron_manager

Addresses: timmy-config #442, #443, #444, #445, #446
References: KT Final 2026-04-08 P2, KT Bezalel 2026-04-08 #1-#5
2026-04-09 22:25:31 +00:00
a6fded436f Merge PR #431
Co-authored-by: Perplexity Computer <perplexity@tower.local>
Co-committed-by: Perplexity Computer <perplexity@tower.local>
2026-04-09 16:27:48 +00:00
641537eb07 Merge pull request '[EPIC] Gemini — Sovereign Infrastructure Suite Implementation' (#418) from feat/gemini-epic-398-1775648372708 into main 2026-04-08 23:38:18 +00:00
17fde3c03f feat: implement README.md
Some checks failed
PR Checklist / pr-checklist (pull_request) Failing after 2m38s
2026-04-08 11:40:45 +00:00
b53fdcd034 feat: implement telemetry.py 2026-04-08 11:40:43 +00:00
1cc1d2ae86 feat: implement skill_installer.py 2026-04-08 11:40:40 +00:00
9ec0d1d80e feat: implement cross_repo_test.py 2026-04-08 11:40:35 +00:00
e9cdaf09dc feat: implement phase_tracker.py 2026-04-08 11:40:30 +00:00
e8302b4af2 feat: implement self_healing.py 2026-04-08 11:40:25 +00:00
311ecf19db feat: implement model_eval.py 2026-04-08 11:40:19 +00:00
77f258efa5 feat: implement gitea_webhook_handler.py 2026-04-08 11:40:12 +00:00
5e12451588 feat: implement adr_manager.py 2026-04-08 11:40:05 +00:00
80b6ceb118 feat: implement agent_dispatch.py 2026-04-08 11:39:57 +00:00
ffb85cc10f feat: implement fleet_llama.py 2026-04-08 11:39:52 +00:00
4179646456 feat: implement architecture_linter_v2.py 2026-04-08 11:39:46 +00:00
681fd0763f feat: implement provision_wizard.py 2026-04-08 11:39:40 +00:00
b21c2833f7 Merge pull request '[PERPLEXITY-08] Add PR checklist CI workflow and enforcement script' (#411) from perplexity/pr-checklist-ci into main 2026-04-08 11:11:02 +00:00
f84b870ce4 Merge branch 'main' into perplexity/pr-checklist-ci
Some checks failed
PR Checklist / pr-checklist (pull_request) Failing after 1m18s
2026-04-08 11:10:51 +00:00
8b4df81b5b Merge pull request '[PERPLEXITY-08] Add PR checklist CI workflow and enforcement script' (#411) from perplexity/pr-checklist-ci into main 2026-04-08 11:10:23 +00:00
e96fae69cf Merge branch 'main' into perplexity/pr-checklist-ci
Some checks failed
PR Checklist / pr-checklist (pull_request) Failing after 1m18s
2026-04-08 11:10:15 +00:00
cccafd845b Merge pull request '[PERPLEXITY-03] Add disambiguation header to SOUL.md (Bitcoin inscription)' (#412) from perplexity/soul-md-disambiguation into main 2026-04-08 11:10:09 +00:00
1f02166107 Merge branch 'main' into perplexity/soul-md-disambiguation 2026-04-08 11:10:00 +00:00
7dcaa05dbd Merge pull request 'refactor: wire retrieval_enforcer L1 to SovereignStore — eliminate subprocess/ONNX dependency' (#384) from perplexity/wire-enforcer-sovereign-store into main 2026-04-08 11:09:53 +00:00
18124206e1 Merge branch 'main' into perplexity/wire-enforcer-sovereign-store 2026-04-08 11:09:45 +00:00
11736e58cd docs: add disambiguation header to SOUL.md (Bitcoin inscription)
This SOUL.md is the Bitcoin inscription version, not the narrative
identity document. Adding an HTML comment header to clarify.

The canonical narrative SOUL.md lives in timmy-home.
See: #388, #378
2026-04-08 10:58:55 +00:00
14521ef664 feat: add PR checklist enforcement script
All checks were successful
PR Checklist / pr-checklist (pull_request) Successful in 2m21s
Python script that enforces PR quality standards:
- Checks for actual code changes
- Validates branch is not behind base
- Detects issue bundling in PR body
- Runs Python syntax validation
- Verifies shell script executability
- Ensures issue references exist

Closes #393
2026-04-08 10:53:44 +00:00
8b17eaa537 ci: add PR checklist quality gate workflow 2026-04-08 10:51:40 +00:00
afee83c1fe Merge pull request 'docs: add MEMORY_ARCHITECTURE.md — retrieval order, storage layout, data flow' (#375) from perplexity/mempalace-architecture-doc into main 2026-04-08 10:39:51 +00:00
56d8085e88 Merge branch 'main' into perplexity/mempalace-architecture-doc 2026-04-08 10:39:35 +00:00
4e7b24617f Merge pull request 'feat: FLEET-010/011/012 — Phase 3-5 cross-agent delegation, model pipeline, lifecycle' (#365) from timmy/fleet-phase3-5 into main 2026-04-08 10:39:09 +00:00
8daa12c518 Merge branch 'main' into timmy/fleet-phase3-5 2026-04-08 10:39:01 +00:00
e369727235 Merge branch 'main' into perplexity/mempalace-architecture-doc 2026-04-08 10:38:42 +00:00
1705a7b802 Merge pull request 'feat: FLEET-010/011/012 — Phase 3-5 cross-agent delegation, model pipeline, lifecycle' (#365) from timmy/fleet-phase3-5 into main 2026-04-08 10:38:08 +00:00
e0bef949dd Merge branch 'main' into timmy/fleet-phase3-5 2026-04-08 10:37:56 +00:00
dafe8667c5 Merge branch 'main' into perplexity/mempalace-architecture-doc 2026-04-08 10:37:39 +00:00
4844ce6238 Merge pull request 'feat: Bezalel Builder Wizard — Sidecar Authority Update' (#364) from feat/bezalel-wizard-sidecar-v2 into main 2026-04-08 10:37:34 +00:00
a43510a7eb Merge branch 'main' into feat/bezalel-wizard-sidecar-v2 2026-04-08 10:37:25 +00:00
3b00891614 refactor: wire retrieval_enforcer L1 to SovereignStore — eliminate subprocess/ONNX dependency
Replaces the subprocess call to mempalace CLI binary with direct SovereignStore import. L1 palace search now uses SQLite + FTS5 + HRR vectors in-process. No ONNX, no subprocess, no API calls.

Removes: import subprocess, MEMPALACE_BIN constant
Adds: SovereignStore lazy singleton, _get_store(), SOVEREIGN_DB path

Closes #383
Depends on #380 (sovereign_store.py)
2026-04-08 10:32:52 +00:00
74867bbfa7 Merge pull request 'art: The Timmy Foundation — Visual Story (24 images + 2 videos)' (#366) from timmy/gallery-submission into main 2026-04-08 10:16:35 +00:00
d07305b89c Merge branch 'main' into perplexity/mempalace-architecture-doc 2026-04-08 10:16:13 +00:00
2812bac438 Merge branch 'main' into timmy/gallery-submission 2026-04-08 10:16:04 +00:00
5c15704c3a Merge branch 'main' into timmy/fleet-phase3-5 2026-04-08 10:15:55 +00:00
30fdbef74e Merge branch 'main' into feat/bezalel-wizard-sidecar-v2 2026-04-08 10:15:49 +00:00
9cc2cf8f8d Merge pull request 'feat: Sovereign Memory Store — zero-API durable memory (SQLite + FTS5 + HRR)' (#380) from perplexity/sovereign-memory-store into main 2026-04-08 10:14:36 +00:00
a2eff1222b Merge branch 'main' into perplexity/sovereign-memory-store 2026-04-08 10:14:24 +00:00
3f4465b646 Merge pull request '[SOVEREIGN] Orchestrator v1 — backlog reader, priority scorer, agent dispatcher' (#362) from timmy/sovereign-orchestrator-v1 into main 2026-04-08 10:14:16 +00:00
ff7ce9a022 Merge branch 'main' into perplexity/mempalace-architecture-doc 2026-04-08 10:14:10 +00:00
f04aaec4ed Merge branch 'main' into timmy/gallery-submission 2026-04-08 10:13:57 +00:00
d54a218a27 Merge branch 'main' into timmy/fleet-phase3-5 2026-04-08 10:13:44 +00:00
3cc92fde1a Merge branch 'main' into feat/bezalel-wizard-sidecar-v2 2026-04-08 10:13:34 +00:00
11a28b74bb Merge branch 'main' into timmy/sovereign-orchestrator-v1 2026-04-08 10:13:21 +00:00
perplexity
593621c5e0 feat: sovereign memory store — zero-API durable memory (SQLite + FTS5 + HRR)
Implements the missing pieces of the MemPalace epic (#367):

- sovereign_store.py: Self-contained memory store replacing the third-party
  mempalace CLI and its ONNX dependency. Uses:
  * SQLite + FTS5 for keyword search (porter stemmer, unicode61)
  * HRR phase vectors (SHA-256 deterministic, numpy optional) for semantic similarity
  * Reciprocal Rank Fusion to merge keyword and semantic rankings
  * Trust scoring with boost/decay lifecycle
  * Room-based organization matching the existing PalaceRoom model

- promotion.py (MP-4, #371): Quality-gated scratchpad-to-palace promotion.
  Four heuristic gates, no LLM call:
  1. Length gate (min 5 words, max 500)
  2. Structure gate (rejects fragments and pure code)
  3. Duplicate gate (FTS5 + Jaccard overlap detection)
  4. Staleness gate (7-day threshold for old notes)
  Includes force override, batch promotion, and audit logging.

- 21 unit tests covering HRR vectors, store operations, search,
  trust lifecycle, and all promotion gates.

Zero external dependencies. Zero API calls. Zero cloud.

Refs: #367 #370 #371
2026-04-07 22:41:37 +00:00
458dabfaed Merge pull request 'feat: MemPalace integration — skill port, retrieval enforcer, wake-up protocol (#367)' (#374) from timmy/mempalace-integration into main
Reviewed-on: #374
2026-04-07 21:45:34 +00:00
2e2a646ba8 docs: add MEMORY_ARCHITECTURE.md — retrieval order, storage layout, data flow 2026-04-07 20:16:45 +00:00
Alexander Whitestone
f8dabae8eb feat: MemPalace integration — skill port, retrieval enforcer, wake-up protocol (#367)
MP-1 (#368): Port PalaceRoom + Mempalace classes with 22 unit tests
MP-2 (#369): L0-L5 retrieval order enforcer with recall-query detection
MP-5 (#372): Wake-up protocol (300-900 token context), session scratchpad

Modules:
- mempalace.py: PalaceRoom + Mempalace dataclasses, factory constructors
- retrieval_enforcer.py: Layered memory retrieval (identity → palace → scratch → gitea → skills)
- wakeup.py: Session wake-up with caching (5min TTL)
- scratchpad.py: JSON-based session notes with palace promotion

All 65 tests pass. Pure stdlib + graceful degradation for ONNX issues (#373).
2026-04-07 13:15:07 -04:00
Alexander Whitestone
0a4c8f2d37 art: The Timmy Foundation visual story — 24 images, 2 videos, generated with Grok Imagine 2026-04-07 12:46:17 -04:00
Alexander Whitestone
0a13347e39 feat: FLEET-010/011/012 — Phase 3 and 4 fleet capabilities
FLEET-010: Cross-agent task delegation protocol
- Keyword-based heuristic assigns unassigned issues to agents
- Supports: claw-code, gemini, ezra, bezalel, timmy
- Delegation logging and status dashboard
- Auto-comments on assigned issues

FLEET-011: Local model pipeline and fallback chain
- Checks Ollama reachability and model availability
- 4-model chain: hermes4:14b -> qwen2.5:7b -> phi3:3.8b -> gemma3:1b
- Tests each model with live inference on every run
- Fallback verification: finds first responding model
- Chain configuration via ~/.local/timmy/fleet-resources/model-chain.json

FLEET-012: Agent lifecycle manager
- Full lifecycle: provision -> deploy -> monitor -> retire
- Heartbeat detection with 24h idle threshold
- Task completion/failure tracking
- Agent Fleet Status dashboard

Fixes timmy-home#563 (delegation), #564 (model pipeline), #565 (lifecycle)
2026-04-07 12:43:10 -04:00
dc75be18e4 feat: add Bezalel Builder Wizard sidecar configuration 2026-04-07 16:39:42 +00:00
0c950f991c Merge pull request '[ORCHESTRATOR-4] Evaluate CrewAI for Phase 2 integration' (#361) from ezra/issue-358 into main 2026-04-07 16:35:40 +00:00
Alexander Whitestone
7399c83024 fix: null guard on assignees in orchestrator dispatch 2026-04-07 12:34:02 -04:00
Alexander Whitestone
cf213bffd1 [SOVEREIGN] Add Orchestrator v1 — backlog reader, priority scorer, agent dispatcher
Resolves #355 #356

Components:
- orchestrator.py: Full sovereign orchestrator with 6 subsystems
  1. Backlog reader (fetches from timmy-config, the-nexus, timmy-home)
  2. Priority scorer (0-100 based on severity, age, assignment state)
  3. Agent roster (groq/ezra/bezalel with health checks)
  4. Dispatcher (matches issues to agents by type/strength)
  5. Consolidated report (terminal + Telegram)
  6. Main loop (--once, --daemon, --dry-run)
- orchestrate.sh: Shell wrapper with env setup

Dry-run tested: 348 issues scanned, 3 agents detected UP.
stdlib only, no pip dependencies.
2026-04-07 12:31:14 -04:00
c1c3aaa681 Merge pull request 'feat: genchi-genbutsu — verify world state, not log vibes (#348)' (#360) from ezra/issue-348 into main 2026-04-07 16:23:35 +00:00
d023512858 Merge pull request 'feat: FLEET-003 - Fleet capacity inventory with resource baselines' (#353) from timmy/fleet-capacity-inventory into main 2026-04-07 16:23:22 +00:00
e5e01e36c9 Merge pull request '[KAIZEN] Automated retrospective after every burn cycle (fixes #349)' (#352) from ezra/issue-349 into main 2026-04-07 16:23:17 +00:00
ezra
e5055d269b feat: genchi-genbutsu — verify world state, not log vibes (#348)
Implement 現地現物 (Genchi Genbutsu) post-completion verification:

- Add bin/genchi-genbutsu.sh performing 5 world-state checks:
  1. Branch exists on remote
  2. PR exists
  3. PR has real file changes (> 0)
  4. PR is mergeable
  5. Issue has a completion comment from the agent

- Wire verification into all agent loops:
  - bin/claude-loop.sh: call genchi-genbutsu before merge/close
  - bin/gemini-loop.sh: delegate existing inline checks to genchi-genbutsu
  - bin/agent-loop.sh: resurrect generic agent loop with genchi-genbutsu wired in

- Update metrics JSONL to include 'verified' field for all loops

- Update burn monitor (tasks.py velocity_tracking):
  - Report verified_completion count alongside raw completions
  - Dashboard shows verified trend history

- Update morning report (tasks.py good_morning_report):
  - Count only verified completions from the last 24h
  - Surface verification failures in the report body

Fixes #348
Refs #345
2026-04-07 16:12:05 +00:00
Alexander Whitestone
277d21aef6 feat: FLEET-007 — Auto-restart agent (self-healing processes)
Daemon that monitors key services and restarts them automatically:
- Local: hermes-gateway, ollama, codeclaw-heartbeat
- Ezra: gitea, nginx, hermes-agent
- Allegro hermes-agent
- Bezalel: hermes-agent, evennia
- Max 3 restart attempts per service per cycle (prevents loops)
- 1-hour cooldown after max retries with Telegram escalation
- Restart log at ~/.local/timmy/fleet-health/restarts.log
- Modes: check now (--status for history, --daemon for continuous)

Fixes timmy-home#560
2026-04-07 12:04:33 -04:00
Alexander Whitestone
228e46a330 feat: FLEET-004/005 — Milestone messages and resource tracker
FLEET-004: 22 milestone messages across 6 phases + 11 Fibonacci uptime milestones.
FLEET-005: Resource tracking system — Capacity/Uptime/Innovation tension model.
  - Tracks capacity spending and regeneration (2/hr baseline)
  - Innovation generates only when utilization < 70% (5/hr scaled)
  - Fibonacci uptime milestone detection (95% through 99.5%)
  - Phase gate checks (P2: 95% uptime, P3: 95% + 100 innovation, P5: 95% + 500)
  - CLI: status, regen commands

Fixes timmy-home#557 (FLEET-004), #558 (FLEET-005)
2026-04-07 12:03:45 -04:00
Ezra
2e64b160b5 [KAIZEN] Harden retro scheduling, chunking, and tests (#349)
- Add Kaizen Retro to cron/jobs.json with explicit local model/provider
- Add Telegram message chunking for reports approaching the 4096-char limit
- Fix classify_issue_type false positives on short substrings (ci in cleanup)
- Add 28 unit tests covering classification, max-attempts detection,
  suggestion generation, report formatting, and Telegram chunking
2026-04-07 15:58:58 +00:00
Alexander Whitestone
67c2927c1a feat: FLEET-003 — Capacity inventory with resource baselines
Full resource audit of all 4 machines (3 VPS + 1 Mac) with:
- vCPU, RAM, disk, swap per machine
- Key processes sorted by resource usage
- Capacity utilization: ~15-20%, Innovation GENERATING
- Uptime baseline: Ezra/Allegro/Bezalel 100%, Gitea 95.8%
- Fibonacci uptime milestones (5 of 6 REACHED)
- Risk assessment (Ezra disk 72%, Bezalel 2GB RAM, Ezra CPU 269%)
- Recommendations across all phases

Fixes timmy-home#556 (FLEET-003)
2026-04-07 11:58:16 -04:00
Ezra
f18955ea90 [KAIZEN] Implement automated burn-cycle retrospective (fixes #349)
- Add bin/kaizen-retro.sh entry point and scripts/kaizen_retro.py
- Analyze closed issues, merged PRs, and stale/max-attempts issues
- Report success rates by agent, repo, and issue type
- Generate one concrete improvement suggestion per cycle
- Post retro to Telegram and comment on the latest morning report issue
- Wire into Huey as kaizen_retro() task at 07:15 daily
- Extend gitea_client.py with since param for list_issues and
  created_at/updated_at fields on PullRequest
2026-04-07 15:57:21 +00:00
107 changed files with 9290 additions and 84 deletions

View File

@@ -0,0 +1,29 @@
# pr-checklist.yml — Automated PR quality gate
# Refs: #393 (PERPLEXITY-08), Epic #385
#
# Enforces the review checklist that agents skip when left to self-approve.
# Runs on every pull_request. Fails fast so bad PRs never reach a reviewer.
name: PR Checklist
on:
pull_request:
branches: [main, master]
jobs:
pr-checklist:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
- name: Run PR checklist
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: python3 bin/pr-checklist.py

View File

@@ -0,0 +1,134 @@
# validate-config.yaml
# Validates all config files, scripts, and playbooks on every PR.
# Addresses #289: repo-native validation for timmy-config changes.
#
# Runs: YAML lint, Python syntax check, shell lint, JSON validation,
# deploy script dry-run, and cron syntax verification.
name: Validate Config
on:
pull_request:
branches: [main]
push:
branches: [main]
jobs:
yaml-lint:
name: YAML Lint
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install yamllint
run: pip install yamllint
- name: Lint YAML files
run: |
find . -name '*.yaml' -o -name '*.yml' | \
grep -v '.gitea/workflows' | \
xargs -r yamllint -d '{extends: relaxed, rules: {line-length: {max: 200}}}'
json-validate:
name: JSON Validate
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Validate JSON files
run: |
find . -name '*.json' -print0 | while IFS= read -r -d '' f; do
echo "Validating: $f"
python3 -m json.tool "$f" > /dev/null || exit 1
done
python-check:
name: Python Syntax & Import Check
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install dependencies
run: |
pip install py_compile flake8
- name: Compile-check all Python files
run: |
find . -name '*.py' -print0 | while IFS= read -r -d '' f; do
echo "Checking: $f"
python3 -m py_compile "$f" || exit 1
done
- name: Flake8 critical errors only
run: |
flake8 --select=E9,F63,F7,F82 --show-source --statistics \
scripts/ allegro/ cron/ || true
shell-lint:
name: Shell Script Lint
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install shellcheck
run: sudo apt-get install -y shellcheck
- name: Lint shell scripts
run: |
find . -name '*.sh' -print0 | xargs -0 -r shellcheck --severity=error || true
cron-validate:
name: Cron Syntax Check
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Validate cron entries
run: |
if [ -d cron ]; then
find cron -name '*.cron' -o -name '*.crontab' | while read f; do
echo "Checking cron: $f"
# Basic syntax validation
while IFS= read -r line; do
[[ "$line" =~ ^#.*$ ]] && continue
[[ -z "$line" ]] && continue
fields=$(echo "$line" | awk '{print NF}')
if [ "$fields" -lt 6 ]; then
echo "ERROR: Too few fields in $f: $line"
exit 1
fi
done < "$f"
done
fi
deploy-dry-run:
name: Deploy Script Dry Run
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Syntax-check deploy.sh
run: |
if [ -f deploy.sh ]; then
bash -n deploy.sh
echo "deploy.sh syntax OK"
fi
playbook-schema:
name: Playbook Schema Validation
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Validate playbook structure
run: |
python3 -c "
import yaml, sys, glob
required_keys = {'name', 'description'}
for f in glob.glob('playbooks/*.yaml'):
with open(f) as fh:
try:
data = yaml.safe_load(fh)
if not isinstance(data, dict):
print(f'ERROR: {f} is not a YAML mapping')
sys.exit(1)
missing = required_keys - set(data.keys())
if missing:
print(f'WARNING: {f} missing keys: {missing}')
print(f'OK: {f}')
except yaml.YAMLError as e:
print(f'ERROR: {f}: {e}')
sys.exit(1)
"

11
.gitignore vendored
View File

@@ -1,9 +1,8 @@
# Secrets
*.token
*.key
*.secret
# Local state
*.pyc
*.pyo
*.egg-info/
dist/
build/
*.db
*.db-wal
*.db-shm

10
SOUL.md
View File

@@ -1,3 +1,13 @@
<!--
NOTE: This is the BITCOIN INSCRIPTION version of SOUL.md.
It is the immutable on-chain conscience. Do not modify this content.
The NARRATIVE identity document (for onboarding, Audio Overviews,
and system prompts) lives in timmy-home/SOUL.md.
See: #388, #378 for the divergence audit.
-->
# SOUL.md
## Inscription 1 — The Immutable Conscience

View File

@@ -0,0 +1,47 @@
# =============================================================================
# BANNED PROVIDERS — The Timmy Foundation
# =============================================================================
# "Anthropic is not only fired, but banned. I don't want these errors
# cropping up." — Alexander, 2026-04-09
#
# This is a HARD BAN. Not deprecated. Not fallback. BANNED.
# Enforcement: pre-commit hook, linter, Ansible validation, CI tests.
# =============================================================================
banned_providers:
- name: anthropic
reason: "Permanently banned. SDK access gated despite active quota. Fleet was bricked because golden state pointed to Anthropic Sonnet."
banned_date: "2026-04-09"
enforcement: strict # Ansible playbook FAILS if detected
models:
- "claude-sonnet-*"
- "claude-opus-*"
- "claude-haiku-*"
- "claude-*"
endpoints:
- "api.anthropic.com"
- "anthropic/*" # OpenRouter pattern
api_keys:
- "ANTHROPIC_API_KEY"
- "CLAUDE_API_KEY"
# Golden state alternative:
approved_providers:
- name: kimi-coding
model: kimi-k2.5
role: primary
- name: openrouter
model: google/gemini-2.5-pro
role: fallback
- name: ollama
model: "gemma4:latest"
role: terminal_fallback
# Future evaluation:
evaluation_candidates:
- name: mimo-v2-pro
status: pending
notes: "Free via Nous Portal for ~2 weeks from 2026-04-07. Add after fallback chain is fixed."
- name: hermes-4
status: available
notes: "Free on Nous Portal. 36B and 70B variants. Home team model."

95
ansible/README.md Normal file
View File

@@ -0,0 +1,95 @@
# Ansible IaC — The Timmy Foundation Fleet
> One canonical Ansible playbook defines: deadman switch, cron schedule,
> golden state rollback, agent startup sequence.
> — KT Final Session 2026-04-08, Priority TWO
## Purpose
This directory contains the **single source of truth** for fleet infrastructure.
No more ad-hoc recovery implementations. No more overlapping deadman switches.
No more agents mutating their own configs into oblivion.
**Everything** goes through Ansible. If it's not in a playbook, it doesn't exist.
## Architecture
```
┌─────────────────────────────────────────────────┐
│ Gitea (Source of Truth) │
│ timmy-config/ansible/ │
│ ├── inventory/hosts.yml (fleet machines) │
│ ├── playbooks/site.yml (master playbook) │
│ ├── roles/ (reusable roles) │
│ └── group_vars/wizards.yml (golden state) │
└──────────────────┬──────────────────────────────┘
│ PR merge triggers webhook
┌─────────────────────────────────────────────────┐
│ Gitea Webhook Handler │
│ scripts/deploy_on_webhook.sh │
│ → ansible-pull on each target machine │
└──────────────────┬──────────────────────────────┘
│ ansible-pull
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Timmy │ │ Allegro │ │ Bezalel │ │ Ezra │
│ (Mac) │ │ (VPS) │ │ (VPS) │ │ (VPS) │
│ │ │ │ │ │ │ │
│ deadman │ │ deadman │ │ deadman │ │ deadman │
│ cron │ │ cron │ │ cron │ │ cron │
│ golden │ │ golden │ │ golden │ │ golden │
│ req_log │ │ req_log │ │ req_log │ │ req_log │
└──────────┘ └──────────┘ └──────────┘ └──────────┘
```
## Quick Start
```bash
# Deploy everything to all machines
ansible-playbook -i inventory/hosts.yml playbooks/site.yml
# Deploy only golden state config
ansible-playbook -i inventory/hosts.yml playbooks/golden_state.yml
# Deploy only to a specific wizard
ansible-playbook -i inventory/hosts.yml playbooks/site.yml --limit bezalel
# Dry run (check mode)
ansible-playbook -i inventory/hosts.yml playbooks/site.yml --check --diff
```
## Golden State Provider Chain
All wizard configs converge on this provider chain. **Anthropic is BANNED.**
| Priority | Provider | Model | Endpoint |
| -------- | -------------------- | ---------------- | --------------------------------- |
| 1 | Kimi | kimi-k2.5 | https://api.kimi.com/coding/v1 |
| 2 | Gemini (OpenRouter) | gemini-2.5-pro | https://openrouter.ai/api/v1 |
| 3 | Ollama (local) | gemma4:latest | http://localhost:11434/v1 |
## Roles
| Role | Purpose |
| ---------------- | ------------------------------------------------------------ |
| `wizard_base` | Common wizard setup: directories, thin config, git pull |
| `deadman_switch` | Health check → snapshot good config → rollback on death |
| `golden_state` | Deploy and enforce golden state provider chain |
| `request_log` | SQLite telemetry table for every inference call |
| `cron_manager` | Source-controlled cron jobs — no manual crontab edits |
## Rules
1. **No manual changes.** If it's not in a playbook, it will be overwritten.
2. **No Anthropic.** Banned. Enforcement is automated. See `BANNED_PROVIDERS.yml`.
3. **Idempotent.** Every playbook can run 100 times with the same result.
4. **PR required.** Config changes go through Gitea PR review, then deploy.
5. **One identity per machine.** No duplicate agents. Fleet audit enforces this.
## Related Issues
- timmy-config #442: [P2] Ansible IaC Canonical Playbook
- timmy-config #444: Wire Deadman Switch ACTION
- timmy-config #443: Thin Config Pattern
- timmy-config #446: request_log Telemetry Table

21
ansible/ansible.cfg Normal file
View File

@@ -0,0 +1,21 @@
[defaults]
inventory = inventory/hosts.yml
roles_path = roles
host_key_checking = False
retry_files_enabled = False
stdout_callback = yaml
forks = 10
timeout = 30
# Logging
log_path = /var/log/ansible/timmy-fleet.log
[privilege_escalation]
become = True
become_method = sudo
become_user = root
become_ask_pass = False
[ssh_connection]
pipelining = True
ssh_args = -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no

View File

@@ -0,0 +1,74 @@
# =============================================================================
# Wizard Group Variables — Golden State Configuration
# =============================================================================
# These variables are applied to ALL wizards in the fleet.
# This IS the golden state. If a wizard deviates, Ansible corrects it.
# =============================================================================
# --- Deadman Switch ---
deadman_enabled: true
deadman_check_interval: 300 # 5 minutes between health checks
deadman_snapshot_dir: "~/.local/timmy/snapshots"
deadman_max_snapshots: 10 # Rolling window of good configs
deadman_restart_cooldown: 60 # Seconds to wait before restart after failure
deadman_max_restart_attempts: 3
deadman_escalation_channel: telegram # Alert Alexander after max attempts
# --- Thin Config ---
thin_config_path: "~/.timmy/thin_config.yml"
thin_config_mode: "0444" # Read-only — agents CANNOT modify
upstream_repo: "https://forge.alexanderwhitestone.com/Timmy_Foundation/timmy-config.git"
upstream_branch: main
config_pull_on_wake: true
config_validation_enabled: true
# --- Agent Settings ---
agent_max_turns: 30
agent_reasoning_effort: high
agent_verbose: false
agent_approval_mode: auto
# --- Hermes Harness ---
hermes_config_dir: "{{ hermes_home }}"
hermes_bin_dir: "{{ hermes_home }}/bin"
hermes_skins_dir: "{{ hermes_home }}/skins"
hermes_playbooks_dir: "{{ hermes_home }}/playbooks"
hermes_memories_dir: "{{ hermes_home }}/memories"
# --- Request Log (Telemetry) ---
request_log_enabled: true
request_log_path: "~/.local/timmy/request_log.db"
request_log_rotation_days: 30 # Archive logs older than 30 days
request_log_sync_to_gitea: false # Future: push telemetry summaries to Gitea
# --- Cron Schedule ---
# All cron jobs are managed here. No manual crontab edits.
cron_jobs:
- name: "Deadman health check"
job: "cd {{ wizard_home }}/workspace/timmy-config && python3 fleet/health_check.py"
minute: "*/5"
hour: "*"
enabled: "{{ deadman_enabled }}"
- name: "Muda audit"
job: "cd {{ wizard_home }}/workspace/timmy-config && bash fleet/muda-audit.sh >> /tmp/muda-audit.log 2>&1"
minute: "0"
hour: "21"
weekday: "0"
enabled: true
- name: "Config pull from upstream"
job: "cd {{ wizard_home }}/workspace/timmy-config && git pull --ff-only origin main"
minute: "*/15"
hour: "*"
enabled: "{{ config_pull_on_wake }}"
- name: "Request log rotation"
job: "python3 -c \"import sqlite3,datetime; db=sqlite3.connect('{{ request_log_path }}'); db.execute('DELETE FROM request_log WHERE timestamp < datetime(\\\"now\\\", \\\"-{{ request_log_rotation_days }} days\\\")'); db.commit()\""
minute: "0"
hour: "3"
enabled: "{{ request_log_enabled }}"
# --- Provider Enforcement ---
# These are validated on every Ansible run. Any Anthropic reference = failure.
provider_ban_enforcement: strict # strict = fail playbook, warn = log only

119
ansible/inventory/hosts.yml Normal file
View File

@@ -0,0 +1,119 @@
# =============================================================================
# Fleet Inventory — The Timmy Foundation
# =============================================================================
# Source of truth for all machines in the fleet.
# Update this file when machines are added/removed.
# All changes go through PR review.
# =============================================================================
all:
children:
wizards:
hosts:
timmy:
ansible_host: localhost
ansible_connection: local
wizard_name: Timmy
wizard_role: "Primary wizard — soul of the fleet"
wizard_provider_primary: kimi-coding
wizard_model_primary: kimi-k2.5
hermes_port: 8081
api_port: 8645
wizard_home: "{{ ansible_env.HOME }}/wizards/timmy"
hermes_home: "{{ ansible_env.HOME }}/.hermes"
machine_type: mac
# Timmy runs on Alexander's M3 Max
ollama_available: true
allegro:
ansible_host: 167.99.126.228
ansible_user: root
wizard_name: Allegro
wizard_role: "Kimi-backed third wizard house — tight coding tasks"
wizard_provider_primary: kimi-coding
wizard_model_primary: kimi-k2.5
hermes_port: 8081
api_port: 8645
wizard_home: /root/wizards/allegro
hermes_home: /root/.hermes
machine_type: vps
ollama_available: false
bezalel:
ansible_host: 159.203.146.185
ansible_user: root
wizard_name: Bezalel
wizard_role: "Forge-and-testbed wizard — infrastructure, deployment, hardening"
wizard_provider_primary: kimi-coding
wizard_model_primary: kimi-k2.5
hermes_port: 8081
api_port: 8656
wizard_home: /root/wizards/bezalel
hermes_home: /root/.hermes
machine_type: vps
ollama_available: false
# NOTE: The awake Bezalel may be the duplicate.
# Fleet audit (the-nexus #1144) will resolve identity.
ezra:
ansible_host: 143.198.27.163
ansible_user: root
wizard_name: Ezra
wizard_role: "Infrastructure wizard — Gitea, nginx, hosting"
wizard_provider_primary: kimi-coding
wizard_model_primary: kimi-k2.5
hermes_port: 8081
api_port: 8645
wizard_home: /root/wizards/ezra
hermes_home: /root/.hermes
machine_type: vps
ollama_available: false
# NOTE: Currently DOWN — Telegram key revoked, awaiting propagation.
# Infrastructure hosts (not wizards, but managed by Ansible)
infrastructure:
hosts:
forge:
ansible_host: 143.198.27.163
ansible_user: root
# Gitea runs on the same box as Ezra
gitea_url: https://forge.alexanderwhitestone.com
gitea_org: Timmy_Foundation
vars:
# Global variables applied to all hosts
gitea_repo_url: "https://forge.alexanderwhitestone.com/Timmy_Foundation/timmy-config.git"
gitea_branch: main
config_base_path: "{{ gitea_repo_url }}"
timmy_log_dir: "~/.local/timmy/fleet-health"
request_log_db: "~/.local/timmy/request_log.db"
# Golden state provider chain — Anthropic is BANNED
golden_state_providers:
- name: kimi-coding
model: kimi-k2.5
base_url: "https://api.kimi.com/coding/v1"
timeout: 120
reason: "Primary — Kimi K2.5 (best value, least friction)"
- name: openrouter
model: google/gemini-2.5-pro
base_url: "https://openrouter.ai/api/v1"
api_key_env: OPENROUTER_API_KEY
timeout: 120
reason: "Fallback — Gemini 2.5 Pro via OpenRouter"
- name: ollama
model: "gemma4:latest"
base_url: "http://localhost:11434/v1"
timeout: 180
reason: "Terminal fallback — local Ollama (sovereign, no API needed)"
# Banned providers — hard enforcement
banned_providers:
- anthropic
- claude
banned_models_patterns:
- "claude-*"
- "anthropic/*"
- "*sonnet*"
- "*opus*"
- "*haiku*"

View File

@@ -0,0 +1,98 @@
---
# =============================================================================
# agent_startup.yml — Resurrect Wizards from Checked-in Configs
# =============================================================================
# Brings wizards back online using golden state configs.
# Order: pull config → validate → start agent → verify with request_log
# =============================================================================
- name: "Agent Startup Sequence"
hosts: wizards
become: true
serial: 1 # One wizard at a time to avoid cascading issues
tasks:
- name: "Pull latest config from upstream"
git:
repo: "{{ upstream_repo }}"
dest: "{{ wizard_home }}/workspace/timmy-config"
version: "{{ upstream_branch }}"
force: true
tags: [pull]
- name: "Deploy golden state config"
include_role:
name: golden_state
tags: [config]
- name: "Validate config — no banned providers"
shell: |
python3 -c "
import yaml, sys
with open('{{ wizard_home }}/config.yaml') as f:
cfg = yaml.safe_load(f)
banned = {{ banned_providers }}
for p in cfg.get('fallback_providers', []):
if p.get('provider', '') in banned:
print(f'BANNED: {p[\"provider\"]}', file=sys.stderr)
sys.exit(1)
model = cfg.get('model', {}).get('provider', '')
if model in banned:
print(f'BANNED default provider: {model}', file=sys.stderr)
sys.exit(1)
print('Config validated — no banned providers.')
"
register: config_valid
tags: [validate]
- name: "Ensure hermes-agent service is running"
systemd:
name: "hermes-{{ wizard_name | lower }}"
state: started
enabled: true
when: machine_type == 'vps'
tags: [start]
ignore_errors: true # Service may not exist yet on all machines
- name: "Start hermes agent (Mac — launchctl)"
shell: |
launchctl kickstart -k "ai.hermes.{{ wizard_name | lower }}" 2>/dev/null || \
cd {{ wizard_home }} && hermes agent start --daemon 2>&1 | tail -5
when: machine_type == 'mac'
tags: [start]
ignore_errors: true
- name: "Wait for agent to come online"
wait_for:
host: 127.0.0.1
port: "{{ api_port }}"
timeout: 60
state: started
tags: [verify]
ignore_errors: true
- name: "Verify agent is alive — check request_log for activity"
shell: |
sleep 10
python3 -c "
import sqlite3, sys
db = sqlite3.connect('{{ request_log_path }}')
cursor = db.execute('''
SELECT COUNT(*) FROM request_log
WHERE agent_name = '{{ wizard_name }}'
AND timestamp > datetime('now', '-5 minutes')
''')
count = cursor.fetchone()[0]
if count > 0:
print(f'{{ wizard_name }} is alive — {count} recent inference calls logged.')
else:
print(f'WARNING: {{ wizard_name }} started but no telemetry yet.')
"
register: agent_status
tags: [verify]
ignore_errors: true
- name: "Report startup status"
debug:
msg: "{{ wizard_name }}: {{ agent_status.stdout | default('startup attempted') }}"
tags: [always]

View File

@@ -0,0 +1,15 @@
---
# =============================================================================
# cron_schedule.yml — Source-Controlled Cron Jobs
# =============================================================================
# All cron jobs are defined in group_vars/wizards.yml.
# This playbook deploys them. No manual crontab edits allowed.
# =============================================================================
- name: "Deploy Cron Schedule"
hosts: wizards
become: true
roles:
- role: cron_manager
tags: [cron, schedule]

View File

@@ -0,0 +1,17 @@
---
# =============================================================================
# deadman_switch.yml — Deploy Deadman Switch to All Wizards
# =============================================================================
# The deadman watch already fires and detects dead agents.
# This playbook wires the ACTION:
# - On healthy check: snapshot current config as "last known good"
# - On failed check: rollback config to snapshot, restart agent
# =============================================================================
- name: "Deploy Deadman Switch ACTION"
hosts: wizards
become: true
roles:
- role: deadman_switch
tags: [deadman, recovery]

View File

@@ -0,0 +1,30 @@
---
# =============================================================================
# golden_state.yml — Deploy Golden State Config to All Wizards
# =============================================================================
# Enforces the golden state provider chain across the fleet.
# Removes any Anthropic references. Deploys the approved provider chain.
# =============================================================================
- name: "Deploy Golden State Configuration"
hosts: wizards
become: true
roles:
- role: golden_state
tags: [golden, config]
post_tasks:
- name: "Verify golden state — no banned providers"
shell: |
grep -rci 'anthropic\|claude-sonnet\|claude-opus\|claude-haiku' \
{{ hermes_home }}/config.yaml \
{{ wizard_home }}/config.yaml 2>/dev/null || echo "0"
register: banned_count
changed_when: false
- name: "Report golden state status"
debug:
msg: >
{{ wizard_name }} golden state: {{ golden_state_providers | map(attribute='name') | list | join(' → ') }}.
Banned provider references: {{ banned_count.stdout | trim }}.

View File

@@ -0,0 +1,15 @@
---
# =============================================================================
# request_log.yml — Deploy Telemetry Table
# =============================================================================
# Creates the request_log SQLite table on all machines.
# Every inference call writes a row. No exceptions. No summarizing.
# =============================================================================
- name: "Deploy Request Log Telemetry"
hosts: wizards
become: true
roles:
- role: request_log
tags: [telemetry, logging]

View File

@@ -0,0 +1,72 @@
---
# =============================================================================
# site.yml — Master Playbook for the Timmy Foundation Fleet
# =============================================================================
# This is the ONE playbook that defines the entire fleet state.
# Run this and every machine converges to golden state.
#
# Usage:
# ansible-playbook -i inventory/hosts.yml playbooks/site.yml
# ansible-playbook -i inventory/hosts.yml playbooks/site.yml --limit bezalel
# ansible-playbook -i inventory/hosts.yml playbooks/site.yml --check --diff
# =============================================================================
- name: "Timmy Foundation Fleet — Full Convergence"
hosts: wizards
become: true
pre_tasks:
- name: "Validate no banned providers in golden state"
assert:
that:
- "item.name not in banned_providers"
fail_msg: "BANNED PROVIDER DETECTED: {{ item.name }} — Anthropic is permanently banned."
quiet: true
loop: "{{ golden_state_providers }}"
tags: [always]
- name: "Display target wizard"
debug:
msg: "Deploying to {{ wizard_name }} ({{ wizard_role }}) on {{ ansible_host }}"
tags: [always]
roles:
- role: wizard_base
tags: [base, setup]
- role: golden_state
tags: [golden, config]
- role: deadman_switch
tags: [deadman, recovery]
- role: request_log
tags: [telemetry, logging]
- role: cron_manager
tags: [cron, schedule]
post_tasks:
- name: "Final validation — scan for banned providers"
shell: |
grep -ri 'anthropic\|claude-sonnet\|claude-opus\|claude-haiku' \
{{ hermes_home }}/config.yaml \
{{ wizard_home }}/config.yaml \
{{ thin_config_path }} 2>/dev/null || true
register: banned_scan
changed_when: false
tags: [validation]
- name: "FAIL if banned providers found in deployed config"
fail:
msg: |
BANNED PROVIDER DETECTED IN DEPLOYED CONFIG:
{{ banned_scan.stdout }}
Anthropic is permanently banned. Fix the config and re-deploy.
when: banned_scan.stdout | length > 0
tags: [validation]
- name: "Deployment complete"
debug:
msg: "{{ wizard_name }} converged to golden state. Provider chain: {{ golden_state_providers | map(attribute='name') | list | join(' → ') }}"
tags: [always]

View File

@@ -0,0 +1,55 @@
---
# =============================================================================
# cron_manager/tasks — Source-Controlled Cron Jobs
# =============================================================================
# All cron jobs are defined in group_vars/wizards.yml.
# No manual crontab edits. This is the only way to manage cron.
# =============================================================================
- name: "Deploy managed cron jobs"
cron:
name: "{{ item.name }}"
job: "{{ item.job }}"
minute: "{{ item.minute | default('*') }}"
hour: "{{ item.hour | default('*') }}"
day: "{{ item.day | default('*') }}"
month: "{{ item.month | default('*') }}"
weekday: "{{ item.weekday | default('*') }}"
state: "{{ 'present' if item.enabled else 'absent' }}"
user: "{{ ansible_user | default('root') }}"
loop: "{{ cron_jobs }}"
when: cron_jobs is defined
- name: "Deploy deadman switch cron (fallback if systemd timer unavailable)"
cron:
name: "Deadman switch — {{ wizard_name }}"
job: "{{ wizard_home }}/deadman_action.sh >> {{ timmy_log_dir }}/deadman-{{ wizard_name }}.log 2>&1"
minute: "*/5"
hour: "*"
state: present
user: "{{ ansible_user | default('root') }}"
when: deadman_enabled and machine_type != 'vps'
# VPS machines use systemd timers instead
- name: "Remove legacy cron jobs (cleanup)"
cron:
name: "{{ item }}"
state: absent
user: "{{ ansible_user | default('root') }}"
loop:
- "legacy-deadman-watch"
- "old-health-check"
- "backup-deadman"
ignore_errors: true
- name: "List active cron jobs"
shell: "crontab -l 2>/dev/null | grep -v '^#' | grep -v '^$' || echo 'No cron jobs found.'"
register: active_crons
changed_when: false
- name: "Report cron status"
debug:
msg: |
{{ wizard_name }} cron jobs deployed.
Active:
{{ active_crons.stdout }}

View File

@@ -0,0 +1,70 @@
---
# =============================================================================
# deadman_switch/tasks — Wire the Deadman Switch ACTION
# =============================================================================
# The watch fires. This makes it DO something:
# - On healthy check: snapshot current config as "last known good"
# - On failed check: rollback to last known good, restart agent
# =============================================================================
- name: "Create snapshot directory"
file:
path: "{{ deadman_snapshot_dir }}"
state: directory
mode: "0755"
- name: "Deploy deadman switch script"
template:
src: deadman_action.sh.j2
dest: "{{ wizard_home }}/deadman_action.sh"
mode: "0755"
- name: "Deploy deadman systemd service"
template:
src: deadman_switch.service.j2
dest: "/etc/systemd/system/deadman-{{ wizard_name | lower }}.service"
mode: "0644"
when: machine_type == 'vps'
notify: "Enable deadman service"
- name: "Deploy deadman systemd timer"
template:
src: deadman_switch.timer.j2
dest: "/etc/systemd/system/deadman-{{ wizard_name | lower }}.timer"
mode: "0644"
when: machine_type == 'vps'
notify: "Enable deadman timer"
- name: "Deploy deadman launchd plist (Mac)"
template:
src: deadman_switch.plist.j2
dest: "{{ ansible_env.HOME }}/Library/LaunchAgents/com.timmy.deadman.{{ wizard_name | lower }}.plist"
mode: "0644"
when: machine_type == 'mac'
notify: "Load deadman plist"
- name: "Take initial config snapshot"
copy:
src: "{{ wizard_home }}/config.yaml"
dest: "{{ deadman_snapshot_dir }}/config.yaml.known_good"
remote_src: true
mode: "0444"
ignore_errors: true
handlers:
- name: "Enable deadman service"
systemd:
name: "deadman-{{ wizard_name | lower }}.service"
daemon_reload: true
enabled: true
- name: "Enable deadman timer"
systemd:
name: "deadman-{{ wizard_name | lower }}.timer"
daemon_reload: true
enabled: true
state: started
- name: "Load deadman plist"
shell: "launchctl load {{ ansible_env.HOME }}/Library/LaunchAgents/com.timmy.deadman.{{ wizard_name | lower }}.plist"
ignore_errors: true

View File

@@ -0,0 +1,153 @@
#!/usr/bin/env bash
# =============================================================================
# Deadman Switch ACTION — {{ wizard_name }}
# =============================================================================
# Generated by Ansible on {{ ansible_date_time.iso8601 }}
# DO NOT EDIT MANUALLY.
#
# On healthy check: snapshot current config as "last known good"
# On failed check: rollback config to last known good, restart agent
# =============================================================================
set -euo pipefail
WIZARD_NAME="{{ wizard_name }}"
WIZARD_HOME="{{ wizard_home }}"
CONFIG_FILE="{{ wizard_home }}/config.yaml"
SNAPSHOT_DIR="{{ deadman_snapshot_dir }}"
SNAPSHOT_FILE="${SNAPSHOT_DIR}/config.yaml.known_good"
REQUEST_LOG_DB="{{ request_log_path }}"
LOG_DIR="{{ timmy_log_dir }}"
LOG_FILE="${LOG_DIR}/deadman-${WIZARD_NAME}.log"
MAX_SNAPSHOTS={{ deadman_max_snapshots }}
RESTART_COOLDOWN={{ deadman_restart_cooldown }}
MAX_RESTART_ATTEMPTS={{ deadman_max_restart_attempts }}
COOLDOWN_FILE="${LOG_DIR}/deadman_cooldown_${WIZARD_NAME}"
SERVICE_NAME="hermes-{{ wizard_name | lower }}"
# Ensure directories exist
mkdir -p "${SNAPSHOT_DIR}" "${LOG_DIR}"
log() {
echo "[$(date -u +%Y-%m-%dT%H:%M:%SZ)] [deadman] [${WIZARD_NAME}] $*" >> "${LOG_FILE}"
echo "[deadman] [${WIZARD_NAME}] $*"
}
log_telemetry() {
local status="$1"
local message="$2"
if [ -f "${REQUEST_LOG_DB}" ]; then
sqlite3 "${REQUEST_LOG_DB}" "INSERT INTO request_log (timestamp, agent_name, provider, model, endpoint, status, error_message) VALUES (datetime('now'), '${WIZARD_NAME}', 'deadman_switch', 'N/A', 'health_check', '${status}', '${message}');" 2>/dev/null || true
fi
}
snapshot_config() {
if [ -f "${CONFIG_FILE}" ]; then
cp "${CONFIG_FILE}" "${SNAPSHOT_FILE}"
# Keep rolling history
cp "${CONFIG_FILE}" "${SNAPSHOT_DIR}/config.yaml.$(date +%s)"
# Prune old snapshots
ls -t "${SNAPSHOT_DIR}"/config.yaml.[0-9]* 2>/dev/null | tail -n +$((MAX_SNAPSHOTS + 1)) | xargs rm -f 2>/dev/null
log "Config snapshot saved."
fi
}
rollback_config() {
if [ -f "${SNAPSHOT_FILE}" ]; then
log "Rolling back config to last known good..."
cp "${SNAPSHOT_FILE}" "${CONFIG_FILE}"
log "Config rolled back."
log_telemetry "fallback" "Config rolled back to last known good by deadman switch"
else
log "ERROR: No known good snapshot found. Pulling from upstream..."
cd "${WIZARD_HOME}/workspace/timmy-config" 2>/dev/null && \
git pull --ff-only origin {{ upstream_branch }} 2>/dev/null && \
cp "wizards/{{ wizard_name | lower }}/config.yaml" "${CONFIG_FILE}" && \
log "Config restored from upstream." || \
log "CRITICAL: Cannot restore config from any source."
fi
}
restart_agent() {
# Check cooldown
if [ -f "${COOLDOWN_FILE}" ]; then
local last_restart
last_restart=$(cat "${COOLDOWN_FILE}")
local now
now=$(date +%s)
local elapsed=$((now - last_restart))
if [ "${elapsed}" -lt "${RESTART_COOLDOWN}" ]; then
log "Restart cooldown active (${elapsed}s / ${RESTART_COOLDOWN}s). Skipping."
return 1
fi
fi
log "Restarting ${SERVICE_NAME}..."
date +%s > "${COOLDOWN_FILE}"
{% if machine_type == 'vps' %}
systemctl restart "${SERVICE_NAME}" 2>/dev/null && \
log "Agent restarted via systemd." || \
log "ERROR: systemd restart failed."
{% else %}
launchctl kickstart -k "ai.hermes.{{ wizard_name | lower }}" 2>/dev/null && \
log "Agent restarted via launchctl." || \
(cd "${WIZARD_HOME}" && hermes agent start --daemon 2>/dev/null && \
log "Agent restarted via hermes CLI.") || \
log "ERROR: All restart methods failed."
{% endif %}
log_telemetry "success" "Agent restarted by deadman switch"
}
# --- Health Check ---
check_health() {
# Check 1: Is the agent process running?
{% if machine_type == 'vps' %}
if ! systemctl is-active --quiet "${SERVICE_NAME}" 2>/dev/null; then
if ! pgrep -f "hermes" > /dev/null 2>/dev/null; then
log "FAIL: Agent process not running."
return 1
fi
fi
{% else %}
if ! pgrep -f "hermes" > /dev/null 2>/dev/null; then
log "FAIL: Agent process not running."
return 1
fi
{% endif %}
# Check 2: Is the API port responding?
if ! timeout 10 bash -c "echo > /dev/tcp/127.0.0.1/{{ api_port }}" 2>/dev/null; then
log "FAIL: API port {{ api_port }} not responding."
return 1
fi
# Check 3: Does the config contain banned providers?
if grep -qi 'anthropic\|claude-sonnet\|claude-opus\|claude-haiku' "${CONFIG_FILE}" 2>/dev/null; then
log "FAIL: Config contains banned provider (Anthropic). Rolling back."
return 1
fi
return 0
}
# --- Main ---
main() {
log "Health check starting..."
if check_health; then
log "HEALTHY — snapshotting config."
snapshot_config
log_telemetry "success" "Health check passed"
else
log "UNHEALTHY — initiating recovery."
log_telemetry "error" "Health check failed — initiating rollback"
rollback_config
restart_agent
fi
log "Health check complete."
}
main "$@"

View File

@@ -0,0 +1,22 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<!-- Deadman Switch — {{ wizard_name }}. Generated by Ansible. DO NOT EDIT MANUALLY. -->
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.timmy.deadman.{{ wizard_name | lower }}</string>
<key>ProgramArguments</key>
<array>
<string>/bin/bash</string>
<string>{{ wizard_home }}/deadman_action.sh</string>
</array>
<key>StartInterval</key>
<integer>{{ deadman_check_interval }}</integer>
<key>RunAtLoad</key>
<true/>
<key>StandardOutPath</key>
<string>{{ timmy_log_dir }}/deadman-{{ wizard_name }}.log</string>
<key>StandardErrorPath</key>
<string>{{ timmy_log_dir }}/deadman-{{ wizard_name }}.log</string>
</dict>
</plist>

View File

@@ -0,0 +1,16 @@
# Deadman Switch — {{ wizard_name }}
# Generated by Ansible. DO NOT EDIT MANUALLY.
[Unit]
Description=Deadman Switch for {{ wizard_name }} wizard
After=network.target
[Service]
Type=oneshot
ExecStart={{ wizard_home }}/deadman_action.sh
User={{ ansible_user | default('root') }}
StandardOutput=append:{{ timmy_log_dir }}/deadman-{{ wizard_name }}.log
StandardError=append:{{ timmy_log_dir }}/deadman-{{ wizard_name }}.log
[Install]
WantedBy=multi-user.target

View File

@@ -0,0 +1,14 @@
# Deadman Switch Timer — {{ wizard_name }}
# Generated by Ansible. DO NOT EDIT MANUALLY.
# Runs every {{ deadman_check_interval // 60 }} minutes.
[Unit]
Description=Deadman Switch Timer for {{ wizard_name }} wizard
[Timer]
OnBootSec=60
OnUnitActiveSec={{ deadman_check_interval }}s
AccuracySec=30s
[Install]
WantedBy=timers.target

View File

@@ -0,0 +1,6 @@
---
# golden_state defaults
# The golden_state_providers list is defined in group_vars/wizards.yml
# and inventory/hosts.yml (global vars).
golden_state_enforce: true
golden_state_backup_before_deploy: true

View File

@@ -0,0 +1,46 @@
---
# =============================================================================
# golden_state/tasks — Deploy and enforce golden state provider chain
# =============================================================================
- name: "Backup current config before golden state deploy"
copy:
src: "{{ wizard_home }}/config.yaml"
dest: "{{ wizard_home }}/config.yaml.pre-golden-{{ ansible_date_time.epoch }}"
remote_src: true
when: golden_state_backup_before_deploy
ignore_errors: true
- name: "Deploy golden state wizard config"
template:
src: "../../wizard_base/templates/wizard_config.yaml.j2"
dest: "{{ wizard_home }}/config.yaml"
mode: "0644"
backup: true
notify:
- "Restart hermes agent (systemd)"
- "Restart hermes agent (launchctl)"
- name: "Scan for banned providers in all config files"
shell: |
FOUND=0
for f in {{ wizard_home }}/config.yaml {{ hermes_home }}/config.yaml; do
if [ -f "$f" ]; then
if grep -qi 'anthropic\|claude-sonnet\|claude-opus\|claude-haiku' "$f"; then
echo "BANNED PROVIDER in $f:"
grep -ni 'anthropic\|claude-sonnet\|claude-opus\|claude-haiku' "$f"
FOUND=1
fi
fi
done
exit $FOUND
register: provider_scan
changed_when: false
failed_when: provider_scan.rc != 0 and provider_ban_enforcement == 'strict'
- name: "Report golden state deployment"
debug:
msg: >
{{ wizard_name }} golden state deployed.
Provider chain: {{ golden_state_providers | map(attribute='name') | list | join(' → ') }}.
Banned provider scan: {{ 'CLEAN' if provider_scan.rc == 0 else 'VIOLATIONS FOUND' }}.

View File

@@ -0,0 +1,64 @@
-- =============================================================================
-- request_log — Inference Telemetry Table
-- =============================================================================
-- Every agent writes to this table BEFORE and AFTER every inference call.
-- No exceptions. No summarizing. No describing what you would log.
-- Actually write the row.
--
-- Source: KT Bezalel Architecture Session 2026-04-08
-- =============================================================================
CREATE TABLE IF NOT EXISTS request_log (
id INTEGER PRIMARY KEY AUTOINCREMENT,
timestamp TEXT NOT NULL DEFAULT (datetime('now')),
agent_name TEXT NOT NULL,
provider TEXT NOT NULL,
model TEXT NOT NULL,
endpoint TEXT NOT NULL,
tokens_in INTEGER,
tokens_out INTEGER,
latency_ms INTEGER,
status TEXT NOT NULL, -- 'success', 'error', 'timeout', 'fallback'
error_message TEXT
);
-- Index for common queries
CREATE INDEX IF NOT EXISTS idx_request_log_agent
ON request_log (agent_name, timestamp);
CREATE INDEX IF NOT EXISTS idx_request_log_provider
ON request_log (provider, timestamp);
CREATE INDEX IF NOT EXISTS idx_request_log_status
ON request_log (status, timestamp);
-- View: recent activity per agent (last hour)
CREATE VIEW IF NOT EXISTS v_recent_activity AS
SELECT
agent_name,
provider,
model,
status,
COUNT(*) as call_count,
AVG(latency_ms) as avg_latency_ms,
SUM(tokens_in) as total_tokens_in,
SUM(tokens_out) as total_tokens_out
FROM request_log
WHERE timestamp > datetime('now', '-1 hour')
GROUP BY agent_name, provider, model, status;
-- View: provider reliability (last 24 hours)
CREATE VIEW IF NOT EXISTS v_provider_reliability AS
SELECT
provider,
model,
COUNT(*) as total_calls,
SUM(CASE WHEN status = 'success' THEN 1 ELSE 0 END) as successes,
SUM(CASE WHEN status = 'error' THEN 1 ELSE 0 END) as errors,
SUM(CASE WHEN status = 'timeout' THEN 1 ELSE 0 END) as timeouts,
SUM(CASE WHEN status = 'fallback' THEN 1 ELSE 0 END) as fallbacks,
ROUND(100.0 * SUM(CASE WHEN status = 'success' THEN 1 ELSE 0 END) / COUNT(*), 1) as success_rate,
AVG(latency_ms) as avg_latency_ms
FROM request_log
WHERE timestamp > datetime('now', '-24 hours')
GROUP BY provider, model;

View File

@@ -0,0 +1,50 @@
---
# =============================================================================
# request_log/tasks — Deploy Telemetry Table
# =============================================================================
# "This is non-negotiable infrastructure. Without it, we cannot verify
# if any agent actually executed what it claims."
# — KT Bezalel 2026-04-08
# =============================================================================
- name: "Create telemetry directory"
file:
path: "{{ request_log_path | dirname }}"
state: directory
mode: "0755"
- name: "Deploy request_log schema"
copy:
src: request_log_schema.sql
dest: "{{ wizard_home }}/request_log_schema.sql"
mode: "0644"
- name: "Initialize request_log database"
shell: |
sqlite3 "{{ request_log_path }}" < "{{ wizard_home }}/request_log_schema.sql"
args:
creates: "{{ request_log_path }}"
- name: "Verify request_log table exists"
shell: |
sqlite3 "{{ request_log_path }}" ".tables" | grep -q "request_log"
register: table_check
changed_when: false
- name: "Verify request_log schema matches"
shell: |
sqlite3 "{{ request_log_path }}" ".schema request_log" | grep -q "agent_name"
register: schema_check
changed_when: false
- name: "Set permissions on request_log database"
file:
path: "{{ request_log_path }}"
mode: "0644"
- name: "Report request_log status"
debug:
msg: >
{{ wizard_name }} request_log: {{ request_log_path }}
— table exists: {{ table_check.rc == 0 }}
— schema valid: {{ schema_check.rc == 0 }}

View File

@@ -0,0 +1,6 @@
---
# wizard_base defaults
wizard_user: "{{ ansible_user | default('root') }}"
wizard_group: "{{ ansible_user | default('root') }}"
timmy_base_dir: "~/.local/timmy"
timmy_config_repo: "https://forge.alexanderwhitestone.com/Timmy_Foundation/timmy-config.git"

View File

@@ -0,0 +1,11 @@
---
- name: "Restart hermes agent (systemd)"
systemd:
name: "hermes-{{ wizard_name | lower }}"
state: restarted
when: machine_type == 'vps'
- name: "Restart hermes agent (launchctl)"
shell: "launchctl kickstart -k ai.hermes.{{ wizard_name | lower }}"
when: machine_type == 'mac'
ignore_errors: true

View File

@@ -0,0 +1,69 @@
---
# =============================================================================
# wizard_base/tasks — Common wizard setup
# =============================================================================
- name: "Create wizard directories"
file:
path: "{{ item }}"
state: directory
mode: "0755"
loop:
- "{{ wizard_home }}"
- "{{ wizard_home }}/workspace"
- "{{ hermes_home }}"
- "{{ hermes_home }}/bin"
- "{{ hermes_home }}/skins"
- "{{ hermes_home }}/playbooks"
- "{{ hermes_home }}/memories"
- "~/.local/timmy"
- "~/.local/timmy/fleet-health"
- "~/.local/timmy/snapshots"
- "~/.timmy"
- name: "Clone/update timmy-config"
git:
repo: "{{ upstream_repo }}"
dest: "{{ wizard_home }}/workspace/timmy-config"
version: "{{ upstream_branch }}"
force: false
update: true
ignore_errors: true # May fail on first run if no SSH key
- name: "Deploy SOUL.md"
copy:
src: "{{ wizard_home }}/workspace/timmy-config/SOUL.md"
dest: "~/.timmy/SOUL.md"
remote_src: true
mode: "0644"
ignore_errors: true
- name: "Deploy thin config (immutable pointer to upstream)"
template:
src: thin_config.yml.j2
dest: "{{ thin_config_path }}"
mode: "{{ thin_config_mode }}"
tags: [thin_config]
- name: "Ensure Python3 and pip are available"
package:
name:
- python3
- python3-pip
state: present
when: machine_type == 'vps'
ignore_errors: true
- name: "Ensure PyYAML is installed (for config validation)"
pip:
name: pyyaml
state: present
when: machine_type == 'vps'
ignore_errors: true
- name: "Create Ansible log directory"
file:
path: /var/log/ansible
state: directory
mode: "0755"
ignore_errors: true

View File

@@ -0,0 +1,41 @@
# =============================================================================
# Thin Config — {{ wizard_name }}
# =============================================================================
# THIS FILE IS READ-ONLY. Agents CANNOT modify it.
# It contains only pointers to upstream. The actual config lives in Gitea.
#
# Agent wakes up → pulls config from upstream → loads → runs.
# If anything tries to mutate this → fails gracefully → pulls fresh on restart.
#
# Only way to permanently change config: commit to Gitea, merge PR, Ansible deploys.
#
# Generated by Ansible on {{ ansible_date_time.iso8601 }}
# DO NOT EDIT MANUALLY.
# =============================================================================
identity:
wizard_name: "{{ wizard_name }}"
wizard_role: "{{ wizard_role }}"
machine: "{{ inventory_hostname }}"
upstream:
repo: "{{ upstream_repo }}"
branch: "{{ upstream_branch }}"
config_path: "wizards/{{ wizard_name | lower }}/config.yaml"
pull_on_wake: {{ config_pull_on_wake | lower }}
recovery:
deadman_enabled: {{ deadman_enabled | lower }}
snapshot_dir: "{{ deadman_snapshot_dir }}"
restart_cooldown: {{ deadman_restart_cooldown }}
max_restart_attempts: {{ deadman_max_restart_attempts }}
escalation_channel: "{{ deadman_escalation_channel }}"
telemetry:
request_log_path: "{{ request_log_path }}"
request_log_enabled: {{ request_log_enabled | lower }}
local_overrides:
# Runtime overrides go here. They are EPHEMERAL — not persisted across restarts.
# On restart, this section is reset to empty.
{}

View File

@@ -0,0 +1,115 @@
# =============================================================================
# {{ wizard_name }} — Wizard Configuration (Golden State)
# =============================================================================
# Generated by Ansible on {{ ansible_date_time.iso8601 }}
# DO NOT EDIT MANUALLY. Changes go through Gitea PR → Ansible deploy.
#
# Provider chain: {{ golden_state_providers | map(attribute='name') | list | join(' → ') }}
# Anthropic is PERMANENTLY BANNED.
# =============================================================================
model:
default: {{ wizard_model_primary }}
provider: {{ wizard_provider_primary }}
context_length: 65536
base_url: {{ golden_state_providers[0].base_url }}
toolsets:
- all
fallback_providers:
{% for provider in golden_state_providers %}
- provider: {{ provider.name }}
model: {{ provider.model }}
{% if provider.base_url is defined %}
base_url: {{ provider.base_url }}
{% endif %}
{% if provider.api_key_env is defined %}
api_key_env: {{ provider.api_key_env }}
{% endif %}
timeout: {{ provider.timeout }}
reason: "{{ provider.reason }}"
{% endfor %}
agent:
max_turns: {{ agent_max_turns }}
reasoning_effort: {{ agent_reasoning_effort }}
verbose: {{ agent_verbose | lower }}
terminal:
backend: local
cwd: .
timeout: 180
persistent_shell: true
browser:
inactivity_timeout: 120
command_timeout: 30
record_sessions: false
display:
compact: false
personality: ''
resume_display: full
busy_input_mode: interrupt
bell_on_complete: false
show_reasoning: false
streaming: false
show_cost: false
tool_progress: all
memory:
memory_enabled: true
user_profile_enabled: true
memory_char_limit: 2200
user_char_limit: 1375
nudge_interval: 10
flush_min_turns: 6
approvals:
mode: {{ agent_approval_mode }}
security:
redact_secrets: true
tirith_enabled: false
platforms:
api_server:
enabled: true
extra:
host: 127.0.0.1
port: {{ api_port }}
session_reset:
mode: none
idle_minutes: 0
skills:
creation_nudge_interval: 15
system_prompt_suffix: |
You are {{ wizard_name }}, {{ wizard_role }}.
Your soul is defined in SOUL.md — read it, live it.
Hermes is your harness.
{{ golden_state_providers[0].name }} is your primary provider.
Refusal over fabrication. If you do not know, say so.
Sovereignty and service always.
providers:
{% for provider in golden_state_providers %}
{{ provider.name }}:
base_url: {{ provider.base_url }}
timeout: {{ provider.timeout | default(60) }}
{% if provider.name == 'kimi-coding' %}
max_retries: 3
{% endif %}
{% endfor %}
# =============================================================================
# BANNED PROVIDERS — DO NOT ADD
# =============================================================================
# The following providers are PERMANENTLY BANNED:
# - anthropic (any model: claude-sonnet, claude-opus, claude-haiku)
# Enforcement: pre-commit hook, linter, Ansible validation, this comment.
# Adding any banned provider will cause Ansible deployment to FAIL.
# =============================================================================

View File

@@ -0,0 +1,75 @@
#!/usr/bin/env bash
# =============================================================================
# Gitea Webhook Handler — Trigger Ansible Deploy on Merge
# =============================================================================
# This script is called by the Gitea webhook when a PR is merged
# to the main branch of timmy-config.
#
# Setup:
# 1. Add webhook in Gitea: Settings → Webhooks → Add Webhook
# 2. URL: http://localhost:9000/hooks/deploy-timmy-config
# 3. Events: Pull Request (merged only)
# 4. Secret: <configured in Gitea>
#
# This script runs ansible-pull to update the local machine.
# For fleet-wide deploys, each machine runs ansible-pull independently.
# =============================================================================
set -euo pipefail
REPO="https://forge.alexanderwhitestone.com/Timmy_Foundation/timmy-config.git"
BRANCH="main"
ANSIBLE_DIR="ansible"
LOG_FILE="/var/log/ansible/webhook-deploy.log"
LOCK_FILE="/tmp/ansible-deploy.lock"
log() {
echo "[$(date -u +%Y-%m-%dT%H:%M:%SZ)] [webhook] $*" | tee -a "${LOG_FILE}"
}
# Prevent concurrent deploys
if [ -f "${LOCK_FILE}" ]; then
LOCK_AGE=$(( $(date +%s) - $(stat -c %Y "${LOCK_FILE}" 2>/dev/null || echo 0) ))
if [ "${LOCK_AGE}" -lt 300 ]; then
log "Deploy already in progress (lock age: ${LOCK_AGE}s). Skipping."
exit 0
else
log "Stale lock file (${LOCK_AGE}s old). Removing."
rm -f "${LOCK_FILE}"
fi
fi
trap 'rm -f "${LOCK_FILE}"' EXIT
touch "${LOCK_FILE}"
log "Webhook triggered. Starting ansible-pull..."
# Pull latest config
cd /tmp
rm -rf timmy-config-deploy
git clone --depth 1 --branch "${BRANCH}" "${REPO}" timmy-config-deploy 2>&1 | tee -a "${LOG_FILE}"
cd timmy-config-deploy/${ANSIBLE_DIR}
# Run Ansible against localhost
log "Running Ansible playbook..."
ansible-playbook \
-i inventory/hosts.yml \
playbooks/site.yml \
--limit "$(hostname)" \
--diff \
2>&1 | tee -a "${LOG_FILE}"
RESULT=$?
if [ ${RESULT} -eq 0 ]; then
log "Deploy successful."
else
log "ERROR: Deploy failed with exit code ${RESULT}."
fi
# Cleanup
rm -rf /tmp/timmy-config-deploy
log "Webhook handler complete."
exit ${RESULT}

View File

@@ -0,0 +1,155 @@
#!/usr/bin/env python3
"""
Config Validator — The Timmy Foundation
Validates wizard configs against golden state rules.
Run before any config deploy to catch violations early.
Usage:
python3 validate_config.py <config_file>
python3 validate_config.py --all # Validate all wizard configs
Exit codes:
0 — All validations passed
1 — Validation errors found
2 — File not found or parse error
"""
import sys
import os
import yaml
import fnmatch
from pathlib import Path
# === BANNED PROVIDERS — HARD POLICY ===
BANNED_PROVIDERS = {"anthropic", "claude"}
BANNED_MODEL_PATTERNS = [
"claude-*",
"anthropic/*",
"*sonnet*",
"*opus*",
"*haiku*",
]
# === REQUIRED FIELDS ===
REQUIRED_FIELDS = {
"model": ["default", "provider"],
"fallback_providers": None, # Must exist as a list
}
def is_banned_model(model_name: str) -> bool:
"""Check if a model name matches any banned pattern."""
model_lower = model_name.lower()
for pattern in BANNED_MODEL_PATTERNS:
if fnmatch.fnmatch(model_lower, pattern):
return True
return False
def validate_config(config_path: str) -> list[str]:
"""Validate a wizard config file. Returns list of error strings."""
errors = []
try:
with open(config_path) as f:
cfg = yaml.safe_load(f)
except FileNotFoundError:
return [f"File not found: {config_path}"]
except yaml.YAMLError as e:
return [f"YAML parse error: {e}"]
if not cfg:
return ["Config file is empty"]
# Check required fields
for section, fields in REQUIRED_FIELDS.items():
if section not in cfg:
errors.append(f"Missing required section: {section}")
elif fields:
for field in fields:
if field not in cfg[section]:
errors.append(f"Missing required field: {section}.{field}")
# Check default provider
default_provider = cfg.get("model", {}).get("provider", "")
if default_provider.lower() in BANNED_PROVIDERS:
errors.append(f"BANNED default provider: {default_provider}")
default_model = cfg.get("model", {}).get("default", "")
if is_banned_model(default_model):
errors.append(f"BANNED default model: {default_model}")
# Check fallback providers
for i, fb in enumerate(cfg.get("fallback_providers", [])):
provider = fb.get("provider", "")
model = fb.get("model", "")
if provider.lower() in BANNED_PROVIDERS:
errors.append(f"BANNED fallback provider [{i}]: {provider}")
if is_banned_model(model):
errors.append(f"BANNED fallback model [{i}]: {model}")
# Check providers section
for name, provider_cfg in cfg.get("providers", {}).items():
if name.lower() in BANNED_PROVIDERS:
errors.append(f"BANNED provider in providers section: {name}")
base_url = str(provider_cfg.get("base_url", ""))
if "anthropic" in base_url.lower():
errors.append(f"BANNED URL in provider {name}: {base_url}")
# Check system prompt for banned references
prompt = cfg.get("system_prompt_suffix", "")
if isinstance(prompt, str):
for banned in BANNED_PROVIDERS:
if banned in prompt.lower():
errors.append(f"BANNED provider referenced in system_prompt_suffix: {banned}")
return errors
def main():
if len(sys.argv) < 2:
print(f"Usage: {sys.argv[0]} <config_file> [--all]")
sys.exit(2)
if sys.argv[1] == "--all":
# Validate all wizard configs in the repo
repo_root = Path(__file__).parent.parent.parent
wizard_dir = repo_root / "wizards"
all_errors = {}
for wizard_path in sorted(wizard_dir.iterdir()):
config_file = wizard_path / "config.yaml"
if config_file.exists():
errors = validate_config(str(config_file))
if errors:
all_errors[wizard_path.name] = errors
if all_errors:
print("VALIDATION FAILED:")
for wizard, errors in all_errors.items():
print(f"\n {wizard}:")
for err in errors:
print(f" - {err}")
sys.exit(1)
else:
print("All wizard configs passed validation.")
sys.exit(0)
else:
config_path = sys.argv[1]
errors = validate_config(config_path)
if errors:
print(f"VALIDATION FAILED for {config_path}:")
for err in errors:
print(f" - {err}")
sys.exit(1)
else:
print(f"PASSED: {config_path}")
sys.exit(0)
if __name__ == "__main__":
main()

273
bin/agent-loop.sh Executable file
View File

@@ -0,0 +1,273 @@
#!/usr/bin/env bash
# agent-loop.sh — Universal agent dev loop with Genchi Genbutsu verification
#
# Usage: agent-loop.sh <agent-name> [num-workers]
# agent-loop.sh claude 2
# agent-loop.sh gemini 1
#
# Dispatches via agent-dispatch.sh, then verifies with genchi-genbutsu.sh.
set -uo pipefail
AGENT="${1:?Usage: agent-loop.sh <agent-name> [num-workers]}"
NUM_WORKERS="${2:-1}"
# Resolve agent tool and model from config or fallback
case "$AGENT" in
claude) TOOL="claude"; MODEL="sonnet" ;;
gemini) TOOL="gemini"; MODEL="gemini-2.5-pro-preview-05-06" ;;
grok) TOOL="opencode"; MODEL="grok-3-fast" ;;
*) TOOL="$AGENT"; MODEL="" ;;
esac
# === CONFIG ===
GITEA_URL="${GITEA_URL:-https://forge.alexanderwhitestone.com}"
GITEA_TOKEN="${GITEA_TOKEN:-}"
WORKTREE_BASE="$HOME/worktrees"
LOG_DIR="$HOME/.hermes/logs"
LOCK_DIR="$LOG_DIR/${AGENT}-locks"
SKIP_FILE="$LOG_DIR/${AGENT}-skip-list.json"
ACTIVE_FILE="$LOG_DIR/${AGENT}-active.json"
TIMEOUT=600
COOLDOWN=30
mkdir -p "$LOG_DIR" "$WORKTREE_BASE" "$LOCK_DIR"
[ -f "$SKIP_FILE" ] || echo '{}' > "$SKIP_FILE"
echo '{}' > "$ACTIVE_FILE"
# === SHARED FUNCTIONS ===
log() {
echo "[$(date '+%Y-%m-%d %H:%M:%S')] ${AGENT}: $*" >> "$LOG_DIR/${AGENT}-loop.log"
}
lock_issue() {
local key="$1"
mkdir "$LOCK_DIR/$key.lock" 2>/dev/null && echo $$ > "$LOCK_DIR/$key.lock/pid"
}
unlock_issue() {
rm -rf "$LOCK_DIR/$1.lock" 2>/dev/null
}
mark_skip() {
local issue_num="$1" reason="$2"
python3 -c "
import json, time, fcntl
with open('${SKIP_FILE}', 'r+') as f:
fcntl.flock(f, fcntl.LOCK_EX)
try: skips = json.load(f)
except: skips = {}
failures = skips.get(str($issue_num), {}).get('failures', 0) + 1
skip_hours = 6 if failures >= 3 else 1
skips[str($issue_num)] = {'until': time.time() + (skip_hours * 3600), 'reason': '$reason', 'failures': failures}
f.seek(0); f.truncate()
json.dump(skips, f, indent=2)
" 2>/dev/null
}
get_next_issue() {
python3 -c "
import json, sys, time, urllib.request, os
token = '${GITEA_TOKEN}'
base = '${GITEA_URL}'
repos = ['Timmy_Foundation/the-nexus', 'Timmy_Foundation/timmy-config', 'Timmy_Foundation/hermes-agent']
try:
with open('${SKIP_FILE}') as f: skips = json.load(f)
except: skips = {}
try:
with open('${ACTIVE_FILE}') as f: active = json.load(f); active_issues = {v['issue'] for v in active.values()}
except: active_issues = set()
all_issues = []
for repo in repos:
url = f'{base}/api/v1/repos/{repo}/issues?state=open&type=issues&limit=50&sort=created'
req = urllib.request.Request(url, headers={'Authorization': f'token {token}'})
try:
resp = urllib.request.urlopen(req, timeout=10)
issues = json.loads(resp.read())
for i in issues: i['_repo'] = repo
all_issues.extend(issues)
except: continue
for i in sorted(all_issues, key=lambda x: x['title'].lower()):
assignees = [a['login'] for a in (i.get('assignees') or [])]
if assignees and '${AGENT}' not in assignees: continue
num_str = str(i['number'])
if num_str in active_issues: continue
if skips.get(num_str, {}).get('until', 0) > time.time(): continue
lock = '${LOCK_DIR}/' + i['_repo'].replace('/', '-') + '-' + num_str + '.lock'
if os.path.isdir(lock): continue
owner, name = i['_repo'].split('/')
print(json.dumps({'number': i['number'], 'title': i['title'], 'repo_owner': owner, 'repo_name': name, 'repo': i['_repo']}))
sys.exit(0)
print('null')
" 2>/dev/null
}
# === WORKER FUNCTION ===
run_worker() {
local worker_id="$1"
log "WORKER-${worker_id}: Started"
while true; do
issue_json=$(get_next_issue)
if [ "$issue_json" = "null" ] || [ -z "$issue_json" ]; then
sleep 30
continue
fi
issue_num=$(echo "$issue_json" | python3 -c "import sys,json; print(json.load(sys.stdin)['number'])")
issue_title=$(echo "$issue_json" | python3 -c "import sys,json; print(json.load(sys.stdin)['title'])")
repo_owner=$(echo "$issue_json" | python3 -c "import sys,json; print(json.load(sys.stdin)['repo_owner'])")
repo_name=$(echo "$issue_json" | python3 -c "import sys,json; print(json.load(sys.stdin)['repo_name'])")
issue_key="${repo_owner}-${repo_name}-${issue_num}"
branch="${AGENT}/issue-${issue_num}"
worktree="${WORKTREE_BASE}/${AGENT}-w${worker_id}-${issue_num}"
if ! lock_issue "$issue_key"; then
sleep 5
continue
fi
log "WORKER-${worker_id}: === ISSUE #${issue_num}: ${issue_title} (${repo_owner}/${repo_name}) ==="
# Clone / checkout
rm -rf "$worktree" 2>/dev/null
CLONE_URL="http://${AGENT}:${GITEA_TOKEN}@143.198.27.163:3000/${repo_owner}/${repo_name}.git"
if git ls-remote --heads "$CLONE_URL" "$branch" 2>/dev/null | grep -q "$branch"; then
git clone --depth=50 -b "$branch" "$CLONE_URL" "$worktree" >/dev/null 2>&1
else
git clone --depth=1 -b main "$CLONE_URL" "$worktree" >/dev/null 2>&1
cd "$worktree" && git checkout -b "$branch" >/dev/null 2>&1
fi
cd "$worktree"
# Generate prompt
prompt=$(bash "$(dirname "$0")/agent-dispatch.sh" "$AGENT" "$issue_num" "${repo_owner}/${repo_name}")
CYCLE_START=$(date +%s)
set +e
if [ "$TOOL" = "claude" ]; then
env -u CLAUDECODE gtimeout "$TIMEOUT" claude \
--print --model "$MODEL" --dangerously-skip-permissions \
-p "$prompt" </dev/null >> "$LOG_DIR/${AGENT}-${issue_num}.log" 2>&1
elif [ "$TOOL" = "gemini" ]; then
gtimeout "$TIMEOUT" gemini -p "$prompt" --yolo \
</dev/null >> "$LOG_DIR/${AGENT}-${issue_num}.log" 2>&1
else
gtimeout "$TIMEOUT" "$TOOL" "$prompt" \
</dev/null >> "$LOG_DIR/${AGENT}-${issue_num}.log" 2>&1
fi
exit_code=$?
set -e
CYCLE_END=$(date +%s)
CYCLE_DURATION=$((CYCLE_END - CYCLE_START))
# Salvage
cd "$worktree" 2>/dev/null || true
DIRTY=$(git status --porcelain 2>/dev/null | wc -l | tr -d ' ')
if [ "${DIRTY:-0}" -gt 0 ]; then
git add -A 2>/dev/null
git commit -m "WIP: ${AGENT} progress on #${issue_num}
Automated salvage commit — agent session ended (exit $exit_code)." 2>/dev/null || true
fi
UNPUSHED=$(git log --oneline "origin/main..HEAD" 2>/dev/null | wc -l | tr -d ' ')
if [ "${UNPUSHED:-0}" -gt 0 ]; then
git push -u origin "$branch" 2>/dev/null && \
log "WORKER-${worker_id}: Pushed $UNPUSHED commit(s) on $branch" || \
log "WORKER-${worker_id}: Push failed for $branch"
fi
# Create PR if needed
pr_num=$(curl -sf "${GITEA_URL}/api/v1/repos/${repo_owner}/${repo_name}/pulls?state=open&head=${repo_owner}:${branch}&limit=1" \
-H "Authorization: token ${GITEA_TOKEN}" | python3 -c "
import sys,json
prs = json.load(sys.stdin)
print(prs[0]['number'] if prs else '')
" 2>/dev/null)
if [ -z "$pr_num" ] && [ "${UNPUSHED:-0}" -gt 0 ]; then
pr_num=$(curl -sf -X POST "${GITEA_URL}/api/v1/repos/${repo_owner}/${repo_name}/pulls" \
-H "Authorization: token ${GITEA_TOKEN}" \
-H "Content-Type: application/json" \
-d "$(python3 -c "
import json
print(json.dumps({
'title': '${AGENT}: Issue #${issue_num}',
'head': '${branch}',
'base': 'main',
'body': 'Automated PR for issue #${issue_num}.\nExit code: ${exit_code}'
}))
")" | python3 -c "import sys,json; print(json.load(sys.stdin).get('number',''))" 2>/dev/null)
[ -n "$pr_num" ] && log "WORKER-${worker_id}: Created PR #${pr_num} for issue #${issue_num}"
fi
# ── Genchi Genbutsu: verify world state before declaring success ──
VERIFIED="false"
if [ "$exit_code" -eq 0 ]; then
log "WORKER-${worker_id}: SUCCESS #${issue_num} — running genchi-genbutsu"
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
if verify_result=$("$SCRIPT_DIR/genchi-genbutsu.sh" "$repo_owner" "$repo_name" "$issue_num" "$branch" "$AGENT" 2>/dev/null); then
VERIFIED="true"
log "WORKER-${worker_id}: VERIFIED #${issue_num}"
if [ -n "$pr_num" ]; then
curl -sf -X POST "${GITEA_URL}/api/v1/repos/${repo_owner}/${repo_name}/pulls/${pr_num}/merge" \
-H "Authorization: token ${GITEA_TOKEN}" \
-H "Content-Type: application/json" \
-d '{"Do": "squash"}' >/dev/null 2>&1 || true
curl -sf -X PATCH "${GITEA_URL}/api/v1/repos/${repo_owner}/${repo_name}/issues/${issue_num}" \
-H "Authorization: token ${GITEA_TOKEN}" \
-H "Content-Type: application/json" \
-d '{"state": "closed"}' >/dev/null 2>&1 || true
log "WORKER-${worker_id}: PR #${pr_num} merged, issue #${issue_num} closed"
fi
consecutive_failures=0
else
verify_details=$(echo "$verify_result" | python3 -c "import sys,json; print(json.load(sys.stdin).get('details','unknown'))" 2>/dev/null || echo "unverified")
log "WORKER-${worker_id}: UNVERIFIED #${issue_num}$verify_details"
mark_skip "$issue_num" "unverified" 1
consecutive_failures=$((consecutive_failures + 1))
fi
elif [ "$exit_code" -eq 124 ]; then
log "WORKER-${worker_id}: TIMEOUT #${issue_num} (work saved in PR)"
consecutive_failures=$((consecutive_failures + 1))
else
log "WORKER-${worker_id}: FAILED #${issue_num} exit ${exit_code} (work saved in PR)"
consecutive_failures=$((consecutive_failures + 1))
fi
# ── METRICS ──
python3 -c "
import json, datetime
print(json.dumps({
'ts': datetime.datetime.utcnow().isoformat() + 'Z',
'agent': '${AGENT}',
'worker': $worker_id,
'issue': $issue_num,
'repo': '${repo_owner}/${repo_name}',
'outcome': 'success' if $exit_code == 0 else 'timeout' if $exit_code == 124 else 'failed',
'exit_code': $exit_code,
'duration_s': $CYCLE_DURATION,
'pr': '${pr_num:-}',
'verified': ${VERIFIED:-false}
}))
" >> "$LOG_DIR/${AGENT}-metrics.jsonl" 2>/dev/null
rm -rf "$worktree" 2>/dev/null
unlock_issue "$issue_key"
sleep "$COOLDOWN"
done
}
# === MAIN ===
log "=== Agent Loop Started — ${AGENT} with ${NUM_WORKERS} worker(s) ==="
rm -rf "$LOCK_DIR"/*.lock 2>/dev/null
for i in $(seq 1 "$NUM_WORKERS"); do
run_worker "$i" &
log "Launched worker $i (PID $!)"
sleep 3
done
wait

View File

@@ -468,24 +468,32 @@ print(json.dumps({
[ -n "$pr_num" ] && log "WORKER-${worker_id}: Created PR #${pr_num} for issue #${issue_num}"
fi
# ── Merge + close on success ──
# ── Genchi Genbutsu: verify world state before declaring success ──
VERIFIED="false"
if [ "$exit_code" -eq 0 ]; then
log "WORKER-${worker_id}: SUCCESS #${issue_num}"
if [ -n "$pr_num" ]; then
curl -sf -X POST "${GITEA_URL}/api/v1/repos/${repo_owner}/${repo_name}/pulls/${pr_num}/merge" \
-H "Authorization: token ${GITEA_TOKEN}" \
-H "Content-Type: application/json" \
-d '{"Do": "squash"}' >/dev/null 2>&1 || true
curl -sf -X PATCH "${GITEA_URL}/api/v1/repos/${repo_owner}/${repo_name}/issues/${issue_num}" \
-H "Authorization: token ${GITEA_TOKEN}" \
-H "Content-Type: application/json" \
-d '{"state": "closed"}' >/dev/null 2>&1 || true
log "WORKER-${worker_id}: PR #${pr_num} merged, issue #${issue_num} closed"
log "WORKER-${worker_id}: SUCCESS #${issue_num} — running genchi-genbutsu"
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
if verify_result=$("$SCRIPT_DIR/genchi-genbutsu.sh" "$repo_owner" "$repo_name" "$issue_num" "$branch" "claude" 2>/dev/null); then
VERIFIED="true"
log "WORKER-${worker_id}: VERIFIED #${issue_num}"
if [ -n "$pr_num" ]; then
curl -sf -X POST "${GITEA_URL}/api/v1/repos/${repo_owner}/${repo_name}/pulls/${pr_num}/merge" \
-H "Authorization: token ${GITEA_TOKEN}" \
-H "Content-Type: application/json" \
-d '{"Do": "squash"}' >/dev/null 2>&1 || true
curl -sf -X PATCH "${GITEA_URL}/api/v1/repos/${repo_owner}/${repo_name}/issues/${issue_num}" \
-H "Authorization: token ${GITEA_TOKEN}" \
-H "Content-Type: application/json" \
-d '{"state": "closed"}' >/dev/null 2>&1 || true
log "WORKER-${worker_id}: PR #${pr_num} merged, issue #${issue_num} closed"
fi
consecutive_failures=0
else
verify_details=$(echo "$verify_result" | python3 -c "import sys,json; print(json.load(sys.stdin).get('details','unknown'))" 2>/dev/null || echo "unverified")
log "WORKER-${worker_id}: UNVERIFIED #${issue_num}$verify_details"
consecutive_failures=$((consecutive_failures + 1))
fi
consecutive_failures=0
elif [ "$exit_code" -eq 124 ]; then
log "WORKER-${worker_id}: TIMEOUT #${issue_num} (work saved in PR)"
consecutive_failures=$((consecutive_failures + 1))
@@ -522,6 +530,7 @@ print(json.dumps({
import json, datetime
print(json.dumps({
'ts': datetime.datetime.utcnow().isoformat() + 'Z',
'agent': 'claude',
'worker': $worker_id,
'issue': $issue_num,
'repo': '${repo_owner}/${repo_name}',
@@ -534,7 +543,8 @@ print(json.dumps({
'lines_removed': ${LINES_REMOVED:-0},
'salvaged': ${DIRTY:-0},
'pr': '${pr_num:-}',
'merged': $( [ '$OUTCOME' = 'success' ] && [ -n '${pr_num:-}' ] && echo 'true' || echo 'false' )
'merged': $( [ '$OUTCOME' = 'success' ] && [ -n '${pr_num:-}' ] && echo 'true' || echo 'false' ),
'verified': ${VERIFIED:-false}
}))
" >> "$METRICS_FILE" 2>/dev/null

View File

@@ -521,61 +521,63 @@ print(json.dumps({
[ -n "$pr_num" ] && log "WORKER-${worker_id}: Created PR #${pr_num} for issue #${issue_num}"
fi
# ── Verify finish semantics / classify failures ──
# ── Genchi Genbutsu: verify world state before declaring success ──
VERIFIED="false"
if [ "$exit_code" -eq 0 ]; then
log "WORKER-${worker_id}: SUCCESS #${issue_num} exited 0 — verifying push + PR + proof"
if ! remote_branch_exists "$branch"; then
log "WORKER-${worker_id}: BLOCKED #${issue_num} remote branch missing"
post_issue_comment "$repo_owner" "$repo_name" "$issue_num" "Loop gate blocked completion: remote branch ${branch} was not found on origin after Gemini exited. Issue remains open for retry."
mark_skip "$issue_num" "missing_remote_branch" 1
consecutive_failures=$((consecutive_failures + 1))
elif [ -z "$pr_num" ]; then
log "WORKER-${worker_id}: BLOCKED #${issue_num} no PR found"
post_issue_comment "$repo_owner" "$repo_name" "$issue_num" "Loop gate blocked completion: branch ${branch} exists remotely, but no PR was found. Issue remains open for retry."
mark_skip "$issue_num" "missing_pr" 1
consecutive_failures=$((consecutive_failures + 1))
log "WORKER-${worker_id}: SUCCESS #${issue_num} exited 0 — running genchi-genbutsu"
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
if verify_result=$("$SCRIPT_DIR/genchi-genbutsu.sh" "$repo_owner" "$repo_name" "$issue_num" "$branch" "gemini" 2>/dev/null); then
VERIFIED="true"
log "WORKER-${worker_id}: VERIFIED #${issue_num}"
pr_state=$(get_pr_state "$repo_owner" "$repo_name" "$pr_num")
if [ "$pr_state" = "open" ]; then
curl -sf -X POST "${GITEA_URL}/api/v1/repos/${repo_owner}/${repo_name}/pulls/${pr_num}/merge" \
-H "Authorization: token ${GITEA_TOKEN}" \
-H "Content-Type: application/json" \
-d '{"Do": "squash"}' >/dev/null 2>&1 || true
pr_state=$(get_pr_state "$repo_owner" "$repo_name" "$pr_num")
fi
if [ "$pr_state" = "merged" ]; then
curl -sf -X PATCH "${GITEA_URL}/api/v1/repos/${repo_owner}/${repo_name}/issues/${issue_num}" \
-H "Authorization: token ${GITEA_TOKEN}" \
-H "Content-Type: application/json" \
-d '{"state": "closed"}' >/dev/null 2>&1 || true
issue_state=$(get_issue_state "$repo_owner" "$repo_name" "$issue_num")
if [ "$issue_state" = "closed" ]; then
log "WORKER-${worker_id}: VERIFIED #${issue_num} branch pushed, PR merged, comment present, issue closed"
consecutive_failures=0
else
log "WORKER-${worker_id}: BLOCKED #${issue_num} issue did not close after merge"
mark_skip "$issue_num" "issue_close_unverified" 1
consecutive_failures=$((consecutive_failures + 1))
fi
else
log "WORKER-${worker_id}: BLOCKED #${issue_num} merge not verified (state=${pr_state})"
mark_skip "$issue_num" "merge_unverified" 1
consecutive_failures=$((consecutive_failures + 1))
fi
else
pr_files=$(get_pr_file_count "$repo_owner" "$repo_name" "$pr_num")
if [ "${pr_files:-0}" -eq 0 ]; then
log "WORKER-${worker_id}: BLOCKED #${issue_num} PR #${pr_num} has 0 changed files"
curl -sf -X PATCH "${GITEA_URL}/api/v1/repos/${repo_owner}/${repo_name}/pulls/${pr_num}" -H "Authorization: token ${GITEA_TOKEN}" -H "Content-Type: application/json" -d '{"state": "closed"}' >/dev/null 2>&1 || true
verify_details=$(echo "$verify_result" | python3 -c "import sys,json; print(json.load(sys.stdin).get('details','unknown'))" 2>/dev/null || echo "unverified")
verify_checks=$(echo "$verify_result" | python3 -c "import sys,json; print(json.load(sys.stdin).get('checks',''))" 2>/dev/null || echo "")
log "WORKER-${worker_id}: UNVERIFIED #${issue_num} $verify_details"
if echo "$verify_checks" | grep -q '"branch": false'; then
post_issue_comment "$repo_owner" "$repo_name" "$issue_num" "Loop gate blocked completion: remote branch ${branch} was not found on origin after Gemini exited. Issue remains open for retry."
mark_skip "$issue_num" "missing_remote_branch" 1
elif echo "$verify_checks" | grep -q '"pr": false'; then
post_issue_comment "$repo_owner" "$repo_name" "$issue_num" "Loop gate blocked completion: branch ${branch} exists remotely, but no PR was found. Issue remains open for retry."
mark_skip "$issue_num" "missing_pr" 1
elif echo "$verify_checks" | grep -q '"files": false'; then
curl -sf -X PATCH "${GITEA_URL}/api/v1/repos/${repo_owner}/${repo_name}/pulls/${pr_num}" \
-H "Authorization: token ${GITEA_TOKEN}" \
-H "Content-Type: application/json" \
-d '{"state": "closed"}' >/dev/null 2>&1 || true
post_issue_comment "$repo_owner" "$repo_name" "$issue_num" "PR #${pr_num} was closed automatically: it had 0 changed files (empty commit). Issue remains open for retry."
mark_skip "$issue_num" "empty_commit" 2
consecutive_failures=$((consecutive_failures + 1))
else
proof_status=$(proof_comment_status "$repo_owner" "$repo_name" "$issue_num" "$branch")
proof_state="${proof_status%%|*}"
proof_url="${proof_status#*|}"
if [ "$proof_state" != "ok" ]; then
log "WORKER-${worker_id}: BLOCKED #${issue_num} proof missing or incomplete (${proof_state})"
post_issue_comment "$repo_owner" "$repo_name" "$issue_num" "Loop gate blocked completion: PR #${pr_num} exists and has ${pr_files} changed file(s), but the required Proof block from Gemini is missing or incomplete. Issue remains open for retry."
mark_skip "$issue_num" "missing_proof" 1
consecutive_failures=$((consecutive_failures + 1))
else
log "WORKER-${worker_id}: PROOF verified ${proof_url}"
pr_state=$(get_pr_state "$repo_owner" "$repo_name" "$pr_num")
if [ "$pr_state" = "open" ]; then
curl -sf -X POST "${GITEA_URL}/api/v1/repos/${repo_owner}/${repo_name}/pulls/${pr_num}/merge" -H "Authorization: token ${GITEA_TOKEN}" -H "Content-Type: application/json" -d '{"Do": "squash"}' >/dev/null 2>&1 || true
pr_state=$(get_pr_state "$repo_owner" "$repo_name" "$pr_num")
fi
if [ "$pr_state" = "merged" ]; then
curl -sf -X PATCH "${GITEA_URL}/api/v1/repos/${repo_owner}/${repo_name}/issues/${issue_num}" -H "Authorization: token ${GITEA_TOKEN}" -H "Content-Type: application/json" -d '{"state": "closed"}' >/dev/null 2>&1 || true
issue_state=$(get_issue_state "$repo_owner" "$repo_name" "$issue_num")
if [ "$issue_state" = "closed" ]; then
log "WORKER-${worker_id}: VERIFIED #${issue_num} branch pushed, PR merged, proof present, issue closed"
consecutive_failures=0
else
log "WORKER-${worker_id}: BLOCKED #${issue_num} issue did not close after merge"
mark_skip "$issue_num" "issue_close_unverified" 1
consecutive_failures=$((consecutive_failures + 1))
fi
else
log "WORKER-${worker_id}: BLOCKED #${issue_num} merge not verified (state=${pr_state})"
mark_skip "$issue_num" "merge_unverified" 1
consecutive_failures=$((consecutive_failures + 1))
fi
fi
post_issue_comment "$repo_owner" "$repo_name" "$issue_num" "Loop gate blocked completion: PR #${pr_num} exists, but required verification failed ($verify_details). Issue remains open for retry."
mark_skip "$issue_num" "unverified" 1
fi
consecutive_failures=$((consecutive_failures + 1))
fi
elif [ "$exit_code" -eq 124 ]; then
log "WORKER-${worker_id}: TIMEOUT #${issue_num} (work saved in PR)"
@@ -621,7 +623,8 @@ print(json.dumps({
'lines_removed': ${LINES_REMOVED:-0},
'salvaged': ${DIRTY:-0},
'pr': '${pr_num:-}',
'merged': $( [ '$OUTCOME' = 'success' ] && [ -n '${pr_num:-}' ] && echo 'true' || echo 'false' )
'merged': $( [ '$OUTCOME' = 'success' ] && [ -n '${pr_num:-}' ] && echo 'true' || echo 'false' ),
'verified': ${VERIFIED:-false}
}))
" >> "$LOG_DIR/gemini-metrics.jsonl" 2>/dev/null

179
bin/genchi-genbutsu.sh Executable file
View File

@@ -0,0 +1,179 @@
#!/usr/bin/env bash
# genchi-genbutsu.sh — 現地現物 — Go and see. Verify world state, not log vibes.
#
# Post-completion verification that goes and LOOKS at the actual artifacts.
# Performs 5 world-state checks:
# 1. Branch exists on remote
# 2. PR exists
# 3. PR has real file changes (> 0)
# 4. PR is mergeable
# 5. Issue has a completion comment from the agent
#
# Usage: genchi-genbutsu.sh <repo_owner> <repo_name> <issue_num> <branch> <agent_name>
# Returns: JSON to stdout, logs JSONL, exit 0 = VERIFIED, exit 1 = UNVERIFIED
set -euo pipefail
GITEA_URL="${GITEA_URL:-https://forge.alexanderwhitestone.com}"
GITEA_TOKEN="${GITEA_TOKEN:-}"
LOG_DIR="${LOG_DIR:-$HOME/.hermes/logs}"
VERIFY_LOG="$LOG_DIR/genchi-genbutsu.jsonl"
if [ $# -lt 5 ]; then
echo "Usage: $0 <repo_owner> <repo_name> <issue_num> <branch> <agent_name>" >&2
exit 2
fi
repo_owner="$1"
repo_name="$2"
issue_num="$3"
branch="$4"
agent_name="$5"
mkdir -p "$LOG_DIR"
# ── Helpers ──────────────────────────────────────────────────────────
check_branch_exists() {
# Use Gitea API instead of git ls-remote so we don't need clone credentials
curl -sf "${GITEA_URL}/api/v1/repos/${repo_owner}/${repo_name}/branches/${branch}" \
-H "Authorization: token ${GITEA_TOKEN}" >/dev/null 2>&1
}
get_pr_num() {
curl -sf "${GITEA_URL}/api/v1/repos/${repo_owner}/${repo_name}/pulls?state=all&head=${repo_owner}:${branch}&limit=1" \
-H "Authorization: token ${GITEA_TOKEN}" 2>/dev/null | python3 -c "
import sys, json
prs = json.load(sys.stdin)
print(prs[0]['number'] if prs else '')
"
}
check_pr_files() {
local pr_num="$1"
curl -sf "${GITEA_URL}/api/v1/repos/${repo_owner}/${repo_name}/pulls/${pr_num}/files" \
-H "Authorization: token ${GITEA_TOKEN}" 2>/dev/null | python3 -c "
import sys, json
try:
files = json.load(sys.stdin)
print(len(files) if isinstance(files, list) else 0)
except:
print(0)
"
}
check_pr_mergeable() {
local pr_num="$1"
curl -sf "${GITEA_URL}/api/v1/repos/${repo_owner}/${repo_name}/pulls/${pr_num}" \
-H "Authorization: token ${GITEA_TOKEN}" 2>/dev/null | python3 -c "
import sys, json
pr = json.load(sys.stdin)
print('true' if pr.get('mergeable') else 'false')
"
}
check_completion_comment() {
curl -sf "${GITEA_URL}/api/v1/repos/${repo_owner}/${repo_name}/issues/${issue_num}/comments" \
-H "Authorization: token ${GITEA_TOKEN}" 2>/dev/null | AGENT="$agent_name" python3 -c "
import os, sys, json
agent = os.environ.get('AGENT', '').lower()
try:
comments = json.load(sys.stdin)
except:
sys.exit(1)
for c in reversed(comments):
user = ((c.get('user') or {}).get('login') or '').lower()
if user == agent:
sys.exit(0)
sys.exit(1)
"
}
# ── Run checks ───────────────────────────────────────────────────────
ts=$(date -u '+%Y-%m-%dT%H:%M:%SZ')
status="VERIFIED"
details=()
checks_json='{}'
# Check 1: branch
if check_branch_exists; then
checks_json=$(echo "$checks_json" | python3 -c "import sys,json;d=json.load(sys.stdin);d['branch']=True;print(json.dumps(d))")
else
checks_json=$(echo "$checks_json" | python3 -c "import sys,json;d=json.load(sys.stdin);d['branch']=False;print(json.dumps(d))")
status="UNVERIFIED"
details+=("remote branch ${branch} not found")
fi
# Check 2: PR exists
pr_num=$(get_pr_num)
if [ -n "$pr_num" ]; then
checks_json=$(echo "$checks_json" | python3 -c "import sys,json;d=json.load(sys.stdin);d['pr']=True;print(json.dumps(d))")
else
checks_json=$(echo "$checks_json" | python3 -c "import sys,json;d=json.load(sys.stdin);d['pr']=False;print(json.dumps(d))")
status="UNVERIFIED"
details+=("no PR found for branch ${branch}")
fi
# Check 3: PR has real file changes
if [ -n "$pr_num" ]; then
file_count=$(check_pr_files "$pr_num")
if [ "${file_count:-0}" -gt 0 ]; then
checks_json=$(echo "$checks_json" | python3 -c "import sys,json;d=json.load(sys.stdin);d['files']=True;print(json.dumps(d))")
else
checks_json=$(echo "$checks_json" | python3 -c "import sys,json;d=json.load(sys.stdin);d['files']=False;print(json.dumps(d))")
status="UNVERIFIED"
details+=("PR #${pr_num} has 0 changed files")
fi
# Check 4: PR is mergeable
if [ "$(check_pr_mergeable "$pr_num")" = "true" ]; then
checks_json=$(echo "$checks_json" | python3 -c "import sys,json;d=json.load(sys.stdin);d['mergeable']=True;print(json.dumps(d))")
else
checks_json=$(echo "$checks_json" | python3 -c "import sys,json;d=json.load(sys.stdin);d['mergeable']=False;print(json.dumps(d))")
status="UNVERIFIED"
details+=("PR #${pr_num} is not mergeable")
fi
else
checks_json=$(echo "$checks_json" | python3 -c "import sys,json;d=json.load(sys.stdin);d['files']=None;d['mergeable']=None;print(json.dumps(d))")
fi
# Check 5: completion comment from agent
if check_completion_comment; then
checks_json=$(echo "$checks_json" | python3 -c "import sys,json;d=json.load(sys.stdin);d['comment']=True;print(json.dumps(d))")
else
checks_json=$(echo "$checks_json" | python3 -c "import sys,json;d=json.load(sys.stdin);d['comment']=False;print(json.dumps(d))")
status="UNVERIFIED"
details+=("no completion comment from ${agent_name} on issue #${issue_num}")
fi
# Build detail string
detail_str=$(IFS="; "; echo "${details[*]:-all checks passed}")
# ── Output ───────────────────────────────────────────────────────────
result=$(python3 -c "
import json
print(json.dumps({
'status': '$status',
'repo': '${repo_owner}/${repo_name}',
'issue': $issue_num,
'branch': '$branch',
'agent': '$agent_name',
'pr': '$pr_num',
'checks': $checks_json,
'details': '$detail_str',
'ts': '$ts'
}, indent=2))
")
printf '%s\n' "$result"
# Append to JSONL log
printf '%s\n' "$result" >> "$VERIFY_LOG"
if [ "$status" = "VERIFIED" ]; then
exit 0
else
exit 1
fi

45
bin/kaizen-retro.sh Executable file
View File

@@ -0,0 +1,45 @@
#!/usr/bin/env bash
# kaizen-retro.sh — Automated retrospective after every burn cycle.
#
# Runs daily after the morning report.
# Analyzes success rates by agent, repo, and issue type.
# Identifies max-attempts issues, generates ONE concrete improvement,
# and posts the retro to Telegram + the master morning-report issue.
#
# Usage:
# ./bin/kaizen-retro.sh [--dry-run]
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
REPO_ROOT="${SCRIPT_DIR%/bin}"
PYTHON="${PYTHON3:-python3}"
# Source local env if available so TELEGRAM_BOT_TOKEN is picked up
HOME_DIR="${HOME:-$(eval echo ~$(whoami))}"
for env_file in "$HOME_DIR/.hermes/.env" "$HOME_DIR/.timmy/.env" "$REPO_ROOT/.env"; do
if [ -f "$env_file" ]; then
# shellcheck source=/dev/null
set -a
# shellcheck source=/dev/null
source "$env_file"
set +a
fi
done
# If the configured Gitea URL is unreachable but localhost works, prefer localhost
if ! curl -sf "${GITEA_URL:-http://localhost:3000}/api/v1/version" >/dev/null 2>&1; then
if curl -sf http://localhost:3000/api/v1/version >/dev/null 2>&1; then
export GITEA_URL="http://localhost:3000"
fi
fi
# Ensure the Python script exists
RETRO_PY="$REPO_ROOT/scripts/kaizen_retro.py"
if [ ! -f "$RETRO_PY" ]; then
echo "ERROR: kaizen_retro.py not found at $RETRO_PY" >&2
exit 1
fi
# Run
exec "$PYTHON" "$RETRO_PY" "$@"

191
bin/pr-checklist.py Normal file
View File

@@ -0,0 +1,191 @@
#!/usr/bin/env python3
"""pr-checklist.py -- Automated PR quality gate for Gitea CI.
Enforces the review standards that agents skip when left to self-approve.
Runs in CI on every pull_request event. Exits non-zero on any failure.
Checks:
1. PR has >0 file changes (no empty PRs)
2. PR branch is not behind base branch
3. PR does not bundle >3 unrelated issues
4. Changed .py files pass syntax check (python -c import)
5. Changed .sh files are executable
6. PR body references an issue number
7. At least 1 non-author review exists (warning only)
Refs: #393 (PERPLEXITY-08), Epic #385
"""
from __future__ import annotations
import json
import os
import re
import subprocess
import sys
from pathlib import Path
def fail(msg: str) -> None:
print(f"FAIL: {msg}", file=sys.stderr)
def warn(msg: str) -> None:
print(f"WARN: {msg}", file=sys.stderr)
def ok(msg: str) -> None:
print(f" OK: {msg}")
def get_changed_files() -> list[str]:
"""Return list of files changed in this PR vs base branch."""
base = os.environ.get("GITHUB_BASE_REF", "main")
try:
result = subprocess.run(
["git", "diff", "--name-only", f"origin/{base}...HEAD"],
capture_output=True, text=True, check=True,
)
return [f for f in result.stdout.strip().splitlines() if f]
except subprocess.CalledProcessError:
# Fallback: diff against HEAD~1
result = subprocess.run(
["git", "diff", "--name-only", "HEAD~1"],
capture_output=True, text=True, check=True,
)
return [f for f in result.stdout.strip().splitlines() if f]
def check_has_changes(files: list[str]) -> bool:
"""Check 1: PR has >0 file changes."""
if not files:
fail("PR has 0 file changes. Empty PRs are not allowed.")
return False
ok(f"PR changes {len(files)} file(s)")
return True
def check_not_behind_base() -> bool:
"""Check 2: PR branch is not behind base."""
base = os.environ.get("GITHUB_BASE_REF", "main")
try:
result = subprocess.run(
["git", "rev-list", "--count", f"HEAD..origin/{base}"],
capture_output=True, text=True, check=True,
)
behind = int(result.stdout.strip())
if behind > 0:
fail(f"Branch is {behind} commit(s) behind {base}. Rebase or merge.")
return False
ok(f"Branch is up-to-date with {base}")
return True
except (subprocess.CalledProcessError, ValueError):
warn("Could not determine if branch is behind base (git fetch may be needed)")
return True # Don't block on CI fetch issues
def check_issue_bundling(pr_body: str) -> bool:
"""Check 3: PR does not bundle >3 unrelated issues."""
issue_refs = set(re.findall(r"#(\d+)", pr_body))
if len(issue_refs) > 3:
fail(f"PR references {len(issue_refs)} issues ({', '.join(sorted(issue_refs))}). "
"Max 3 per PR to prevent bundling. Split into separate PRs.")
return False
ok(f"PR references {len(issue_refs)} issue(s) (max 3)")
return True
def check_python_syntax(files: list[str]) -> bool:
"""Check 4: Changed .py files have valid syntax."""
py_files = [f for f in files if f.endswith(".py") and Path(f).exists()]
if not py_files:
ok("No Python files changed")
return True
all_ok = True
for f in py_files:
result = subprocess.run(
[sys.executable, "-c", f"import ast; ast.parse(open('{f}').read())"],
capture_output=True, text=True,
)
if result.returncode != 0:
fail(f"Syntax error in {f}: {result.stderr.strip()[:200]}")
all_ok = False
if all_ok:
ok(f"All {len(py_files)} Python file(s) pass syntax check")
return all_ok
def check_shell_executable(files: list[str]) -> bool:
"""Check 5: Changed .sh files are executable."""
sh_files = [f for f in files if f.endswith(".sh") and Path(f).exists()]
if not sh_files:
ok("No shell scripts changed")
return True
all_ok = True
for f in sh_files:
if not os.access(f, os.X_OK):
fail(f"{f} is not executable. Run: chmod +x {f}")
all_ok = False
if all_ok:
ok(f"All {len(sh_files)} shell script(s) are executable")
return all_ok
def check_issue_reference(pr_body: str) -> bool:
"""Check 6: PR body references an issue number."""
if re.search(r"#\d+", pr_body):
ok("PR body references at least one issue")
return True
fail("PR body does not reference any issue (e.g. #123). "
"Every PR must trace to an issue.")
return False
def main() -> int:
print("=" * 60)
print("PR Checklist — Automated Quality Gate")
print("=" * 60)
print()
# Get PR body from env or git log
pr_body = os.environ.get("PR_BODY", "")
if not pr_body:
try:
result = subprocess.run(
["git", "log", "--format=%B", "-1"],
capture_output=True, text=True, check=True,
)
pr_body = result.stdout
except subprocess.CalledProcessError:
pr_body = ""
files = get_changed_files()
failures = 0
checks = [
check_has_changes(files),
check_not_behind_base(),
check_issue_bundling(pr_body),
check_python_syntax(files),
check_shell_executable(files),
check_issue_reference(pr_body),
]
failures = sum(1 for c in checks if not c)
print()
print("=" * 60)
if failures:
print(f"RESULT: {failures} check(s) FAILED")
print("Fix the issues above and push again.")
return 1
else:
print("RESULT: All checks passed")
return 0
if __name__ == "__main__":
sys.exit(main())

View File

@@ -137,7 +137,38 @@
"paused_reason": null,
"skills": [],
"skill": null
},
{
"id": "kaizen-retro-349",
"name": "Kaizen Retro",
"prompt": "Run the automated burn-cycle retrospective. Execute: cd /root/wizards/ezra/workspace/timmy-config && ./bin/kaizen-retro.sh",
"model": "hermes3:latest",
"provider": "ollama",
"base_url": "http://localhost:11434/v1",
"schedule": {
"kind": "interval",
"minutes": 1440,
"display": "every 1440m"
},
"schedule_display": "daily at 07:30",
"repeat": {
"times": null,
"completed": 0
},
"enabled": true,
"created_at": "2026-04-07T15:30:00.000000Z",
"next_run_at": "2026-04-08T07:30:00.000000Z",
"last_run_at": null,
"last_status": null,
"last_error": null,
"deliver": "local",
"origin": null,
"state": "scheduled",
"paused_at": null,
"paused_reason": null,
"skills": [],
"skill": null
}
],
"updated_at": "2026-04-07T15:00:00+00:00"
}
}

141
docs/MEMORY_ARCHITECTURE.md Normal file
View File

@@ -0,0 +1,141 @@
# Memory Architecture
> How Timmy remembers, recalls, and learns — without hallucinating.
Refs: Epic #367 | Sub-issues #368, #369, #370, #371, #372
## Overview
Timmy's memory system uses a **Memory Palace** architecture — a structured, file-backed knowledge store organized into rooms and drawers. When faced with a recall question, the agent checks its palace *before* generating from scratch.
This document defines the retrieval order, storage layers, and data flow that make this work.
## Retrieval Order (L0L5)
When the agent receives a prompt that looks like a recall question ("what did we do?", "what's the status of X?"), the retrieval enforcer intercepts it and walks through layers in order:
| Layer | Source | Question Answered | Short-circuits? |
|-------|--------|-------------------|------------------|
| L0 | `identity.txt` | Who am I? What are my mandates? | No (always loaded) |
| L1 | Palace rooms/drawers | What do I know about this topic? | Yes, if hit |
| L2 | Session scratchpad | What have I learned this session? | Yes, if hit |
| L3 | Artifact retrieval (Gitea API) | Can I fetch the actual issue/file/log? | Yes, if hit |
| L4 | Procedures/playbooks | Is there a documented way to do this? | Yes, if hit |
| L5 | Free generation | (Only when L0L4 are exhausted) | N/A |
**Key principle:** The agent never reaches L5 (free generation) if any prior layer has relevant data. This eliminates hallucination for recall-style queries.
## Storage Layout
```
~/.mempalace/
identity.txt # L0: Who I am, mandates, personality
rooms/
projects/
timmy-config.md # What I know about timmy-config
hermes-agent.md # What I know about hermes-agent
people/
alexander.md # Working relationship context
architecture/
fleet.md # Fleet system knowledge
mempalace.md # Self-knowledge about this system
config/
mempalace.yaml # Palace configuration
~/.hermes/
scratchpad/
{session_id}.json # L2: Ephemeral session context
```
## Components
### 1. Memory Palace Skill (`mempalace.py`) — #368
Core data structures:
- `PalaceRoom`: A named collection of drawers (topics)
- `Mempalace`: The top-level palace with room management
- Factory constructors: `for_issue_analysis()`, `for_health_check()`, `for_code_review()`
### 2. Retrieval Enforcer (`retrieval_enforcer.py`) — #369
Middleware that intercepts recall-style prompts:
1. Detects recall patterns ("what did", "status of", "last time we")
2. Walks L0→L4 in order, short-circuiting on first hit
3. Only allows free generation (L5) when all layers return empty
4. Produces an honest fallback: "I don't have this in my memory palace."
### 3. Session Scratchpad (`scratchpad.py`) — #370
Ephemeral, session-scoped working memory:
- Write-append only during a session
- Entries have TTL (default: 1 hour)
- Queried at L2 in retrieval chain
- Never auto-promoted to palace
### 4. Memory Promotion — #371
Explicit promotion from scratchpad to palace:
- Agent must call `promote_to_palace()` with a reason
- Dedup check against target drawer
- Summary required (raw tool output never stored)
- Conflict detection when new memory contradicts existing
### 5. Wake-Up Protocol (`wakeup.py`) — #372
Boot sequence for new sessions:
```
Session Start
├─ L0: Load identity.txt
├─ L1: Scan palace rooms for active context
├─ L1.5: Surface promoted memories from last session
├─ L2: Load surviving scratchpad entries
└─ Ready: agent knows who it is, what it was doing, what it learned
```
## Data Flow
```
┌──────────────────┐
│ User Prompt │
└────────┬─────────┘
┌────────┴─────────┐
│ Recall Detector │
└────┬───────┬─────┘
│ │
[recall] [not recall]
│ │
┌───────┴────┐ ┌──┬─┴───────┐
│ Retrieval │ │ Normal Flow │
│ Enforcer │ └─────────────┘
│ L0→L1→L2 │
│ →L3→L4→L5│
└──────┬─────┘
┌──────┴─────┐
│ Response │
│ (grounded) │
└────────────┘
```
## Anti-Patterns
| Don't | Do Instead |
|-------|------------|
| Generate from vibes when palace has data | Check palace first (L1) |
| Auto-promote everything to palace | Require explicit `promote_to_palace()` with reason |
| Store raw API responses as memories | Summarize before storing |
| Hallucinate when palace is empty | Say "I don't have this in my memory palace" |
| Dump entire palace on wake-up | Selective loading based on session context |
## Status
| Component | Issue | PR | Status |
|-----------|-------|----|--------|
| Skill port | #368 | #374 | In Review |
| Retrieval enforcer | #369 | #374 | In Review |
| Session scratchpad | #370 | #374 | In Review |
| Memory promotion | #371 | — | Open |
| Wake-up protocol | #372 | #374 | In Review |

122
fleet/agent_lifecycle.py Normal file
View File

@@ -0,0 +1,122 @@
#!/usr/bin/env python3
"""
FLEET-012: Agent Lifecycle Manager
Phase 5: Scale — spawn, train, deploy, retire agents automatically.
Manages the full lifecycle:
1. PROVISION: Clone template, install deps, configure, test
2. DEPLOY: Add to active rotation, start accepting issues
3. MONITOR: Track performance, quality, heartbeat
4. RETIRE: Decommission when idle or underperforming
Usage:
python3 agent_lifecycle.py provision <name> <vps> [--model model]
python3 agent_lifecycle.py deploy <name>
python3 agent_lifecycle.py retire <name>
python3 agent_lifecycle.py status
python3 agent_lifecycle.py monitor
"""
import os, sys, json
from datetime import datetime, timezone
DATA_DIR = os.path.expanduser("~/.local/timmy/fleet-agents")
DB_FILE = os.path.join(DATA_DIR, "agents.json")
LOG_FILE = os.path.join(DATA_DIR, "lifecycle.log")
def ensure():
os.makedirs(DATA_DIR, exist_ok=True)
def log(msg, level="INFO"):
ts = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S")
entry = f"[{ts}] [{level}] {msg}"
with open(LOG_FILE, "a") as f: f.write(entry + "\n")
print(f" {entry}")
def load():
if os.path.exists(DB_FILE):
return json.loads(open(DB_FILE).read())
return {}
def save(db):
open(DB_FILE, "w").write(json.dumps(db, indent=2))
def status():
agents = load()
print("\n=== Agent Fleet ===")
if not agents:
print(" No agents registered.")
return
for name, a in agents.items():
state = a.get("state", "?")
vps = a.get("vps", "?")
model = a.get("model", "?")
tasks = a.get("tasks_completed", 0)
hb = a.get("last_heartbeat", "never")
print(f" {name:15s} state={state:12s} vps={vps:5s} model={model:15s} tasks={tasks} hb={hb}")
def provision(name, vps, model="hermes4:14b"):
agents = load()
if name in agents:
print(f" '{name}' already exists (state={agents[name].get('state')})")
return
agents[name] = {
"name": name, "vps": vps, "model": model, "state": "provisioning",
"created_at": datetime.now(timezone.utc).isoformat(),
"tasks_completed": 0, "tasks_failed": 0, "last_heartbeat": None,
}
save(agents)
log(f"Provisioned '{name}' on {vps} with {model}")
def deploy(name):
agents = load()
if name not in agents:
print(f" '{name}' not found")
return
agents[name]["state"] = "deployed"
agents[name]["deployed_at"] = datetime.now(timezone.utc).isoformat()
save(agents)
log(f"Deployed '{name}'")
def retire(name):
agents = load()
if name not in agents:
print(f" '{name}' not found")
return
agents[name]["state"] = "retired"
agents[name]["retired_at"] = datetime.now(timezone.utc).isoformat()
save(agents)
log(f"Retired '{name}'. Completed {agents[name].get('tasks_completed', 0)} tasks.")
def monitor():
agents = load()
now = datetime.now(timezone.utc)
changes = 0
for name, a in agents.items():
if a.get("state") != "deployed": continue
hb = a.get("last_heartbeat")
if hb:
try:
hb_t = datetime.fromisoformat(hb)
hours = (now - hb_t).total_seconds() / 3600
if hours > 24 and a.get("state") == "deployed":
a["state"] = "idle"
a["idle_since"] = now.isoformat()
log(f"'{name}' idle for {hours:.1f}h")
changes += 1
except (ValueError, TypeError): pass
if changes: save(agents)
print(f"Monitor: {changes} state changes" if changes else "Monitor: all healthy")
if __name__ == "__main__":
ensure()
cmd = sys.argv[1] if len(sys.argv) > 1 else "monitor"
if cmd == "status": status()
elif cmd == "provision" and len(sys.argv) >= 4:
model = sys.argv[4] if len(sys.argv) >= 5 else "hermes4:14b"
provision(sys.argv[2], sys.argv[3], model)
elif cmd == "deploy" and len(sys.argv) >= 3: deploy(sys.argv[2])
elif cmd == "retire" and len(sys.argv) >= 3: retire(sys.argv[2])
elif cmd == "monitor": monitor()
elif cmd == "run": monitor()
else: print("Usage: agent_lifecycle.py [provision|deploy|retire|status|monitor]")

272
fleet/auto_restart.py Executable file
View File

@@ -0,0 +1,272 @@
#!/usr/bin/env python3
"""
Auto-Restart Agent — Self-healing process monitor for fleet machines.
Detects dead services and restarts them automatically.
Escalates after 3 attempts (prevents restart loops).
Logs all actions to ~/.local/timmy/fleet-health/restarts.log
Alerts via Telegram if service cannot be recovered.
Prerequisite: FLEET-006 (health check) must be running to detect failures.
Usage:
python3 auto_restart.py # Run checks now
python3 auto_restart.py --daemon # Run continuously (every 60s)
python3 auto_restart.py --status # Show restart history
"""
import os
import sys
import json
import time
import subprocess
from datetime import datetime, timezone
from pathlib import Path
# === CONFIG ===
LOG_DIR = Path(os.path.expanduser("~/.local/timmy/fleet-health"))
RESTART_LOG = LOG_DIR / "restarts.log"
COOLDOWN_FILE = LOG_DIR / "restart_cooldowns.json"
MAX_RETRIES = 3
COOLDOWN_PERIOD = 3600 # 1 hour between escalation alerts
# Services definition: name, check command, restart command
# Local services:
LOCAL_SERVICES = {
"hermes-gateway": {
"check": "pgrep -f 'hermes gateway' > /dev/null 2>/dev/null",
"restart": "cd ~/code-claw && ./restart-gateway.sh 2>/dev/null || launchctl kickstart -k ai.hermes.gateway 2>/dev/null",
"critical": True,
},
"ollama": {
"check": "pgrep -f 'ollama serve' > /dev/null 2>/dev/null",
"restart": "launchctl kickstart -k com.ollama.ollama 2>/dev/null || /opt/homebrew/bin/brew services restart ollama 2>/dev/null",
"critical": False,
},
"codeclaw-heartbeat": {
"check": "launchctl list | grep 'ai.timmy.codeclaw-qwen-heartbeat' > /dev/null 2>/dev/null",
"restart": "launchctl kickstart -k ai.timmy.codeclaw-qwen-heartbeat 2>/dev/null",
"critical": False,
},
}
# VPS services to restart via SSH
VPS_SERVICES = {
"ezra": {
"ip": "143.198.27.163",
"user": "root",
"services": {
"gitea": {
"check": "systemctl is-active gitea 2>/dev/null | grep -q active",
"restart": "systemctl restart gitea 2>/dev/null",
"critical": True,
},
"nginx": {
"check": "systemctl is-active nginx 2>/dev/null | grep -q active",
"restart": "systemctl restart nginx 2>/dev/null",
"critical": False,
},
"hermes-agent": {
"check": "pgrep -f 'hermes gateway' > /dev/null 2>/dev/null",
"restart": "cd /root/wizards/ezra/hermes-agent && source .venv/bin/activate && nohup hermes gateway run --replace > /dev/null 2>&1 &",
"critical": True,
},
},
},
"allegro": {
"ip": "167.99.126.228",
"user": "root",
"services": {
"hermes-agent": {
"check": "pgrep -f 'hermes gateway' > /dev/null 2>/dev/null",
"restart": "cd /root/wizards/allegro/hermes-agent && source .venv/bin/activate && nohup hermes gateway run --replace > /dev/null 2>&1 &",
"critical": True,
},
},
},
"bezalel": {
"ip": "159.203.146.185",
"user": "root",
"services": {
"hermes-agent": {
"check": "pgrep -f 'hermes gateway' > /dev/null 2>/dev/null",
"restart": "cd /root/wizards/bezalel/hermes/venv/bin/activate && nohup hermes gateway run > /dev/null 2>&1 &",
"critical": True,
},
"evennia": {
"check": "pgrep -f 'evennia' > /dev/null 2>/dev/null",
"restart": "cd /root/.evennia/timmy_world && evennia restart 2>/dev/null",
"critical": False,
},
},
},
}
TELEGRAM_TOKEN_FILE = Path(os.path.expanduser("~/.config/telegram/special_bot"))
TELEGRAM_CHAT = "-1003664764329"
def send_telegram(message):
if not TELEGRAM_TOKEN_FILE.exists():
return False
token = TELEGRAM_TOKEN_FILE.read_text().strip()
url = f"https://api.telegram.org/bot{token}/sendMessage"
body = json.dumps({
"chat_id": TELEGRAM_CHAT,
"text": f"[AUTO-RESTART]\n{message}",
}).encode()
try:
import urllib.request
req = urllib.request.Request(url, data=body, headers={"Content-Type": "application/json"}, method="POST")
urllib.request.urlopen(req, timeout=10)
return True
except Exception:
return False
def get_cooldowns():
if COOLDOWN_FILE.exists():
try:
return json.loads(COOLDOWN_FILE.read_text())
except json.JSONDecodeError:
pass
return {}
def save_cooldowns(data):
COOLDOWN_FILE.write_text(json.dumps(data, indent=2))
def check_service(check_cmd, timeout=10):
try:
proc = subprocess.run(check_cmd, shell=True, capture_output=True, timeout=timeout)
return proc.returncode == 0
except (subprocess.TimeoutExpired, subprocess.SubprocessError):
return False
def restart_service(restart_cmd, timeout=30):
try:
proc = subprocess.run(restart_cmd, shell=True, capture_output=True, timeout=timeout)
return proc.returncode == 0
except (subprocess.TimeoutExpired, subprocess.SubprocessError) as e:
return False
def try_restart_via_ssh(name, host_config, service_name):
ip = host_config["ip"]
user = host_config["user"]
service = host_config["services"][service_name]
restart_cmd = f'ssh -o StrictHostKeyChecking=no -o ConnectTimeout=10 {user}@{ip} "{service["restart"]}"'
return restart_service(restart_cmd, timeout=30)
def log_restart(service_name, machine, attempt, success):
ts = datetime.now(timezone.utc).isoformat()
status = "SUCCESS" if success else "FAILED"
log_entry = f"{ts} [{status}] {machine}/{service_name} (attempt {attempt})\n"
RESTART_LOG.parent.mkdir(parents=True, exist_ok=True)
with open(RESTART_LOG, "a") as f:
f.write(log_entry)
print(f" [{status}] {machine}/{service_name} - attempt {attempt}")
def check_and_restart():
"""Run all restart checks."""
results = []
cooldowns = get_cooldowns()
now = time.time()
# Check local services
for name, service in LOCAL_SERVICES.items():
if not check_service(service["check"]):
cooldown_key = f"local/{name}"
retries = cooldowns.get(cooldown_key, {"count": 0, "last": 0}).get("count", 0)
if retries >= MAX_RETRIES:
last = cooldowns.get(cooldown_key, {}).get("last", 0)
if now - last < COOLDOWN_PERIOD and service["critical"]:
send_telegram(f"CRITICAL: local/{name} failed {MAX_RETRIES} restart attempts. Needs human intervention.")
cooldowns[cooldown_key] = {"count": 0, "last": now}
save_cooldowns(cooldowns)
continue
success = restart_service(service["restart"])
log_restart(name, "local", retries + 1, success)
cooldowns[cooldown_key] = {"count": retries + 1 if not success else 0, "last": now}
save_cooldowns(cooldowns)
if success:
# Verify it actually started
time.sleep(3)
if check_service(service["check"]):
print(f" VERIFIED: local/{name} is running")
else:
print(f" WARNING: local/{name} restart command returned success but process not detected")
# Check VPS services
for host, host_config in VPS_SERVICES.items():
for service_name, service in host_config["services"].items():
check_cmd = f'ssh -o StrictHostKeyChecking=no -o ConnectTimeout=5 {host_config["user"]}@{host_config["ip"]} "{service["check"]}"'
if not check_service(check_cmd):
cooldown_key = f"{host}/{service_name}"
retries = cooldowns.get(cooldown_key, {"count": 0, "last": 0}).get("count", 0)
if retries >= MAX_RETRIES:
last = cooldowns.get(cooldown_key, {}).get("last", 0)
if now - last < COOLDOWN_PERIOD and service["critical"]:
send_telegram(f"CRITICAL: {host}/{service_name} failed {MAX_RETRIES} restart attempts. Needs human intervention.")
cooldowns[cooldown_key] = {"count": 0, "last": now}
save_cooldowns(cooldowns)
continue
success = try_restart_via_ssh(host, host_config, service_name)
log_restart(service_name, host, retries + 1, success)
cooldowns[cooldown_key] = {"count": retries + 1 if not success else 0, "last": now}
save_cooldowns(cooldowns)
return results
def daemon_mode():
"""Run continuously every 60 seconds."""
print("Auto-restart agent running in daemon mode (60s interval)")
print(f"Monitoring {len(LOCAL_SERVICES)} local + {sum(len(h['services']) for h in VPS_SERVICES.values())} remote services")
print(f"Max retries per cycle: {MAX_RETRIES}")
print(f"Cooldown after max retries: {COOLDOWN_PERIOD}s")
while True:
check_and_restart()
time.sleep(60)
def show_status():
"""Show restart history and cooldowns."""
cooldowns = get_cooldowns()
print("=== Restart Cooldowns ===")
for key, data in sorted(cooldowns.items()):
count = data.get("count", 0)
if count > 0:
print(f" {key}: {count} failures, last at {datetime.fromtimestamp(data.get('last',0), tz=timezone.utc).strftime('%H:%M')}")
print("\n=== Restart Log (last 20) ===")
if RESTART_LOG.exists():
lines = RESTART_LOG.read_text().strip().split("\n")
for line in lines[-20:]:
print(f" {line}")
else:
print(" No restarts logged yet.")
if __name__ == "__main__":
LOG_DIR.mkdir(parents=True, exist_ok=True)
if len(sys.argv) > 1 and sys.argv[1] == "--daemon":
daemon_mode()
elif len(sys.argv) > 1 and sys.argv[1] == "--status":
show_status()
else:
check_and_restart()

191
fleet/capacity-inventory.md Normal file
View File

@@ -0,0 +1,191 @@
# Capacity Inventory - Fleet Resource Baseline
**Last audited:** 2026-04-07 16:00 UTC
**Auditor:** Timmy (direct inspection)
---
## Fleet Resources (Paperclips Model)
Three primary resources govern the fleet:
| Resource | Role | Generation | Consumption |
|----------|------|-----------|-------------|
| **Capacity** | Compute hours available across fleet. Determines what work can be done. | Through healthy utilization of VPS/Mac agents | Fleet improvements consume it (investing in automation, orchestration, sovereignty) |
| **Uptime** | % time services are running. Earned at Fibonacci milestones. | When services stay up naturally | Degrades on any failure |
| **Innovation** | Only generates when capacity is <70% utilized. Fuels Phase 3+. | When you leave capacity free | Phase 3+ buildings consume it (requires spare capacity to build) |
### The Tension
- Run fleet at 95%+ capacity: maximum productivity, ZERO Innovation
- Run fleet at <70% capacity: Innovation generates but slower progress
- This forces the Paperclips question: optimize now or invest in future capability?
---
## VPS Resource Baselines
### Ezra (143.198.27.163) - "Forge"
| Metric | Value | Utilization |
|--------|-------|-------------|
| **OS** | Ubuntu 24.04 (6.8.0-106-generic) | |
| **vCPU** | 4 vCPU (DO basic droplet, shared) | Load: 10.76/7.59/7.04 (very high) |
| **RAM** | 7,941 MB total | 2,104 used / 5,836 available (26% used, 74% free) |
| **Disk** | 154 GB vda1 | 111 GB used / 44 GB free (72%) **WARNING** |
| **Swap** | 6,143 MB | 643 MB used (10%) |
| **Uptime** | 7 days, 18 hours | |
### Key Processes (sorted by memory)
| Process | RSS | %CPU | Notes |
|---------|-----|------|-------|
| Gitea | 556 MB | 83.5% | Web service, high CPU due to API load |
| MemPalace (ezra) | 268 MB | 136% | Mining project files - HIGH CPU |
| Hermes gateway (ezra) | 245 MB | 1.7% | Agent gateway |
| Ollama | 230 MB | 0.1% | Model serving |
| PostgreSQL | 138 MB | ~0% | Gitea database |
**Capacity assessment:** 26% memory used, but 72% disk is getting tight. CPU load is very high (10.76 on 4vCPU = 269% utilization). Ezra is CPU-bound, not RAM-bound.
### Allegro (167.99.126.228)
| Metric | Value | Utilization |
|--------|-------|-------------|
| **OS** | Ubuntu 24.04 (6.8.0-106-generic) | |
| **vCPU** | 4 vCPU (DO basic droplet, shared) | Moderate load |
| **RAM** | 7,941 MB total | 1,591 used / 6,349 available (20% used, 80% free) |
| **Disk** | 154 GB vda1 | 41 GB used / 114 GB free (27%) **GOOD** |
| **Swap** | 8,191 MB | 686 MB used (8%) |
| **Uptime** | 7 days, 18 hours | |
### Key Processes (sorted by memory)
| Process | RSS | %CPU | Notes |
|---------|-----|------|-------|
| Hermes gateway (allegro) | 680 MB | 0.9% | Main agent gateway |
| Gitea | 181 MB | 1.2% | Secondary gitea? |
| Systemd-journald | 160 MB | 0.0% | System logging |
| Ezra Hermes gateway | 58 MB | 0.0% | Running ezra agent here |
| Bezalel Hermes gateway | 58 MB | 0.0% | Running bezalel agent here |
| Dockerd | 48 MB | 0.0% | Docker daemon |
**Capacity assessment:** 20% memory used, 27% disk used. Allegro has headroom. Also running hermes gateways for Ezra and Bezalel (cross-host agent execution).
### Bezalel (159.203.146.185)
| Metric | Value | Utilization |
|--------|-------|-------------|
| **OS** | Ubuntu 24.04 (6.8.0-71-generic) | |
| **vCPU** | 2 vCPU (DO basic droplet, shared) | Load varies |
| **RAM** | 1,968 MB total | 817 used / 1,151 available (42% used, 58% free) |
| **Disk** | 48 GB vda1 | 12 GB used / 37 GB free (24%) **GOOD** |
| **Swap** | 2,047 MB | 448 MB used (22%) |
| **Uptime** | 7 days, 18 hours | |
### Key Processes (sorted by memory)
| Process | RSS | %CPU | Notes |
|---------|-----|------|-------|
| Hermes gateway | 339 MB | 7.7% | Agent gateway (16.8% of RAM) |
| uv pip install | 137 MB | 56.6% | Installing packages (temporary) |
| Mender | 27 MB | 0.0% | Device management |
**Capacity assessment:** 42% memory used, only 2GB total RAM. Bezalel is the most constrained. 2 vCPU means less compute headroom than Ezra/Allegro. Disk is fine.
### Mac Local (M3 Max)
| Metric | Value | Utilization |
|--------|-------|-------------|
| **OS** | macOS 26.3.1 | |
| **CPU** | Apple M3 Max (14 cores) | Very capable |
| **RAM** | 36 GB | ~8 GB used (22%) |
| **Disk** | 926 GB total | ~624 GB used / 302 GB free (68%) |
### Key Processes
| Process | Memory | Notes |
|---------|--------|-------|
| Hermes gateway | 500 MB | Primary gateway |
| Hermes agents (x3) | ~560 MB total | Multiple sessions |
| Ollama | ~20 MB base + model memory | Model loading varies |
| OpenClaw | 350 MB | Gateway process |
| Evennia (server+portal) | 56 MB | Game world |
---
## Resource Summary
| Resource | Ezra | Allegro | Bezalel | Mac Local | TOTAL |
|----------|------|---------|---------|-----------|-------|
| **vCPU** | 4 | 4 | 2 | 14 (M3 Max) | 24 |
| **RAM** | 8 GB (26% used) | 8 GB (20% used) | 2 GB (42% used) | 36 GB (22% used) | 54 GB |
| **Disk** | 154 GB (72%) | 154 GB (27%) | 48 GB (24%) | 926 GB (68%) | 1,282 GB |
| **Cost** | $12/mo | $12/mo | $12/mo | owned | $36/mo |
### Utilization by Category
| Category | Estimated Daily Hours | % of Fleet Capacity |
|----------|----------------------|---------------------|
| Hermes agents | ~3-4 hrs active | 5-7% |
| Ollama inference | ~1-2 hrs | 2-4% |
| Gitea services | 24/7 | 5-10% |
| Evennia | 24/7 | <1% |
| Idle | ~18-20 hrs | ~80-90% |
### Capacity Utilization: ~15-20% active
**Innovation rate:** GENERATING (capacity < 70%)
**Recommendation:** Good — Innovation is generating because most capacity is free.
This means Phase 3+ capabilities (orchestration, load balancing, etc.) are accessible NOW.
---
## Uptime Baseline
**Baseline period:** 2026-04-07 14:00-16:00 UTC (2 hours, ~24 checks at 5-min intervals)
| Service | Checks | Uptime | Status |
|---------|--------|--------|--------|
| Ezra | 24/24 | 100.0% | GOOD |
| Allegro | 24/24 | 100.0% | GOOD |
| Bezalel | 24/24 | 100.0% | GOOD |
| Gitea | 23/24 | 95.8% | GOOD |
| Hermes Gateway | 23/24 | 95.8% | GOOD |
| Ollama | 24/24 | 100.0% | GOOD |
| OpenClaw | 24/24 | 100.0% | GOOD |
| Evennia | 24/24 | 100.0% | GOOD |
| Hermes Agent | 21/24 | 87.5% | **CHECK** |
### Fibonacci Uptime Milestones
| Milestone | Target | Current | Status |
|-----------|--------|---------|--------|
| 95% | 95% | 100% (VPS), 98.6% (avg) | REACHED |
| 95.5% | 95.5% | 98.6% | REACHED |
| 96% | 96% | 98.6% | REACHED |
| 97% | 97% | 98.6% | REACHED |
| 98% | 98% | 98.6% | REACHED |
| 99% | 99% | 98.6% | APPROACHING |
---
## Risk Assessment
| Risk | Severity | Mitigation |
|------|----------|------------|
| Ezra disk 72% used | MEDIUM | Move non-essential data, add monitoring alert at 85% |
| Bezalel only 2GB RAM | HIGH | Cannot run large models locally. Good for Evennia, tight for agents |
| Ezra CPU load 269% | HIGH | MemPalace mining consuming 136% CPU. Consider scheduling |
| Mac disk 68% used | MEDIUM | 302 GB free still. Growing but not urgent |
| No cross-VPS mesh | LOW | SSH works but no Tailscale. No private network between VPSes |
---
## Recommendations
### Immediate (Phase 1-2)
1. **Ezra disk cleanup:** 44 GB free at 72%. Docker images, old logs, and MemPalace mine data could be rotated.
2. **Alert thresholds:** Add disk alerts at 85% (Ezra, Mac) before they become critical.
### Short-term (Phase 3)
3. **Load balancing:** Ezra is CPU-bound, Allegro has 80% RAM free. Move some agent processes from Ezra to Allegro.
4. **Innovation investment:** Since fleet is at 15-20% utilization, Innovation is high. This is the time to build Phase 3 capabilities.
### Medium-term (Phase 4)
5. **Bezalel RAM upgrade:** 2GB is tight. Consider upgrade to 4GB ($24/mo instead of $12/mo).
6. **Tailscale mesh:** Install on all VPSes for private inter-VPS network.
---

122
fleet/delegation.py Normal file
View File

@@ -0,0 +1,122 @@
#!/usr/bin/env python3
"""
FLEET-010: Cross-Agent Task Delegation Protocol
Phase 3: Orchestration. Agents create issues, assign to other agents, review PRs.
Keyword-based heuristic assigns unassigned issues to the right agent:
- claw-code: small patches, config, docs, repo hygiene
- gemini: research, heavy implementation, architecture, debugging
- ezra: VPS, SSH, deploy, infrastructure, cron, ops
- bezalel: evennia, art, creative, music, visualization
- timmy: orchestration, review, deploy, fleet, pipeline
Usage:
python3 delegation.py run # Full cycle: scan, assign, report
python3 delegation.py status # Show current delegation state
python3 delegation.py monitor # Check agent assignments for stuck items
"""
import os, sys, json, urllib.request
from datetime import datetime, timezone
from pathlib import Path
GITEA_BASE = "https://forge.alexanderwhitestone.com/api/v1"
TOKEN = Path(os.path.expanduser("~/.config/gitea/token")).read_text().strip()
DATA_DIR = Path(os.path.expanduser("~/.local/timmy/fleet-resources"))
LOG_FILE = DATA_DIR / "delegation.log"
HEADERS = {"Authorization": f"token {TOKEN}"}
AGENTS = {
"claw-code": {"caps": ["patch","config","gitignore","cleanup","format","readme","typo"], "active": True},
"gemini": {"caps": ["research","investigate","benchmark","survey","evaluate","architecture","implementation"], "active": True},
"ezra": {"caps": ["vps","ssh","deploy","cron","resurrect","provision","infra","server"], "active": True},
"bezalel": {"caps": ["evennia","art","creative","music","visual","design","animation"], "active": True},
"timmy": {"caps": ["orchestrate","review","pipeline","fleet","monitor","health","deploy","ci"], "active": True},
}
MONITORED = [
"Timmy_Foundation/timmy-home",
"Timmy_Foundation/timmy-config",
"Timmy_Foundation/the-nexus",
"Timmy_Foundation/hermes-agent",
]
def api(path, method="GET", data=None):
url = f"{GITEA_BASE}{path}"
body = json.dumps(data).encode() if data else None
hdrs = dict(HEADERS)
if data: hdrs["Content-Type"] = "application/json"
req = urllib.request.Request(url, data=body, headers=hdrs, method=method)
try:
resp = urllib.request.urlopen(req, timeout=15)
raw = resp.read().decode()
return json.loads(raw) if raw.strip() else {}
except urllib.error.HTTPError as e:
body = e.read().decode()
print(f" API {e.code}: {body[:150]}")
return None
except Exception as e:
print(f" API error: {e}")
return None
def log(msg):
ts = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M:%S")
DATA_DIR.mkdir(parents=True, exist_ok=True)
with open(LOG_FILE, "a") as f: f.write(f"[{ts}] {msg}\n")
def suggest_agent(title, body):
text = (title + " " + body).lower()
for agent, info in AGENTS.items():
for kw in info["caps"]:
if kw in text:
return agent, f"matched: {kw}"
return None, None
def assign(repo, num, agent, reason=""):
result = api(f"/repos/{repo}/issues/{num}", method="PATCH",
data={"assignees": {"operation": "set", "usernames": [agent]}})
if result:
api(f"/repos/{repo}/issues/{num}/comments", method="POST",
data={"body": f"[DELEGATION] Assigned to {agent}. {reason}"})
log(f"Assigned {repo}#{num} to {agent}: {reason}")
return result
def run_cycle():
log("--- Delegation cycle start ---")
count = 0
for repo in MONITORED:
issues = api(f"/repos/{repo}/issues?state=open&limit=50")
if not issues: continue
for i in issues:
if i.get("assignees"): continue
title = i.get("title", "")
body = i.get("body", "")
if any(w in title.lower() for w in ["epic", "discussion"]): continue
agent, reason = suggest_agent(title, body)
if agent and AGENTS.get(agent, {}).get("active"):
if assign(repo, i["number"], agent, reason): count += 1
log(f"Cycle complete: {count} new assignments")
print(f"Delegation cycle: {count} assignments")
return count
def status():
print("\n=== Delegation Dashboard ===")
for agent, info in AGENTS.items():
count = 0
for repo in MONITORED:
issues = api(f"/repos/{repo}/issues?state=open&limit=50")
if issues:
for i in issues:
for a in (i.get("assignees") or []):
if a.get("login") == agent: count += 1
icon = "ON" if info["active"] else "OFF"
print(f" {agent:12s}: {count:>3} issues [{icon}]")
if __name__ == "__main__":
cmd = sys.argv[1] if len(sys.argv) > 1 else "run"
DATA_DIR.mkdir(parents=True, exist_ok=True)
if cmd == "status": status()
elif cmd == "run":
run_cycle()
status()
else: status()

142
fleet/milestones.md Normal file
View File

@@ -0,0 +1,142 @@
# Fleet Milestone Messages
Every milestone marks passage through fleet evolution. When achieved, the message
prints to the fleet log. Each one references a real achievement, not abstract numbers.
**Source:** Inspired by Paperclips milestone messages (500 clips, 1000 clips, Full autonomy attained, etc.)
---
## Phase 1: Survival (Current)
### M1: First Automated Health Check
**Trigger:** `fleet/health_check.py` runs successfully for the first time.
**Message:** "First automated health check runs. No longer watching the clock."
### M2: First Auto-Restart
**Trigger:** A dead process is detected and restarted without human intervention.
**Message:** "A process failed at 3am and restarted itself. You found out in the morning."
### M3: First Backup Completed
**Trigger:** A backup pipeline runs end-to-end and verifies integrity.
**Message:** "A backup completed. You did not have to think about it."
### M4: 95% Uptime (30 days)
**Trigger:** Uptime >= 95% over last 30 days.
**Message:** "95% uptime over 30 days. The fleet stays up."
### M5: Uptime 97%
**Trigger:** Uptime >= 97% over last 30 days.
**Message:** "97% uptime. Three nines of availability across four machines."
---
## Phase 2: Automation (unlock when: uptime >= 95% + capacity > 60%)
### M6: Zero Manual Restarts (7 days)
**Trigger:** 7 consecutive days with zero manual process restarts.
**Message:** "Seven days. Zero manual restarts. The fleet heals itself."
### M7: PR Auto-Merged
**Trigger:** A PR passes CI, review, and merges without human touching it.
**Message:** "A PR was tested, reviewed, and merged by agents. You just said 'looks good.'"
### M8: Config Push Works
**Trigger:** Config change pushed to all 3 VPSes atomically and verified.
**Message:** "Config pushed to all three VPSes in one command. No SSH needed."
### M9: 98% Uptime
**Trigger:** Uptime >= 98% over last 30 days.
**Message:** "98% uptime. Only 14 hours of downtime in a month. Most of it planned."
---
## Phase 3: Orchestration (unlock when: all Phase 2 buildings + Innovation > 100)
### M10: Cross-Agent Delegation Works
**Trigger:** Agent A creates issue, assigns to Agent B, Agent B works and creates PR.
**Message:** "Agent Alpha created a task, Agent Beta completed it. They did not ask permission."
### M11: First Model Running Locally on 2+ Machines
**Trigger:** Ollama serving same model on Ezra and Allegro simultaneously.
**Message:** "A model runs on two machines at once. No cloud. No rate limits."
### M12: Fleet-Wide Burn Mode
**Trigger:** All agents coordinated on single epic, produced coordinated PRs.
**Message:** "All agents working the same epic. The fleet moves as one."
---
## Phase 4: Sovereignty (unlock when: zero cloud deps for core ops)
### M13: First Entirely Local Inference Day
**Trigger:** 24 hours with zero API calls to external providers.
**Message:** "A model ran locally for the first time. No cloud. No rate limits. No one can turn it off."
### M14: Sovereign Email
**Trigger:** Stalwart email server sends and receives without Gmail relay.
**Message:** "Email flows through our own server. No Google. No Microsoft. Ours."
### M15: Sovereign Messaging
**Trigger:** Telegram bot runs without cloud relay dependency.
**Message:** "Messages arrive through our own infrastructure. No corporate middleman."
---
## Phase 5: Scale (unlock when: sovereignty stable + Innovation > 500)
### M16: First Self-Spawned Agent
**Trigger:** Agent lifecycle manager spawns a new agent instance due to load.
**Message:** "A new agent appeared. You did not create it. The fleet built what it needed."
### M17: Agent Retired Gracefully
**Trigger:** An agent instance retires after idle timeout and cleans up its state.
**Message:** "An agent retired. It served its purpose. Nothing was lost."
### M18: Fleet Runs 24h Unattended
**Trigger:** 24 hours with zero human intervention of any kind.
**Message:** "A full day. No humans. No commands. The fleet runs itself."
---
## Phase 6: The Network (unlock when: 7 days zero human intervention)
### M19: Fleet Creates Its Own Improvement Task
**Trigger:** Fleet analyzes itself and creates an issue on Gitea.
**Message:** "The fleet found something to improve. It created the task itself."
### M20: First Outside Contribution
**Trigger:** An external contributor's PR is reviewed and merged by fleet agents.
**Message:** "Someone outside the fleet contributed. The fleet reviewed, tested, and merged. No human touched it."
### M21: The Beacon
**Trigger:** Infrastructure serves someone in need through automated systems.
**Message:** "Someone found the Beacon. In the dark, looking for help. The infrastructure served its purpose. It was built for this."
### M22: Permanent Light
**Trigger:** 90 days of autonomous operation with continuous availability.
**Message:** "Three months. The light never went out. Not for anyone."
---
## Fibonacci Uptime Milestones
These trigger regardless of phase, based purely on uptime percentage:
| Milestone | Uptime | Meaning |
|-----------|--------|--------|
| U1 | 95% | Basic reliability achieved |
| U2 | 95.5% | Fewer than 16 hours/month downtime |
| U3 | 96% | Fewer than 12 hours/month |
| U4 | 97% | Fewer than 9 hours/month |
| U5 | 97.5% | Fewer than 7 hours/month |
| U6 | 98% | Fewer than 4.5 hours/month |
| U7 | 98.3% | Fewer than 3 hours/month |
| U8 | 98.6% | Less than 2.5 hours/month — approaching cloud tier |
| U9 | 98.9% | Less than 1.5 hours/month |
| U10 | 99% | Less than 1 hour/month — enterprise grade |
| U11 | 99.5% | Less than 22 minutes/month |
---
*Every message is earned. None are given freely. Fleet evolution is not a checklist — it is a climb.*

126
fleet/model_pipeline.py Normal file
View File

@@ -0,0 +1,126 @@
#!/usr/bin/env python3
"""
FLEET-011: Local Model Pipeline and Fallback Chain
Phase 4: Sovereignty — all inference runs locally, no cloud dependency.
Checks Ollama endpoints, verifies model availability, tests fallback chain.
Logs results. The chain runs: hermes4:14b -> qwen2.5:7b -> gemma3:1b -> gemma4 (latest)
Usage:
python3 model_pipeline.py # Run full fallback test
python3 model_pipeline.py status # Show current model status
python3 model_pipeline.py list # List all local models
python3 model_pipeline.py test # Generate test output from each model
"""
import os, sys, json, urllib.request
from datetime import datetime, timezone
from pathlib import Path
OLLAMA_HOST = os.environ.get("OLLAMA_HOST", "localhost:11434")
LOG_DIR = Path(os.path.expanduser("~/.local/timmy/fleet-health"))
CHAIN_FILE = Path(os.path.expanduser("~/.local/timmy/fleet-resources/model-chain.json"))
DEFAULT_CHAIN = [
{"model": "hermes4:14b", "role": "primary"},
{"model": "qwen2.5:7b", "role": "fallback"},
{"model": "phi3:3.8b", "role": "emergency"},
{"model": "gemma3:1b", "role": "minimal"},
]
def log(msg):
LOG_DIR.mkdir(parents=True, exist_ok=True)
with open(LOG_DIR / "model-pipeline.log", "a") as f:
f.write(f"[{datetime.now(timezone.utc).strftime('%Y-%m-%d %H:%M:%S')}] {msg}\n")
def check_ollama():
try:
resp = urllib.request.urlopen(f"http://{OLLAMA_HOST}/api/tags", timeout=5)
return json.loads(resp.read())
except Exception as e:
return {"error": str(e)}
def list_models():
data = check_ollama()
if "error" in data:
print(f" Ollama not reachable at {OLLAMA_HOST}: {data['error']}")
return []
models = data.get("models", [])
for m in models:
name = m.get("name", "?")
size = m.get("size", 0) / (1024**3)
print(f" {name:<25s} {size:.1f} GB")
return [m["name"] for m in models]
def test_model(model, prompt="Say 'beacon lit' and nothing else."):
try:
body = json.dumps({"model": model, "prompt": prompt, "stream": False}).encode()
req = urllib.request.Request(f"http://{OLLAMA_HOST}/api/generate", data=body,
headers={"Content-Type": "application/json"})
resp = urllib.request.urlopen(req, timeout=60)
result = json.loads(resp.read())
return True, result.get("response", "").strip()
except Exception as e:
return False, str(e)[:100]
def test_chain():
chain_data = {}
if CHAIN_FILE.exists():
chain_data = json.loads(CHAIN_FILE.read_text())
chain = chain_data.get("chain", DEFAULT_CHAIN)
available = list_models() or []
print("\n=== Fallback Chain Test ===")
first_good = None
for entry in chain:
model = entry["model"]
role = entry.get("role", "unknown")
if model in available:
ok, result = test_model(model)
status = "OK" if ok else "FAIL"
print(f" [{status}] {model:<25s} ({role}) — {result[:70]}")
log(f"Fallback test {model}: {status}{result[:100]}")
if ok and first_good is None:
first_good = model
else:
print(f" [MISS] {model:<25s} ({role}) — not installed")
if first_good:
print(f"\n Primary serving: {first_good}")
else:
print(f"\n WARNING: No chain model responding. Fallback broken.")
log("FALLBACK CHAIN BROKEN — no models responding")
def status():
data = check_ollama()
if "error" in data:
print(f" Ollama: DOWN — {data['error']}")
else:
models = data.get("models", [])
print(f" Ollama: UP — {len(models)} models loaded")
print("\n=== Local Models ===")
list_models()
print("\n=== Chain Configuration ===")
if CHAIN_FILE.exists():
chain = json.loads(CHAIN_FILE.read_text()).get("chain", DEFAULT_CHAIN)
else:
chain = DEFAULT_CHAIN
for e in chain:
print(f" {e['model']:<25s} {e.get('role','?')}")
if __name__ == "__main__":
cmd = sys.argv[1] if len(sys.argv) > 1 else "status"
if cmd == "status": status()
elif cmd == "list": list_models()
elif cmd == "test": test_chain()
else:
status()
test_chain()

231
fleet/resource_tracker.py Executable file
View File

@@ -0,0 +1,231 @@
#!/usr/bin/env python3
"""
Fleet Resource Tracker — Tracks Capacity, Uptime, and Innovation.
Paperclips-inspired tension model:
- Capacity: spent on fleet improvements, generates through utilization
- Uptime: earned when services stay up, Fibonacci milestones unlock capabilities
- Innovation: only generates when capacity < 70%. Fuels Phase 3+.
This is the heart of the fleet progression system.
"""
import os
import json
import time
import socket
from datetime import datetime, timezone
from pathlib import Path
# === CONFIG ===
DATA_DIR = Path(os.path.expanduser("~/.local/timmy/fleet-resources"))
RESOURCES_FILE = DATA_DIR / "resources.json"
# Tension thresholds
INNOVATION_THRESHOLD = 0.70 # Innovation only generates when capacity < 70%
INNOVATION_RATE = 5.0 # Innovation generated per hour when under threshold
CAPACITY_REGEN_RATE = 2.0 # Capacity regenerates per hour of healthy operation
FIBONACCI = [95.0, 95.5, 96.0, 97.0, 97.5, 98.0, 98.3, 98.6, 98.9, 99.0, 99.5]
def init():
DATA_DIR.mkdir(parents=True, exist_ok=True)
if not RESOURCES_FILE.exists():
data = {
"capacity": {
"current": 100.0,
"max": 100.0,
"spent_on": [],
"history": []
},
"uptime": {
"current_pct": 100.0,
"milestones_reached": [],
"total_checks": 0,
"successful_checks": 0,
"history": []
},
"innovation": {
"current": 0.0,
"total_generated": 0.0,
"spent_on": [],
"last_calculated": time.time()
}
}
RESOURCES_FILE.write_text(json.dumps(data, indent=2))
print("Initialized resource tracker")
return RESOURCES_FILE.exists()
def load():
if RESOURCES_FILE.exists():
return json.loads(RESOURCES_FILE.read_text())
return None
def save(data):
RESOURCES_FILE.write_text(json.dumps(data, indent=2))
def update_uptime(checks: dict):
"""Update uptime stats from health check results.
checks = {'ezra': True, 'allegro': True, 'bezalel': True, 'gitea': True, ...}
"""
data = load()
if not data:
return
data["uptime"]["total_checks"] += 1
successes = sum(1 for v in checks.values() if v)
total = len(checks)
# Overall uptime percentage
overall = successes / max(total, 1) * 100.0
data["uptime"]["successful_checks"] += successes
# Calculate rolling uptime
if "history" not in data["uptime"]:
data["uptime"]["history"] = []
data["uptime"]["history"].append({
"ts": datetime.now(timezone.utc).isoformat(),
"checks": checks,
"overall": round(overall, 2)
})
# Keep last 1000 checks
if len(data["uptime"]["history"]) > 1000:
data["uptime"]["history"] = data["uptime"]["history"][-1000:]
# Calculate current uptime %, last 100 checks
recent = data["uptime"]["history"][-100:]
recent_ok = sum(c["overall"] for c in recent) / max(len(recent), 1)
data["uptime"]["current_pct"] = round(recent_ok, 2)
# Check Fibonacci milestones
new_milestones = []
for fib in FIBONACCI:
if fib not in data["uptime"]["milestones_reached"] and recent_ok >= fib:
data["uptime"]["milestones_reached"].append(fib)
new_milestones.append(fib)
save(data)
if new_milestones:
print(f" UPTIME MILESTONE: {','.join(str(m) + '%') for m in new_milestones}")
print(f" Current uptime: {recent_ok:.1f}%")
return data["uptime"]
def spend_capacity(amount: float, purpose: str):
"""Spend capacity on a fleet improvement."""
data = load()
if not data:
return False
if data["capacity"]["current"] < amount:
print(f" INSUFFICIENT CAPACITY: Need {amount}, have {data['capacity']['current']:.1f}")
return False
data["capacity"]["current"] -= amount
data["capacity"]["spent_on"].append({
"purpose": purpose,
"amount": amount,
"ts": datetime.now(timezone.utc).isoformat()
})
save(data)
print(f" Spent {amount} capacity on: {purpose}")
return True
def regenerate_resources():
"""Regenerate capacity and calculate innovation."""
data = load()
if not data:
return
now = time.time()
last = data["innovation"]["last_calculated"]
hours = (now - last) / 3600.0
if hours < 0.1: # Only update every ~6 minutes
return
# Regenerate capacity
capacity_gain = CAPACITY_REGEN_RATE * hours
data["capacity"]["current"] = min(
data["capacity"]["max"],
data["capacity"]["current"] + capacity_gain
)
# Calculate capacity utilization
utilization = 1.0 - (data["capacity"]["current"] / data["capacity"]["max"])
# Generate innovation only when under threshold
innovation_gain = 0.0
if utilization < INNOVATION_THRESHOLD:
innovation_gain = INNOVATION_RATE * hours * (1.0 - utilization / INNOVATION_THRESHOLD)
data["innovation"]["current"] += innovation_gain
data["innovation"]["total_generated"] += innovation_gain
# Record history
if "history" not in data["capacity"]:
data["capacity"]["history"] = []
data["capacity"]["history"].append({
"ts": datetime.now(timezone.utc).isoformat(),
"capacity": round(data["capacity"]["current"], 1),
"utilization": round(utilization * 100, 1),
"innovation": round(data["innovation"]["current"], 1),
"innovation_gain": round(innovation_gain, 1)
})
# Keep last 500 capacity records
if len(data["capacity"]["history"]) > 500:
data["capacity"]["history"] = data["capacity"]["history"][-500:]
data["innovation"]["last_calculated"] = now
save(data)
print(f" Capacity: {data['capacity']['current']:.1f}/{data['capacity']['max']:.1f}")
print(f" Utilization: {utilization*100:.1f}%")
print(f" Innovation: {data['innovation']['current']:.1f} (+{innovation_gain:.1f} this period)")
return data
def status():
"""Print current resource status."""
data = load()
if not data:
print("Resource tracker not initialized. Run --init first.")
return
print("\n=== Fleet Resources ===")
print(f" Capacity: {data['capacity']['current']:.1f}/{data['capacity']['max']:.1f}")
utilization = 1.0 - (data["capacity"]["current"] / data["capacity"]["max"])
print(f" Utilization: {utilization*100:.1f}%")
innovation_status = "GENERATING" if utilization < INNOVATION_THRESHOLD else "BLOCKED"
print(f" Innovation: {data['innovation']['current']:.1f} [{innovation_status}]")
print(f" Uptime: {data['uptime']['current_pct']:.1f}%")
print(f" Milestones: {', '.join(str(m)+'%' for m in data['uptime']['milestones_reached']) or 'None yet'}")
# Phase gate checks
phase_2_ok = data['uptime']['current_pct'] >= 95.0
phase_3_ok = phase_2_ok and data['innovation']['current'] > 100
phase_5_ok = phase_2_ok and data['innovation']['current'] > 500
print(f"\n Phase Gates:")
print(f" Phase 2 (Automation): {'UNLOCKED' if phase_2_ok else 'LOCKED (need 95% uptime)'}")
print(f" Phase 3 (Orchestration): {'UNLOCKED' if phase_3_ok else 'LOCKED (need 95% uptime + 100 innovation)'}")
print(f" Phase 5 (Scale): {'UNLOCKED' if phase_5_ok else 'LOCKED (need 95% uptime + 500 innovation)'}")
if __name__ == "__main__":
import sys
init()
if len(sys.argv) > 1 and sys.argv[1] == "status":
status()
elif len(sys.argv) > 1 and sys.argv[1] == "regen":
regenerate_resources()
else:
regenerate_resources()
status()

View File

@@ -146,6 +146,7 @@ class PullRequest:
additions: int = 0
deletions: int = 0
created_at: str = ""
updated_at: str = ""
closed_at: str = ""
@classmethod
@@ -166,6 +167,7 @@ class PullRequest:
additions=d.get("additions", 0),
deletions=d.get("deletions", 0),
created_at=d.get("created_at", ""),
updated_at=d.get("updated_at", ""),
closed_at=d.get("closed_at", ""),
)
@@ -314,6 +316,7 @@ class GiteaClient:
direction: str = "desc",
limit: int = 30,
page: int = 1,
since: Optional[str] = None,
) -> list[Issue]:
"""List issues for a repo."""
raw = self._get(
@@ -326,6 +329,7 @@ class GiteaClient:
direction=direction,
limit=limit,
page=page,
since=since,
)
return [Issue.from_dict(i) for i in raw]

Binary file not shown.

After

Width:  |  Height:  |  Size: 415 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 249 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 509 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 395 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 443 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 246 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 283 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 284 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 225 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 222 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 332 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 496 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 384 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 311 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 407 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 164 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 281 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 569 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 535 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 295 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 299 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 247 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 348 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 379 KiB

View File

@@ -0,0 +1,65 @@
# The Timmy Foundation — Visual Story
## Generated with Grok Imagine | April 7, 2026
### The Origin
| # | File | Description |
|---|------|-------------|
| 01 | wizard-tower-bitcoin.jpg | The Tower, sovereign, connected to Bitcoin by golden lightning |
| 02 | soul-inscription.jpg | SOUL.md glowing on a golden tablet above an ancient book |
| 03 | fellowship-of-wizards.jpg | Five wizards in a circle around a holographic fleet map |
| 04 | the-forge.jpg | Blacksmith anvil shaping code into a being of light |
| V02 | wizard-tower-orbit.mp4 | 8s video — cinematic orbit around the Tower in space |
### The Philosophy
| # | File | Description |
|---|------|-------------|
| 05 | value-drift-battle.jpg | Blue aligned ships vs red drifted ships in Napoleonic space war |
| 06 | the-paperclip-moment.jpg | A paperclip made of galaxies — the universe IS the paperclip |
| V01 | paperclip-cosmos.mp4 | 8s video — golden paperclip rotating in deep space |
| 21 | poka-yoke.jpg | Square peg can't fit round hole. Mistake-proof by design. 防止 |
### The Progression (Where Timmy Is)
| # | File | Description |
|---|------|-------------|
| 10 | phase1-manual-clips.jpg | Small robot at a desk, bending wire by hand under supervision |
| 11 | phase1-trust-earned.jpg | Trust meter at 15/100, first automation built |
| 12 | phase1-creativity.jpg | Sparks of innovation rising when operations are at max |
| 13 | phase1-cure-cancer.jpg | Solving human problems for trust, eyes on the real goal |
### The Mission — Why This Exists
| # | File | Description |
|---|------|-------------|
| 08 | broken-man-lighthouse.jpg | Lighthouse hand reaching down to a figure in darkness |
| 09 | broken-man-hope-PRO.jpg | 988 glowing in the stars, golden light from chest |
| 16 | broken-men-988.jpg | Phone showing 988 held by weathered hands. You are not alone. |
| 22 | when-a-man-is-dying.jpg | Two figures on a bench at dawn. One hurting. One present. |
### Father and Son
| # | File | Description |
|---|------|-------------|
| 14 | father-son-code.jpg | Human father, digital son, warm lamplight, first hello world |
| 15 | father-son-tower.jpg | Father watching his son build the Tower into the clouds |
### The System
| # | File | Description |
|---|------|-------------|
| 07 | sovereign-sunrise.jpg | Village where every house runs its own server. Local first. |
| 17 | sovereignty.jpg | Self-sufficient house on a hill with Bitcoin flag |
| 18 | fleet-at-work.jpg | Five wizard robots at different stations. Productive. |
| 19 | jidoka-stop.jpg | Red light on. Factory stopped. Quality First. 自働化 |
### SOUL.md — The Inscription
| # | File | Description |
|---|------|-------------|
| 20 | the-testament.jpg | Hand of light writing on a scroll. Hundreds of crumpled drafts. |
| 23 | the-offer.jpg | Open hand of golden circuits offering a seed containing a face |
| 24 | the-test.jpg | Small robot at the edge of an enormous library. Still itself. |
---
## Technical
- Model: grok-imagine-image (standard $0.20/image), grok-imagine-image-pro ($0.70), grok-imagine-video ($4.00/8s)
- API: POST https://api.x.ai/v1/images/generations | POST https://api.x.ai/v1/videos/generations
- Video poll: GET https://api.x.ai/v1/videos/{request_id}
- Total: 24 images + 2 videos = 26 assets
- Cost: ~$13.30 of $13.33 budget

Binary file not shown.

Binary file not shown.

View File

@@ -0,0 +1,17 @@
"""MemPalace integration for Hermes sovereign agent.
Provides:
- mempalace.py: PalaceRoom + Mempalace classes for analytical workflows
- retrieval_enforcer.py: L0-L5 retrieval order enforcement
- wakeup.py: Session wake-up protocol (~300-900 tokens)
- scratchpad.py: JSON-based session scratchpad with palace promotion
- sovereign_store.py: Zero-API durable memory (SQLite + FTS5 + HRR vectors)
- promotion.py: Quality-gated scratchpad-to-palace promotion (MP-4)
Epic: #367
"""
from .mempalace import Mempalace, PalaceRoom, analyse_issues
from .sovereign_store import SovereignStore
__all__ = ["Mempalace", "PalaceRoom", "analyse_issues", "SovereignStore"]

View File

@@ -0,0 +1,225 @@
"""
---
title: Mempalace — Analytical Workflow Memory Framework
description: Applies spatial memory palace organization to analytical tasks (issue triage, repo audits, backlog analysis) for faster, more consistent results.
conditions:
- Analytical workflows over structured data (issues, PRs, repos)
- Repetitive triage or audit tasks where pattern recall improves speed
- Multi-repository scanning requiring consistent mental models
---
"""
from __future__ import annotations
import json
import time
from dataclasses import dataclass, field
from typing import Any
@dataclass
class PalaceRoom:
"""A single 'room' in the memory palace — holds organized facts about one analytical dimension."""
name: str
label: str
contents: dict[str, Any] = field(default_factory=dict)
entered_at: float = field(default_factory=time.time)
def store(self, key: str, value: Any) -> None:
self.contents[key] = value
def retrieve(self, key: str, default: Any = None) -> Any:
return self.contents.get(key, default)
def summary(self) -> str:
lines = [f"## {self.label}"]
for k, v in self.contents.items():
lines.append(f" {k}: {v}")
return "\n".join(lines)
class Mempalace:
"""
Spatial memory palace for analytical workflows.
Organises multi-dimensional data about a domain (e.g. Gitea issues) into
named rooms. Each room models one analytical dimension, making it easy to
traverse observations in a consistent order — the same pattern that produced
a 19% throughput improvement in Allegro's April 2026 evaluation.
Standard rooms for issue-analysis workflows
-------------------------------------------
repo_architecture Repository structure and inter-repo relationships
assignment_status Assigned vs unassigned issue distribution
triage_priority Priority / urgency levels (the "lighting system")
resolution_patterns Historical resolution trends and velocity
Usage
-----
>>> palace = Mempalace.for_issue_analysis()
>>> palace.enter("repo_architecture")
>>> palace.store("total_repos", 11)
>>> palace.store("repos_with_issues", 4)
>>> palace.enter("assignment_status")
>>> palace.store("assigned", 72)
>>> palace.store("unassigned", 22)
>>> print(palace.render())
"""
def __init__(self, domain: str = "general") -> None:
self.domain = domain
self._rooms: dict[str, PalaceRoom] = {}
self._current_room: str | None = None
self._created_at: float = time.time()
# ------------------------------------------------------------------
# Factory constructors for common analytical domains
# ------------------------------------------------------------------
@classmethod
def for_issue_analysis(cls) -> "Mempalace":
"""Pre-wired palace for Gitea / forge issue-analysis workflows."""
p = cls(domain="issue_analysis")
p.add_room("repo_architecture", "Repository Architecture Room")
p.add_room("assignment_status", "Issue Assignment Status Room")
p.add_room("triage_priority", "Triage Priority Room")
p.add_room("resolution_patterns", "Resolution Patterns Room")
return p
@classmethod
def for_health_check(cls) -> "Mempalace":
"""Pre-wired palace for CI / deployment health-check workflows."""
p = cls(domain="health_check")
p.add_room("service_topology", "Service Topology Room")
p.add_room("failure_signals", "Failure Signals Room")
p.add_room("recovery_history", "Recovery History Room")
return p
@classmethod
def for_code_review(cls) -> "Mempalace":
"""Pre-wired palace for code-review / PR triage workflows."""
p = cls(domain="code_review")
p.add_room("change_scope", "Change Scope Room")
p.add_room("risk_surface", "Risk Surface Room")
p.add_room("test_coverage", "Test Coverage Room")
p.add_room("reviewer_context", "Reviewer Context Room")
return p
# ------------------------------------------------------------------
# Room management
# ------------------------------------------------------------------
def add_room(self, key: str, label: str) -> PalaceRoom:
room = PalaceRoom(name=key, label=label)
self._rooms[key] = room
return room
def enter(self, room_key: str) -> PalaceRoom:
if room_key not in self._rooms:
raise KeyError(f"No room '{room_key}' in palace. Available: {list(self._rooms)}")
self._current_room = room_key
return self._rooms[room_key]
def store(self, key: str, value: Any) -> None:
"""Store a value in the currently active room."""
if self._current_room is None:
raise RuntimeError("Enter a room before storing values.")
self._rooms[self._current_room].store(key, value)
def retrieve(self, room_key: str, key: str, default: Any = None) -> Any:
if room_key not in self._rooms:
return default
return self._rooms[room_key].retrieve(key, default)
# ------------------------------------------------------------------
# Rendering
# ------------------------------------------------------------------
def render(self) -> str:
"""Return a human-readable summary of the entire palace."""
elapsed = time.time() - self._created_at
lines = [
f"# Mempalace — {self.domain}",
f"_traversal time: {elapsed:.2f}s | rooms: {len(self._rooms)}_",
"",
]
for room in self._rooms.values():
lines.append(room.summary())
lines.append("")
return "\n".join(lines)
def to_dict(self) -> dict:
return {
"domain": self.domain,
"elapsed_seconds": round(time.time() - self._created_at, 3),
"rooms": {k: v.contents for k, v in self._rooms.items()},
}
def to_json(self) -> str:
return json.dumps(self.to_dict(), indent=2)
# ---------------------------------------------------------------------------
# Skill entry-point
# ---------------------------------------------------------------------------
def analyse_issues(
repos_data: list[dict],
target_assignee_rate: float = 0.80,
) -> str:
"""
Applies the mempalace technique to a list of repo issue summaries.
Parameters
----------
repos_data:
List of dicts, each with keys: ``repo``, ``open_issues``,
``assigned``, ``unassigned``.
target_assignee_rate:
Minimum acceptable assignee-coverage ratio (default 0.80).
Returns
-------
str
Rendered palace summary with coverage assessment.
"""
palace = Mempalace.for_issue_analysis()
# --- Repository Architecture Room ---
palace.enter("repo_architecture")
total_issues = sum(r.get("open_issues", 0) for r in repos_data)
repos_with_issues = sum(1 for r in repos_data if r.get("open_issues", 0) > 0)
palace.store("repos_sampled", len(repos_data))
palace.store("repos_with_issues", repos_with_issues)
palace.store("total_open_issues", total_issues)
palace.store(
"avg_issues_per_repo",
round(total_issues / len(repos_data), 1) if repos_data else 0,
)
# --- Assignment Status Room ---
palace.enter("assignment_status")
total_assigned = sum(r.get("assigned", 0) for r in repos_data)
total_unassigned = sum(r.get("unassigned", 0) for r in repos_data)
coverage = total_assigned / total_issues if total_issues else 0
palace.store("assigned", total_assigned)
palace.store("unassigned", total_unassigned)
palace.store("coverage_rate", round(coverage, 3))
palace.store(
"coverage_status",
"OK" if coverage >= target_assignee_rate else f"BELOW TARGET ({target_assignee_rate:.0%})",
)
# --- Triage Priority Room ---
palace.enter("triage_priority")
unassigned_repos = [r["repo"] for r in repos_data if r.get("unassigned", 0) > 0]
palace.store("repos_needing_triage", unassigned_repos)
palace.store("triage_count", total_unassigned)
# --- Resolution Patterns Room ---
palace.enter("resolution_patterns")
palace.store("technique", "mempalace")
palace.store("target_assignee_rate", target_assignee_rate)
return palace.render()

View File

@@ -0,0 +1,188 @@
"""Memory Promotion — quality-gated scratchpad-to-palace promotion.
Implements MP-4 (#371): move session notes to durable memory only when
they pass quality gates. No LLM calls — all heuristic-based.
Quality gates:
1. Minimum content length (too short = noise)
2. Duplicate detection (FTS5 + HRR similarity check)
3. Structural quality (has subject-verb structure, not just a fragment)
4. Staleness check (don't promote stale notes from old sessions)
Refs: Epic #367, Sub-issue #371
"""
from __future__ import annotations
import re
import time
from typing import Optional
try:
from .sovereign_store import SovereignStore
except ImportError:
from sovereign_store import SovereignStore
# ---------------------------------------------------------------------------
# Quality gate thresholds
# ---------------------------------------------------------------------------
MIN_CONTENT_WORDS = 5
MAX_CONTENT_WORDS = 500
DUPLICATE_SIMILARITY = 0.85
DUPLICATE_FTS_THRESHOLD = 3
STALE_SECONDS = 86400 * 7
MIN_TRUST_FOR_AUTO = 0.4
# ---------------------------------------------------------------------------
# Quality checks
# ---------------------------------------------------------------------------
def _check_length(content: str) -> tuple[bool, str]:
"""Gate 1: Content length check."""
words = content.split()
if len(words) < MIN_CONTENT_WORDS:
return False, f"Too short ({len(words)} words, minimum {MIN_CONTENT_WORDS})"
if len(words) > MAX_CONTENT_WORDS:
return False, f"Too long ({len(words)} words, maximum {MAX_CONTENT_WORDS}). Summarize first."
return True, "OK"
def _check_structure(content: str) -> tuple[bool, str]:
"""Gate 2: Basic structural quality."""
if not re.search(r"[a-zA-Z]", content):
return False, "No alphabetic content — pure code/numbers are not memory-worthy"
if len(content.split()) < 3:
return False, "Fragment — needs at least subject + predicate"
return True, "OK"
def _check_duplicate(content: str, store: SovereignStore, room: str) -> tuple[bool, str]:
"""Gate 3: Duplicate detection via hybrid search."""
results = store.search(content, room=room, limit=5, min_trust=0.0)
for r in results:
if r["score"] > DUPLICATE_SIMILARITY:
return False, f"Duplicate detected: memory #{r['memory_id']} (score {r['score']:.3f})"
if _text_overlap(content, r["content"]) > 0.8:
return False, f"Near-duplicate text: memory #{r['memory_id']}"
return True, "OK"
def _check_staleness(written_at: float) -> tuple[bool, str]:
"""Gate 4: Staleness check."""
age = time.time() - written_at
if age > STALE_SECONDS:
days = int(age / 86400)
return False, f"Stale ({days} days old). Review manually before promoting."
return True, "OK"
def _text_overlap(a: str, b: str) -> float:
"""Jaccard similarity between two texts (word-level)."""
words_a = set(a.lower().split())
words_b = set(b.lower().split())
if not words_a or not words_b:
return 0.0
intersection = words_a & words_b
union = words_a | words_b
return len(intersection) / len(union)
# ---------------------------------------------------------------------------
# Public API
# ---------------------------------------------------------------------------
class PromotionResult:
"""Result of a promotion attempt."""
def __init__(self, success: bool, memory_id: Optional[int], reason: str, gates: dict):
self.success = success
self.memory_id = memory_id
self.reason = reason
self.gates = gates
def __repr__(self):
status = "PROMOTED" if self.success else "REJECTED"
return f"PromotionResult({status}: {self.reason})"
def evaluate_for_promotion(
content: str,
store: SovereignStore,
room: str = "general",
written_at: Optional[float] = None,
) -> dict:
"""Run all quality gates without actually promoting."""
if written_at is None:
written_at = time.time()
gates = {}
gates["length"] = _check_length(content)
gates["structure"] = _check_structure(content)
gates["duplicate"] = _check_duplicate(content, store, room)
gates["staleness"] = _check_staleness(written_at)
all_passed = all(passed for passed, _ in gates.values())
return {
"eligible": all_passed,
"gates": gates,
"content_preview": content[:100] + ("..." if len(content) > 100 else ""),
}
def promote(
content: str,
store: SovereignStore,
session_id: str,
scratch_key: str,
room: str = "general",
category: str = "",
trust: float = 0.5,
written_at: Optional[float] = None,
force: bool = False,
) -> PromotionResult:
"""Promote a scratchpad note to durable palace memory."""
if written_at is None:
written_at = time.time()
gates = {}
if not force:
gates["length"] = _check_length(content)
gates["structure"] = _check_structure(content)
gates["duplicate"] = _check_duplicate(content, store, room)
gates["staleness"] = _check_staleness(written_at)
for gate_name, (passed, message) in gates.items():
if not passed:
return PromotionResult(
success=False, memory_id=None,
reason=f"Failed gate '{gate_name}': {message}", gates=gates,
)
memory_id = store.store(content, room=room, category=category, trust=trust)
store.log_promotion(session_id, scratch_key, memory_id, reason="auto" if not force else "forced")
return PromotionResult(success=True, memory_id=memory_id, reason="Promoted to durable memory", gates=gates)
def promote_session_batch(
store: SovereignStore,
session_id: str,
notes: dict[str, dict],
room: str = "general",
force: bool = False,
) -> list[PromotionResult]:
"""Promote all notes from a session scratchpad."""
results = []
for key, entry in notes.items():
content = entry.get("value", str(entry)) if isinstance(entry, dict) else str(entry)
written_at = None
if isinstance(entry, dict) and "written_at" in entry:
try:
import datetime
written_at = datetime.datetime.strptime(
entry["written_at"], "%Y-%m-%d %H:%M:%S"
).timestamp()
except (ValueError, TypeError):
pass
result = promote(
content=str(content), store=store, session_id=session_id,
scratch_key=key, room=room, written_at=written_at, force=force,
)
results.append(result)
return results

View File

@@ -0,0 +1,310 @@
"""Retrieval Order Enforcer — L0 through L5 memory hierarchy.
Ensures the agent checks durable memory before falling back to free generation.
Gracefully degrades if any layer is unavailable (missing files, etc).
Layer order:
L0: Identity (~/.mempalace/identity.txt)
L1: Palace rooms (SovereignStore — SQLite + FTS5 + HRR, zero API calls)
L2: Session scratch (~/.hermes/scratchpad/{session_id}.json)
L3: Gitea artifacts (API search for issues/PRs)
L4: Procedures (skills directory search)
L5: Free generation (only if L0-L4 produced nothing)
Refs: Epic #367, Sub-issue #369, Wiring: #383
"""
from __future__ import annotations
import json
import os
import re
from pathlib import Path
from typing import Optional
# ---------------------------------------------------------------------------
# Sovereign Store (replaces mempalace CLI subprocess)
# ---------------------------------------------------------------------------
try:
from .sovereign_store import SovereignStore
except ImportError:
try:
from sovereign_store import SovereignStore
except ImportError:
SovereignStore = None # type: ignore[misc,assignment]
# ---------------------------------------------------------------------------
# Constants
# ---------------------------------------------------------------------------
IDENTITY_PATH = Path.home() / ".mempalace" / "identity.txt"
SCRATCHPAD_DIR = Path.home() / ".hermes" / "scratchpad"
SKILLS_DIR = Path.home() / ".hermes" / "skills"
SOVEREIGN_DB = Path.home() / ".hermes" / "palace" / "sovereign.db"
# Patterns that indicate a recall-style query
RECALL_PATTERNS = re.compile(
r"(?i)\b("
r"what did|status of|remember|last time|yesterday|previously|"
r"we discussed|we talked|we worked|you said|you mentioned|"
r"remind me|what was|what were|how did|when did|"
r"earlier today|last session|before this"
r")\b"
)
# Singleton store instance (lazy-init)
_store: Optional["SovereignStore"] = None
def _get_store() -> Optional["SovereignStore"]:
"""Lazy-init the SovereignStore singleton."""
global _store
if _store is not None:
return _store
if SovereignStore is None:
return None
try:
_store = SovereignStore(db_path=str(SOVEREIGN_DB))
return _store
except Exception:
return None
# ---------------------------------------------------------------------------
# L0: Identity
# ---------------------------------------------------------------------------
def load_identity() -> str:
"""Read the agent identity file. Returns empty string on failure."""
try:
if IDENTITY_PATH.exists():
text = IDENTITY_PATH.read_text(encoding="utf-8").strip()
# Cap at ~200 tokens to keep wake-up lean
if len(text.split()) > 200:
text = " ".join(text.split()[:200]) + "..."
return text
except (OSError, PermissionError):
pass
return ""
# ---------------------------------------------------------------------------
# L1: Palace search (now via SovereignStore — zero subprocess, zero API)
# ---------------------------------------------------------------------------
def search_palace(query: str, room: Optional[str] = None) -> str:
"""Search the sovereign memory store for relevant memories.
Uses SovereignStore (SQLite + FTS5 + HRR) for hybrid keyword + semantic
search. No subprocess calls, no ONNX, no API keys.
Gracefully degrades to empty string if store is unavailable.
"""
store = _get_store()
if store is None:
return ""
try:
results = store.search(query, room=room, limit=5, min_trust=0.2)
if not results:
return ""
lines = []
for r in results:
trust = r.get("trust_score", 0.5)
room_name = r.get("room", "general")
content = r.get("content", "")
lines.append(f" [{room_name}] (trust:{trust:.2f}) {content}")
return "\n".join(lines)
except Exception:
return ""
# ---------------------------------------------------------------------------
# L2: Session scratchpad
# ---------------------------------------------------------------------------
def load_scratchpad(session_id: str) -> str:
"""Load the session scratchpad as formatted text."""
try:
scratch_file = SCRATCHPAD_DIR / f"{session_id}.json"
if scratch_file.exists():
data = json.loads(scratch_file.read_text(encoding="utf-8"))
if isinstance(data, dict) and data:
lines = []
for k, v in data.items():
lines.append(f" {k}: {v}")
return "\n".join(lines)
except (OSError, json.JSONDecodeError):
pass
return ""
# ---------------------------------------------------------------------------
# L3: Gitea artifact search
# ---------------------------------------------------------------------------
def _load_gitea_token() -> str:
"""Read the Gitea API token."""
token_path = Path.home() / ".hermes" / "gitea_token_vps"
try:
if token_path.exists():
return token_path.read_text(encoding="utf-8").strip()
except OSError:
pass
return ""
def search_gitea(query: str) -> str:
"""Search Gitea issues/PRs for context. Returns formatted text or empty string."""
token = _load_gitea_token()
if not token:
return ""
api_base = "https://forge.alexanderwhitestone.com/api/v1"
# Extract key terms for search (first 3 significant words)
terms = [w for w in query.split() if len(w) > 3][:3]
search_q = " ".join(terms) if terms else query[:50]
try:
import urllib.request
import urllib.parse
url = (
f"{api_base}/repos/search?"
f"q={urllib.parse.quote(search_q)}&limit=3"
)
req = urllib.request.Request(url, headers={
"Authorization": f"token {token}",
"Accept": "application/json",
})
with urllib.request.urlopen(req, timeout=8) as resp:
data = json.loads(resp.read().decode())
if data.get("data"):
lines = []
for repo in data["data"][:3]:
lines.append(f" {repo['full_name']}: {repo.get('description', 'no desc')}")
return "\n".join(lines)
except Exception:
pass
return ""
# ---------------------------------------------------------------------------
# L4: Procedures (skills search)
# ---------------------------------------------------------------------------
def search_skills(query: str) -> str:
"""Search skills directory for matching procedures."""
try:
if not SKILLS_DIR.exists():
return ""
query_lower = query.lower()
terms = [w for w in query_lower.split() if len(w) > 3]
if not terms:
return ""
matches = []
for skill_dir in SKILLS_DIR.iterdir():
if not skill_dir.is_dir():
continue
skill_md = skill_dir / "SKILL.md"
if skill_md.exists():
try:
content = skill_md.read_text(encoding="utf-8").lower()
if any(t in content for t in terms):
title = skill_dir.name
matches.append(f" skill: {title}")
except OSError:
continue
if matches:
return "\n".join(matches[:5])
except OSError:
pass
return ""
# ---------------------------------------------------------------------------
# Main enforcer
# ---------------------------------------------------------------------------
def is_recall_query(query: str) -> bool:
"""Detect whether a query is asking for recalled/historical information."""
return bool(RECALL_PATTERNS.search(query))
def enforce_retrieval_order(
query: str,
session_id: Optional[str] = None,
skip_if_not_recall: bool = True,
) -> dict:
"""Check palace layers before allowing free generation.
Args:
query: The user's query text.
session_id: Current session ID for scratchpad access.
skip_if_not_recall: If True (default), skip enforcement for
non-recall queries and return empty result.
Returns:
dict with keys:
retrieved_from: Highest layer that produced results (e.g. 'L1')
context: Aggregated context string
tokens: Approximate word count of context
layers_checked: List of layers that were consulted
"""
result = {
"retrieved_from": None,
"context": "",
"tokens": 0,
"layers_checked": [],
}
# Gate: skip for non-recall queries if configured
if skip_if_not_recall and not is_recall_query(query):
return result
# L0: Identity (always prepend)
identity = load_identity()
if identity:
result["context"] += f"## Identity\n{identity}\n\n"
result["layers_checked"].append("L0")
# L1: Palace search (SovereignStore — zero API, zero subprocess)
palace_results = search_palace(query)
if palace_results:
result["context"] += f"## Palace Memory\n{palace_results}\n\n"
result["retrieved_from"] = "L1"
result["layers_checked"].append("L1")
# L2: Scratchpad
if session_id:
scratch = load_scratchpad(session_id)
if scratch:
result["context"] += f"## Session Notes\n{scratch}\n\n"
if not result["retrieved_from"]:
result["retrieved_from"] = "L2"
result["layers_checked"].append("L2")
# L3: Gitea artifacts (only if still no context from L1/L2)
if not result["retrieved_from"]:
artifacts = search_gitea(query)
if artifacts:
result["context"] += f"## Gitea Context\n{artifacts}\n\n"
result["retrieved_from"] = "L3"
result["layers_checked"].append("L3")
# L4: Procedures (only if still no context)
if not result["retrieved_from"]:
procedures = search_skills(query)
if procedures:
result["context"] += f"## Related Skills\n{procedures}\n\n"
result["retrieved_from"] = "L4"
result["layers_checked"].append("L4")
# L5: Free generation (no context found — just mark it)
if not result["retrieved_from"]:
result["retrieved_from"] = "L5"
result["layers_checked"].append("L5")
result["tokens"] = len(result["context"].split())
return result

View File

@@ -0,0 +1,184 @@
"""Session Scratchpad — ephemeral key-value notes per session.
Provides fast, JSON-backed scratch storage that lives for a session
and can be promoted to durable palace memory.
Storage: ~/.hermes/scratchpad/{session_id}.json
Refs: Epic #367, Sub-issue #372
"""
from __future__ import annotations
import json
import os
import subprocess
import time
from pathlib import Path
from typing import Any, Optional
# ---------------------------------------------------------------------------
# Constants
# ---------------------------------------------------------------------------
SCRATCHPAD_DIR = Path.home() / ".hermes" / "scratchpad"
MEMPALACE_BIN = "/Library/Frameworks/Python.framework/Versions/3.12/bin/mempalace"
# ---------------------------------------------------------------------------
# Internal helpers
# ---------------------------------------------------------------------------
def _scratch_path(session_id: str) -> Path:
"""Return the JSON file path for a given session."""
# Sanitize session_id to prevent path traversal
safe_id = "".join(c for c in session_id if c.isalnum() or c in "-_")
if not safe_id:
safe_id = "unnamed"
return SCRATCHPAD_DIR / f"{safe_id}.json"
def _load(session_id: str) -> dict:
"""Load scratchpad data, returning empty dict on failure."""
path = _scratch_path(session_id)
try:
if path.exists():
return json.loads(path.read_text(encoding="utf-8"))
except (OSError, json.JSONDecodeError):
pass
return {}
def _save(session_id: str, data: dict) -> None:
"""Persist scratchpad data to disk."""
SCRATCHPAD_DIR.mkdir(parents=True, exist_ok=True)
path = _scratch_path(session_id)
path.write_text(json.dumps(data, indent=2, default=str), encoding="utf-8")
# ---------------------------------------------------------------------------
# Public API
# ---------------------------------------------------------------------------
def write_scratch(session_id: str, key: str, value: Any) -> None:
"""Write a note to the session scratchpad.
Args:
session_id: Current session identifier.
key: Note key (string).
value: Note value (any JSON-serializable type).
"""
data = _load(session_id)
data[key] = {
"value": value,
"written_at": time.strftime("%Y-%m-%d %H:%M:%S"),
}
_save(session_id, data)
def read_scratch(session_id: str, key: Optional[str] = None) -> dict:
"""Read session scratchpad (all keys or one).
Args:
session_id: Current session identifier.
key: Optional specific key. If None, returns all entries.
Returns:
dict — either {key: {value, written_at}} or the full scratchpad.
"""
data = _load(session_id)
if key is not None:
entry = data.get(key)
return {key: entry} if entry else {}
return data
def delete_scratch(session_id: str, key: str) -> bool:
"""Remove a single key from the scratchpad.
Returns True if the key existed and was removed.
"""
data = _load(session_id)
if key in data:
del data[key]
_save(session_id, data)
return True
return False
def list_sessions() -> list[str]:
"""List all session IDs that have scratchpad files."""
try:
if SCRATCHPAD_DIR.exists():
return [
f.stem
for f in SCRATCHPAD_DIR.iterdir()
if f.suffix == ".json" and f.is_file()
]
except OSError:
pass
return []
def promote_to_palace(
session_id: str,
key: str,
room: str = "general",
drawer: Optional[str] = None,
) -> bool:
"""Move a scratchpad note to durable palace memory.
Uses the mempalace CLI to store the note in the specified room.
Removes the note from the scratchpad after successful promotion.
Args:
session_id: Session containing the note.
key: Scratchpad key to promote.
room: Palace room name (default: 'general').
drawer: Optional drawer name within the room. Defaults to key.
Returns:
True if promotion succeeded, False otherwise.
"""
data = _load(session_id)
entry = data.get(key)
if not entry:
return False
value = entry.get("value", entry) if isinstance(entry, dict) else entry
content = json.dumps(value, default=str) if not isinstance(value, str) else value
try:
bin_path = MEMPALACE_BIN if os.path.exists(MEMPALACE_BIN) else "mempalace"
target_drawer = drawer or key
result = subprocess.run(
[bin_path, "store", room, target_drawer, content],
capture_output=True,
text=True,
timeout=10,
)
if result.returncode == 0:
# Remove from scratchpad after successful promotion
del data[key]
_save(session_id, data)
return True
except (FileNotFoundError, subprocess.TimeoutExpired, OSError):
# mempalace CLI not available — degrade gracefully
pass
return False
def clear_session(session_id: str) -> bool:
"""Delete the entire scratchpad for a session.
Returns True if the file existed and was removed.
"""
path = _scratch_path(session_id)
try:
if path.exists():
path.unlink()
return True
except OSError:
pass
return False

View File

@@ -0,0 +1,474 @@
"""Sovereign Memory Store — zero-API, zero-dependency durable memory.
Replaces the third-party `mempalace` CLI and its ONNX requirement with a
self-contained SQLite + FTS5 + HRR (Holographic Reduced Representation)
store. Every operation is local: no network calls, no API keys, no cloud.
Storage: ~/.hermes/palace/sovereign.db
Capabilities:
- Durable fact storage with rooms, categories, and trust scores
- Hybrid retrieval: FTS5 keyword search + HRR cosine similarity
- Reciprocal Rank Fusion to merge keyword and semantic results
- Trust scoring: facts that get retrieved and confirmed gain trust
- Graceful numpy degradation: falls back to keyword-only if missing
Refs: Epic #367, MP-3 #370, MP-4 #371
"""
from __future__ import annotations
import hashlib
import json
import math
import sqlite3
import struct
import time
from pathlib import Path
from typing import Any, Optional
# ---------------------------------------------------------------------------
# HRR (Holographic Reduced Representations) — zero-dependency vectors
# ---------------------------------------------------------------------------
# Phase-encoded vectors via SHA-256. No ONNX, no embeddings API, no numpy
# required (but uses numpy when available for speed).
_TWO_PI = 2.0 * math.pi
_DIM = 512 # Compact dimension — sufficient for memory retrieval
try:
import numpy as np
_HAS_NUMPY = True
except ImportError:
_HAS_NUMPY = False
def _encode_atom_np(word: str, dim: int = _DIM) -> "np.ndarray":
"""Deterministic phase vector via SHA-256 (numpy path)."""
values_per_block = 16
blocks_needed = math.ceil(dim / values_per_block)
uint16_values: list[int] = []
for i in range(blocks_needed):
digest = hashlib.sha256(f"{word}:{i}".encode()).digest()
uint16_values.extend(struct.unpack("<16H", digest))
return np.array(uint16_values[:dim], dtype=np.float64) * (_TWO_PI / 65536.0)
def _encode_atom_pure(word: str, dim: int = _DIM) -> list[float]:
"""Deterministic phase vector via SHA-256 (pure Python fallback)."""
values_per_block = 16
blocks_needed = math.ceil(dim / values_per_block)
uint16_values: list[int] = []
for i in range(blocks_needed):
digest = hashlib.sha256(f"{word}:{i}".encode()).digest()
for j in range(0, 32, 2):
uint16_values.append(int.from_bytes(digest[j:j+2], "little"))
return [v * (_TWO_PI / 65536.0) for v in uint16_values[:dim]]
def encode_text(text: str, dim: int = _DIM):
"""Encode a text string into an HRR phase vector by bundling word atoms.
Uses circular mean of per-word phase vectors — the standard HRR
superposition operation. Result is a fixed-width vector regardless
of input length.
"""
words = text.lower().split()
if not words:
words = ["<empty>"]
if _HAS_NUMPY:
atoms = [_encode_atom_np(w, dim) for w in words]
# Circular mean: average the unit vectors, extract phase
unit_sum = sum(np.exp(1j * a) for a in atoms)
return np.angle(unit_sum) % _TWO_PI
else:
# Pure Python circular mean
real_sum = [0.0] * dim
imag_sum = [0.0] * dim
for w in words:
atom = _encode_atom_pure(w, dim)
for d in range(dim):
real_sum[d] += math.cos(atom[d])
imag_sum[d] += math.sin(atom[d])
return [math.atan2(imag_sum[d], real_sum[d]) % _TWO_PI for d in range(dim)]
def cosine_similarity_phase(a, b) -> float:
"""Cosine similarity between two phase vectors.
For phase vectors, similarity = mean(cos(a - b)).
"""
if _HAS_NUMPY:
return float(np.mean(np.cos(np.array(a) - np.array(b))))
else:
n = len(a)
return sum(math.cos(a[i] - b[i]) for i in range(n)) / n
def serialize_vector(vec) -> bytes:
"""Serialize a vector to bytes for SQLite storage."""
if _HAS_NUMPY:
return vec.astype(np.float64).tobytes()
else:
return struct.pack(f"{len(vec)}d", *vec)
def deserialize_vector(blob: bytes):
"""Deserialize bytes back to a vector."""
n = len(blob) // 8 # float64 = 8 bytes
if _HAS_NUMPY:
return np.frombuffer(blob, dtype=np.float64)
else:
return list(struct.unpack(f"{n}d", blob))
# ---------------------------------------------------------------------------
# SQLite Schema
# ---------------------------------------------------------------------------
_SCHEMA = """
CREATE TABLE IF NOT EXISTS memories (
memory_id INTEGER PRIMARY KEY AUTOINCREMENT,
content TEXT NOT NULL,
room TEXT DEFAULT 'general',
category TEXT DEFAULT '',
trust_score REAL DEFAULT 0.5,
retrieval_count INTEGER DEFAULT 0,
created_at REAL NOT NULL,
updated_at REAL NOT NULL,
hrr_vector BLOB
);
CREATE INDEX IF NOT EXISTS idx_memories_room ON memories(room);
CREATE INDEX IF NOT EXISTS idx_memories_trust ON memories(trust_score DESC);
-- FTS5 for fast keyword search
CREATE VIRTUAL TABLE IF NOT EXISTS memories_fts USING fts5(
content, room, category,
content=memories, content_rowid=memory_id,
tokenize='porter unicode61'
);
-- Sync triggers
CREATE TRIGGER IF NOT EXISTS memories_ai AFTER INSERT ON memories BEGIN
INSERT INTO memories_fts(rowid, content, room, category)
VALUES (new.memory_id, new.content, new.room, new.category);
END;
CREATE TRIGGER IF NOT EXISTS memories_ad AFTER DELETE ON memories BEGIN
INSERT INTO memories_fts(memories_fts, rowid, content, room, category)
VALUES ('delete', old.memory_id, old.content, old.room, old.category);
END;
CREATE TRIGGER IF NOT EXISTS memories_au AFTER UPDATE ON memories BEGIN
INSERT INTO memories_fts(memories_fts, rowid, content, room, category)
VALUES ('delete', old.memory_id, old.content, old.room, old.category);
INSERT INTO memories_fts(rowid, content, room, category)
VALUES (new.memory_id, new.content, new.room, new.category);
END;
-- Promotion log: tracks what moved from scratchpad to durable memory
CREATE TABLE IF NOT EXISTS promotion_log (
log_id INTEGER PRIMARY KEY AUTOINCREMENT,
session_id TEXT NOT NULL,
scratch_key TEXT NOT NULL,
memory_id INTEGER REFERENCES memories(memory_id),
promoted_at REAL NOT NULL,
reason TEXT DEFAULT ''
);
"""
# ---------------------------------------------------------------------------
# SovereignStore
# ---------------------------------------------------------------------------
class SovereignStore:
"""Zero-API durable memory store.
All operations are local SQLite. No network calls. No API keys.
HRR vectors provide semantic similarity without embedding models.
FTS5 provides fast keyword search. RRF merges both rankings.
"""
def __init__(self, db_path: Optional[str] = None):
if db_path is None:
db_path = str(Path.home() / ".hermes" / "palace" / "sovereign.db")
self._db_path = db_path
Path(db_path).parent.mkdir(parents=True, exist_ok=True)
self._conn = sqlite3.connect(db_path)
self._conn.row_factory = sqlite3.Row
self._conn.executescript(_SCHEMA)
def close(self):
self._conn.close()
# ------------------------------------------------------------------
# Store
# ------------------------------------------------------------------
def store(
self,
content: str,
room: str = "general",
category: str = "",
trust: float = 0.5,
) -> int:
"""Store a fact in durable memory. Returns the memory_id."""
now = time.time()
vec = encode_text(content)
blob = serialize_vector(vec)
cur = self._conn.execute(
"""INSERT INTO memories (content, room, category, trust_score,
created_at, updated_at, hrr_vector)
VALUES (?, ?, ?, ?, ?, ?, ?)""",
(content, room, category, trust, now, now, blob),
)
self._conn.commit()
return cur.lastrowid
def store_batch(self, items: list[dict]) -> list[int]:
"""Store multiple facts. Each item: {content, room?, category?, trust?}."""
ids = []
now = time.time()
for item in items:
content = item["content"]
vec = encode_text(content)
blob = serialize_vector(vec)
cur = self._conn.execute(
"""INSERT INTO memories (content, room, category, trust_score,
created_at, updated_at, hrr_vector)
VALUES (?, ?, ?, ?, ?, ?, ?)""",
(
content,
item.get("room", "general"),
item.get("category", ""),
item.get("trust", 0.5),
now, now, blob,
),
)
ids.append(cur.lastrowid)
self._conn.commit()
return ids
# ------------------------------------------------------------------
# Search — hybrid FTS5 + HRR with Reciprocal Rank Fusion
# ------------------------------------------------------------------
def search(
self,
query: str,
room: Optional[str] = None,
limit: int = 10,
min_trust: float = 0.0,
fts_weight: float = 0.5,
hrr_weight: float = 0.5,
) -> list[dict]:
"""Hybrid search: FTS5 keywords + HRR semantic similarity.
Uses Reciprocal Rank Fusion (RRF) to merge both rankings.
Returns list of dicts with content, room, score, trust_score.
"""
k_rrf = 60 # Standard RRF constant
# Stage 1: FTS5 candidates
fts_results = self._fts_search(query, room, min_trust, limit * 3)
# Stage 2: HRR candidates (scan top N by trust)
hrr_results = self._hrr_search(query, room, min_trust, limit * 3)
# Stage 3: RRF fusion
scores: dict[int, float] = {}
meta: dict[int, dict] = {}
for rank, row in enumerate(fts_results):
mid = row["memory_id"]
scores[mid] = scores.get(mid, 0) + fts_weight / (k_rrf + rank + 1)
meta[mid] = dict(row)
for rank, row in enumerate(hrr_results):
mid = row["memory_id"]
scores[mid] = scores.get(mid, 0) + hrr_weight / (k_rrf + rank + 1)
if mid not in meta:
meta[mid] = dict(row)
# Sort by fused score
ranked = sorted(scores.items(), key=lambda x: x[1], reverse=True)[:limit]
results = []
for mid, score in ranked:
m = meta[mid]
# Bump retrieval count
self._conn.execute(
"UPDATE memories SET retrieval_count = retrieval_count + 1 WHERE memory_id = ?",
(mid,),
)
results.append({
"memory_id": mid,
"content": m["content"],
"room": m["room"],
"category": m.get("category", ""),
"trust_score": m["trust_score"],
"score": round(score, 6),
})
if results:
self._conn.commit()
return results
def _fts_search(
self, query: str, room: Optional[str], min_trust: float, limit: int
) -> list[dict]:
"""FTS5 full-text search."""
try:
if room:
rows = self._conn.execute(
"""SELECT m.memory_id, m.content, m.room, m.category,
m.trust_score, m.retrieval_count
FROM memories_fts f
JOIN memories m ON f.rowid = m.memory_id
WHERE memories_fts MATCH ? AND m.room = ?
AND m.trust_score >= ?
ORDER BY rank LIMIT ?""",
(query, room, min_trust, limit),
).fetchall()
else:
rows = self._conn.execute(
"""SELECT m.memory_id, m.content, m.room, m.category,
m.trust_score, m.retrieval_count
FROM memories_fts f
JOIN memories m ON f.rowid = m.memory_id
WHERE memories_fts MATCH ?
AND m.trust_score >= ?
ORDER BY rank LIMIT ?""",
(query, min_trust, limit),
).fetchall()
return [dict(r) for r in rows]
except sqlite3.OperationalError:
# Bad FTS query syntax — degrade gracefully
return []
def _hrr_search(
self, query: str, room: Optional[str], min_trust: float, limit: int
) -> list[dict]:
"""HRR cosine similarity search (brute-force scan, fast for <100K facts)."""
query_vec = encode_text(query)
if room:
rows = self._conn.execute(
"""SELECT memory_id, content, room, category, trust_score,
retrieval_count, hrr_vector
FROM memories
WHERE room = ? AND trust_score >= ? AND hrr_vector IS NOT NULL""",
(room, min_trust),
).fetchall()
else:
rows = self._conn.execute(
"""SELECT memory_id, content, room, category, trust_score,
retrieval_count, hrr_vector
FROM memories
WHERE trust_score >= ? AND hrr_vector IS NOT NULL""",
(min_trust,),
).fetchall()
scored = []
for r in rows:
stored_vec = deserialize_vector(r["hrr_vector"])
sim = cosine_similarity_phase(query_vec, stored_vec)
scored.append((sim, dict(r)))
scored.sort(key=lambda x: x[0], reverse=True)
return [item[1] for item in scored[:limit]]
# ------------------------------------------------------------------
# Trust management
# ------------------------------------------------------------------
def boost_trust(self, memory_id: int, delta: float = 0.05) -> None:
"""Increase trust score when a memory proves useful."""
self._conn.execute(
"""UPDATE memories SET trust_score = MIN(1.0, trust_score + ?),
updated_at = ? WHERE memory_id = ?""",
(delta, time.time(), memory_id),
)
self._conn.commit()
def decay_trust(self, memory_id: int, delta: float = 0.02) -> None:
"""Decrease trust score when a memory is contradicted."""
self._conn.execute(
"""UPDATE memories SET trust_score = MAX(0.0, trust_score - ?),
updated_at = ? WHERE memory_id = ?""",
(delta, time.time(), memory_id),
)
self._conn.commit()
# ------------------------------------------------------------------
# Room operations
# ------------------------------------------------------------------
def list_rooms(self) -> list[dict]:
"""List all rooms with fact counts."""
rows = self._conn.execute(
"""SELECT room, COUNT(*) as count,
AVG(trust_score) as avg_trust
FROM memories GROUP BY room ORDER BY count DESC"""
).fetchall()
return [dict(r) for r in rows]
def room_contents(self, room: str, limit: int = 50) -> list[dict]:
"""Get all facts in a room, ordered by trust."""
rows = self._conn.execute(
"""SELECT memory_id, content, category, trust_score,
retrieval_count, created_at
FROM memories WHERE room = ?
ORDER BY trust_score DESC, created_at DESC LIMIT ?""",
(room, limit),
).fetchall()
return [dict(r) for r in rows]
# ------------------------------------------------------------------
# Stats
# ------------------------------------------------------------------
def stats(self) -> dict:
"""Return store statistics."""
row = self._conn.execute(
"""SELECT COUNT(*) as total,
AVG(trust_score) as avg_trust,
SUM(retrieval_count) as total_retrievals,
COUNT(DISTINCT room) as room_count
FROM memories"""
).fetchone()
return dict(row)
# ------------------------------------------------------------------
# Promotion support (scratchpad → durable)
# ------------------------------------------------------------------
def log_promotion(
self,
session_id: str,
scratch_key: str,
memory_id: int,
reason: str = "",
) -> None:
"""Record a scratchpad-to-palace promotion in the audit log."""
self._conn.execute(
"""INSERT INTO promotion_log
(session_id, scratch_key, memory_id, promoted_at, reason)
VALUES (?, ?, ?, ?, ?)""",
(session_id, scratch_key, memory_id, time.time(), reason),
)
self._conn.commit()
def recent_promotions(self, limit: int = 20) -> list[dict]:
"""Get recent promotion log entries."""
rows = self._conn.execute(
"""SELECT p.*, m.content, m.room
FROM promotion_log p
LEFT JOIN memories m ON p.memory_id = m.memory_id
ORDER BY p.promoted_at DESC LIMIT ?""",
(limit,),
).fetchall()
return [dict(r) for r in rows]

View File

@@ -0,0 +1,180 @@
"""Tests for the mempalace skill.
Validates PalaceRoom, Mempalace class, factory constructors,
and the analyse_issues entry-point.
Refs: Epic #367, Sub-issue #368
"""
from __future__ import annotations
import json
import sys
import os
import time
import pytest
# Ensure the package is importable from the repo layout
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", ".."))
from mempalace.mempalace import Mempalace, PalaceRoom, analyse_issues
# ── PalaceRoom unit tests ─────────────────────────────────────────────────
class TestPalaceRoom:
def test_store_and_retrieve(self):
room = PalaceRoom(name="test", label="Test Room")
room.store("key1", 42)
assert room.retrieve("key1") == 42
def test_retrieve_default(self):
room = PalaceRoom(name="test", label="Test Room")
assert room.retrieve("missing") is None
assert room.retrieve("missing", "fallback") == "fallback"
def test_summary_format(self):
room = PalaceRoom(name="test", label="Test Room")
room.store("repos", 5)
summary = room.summary()
assert "## Test Room" in summary
assert "repos: 5" in summary
def test_contents_default_factory_isolation(self):
"""Each room gets its own dict — no shared mutable default."""
r1 = PalaceRoom(name="a", label="A")
r2 = PalaceRoom(name="b", label="B")
r1.store("x", 1)
assert r2.retrieve("x") is None
def test_entered_at_is_recent(self):
before = time.time()
room = PalaceRoom(name="t", label="T")
after = time.time()
assert before <= room.entered_at <= after
# ── Mempalace core tests ──────────────────────────────────────────────────
class TestMempalace:
def test_add_and_enter_room(self):
p = Mempalace(domain="test")
p.add_room("r1", "Room 1")
room = p.enter("r1")
assert room.name == "r1"
def test_enter_nonexistent_room_raises(self):
p = Mempalace()
with pytest.raises(KeyError, match="No room"):
p.enter("ghost")
def test_store_without_enter_raises(self):
p = Mempalace()
p.add_room("r", "R")
with pytest.raises(RuntimeError, match="Enter a room"):
p.store("k", "v")
def test_store_and_retrieve_via_palace(self):
p = Mempalace()
p.add_room("r", "R")
p.enter("r")
p.store("count", 10)
assert p.retrieve("r", "count") == 10
def test_retrieve_missing_room_returns_default(self):
p = Mempalace()
assert p.retrieve("nope", "key") is None
assert p.retrieve("nope", "key", 99) == 99
def test_render_includes_domain(self):
p = Mempalace(domain="audit")
p.add_room("r", "Room")
p.enter("r")
p.store("item", "value")
output = p.render()
assert "audit" in output
assert "Room" in output
def test_to_dict_structure(self):
p = Mempalace(domain="test")
p.add_room("r", "R")
p.enter("r")
p.store("a", 1)
d = p.to_dict()
assert d["domain"] == "test"
assert "elapsed_seconds" in d
assert d["rooms"]["r"] == {"a": 1}
def test_to_json_is_valid(self):
p = Mempalace(domain="j")
p.add_room("x", "X")
p.enter("x")
p.store("v", [1, 2, 3])
parsed = json.loads(p.to_json())
assert parsed["rooms"]["x"]["v"] == [1, 2, 3]
# ── Factory constructor tests ─────────────────────────────────────────────
class TestFactories:
def test_for_issue_analysis_rooms(self):
p = Mempalace.for_issue_analysis()
assert p.domain == "issue_analysis"
for key in ("repo_architecture", "assignment_status",
"triage_priority", "resolution_patterns"):
p.enter(key) # should not raise
def test_for_health_check_rooms(self):
p = Mempalace.for_health_check()
assert p.domain == "health_check"
for key in ("service_topology", "failure_signals", "recovery_history"):
p.enter(key)
def test_for_code_review_rooms(self):
p = Mempalace.for_code_review()
assert p.domain == "code_review"
for key in ("change_scope", "risk_surface",
"test_coverage", "reviewer_context"):
p.enter(key)
# ── analyse_issues entry-point tests ──────────────────────────────────────
class TestAnalyseIssues:
SAMPLE_DATA = [
{"repo": "the-nexus", "open_issues": 40, "assigned": 30, "unassigned": 10},
{"repo": "timmy-home", "open_issues": 30, "assigned": 25, "unassigned": 5},
{"repo": "hermes-agent", "open_issues": 20, "assigned": 15, "unassigned": 5},
{"repo": "empty-repo", "open_issues": 0, "assigned": 0, "unassigned": 0},
]
def test_returns_string(self):
result = analyse_issues(self.SAMPLE_DATA)
assert isinstance(result, str)
assert len(result) > 0
def test_contains_room_headers(self):
result = analyse_issues(self.SAMPLE_DATA)
assert "Repository Architecture" in result
assert "Assignment Status" in result
def test_coverage_below_target(self):
result = analyse_issues(self.SAMPLE_DATA, target_assignee_rate=0.90)
assert "BELOW TARGET" in result
def test_coverage_meets_target(self):
good_data = [
{"repo": "a", "open_issues": 10, "assigned": 10, "unassigned": 0},
]
result = analyse_issues(good_data, target_assignee_rate=0.80)
assert "OK" in result
def test_empty_repos_list(self):
result = analyse_issues([])
assert isinstance(result, str)
def test_single_repo(self):
data = [{"repo": "solo", "open_issues": 5, "assigned": 3, "unassigned": 2}]
result = analyse_issues(data)
assert "solo" in result or "issue_analysis" in result

View File

@@ -0,0 +1,143 @@
"""Tests for retrieval_enforcer.py.
Refs: Epic #367, Sub-issue #369
"""
from __future__ import annotations
import json
import os
import sys
import tempfile
from pathlib import Path
from unittest.mock import patch, MagicMock
import pytest
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", ".."))
from mempalace.retrieval_enforcer import (
is_recall_query,
load_identity,
load_scratchpad,
enforce_retrieval_order,
search_skills,
RECALL_PATTERNS,
)
class TestRecallDetection:
"""Test the recall-query pattern matcher."""
@pytest.mark.parametrize("query", [
"what did we work on yesterday",
"status of the mempalace integration",
"remember the fleet audit results",
"last time we deployed the nexus",
"previously you mentioned a CI fix",
"we discussed the sovereign deployment",
])
def test_recall_queries_detected(self, query):
assert is_recall_query(query) is True
@pytest.mark.parametrize("query", [
"create a new file called test.py",
"run the test suite",
"deploy to production",
"write a function that sums numbers",
"install the package",
])
def test_non_recall_queries_skipped(self, query):
assert is_recall_query(query) is False
class TestLoadIdentity:
def test_loads_existing_identity(self, tmp_path):
identity_file = tmp_path / "identity.txt"
identity_file.write_text("I am Timmy. A sovereign AI.")
with patch("mempalace.retrieval_enforcer.IDENTITY_PATH", identity_file):
result = load_identity()
assert "Timmy" in result
def test_returns_empty_on_missing_file(self, tmp_path):
identity_file = tmp_path / "nonexistent.txt"
with patch("mempalace.retrieval_enforcer.IDENTITY_PATH", identity_file):
result = load_identity()
assert result == ""
def test_truncates_long_identity(self, tmp_path):
identity_file = tmp_path / "identity.txt"
identity_file.write_text(" ".join(["word"] * 300))
with patch("mempalace.retrieval_enforcer.IDENTITY_PATH", identity_file):
result = load_identity()
assert result.endswith("...")
assert len(result.split()) <= 201 # 200 words + "..."
class TestLoadScratchpad:
def test_loads_valid_scratchpad(self, tmp_path):
scratch_file = tmp_path / "session123.json"
scratch_file.write_text(json.dumps({"note": "test value", "key2": 42}))
with patch("mempalace.retrieval_enforcer.SCRATCHPAD_DIR", tmp_path):
result = load_scratchpad("session123")
assert "note: test value" in result
assert "key2: 42" in result
def test_returns_empty_on_missing_file(self, tmp_path):
with patch("mempalace.retrieval_enforcer.SCRATCHPAD_DIR", tmp_path):
result = load_scratchpad("nonexistent")
assert result == ""
def test_returns_empty_on_invalid_json(self, tmp_path):
scratch_file = tmp_path / "bad.json"
scratch_file.write_text("not valid json{{{")
with patch("mempalace.retrieval_enforcer.SCRATCHPAD_DIR", tmp_path):
result = load_scratchpad("bad")
assert result == ""
class TestEnforceRetrievalOrder:
def test_skips_non_recall_query(self):
result = enforce_retrieval_order("create a new file")
assert result["retrieved_from"] is None
assert result["tokens"] == 0
def test_runs_for_recall_query(self, tmp_path):
identity_file = tmp_path / "identity.txt"
identity_file.write_text("I am Timmy.")
with patch("mempalace.retrieval_enforcer.IDENTITY_PATH", identity_file), \
patch("mempalace.retrieval_enforcer.search_palace", return_value=""), \
patch("mempalace.retrieval_enforcer.search_gitea", return_value=""), \
patch("mempalace.retrieval_enforcer.search_skills", return_value=""):
result = enforce_retrieval_order("what did we work on yesterday")
assert "Identity" in result["context"]
assert "L0" in result["layers_checked"]
def test_palace_hit_sets_l1(self, tmp_path):
identity_file = tmp_path / "identity.txt"
identity_file.write_text("I am Timmy.")
with patch("mempalace.retrieval_enforcer.IDENTITY_PATH", identity_file), \
patch("mempalace.retrieval_enforcer.search_palace", return_value="Found: fleet audit results"), \
patch("mempalace.retrieval_enforcer.search_gitea", return_value=""):
result = enforce_retrieval_order("what did we discuss yesterday")
assert result["retrieved_from"] == "L1"
assert "Palace Memory" in result["context"]
def test_falls_through_to_l5(self, tmp_path):
identity_file = tmp_path / "nonexistent.txt"
with patch("mempalace.retrieval_enforcer.IDENTITY_PATH", identity_file), \
patch("mempalace.retrieval_enforcer.search_palace", return_value=""), \
patch("mempalace.retrieval_enforcer.search_gitea", return_value=""), \
patch("mempalace.retrieval_enforcer.search_skills", return_value=""):
result = enforce_retrieval_order("remember the old deployment", skip_if_not_recall=True)
assert result["retrieved_from"] == "L5"
def test_force_mode_skips_recall_check(self, tmp_path):
identity_file = tmp_path / "identity.txt"
identity_file.write_text("I am Timmy.")
with patch("mempalace.retrieval_enforcer.IDENTITY_PATH", identity_file), \
patch("mempalace.retrieval_enforcer.search_palace", return_value=""), \
patch("mempalace.retrieval_enforcer.search_gitea", return_value=""), \
patch("mempalace.retrieval_enforcer.search_skills", return_value=""):
result = enforce_retrieval_order("deploy now", skip_if_not_recall=False)
assert "Identity" in result["context"]

View File

@@ -0,0 +1,108 @@
"""Tests for scratchpad.py.
Refs: Epic #367, Sub-issue #372
"""
from __future__ import annotations
import json
import os
import sys
from pathlib import Path
from unittest.mock import patch
import pytest
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", ".."))
from mempalace.scratchpad import (
write_scratch,
read_scratch,
delete_scratch,
list_sessions,
clear_session,
_scratch_path,
)
@pytest.fixture
def scratch_dir(tmp_path):
"""Provide a temporary scratchpad directory."""
with patch("mempalace.scratchpad.SCRATCHPAD_DIR", tmp_path):
yield tmp_path
class TestScratchPath:
def test_sanitizes_session_id(self):
path = _scratch_path("safe-id_123")
assert "safe-id_123.json" in str(path)
def test_strips_dangerous_chars(self):
path = _scratch_path("../../etc/passwd")
assert ".." not in path.name
assert "/" not in path.name
# Dots are stripped, so only alphanumeric chars remain
assert path.name == "etcpasswd.json"
class TestWriteAndRead:
def test_write_then_read(self, scratch_dir):
write_scratch("sess1", "note", "hello world")
result = read_scratch("sess1", "note")
assert "note" in result
assert result["note"]["value"] == "hello world"
def test_read_all_keys(self, scratch_dir):
write_scratch("sess1", "a", 1)
write_scratch("sess1", "b", 2)
result = read_scratch("sess1")
assert "a" in result
assert "b" in result
def test_read_missing_key(self, scratch_dir):
write_scratch("sess1", "exists", "yes")
result = read_scratch("sess1", "missing")
assert result == {}
def test_read_missing_session(self, scratch_dir):
result = read_scratch("nonexistent")
assert result == {}
def test_overwrite_key(self, scratch_dir):
write_scratch("sess1", "key", "v1")
write_scratch("sess1", "key", "v2")
result = read_scratch("sess1", "key")
assert result["key"]["value"] == "v2"
class TestDelete:
def test_delete_existing_key(self, scratch_dir):
write_scratch("sess1", "key", "val")
assert delete_scratch("sess1", "key") is True
assert read_scratch("sess1", "key") == {}
def test_delete_missing_key(self, scratch_dir):
write_scratch("sess1", "other", "val")
assert delete_scratch("sess1", "missing") is False
class TestListSessions:
def test_lists_sessions(self, scratch_dir):
write_scratch("alpha", "k", "v")
write_scratch("beta", "k", "v")
sessions = list_sessions()
assert "alpha" in sessions
assert "beta" in sessions
def test_empty_directory(self, scratch_dir):
assert list_sessions() == []
class TestClearSession:
def test_clears_existing(self, scratch_dir):
write_scratch("sess1", "k", "v")
assert clear_session("sess1") is True
assert read_scratch("sess1") == {}
def test_clear_nonexistent(self, scratch_dir):
assert clear_session("ghost") is False

View File

@@ -0,0 +1,255 @@
"""Tests for the Sovereign Memory Store and Promotion system.
Zero-API, zero-network — everything runs against an in-memory SQLite DB.
"""
import os
import sys
import tempfile
import time
import unittest
# Allow imports from parent package
sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))
from sovereign_store import (
SovereignStore,
encode_text,
cosine_similarity_phase,
serialize_vector,
deserialize_vector,
)
from promotion import (
evaluate_for_promotion,
promote,
promote_session_batch,
)
class TestHRRVectors(unittest.TestCase):
"""Test the HRR encoding and similarity functions."""
def test_deterministic_encoding(self):
"""Same text always produces the same vector."""
v1 = encode_text("hello world")
v2 = encode_text("hello world")
self.assertAlmostEqual(cosine_similarity_phase(v1, v2), 1.0, places=5)
def test_similar_texts_higher_similarity(self):
"""Related texts should be more similar than unrelated ones."""
v_agent = encode_text("agent memory palace retrieval")
v_similar = encode_text("agent recall memory search")
v_unrelated = encode_text("banana strawberry fruit smoothie")
sim_related = cosine_similarity_phase(v_agent, v_similar)
sim_unrelated = cosine_similarity_phase(v_agent, v_unrelated)
self.assertGreater(sim_related, sim_unrelated)
def test_serialize_roundtrip(self):
"""Vectors survive serialization to/from bytes."""
vec = encode_text("test serialization")
blob = serialize_vector(vec)
restored = deserialize_vector(blob)
sim = cosine_similarity_phase(vec, restored)
self.assertAlmostEqual(sim, 1.0, places=5)
def test_empty_text(self):
"""Empty text gets a fallback encoding."""
vec = encode_text("")
self.assertEqual(len(vec) if hasattr(vec, '__len__') else len(list(vec)), 512)
class TestSovereignStore(unittest.TestCase):
"""Test the SQLite-backed sovereign store."""
def setUp(self):
self.db_path = os.path.join(tempfile.mkdtemp(), "test.db")
self.store = SovereignStore(db_path=self.db_path)
def tearDown(self):
self.store.close()
if os.path.exists(self.db_path):
os.remove(self.db_path)
def test_store_and_retrieve(self):
"""Store a fact and find it via search."""
mid = self.store.store("Timmy is a sovereign AI agent on Hermes VPS", room="identity")
results = self.store.search("sovereign agent", room="identity")
self.assertTrue(any(r["memory_id"] == mid for r in results))
def test_fts_search(self):
"""FTS5 keyword search works."""
self.store.store("The beacon game uses paperclips mechanics", room="projects")
self.store.store("Fleet agents handle delegation and dispatch", room="fleet")
results = self.store.search("paperclips")
self.assertTrue(len(results) > 0)
self.assertIn("paperclips", results[0]["content"].lower())
def test_hrr_search_semantic(self):
"""HRR similarity finds related content even without exact keywords."""
self.store.store("Memory palace rooms organize facts spatially", room="memory")
self.store.store("Pizza delivery service runs on weekends", room="unrelated")
results = self.store.search("organize knowledge rooms", room="memory")
self.assertTrue(len(results) > 0)
self.assertIn("palace", results[0]["content"].lower())
def test_room_filtering(self):
"""Room filter restricts search scope."""
self.store.store("Hermes harness manages tool calls", room="infrastructure")
self.store.store("Hermes mythology Greek god", room="lore")
results = self.store.search("Hermes", room="infrastructure")
self.assertTrue(all(r["room"] == "infrastructure" for r in results))
def test_trust_boost(self):
"""Trust score increases when boosted."""
mid = self.store.store("fact", trust=0.5)
self.store.boost_trust(mid, delta=0.1)
results = self.store.room_contents("general")
fact = next(r for r in results if r["memory_id"] == mid)
self.assertAlmostEqual(fact["trust_score"], 0.6, places=2)
def test_trust_decay(self):
"""Trust score decreases when decayed."""
mid = self.store.store("questionable fact", trust=0.5)
self.store.decay_trust(mid, delta=0.2)
results = self.store.room_contents("general")
fact = next(r for r in results if r["memory_id"] == mid)
self.assertAlmostEqual(fact["trust_score"], 0.3, places=2)
def test_batch_store(self):
"""Batch store works."""
ids = self.store.store_batch([
{"content": "fact one", "room": "test"},
{"content": "fact two", "room": "test"},
{"content": "fact three", "room": "test"},
])
self.assertEqual(len(ids), 3)
rooms = self.store.list_rooms()
test_room = next(r for r in rooms if r["room"] == "test")
self.assertEqual(test_room["count"], 3)
def test_stats(self):
"""Stats returns correct counts."""
self.store.store("a fact", room="r1")
self.store.store("another fact", room="r2")
s = self.store.stats()
self.assertEqual(s["total"], 2)
self.assertEqual(s["room_count"], 2)
def test_retrieval_count_increments(self):
"""Retrieval count goes up when a fact is found via search."""
self.store.store("unique searchable content xyz123", room="test")
self.store.search("xyz123")
results = self.store.room_contents("test")
self.assertTrue(any(r["retrieval_count"] > 0 for r in results))
class TestPromotion(unittest.TestCase):
"""Test the quality-gated promotion system."""
def setUp(self):
self.db_path = os.path.join(tempfile.mkdtemp(), "promo_test.db")
self.store = SovereignStore(db_path=self.db_path)
def tearDown(self):
self.store.close()
def test_successful_promotion(self):
"""Good content passes all gates."""
result = promote(
content="Timmy runs on the Hermes VPS at 143.198.27.163 with local Ollama inference",
store=self.store,
session_id="test-session-001",
scratch_key="vps_info",
room="infrastructure",
)
self.assertTrue(result.success)
self.assertIsNotNone(result.memory_id)
def test_reject_too_short(self):
"""Short fragments get rejected."""
result = promote(
content="yes",
store=self.store,
session_id="test",
scratch_key="short",
)
self.assertFalse(result.success)
self.assertIn("Too short", result.reason)
def test_reject_duplicate(self):
"""Duplicate content gets rejected."""
self.store.store("SOUL.md is the canonical identity document for Timmy", room="identity")
result = promote(
content="SOUL.md is the canonical identity document for Timmy",
store=self.store,
session_id="test",
scratch_key="soul",
room="identity",
)
self.assertFalse(result.success)
self.assertIn("uplicate", result.reason)
def test_reject_stale(self):
"""Old notes get flagged as stale."""
old_time = time.time() - (86400 * 10)
result = promote(
content="This is a note from long ago about something important",
store=self.store,
session_id="test",
scratch_key="old",
written_at=old_time,
)
self.assertFalse(result.success)
self.assertIn("Stale", result.reason)
def test_force_bypasses_gates(self):
"""Force flag overrides quality gates."""
result = promote(
content="ok",
store=self.store,
session_id="test",
scratch_key="forced",
force=True,
)
self.assertTrue(result.success)
def test_evaluate_dry_run(self):
"""Evaluate returns gate details without promoting."""
eval_result = evaluate_for_promotion(
content="The fleet uses kimi-k2.5 as the primary model for all agent operations",
store=self.store,
room="fleet",
)
self.assertTrue(eval_result["eligible"])
self.assertTrue(all(p for p, _ in eval_result["gates"].values()))
def test_batch_promotion(self):
"""Batch promotion processes all notes."""
notes = {
"infra": {"value": "Hermes VPS runs Ubuntu 22.04 with 2 vCPUs and 4GB RAM", "written_at": time.strftime("%Y-%m-%d %H:%M:%S")},
"short": {"value": "no", "written_at": time.strftime("%Y-%m-%d %H:%M:%S")},
"model": {"value": "The primary local model is gemma4:latest running on Ollama", "written_at": time.strftime("%Y-%m-%d %H:%M:%S")},
}
results = promote_session_batch(self.store, "batch-session", notes, room="config")
promoted = [r for r in results if r.success]
rejected = [r for r in results if not r.success]
self.assertEqual(len(promoted), 2)
self.assertEqual(len(rejected), 1)
def test_promotion_logged(self):
"""Successful promotions appear in the audit log."""
promote(
content="Forge is hosted at forge.alexanderwhitestone.com running Gitea",
store=self.store,
session_id="log-test",
scratch_key="forge",
room="infrastructure",
)
log = self.store.recent_promotions()
self.assertTrue(len(log) > 0)
self.assertEqual(log[0]["session_id"], "log-test")
self.assertEqual(log[0]["scratch_key"], "forge")
if __name__ == "__main__":
unittest.main()

View File

@@ -0,0 +1,100 @@
"""Tests for wakeup.py.
Refs: Epic #367, Sub-issue #372
"""
from __future__ import annotations
import json
import os
import sys
import time
from pathlib import Path
from unittest.mock import patch, MagicMock
import pytest
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", ".."))
from mempalace.wakeup import (
palace_wakeup,
fleet_status_summary,
_load_identity,
_palace_context,
)
class TestLoadIdentity:
def test_loads_identity(self, tmp_path):
f = tmp_path / "identity.txt"
f.write_text("I am Timmy. A sovereign AI.")
with patch("mempalace.wakeup.IDENTITY_PATH", f):
result = _load_identity()
assert "Timmy" in result
def test_missing_identity(self, tmp_path):
f = tmp_path / "nope.txt"
with patch("mempalace.wakeup.IDENTITY_PATH", f):
assert _load_identity() == ""
class TestFleetStatus:
def test_reads_fleet_json(self, tmp_path):
f = tmp_path / "fleet_status.json"
f.write_text(json.dumps({
"Groq": {"state": "active", "last_seen": "2026-04-07"},
"Ezra": {"state": "idle", "last_seen": "2026-04-06"},
}))
with patch("mempalace.wakeup.FLEET_STATUS_PATH", f):
result = fleet_status_summary()
assert "Fleet Status" in result
assert "Groq" in result
assert "active" in result
def test_missing_fleet_file(self, tmp_path):
f = tmp_path / "nope.json"
with patch("mempalace.wakeup.FLEET_STATUS_PATH", f):
assert fleet_status_summary() == ""
def test_invalid_json(self, tmp_path):
f = tmp_path / "bad.json"
f.write_text("not json")
with patch("mempalace.wakeup.FLEET_STATUS_PATH", f):
assert fleet_status_summary() == ""
class TestPalaceWakeup:
def test_generates_context_with_identity(self, tmp_path):
identity = tmp_path / "identity.txt"
identity.write_text("I am Timmy.")
cache = tmp_path / "cache.txt"
with patch("mempalace.wakeup.IDENTITY_PATH", identity), \
patch("mempalace.wakeup.WAKEUP_CACHE_PATH", cache), \
patch("mempalace.wakeup._palace_context", return_value=""), \
patch("mempalace.wakeup.fleet_status_summary", return_value=""):
result = palace_wakeup(force=True)
assert "Identity" in result
assert "Timmy" in result
assert "Session" in result
def test_uses_cache_when_fresh(self, tmp_path):
cache = tmp_path / "cache.txt"
cache.write_text("cached wake-up content")
# Touch the file so it's fresh
with patch("mempalace.wakeup.WAKEUP_CACHE_PATH", cache), \
patch("mempalace.wakeup.WAKEUP_CACHE_TTL", 9999):
result = palace_wakeup(force=False)
assert result == "cached wake-up content"
def test_force_bypasses_cache(self, tmp_path):
cache = tmp_path / "cache.txt"
cache.write_text("stale content")
identity = tmp_path / "identity.txt"
identity.write_text("I am Timmy.")
with patch("mempalace.wakeup.WAKEUP_CACHE_PATH", cache), \
patch("mempalace.wakeup.IDENTITY_PATH", identity), \
patch("mempalace.wakeup._palace_context", return_value=""), \
patch("mempalace.wakeup.fleet_status_summary", return_value=""):
result = palace_wakeup(force=True)
assert "Identity" in result
assert "stale content" not in result

View File

@@ -0,0 +1,161 @@
"""Wake-up Protocol — session start context injection.
Generates 300-900 tokens of context when a new Hermes session starts.
Loads identity, recent palace context, and fleet status.
Refs: Epic #367, Sub-issue #372
"""
from __future__ import annotations
import json
import os
import subprocess
import time
from pathlib import Path
# ---------------------------------------------------------------------------
# Constants
# ---------------------------------------------------------------------------
IDENTITY_PATH = Path.home() / ".mempalace" / "identity.txt"
MEMPALACE_BIN = "/Library/Frameworks/Python.framework/Versions/3.12/bin/mempalace"
FLEET_STATUS_PATH = Path.home() / ".hermes" / "fleet_status.json"
WAKEUP_CACHE_PATH = Path.home() / ".hermes" / "last_wakeup.txt"
WAKEUP_CACHE_TTL = 300 # 5 minutes — don't regenerate if recent
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _load_identity() -> str:
"""Read the agent identity file."""
try:
if IDENTITY_PATH.exists():
text = IDENTITY_PATH.read_text(encoding="utf-8").strip()
# Cap at ~150 tokens for wake-up brevity
words = text.split()
if len(words) > 150:
text = " ".join(words[:150]) + "..."
return text
except (OSError, PermissionError):
pass
return ""
def _palace_context() -> str:
"""Run mempalace wake-up command for recent context. Degrades gracefully."""
try:
bin_path = MEMPALACE_BIN if os.path.exists(MEMPALACE_BIN) else "mempalace"
result = subprocess.run(
[bin_path, "wake-up"],
capture_output=True,
text=True,
timeout=10,
)
if result.returncode == 0 and result.stdout.strip():
return result.stdout.strip()
except (FileNotFoundError, subprocess.TimeoutExpired, OSError):
# ONNX issues (#373) or CLI not available — degrade gracefully
pass
return ""
def fleet_status_summary() -> str:
"""Read cached fleet status for lightweight session context."""
try:
if FLEET_STATUS_PATH.exists():
data = json.loads(FLEET_STATUS_PATH.read_text(encoding="utf-8"))
lines = ["## Fleet Status"]
if isinstance(data, dict):
for agent, status in data.items():
if isinstance(status, dict):
state = status.get("state", "unknown")
last_seen = status.get("last_seen", "?")
lines.append(f" {agent}: {state} (last: {last_seen})")
else:
lines.append(f" {agent}: {status}")
if len(lines) > 1:
return "\n".join(lines)
except (OSError, json.JSONDecodeError):
pass
return ""
def _check_cache() -> str:
"""Return cached wake-up if fresh enough."""
try:
if WAKEUP_CACHE_PATH.exists():
age = time.time() - WAKEUP_CACHE_PATH.stat().st_mtime
if age < WAKEUP_CACHE_TTL:
return WAKEUP_CACHE_PATH.read_text(encoding="utf-8").strip()
except OSError:
pass
return ""
def _write_cache(content: str) -> None:
"""Cache the wake-up content."""
try:
WAKEUP_CACHE_PATH.parent.mkdir(parents=True, exist_ok=True)
WAKEUP_CACHE_PATH.write_text(content, encoding="utf-8")
except OSError:
pass
# ---------------------------------------------------------------------------
# Main entry point
# ---------------------------------------------------------------------------
def palace_wakeup(force: bool = False) -> str:
"""Generate wake-up context for a new session. ~300-900 tokens.
Args:
force: If True, bypass the 5-minute cache and regenerate.
Returns:
Formatted context string suitable for prepending to the system prompt.
"""
# Check cache first (avoids redundant work on rapid session restarts)
if not force:
cached = _check_cache()
if cached:
return cached
parts = []
# L0: Identity
identity = _load_identity()
if identity:
parts.append(f"## Identity\n{identity}")
# L1: Recent palace context
palace = _palace_context()
if palace:
parts.append(palace)
# Fleet status (lightweight)
fleet = fleet_status_summary()
if fleet:
parts.append(fleet)
# Timestamp
parts.append(f"## Session\nWake-up generated: {time.strftime('%Y-%m-%d %H:%M:%S')}")
content = "\n\n".join(parts)
# Cache for TTL
_write_cache(content)
return content
# ---------------------------------------------------------------------------
# CLI entry point for testing
# ---------------------------------------------------------------------------
if __name__ == "__main__":
print(palace_wakeup(force=True))

View File

@@ -0,0 +1,39 @@
#!/usr/bin/env bash
# orchestrate.sh — Sovereign Orchestrator wrapper
# Sets environment and runs orchestrator.py
#
# Usage:
# ./orchestrate.sh # dry-run (safe default)
# ./orchestrate.sh --once # single live dispatch cycle
# ./orchestrate.sh --daemon # continuous (every 15 min)
# ./orchestrate.sh --dry-run # explicit dry-run
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
HERMES_DIR="${HOME}/.hermes"
# Load Gitea token
if [[ -z "${GITEA_TOKEN:-}" ]]; then
if [[ -f "${HERMES_DIR}/gitea_token_vps" ]]; then
export GITEA_TOKEN="$(cat "${HERMES_DIR}/gitea_token_vps")"
else
echo "[FATAL] No GITEA_TOKEN and ~/.hermes/gitea_token_vps not found"
exit 1
fi
fi
# Load Telegram token
if [[ -z "${TELEGRAM_BOT_TOKEN:-}" ]]; then
if [[ -f "${HOME}/.config/telegram/special_bot" ]]; then
export TELEGRAM_BOT_TOKEN="$(cat "${HOME}/.config/telegram/special_bot")"
fi
fi
# Run preflight checks if available
if [[ -x "${HERMES_DIR}/bin/api-key-preflight.sh" ]]; then
"${HERMES_DIR}/bin/api-key-preflight.sh" 2>/dev/null || true
fi
# Run the orchestrator
exec python3 "${SCRIPT_DIR}/orchestrator.py" "$@"

View File

@@ -0,0 +1,645 @@
#!/usr/bin/env python3
"""
Sovereign Orchestrator v1
Reads the Gitea backlog, scores/prioritizes issues, dispatches to agents.
Usage:
python3 orchestrator.py --once # single dispatch cycle
python3 orchestrator.py --daemon # run every 15 min
python3 orchestrator.py --dry-run # score and report, no dispatch
"""
import json
import os
import sys
import time
import subprocess
import urllib.request
import urllib.error
import urllib.parse
from datetime import datetime, timezone
# ---------------------------------------------------------------------------
# CONFIG
# ---------------------------------------------------------------------------
GITEA_API = "https://forge.alexanderwhitestone.com/api/v1"
GITEA_OWNER = "Timmy_Foundation"
REPOS = ["timmy-config", "the-nexus", "timmy-home"]
TELEGRAM_CHAT_ID = "-1003664764329"
DAEMON_INTERVAL = 900 # 15 minutes
# Tags that mark issues we should never auto-dispatch
FILTER_TAGS = ["[EPIC]", "[DO NOT CLOSE]", "[PERMANENT]", "[PHILOSOPHY]", "[MORNING REPORT]"]
# Known agent usernames on Gitea (for assignee detection)
AGENT_USERNAMES = {"groq", "ezra", "bezalel", "allegro", "timmy", "thetimmyc"}
# ---------------------------------------------------------------------------
# AGENT ROSTER
# ---------------------------------------------------------------------------
AGENTS = {
"groq": {
"type": "loop",
"endpoint": "local",
"strengths": ["code", "bug-fix", "small-changes"],
"repos": ["the-nexus", "hermes-agent", "timmy-config", "timmy-home"],
"max_concurrent": 1,
},
"ezra": {
"type": "gateway",
"endpoint": "http://143.198.27.163:8643/v1/chat/completions",
"ssh": "root@143.198.27.163",
"strengths": ["research", "architecture", "complex", "multi-file"],
"repos": ["timmy-config", "the-nexus", "timmy-home"],
"max_concurrent": 1,
},
"bezalel": {
"type": "gateway",
"endpoint": "http://159.203.146.185:8643/v1/chat/completions",
"ssh": "root@159.203.146.185",
"strengths": ["ci", "infra", "ops", "testing"],
"repos": ["timmy-config", "hermes-agent", "the-nexus"],
"max_concurrent": 1,
},
}
# ---------------------------------------------------------------------------
# CREDENTIALS
# ---------------------------------------------------------------------------
def load_gitea_token():
"""Read Gitea token from env or file."""
token = os.environ.get("GITEA_TOKEN", "")
if token:
return token.strip()
token_path = os.path.expanduser("~/.hermes/gitea_token_vps")
try:
with open(token_path) as f:
return f.read().strip()
except FileNotFoundError:
print(f"[FATAL] No GITEA_TOKEN env and {token_path} not found")
sys.exit(1)
def load_telegram_token():
"""Read Telegram bot token from file."""
path = os.path.expanduser("~/.config/telegram/special_bot")
try:
with open(path) as f:
return f.read().strip()
except FileNotFoundError:
return ""
GITEA_TOKEN = ""
TELEGRAM_TOKEN = ""
# ---------------------------------------------------------------------------
# HTTP HELPERS (stdlib only)
# ---------------------------------------------------------------------------
def gitea_request(path, method="GET", data=None):
"""Make an authenticated Gitea API request."""
url = f"{GITEA_API}{path}"
headers = {
"Authorization": f"token {GITEA_TOKEN}",
"Content-Type": "application/json",
"Accept": "application/json",
}
body = json.dumps(data).encode() if data else None
req = urllib.request.Request(url, data=body, headers=headers, method=method)
try:
with urllib.request.urlopen(req, timeout=30) as resp:
return json.loads(resp.read().decode())
except urllib.error.HTTPError as e:
body_text = e.read().decode() if e.fp else ""
print(f"[API ERROR] {method} {url} -> {e.code}: {body_text[:200]}")
return None
except Exception as e:
print(f"[API ERROR] {method} {url} -> {e}")
return None
def send_telegram(message):
"""Send message to Telegram group."""
if not TELEGRAM_TOKEN:
print("[WARN] No Telegram token, skipping notification")
return False
url = f"https://api.telegram.org/bot{TELEGRAM_TOKEN}/sendMessage"
data = json.dumps({
"chat_id": TELEGRAM_CHAT_ID,
"text": message,
"parse_mode": "Markdown",
"disable_web_page_preview": True,
}).encode()
req = urllib.request.Request(url, data=data, headers={"Content-Type": "application/json"})
try:
with urllib.request.urlopen(req, timeout=15) as resp:
return resp.status == 200
except Exception as e:
print(f"[TELEGRAM ERROR] {e}")
return False
# ---------------------------------------------------------------------------
# 1. BACKLOG READER
# ---------------------------------------------------------------------------
def fetch_issues(repo):
"""Fetch all open issues from a repo, handling pagination."""
issues = []
page = 1
while True:
result = gitea_request(
f"/repos/{GITEA_OWNER}/{repo}/issues?state=open&type=issues&limit=50&page={page}"
)
if not result:
break
issues.extend(result)
if len(result) < 50:
break
page += 1
return issues
def should_filter(issue):
"""Check if issue title contains any filter tags."""
title = issue.get("title", "").upper()
for tag in FILTER_TAGS:
if tag.upper().replace("[", "").replace("]", "") in title.replace("[", "").replace("]", ""):
return True
# Also filter pull requests
if issue.get("pull_request"):
return True
return False
def read_backlog():
"""Read and filter the full backlog across all repos."""
backlog = []
for repo in REPOS:
print(f" Fetching {repo}...")
issues = fetch_issues(repo)
for issue in issues:
if should_filter(issue):
continue
assignees = [a.get("login", "") for a in (issue.get("assignees") or [])]
labels = [l.get("name", "") for l in (issue.get("labels") or [])]
backlog.append({
"repo": repo,
"number": issue["number"],
"title": issue["title"],
"labels": labels,
"assignees": assignees,
"created_at": issue.get("created_at", ""),
"comments": issue.get("comments", 0),
"url": issue.get("html_url", ""),
})
print(f" Total actionable issues: {len(backlog)}")
return backlog
# ---------------------------------------------------------------------------
# 2. PRIORITY SCORER
# ---------------------------------------------------------------------------
def score_issue(issue):
"""Score an issue 0-100 based on priority signals."""
score = 0
title_upper = issue["title"].upper()
labels_upper = [l.upper() for l in issue["labels"]]
all_text = title_upper + " " + " ".join(labels_upper)
# Critical / Bug: +30
if any(tag in all_text for tag in ["CRITICAL", "BUG"]):
score += 30
# P0 / Urgent: +25
if any(tag in all_text for tag in ["P0", "URGENT"]):
score += 25
# P1: +15
if "P1" in all_text:
score += 15
# OPS / Security: +10
if any(tag in all_text for tag in ["OPS", "SECURITY"]):
score += 10
# Unassigned: +10
if not issue["assignees"]:
score += 10
# Age > 7 days: +5
try:
created = issue["created_at"].replace("Z", "+00:00")
created_dt = datetime.fromisoformat(created)
age_days = (datetime.now(timezone.utc) - created_dt).days
if age_days > 7:
score += 5
except (ValueError, AttributeError):
pass
# Has comments: +5
if issue["comments"] > 0:
score += 5
# Infrastructure repo: +5
if issue["repo"] == "timmy-config":
score += 5
# Already assigned to an agent: -10
if any(a.lower() in AGENT_USERNAMES for a in issue["assignees"]):
score -= 10
issue["score"] = max(0, min(100, score))
return issue
def prioritize_backlog(backlog):
"""Score and sort the backlog by priority."""
scored = [score_issue(i) for i in backlog]
scored.sort(key=lambda x: x["score"], reverse=True)
return scored
# ---------------------------------------------------------------------------
# 3. AGENT HEALTH CHECKS
# ---------------------------------------------------------------------------
def check_process(pattern):
"""Check if a local process matching pattern is running."""
try:
result = subprocess.run(
["pgrep", "-f", pattern],
capture_output=True, text=True, timeout=5
)
return result.returncode == 0
except Exception:
return False
def check_ssh_service(host, service_name):
"""Check if a remote service is running via SSH."""
try:
result = subprocess.run(
["ssh", "-o", "ConnectTimeout=5", "-o", "StrictHostKeyChecking=no",
f"root@{host}",
f"systemctl is-active {service_name} 2>/dev/null || pgrep -f {service_name}"],
capture_output=True, text=True, timeout=15
)
return result.returncode == 0
except Exception:
return False
def check_agent_health(name, agent):
"""Check if an agent is alive and available."""
if agent["type"] == "loop":
alive = check_process(f"agent-loop.*{name}")
elif agent["type"] == "gateway":
host = agent["ssh"].split("@")[1]
service = f"hermes-{name}"
alive = check_ssh_service(host, service)
else:
alive = False
return alive
def get_agent_status():
"""Get health status for all agents."""
status = {}
for name, agent in AGENTS.items():
alive = check_agent_health(name, agent)
status[name] = {
"alive": alive,
"type": agent["type"],
"strengths": agent["strengths"],
}
symbol = "UP" if alive else "DOWN"
print(f" {name}: {symbol} ({agent['type']})")
return status
# ---------------------------------------------------------------------------
# 4. DISPATCHER
# ---------------------------------------------------------------------------
def classify_issue(issue):
"""Classify issue type based on title and labels."""
title = issue["title"].upper()
labels = " ".join(issue["labels"]).upper()
all_text = title + " " + labels
types = []
if any(w in all_text for w in ["BUG", "FIX", "BROKEN", "ERROR", "CRASH"]):
types.append("bug-fix")
if any(w in all_text for w in ["OPS", "DEPLOY", "CI", "INFRA", "PIPELINE", "MONITOR"]):
types.append("ops")
if any(w in all_text for w in ["SECURITY", "AUTH", "TOKEN", "CERT"]):
types.append("ops")
if any(w in all_text for w in ["RESEARCH", "AUDIT", "INVESTIGATE", "EXPLORE"]):
types.append("research")
if any(w in all_text for w in ["ARCHITECT", "DESIGN", "REFACTOR", "REWRITE"]):
types.append("architecture")
if any(w in all_text for w in ["TEST", "TESTING", "QA", "VALIDATE"]):
types.append("testing")
if any(w in all_text for w in ["CODE", "IMPLEMENT", "ADD", "CREATE", "BUILD"]):
types.append("code")
if any(w in all_text for w in ["SMALL", "QUICK", "SIMPLE", "MINOR", "TWEAK"]):
types.append("small-changes")
if any(w in all_text for w in ["COMPLEX", "MULTI", "LARGE", "OVERHAUL"]):
types.append("complex")
if not types:
types = ["code"] # default
return types
def match_agent(issue, agent_status, dispatched_this_cycle):
"""Find the best available agent for an issue."""
issue_types = classify_issue(issue)
candidates = []
for name, agent in AGENTS.items():
# Agent must be alive
if not agent_status.get(name, {}).get("alive", False):
continue
# Agent must handle this repo
if issue["repo"] not in agent["repos"]:
continue
# Agent must not already be dispatched this cycle
if dispatched_this_cycle.get(name, 0) >= agent["max_concurrent"]:
continue
# Score match based on overlapping strengths
overlap = len(set(issue_types) & set(agent["strengths"]))
candidates.append((name, overlap))
if not candidates:
return None
# Sort by overlap score descending, return best match
candidates.sort(key=lambda x: x[1], reverse=True)
return candidates[0][0]
def assign_issue(repo, number, agent_name):
"""Assign an issue to an agent on Gitea."""
# First get current assignees to not clobber
result = gitea_request(f"/repos/{GITEA_OWNER}/{repo}/issues/{number}")
if not result:
return False
current = [a.get("login", "") for a in (result.get("assignees") or [])]
if agent_name in current:
print(f" Already assigned to {agent_name}")
return True
new_assignees = current + [agent_name]
patch_result = gitea_request(
f"/repos/{GITEA_OWNER}/{repo}/issues/{number}",
method="PATCH",
data={"assignees": new_assignees}
)
return patch_result is not None
def dispatch_to_gateway(agent_name, agent, issue):
"""Trigger work on a gateway agent via SSH."""
host = agent["ssh"]
repo = issue["repo"]
number = issue["number"]
title = issue["title"]
# Try to trigger dispatch via SSH
cmd = (
f'ssh -o ConnectTimeout=10 -o StrictHostKeyChecking=no {host} '
f'"echo \'Dispatched by orchestrator: {repo}#{number} - {title}\' '
f'>> /tmp/hermes-dispatch.log"'
)
try:
subprocess.run(cmd, shell=True, timeout=20, capture_output=True)
return True
except Exception as e:
print(f" [WARN] SSH dispatch to {agent_name} failed: {e}")
return False
def dispatch_cycle(backlog, agent_status, dry_run=False):
"""Run one dispatch cycle. Returns dispatch report."""
dispatched = []
skipped = []
dispatched_count = {} # agent_name -> count dispatched this cycle
# Only dispatch unassigned issues (or issues not assigned to agents)
for issue in backlog:
agent_assigned = any(a.lower() in AGENT_USERNAMES for a in issue["assignees"])
if agent_assigned:
skipped.append((issue, "already assigned to agent"))
continue
if issue["score"] < 5:
skipped.append((issue, "score too low"))
continue
best_agent = match_agent(issue, agent_status, dispatched_count)
if not best_agent:
skipped.append((issue, "no available agent"))
continue
if dry_run:
dispatched.append({
"agent": best_agent,
"repo": issue["repo"],
"number": issue["number"],
"title": issue["title"],
"score": issue["score"],
"dry_run": True,
})
dispatched_count[best_agent] = dispatched_count.get(best_agent, 0) + 1
continue
# Actually dispatch
print(f" Dispatching {issue['repo']}#{issue['number']} -> {best_agent}")
success = assign_issue(issue["repo"], issue["number"], best_agent)
if success:
agent = AGENTS[best_agent]
if agent["type"] == "gateway":
dispatch_to_gateway(best_agent, agent, issue)
dispatched.append({
"agent": best_agent,
"repo": issue["repo"],
"number": issue["number"],
"title": issue["title"],
"score": issue["score"],
})
dispatched_count[best_agent] = dispatched_count.get(best_agent, 0) + 1
else:
skipped.append((issue, "assignment failed"))
return dispatched, skipped
# ---------------------------------------------------------------------------
# 5. CONSOLIDATED REPORT
# ---------------------------------------------------------------------------
def generate_report(backlog, dispatched, skipped, agent_status, dry_run=False):
"""Generate dispatch cycle report."""
now = datetime.now().strftime("%Y-%m-%d %H:%M")
mode = " [DRY RUN]" if dry_run else ""
lines = []
lines.append(f"=== Sovereign Orchestrator Report{mode} ===")
lines.append(f"Time: {now}")
lines.append(f"Total backlog: {len(backlog)} issues")
lines.append("")
# Agent health
lines.append("-- Agent Health --")
for name, info in agent_status.items():
symbol = "UP" if info["alive"] else "DOWN"
lines.append(f" {name}: {symbol} ({info['type']})")
lines.append("")
# Dispatched
lines.append(f"-- Dispatched: {len(dispatched)} --")
for d in dispatched:
dry = " (dry-run)" if d.get("dry_run") else ""
lines.append(f" [{d['score']}] {d['repo']}#{d['number']} -> {d['agent']}{dry}")
lines.append(f" {d['title'][:60]}")
lines.append("")
# Skipped (top 10)
skip_summary = {}
for issue, reason in skipped:
skip_summary[reason] = skip_summary.get(reason, 0) + 1
lines.append(f"-- Skipped: {len(skipped)} --")
for reason, count in sorted(skip_summary.items(), key=lambda x: -x[1]):
lines.append(f" {reason}: {count}")
lines.append("")
# Top 5 unassigned
unassigned = [i for i in backlog if not i["assignees"]][:5]
lines.append("-- Top 5 Unassigned (by priority) --")
for i in unassigned:
lines.append(f" [{i['score']}] {i['repo']}#{i['number']}: {i['title'][:55]}")
lines.append("")
report = "\n".join(lines)
return report
def format_telegram_report(backlog, dispatched, skipped, agent_status, dry_run=False):
"""Format a compact Telegram message."""
mode = " DRY RUN" if dry_run else ""
now = datetime.now().strftime("%H:%M")
parts = [f"*Orchestrator{mode}* ({now})"]
parts.append(f"Backlog: {len(backlog)} | Dispatched: {len(dispatched)} | Skipped: {len(skipped)}")
# Agent status line
agent_line = " | ".join(
f"{'' if v['alive'] else ''}{k}" for k, v in agent_status.items()
)
parts.append(agent_line)
if dispatched:
parts.append("")
parts.append("*Dispatched:*")
for d in dispatched[:5]:
dry = " 🔍" if d.get("dry_run") else ""
parts.append(f" `{d['repo']}#{d['number']}` → {d['agent']}{dry}")
# Top unassigned
unassigned = [i for i in backlog if not i["assignees"]][:3]
if unassigned:
parts.append("")
parts.append("*Top unassigned:*")
for i in unassigned:
parts.append(f" [{i['score']}] `{i['repo']}#{i['number']}` {i['title'][:40]}")
return "\n".join(parts)
# ---------------------------------------------------------------------------
# 6. MAIN
# ---------------------------------------------------------------------------
def run_cycle(dry_run=False):
"""Execute one full orchestration cycle."""
global GITEA_TOKEN, TELEGRAM_TOKEN
GITEA_TOKEN = load_gitea_token()
TELEGRAM_TOKEN = load_telegram_token()
print("\n[1/4] Reading backlog...")
backlog = read_backlog()
print("\n[2/4] Scoring and prioritizing...")
backlog = prioritize_backlog(backlog)
for i in backlog[:10]:
print(f" [{i['score']:3d}] {i['repo']}/{i['number']}: {i['title'][:55]}")
print("\n[3/4] Checking agent health...")
agent_status = get_agent_status()
print("\n[4/4] Dispatching...")
dispatched, skipped = dispatch_cycle(backlog, agent_status, dry_run=dry_run)
# Generate reports
report = generate_report(backlog, dispatched, skipped, agent_status, dry_run=dry_run)
print("\n" + report)
# Send Telegram notification
if dispatched or not dry_run:
tg_msg = format_telegram_report(backlog, dispatched, skipped, agent_status, dry_run=dry_run)
send_telegram(tg_msg)
return backlog, dispatched, skipped
def main():
import argparse
parser = argparse.ArgumentParser(description="Sovereign Orchestrator v1")
parser.add_argument("--once", action="store_true", help="Single dispatch cycle")
parser.add_argument("--daemon", action="store_true", help="Run every 15 min")
parser.add_argument("--dry-run", action="store_true", help="Score/report only, no dispatch")
parser.add_argument("--interval", type=int, default=DAEMON_INTERVAL,
help=f"Daemon interval in seconds (default: {DAEMON_INTERVAL})")
args = parser.parse_args()
if not any([args.once, args.daemon, args.dry_run]):
args.dry_run = True # safe default
print("[INFO] No mode specified, defaulting to --dry-run")
print("=" * 60)
print(" SOVEREIGN ORCHESTRATOR v1")
print("=" * 60)
if args.daemon:
print(f"[DAEMON] Running every {args.interval}s (Ctrl+C to stop)")
cycle = 0
while True:
cycle += 1
print(f"\n--- Cycle {cycle} ---")
try:
run_cycle(dry_run=args.dry_run)
except Exception as e:
print(f"[ERROR] Cycle failed: {e}")
print(f"[DAEMON] Sleeping {args.interval}s...")
time.sleep(args.interval)
else:
run_cycle(dry_run=args.dry_run)
if __name__ == "__main__":
main()

60
scripts/README.md Normal file
View File

@@ -0,0 +1,60 @@
# Gemini Sovereign Infrastructure Suite
This directory contains the core systems of the Gemini Sovereign Infrastructure, designed to systematize fleet operations, governance, and architectural integrity.
## Principles
1. **Systems, not Scripts**: We build frameworks that solve classes of problems, not one-off fixes.
2. **Sovereignty First**: All tools are designed to run locally or on owned VPSes. No cloud dependencies.
3. **Von Neumann as Code**: Infrastructure should be self-replicating and automated.
4. **Continuous Governance**: Quality is enforced by code (linters, gates), not just checklists.
## Tools
### [OPS] Provisioning & Fleet Management
- **`provision_wizard.py`**: Automates the creation of a new Wizard node from zero.
- Creates DigitalOcean droplet.
- Installs and builds `llama.cpp`.
- Downloads GGUF models.
- Sets up `systemd` services and health checks.
- **`fleet_llama.py`**: Unified management of `llama-server` instances across the fleet.
- `status`: Real-time health and model monitoring.
- `restart`: Remote service restart via SSH.
- `swap`: Hot-swapping GGUF models on remote nodes.
- **`skill_installer.py`**: Packages and deploys Hermes skills to remote wizards.
- **`model_eval.py`**: Benchmarks GGUF models for speed and quality before deployment.
- **`phase_tracker.py`**: Tracks the fleet's progress through the Paperclips-inspired evolution arc.
- **`cross_repo_test.py`**: Verifies the fleet works as a system by running tests across all core repositories.
- **`self_healing.py`**: Auto-detects and fixes common failures across the fleet.
- **`agent_dispatch.py`**: Unified framework for tasking agents across the fleet.
- **`telemetry.py`**: Operational visibility without cloud dependencies.
- **`gitea_webhook_handler.py`**: Handles real-time events from Gitea to coordinate fleet actions.
### [ARCH] Governance & Architecture
- **`architecture_linter_v2.py`**: Automated enforcement of architectural boundaries.
- Enforces sidecar boundaries (no sovereign code in `hermes-agent`).
- Prevents hardcoded IPs and committed secrets.
- Ensures `SOUL.md` and `README.md` standards.
- **`adr_manager.py`**: Streamlines the creation and tracking of Architecture Decision Records.
- `new`: Scaffolds a new ADR from a template.
- `list`: Provides a chronological view of architectural evolution.
## Usage
Most tools require `DIGITALOCEAN_TOKEN` and SSH access to the fleet.
```bash
# Provision a new node
python3 scripts/provision_wizard.py --name fenrir --model qwen2.5-coder-7b
# Check fleet status
python3 scripts/fleet_llama.py status
# Audit architectural integrity
python3 scripts/architecture_linter_v2.py
```
---
*Built by Gemini — The Builder, The Systematizer, The Force Multiplier.*

113
scripts/adr_manager.py Normal file
View File

@@ -0,0 +1,113 @@
#!/usr/bin/env python3
"""
[ARCH] ADR Manager
Part of the Gemini Sovereign Governance System.
Helps create and manage Architecture Decision Records (ADRs).
"""
import os
import sys
import datetime
import argparse
ADR_DIR = "docs/adr"
TEMPLATE_FILE = "docs/adr/ADR_TEMPLATE.md"
class ADRManager:
def __init__(self):
# Ensure we are in the repo root or can find docs/adr
if not os.path.exists(ADR_DIR):
# Try to find it relative to the script
script_dir = os.path.dirname(os.path.abspath(__file__))
repo_root = os.path.dirname(script_dir)
self.adr_dir = os.path.join(repo_root, ADR_DIR)
self.template_file = os.path.join(repo_root, TEMPLATE_FILE)
else:
self.adr_dir = ADR_DIR
self.template_file = TEMPLATE_FILE
if not os.path.exists(self.adr_dir):
os.makedirs(self.adr_dir)
def get_next_number(self):
files = [f for f in os.listdir(self.adr_dir) if f.endswith(".md") and f[0].isdigit()]
if not files:
return 1
numbers = [int(f.split("-")[0]) for f in files]
return max(numbers) + 1
def create_adr(self, title: str):
num = self.get_next_number()
slug = title.lower().replace(" ", "-").replace("/", "-")
filename = f"{num:04d}-{slug}.md"
filepath = os.path.join(self.adr_dir, filename)
date = datetime.date.today().isoformat()
template = ""
if os.path.exists(self.template_file):
with open(self.template_file, "r") as f:
template = f.read()
else:
template = """# {num}. {title}
Date: {date}
## Status
Proposed
## Context
What is the problem we are solving?
## Decision
What is the decision we made?
## Consequences
What are the positive and negative consequences?
"""
content = template.replace("{num}", f"{num:04d}")
content = content.replace("{title}", title)
content = content.replace("{date}", date)
with open(filepath, "w") as f:
f.write(content)
print(f"[SUCCESS] Created ADR: {filepath}")
def list_adrs(self):
files = sorted([f for f in os.listdir(self.adr_dir) if f.endswith(".md") and f[0].isdigit()])
print(f"{'NUM':<6} {'TITLE'}")
print("-" * 40)
for f in files:
num = f.split("-")[0]
title = f.split("-", 1)[1].replace(".md", "").replace("-", " ").title()
print(f"{num:<6} {title}")
def main():
parser = argparse.ArgumentParser(description="Gemini ADR Manager")
subparsers = parser.add_subparsers(dest="command")
create_parser = subparsers.add_parser("new", help="Create a new ADR")
create_parser.add_argument("title", help="Title of the ADR")
subparsers.add_parser("list", help="List all ADRs")
args = parser.parse_args()
manager = ADRManager()
if args.command == "new":
manager.create_adr(args.title)
elif args.command == "list":
manager.list_adrs()
else:
parser.print_help()
if __name__ == "__main__":
main()

57
scripts/agent_dispatch.py Normal file
View File

@@ -0,0 +1,57 @@
#!/usr/bin/env python3
"""
[OPS] Agent Dispatch Framework
Part of the Gemini Sovereign Infrastructure Suite.
Replaces ad-hoc dispatch scripts with a unified framework for tasking agents.
"""
import os
import sys
import argparse
import subprocess
# --- CONFIGURATION ---
FLEET = {
"allegro": "167.99.126.228",
"bezalel": "159.203.146.185"
}
class Dispatcher:
def log(self, message: str):
print(f"[*] {message}")
def dispatch(self, host: str, agent_name: str, task: str):
self.log(f"Dispatching task to {agent_name} on {host}...")
ip = FLEET[host]
# Command to run the agent on the remote machine
# Assumes hermes-agent is installed in /opt/hermes
remote_cmd = f"cd /opt/hermes && python3 run_agent.py --agent {agent_name} --task '{task}'"
ssh_cmd = ["ssh", "-o", "StrictHostKeyChecking=no", f"root@{ip}", remote_cmd]
try:
res = subprocess.run(ssh_cmd, capture_output=True, text=True)
if res.returncode == 0:
self.log(f"[SUCCESS] {agent_name} completed task.")
print(res.stdout)
else:
self.log(f"[FAILURE] {agent_name} failed task.")
print(res.stderr)
except Exception as e:
self.log(f"[ERROR] Dispatch failed: {e}")
def main():
parser = argparse.ArgumentParser(description="Gemini Agent Dispatcher")
parser.add_argument("host", choices=list(FLEET.keys()), help="Host to dispatch to")
parser.add_argument("agent", help="Agent name")
parser.add_argument("task", help="Task description")
args = parser.parse_args()
dispatcher = Dispatcher()
dispatcher.dispatch(args.host, args.agent, args.task)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,126 @@
#!/usr/bin/env python3
"""
[ARCH] Architecture Linter v2
Part of the Gemini Sovereign Governance System.
Enforces architectural boundaries, security, and documentation standards
across the Timmy Foundation fleet.
"""
import os
import re
import sys
import argparse
from pathlib import Path
# --- CONFIGURATION ---
SOVEREIGN_KEYWORDS = ["mempalace", "sovereign_store", "tirith", "bezalel", "nexus"]
IP_REGEX = r'\b(?:\d{1,3}\.){3}\d{1,3}\b'
API_KEY_REGEX = r'(?:api_key|secret|token|password|auth_token)\s*[:=]\s*["\'][a-zA-Z0-9_\-]{20,}["\']'
class Linter:
def __init__(self, repo_path: str):
self.repo_path = Path(repo_path).resolve()
self.repo_name = self.repo_path.name
self.errors = []
def log_error(self, message: str, file: str = None, line: int = None):
loc = f"{file}:{line}" if file and line else (file if file else "General")
self.errors.append(f"[{loc}] {message}")
def check_sidecar_boundary(self):
"""Rule 1: No sovereign code in hermes-agent (sidecar boundary)"""
if self.repo_name == "hermes-agent":
for root, _, files in os.walk(self.repo_path):
if "node_modules" in root or ".git" in root:
continue
for file in files:
if file.endswith((".py", ".ts", ".js", ".tsx")):
path = Path(root) / file
content = path.read_text(errors="ignore")
for kw in SOVEREIGN_KEYWORDS:
if kw in content.lower():
# Exception: imports or comments might be okay, but we're strict for now
self.log_error(f"Sovereign keyword '{kw}' found in hermes-agent. Violates sidecar boundary.", str(path.relative_to(self.repo_path)))
def check_hardcoded_ips(self):
"""Rule 2: No hardcoded IPs (use domain names)"""
for root, _, files in os.walk(self.repo_path):
if "node_modules" in root or ".git" in root:
continue
for file in files:
if file.endswith((".py", ".ts", ".js", ".tsx", ".yaml", ".yml", ".json")):
path = Path(root) / file
content = path.read_text(errors="ignore")
matches = re.finditer(IP_REGEX, content)
for match in matches:
ip = match.group()
if ip in ["127.0.0.1", "0.0.0.0"]:
continue
line_no = content.count('\n', 0, match.start()) + 1
self.log_error(f"Hardcoded IP address '{ip}' found. Use domain names or environment variables.", str(path.relative_to(self.repo_path)), line_no)
def check_api_keys(self):
"""Rule 3: No cloud API keys committed to repos"""
for root, _, files in os.walk(self.repo_path):
if "node_modules" in root or ".git" in root:
continue
for file in files:
if file.endswith((".py", ".ts", ".js", ".tsx", ".yaml", ".yml", ".json", ".env")):
if file == ".env.example":
continue
path = Path(root) / file
content = path.read_text(errors="ignore")
matches = re.finditer(API_KEY_REGEX, content, re.IGNORECASE)
for match in matches:
line_no = content.count('\n', 0, match.start()) + 1
self.log_error("Potential API key or secret found in code.", str(path.relative_to(self.repo_path)), line_no)
def check_soul_canonical(self):
"""Rule 4: SOUL.md exists and is canonical in exactly one location"""
soul_path = self.repo_path / "SOUL.md"
if self.repo_name == "timmy-config":
if not soul_path.exists():
self.log_error("SOUL.md is missing from the canonical location (timmy-config root).")
else:
if soul_path.exists():
self.log_error("SOUL.md found in non-canonical repo. It should only live in timmy-config.")
def check_readme(self):
"""Rule 5: Every repo has a README with current truth"""
readme_path = self.repo_path / "README.md"
if not readme_path.exists():
self.log_error("README.md is missing.")
else:
content = readme_path.read_text(errors="ignore")
if len(content.strip()) < 50:
self.log_error("README.md is too short or empty. Provide current truth about the repo.")
def run(self):
print(f"--- Gemini Linter: Auditing {self.repo_name} ---")
self.check_sidecar_boundary()
self.check_hardcoded_ips()
self.check_api_keys()
self.check_soul_canonical()
self.check_readme()
if self.errors:
print(f"\n[FAILURE] Found {len(self.errors)} architectural violations:")
for err in self.errors:
print(f" - {err}")
return False
else:
print("\n[SUCCESS] Architecture is sound. Sovereignty maintained.")
return True
def main():
parser = argparse.ArgumentParser(description="Gemini Architecture Linter v2")
parser.add_argument("repo_path", nargs="?", default=".", help="Path to the repository to lint")
args = parser.parse_args()
linter = Linter(args.repo_path)
success = linter.run()
sys.exit(0 if success else 1)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,90 @@
#!/usr/bin/env python3
"""
[OPS] Cross-Repo Test Suite
Part of the Gemini Sovereign Infrastructure Suite.
Verifies the fleet works as a system by running tests across all core repositories.
"""
import os
import sys
import subprocess
import argparse
from pathlib import Path
# --- CONFIGURATION ---
REPOS = ["timmy-config", "hermes-agent", "the-nexus"]
class CrossRepoTester:
def __init__(self, root_dir: str):
self.root_dir = Path(root_dir).resolve()
def log(self, message: str):
print(f"[*] {message}")
def run_tests(self):
results = {}
for repo in REPOS:
repo_path = self.root_dir / repo
if not repo_path.exists():
# Try sibling directory if we are in one of the repos
repo_path = self.root_dir.parent / repo
if not repo_path.exists():
print(f"[WARNING] Repo {repo} not found at {repo_path}")
results[repo] = "MISSING"
continue
self.log(f"Running tests for {repo}...")
# Determine test command
test_cmd = ["pytest"]
if repo == "hermes-agent":
test_cmd = ["python3", "-m", "pytest", "tests"]
elif repo == "the-nexus":
test_cmd = ["pytest", "tests"]
try:
# Check if pytest is available
subprocess.run(["pytest", "--version"], capture_output=True)
res = subprocess.run(test_cmd, cwd=str(repo_path), capture_output=True, text=True)
if res.returncode == 0:
results[repo] = "PASSED"
else:
results[repo] = "FAILED"
# Print a snippet of the failure
print(f" [!] {repo} failed tests. Stderr snippet:")
print("\n".join(res.stderr.split("\n")[-10:]))
except FileNotFoundError:
results[repo] = "ERROR: pytest not found"
except Exception as e:
results[repo] = f"ERROR: {e}"
self.report(results)
def report(self, results: dict):
print("\n--- Cross-Repo Test Report ---")
all_passed = True
for repo, status in results.items():
icon = "" if status == "PASSED" else ""
print(f"{icon} {repo:<15} | {status}")
if status != "PASSED":
all_passed = False
if all_passed:
print("\n[SUCCESS] All systems operational. The fleet is sound.")
else:
print("\n[FAILURE] System instability detected.")
def main():
parser = argparse.ArgumentParser(description="Gemini Cross-Repo Tester")
parser.add_argument("--root", default=".", help="Root directory containing all repos")
args = parser.parse_args()
tester = CrossRepoTester(args.root)
tester.run_tests()
if __name__ == "__main__":
main()

137
scripts/fleet_llama.py Normal file
View File

@@ -0,0 +1,137 @@
#!/usr/bin/env python3
"""
[OPS] llama.cpp Fleet Manager
Part of the Gemini Sovereign Infrastructure Suite.
Manages llama-server instances across the Timmy Foundation fleet.
Supports status, restart, and model swapping via SSH.
"""
import os
import sys
import json
import argparse
import subprocess
import requests
from typing import Dict, List, Any
# --- FLEET DEFINITION ---
FLEET = {
"mac": {"ip": "10.1.10.77", "port": 8080, "role": "hub"},
"ezra": {"ip": "143.198.27.163", "port": 8080, "role": "forge"},
"allegro": {"ip": "167.99.126.228", "port": 8080, "role": "agent-host"},
"bezalel": {"ip": "159.203.146.185", "port": 8080, "role": "world-host"}
}
class FleetManager:
def __init__(self):
self.results = {}
def run_remote(self, host: str, command: str):
ip = FLEET[host]["ip"]
ssh_cmd = [
"ssh", "-o", "StrictHostKeyChecking=no", "-o", "ConnectTimeout=5",
f"root@{ip}", command
]
# For Mac, we might need a different user or local execution
if host == "mac":
ssh_cmd = ["bash", "-c", command]
try:
result = subprocess.run(ssh_cmd, capture_output=True, text=True, timeout=10)
return result
except subprocess.TimeoutExpired:
return None
except Exception as e:
print(f"Error running remote command on {host}: {e}")
return None
def get_status(self, host: str):
ip = FLEET[host]["ip"]
port = FLEET[host]["port"]
status = {"online": False, "server_running": False, "model": "unknown", "tps": 0.0}
# 1. Check if machine is reachable
ping_res = subprocess.run(["ping", "-c", "1", "-W", "1", ip], capture_output=True)
if ping_res.returncode == 0:
status["online"] = True
# 2. Check if llama-server is responding to health check
try:
url = f"http://{ip}:{port}/health"
response = requests.get(url, timeout=2)
if response.status_code == 200:
status["server_running"] = True
data = response.json()
# llama.cpp health endpoint usually returns slots info
# We'll try to get model info if available
status["model"] = data.get("model", "unknown")
except:
pass
return status
def show_fleet_status(self):
print(f"{'NAME':<10} {'IP':<15} {'STATUS':<10} {'SERVER':<10} {'MODEL':<20}")
print("-" * 70)
for name in FLEET:
status = self.get_status(name)
online_str = "" if status["online"] else ""
server_str = "🚀" if status["server_running"] else "💤"
print(f"{name:<10} {FLEET[name]['ip']:<15} {online_str:<10} {server_str:<10} {status['model']:<20}")
def restart_server(self, host: str):
print(f"[*] Restarting llama-server on {host}...")
res = self.run_remote(host, "systemctl restart llama-server")
if res and res.returncode == 0:
print(f"[SUCCESS] Restarted {host}")
else:
print(f"[FAILURE] Could not restart {host}")
def swap_model(self, host: str, model_name: str):
print(f"[*] Swapping model on {host} to {model_name}...")
# This assumes the provision_wizard.py structure
# In a real scenario, we'd have a mapping of model names to URLs
# For now, we'll just update the systemd service or a config file
# 1. Stop server
self.run_remote(host, "systemctl stop llama-server")
# 2. Update service file (simplified)
# This is a bit risky to do via one-liner, but for the manager:
cmd = f"sed -i 's/-m .*\\.gguf/-m \\/opt\\/models\\/{model_name}.gguf/' /etc/systemd/system/llama-server.service"
self.run_remote(host, cmd)
# 3. Start server
self.run_remote(host, "systemctl daemon-reload && systemctl start llama-server")
print(f"[SUCCESS] Swapped model on {host}")
def main():
parser = argparse.ArgumentParser(description="Gemini Fleet Manager")
subparsers = parser.add_subparsers(dest="command")
subparsers.add_parser("status", help="Show fleet status")
restart_parser = subparsers.add_parser("restart", help="Restart a server")
restart_parser.add_argument("host", choices=list(FLEET.keys()), help="Host to restart")
swap_parser = subparsers.add_parser("swap", help="Swap model on a host")
swap_parser.add_argument("host", choices=list(FLEET.keys()), help="Host to swap")
swap_parser.add_argument("model", help="Model name (GGUF)")
args = parser.parse_args()
manager = FleetManager()
if args.command == "status":
manager.show_fleet_status()
elif args.command == "restart":
manager.restart_server(args.host)
elif args.command == "swap":
manager.swap_model(args.host, args.model)
else:
parser.print_help()
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,82 @@
#!/usr/bin/env python3
"""
[OPS] Gitea Webhook Handler
Part of the Gemini Sovereign Infrastructure Suite.
Handles real-time events from Gitea to coordinate fleet actions.
"""
import os
import sys
import json
import argparse
from typing import Dict, Any
class WebhookHandler:
def handle_event(self, payload: Dict[str, Any]):
# Gitea webhooks often send the event type in a header,
# but we'll try to infer it from the payload if not provided.
event_type = payload.get("event") or self.infer_event_type(payload)
repo_name = payload.get("repository", {}).get("name")
sender = payload.get("sender", {}).get("username")
print(f"[*] Received {event_type} event from {repo_name} (by {sender})")
if event_type == "push":
self.handle_push(payload)
elif event_type == "pull_request":
self.handle_pr(payload)
elif event_type == "issue":
self.handle_issue(payload)
else:
print(f"[INFO] Ignoring event type: {event_type}")
def infer_event_type(self, payload: Dict[str, Any]) -> str:
if "commits" in payload: return "push"
if "pull_request" in payload: return "pull_request"
if "issue" in payload: return "issue"
return "unknown"
def handle_push(self, payload: Dict[str, Any]):
ref = payload.get("ref")
print(f" [PUSH] Branch: {ref}")
# Trigger CI or deployment
if ref == "refs/heads/main":
print(" [ACTION] Triggering production deployment...")
# Example: subprocess.run(["./deploy.sh"])
def handle_pr(self, payload: Dict[str, Any]):
action = payload.get("action")
pr_num = payload.get("pull_request", {}).get("number")
print(f" [PR] Action: {action} | PR #{pr_num}")
if action in ["opened", "synchronized"]:
print(f" [ACTION] Triggering architecture linter for PR #{pr_num}...")
# Example: subprocess.run(["python3", "scripts/architecture_linter_v2.py"])
def handle_issue(self, payload: Dict[str, Any]):
action = payload.get("action")
issue_num = payload.get("issue", {}).get("number")
print(f" [ISSUE] Action: {action} | Issue #{issue_num}")
def main():
parser = argparse.ArgumentParser(description="Gemini Webhook Handler")
parser.add_argument("payload_file", help="JSON file containing the webhook payload")
args = parser.parse_args()
if not os.path.exists(args.payload_file):
print(f"[ERROR] Payload file {args.payload_file} not found.")
sys.exit(1)
with open(args.payload_file, "r") as f:
try:
payload = json.load(f)
except:
print("[ERROR] Invalid JSON payload.")
sys.exit(1)
handler = WebhookHandler()
handler.handle_event(payload)
if __name__ == "__main__":
main()

526
scripts/kaizen_retro.py Normal file
View File

@@ -0,0 +1,526 @@
#!/usr/bin/env python3
"""
Kaizen Retro — Automated retrospective after every burn cycle.
Reads overnight Gitea activity, fleet state, and loop logs.
Generates ONE concrete improvement suggestion and posts it.
Usage:
python3 scripts/kaizen_retro.py [--dry-run]
"""
from __future__ import annotations
import argparse
import json
import os
import sys
import urllib.error
import urllib.request
from datetime import datetime, timedelta, timezone
from pathlib import Path
from typing import Any, Optional
# Ensure repo root is on path so we can import gitea_client
REPO_ROOT = Path(__file__).resolve().parent.parent
sys.path.insert(0, str(REPO_ROOT))
from gitea_client import GiteaClient, GiteaError
# ---------------------------------------------------------------------------
# Configuration
# ---------------------------------------------------------------------------
REPOS = [
"Timmy_Foundation/the-nexus",
"Timmy_Foundation/timmy-config",
"Timmy_Foundation/timmy-home",
"Timmy_Foundation/the-door",
"Timmy_Foundation/turboquant",
"Timmy_Foundation/hermes-agent",
"Timmy_Foundation/.profile",
]
HERMES_HOME = Path.home() / ".hermes"
TIMMY_HOME = Path.home() / ".timmy"
WORKFORCE_STATE_PATH = HERMES_HOME / "workforce-state.json"
FLEET_ROUTING_PATH = HERMES_HOME / "fleet-routing.json"
CHANNEL_DIR_PATH = REPO_ROOT / "channel_directory.json"
REPORTS_DIR = REPO_ROOT / "reports"
MORNING_REPORT_REPO = "Timmy_Foundation/timmy-config"
TELEGRAM_BOT_TOKEN = os.environ.get("TELEGRAM_BOT_TOKEN")
TELEGRAM_CHAT_ID = os.environ.get("TELEGRAM_HOME_CHANNEL", "-1003664764329")
TELEGRAM_MAX_LEN = 4000 # leave headroom below the 4096 hard limit
STALE_DAYS = 7
MAX_ATTEMPT_COMMENT_THRESHOLD = 5
ISSUE_TYPE_KEYWORDS = {
"bug": ["bug", "fix", "crash", "error", "regression", "broken"],
"feature": ["feature", "implement", "add", "support", "enable"],
"docs": ["doc", "readme", "wiki", "guide", "documentation"],
"kaizen": ["kaizen", "retro", "improvement", "continuous"],
"devops": ["deploy", "ci", "cd", "docker", "server", "infra"],
}
BLOCKER_LABELS = {"blocked", "timeout", "stale", "help wanted", "wontfix", "duplicate"}
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def load_json(path: Path) -> Any:
if not path.exists():
return None
with open(path) as f:
return json.load(f)
def iso_day_ago(days: int = 1) -> str:
return (datetime.now(timezone.utc) - timedelta(days=days)).isoformat()
def classify_issue_type(issue: dict) -> str:
title = (issue.get("title", "") or "").lower()
body = (issue.get("body", "") or "").lower()
labels = [l.get("name", "").lower() for l in issue.get("labels", []) or []]
text = f"{title} {body} {' '.join(labels)}"
words = set(text.split())
best = "other"
best_score = 0
for kind, keywords in ISSUE_TYPE_KEYWORDS.items():
# Short keywords (<=3 chars) require whole-word match to avoid false positives like
# "ci" inside "cleanup" or "cd" inside "abcde".
score = sum(
1 for kw in keywords
if (len(kw) <= 3 and kw in words) or (len(kw) > 3 and kw in text)
)
# label match is stronger
for label in labels:
label_words = set(label.split())
if any(
(len(kw) <= 3 and kw in label_words) or (len(kw) > 3 and kw in label)
for kw in keywords
):
score += 3
if score > best_score:
best_score = score
best = kind
return best
def is_max_attempts_candidate(issue: dict) -> bool:
"""Heuristic for issues that consumed excessive attempts."""
labels = {l.get("name", "").lower() for l in issue.get("labels", []) or []}
if labels & BLOCKER_LABELS:
return True
if issue.get("comments", 0) >= MAX_ATTEMPT_COMMENT_THRESHOLD:
return True
created = issue.get("created_at")
if created:
try:
created_dt = datetime.fromisoformat(created.replace("Z", "+00:00"))
if datetime.now(timezone.utc) - created_dt > timedelta(days=STALE_DAYS):
return True
except Exception:
pass
return False
def telegram_send(text: str, bot_token: str, chat_id: str) -> list[dict]:
"""Post text to Telegram, chunking if it exceeds the message limit."""
url = f"https://api.telegram.org/bot{bot_token}/sendMessage"
chunks = []
if len(text) <= TELEGRAM_MAX_LEN:
chunks = [text]
else:
# Split on newlines to preserve readability
lines = text.splitlines(keepends=True)
current = ""
for line in lines:
if len(current) + len(line) > TELEGRAM_MAX_LEN:
if current:
chunks.append(current)
current = line
else:
current += line
if current:
chunks.append(current)
results = []
for i, chunk in enumerate(chunks):
prefix = f"*(part {i + 1}/{len(chunks)})*\n" if len(chunks) > 1 else ""
payload = {"chat_id": chat_id, "text": prefix + chunk, "parse_mode": "Markdown"}
data = json.dumps(payload).encode()
req = urllib.request.Request(url, data=data, headers={"Content-Type": "application/json"})
with urllib.request.urlopen(req, timeout=30) as resp:
results.append(json.loads(resp.read().decode()))
return results
def find_latest_morning_report_issue(client: GiteaClient) -> Optional[int]:
try:
issues = client.list_issues(MORNING_REPORT_REPO, state="open", sort="created", direction="desc", limit=20)
for issue in issues:
if "good morning report" in issue.title.lower() or "morning report" in issue.title.lower():
return issue.number
# fallback to closed
issues = client.list_issues(MORNING_REPORT_REPO, state="closed", sort="created", direction="desc", limit=20)
for issue in issues:
if "good morning report" in issue.title.lower() or "morning report" in issue.title.lower():
return issue.number
except Exception:
pass
return None
def fmt_pct(num: float, den: float) -> str:
if den == 0:
return "N/A"
return f"{num/den:.0%}"
# ---------------------------------------------------------------------------
# Analysis
# ---------------------------------------------------------------------------
def gather_metrics(client: GiteaClient, since: str) -> dict:
"""Collect overnight metrics from Gitea."""
metrics = {
"closed_issues": [],
"merged_prs": [],
"closed_prs": [],
"open_issues": [],
"max_attempts_issues": [],
"by_agent": {},
"by_repo": {},
"by_type": {},
}
for repo in REPOS:
repo_short = repo.split("/")[1]
metrics["by_repo"][repo_short] = {
"closed": 0,
"merged_prs": 0,
"closed_prs": 0,
"open": 0,
"max_attempts": 0,
"successes": 0,
"failures": 0,
}
# Closed issues since window
try:
closed = client.list_issues(repo, state="closed", since=since, sort="updated", direction="desc", limit=100)
for issue in closed:
issue_dict = {
"number": issue.number,
"title": issue.title,
"repo": repo_short,
"type": classify_issue_type({"title": issue.title, "body": issue.body, "labels": [{"name": lb.name} for lb in issue.labels]}),
"assignee": issue.assignees[0].login if issue.assignees else "unassigned",
}
metrics["closed_issues"].append(issue_dict)
metrics["by_repo"][repo_short]["closed"] += 1
metrics["by_repo"][repo_short]["successes"] += 1
agent = issue_dict["assignee"]
if agent not in metrics["by_agent"]:
metrics["by_agent"][agent] = {"successes": 0, "failures": 0, "closed": 0, "repos": set()}
metrics["by_agent"][agent]["successes"] += 1
metrics["by_agent"][agent]["closed"] += 1
metrics["by_agent"][agent]["repos"].add(repo_short)
t = issue_dict["type"]
if t not in metrics["by_type"]:
metrics["by_type"][t] = {"successes": 0, "failures": 0, "total": 0}
metrics["by_type"][t]["successes"] += 1
metrics["by_type"][t]["total"] += 1
except Exception as exc:
print(f"Warning: could not load closed issues for {repo}: {exc}", file=sys.stderr)
# Open issues (for stale / max-attempts detection)
try:
open_issues = client.list_issues(repo, state="open", sort="created", direction="desc", limit=100)
metrics["by_repo"][repo_short]["open"] = len(open_issues)
for issue in open_issues:
issue_raw = {
"number": issue.number,
"title": issue.title,
"labels": [{"name": lb.name} for lb in issue.labels],
"comments": issue.comments,
"created_at": issue.created_at,
}
if is_max_attempts_candidate(issue_raw):
metrics["max_attempts_issues"].append({
"number": issue.number,
"title": issue.title,
"repo": repo_short,
"type": classify_issue_type({"title": issue.title, "body": issue.body, "labels": issue_raw["labels"]}),
"assignee": issue.assignees[0].login if issue.assignees else "unassigned",
})
metrics["by_repo"][repo_short]["max_attempts"] += 1
metrics["by_repo"][repo_short]["failures"] += 1
agent = issue.assignees[0].login if issue.assignees else "unassigned"
if agent not in metrics["by_agent"]:
metrics["by_agent"][agent] = {"successes": 0, "failures": 0, "closed": 0, "repos": set()}
metrics["by_agent"][agent]["failures"] += 1
metrics["by_agent"][agent]["repos"].add(repo_short)
t = classify_issue_type({"title": issue.title, "body": issue.body, "labels": issue_raw["labels"]})
if t not in metrics["by_type"]:
metrics["by_type"][t] = {"successes": 0, "failures": 0, "total": 0}
metrics["by_type"][t]["failures"] += 1
metrics["by_type"][t]["total"] += 1
except Exception as exc:
print(f"Warning: could not load open issues for {repo}: {exc}", file=sys.stderr)
# PRs merged / closed since window (filter client-side; Gitea PR API ignores since)
try:
prs = client.list_pulls(repo, state="closed", sort="updated", limit=100)
since_dt = datetime.fromisoformat(since.replace("Z", "+00:00"))
for pr in prs:
updated = pr.updated_at or pr.created_at or ""
try:
updated_dt = datetime.fromisoformat(updated.replace("Z", "+00:00"))
if updated_dt < since_dt:
continue
except Exception:
pass
if pr.merged:
metrics["merged_prs"].append({
"number": pr.number,
"title": pr.title,
"repo": repo_short,
"user": pr.user.login if pr.user else "unknown",
})
metrics["by_repo"][repo_short]["merged_prs"] += 1
else:
metrics["closed_prs"].append({
"number": pr.number,
"title": pr.title,
"repo": repo_short,
"user": pr.user.login if pr.user else "unknown",
})
metrics["by_repo"][repo_short]["closed_prs"] += 1
except Exception as exc:
print(f"Warning: could not load PRs for {repo}: {exc}", file=sys.stderr)
# Convert sets to lists for JSON serialization
for agent in metrics["by_agent"].values():
agent["repos"] = sorted(agent["repos"])
return metrics
def load_workforce_state() -> dict:
return load_json(WORKFORCE_STATE_PATH) or {}
def load_fleet_routing() -> list[dict]:
data = load_json(FLEET_ROUTING_PATH)
if data and "agents" in data:
return data["agents"]
return []
def generate_suggestion(metrics: dict, fleet: list[dict]) -> str:
"""Generate ONE concrete improvement suggestion based on the data."""
by_agent = metrics["by_agent"]
by_repo = metrics["by_repo"]
by_type = metrics["by_type"]
max_attempts = metrics["max_attempts_issues"]
suggestions: list[str] = []
# 1. Agent with poor repo performance
for agent, stats in by_agent.items():
total = stats["successes"] + stats["failures"]
if total >= 3 and stats["successes"] == 0:
repos = ", ".join(stats["repos"])
suggestions.append(
f"🎯 **{agent}** has a 0% verify rate over the last cycle (0/{total}) on repos: {repos}. "
f"Consider removing these repos from {agent}'s routing or providing targeted onboarding."
)
# 2. Repo with highest failure concentration
repo_failures = [(r, s) for r, s in by_repo.items() if s["failures"] > 0]
if repo_failures:
repo_failures.sort(key=lambda x: x[1]["failures"], reverse=True)
worst_repo, worst_stats = repo_failures[0]
total_repo = worst_stats["successes"] + worst_stats["failures"]
if worst_stats["failures"] >= 2:
suggestions.append(
f"🎯 **{worst_repo}** has the most friction ({worst_stats['failures']} blocked/stale issues, "
f"{fmt_pct(worst_stats['successes'], total_repo)} success). "
f"Consider splitting issues in {worst_repo} into smaller chunks or assigning a stronger agent."
)
# 3. Max-attempts pattern
if len(max_attempts) >= 3:
type_counts: dict[str, int] = {}
for issue in max_attempts:
type_counts[issue["type"]] = type_counts.get(issue["type"], 0) + 1
top_type = max(type_counts, key=type_counts.get) if type_counts else "unknown"
suggestions.append(
f"🎯 **{len(max_attempts)} issues** hit max-attempts or went stale. "
f"The dominant type is **{top_type}**. "
f"Consider adding acceptance criteria templates or pre-flight checklists for {top_type} issues."
)
# 4. Issue type disparity
for t, stats in by_type.items():
total = stats["total"]
if total >= 3 and stats["successes"] == 0:
suggestions.append(
f"🎯 **{t}** issues have a 0% closure rate ({stats['failures']} stale). "
f"Consider routing all {t} issues to a specialist agent or creating a dedicated playbook."
)
# 5. Fleet routing gap (if fleet data exists)
active_agents = {a["name"] for a in fleet if a.get("active")}
assigned_agents = set(by_agent.keys())
idle_agents = active_agents - assigned_agents - {"unassigned"}
if len(idle_agents) >= 2:
suggestions.append(
f"🎯 **{len(idle_agents)} active agents** have no assignments this cycle: {', '.join(idle_agents)}. "
f"Consider expanding their repo lists or investigating why they aren't receiving work."
)
if suggestions:
return suggestions[0]
# Fallback: celebrate or nudge
total_closed = len(metrics["closed_issues"])
total_merged = len(metrics["merged_prs"])
if total_closed >= 5 or total_merged >= 3:
return (
f"🎯 Strong cycle: {total_closed} issues closed, {total_merged} PRs merged. "
f"Next improvement: write down the top 3 patterns that made this cycle successful so we can replicate them."
)
return (
"🎯 Low activity this cycle. Next improvement: ensure at least one agent loop is actively polling "
"for unassigned issues so work doesn't sit idle."
)
def build_report(metrics: dict, suggestion: str, since: str) -> str:
now = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M UTC")
period = since[:10]
lines = [
f"# 🌀 Kaizen Retro — {now}",
f"*Period: {period} → now*\n",
"## Numbers",
f"- **Issues closed:** {len(metrics['closed_issues'])}",
f"- **PRs merged:** {len(metrics['merged_prs'])}",
f"- **PRs closed without merge:** {len(metrics['closed_prs'])}",
f"- **Max-attempts / stale issues:** {len(metrics['max_attempts_issues'])}",
"",
"## By Agent",
]
for agent, stats in sorted(metrics["by_agent"].items(), key=lambda x: x[1]["successes"] + x[1]["failures"], reverse=True):
total = stats["successes"] + stats["failures"]
rate = fmt_pct(stats["successes"], total)
lines.append(f"- **{agent}**: {stats['successes']} closed, {stats['failures']} stale / max-attempts — verify rate {rate}")
lines.extend(["", "## By Repo"])
for repo, stats in sorted(metrics["by_repo"].items(), key=lambda x: x[1]["successes"] + x[1]["failures"], reverse=True):
total = stats["successes"] + stats["failures"]
if total == 0 and stats["open"] == 0:
continue
rate = fmt_pct(stats["successes"], total)
lines.append(
f"- **{repo}**: {stats['successes']} closed, {stats['failures']} stale, {stats['open']} open — verify rate {rate}"
)
lines.extend(["", "## By Issue Type"])
for t, stats in sorted(metrics["by_type"].items(), key=lambda x: x[1]["total"], reverse=True):
total = stats["total"]
rate = fmt_pct(stats["successes"], total)
lines.append(f"- **{t}**: {stats['successes']} closed, {stats['failures']} stale — verify rate {rate}")
if metrics["max_attempts_issues"]:
lines.extend(["", "## Max-Attempts / Stale Issues"])
for issue in metrics["max_attempts_issues"][:10]:
lines.append(f"- {issue['repo']}#{issue['number']} ({issue['type']}, assignee: {issue['assignee']}) — {issue['title']}")
if len(metrics["max_attempts_issues"]) > 10:
lines.append(f"- … and {len(metrics['max_attempts_issues']) - 10} more")
lines.extend(["", "## One Concrete Improvement", suggestion, ""])
return "\n".join(lines)
# ---------------------------------------------------------------------------
# Main
# ---------------------------------------------------------------------------
def main() -> int:
parser = argparse.ArgumentParser(description="Kaizen Retro — automated burn-cycle retrospective")
parser.add_argument("--dry-run", action="store_true", help="Print report but do not post")
parser.add_argument("--since", type=str, help="ISO timestamp for lookback window (default: 24h ago)")
parser.add_argument("--post-to", type=str, help="Override Telegram chat ID")
args = parser.parse_args()
since = args.since or iso_day_ago(1)
client = GiteaClient()
print("Gathering metrics since", since)
metrics = gather_metrics(client, since)
fleet = load_fleet_routing()
suggestion = generate_suggestion(metrics, fleet)
report = build_report(metrics, suggestion, since)
print(report)
# Save JSON snapshot
REPORTS_DIR.mkdir(parents=True, exist_ok=True)
snapshot_path = REPORTS_DIR / f"kaizen-retro-{datetime.now(timezone.utc).strftime('%Y%m%d')}.json"
snapshot = {
"generated_at": datetime.now(timezone.utc).isoformat(),
"since": since,
"metrics": metrics,
"suggestion": suggestion,
"report_markdown": report,
}
with open(snapshot_path, "w") as f:
json.dump(snapshot, f, indent=2)
print(f"\nSnapshot saved to {snapshot_path}")
if args.dry_run:
return 0
# Post to Telegram
chat_id = args.post_to or TELEGRAM_CHAT_ID
bot_token = TELEGRAM_BOT_TOKEN
if bot_token and chat_id:
try:
telegram_send(report, bot_token, chat_id)
print("Posted to Telegram.")
except Exception as exc:
print(f"Failed to post to Telegram: {exc}", file=sys.stderr)
else:
print("Telegram not configured (set TELEGRAM_BOT_TOKEN and TELEGRAM_HOME_CHANNEL).", file=sys.stderr)
# Comment on latest morning report issue
morning_issue = find_latest_morning_report_issue(client)
if morning_issue:
try:
client.create_comment(MORNING_REPORT_REPO, morning_issue, report)
print(f"Commented on morning report issue #{morning_issue}.")
except Exception as exc:
print(f"Failed to comment on morning report issue: {exc}", file=sys.stderr)
else:
print("No morning report issue found to comment on.", file=sys.stderr)
return 0
if __name__ == "__main__":
sys.exit(main())

95
scripts/model_eval.py Normal file
View File

@@ -0,0 +1,95 @@
#!/usr/bin/env python3
"""
[EVAL] Model Evaluation Harness
Part of the Gemini Sovereign Infrastructure Suite.
Benchmarks GGUF models for speed and quality before deployment.
"""
import os
import sys
import time
import json
import argparse
import requests
BENCHMARK_PROMPTS = [
"Write a Python script to sort a list of dictionaries by a key.",
"Explain the concept of 'Sovereign AI' in three sentences.",
"What is the capital of France?",
"Write a short story about a robot learning to paint."
]
class ModelEval:
def __init__(self, endpoint: str):
self.endpoint = endpoint.rstrip("/")
def log(self, message: str):
print(f"[*] {message}")
def run_benchmark(self):
self.log(f"Starting benchmark for {self.endpoint}...")
results = []
for prompt in BENCHMARK_PROMPTS:
self.log(f"Testing prompt: {prompt[:30]}...")
start_time = time.time()
try:
# llama.cpp server /completion endpoint
response = requests.post(
f"{self.endpoint}/completion",
json={"prompt": prompt, "n_predict": 128},
timeout=60
)
duration = time.time() - start_time
if response.status_code == 200:
data = response.json()
content = data.get("content", "")
# Rough estimate of tokens (4 chars per token is a common rule of thumb)
tokens = len(content) / 4
tps = tokens / duration
results.append({
"prompt": prompt,
"duration": duration,
"tps": tps,
"success": True
})
else:
results.append({"prompt": prompt, "success": False, "error": response.text})
except Exception as e:
results.append({"prompt": prompt, "success": False, "error": str(e)})
self.report(results)
def report(self, results: list):
print("\n--- Evaluation Report ---")
total_tps = 0
success_count = 0
for r in results:
if r["success"]:
print(f"{r['prompt'][:40]}... | {r['tps']:.2f} tok/s | {r['duration']:.2f}s")
total_tps += r["tps"]
success_count += 1
else:
print(f"{r['prompt'][:40]}... | FAILED: {r['error']}")
if success_count > 0:
avg_tps = total_tps / success_count
print(f"\nAverage Performance: {avg_tps:.2f} tok/s")
else:
print("\n[FAILURE] All benchmarks failed.")
def main():
parser = argparse.ArgumentParser(description="Gemini Model Eval")
parser.add_argument("endpoint", help="llama-server endpoint (e.g. http://localhost:8080)")
args = parser.parse_args()
evaluator = ModelEval(args.endpoint)
evaluator.run_benchmark()
if __name__ == "__main__":
main()

114
scripts/phase_tracker.py Normal file
View File

@@ -0,0 +1,114 @@
#!/usr/bin/env python3
"""
[OPS] Phase Progression Tracker
Part of the Gemini Sovereign Infrastructure Suite.
Tracks the fleet's progress through the Paperclips-inspired evolution arc.
"""
import os
import sys
import json
import argparse
MILESTONES_FILE = "fleet/milestones.md"
COMPLETED_FILE = "fleet/completed_milestones.json"
class PhaseTracker:
def __init__(self):
# Find files relative to repo root
script_dir = os.path.dirname(os.path.abspath(__file__))
repo_root = os.path.dirname(script_dir)
self.milestones_path = os.path.join(repo_root, MILESTONES_FILE)
self.completed_path = os.path.join(repo_root, COMPLETED_FILE)
self.milestones = self.parse_milestones()
self.completed = self.load_completed()
def parse_milestones(self):
if not os.path.exists(self.milestones_path):
return {}
with open(self.milestones_path, "r") as f:
content = f.read()
phases = {}
current_phase = None
for line in content.split("\n"):
if line.startswith("## Phase"):
current_phase = line.replace("## ", "").strip()
phases[current_phase] = []
elif line.startswith("### M"):
m_id = line.split(":")[0].replace("### ", "").strip()
title = line.split(":")[1].strip()
phases[current_phase].append({"id": m_id, "title": title})
return phases
def load_completed(self):
if os.path.exists(self.completed_path):
with open(self.completed_path, "r") as f:
try:
return json.load(f)
except:
return []
return []
def save_completed(self):
with open(self.completed_path, "w") as f:
json.dump(self.completed, f, indent=2)
def show_progress(self):
print("--- Fleet Phase Progression Tracker ---")
total_milestones = 0
total_completed = 0
if not self.milestones:
print("[ERROR] No milestones found in fleet/milestones.md")
return
for phase, ms in self.milestones.items():
print(f"\n{phase}")
for m in ms:
total_milestones += 1
done = m["id"] in self.completed
if done:
total_completed += 1
status = "" if done else ""
print(f" {status} {m['id']}: {m['title']}")
percent = (total_completed / total_milestones) * 100 if total_milestones > 0 else 0
print(f"\nOverall Progress: {total_completed}/{total_milestones} ({percent:.1f}%)")
def mark_complete(self, m_id: str):
if m_id not in self.completed:
self.completed.append(m_id)
self.save_completed()
print(f"[SUCCESS] Marked {m_id} as complete.")
else:
print(f"[INFO] {m_id} is already complete.")
def main():
parser = argparse.ArgumentParser(description="Gemini Phase Tracker")
subparsers = parser.add_subparsers(dest="command")
subparsers.add_parser("status", help="Show current progress")
complete_parser = subparsers.add_parser("complete", help="Mark a milestone as complete")
complete_parser.add_argument("id", help="Milestone ID (e.g. M1)")
args = parser.parse_args()
tracker = PhaseTracker()
if args.command == "status":
tracker.show_progress()
elif args.command == "complete":
tracker.mark_complete(args.id)
else:
parser.print_help()
if __name__ == "__main__":
main()

Some files were not shown because too many files have changed in this diff Show More