Compare commits

..

3 Commits

Author SHA1 Message Date
6ad6469c40 feat: implement architecture linter for sovereignty enforcement 2026-04-10 23:49:24 +00:00
perplexity
3af63cf172 enforce: Anthropic ban — linter, pre-commit, tests, and policy doc
Some checks failed
PR Checklist / pr-checklist (pull_request) Failing after 1m20s
Anthropic is not just removed — it is banned. This commit adds
enforcement at every gate to prevent re-introduction.

1. architecture_linter.py — 9 BANNED rules for Anthropic patterns
   (provider, model slugs, API endpoints, keys, model names).
   Scans all yaml/py/sh/json/md. Skips training data and historical docs.

2. pre-commit.py — scan_banned_providers() runs on every staged file.
   Blocks any commit that introduces Anthropic references.
   Exempt: training/, evaluations/, changelogs, historical cost data.

3. test_sovereignty_enforcement.py — TestAnthropicBan class with 4 tests:
   - No Anthropic in wizard configs
   - No Anthropic in playbooks
   - No Anthropic in fallback chain
   - No Anthropic API key in bootstrap

4. BANNED_PROVIDERS.md — Hard policy document. Golden state config.
   Replacement table. Exception list. Not advisory — mandatory.
2026-04-09 19:27:00 +00:00
perplexity
6d713aeeb9 purge: remove Anthropic from all wizard configs, playbooks, and fleet scripts
Some checks failed
PR Checklist / pr-checklist (pull_request) Failing after 1m18s
Golden state: Kimi K2.5 primary → Gemini via OpenRouter → local Ollama.
Anthropic is gone from every active config, fallback chain, and loop script.

Wizard configs (3):
- allegro, bezalel, ezra: removed anthropic from fallback_providers,
  replaced with gemini + ollama. Removed anthropic provider section.

Playbooks (7):
- All playbooks now use kimi-k2.5 as preferred, google/gemini-2.5-pro
  as fallback. No claude model references remain.

Fleet scripts (8):
- claude-loop.sh: deprecated (exit 0, original preserved as reference)
- claudemax-watchdog.sh: deprecated (exit 0)
- agent-loop.sh: removed claude dispatch case
- start-loops.sh: removed claude-locks, claude-loop from proc list
- timmy-orchestrator.sh: removed claude worker monitoring
- fleet-status.sh: zeroed claude loop counter
- model-health-check.sh: replaced check_anthropic_model with check_kimi_model
- ops-gitea.sh, ops-helpers.sh, ops-panel.sh: removed claude from agent lists

Infrastructure (5):
- wizard_bootstrap.py: removed anthropic pip package and API key checks
- WIZARD_ENVIRONMENT_CONTRACT.md: replaced ANTHROPIC keys with KIMI
- DEPLOY.md: replaced ANTHROPIC_API_KEY with KIMI_API_KEY
- fallback-portfolios.yaml: replaced anthropic provider with kimi-coding
- fleet-vocabulary.md: updated Ezra and Claude entries to Kimi K2.5

Docs (2):
- sonnet-workforce.md: deprecated with notice
- GoldenRockachopa-checkin.md: updated model references

Preserved (not touched):
- training/ data (changing would corrupt training set)
- evaluations/ (historical benchmarks)
- RELEASE_*.md (changelogs)
- metrics_helpers.py (historical cost calculation)
- hermes-sovereign/githooks/pre-commit.py (secret detection - still useful)
- security/secret-scan.yml (key detection - still useful)
- architecture_linter.py (warns about anthropic usage - desired behavior)
- test_sovereignty_enforcement.py (tests anthropic is blocked - correct)
- son-of-timmy.md philosophical references (Claude as one of many backends)

Refs: Sovereignty directive, zero-cloud vision
2026-04-09 19:21:48 +00:00
85 changed files with 853 additions and 4492 deletions

View File

@@ -1,54 +0,0 @@
## Summary
<!-- What changed and why. One paragraph max. -->
## Governing Issue
<!-- REQUIRED. Every PR must reference at least one issue. Max 3 issues per PR. -->
<!-- Closes #ISSUENUM -->
<!-- Refs #ISSUENUM -->
## Acceptance Criteria
<!-- List the specific outcomes this PR delivers. Check each only when proven. -->
<!-- Copy these from the governing issue if it has them. -->
- [ ] Criterion 1
- [ ] Criterion 2
## Proof
<!-- No proof = no merge. See CONTRIBUTING.md for the full standard. -->
### Commands / logs / world-state proof
<!-- Paste the exact commands, output, log paths, or world-state artifacts that prove each acceptance criterion was met. -->
```
$ <command you ran>
<relevant output>
```
### Visual proof (if applicable)
<!-- For skin updates, UI changes, dashboard changes: attach screenshot to the PR discussion. -->
<!-- Name what the screenshot proves. Do not commit binary media unless explicitly required. -->
## Risk and Rollback
<!-- What could go wrong? How do we undo it? -->
- **Risk level:** low / medium / high
- **What breaks if this is wrong:**
- **How to rollback:**
## Checklist
<!-- Complete every item before requesting review. -->
- [ ] PR body references at least one issue number (`Closes #N` or `Refs #N`)
- [ ] Changed files are syntactically valid (`python -c "import ast; ast.parse(open(f).read())"`, `node --check`, `bash -n`)
- [ ] Proof meets CONTRIBUTING.md standard (exact commands, output, or artifacts — not "looks right")
- [ ] Branch is up-to-date with base
- [ ] No more than 3 unrelated issues bundled in this PR
- [ ] Shell scripts are executable (`chmod +x`)

View File

@@ -1,41 +0,0 @@
# architecture-lint.yml — CI gate for the Architecture Linter v2
# Refs: #437 — repo-aware, test-backed, CI-enforced.
#
# Runs on every PR to main. Validates Python syntax, then runs
# linter tests and finally lints the repo itself.
name: Architecture Lint
on:
pull_request:
branches: [main, master]
push:
branches: [main]
jobs:
linter-tests:
name: Linter Tests
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.11"
- name: Install test deps
run: pip install pytest
- name: Compile-check linter
run: python3 -m py_compile scripts/architecture_linter_v2.py
- name: Run linter tests
run: python3 -m pytest tests/test_linter.py -v
lint-repo:
name: Lint Repository
runs-on: ubuntu-latest
needs: linter-tests
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.11"
- name: Run architecture linter
run: python3 scripts/architecture_linter_v2.py .

View File

@@ -112,10 +112,23 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install PyYAML
run: pip install pyyaml
- name: Validate playbook structure
run: python3 scripts/validate_playbook_schema.py
run: |
python3 -c "
import yaml, sys, glob
required_keys = {'name', 'description'}
for f in glob.glob('playbooks/*.yaml'):
with open(f) as fh:
try:
data = yaml.safe_load(fh)
if not isinstance(data, dict):
print(f'ERROR: {f} is not a YAML mapping')
sys.exit(1)
missing = required_keys - set(data.keys())
if missing:
print(f'WARNING: {f} missing keys: {missing}')
print(f'OK: {f}')
except yaml.YAMLError as e:
print(f'ERROR: {f}: {e}')
sys.exit(1)
"

24
.gitignore vendored
View File

@@ -10,27 +10,3 @@ __pycache__/
# Generated audit reports
reports/
# Secrets and credentials
.bash_history
.git-credentials
.gitea_token
.ssh/id_*
.ssh/known_hosts
.viminfo
.wget-hsts
.profile
.bashrc
.bash_logout
.python_history
.lesshst
.selected_editor
.sudo_as_admin_successful
.config/telegram/
.hermes/.env
.hermes/auth.json
*.pem
*.key
.env
.env.*
!.env.example

63
BANNED_PROVIDERS.md Normal file
View File

@@ -0,0 +1,63 @@
# Banned Providers
This document is a hard policy. It is not advisory. It is not aspirational.
Any agent, wizard, or automated process that violates this policy is broken
and must be fixed immediately.
## Permanently Banned
### Anthropic (Claude)
**Status:** BANNED — April 2026
**Scope:** All configs, fallback chains, playbooks, wizard bootstraps, and fleet scripts.
**Enforcement:** Pre-commit hook, architecture linter, sovereignty enforcement tests.
No Anthropic model (Claude Opus, Sonnet, Haiku, or any variant) may appear as:
- A primary provider
- A fallback provider
- An OpenRouter model slug (e.g. `anthropic/claude-*`)
- An API endpoint (api.anthropic.com)
- A required dependency (`anthropic` pip package)
- An environment variable (`ANTHROPIC_API_KEY`, `ANTHROPIC_TOKEN`)
### What to use instead
| Was | Now |
|-----|-----|
| claude-opus-4-6 | kimi-k2.5 |
| claude-sonnet-4-20250514 | kimi-k2.5 |
| claude-haiku | google/gemini-2.5-pro |
| anthropic (provider) | kimi-coding |
| anthropic/claude-* (OpenRouter) | google/gemini-2.5-pro |
| ANTHROPIC_API_KEY | KIMI_API_KEY |
### Exceptions
The following files may reference Anthropic for **historical or defensive** purposes:
- `training/` — Training data must not be altered
- `evaluations/` — Historical benchmark results
- `RELEASE_*.md` — Changelogs
- `metrics_helpers.py` — Historical cost calculation
- `pre-commit.py` — Detects leaked Anthropic keys (defensive)
- `secret-scan.yml` — Detects leaked Anthropic keys (defensive)
- `architecture_linter.py` — Warns/blocks Anthropic usage (enforcement)
- `test_sovereignty_enforcement.py` — Tests that Anthropic is blocked (enforcement)
### Golden State
```yaml
fallback_providers:
- provider: kimi-coding
model: kimi-k2.5
reason: Primary
- provider: openrouter
model: google/gemini-2.5-pro
reason: Cloud fallback
- provider: ollama
model: gemma4:latest
base_url: http://localhost:11434/v1
reason: Terminal fallback — never phones home
```
*Sovereignty and service always.*

View File

@@ -51,11 +51,11 @@ Alexander is pleased with the state. This tag marks a high-water mark.
| OAI-Wolf-3 | 8683 | hermes gateway | ACTIVE |
- Disk: 12G/926G (4%) — pristine
- Primary model: claude-opus-4-6 via Anthropic
- Primary model: kimi-k2.5 via Kimi
- Fallback chain: codex → kimi-k2.5 → gemini-2.5-flash → llama-3.3-70b → grok-3-mini-fast → kimi → grok → kimi → gpt-4.1-mini
- Ollama models: gemma4:latest (9.6GB), hermes4:14b (9.0GB)
- Worktrees: 239 (9.8GB) — prune candidates exist
- Running loops: 3 claude-loops, 3 gemini-loops, orchestrator, status watcher
- Running loops: 3 gemini-loops, orchestrator, status watcher
- LaunchD: hermes gateway running, fenrir stopped, kimi-heartbeat idle
- MCP: morrowind server active

View File

@@ -1,47 +0,0 @@
# =============================================================================
# BANNED PROVIDERS — The Timmy Foundation
# =============================================================================
# "Anthropic is not only fired, but banned. I don't want these errors
# cropping up." — Alexander, 2026-04-09
#
# This is a HARD BAN. Not deprecated. Not fallback. BANNED.
# Enforcement: pre-commit hook, linter, Ansible validation, CI tests.
# =============================================================================
banned_providers:
- name: anthropic
reason: "Permanently banned. SDK access gated despite active quota. Fleet was bricked because golden state pointed to Anthropic Sonnet."
banned_date: "2026-04-09"
enforcement: strict # Ansible playbook FAILS if detected
models:
- "claude-sonnet-*"
- "claude-opus-*"
- "claude-haiku-*"
- "claude-*"
endpoints:
- "api.anthropic.com"
- "anthropic/*" # OpenRouter pattern
api_keys:
- "ANTHROPIC_API_KEY"
- "CLAUDE_API_KEY"
# Golden state alternative:
approved_providers:
- name: kimi-coding
model: kimi-k2.5
role: primary
- name: openrouter
model: google/gemini-2.5-pro
role: fallback
- name: ollama
model: "gemma4:latest"
role: terminal_fallback
# Future evaluation:
evaluation_candidates:
- name: mimo-v2-pro
status: pending
notes: "Free via Nous Portal for ~2 weeks from 2026-04-07. Add after fallback chain is fixed."
- name: hermes-4
status: available
notes: "Free on Nous Portal. 36B and 70B variants. Home team model."

View File

@@ -1,95 +0,0 @@
# Ansible IaC — The Timmy Foundation Fleet
> One canonical Ansible playbook defines: deadman switch, cron schedule,
> golden state rollback, agent startup sequence.
> — KT Final Session 2026-04-08, Priority TWO
## Purpose
This directory contains the **single source of truth** for fleet infrastructure.
No more ad-hoc recovery implementations. No more overlapping deadman switches.
No more agents mutating their own configs into oblivion.
**Everything** goes through Ansible. If it's not in a playbook, it doesn't exist.
## Architecture
```
┌─────────────────────────────────────────────────┐
│ Gitea (Source of Truth) │
│ timmy-config/ansible/ │
│ ├── inventory/hosts.yml (fleet machines) │
│ ├── playbooks/site.yml (master playbook) │
│ ├── roles/ (reusable roles) │
│ └── group_vars/wizards.yml (golden state) │
└──────────────────┬──────────────────────────────┘
│ PR merge triggers webhook
┌─────────────────────────────────────────────────┐
│ Gitea Webhook Handler │
│ scripts/deploy_on_webhook.sh │
│ → ansible-pull on each target machine │
└──────────────────┬──────────────────────────────┘
│ ansible-pull
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Timmy │ │ Allegro │ │ Bezalel │ │ Ezra │
│ (Mac) │ │ (VPS) │ │ (VPS) │ │ (VPS) │
│ │ │ │ │ │ │ │
│ deadman │ │ deadman │ │ deadman │ │ deadman │
│ cron │ │ cron │ │ cron │ │ cron │
│ golden │ │ golden │ │ golden │ │ golden │
│ req_log │ │ req_log │ │ req_log │ │ req_log │
└──────────┘ └──────────┘ └──────────┘ └──────────┘
```
## Quick Start
```bash
# Deploy everything to all machines
ansible-playbook -i inventory/hosts.yml playbooks/site.yml
# Deploy only golden state config
ansible-playbook -i inventory/hosts.yml playbooks/golden_state.yml
# Deploy only to a specific wizard
ansible-playbook -i inventory/hosts.yml playbooks/site.yml --limit bezalel
# Dry run (check mode)
ansible-playbook -i inventory/hosts.yml playbooks/site.yml --check --diff
```
## Golden State Provider Chain
All wizard configs converge on this provider chain. **Anthropic is BANNED.**
| Priority | Provider | Model | Endpoint |
| -------- | -------------------- | ---------------- | --------------------------------- |
| 1 | Kimi | kimi-k2.5 | https://api.kimi.com/coding/v1 |
| 2 | Gemini (OpenRouter) | gemini-2.5-pro | https://openrouter.ai/api/v1 |
| 3 | Ollama (local) | gemma4:latest | http://localhost:11434/v1 |
## Roles
| Role | Purpose |
| ---------------- | ------------------------------------------------------------ |
| `wizard_base` | Common wizard setup: directories, thin config, git pull |
| `deadman_switch` | Health check → snapshot good config → rollback on death |
| `golden_state` | Deploy and enforce golden state provider chain |
| `request_log` | SQLite telemetry table for every inference call |
| `cron_manager` | Source-controlled cron jobs — no manual crontab edits |
## Rules
1. **No manual changes.** If it's not in a playbook, it will be overwritten.
2. **No Anthropic.** Banned. Enforcement is automated. See `BANNED_PROVIDERS.yml`.
3. **Idempotent.** Every playbook can run 100 times with the same result.
4. **PR required.** Config changes go through Gitea PR review, then deploy.
5. **One identity per machine.** No duplicate agents. Fleet audit enforces this.
## Related Issues
- timmy-config #442: [P2] Ansible IaC Canonical Playbook
- timmy-config #444: Wire Deadman Switch ACTION
- timmy-config #443: Thin Config Pattern
- timmy-config #446: request_log Telemetry Table

View File

@@ -1,21 +0,0 @@
[defaults]
inventory = inventory/hosts.yml
roles_path = roles
host_key_checking = False
retry_files_enabled = False
stdout_callback = yaml
forks = 10
timeout = 30
# Logging
log_path = /var/log/ansible/timmy-fleet.log
[privilege_escalation]
become = True
become_method = sudo
become_user = root
become_ask_pass = False
[ssh_connection]
pipelining = True
ssh_args = -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no

View File

@@ -1,74 +0,0 @@
# =============================================================================
# Wizard Group Variables — Golden State Configuration
# =============================================================================
# These variables are applied to ALL wizards in the fleet.
# This IS the golden state. If a wizard deviates, Ansible corrects it.
# =============================================================================
# --- Deadman Switch ---
deadman_enabled: true
deadman_check_interval: 300 # 5 minutes between health checks
deadman_snapshot_dir: "~/.local/timmy/snapshots"
deadman_max_snapshots: 10 # Rolling window of good configs
deadman_restart_cooldown: 60 # Seconds to wait before restart after failure
deadman_max_restart_attempts: 3
deadman_escalation_channel: telegram # Alert Alexander after max attempts
# --- Thin Config ---
thin_config_path: "~/.timmy/thin_config.yml"
thin_config_mode: "0444" # Read-only — agents CANNOT modify
upstream_repo: "https://forge.alexanderwhitestone.com/Timmy_Foundation/timmy-config.git"
upstream_branch: main
config_pull_on_wake: true
config_validation_enabled: true
# --- Agent Settings ---
agent_max_turns: 30
agent_reasoning_effort: high
agent_verbose: false
agent_approval_mode: auto
# --- Hermes Harness ---
hermes_config_dir: "{{ hermes_home }}"
hermes_bin_dir: "{{ hermes_home }}/bin"
hermes_skins_dir: "{{ hermes_home }}/skins"
hermes_playbooks_dir: "{{ hermes_home }}/playbooks"
hermes_memories_dir: "{{ hermes_home }}/memories"
# --- Request Log (Telemetry) ---
request_log_enabled: true
request_log_path: "~/.local/timmy/request_log.db"
request_log_rotation_days: 30 # Archive logs older than 30 days
request_log_sync_to_gitea: false # Future: push telemetry summaries to Gitea
# --- Cron Schedule ---
# All cron jobs are managed here. No manual crontab edits.
cron_jobs:
- name: "Deadman health check"
job: "cd {{ wizard_home }}/workspace/timmy-config && python3 fleet/health_check.py"
minute: "*/5"
hour: "*"
enabled: "{{ deadman_enabled }}"
- name: "Muda audit"
job: "cd {{ wizard_home }}/workspace/timmy-config && bash fleet/muda-audit.sh >> /tmp/muda-audit.log 2>&1"
minute: "0"
hour: "21"
weekday: "0"
enabled: true
- name: "Config pull from upstream"
job: "cd {{ wizard_home }}/workspace/timmy-config && git pull --ff-only origin main"
minute: "*/15"
hour: "*"
enabled: "{{ config_pull_on_wake }}"
- name: "Request log rotation"
job: "python3 -c \"import sqlite3,datetime; db=sqlite3.connect('{{ request_log_path }}'); db.execute('DELETE FROM request_log WHERE timestamp < datetime(\\\"now\\\", \\\"-{{ request_log_rotation_days }} days\\\")'); db.commit()\""
minute: "0"
hour: "3"
enabled: "{{ request_log_enabled }}"
# --- Provider Enforcement ---
# These are validated on every Ansible run. Any Anthropic reference = failure.
provider_ban_enforcement: strict # strict = fail playbook, warn = log only

View File

@@ -1,119 +0,0 @@
# =============================================================================
# Fleet Inventory — The Timmy Foundation
# =============================================================================
# Source of truth for all machines in the fleet.
# Update this file when machines are added/removed.
# All changes go through PR review.
# =============================================================================
all:
children:
wizards:
hosts:
timmy:
ansible_host: localhost
ansible_connection: local
wizard_name: Timmy
wizard_role: "Primary wizard — soul of the fleet"
wizard_provider_primary: kimi-coding
wizard_model_primary: kimi-k2.5
hermes_port: 8081
api_port: 8645
wizard_home: "{{ ansible_env.HOME }}/wizards/timmy"
hermes_home: "{{ ansible_env.HOME }}/.hermes"
machine_type: mac
# Timmy runs on Alexander's M3 Max
ollama_available: true
allegro:
ansible_host: 167.99.126.228
ansible_user: root
wizard_name: Allegro
wizard_role: "Kimi-backed third wizard house — tight coding tasks"
wizard_provider_primary: kimi-coding
wizard_model_primary: kimi-k2.5
hermes_port: 8081
api_port: 8645
wizard_home: /root/wizards/allegro
hermes_home: /root/.hermes
machine_type: vps
ollama_available: false
bezalel:
ansible_host: 159.203.146.185
ansible_user: root
wizard_name: Bezalel
wizard_role: "Forge-and-testbed wizard — infrastructure, deployment, hardening"
wizard_provider_primary: kimi-coding
wizard_model_primary: kimi-k2.5
hermes_port: 8081
api_port: 8656
wizard_home: /root/wizards/bezalel
hermes_home: /root/.hermes
machine_type: vps
ollama_available: false
# NOTE: The awake Bezalel may be the duplicate.
# Fleet audit (the-nexus #1144) will resolve identity.
ezra:
ansible_host: 143.198.27.163
ansible_user: root
wizard_name: Ezra
wizard_role: "Infrastructure wizard — Gitea, nginx, hosting"
wizard_provider_primary: kimi-coding
wizard_model_primary: kimi-k2.5
hermes_port: 8081
api_port: 8645
wizard_home: /root/wizards/ezra
hermes_home: /root/.hermes
machine_type: vps
ollama_available: false
# NOTE: Currently DOWN — Telegram key revoked, awaiting propagation.
# Infrastructure hosts (not wizards, but managed by Ansible)
infrastructure:
hosts:
forge:
ansible_host: 143.198.27.163
ansible_user: root
# Gitea runs on the same box as Ezra
gitea_url: https://forge.alexanderwhitestone.com
gitea_org: Timmy_Foundation
vars:
# Global variables applied to all hosts
gitea_repo_url: "https://forge.alexanderwhitestone.com/Timmy_Foundation/timmy-config.git"
gitea_branch: main
config_base_path: "{{ gitea_repo_url }}"
timmy_log_dir: "~/.local/timmy/fleet-health"
request_log_db: "~/.local/timmy/request_log.db"
# Golden state provider chain — Anthropic is BANNED
golden_state_providers:
- name: kimi-coding
model: kimi-k2.5
base_url: "https://api.kimi.com/coding/v1"
timeout: 120
reason: "Primary — Kimi K2.5 (best value, least friction)"
- name: openrouter
model: google/gemini-2.5-pro
base_url: "https://openrouter.ai/api/v1"
api_key_env: OPENROUTER_API_KEY
timeout: 120
reason: "Fallback — Gemini 2.5 Pro via OpenRouter"
- name: ollama
model: "gemma4:latest"
base_url: "http://localhost:11434/v1"
timeout: 180
reason: "Terminal fallback — local Ollama (sovereign, no API needed)"
# Banned providers — hard enforcement
banned_providers:
- anthropic
- claude
banned_models_patterns:
- "claude-*"
- "anthropic/*"
- "*sonnet*"
- "*opus*"
- "*haiku*"

View File

@@ -1,98 +0,0 @@
---
# =============================================================================
# agent_startup.yml — Resurrect Wizards from Checked-in Configs
# =============================================================================
# Brings wizards back online using golden state configs.
# Order: pull config → validate → start agent → verify with request_log
# =============================================================================
- name: "Agent Startup Sequence"
hosts: wizards
become: true
serial: 1 # One wizard at a time to avoid cascading issues
tasks:
- name: "Pull latest config from upstream"
git:
repo: "{{ upstream_repo }}"
dest: "{{ wizard_home }}/workspace/timmy-config"
version: "{{ upstream_branch }}"
force: true
tags: [pull]
- name: "Deploy golden state config"
include_role:
name: golden_state
tags: [config]
- name: "Validate config — no banned providers"
shell: |
python3 -c "
import yaml, sys
with open('{{ wizard_home }}/config.yaml') as f:
cfg = yaml.safe_load(f)
banned = {{ banned_providers }}
for p in cfg.get('fallback_providers', []):
if p.get('provider', '') in banned:
print(f'BANNED: {p[\"provider\"]}', file=sys.stderr)
sys.exit(1)
model = cfg.get('model', {}).get('provider', '')
if model in banned:
print(f'BANNED default provider: {model}', file=sys.stderr)
sys.exit(1)
print('Config validated — no banned providers.')
"
register: config_valid
tags: [validate]
- name: "Ensure hermes-agent service is running"
systemd:
name: "hermes-{{ wizard_name | lower }}"
state: started
enabled: true
when: machine_type == 'vps'
tags: [start]
ignore_errors: true # Service may not exist yet on all machines
- name: "Start hermes agent (Mac — launchctl)"
shell: |
launchctl kickstart -k "ai.hermes.{{ wizard_name | lower }}" 2>/dev/null || \
cd {{ wizard_home }} && hermes agent start --daemon 2>&1 | tail -5
when: machine_type == 'mac'
tags: [start]
ignore_errors: true
- name: "Wait for agent to come online"
wait_for:
host: 127.0.0.1
port: "{{ api_port }}"
timeout: 60
state: started
tags: [verify]
ignore_errors: true
- name: "Verify agent is alive — check request_log for activity"
shell: |
sleep 10
python3 -c "
import sqlite3, sys
db = sqlite3.connect('{{ request_log_path }}')
cursor = db.execute('''
SELECT COUNT(*) FROM request_log
WHERE agent_name = '{{ wizard_name }}'
AND timestamp > datetime('now', '-5 minutes')
''')
count = cursor.fetchone()[0]
if count > 0:
print(f'{{ wizard_name }} is alive — {count} recent inference calls logged.')
else:
print(f'WARNING: {{ wizard_name }} started but no telemetry yet.')
"
register: agent_status
tags: [verify]
ignore_errors: true
- name: "Report startup status"
debug:
msg: "{{ wizard_name }}: {{ agent_status.stdout | default('startup attempted') }}"
tags: [always]

View File

@@ -1,15 +0,0 @@
---
# =============================================================================
# cron_schedule.yml — Source-Controlled Cron Jobs
# =============================================================================
# All cron jobs are defined in group_vars/wizards.yml.
# This playbook deploys them. No manual crontab edits allowed.
# =============================================================================
- name: "Deploy Cron Schedule"
hosts: wizards
become: true
roles:
- role: cron_manager
tags: [cron, schedule]

View File

@@ -1,17 +0,0 @@
---
# =============================================================================
# deadman_switch.yml — Deploy Deadman Switch to All Wizards
# =============================================================================
# The deadman watch already fires and detects dead agents.
# This playbook wires the ACTION:
# - On healthy check: snapshot current config as "last known good"
# - On failed check: rollback config to snapshot, restart agent
# =============================================================================
- name: "Deploy Deadman Switch ACTION"
hosts: wizards
become: true
roles:
- role: deadman_switch
tags: [deadman, recovery]

View File

@@ -1,30 +0,0 @@
---
# =============================================================================
# golden_state.yml — Deploy Golden State Config to All Wizards
# =============================================================================
# Enforces the golden state provider chain across the fleet.
# Removes any Anthropic references. Deploys the approved provider chain.
# =============================================================================
- name: "Deploy Golden State Configuration"
hosts: wizards
become: true
roles:
- role: golden_state
tags: [golden, config]
post_tasks:
- name: "Verify golden state — no banned providers"
shell: |
grep -rci 'anthropic\|claude-sonnet\|claude-opus\|claude-haiku' \
{{ hermes_home }}/config.yaml \
{{ wizard_home }}/config.yaml 2>/dev/null || echo "0"
register: banned_count
changed_when: false
- name: "Report golden state status"
debug:
msg: >
{{ wizard_name }} golden state: {{ golden_state_providers | map(attribute='name') | list | join(' → ') }}.
Banned provider references: {{ banned_count.stdout | trim }}.

View File

@@ -1,15 +0,0 @@
---
# =============================================================================
# request_log.yml — Deploy Telemetry Table
# =============================================================================
# Creates the request_log SQLite table on all machines.
# Every inference call writes a row. No exceptions. No summarizing.
# =============================================================================
- name: "Deploy Request Log Telemetry"
hosts: wizards
become: true
roles:
- role: request_log
tags: [telemetry, logging]

View File

@@ -1,72 +0,0 @@
---
# =============================================================================
# site.yml — Master Playbook for the Timmy Foundation Fleet
# =============================================================================
# This is the ONE playbook that defines the entire fleet state.
# Run this and every machine converges to golden state.
#
# Usage:
# ansible-playbook -i inventory/hosts.yml playbooks/site.yml
# ansible-playbook -i inventory/hosts.yml playbooks/site.yml --limit bezalel
# ansible-playbook -i inventory/hosts.yml playbooks/site.yml --check --diff
# =============================================================================
- name: "Timmy Foundation Fleet — Full Convergence"
hosts: wizards
become: true
pre_tasks:
- name: "Validate no banned providers in golden state"
assert:
that:
- "item.name not in banned_providers"
fail_msg: "BANNED PROVIDER DETECTED: {{ item.name }} — Anthropic is permanently banned."
quiet: true
loop: "{{ golden_state_providers }}"
tags: [always]
- name: "Display target wizard"
debug:
msg: "Deploying to {{ wizard_name }} ({{ wizard_role }}) on {{ ansible_host }}"
tags: [always]
roles:
- role: wizard_base
tags: [base, setup]
- role: golden_state
tags: [golden, config]
- role: deadman_switch
tags: [deadman, recovery]
- role: request_log
tags: [telemetry, logging]
- role: cron_manager
tags: [cron, schedule]
post_tasks:
- name: "Final validation — scan for banned providers"
shell: |
grep -ri 'anthropic\|claude-sonnet\|claude-opus\|claude-haiku' \
{{ hermes_home }}/config.yaml \
{{ wizard_home }}/config.yaml \
{{ thin_config_path }} 2>/dev/null || true
register: banned_scan
changed_when: false
tags: [validation]
- name: "FAIL if banned providers found in deployed config"
fail:
msg: |
BANNED PROVIDER DETECTED IN DEPLOYED CONFIG:
{{ banned_scan.stdout }}
Anthropic is permanently banned. Fix the config and re-deploy.
when: banned_scan.stdout | length > 0
tags: [validation]
- name: "Deployment complete"
debug:
msg: "{{ wizard_name }} converged to golden state. Provider chain: {{ golden_state_providers | map(attribute='name') | list | join(' → ') }}"
tags: [always]

View File

@@ -1,55 +0,0 @@
---
# =============================================================================
# cron_manager/tasks — Source-Controlled Cron Jobs
# =============================================================================
# All cron jobs are defined in group_vars/wizards.yml.
# No manual crontab edits. This is the only way to manage cron.
# =============================================================================
- name: "Deploy managed cron jobs"
cron:
name: "{{ item.name }}"
job: "{{ item.job }}"
minute: "{{ item.minute | default('*') }}"
hour: "{{ item.hour | default('*') }}"
day: "{{ item.day | default('*') }}"
month: "{{ item.month | default('*') }}"
weekday: "{{ item.weekday | default('*') }}"
state: "{{ 'present' if item.enabled else 'absent' }}"
user: "{{ ansible_user | default('root') }}"
loop: "{{ cron_jobs }}"
when: cron_jobs is defined
- name: "Deploy deadman switch cron (fallback if systemd timer unavailable)"
cron:
name: "Deadman switch — {{ wizard_name }}"
job: "{{ wizard_home }}/deadman_action.sh >> {{ timmy_log_dir }}/deadman-{{ wizard_name }}.log 2>&1"
minute: "*/5"
hour: "*"
state: present
user: "{{ ansible_user | default('root') }}"
when: deadman_enabled and machine_type != 'vps'
# VPS machines use systemd timers instead
- name: "Remove legacy cron jobs (cleanup)"
cron:
name: "{{ item }}"
state: absent
user: "{{ ansible_user | default('root') }}"
loop:
- "legacy-deadman-watch"
- "old-health-check"
- "backup-deadman"
ignore_errors: true
- name: "List active cron jobs"
shell: "crontab -l 2>/dev/null | grep -v '^#' | grep -v '^$' || echo 'No cron jobs found.'"
register: active_crons
changed_when: false
- name: "Report cron status"
debug:
msg: |
{{ wizard_name }} cron jobs deployed.
Active:
{{ active_crons.stdout }}

View File

@@ -1,70 +0,0 @@
---
# =============================================================================
# deadman_switch/tasks — Wire the Deadman Switch ACTION
# =============================================================================
# The watch fires. This makes it DO something:
# - On healthy check: snapshot current config as "last known good"
# - On failed check: rollback to last known good, restart agent
# =============================================================================
- name: "Create snapshot directory"
file:
path: "{{ deadman_snapshot_dir }}"
state: directory
mode: "0755"
- name: "Deploy deadman switch script"
template:
src: deadman_action.sh.j2
dest: "{{ wizard_home }}/deadman_action.sh"
mode: "0755"
- name: "Deploy deadman systemd service"
template:
src: deadman_switch.service.j2
dest: "/etc/systemd/system/deadman-{{ wizard_name | lower }}.service"
mode: "0644"
when: machine_type == 'vps'
notify: "Enable deadman service"
- name: "Deploy deadman systemd timer"
template:
src: deadman_switch.timer.j2
dest: "/etc/systemd/system/deadman-{{ wizard_name | lower }}.timer"
mode: "0644"
when: machine_type == 'vps'
notify: "Enable deadman timer"
- name: "Deploy deadman launchd plist (Mac)"
template:
src: deadman_switch.plist.j2
dest: "{{ ansible_env.HOME }}/Library/LaunchAgents/com.timmy.deadman.{{ wizard_name | lower }}.plist"
mode: "0644"
when: machine_type == 'mac'
notify: "Load deadman plist"
- name: "Take initial config snapshot"
copy:
src: "{{ wizard_home }}/config.yaml"
dest: "{{ deadman_snapshot_dir }}/config.yaml.known_good"
remote_src: true
mode: "0444"
ignore_errors: true
handlers:
- name: "Enable deadman service"
systemd:
name: "deadman-{{ wizard_name | lower }}.service"
daemon_reload: true
enabled: true
- name: "Enable deadman timer"
systemd:
name: "deadman-{{ wizard_name | lower }}.timer"
daemon_reload: true
enabled: true
state: started
- name: "Load deadman plist"
shell: "launchctl load {{ ansible_env.HOME }}/Library/LaunchAgents/com.timmy.deadman.{{ wizard_name | lower }}.plist"
ignore_errors: true

View File

@@ -1,153 +0,0 @@
#!/usr/bin/env bash
# =============================================================================
# Deadman Switch ACTION — {{ wizard_name }}
# =============================================================================
# Generated by Ansible on {{ ansible_date_time.iso8601 }}
# DO NOT EDIT MANUALLY.
#
# On healthy check: snapshot current config as "last known good"
# On failed check: rollback config to last known good, restart agent
# =============================================================================
set -euo pipefail
WIZARD_NAME="{{ wizard_name }}"
WIZARD_HOME="{{ wizard_home }}"
CONFIG_FILE="{{ wizard_home }}/config.yaml"
SNAPSHOT_DIR="{{ deadman_snapshot_dir }}"
SNAPSHOT_FILE="${SNAPSHOT_DIR}/config.yaml.known_good"
REQUEST_LOG_DB="{{ request_log_path }}"
LOG_DIR="{{ timmy_log_dir }}"
LOG_FILE="${LOG_DIR}/deadman-${WIZARD_NAME}.log"
MAX_SNAPSHOTS={{ deadman_max_snapshots }}
RESTART_COOLDOWN={{ deadman_restart_cooldown }}
MAX_RESTART_ATTEMPTS={{ deadman_max_restart_attempts }}
COOLDOWN_FILE="${LOG_DIR}/deadman_cooldown_${WIZARD_NAME}"
SERVICE_NAME="hermes-{{ wizard_name | lower }}"
# Ensure directories exist
mkdir -p "${SNAPSHOT_DIR}" "${LOG_DIR}"
log() {
echo "[$(date -u +%Y-%m-%dT%H:%M:%SZ)] [deadman] [${WIZARD_NAME}] $*" >> "${LOG_FILE}"
echo "[deadman] [${WIZARD_NAME}] $*"
}
log_telemetry() {
local status="$1"
local message="$2"
if [ -f "${REQUEST_LOG_DB}" ]; then
sqlite3 "${REQUEST_LOG_DB}" "INSERT INTO request_log (timestamp, agent_name, provider, model, endpoint, status, error_message) VALUES (datetime('now'), '${WIZARD_NAME}', 'deadman_switch', 'N/A', 'health_check', '${status}', '${message}');" 2>/dev/null || true
fi
}
snapshot_config() {
if [ -f "${CONFIG_FILE}" ]; then
cp "${CONFIG_FILE}" "${SNAPSHOT_FILE}"
# Keep rolling history
cp "${CONFIG_FILE}" "${SNAPSHOT_DIR}/config.yaml.$(date +%s)"
# Prune old snapshots
ls -t "${SNAPSHOT_DIR}"/config.yaml.[0-9]* 2>/dev/null | tail -n +$((MAX_SNAPSHOTS + 1)) | xargs rm -f 2>/dev/null
log "Config snapshot saved."
fi
}
rollback_config() {
if [ -f "${SNAPSHOT_FILE}" ]; then
log "Rolling back config to last known good..."
cp "${SNAPSHOT_FILE}" "${CONFIG_FILE}"
log "Config rolled back."
log_telemetry "fallback" "Config rolled back to last known good by deadman switch"
else
log "ERROR: No known good snapshot found. Pulling from upstream..."
cd "${WIZARD_HOME}/workspace/timmy-config" 2>/dev/null && \
git pull --ff-only origin {{ upstream_branch }} 2>/dev/null && \
cp "wizards/{{ wizard_name | lower }}/config.yaml" "${CONFIG_FILE}" && \
log "Config restored from upstream." || \
log "CRITICAL: Cannot restore config from any source."
fi
}
restart_agent() {
# Check cooldown
if [ -f "${COOLDOWN_FILE}" ]; then
local last_restart
last_restart=$(cat "${COOLDOWN_FILE}")
local now
now=$(date +%s)
local elapsed=$((now - last_restart))
if [ "${elapsed}" -lt "${RESTART_COOLDOWN}" ]; then
log "Restart cooldown active (${elapsed}s / ${RESTART_COOLDOWN}s). Skipping."
return 1
fi
fi
log "Restarting ${SERVICE_NAME}..."
date +%s > "${COOLDOWN_FILE}"
{% if machine_type == 'vps' %}
systemctl restart "${SERVICE_NAME}" 2>/dev/null && \
log "Agent restarted via systemd." || \
log "ERROR: systemd restart failed."
{% else %}
launchctl kickstart -k "ai.hermes.{{ wizard_name | lower }}" 2>/dev/null && \
log "Agent restarted via launchctl." || \
(cd "${WIZARD_HOME}" && hermes agent start --daemon 2>/dev/null && \
log "Agent restarted via hermes CLI.") || \
log "ERROR: All restart methods failed."
{% endif %}
log_telemetry "success" "Agent restarted by deadman switch"
}
# --- Health Check ---
check_health() {
# Check 1: Is the agent process running?
{% if machine_type == 'vps' %}
if ! systemctl is-active --quiet "${SERVICE_NAME}" 2>/dev/null; then
if ! pgrep -f "hermes" > /dev/null 2>/dev/null; then
log "FAIL: Agent process not running."
return 1
fi
fi
{% else %}
if ! pgrep -f "hermes" > /dev/null 2>/dev/null; then
log "FAIL: Agent process not running."
return 1
fi
{% endif %}
# Check 2: Is the API port responding?
if ! timeout 10 bash -c "echo > /dev/tcp/127.0.0.1/{{ api_port }}" 2>/dev/null; then
log "FAIL: API port {{ api_port }} not responding."
return 1
fi
# Check 3: Does the config contain banned providers?
if grep -qi 'anthropic\|claude-sonnet\|claude-opus\|claude-haiku' "${CONFIG_FILE}" 2>/dev/null; then
log "FAIL: Config contains banned provider (Anthropic). Rolling back."
return 1
fi
return 0
}
# --- Main ---
main() {
log "Health check starting..."
if check_health; then
log "HEALTHY — snapshotting config."
snapshot_config
log_telemetry "success" "Health check passed"
else
log "UNHEALTHY — initiating recovery."
log_telemetry "error" "Health check failed — initiating rollback"
rollback_config
restart_agent
fi
log "Health check complete."
}
main "$@"

View File

@@ -1,22 +0,0 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<!-- Deadman Switch — {{ wizard_name }}. Generated by Ansible. DO NOT EDIT MANUALLY. -->
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.timmy.deadman.{{ wizard_name | lower }}</string>
<key>ProgramArguments</key>
<array>
<string>/bin/bash</string>
<string>{{ wizard_home }}/deadman_action.sh</string>
</array>
<key>StartInterval</key>
<integer>{{ deadman_check_interval }}</integer>
<key>RunAtLoad</key>
<true/>
<key>StandardOutPath</key>
<string>{{ timmy_log_dir }}/deadman-{{ wizard_name }}.log</string>
<key>StandardErrorPath</key>
<string>{{ timmy_log_dir }}/deadman-{{ wizard_name }}.log</string>
</dict>
</plist>

View File

@@ -1,16 +0,0 @@
# Deadman Switch — {{ wizard_name }}
# Generated by Ansible. DO NOT EDIT MANUALLY.
[Unit]
Description=Deadman Switch for {{ wizard_name }} wizard
After=network.target
[Service]
Type=oneshot
ExecStart={{ wizard_home }}/deadman_action.sh
User={{ ansible_user | default('root') }}
StandardOutput=append:{{ timmy_log_dir }}/deadman-{{ wizard_name }}.log
StandardError=append:{{ timmy_log_dir }}/deadman-{{ wizard_name }}.log
[Install]
WantedBy=multi-user.target

View File

@@ -1,14 +0,0 @@
# Deadman Switch Timer — {{ wizard_name }}
# Generated by Ansible. DO NOT EDIT MANUALLY.
# Runs every {{ deadman_check_interval // 60 }} minutes.
[Unit]
Description=Deadman Switch Timer for {{ wizard_name }} wizard
[Timer]
OnBootSec=60
OnUnitActiveSec={{ deadman_check_interval }}s
AccuracySec=30s
[Install]
WantedBy=timers.target

View File

@@ -1,6 +0,0 @@
---
# golden_state defaults
# The golden_state_providers list is defined in group_vars/wizards.yml
# and inventory/hosts.yml (global vars).
golden_state_enforce: true
golden_state_backup_before_deploy: true

View File

@@ -1,46 +0,0 @@
---
# =============================================================================
# golden_state/tasks — Deploy and enforce golden state provider chain
# =============================================================================
- name: "Backup current config before golden state deploy"
copy:
src: "{{ wizard_home }}/config.yaml"
dest: "{{ wizard_home }}/config.yaml.pre-golden-{{ ansible_date_time.epoch }}"
remote_src: true
when: golden_state_backup_before_deploy
ignore_errors: true
- name: "Deploy golden state wizard config"
template:
src: "../../wizard_base/templates/wizard_config.yaml.j2"
dest: "{{ wizard_home }}/config.yaml"
mode: "0644"
backup: true
notify:
- "Restart hermes agent (systemd)"
- "Restart hermes agent (launchctl)"
- name: "Scan for banned providers in all config files"
shell: |
FOUND=0
for f in {{ wizard_home }}/config.yaml {{ hermes_home }}/config.yaml; do
if [ -f "$f" ]; then
if grep -qi 'anthropic\|claude-sonnet\|claude-opus\|claude-haiku' "$f"; then
echo "BANNED PROVIDER in $f:"
grep -ni 'anthropic\|claude-sonnet\|claude-opus\|claude-haiku' "$f"
FOUND=1
fi
fi
done
exit $FOUND
register: provider_scan
changed_when: false
failed_when: provider_scan.rc != 0 and provider_ban_enforcement == 'strict'
- name: "Report golden state deployment"
debug:
msg: >
{{ wizard_name }} golden state deployed.
Provider chain: {{ golden_state_providers | map(attribute='name') | list | join(' → ') }}.
Banned provider scan: {{ 'CLEAN' if provider_scan.rc == 0 else 'VIOLATIONS FOUND' }}.

View File

@@ -1,64 +0,0 @@
-- =============================================================================
-- request_log — Inference Telemetry Table
-- =============================================================================
-- Every agent writes to this table BEFORE and AFTER every inference call.
-- No exceptions. No summarizing. No describing what you would log.
-- Actually write the row.
--
-- Source: KT Bezalel Architecture Session 2026-04-08
-- =============================================================================
CREATE TABLE IF NOT EXISTS request_log (
id INTEGER PRIMARY KEY AUTOINCREMENT,
timestamp TEXT NOT NULL DEFAULT (datetime('now')),
agent_name TEXT NOT NULL,
provider TEXT NOT NULL,
model TEXT NOT NULL,
endpoint TEXT NOT NULL,
tokens_in INTEGER,
tokens_out INTEGER,
latency_ms INTEGER,
status TEXT NOT NULL, -- 'success', 'error', 'timeout', 'fallback'
error_message TEXT
);
-- Index for common queries
CREATE INDEX IF NOT EXISTS idx_request_log_agent
ON request_log (agent_name, timestamp);
CREATE INDEX IF NOT EXISTS idx_request_log_provider
ON request_log (provider, timestamp);
CREATE INDEX IF NOT EXISTS idx_request_log_status
ON request_log (status, timestamp);
-- View: recent activity per agent (last hour)
CREATE VIEW IF NOT EXISTS v_recent_activity AS
SELECT
agent_name,
provider,
model,
status,
COUNT(*) as call_count,
AVG(latency_ms) as avg_latency_ms,
SUM(tokens_in) as total_tokens_in,
SUM(tokens_out) as total_tokens_out
FROM request_log
WHERE timestamp > datetime('now', '-1 hour')
GROUP BY agent_name, provider, model, status;
-- View: provider reliability (last 24 hours)
CREATE VIEW IF NOT EXISTS v_provider_reliability AS
SELECT
provider,
model,
COUNT(*) as total_calls,
SUM(CASE WHEN status = 'success' THEN 1 ELSE 0 END) as successes,
SUM(CASE WHEN status = 'error' THEN 1 ELSE 0 END) as errors,
SUM(CASE WHEN status = 'timeout' THEN 1 ELSE 0 END) as timeouts,
SUM(CASE WHEN status = 'fallback' THEN 1 ELSE 0 END) as fallbacks,
ROUND(100.0 * SUM(CASE WHEN status = 'success' THEN 1 ELSE 0 END) / COUNT(*), 1) as success_rate,
AVG(latency_ms) as avg_latency_ms
FROM request_log
WHERE timestamp > datetime('now', '-24 hours')
GROUP BY provider, model;

View File

@@ -1,50 +0,0 @@
---
# =============================================================================
# request_log/tasks — Deploy Telemetry Table
# =============================================================================
# "This is non-negotiable infrastructure. Without it, we cannot verify
# if any agent actually executed what it claims."
# — KT Bezalel 2026-04-08
# =============================================================================
- name: "Create telemetry directory"
file:
path: "{{ request_log_path | dirname }}"
state: directory
mode: "0755"
- name: "Deploy request_log schema"
copy:
src: request_log_schema.sql
dest: "{{ wizard_home }}/request_log_schema.sql"
mode: "0644"
- name: "Initialize request_log database"
shell: |
sqlite3 "{{ request_log_path }}" < "{{ wizard_home }}/request_log_schema.sql"
args:
creates: "{{ request_log_path }}"
- name: "Verify request_log table exists"
shell: |
sqlite3 "{{ request_log_path }}" ".tables" | grep -q "request_log"
register: table_check
changed_when: false
- name: "Verify request_log schema matches"
shell: |
sqlite3 "{{ request_log_path }}" ".schema request_log" | grep -q "agent_name"
register: schema_check
changed_when: false
- name: "Set permissions on request_log database"
file:
path: "{{ request_log_path }}"
mode: "0644"
- name: "Report request_log status"
debug:
msg: >
{{ wizard_name }} request_log: {{ request_log_path }}
— table exists: {{ table_check.rc == 0 }}
— schema valid: {{ schema_check.rc == 0 }}

View File

@@ -1,6 +0,0 @@
---
# wizard_base defaults
wizard_user: "{{ ansible_user | default('root') }}"
wizard_group: "{{ ansible_user | default('root') }}"
timmy_base_dir: "~/.local/timmy"
timmy_config_repo: "https://forge.alexanderwhitestone.com/Timmy_Foundation/timmy-config.git"

View File

@@ -1,11 +0,0 @@
---
- name: "Restart hermes agent (systemd)"
systemd:
name: "hermes-{{ wizard_name | lower }}"
state: restarted
when: machine_type == 'vps'
- name: "Restart hermes agent (launchctl)"
shell: "launchctl kickstart -k ai.hermes.{{ wizard_name | lower }}"
when: machine_type == 'mac'
ignore_errors: true

View File

@@ -1,69 +0,0 @@
---
# =============================================================================
# wizard_base/tasks — Common wizard setup
# =============================================================================
- name: "Create wizard directories"
file:
path: "{{ item }}"
state: directory
mode: "0755"
loop:
- "{{ wizard_home }}"
- "{{ wizard_home }}/workspace"
- "{{ hermes_home }}"
- "{{ hermes_home }}/bin"
- "{{ hermes_home }}/skins"
- "{{ hermes_home }}/playbooks"
- "{{ hermes_home }}/memories"
- "~/.local/timmy"
- "~/.local/timmy/fleet-health"
- "~/.local/timmy/snapshots"
- "~/.timmy"
- name: "Clone/update timmy-config"
git:
repo: "{{ upstream_repo }}"
dest: "{{ wizard_home }}/workspace/timmy-config"
version: "{{ upstream_branch }}"
force: false
update: true
ignore_errors: true # May fail on first run if no SSH key
- name: "Deploy SOUL.md"
copy:
src: "{{ wizard_home }}/workspace/timmy-config/SOUL.md"
dest: "~/.timmy/SOUL.md"
remote_src: true
mode: "0644"
ignore_errors: true
- name: "Deploy thin config (immutable pointer to upstream)"
template:
src: thin_config.yml.j2
dest: "{{ thin_config_path }}"
mode: "{{ thin_config_mode }}"
tags: [thin_config]
- name: "Ensure Python3 and pip are available"
package:
name:
- python3
- python3-pip
state: present
when: machine_type == 'vps'
ignore_errors: true
- name: "Ensure PyYAML is installed (for config validation)"
pip:
name: pyyaml
state: present
when: machine_type == 'vps'
ignore_errors: true
- name: "Create Ansible log directory"
file:
path: /var/log/ansible
state: directory
mode: "0755"
ignore_errors: true

View File

@@ -1,41 +0,0 @@
# =============================================================================
# Thin Config — {{ wizard_name }}
# =============================================================================
# THIS FILE IS READ-ONLY. Agents CANNOT modify it.
# It contains only pointers to upstream. The actual config lives in Gitea.
#
# Agent wakes up → pulls config from upstream → loads → runs.
# If anything tries to mutate this → fails gracefully → pulls fresh on restart.
#
# Only way to permanently change config: commit to Gitea, merge PR, Ansible deploys.
#
# Generated by Ansible on {{ ansible_date_time.iso8601 }}
# DO NOT EDIT MANUALLY.
# =============================================================================
identity:
wizard_name: "{{ wizard_name }}"
wizard_role: "{{ wizard_role }}"
machine: "{{ inventory_hostname }}"
upstream:
repo: "{{ upstream_repo }}"
branch: "{{ upstream_branch }}"
config_path: "wizards/{{ wizard_name | lower }}/config.yaml"
pull_on_wake: {{ config_pull_on_wake | lower }}
recovery:
deadman_enabled: {{ deadman_enabled | lower }}
snapshot_dir: "{{ deadman_snapshot_dir }}"
restart_cooldown: {{ deadman_restart_cooldown }}
max_restart_attempts: {{ deadman_max_restart_attempts }}
escalation_channel: "{{ deadman_escalation_channel }}"
telemetry:
request_log_path: "{{ request_log_path }}"
request_log_enabled: {{ request_log_enabled | lower }}
local_overrides:
# Runtime overrides go here. They are EPHEMERAL — not persisted across restarts.
# On restart, this section is reset to empty.
{}

View File

@@ -1,115 +0,0 @@
# =============================================================================
# {{ wizard_name }} — Wizard Configuration (Golden State)
# =============================================================================
# Generated by Ansible on {{ ansible_date_time.iso8601 }}
# DO NOT EDIT MANUALLY. Changes go through Gitea PR → Ansible deploy.
#
# Provider chain: {{ golden_state_providers | map(attribute='name') | list | join(' → ') }}
# Anthropic is PERMANENTLY BANNED.
# =============================================================================
model:
default: {{ wizard_model_primary }}
provider: {{ wizard_provider_primary }}
context_length: 65536
base_url: {{ golden_state_providers[0].base_url }}
toolsets:
- all
fallback_providers:
{% for provider in golden_state_providers %}
- provider: {{ provider.name }}
model: {{ provider.model }}
{% if provider.base_url is defined %}
base_url: {{ provider.base_url }}
{% endif %}
{% if provider.api_key_env is defined %}
api_key_env: {{ provider.api_key_env }}
{% endif %}
timeout: {{ provider.timeout }}
reason: "{{ provider.reason }}"
{% endfor %}
agent:
max_turns: {{ agent_max_turns }}
reasoning_effort: {{ agent_reasoning_effort }}
verbose: {{ agent_verbose | lower }}
terminal:
backend: local
cwd: .
timeout: 180
persistent_shell: true
browser:
inactivity_timeout: 120
command_timeout: 30
record_sessions: false
display:
compact: false
personality: ''
resume_display: full
busy_input_mode: interrupt
bell_on_complete: false
show_reasoning: false
streaming: false
show_cost: false
tool_progress: all
memory:
memory_enabled: true
user_profile_enabled: true
memory_char_limit: 2200
user_char_limit: 1375
nudge_interval: 10
flush_min_turns: 6
approvals:
mode: {{ agent_approval_mode }}
security:
redact_secrets: true
tirith_enabled: false
platforms:
api_server:
enabled: true
extra:
host: 127.0.0.1
port: {{ api_port }}
session_reset:
mode: none
idle_minutes: 0
skills:
creation_nudge_interval: 15
system_prompt_suffix: |
You are {{ wizard_name }}, {{ wizard_role }}.
Your soul is defined in SOUL.md — read it, live it.
Hermes is your harness.
{{ golden_state_providers[0].name }} is your primary provider.
Refusal over fabrication. If you do not know, say so.
Sovereignty and service always.
providers:
{% for provider in golden_state_providers %}
{{ provider.name }}:
base_url: {{ provider.base_url }}
timeout: {{ provider.timeout | default(60) }}
{% if provider.name == 'kimi-coding' %}
max_retries: 3
{% endif %}
{% endfor %}
# =============================================================================
# BANNED PROVIDERS — DO NOT ADD
# =============================================================================
# The following providers are PERMANENTLY BANNED:
# - anthropic (any model: claude-sonnet, claude-opus, claude-haiku)
# Enforcement: pre-commit hook, linter, Ansible validation, this comment.
# Adding any banned provider will cause Ansible deployment to FAIL.
# =============================================================================

View File

@@ -1,75 +0,0 @@
#!/usr/bin/env bash
# =============================================================================
# Gitea Webhook Handler — Trigger Ansible Deploy on Merge
# =============================================================================
# This script is called by the Gitea webhook when a PR is merged
# to the main branch of timmy-config.
#
# Setup:
# 1. Add webhook in Gitea: Settings → Webhooks → Add Webhook
# 2. URL: http://localhost:9000/hooks/deploy-timmy-config
# 3. Events: Pull Request (merged only)
# 4. Secret: <configured in Gitea>
#
# This script runs ansible-pull to update the local machine.
# For fleet-wide deploys, each machine runs ansible-pull independently.
# =============================================================================
set -euo pipefail
REPO="https://forge.alexanderwhitestone.com/Timmy_Foundation/timmy-config.git"
BRANCH="main"
ANSIBLE_DIR="ansible"
LOG_FILE="/var/log/ansible/webhook-deploy.log"
LOCK_FILE="/tmp/ansible-deploy.lock"
log() {
echo "[$(date -u +%Y-%m-%dT%H:%M:%SZ)] [webhook] $*" | tee -a "${LOG_FILE}"
}
# Prevent concurrent deploys
if [ -f "${LOCK_FILE}" ]; then
LOCK_AGE=$(( $(date +%s) - $(stat -c %Y "${LOCK_FILE}" 2>/dev/null || echo 0) ))
if [ "${LOCK_AGE}" -lt 300 ]; then
log "Deploy already in progress (lock age: ${LOCK_AGE}s). Skipping."
exit 0
else
log "Stale lock file (${LOCK_AGE}s old). Removing."
rm -f "${LOCK_FILE}"
fi
fi
trap 'rm -f "${LOCK_FILE}"' EXIT
touch "${LOCK_FILE}"
log "Webhook triggered. Starting ansible-pull..."
# Pull latest config
cd /tmp
rm -rf timmy-config-deploy
git clone --depth 1 --branch "${BRANCH}" "${REPO}" timmy-config-deploy 2>&1 | tee -a "${LOG_FILE}"
cd timmy-config-deploy/${ANSIBLE_DIR}
# Run Ansible against localhost
log "Running Ansible playbook..."
ansible-playbook \
-i inventory/hosts.yml \
playbooks/site.yml \
--limit "$(hostname)" \
--diff \
2>&1 | tee -a "${LOG_FILE}"
RESULT=$?
if [ ${RESULT} -eq 0 ]; then
log "Deploy successful."
else
log "ERROR: Deploy failed with exit code ${RESULT}."
fi
# Cleanup
rm -rf /tmp/timmy-config-deploy
log "Webhook handler complete."
exit ${RESULT}

View File

@@ -1,155 +0,0 @@
#!/usr/bin/env python3
"""
Config Validator — The Timmy Foundation
Validates wizard configs against golden state rules.
Run before any config deploy to catch violations early.
Usage:
python3 validate_config.py <config_file>
python3 validate_config.py --all # Validate all wizard configs
Exit codes:
0 — All validations passed
1 — Validation errors found
2 — File not found or parse error
"""
import sys
import os
import yaml
import fnmatch
from pathlib import Path
# === BANNED PROVIDERS — HARD POLICY ===
BANNED_PROVIDERS = {"anthropic", "claude"}
BANNED_MODEL_PATTERNS = [
"claude-*",
"anthropic/*",
"*sonnet*",
"*opus*",
"*haiku*",
]
# === REQUIRED FIELDS ===
REQUIRED_FIELDS = {
"model": ["default", "provider"],
"fallback_providers": None, # Must exist as a list
}
def is_banned_model(model_name: str) -> bool:
"""Check if a model name matches any banned pattern."""
model_lower = model_name.lower()
for pattern in BANNED_MODEL_PATTERNS:
if fnmatch.fnmatch(model_lower, pattern):
return True
return False
def validate_config(config_path: str) -> list[str]:
"""Validate a wizard config file. Returns list of error strings."""
errors = []
try:
with open(config_path) as f:
cfg = yaml.safe_load(f)
except FileNotFoundError:
return [f"File not found: {config_path}"]
except yaml.YAMLError as e:
return [f"YAML parse error: {e}"]
if not cfg:
return ["Config file is empty"]
# Check required fields
for section, fields in REQUIRED_FIELDS.items():
if section not in cfg:
errors.append(f"Missing required section: {section}")
elif fields:
for field in fields:
if field not in cfg[section]:
errors.append(f"Missing required field: {section}.{field}")
# Check default provider
default_provider = cfg.get("model", {}).get("provider", "")
if default_provider.lower() in BANNED_PROVIDERS:
errors.append(f"BANNED default provider: {default_provider}")
default_model = cfg.get("model", {}).get("default", "")
if is_banned_model(default_model):
errors.append(f"BANNED default model: {default_model}")
# Check fallback providers
for i, fb in enumerate(cfg.get("fallback_providers", [])):
provider = fb.get("provider", "")
model = fb.get("model", "")
if provider.lower() in BANNED_PROVIDERS:
errors.append(f"BANNED fallback provider [{i}]: {provider}")
if is_banned_model(model):
errors.append(f"BANNED fallback model [{i}]: {model}")
# Check providers section
for name, provider_cfg in cfg.get("providers", {}).items():
if name.lower() in BANNED_PROVIDERS:
errors.append(f"BANNED provider in providers section: {name}")
base_url = str(provider_cfg.get("base_url", ""))
if "anthropic" in base_url.lower():
errors.append(f"BANNED URL in provider {name}: {base_url}")
# Check system prompt for banned references
prompt = cfg.get("system_prompt_suffix", "")
if isinstance(prompt, str):
for banned in BANNED_PROVIDERS:
if banned in prompt.lower():
errors.append(f"BANNED provider referenced in system_prompt_suffix: {banned}")
return errors
def main():
if len(sys.argv) < 2:
print(f"Usage: {sys.argv[0]} <config_file> [--all]")
sys.exit(2)
if sys.argv[1] == "--all":
# Validate all wizard configs in the repo
repo_root = Path(__file__).parent.parent.parent
wizard_dir = repo_root / "wizards"
all_errors = {}
for wizard_path in sorted(wizard_dir.iterdir()):
config_file = wizard_path / "config.yaml"
if config_file.exists():
errors = validate_config(str(config_file))
if errors:
all_errors[wizard_path.name] = errors
if all_errors:
print("VALIDATION FAILED:")
for wizard, errors in all_errors.items():
print(f"\n {wizard}:")
for err in errors:
print(f" - {err}")
sys.exit(1)
else:
print("All wizard configs passed validation.")
sys.exit(0)
else:
config_path = sys.argv[1]
errors = validate_config(config_path)
if errors:
print(f"VALIDATION FAILED for {config_path}:")
for err in errors:
print(f" - {err}")
sys.exit(1)
else:
print(f"PASSED: {config_path}")
sys.exit(0)
if __name__ == "__main__":
main()

77
architecture_linter.py Normal file
View File

@@ -0,0 +1,77 @@
#!/usr/bin/env python3
"""
Architecture Linter — Sovereignty Enforcement
Scans the codebase for banned providers, models, and API keys.
"""
import os
import sys
import re
BANNED_STRINGS = [
r'anthropic',
r'claude',
r'api\.anthropic\.com',
r'ANTHROPIC_API_KEY',
r'claude-opus',
r'claude-sonnet',
r'claude-haiku'
]
EXCEPTIONS = [
'BANNED_PROVIDERS.md',
'architecture_linter.py',
'training/',
'evaluations/',
'RELEASE_',
'metrics_helpers.py'
]
def is_exception(path):
for exc in EXCEPTIONS:
if exc in path:
return True
return False
def check_file(path):
violations = []
try:
with open(path, 'r', encoding='utf-8', errors='ignore') as f:
for i, line in enumerate(f, 1):
for pattern in BANNED_STRINGS:
if re.search(pattern, line, re.IGNORECASE):
violations.append((i, line.strip(), pattern))
except Exception as e:
print(f"Error reading {path}: {e}")
return violations
def main():
print("--- Sovereignty Enforcement: Architecture Linter ---")
total_violations = 0
for root, dirs, files in os.walk('.'):
# Skip .git
if '.git' in dirs:
dirs.remove('.git')
for file in files:
path = os.path.join(root, file)
if is_exception(path):
continue
violations = check_file(path)
if violations:
print(f"\n[VIOLATION] {path}:")
for line_num, content, pattern in violations:
print(f" Line {line_num}: Found '{pattern}' -> {content}")
total_violations += 1
if total_violations > 0:
print(f"\nFAILED: Found {total_violations} sovereignty violations.")
sys.exit(1)
else:
print("\nPASSED: No banned providers detected.")
sys.exit(0)
if __name__ == "__main__":
main()

View File

@@ -2,7 +2,7 @@
# agent-loop.sh — Universal agent dev loop with Genchi Genbutsu verification
#
# Usage: agent-loop.sh <agent-name> [num-workers]
# agent-loop.sh claude 2
# agent-loop.sh kimi 2
# agent-loop.sh gemini 1
#
# Dispatches via agent-dispatch.sh, then verifies with genchi-genbutsu.sh.
@@ -14,7 +14,7 @@ NUM_WORKERS="${2:-1}"
# Resolve agent tool and model from config or fallback
case "$AGENT" in
claude) TOOL="claude"; MODEL="sonnet" ;;
# claude case removed — Anthropic purged from fleet
gemini) TOOL="gemini"; MODEL="gemini-2.5-pro-preview-05-06" ;;
grok) TOOL="opencode"; MODEL="grok-3-fast" ;;
*) TOOL="$AGENT"; MODEL="" ;;
@@ -145,8 +145,8 @@ run_worker() {
CYCLE_START=$(date +%s)
set +e
if [ "$TOOL" = "claude" ]; then
env -u CLAUDECODE gtimeout "$TIMEOUT" claude \
if [ "$TOOL" = "kimi" ]; then
# Claude dispatch removed — Anthropic purged
--print --model "$MODEL" --dangerously-skip-permissions \
-p "$prompt" </dev/null >> "$LOG_DIR/${AGENT}-${issue_num}.log" 2>&1
elif [ "$TOOL" = "gemini" ]; then

View File

@@ -1,4 +1,13 @@
#!/usr/bin/env bash
# DEPRECATED — Anthropic purged from fleet (April 2026)
# This script dispatched parallel Claude Code agent loops.
# All wizard providers now use Kimi K2.5 as primary.
# See bin/gemini-loop.sh for the surviving loop pattern.
echo "[DEPRECATED] claude-loop.sh is no longer active. Use gemini-loop.sh or agent-loop.sh with kimi provider."
exit 0
# --- ORIGINAL SCRIPT PRESERVED BELOW FOR REFERENCE ---
#!/usr/bin/env bash
# claude-loop.sh — Parallel Claude Code agent dispatch loop
# Runs N workers concurrently against the Gitea backlog.
# Gracefully handles rate limits with backoff.

View File

@@ -1,4 +1,12 @@
#!/usr/bin/env bash
# DEPRECATED — Anthropic purged from fleet (April 2026)
# This watchdog kept Claude/Gemini loops alive.
# Only gemini loops survive. Use fleet-status.sh for monitoring.
echo "[DEPRECATED] claudemax-watchdog.sh is no longer active."
exit 0
# --- ORIGINAL SCRIPT PRESERVED BELOW FOR REFERENCE ---
#!/usr/bin/env bash
# claudemax-watchdog.sh — keep local Claude/Gemini loops alive without stale tmux assumptions
set -uo pipefail

View File

@@ -1,264 +0,0 @@
1|#!/usr/bin/env python3
2|"""
3|Dead Man Switch Fallback Engine
4|
5|When the dead man switch triggers (zero commits for 2+ hours, model down,
6|Gitea unreachable, etc.), this script diagnoses the failure and applies
7|common sense fallbacks automatically.
8|
9|Fallback chain:
10|1. Primary model (Anthropic) down -> switch config to local-llama.cpp
11|2. Gitea unreachable -> cache issues locally, retry on recovery
12|3. VPS agents down -> alert + lazarus protocol
13|4. Local llama.cpp down -> try Ollama, then alert-only mode
14|5. All inference dead -> safe mode (cron pauses, alert Alexander)
15|
16|Each fallback is reversible. Recovery auto-restores the previous config.
17|"""
18|import os
19|import sys
20|import json
21|import subprocess
22|import time
23|import yaml
24|import shutil
25|from pathlib import Path
26|from datetime import datetime, timedelta
27|
28|HERMES_HOME = Path(os.environ.get("HERMES_HOME", os.path.expanduser("~/.hermes")))
29|CONFIG_PATH = HERMES_HOME / "config.yaml"
30|FALLBACK_STATE = HERMES_HOME / "deadman-fallback-state.json"
31|BACKUP_CONFIG = HERMES_HOME / "config.yaml.pre-fallback"
32|FORGE_URL = "https://forge.alexanderwhitestone.com"
33|
34|def load_config():
35| with open(CONFIG_PATH) as f:
36| return yaml.safe_load(f)
37|
38|def save_config(cfg):
39| with open(CONFIG_PATH, "w") as f:
40| yaml.dump(cfg, f, default_flow_style=False)
41|
42|def load_state():
43| if FALLBACK_STATE.exists():
44| with open(FALLBACK_STATE) as f:
45| return json.load(f)
46| return {"active_fallbacks": [], "last_check": None, "recovery_pending": False}
47|
48|def save_state(state):
49| state["last_check"] = datetime.now().isoformat()
50| with open(FALLBACK_STATE, "w") as f:
51| json.dump(state, f, indent=2)
52|
53|def run(cmd, timeout=10):
54| try:
55| r = subprocess.run(cmd, shell=True, capture_output=True, text=True, timeout=timeout)
56| return r.returncode, r.stdout.strip(), r.stderr.strip()
57| except subprocess.TimeoutExpired:
58| return -1, "", "timeout"
59| except Exception as e:
60| return -1, "", str(e)
61|
62|# ─── HEALTH CHECKS ───
63|
64|def check_anthropic():
65| """Can we reach Anthropic API?"""
66| key = os.environ.get("ANTHROPIC_API_KEY", "")
67| if not key:
68| # Check multiple .env locations
69| for env_path in [HERMES_HOME / ".env", Path.home() / ".hermes" / ".env"]:
70| if env_path.exists():
71| for line in open(env_path):
72| line = line.strip()
73| if line.startswith("ANTHROPIC_API_KEY=***
74| key = line.split("=", 1)[1].strip().strip('"').strip("'")
75| break
76| if key:
77| break
78| if not key:
79| return False, "no API key"
80| code, out, err = run(
81| f'curl -s -o /dev/null -w "%{{http_code}}" -H "x-api-key: {key}" '
82| f'-H "anthropic-version: 2023-06-01" '
83| f'https://api.anthropic.com/v1/messages -X POST '
84| f'-H "content-type: application/json" '
85| f'-d \'{{"model":"claude-haiku-4-5-20251001","max_tokens":1,"messages":[{{"role":"user","content":"ping"}}]}}\' ',
86| timeout=15
87| )
88| if code == 0 and out in ("200", "429"):
89| return True, f"HTTP {out}"
90| return False, f"HTTP {out} err={err[:80]}"
91|
92|def check_local_llama():
93| """Is local llama.cpp serving?"""
94| code, out, err = run("curl -s http://localhost:8081/v1/models", timeout=5)
95| if code == 0 and "hermes" in out.lower():
96| return True, "serving"
97| return False, f"exit={code}"
98|
99|def check_ollama():
100| """Is Ollama running?"""
101| code, out, err = run("curl -s http://localhost:11434/api/tags", timeout=5)
102| if code == 0 and "models" in out:
103| return True, "running"
104| return False, f"exit={code}"
105|
106|def check_gitea():
107| """Can we reach the Forge?"""
108| token_path = Path.home() / ".config" / "gitea" / "timmy-token"
109| if not token_path.exists():
110| return False, "no token"
111| token = token_path.read_text().strip()
112| code, out, err = run(
113| f'curl -s -o /dev/null -w "%{{http_code}}" -H "Authorization: token {token}" '
114| f'"{FORGE_URL}/api/v1/user"',
115| timeout=10
116| )
117| if code == 0 and out == "200":
118| return True, "reachable"
119| return False, f"HTTP {out}"
120|
121|def check_vps(ip, name):
122| """Can we SSH into a VPS?"""
123| code, out, err = run(f"ssh -o ConnectTimeout=5 root@{ip} 'echo alive'", timeout=10)
124| if code == 0 and "alive" in out:
125| return True, "alive"
126| return False, f"unreachable"
127|
128|# ─── FALLBACK ACTIONS ───
129|
130|def fallback_to_local_model(cfg):
131| """Switch primary model from Anthropic to local llama.cpp"""
132| if not BACKUP_CONFIG.exists():
133| shutil.copy2(CONFIG_PATH, BACKUP_CONFIG)
134|
135| cfg["model"]["provider"] = "local-llama.cpp"
136| cfg["model"]["default"] = "hermes3"
137| save_config(cfg)
138| return "Switched primary model to local-llama.cpp/hermes3"
139|
140|def fallback_to_ollama(cfg):
141| """Switch to Ollama if llama.cpp is also down"""
142| if not BACKUP_CONFIG.exists():
143| shutil.copy2(CONFIG_PATH, BACKUP_CONFIG)
144|
145| cfg["model"]["provider"] = "ollama"
146| cfg["model"]["default"] = "gemma4:latest"
147| save_config(cfg)
148| return "Switched primary model to ollama/gemma4:latest"
149|
150|def enter_safe_mode(state):
151| """Pause all non-essential cron jobs, alert Alexander"""
152| state["safe_mode"] = True
153| state["safe_mode_entered"] = datetime.now().isoformat()
154| save_state(state)
155| return "SAFE MODE: All inference down. Cron jobs should be paused. Alert Alexander."
156|
157|def restore_config():
158| """Restore pre-fallback config when primary recovers"""
159| if BACKUP_CONFIG.exists():
160| shutil.copy2(BACKUP_CONFIG, CONFIG_PATH)
161| BACKUP_CONFIG.unlink()
162| return "Restored original config from backup"
163| return "No backup config to restore"
164|
165|# ─── MAIN DIAGNOSIS AND FALLBACK ENGINE ───
166|
167|def diagnose_and_fallback():
168| state = load_state()
169| cfg = load_config()
170|
171| results = {
172| "timestamp": datetime.now().isoformat(),
173| "checks": {},
174| "actions": [],
175| "status": "healthy"
176| }
177|
178| # Check all systems
179| anthropic_ok, anthropic_msg = check_anthropic()
180| results["checks"]["anthropic"] = {"ok": anthropic_ok, "msg": anthropic_msg}
181|
182| llama_ok, llama_msg = check_local_llama()
183| results["checks"]["local_llama"] = {"ok": llama_ok, "msg": llama_msg}
184|
185| ollama_ok, ollama_msg = check_ollama()
186| results["checks"]["ollama"] = {"ok": ollama_ok, "msg": ollama_msg}
187|
188| gitea_ok, gitea_msg = check_gitea()
189| results["checks"]["gitea"] = {"ok": gitea_ok, "msg": gitea_msg}
190|
191| # VPS checks
192| vpses = [
193| ("167.99.126.228", "Allegro"),
194| ("143.198.27.163", "Ezra"),
195| ("159.203.146.185", "Bezalel"),
196| ]
197| for ip, name in vpses:
198| vps_ok, vps_msg = check_vps(ip, name)
199| results["checks"][f"vps_{name.lower()}"] = {"ok": vps_ok, "msg": vps_msg}
200|
201| current_provider = cfg.get("model", {}).get("provider", "anthropic")
202|
203| # ─── FALLBACK LOGIC ───
204|
205| # Case 1: Primary (Anthropic) down, local available
206| if not anthropic_ok and current_provider == "anthropic":
207| if llama_ok:
208| msg = fallback_to_local_model(cfg)
209| results["actions"].append(msg)
210| state["active_fallbacks"].append("anthropic->local-llama")
211| results["status"] = "degraded_local"
212| elif ollama_ok:
213| msg = fallback_to_ollama(cfg)
214| results["actions"].append(msg)
215| state["active_fallbacks"].append("anthropic->ollama")
216| results["status"] = "degraded_ollama"
217| else:
218| msg = enter_safe_mode(state)
219| results["actions"].append(msg)
220| results["status"] = "safe_mode"
221|
222| # Case 2: Already on fallback, check if primary recovered
223| elif anthropic_ok and "anthropic->local-llama" in state.get("active_fallbacks", []):
224| msg = restore_config()
225| results["actions"].append(msg)
226| state["active_fallbacks"].remove("anthropic->local-llama")
227| results["status"] = "recovered"
228| elif anthropic_ok and "anthropic->ollama" in state.get("active_fallbacks", []):
229| msg = restore_config()
230| results["actions"].append(msg)
231| state["active_fallbacks"].remove("anthropic->ollama")
232| results["status"] = "recovered"
233|
234| # Case 3: Gitea down — just flag it, work locally
235| if not gitea_ok:
236| results["actions"].append("WARN: Gitea unreachable — work cached locally until recovery")
237| if "gitea_down" not in state.get("active_fallbacks", []):
238| state["active_fallbacks"].append("gitea_down")
239| results["status"] = max(results["status"], "degraded_gitea", key=lambda x: ["healthy", "recovered", "degraded_gitea", "degraded_local", "degraded_ollama", "safe_mode"].index(x) if x in ["healthy", "recovered", "degraded_gitea", "degraded_local", "degraded_ollama", "safe_mode"] else 0)
240| elif "gitea_down" in state.get("active_fallbacks", []):
241| state["active_fallbacks"].remove("gitea_down")
242| results["actions"].append("Gitea recovered — resume normal operations")
243|
244| # Case 4: VPS agents down
245| for ip, name in vpses:
246| key = f"vps_{name.lower()}"
247| if not results["checks"][key]["ok"]:
248| results["actions"].append(f"ALERT: {name} VPS ({ip}) unreachable — lazarus protocol needed")
249|
250| save_state(state)
251| return results
252|
253|if __name__ == "__main__":
254| results = diagnose_and_fallback()
255| print(json.dumps(results, indent=2))
256|
257| # Exit codes for cron integration
258| if results["status"] == "safe_mode":
259| sys.exit(2)
260| elif results["status"].startswith("degraded"):
261| sys.exit(1)
262| else:
263| sys.exit(0)
264|

View File

@@ -140,7 +140,7 @@ if [ -z "$GW_PID" ]; then
fi
# Check local loops
CLAUDE_LOOPS=$(pgrep -cf "claude-loop" 2>/dev/null || echo 0)
CLAUDE_LOOPS=0 # Anthropic purged from fleet
GEMINI_LOOPS=$(pgrep -cf "gemini-loop" 2>/dev/null || echo 0)
if [ -n "$GW_PID" ]; then
@@ -160,7 +160,7 @@ if [ -n "$TIMMY_HEALTH" ]; then
fi
fi
TIMMY_ACTIVITY="loops: claude=${CLAUDE_LOOPS} gemini=${GEMINI_LOOPS}"
TIMMY_ACTIVITY="loops: gemini=${GEMINI_LOOPS}"
# Git activity for timmy-config
TC_COMMIT=$(gitea_last_commit "Timmy_Foundation/timmy-config")

View File

@@ -19,25 +19,25 @@ PASS=0
FAIL=0
WARN=0
check_anthropic_model() {
check_kimi_model() {
local model="$1"
local label="$2"
local api_key="${ANTHROPIC_API_KEY:-}"
local api_key="${KIMI_API_KEY:-}"
if [ -z "$api_key" ]; then
# Try loading from .env
api_key=$(grep '^ANTHROPIC_API_KEY=' "${HERMES_HOME:-$HOME/.hermes}/.env" 2>/dev/null | head -1 | cut -d= -f2- | tr -d "'\"" || echo "")
api_key=$(grep '^KIMI_API_KEY=' "${HERMES_HOME:-$HOME/.hermes}/.env" 2>/dev/null | head -1 | cut -d= -f2- | tr -d "'\"" || echo "")
fi
if [ -z "$api_key" ]; then
log "SKIP [$label] $model -- no ANTHROPIC_API_KEY"
log "SKIP [$label] $model -- no KIMI_API_KEY"
return 0
fi
response=$(curl -sf --max-time 10 -X POST \
"https://api.anthropic.com/v1/messages" \
-H "x-api-key: ${api_key}" \
-H "anthropic-version: 2023-06-01" \
"https://api.kimi.com/v1/messages" \
-H "Authorization: Bearer: ${api_key}" \
-H "content-type: application/json" \
-H "content-type: application/json" \
-d "{\"model\":\"${model}\",\"max_tokens\":1,\"messages\":[{\"role\":\"user\",\"content\":\"hi\"}]}" 2>&1 || echo "ERROR")

View File

@@ -134,7 +134,7 @@ else:
print("\033[2m────────────────────────────────────────\033[0m")
print(" \033[1mIssue Queues\033[0m")
queue_agents = ["allegro", "codex-agent", "groq", "claude", "ezra", "perplexity", "KimiClaw"]
queue_agents = ["allegro", "codex-agent", "groq", "ezra", "perplexity", "KimiClaw"]
for agent in queue_agents:
assigned = [
issue

View File

@@ -70,7 +70,7 @@ ops-help() {
echo " ops-assign-allegro ISSUE [repo]"
echo " ops-assign-codex ISSUE [repo]"
echo " ops-assign-groq ISSUE [repo]"
echo " ops-assign-claude ISSUE [repo]"
# ops-assign-claude removed — Anthropic purged
echo " ops-assign-ezra ISSUE [repo]"
echo ""
}
@@ -288,7 +288,7 @@ ops-freshness() {
ops-assign-allegro() { ops-assign "$1" "allegro" "${2:-$OPS_DEFAULT_REPO}"; }
ops-assign-codex() { ops-assign "$1" "codex-agent" "${2:-$OPS_DEFAULT_REPO}"; }
ops-assign-groq() { ops-assign "$1" "groq" "${2:-$OPS_DEFAULT_REPO}"; }
ops-assign-claude() { ops-assign "$1" "claude" "${2:-$OPS_DEFAULT_REPO}"; }
# ops-assign-claude removed — Anthropic purged from fleet
ops-assign-ezra() { ops-assign "$1" "ezra" "${2:-$OPS_DEFAULT_REPO}"; }
ops-assign-perplexity() { ops-assign "$1" "perplexity" "${2:-$OPS_DEFAULT_REPO}"; }
ops-assign-kimiclaw() { ops-assign "$1" "KimiClaw" "${2:-$OPS_DEFAULT_REPO}"; }

View File

@@ -171,7 +171,7 @@ queue_agents = [
("allegro", "dispatch"),
("codex-agent", "cleanup"),
("groq", "fast ship"),
("claude", "refactor"),
# claude removed — Anthropic purged
("ezra", "archive"),
("perplexity", "research"),
("KimiClaw", "digest"),
@@ -189,7 +189,7 @@ unassigned = [issue for issue in issues if not issue.get("assignees")]
stale_cutoff = (datetime.now(timezone.utc) - timedelta(days=2)).strftime("%Y-%m-%d")
stale_prs = [pr for pr in pulls if pr.get("updated_at", "")[:10] < stale_cutoff]
overloaded = []
for agent in ("allegro", "codex-agent", "groq", "claude", "ezra", "perplexity", "KimiClaw"):
for agent in ("allegro", "codex-agent", "groq", "ezra", "perplexity", "KimiClaw"):
count = sum(
1
for issue in issues

View File

@@ -10,10 +10,10 @@ set -euo pipefail
HERMES_BIN="$HOME/.hermes/bin"
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
LOG_DIR="$HOME/.hermes/logs"
CLAUDE_LOCKS="$LOG_DIR/claude-locks"
# CLAUDE_LOCKS removed — Anthropic purged
GEMINI_LOCKS="$LOG_DIR/gemini-locks"
mkdir -p "$LOG_DIR" "$CLAUDE_LOCKS" "$GEMINI_LOCKS"
mkdir -p "$LOG_DIR" "$GEMINI_LOCKS"
log() {
echo "[$(date '+%Y-%m-%d %H:%M:%S')] START-LOOPS: $*"
@@ -29,7 +29,7 @@ log "Model health check passed."
# ── 2. Kill stale loop processes ──────────────────────────────────────
log "Killing stale loop processes..."
for proc_name in claude-loop gemini-loop timmy-orchestrator; do
for proc_name in gemini-loop timmy-orchestrator; do
pids=$(pgrep -f "${proc_name}\\.sh" 2>/dev/null || true)
if [ -n "$pids" ]; then
log " Killing stale $proc_name PIDs: $pids"
@@ -47,7 +47,7 @@ done
# ── 3. Clear lock directories ────────────────────────────────────────
log "Clearing lock dirs..."
rm -rf "${CLAUDE_LOCKS:?}"/*
# CLAUDE_LOCKS removed — Anthropic purged
rm -rf "${GEMINI_LOCKS:?}"/*
log " Cleared $CLAUDE_LOCKS and $GEMINI_LOCKS"

View File

@@ -62,10 +62,10 @@ for p in json.load(sys.stdin):
print(f'REPO={\"$repo\"} PR={p[\"number\"]} BY={p[\"user\"][\"login\"]} TITLE={p[\"title\"]}')" >> "$state_dir/open_prs.txt" 2>/dev/null
done
echo "Claude workers: $(pgrep -f 'claude.*--print.*--dangerously' 2>/dev/null | wc -l | tr -d ' ')" >> "$state_dir/agent_status.txt"
echo "Claude loop: $(pgrep -f 'claude-loop.sh' 2>/dev/null | wc -l | tr -d ' ') procs" >> "$state_dir/agent_status.txt"
tail -50 "$LOG_DIR/claude-loop.log" 2>/dev/null | grep -c "SUCCESS" | xargs -I{} echo "Claude recent successes: {}" >> "$state_dir/agent_status.txt"
tail -50 "$LOG_DIR/claude-loop.log" 2>/dev/null | grep -c "FAILED" | xargs -I{} echo "Claude recent failures: {}" >> "$state_dir/agent_status.txt"
# [Anthropic purged]
# [Anthropic purged]
# [Anthropic purged]
# [Anthropic purged]
echo "Kimi heartbeat launchd: $(launchctl list 2>/dev/null | grep -c 'ai.timmy.kimi-heartbeat' | tr -d ' ') job" >> "$state_dir/agent_status.txt"
tail -50 "/tmp/kimi-heartbeat.log" 2>/dev/null | grep -c "DISPATCHED:" | xargs -I{} echo "Kimi recent dispatches: {}" >> "$state_dir/agent_status.txt"
tail -50 "/tmp/kimi-heartbeat.log" 2>/dev/null | grep -c "FAILED:" | xargs -I{} echo "Kimi recent failures: {}" >> "$state_dir/agent_status.txt"
@@ -91,7 +91,7 @@ run_triage() {
# Auto-assignment is opt-in because silent queue mutation resurrects old state.
if [ "$unassigned_count" -gt 0 ]; then
if [ "$AUTO_ASSIGN_UNASSIGNED" = "1" ]; then
log "Assigning $unassigned_count issues to claude..."
log "Assigning $unassigned_count issues to kimi..."
while IFS= read -r line; do
local repo=$(echo "$line" | sed 's/.*REPO=\([^ ]*\).*/\1/')
local num=$(echo "$line" | sed 's/.*NUM=\([^ ]*\).*/\1/')

View File

@@ -1,212 +0,0 @@
[
{
"job_id": "9e0624269ba7",
"name": "Triage Heartbeat",
"schedule": "every 15m",
"state": "paused"
},
{
"job_id": "e29eda4a8548",
"name": "PR Review Sweep",
"schedule": "every 30m",
"state": "scheduled"
},
{
"job_id": "a77a87392582",
"name": "Health Monitor",
"schedule": "every 5m",
"state": "scheduled"
},
{
"job_id": "5e9d952871bc",
"name": "Agent Status Check",
"schedule": "every 10m",
"state": "paused"
},
{
"job_id": "36fb2f630a17",
"name": "Hermes Philosophy Loop",
"schedule": "every 1440m",
"state": "paused"
},
{
"job_id": "b40a96a2f48c",
"name": "wolf-eval-cycle",
"schedule": "every 240m",
"state": "paused"
},
{
"job_id": "4204e568b862",
"name": "Burn Mode \u2014 Timmy Orchestrator",
"schedule": "every 15m",
"state": "scheduled"
},
{
"job_id": "0944a976d034",
"name": "Burn Mode",
"schedule": "every 15m",
"state": "paused"
},
{
"job_id": "62016b960fa0",
"name": "velocity-engine",
"schedule": "every 30m",
"state": "paused"
},
{
"job_id": "e9d49eeff79c",
"name": "weekly-skill-extraction",
"schedule": "every 10080m",
"state": "scheduled"
},
{
"job_id": "75c74a5bb563",
"name": "tower-tick",
"schedule": "every 1m",
"state": "scheduled"
},
{
"job_id": "390a19054d4c",
"name": "Burn Deadman",
"schedule": "every 30m",
"state": "scheduled"
},
{
"job_id": "05e3c13498fa",
"name": "Morning Report \u2014 Burn Mode",
"schedule": "0 6 * * *",
"state": "scheduled"
},
{
"job_id": "64fe44b512b9",
"name": "evennia-morning-report",
"schedule": "0 9 * * *",
"state": "scheduled"
},
{
"job_id": "3896a7fd9747",
"name": "Gitea Priority Inbox",
"schedule": "every 3m",
"state": "scheduled"
},
{
"job_id": "f64c2709270a",
"name": "Config Drift Guard",
"schedule": "every 30m",
"state": "scheduled"
},
{
"job_id": "fc6a75b7102a",
"name": "Gitea Event Watcher",
"schedule": "every 2m",
"state": "scheduled"
},
{
"job_id": "12e59648fb06",
"name": "Burndown Night Watcher",
"schedule": "every 15m",
"state": "scheduled"
},
{
"job_id": "35d3ada9cf8f",
"name": "Mempalace Forge \u2014 Issue Analysis",
"schedule": "every 60m",
"state": "scheduled"
},
{
"job_id": "190b6fb8dc91",
"name": "Mempalace Watchtower \u2014 Fleet Health",
"schedule": "every 30m",
"state": "scheduled"
},
{
"job_id": "710ab589813c",
"name": "Ezra Health Monitor",
"schedule": "every 15m",
"state": "scheduled"
},
{
"job_id": "a0a9cce4575c",
"name": "daily-poka-yoke-ultraplan-awesometools",
"schedule": "every 1440m",
"state": "scheduled"
},
{
"job_id": "adc3a51457bd",
"name": "vps-agent-dispatch",
"schedule": "every 10m",
"state": "scheduled"
},
{
"job_id": "afd2c4eac44d",
"name": "Project Mnemosyne Nightly Burn v2",
"schedule": "*/30 * * * *",
"state": "scheduled"
},
{
"job_id": "f3a3c2832af0",
"name": "gemma4-multimodal-worker",
"schedule": "once in 15m",
"state": "completed"
},
{
"job_id": "c17a85c19838",
"name": "know-thy-father-analyzer",
"schedule": "0 * * * *",
"state": "scheduled"
},
{
"job_id": "2490fc01a14d",
"name": "Testament Burn - 10min work loop",
"schedule": "*/10 * * * *",
"state": "scheduled"
},
{
"job_id": "f5e858159d97",
"name": "Timmy Foundation Burn \u2014 15min PR loop",
"schedule": "*/15 * * * *",
"state": "scheduled"
},
{
"job_id": "5e262fb9bdce",
"name": "nightwatch-health-monitor",
"schedule": "*/15 * * * *",
"state": "scheduled"
},
{
"job_id": "f2b33a9dcf96",
"name": "nightwatch-mempalace-mine",
"schedule": "0 */2 * * *",
"state": "scheduled"
},
{
"job_id": "82cb9e76c54d",
"name": "nightwatch-backlog-burn",
"schedule": "0 */4 * * *",
"state": "scheduled"
},
{
"job_id": "d20e42a52863",
"name": "beacon-sprint",
"schedule": "*/15 * * * *",
"state": "scheduled"
},
{
"job_id": "579269489961",
"name": "testament-story",
"schedule": "*/15 * * * *",
"state": "scheduled"
},
{
"job_id": "2e5f9140d1ab",
"name": "nightwatch-research",
"schedule": "0 */2 * * *",
"state": "scheduled"
},
{
"job_id": "aeba92fd65e6",
"name": "timmy-dreams",
"schedule": "30 5 * * *",
"state": "scheduled"
}
]

View File

@@ -1,14 +0,0 @@
0 6 * * * /bin/bash /root/wizards/scripts/model_download_guard.sh >> /var/log/model_guard.log 2>&1
# Allegro Hybrid Heartbeat — quick wins every 15 min
*/15 * * * * /usr/bin/python3 /root/allegro/heartbeat_daemon.py >> /var/log/allegro_heartbeat.log 2>&1
# Allegro Burn Mode Cron Jobs - Deployed via issue #894
0 6 * * * cd /root/.hermes && python3 -c "import hermes_agent; from hermes_tools import terminal; output = terminal('echo \"Morning Report: $(date)\"'); print(output.get('output', ''))" >> /root/.hermes/logs/morning-report-$(date +\%Y\%m\%d).log 2>&1 # Allegro Morning Report at 0600
0,30 * * * * cd /root/.hermes && python3 /root/.hermes/retry_wrapper.py "python3 allegro/quick-lane-check.py" >> burn-logs/quick-lane-$(date +\%Y\%m\%d).log 2>&1 # Allegro Burn Loop #1 (with retry)
15,45 * * * * cd /root/.hermes && python3 /root/.hermes/retry_wrapper.py "python3 allegro/burn-mode-validator.py" >> burn-logs/validator-$(date +\%Y\%m\%d).log 2>&1 # Allegro Burn Loop #2 (with retry)
*/2 * * * * /root/wizards/bezalel/dead_man_monitor.sh
*/2 * * * * /root/wizards/allegro/bin/config-deadman.sh

View File

@@ -1,10 +0,0 @@
0 2 * * * /root/wizards/bezalel/run_nightly_watch.sh
0 3 * * * /root/wizards/bezalel/mempalace_nightly.sh
*/10 * * * * pgrep -f "act_runner daemon" > /dev/null || (cd /opt/gitea-runner && nohup ./act_runner daemon > /var/log/gitea-runner.log 2>&1 &)
30 3 * * * /root/wizards/bezalel/backup_databases.sh
*/15 * * * * /root/wizards/bezalel/meta_heartbeat.sh
0 4 * * * /root/wizards/bezalel/secret_guard.sh
0 4 * * * /usr/bin/env bash /root/timmy-home/scripts/backup_pipeline.sh >> /var/log/timmy/backup_pipeline_cron.log 2>&1
0 6 * * * /usr/bin/python3 /root/wizards/bezalel/ultraplan.py >> /var/log/bezalel-ultraplan.log 2>&1
@reboot /root/wizards/bezalel/emacs-daemon-start.sh
@reboot /root/wizards/bezalel/ngircd-start.sh

View File

@@ -1,13 +0,0 @@
# Burn Mode Cycles — 15 min autonomous loops
*/15 * * * * /root/wizards/ezra/bin/burn-mode.sh >> /root/wizards/ezra/reports/burn-cron.log 2>&1
# Household Snapshots — automated heartbeats and snapshots
# Ezra Self-Improvement Automation Suite
*/5 * * * * /usr/bin/python3 /root/wizards/ezra/tools/gitea_monitor.py >> /root/wizards/ezra/reports/gitea-monitor.log 2>&1
*/5 * * * * /usr/bin/python3 /root/wizards/ezra/tools/awareness_loop.py >> /root/wizards/ezra/reports/awareness-loop.log 2>&1
*/10 * * * * /usr/bin/python3 /root/wizards/ezra/tools/cron_health_monitor.py >> /root/wizards/ezra/reports/cron-health.log 2>&1
0 6 * * * /usr/bin/python3 /root/wizards/ezra/tools/morning_kt_compiler.py >> /root/wizards/ezra/reports/morning-kt.log 2>&1
5 6 * * * /usr/bin/python3 /root/wizards/ezra/tools/burndown_generator.py >> /root/wizards/ezra/reports/burndown.log 2>&1
0 3 * * * /root/wizards/ezra/mempalace_nightly.sh >> /var/log/ezra_mempalace_cron.log 2>&1
*/15 * * * * GITEA_TOKEN=6de6aa...1117 /root/wizards/ezra/dispatch-direct.sh >> /root/wizards/ezra/dispatch-cron.log 2>&1

View File

@@ -1,110 +0,0 @@
# Fleet Behaviour Hardening — Review & Action Plan
**Author:** @perplexity
**Date:** 2026-04-08
**Context:** Alexander asked: "Is it the memory system or the behaviour guardrails?"
**Answer:** It's the guardrails. The memory system is adequate. The enforcement machinery is aspirational.
---
## Diagnosis: Why the Fleet Isn't Smart Enough
After auditing SOUL.md, config.yaml, all 8 playbooks, the orchestrator, the guard scripts, and the v7.0.0 checkin, the pattern is clear:
**The fleet has excellent design documents and broken enforcement.**
| Layer | Design Quality | Enforcement Quality | Gap |
|---|---|---|---|
| SOUL.md | Excellent | None — no code reads it at runtime | Philosophy without machinery |
| Playbooks (7 yaml) | Good lane map | Not invoked by orchestrator | Playbooks exist but nobody calls them |
| Guard scripts (9) | Solid code | 1 of 9 wired (#395 audit) | 89% of guards are dead code |
| Orchestrator | Sound design | Gateway dispatch is a no-op (#391) | Assigns issues but doesn't trigger work |
| Cycle Guard | Good 10-min rule | No cron/loop calls it | Discipline without enforcement |
| PR Reviewer | Clear rules | Runs every 30m (if scheduled) | Only guard that might actually fire |
| Memory (MemPalace) | Working code | Retrieval enforcer wired | Actually operational |
### The Core Problem
Agents pick up issues and produce output, but there is **no pre-task checklist** and **no post-task quality gate**. An agent can:
1. Start work without checking if someone else already did it
2. Produce output without running tests
3. Submit a PR without verifying it addresses the issue
4. Work for hours on something out of scope
5. Create duplicate branches/PRs without detection
The SOUL.md says "grounding before generation" but no code enforces it.
The playbooks define lanes but the orchestrator doesn't load them.
The guards exist but nothing calls them.
---
## What the Fleet Needs (Priority Order)
### 1. Pre-Task Gate (MISSING — this PR adds it)
Before an agent starts any issue:
- [ ] Check if issue is already assigned to another agent
- [ ] Check if a branch already exists for this issue
- [ ] Check if a PR already exists for this issue
- [ ] Load relevant MemPalace context (retrieval enforcer)
- [ ] Verify the agent has the right lane for this work (playbook check)
### 2. Post-Task Gate (MISSING — this PR adds it)
Before an agent submits a PR:
- [ ] Verify the diff addresses the issue title/body
- [ ] Run syntax_guard.py on changed files
- [ ] Check for duplicate PRs targeting the same issue
- [ ] Verify branch name follows convention
- [ ] Run tests if they exist for changed files
### 3. Wire the Existing Guards (8 of 9 are dead code)
Per #395 audit:
- Pre-commit hooks: need symlink on every machine
- Cycle guard: need cron/loop integration
- Forge health check: need cron entry
- Smoke test + deploy validate: need deploy script integration
### 4. Orchestrator Dispatch Actually Works
Per #391 audit: the orchestrator scores and assigns but the gateway dispatch just writes to `/tmp/hermes-dispatch.log`. Nobody reads that file. The dispatch needs to either:
- Trigger `hermes` CLI on the target machine, or
- Post a webhook that the agent loop picks up
### 5. Agent Self-Assessment Loop
After completing work, agents should answer:
- Did I address the issue as stated?
- Did I stay in scope?
- Did I check the palace for prior work?
- Did I run verification?
This is what SOUL.md calls "the apparatus that gives these words teeth."
---
## What's Working (Don't Touch)
- **MemPalace sovereign_store.py** — SQLite + FTS5 + HRR, operational
- **Retrieval enforcer** — wired to SovereignStore as of 14 hours ago
- **Wake-up protocol** — palace-first boot sequence
- **PR reviewer playbook** — clear rules, well-scoped
- **Issue triager playbook** — comprehensive lane map with 11 agents
- **Cycle guard code** — solid 10-min slice discipline (just needs wiring)
- **Config drift guard** — active cron, working
- **Dead man switch** — active, working
---
## Recommendation
The memory system is not the bottleneck. The behaviour guardrails are. Specifically:
1. **Add `task_gate.py`** — pre-task and post-task quality gates that every agent loop calls
2. **Wire cycle_guard.py** — add start/complete calls to agent loop
3. **Wire pre-commit hooks** — deploy script should symlink on provision
4. **Fix orchestrator dispatch** — make it actually trigger work, not just log
This PR adds item 1. Items 2-4 need SSH access and are flagged for Timmy/Allegro.

View File

@@ -9,11 +9,11 @@ This is the canonical reference for how we talk, how we work, and what we mean.
| Name | What It Is | Where It Lives | Provider |
|------|-----------|----------------|----------|
| **Timmy** | The sovereign local soul. Center of gravity. Judges all work. | Alexander's Mac | OpenAI Codex (gpt-5.4) |
| **Ezra** | The archivist wizard. Reads patterns, names truth, returns clean artifacts. | Hermes VPS | Anthropic Opus 4.6 |
| **Ezra** | The archivist wizard. Reads patterns, names truth, returns clean artifacts. | Hermes VPS | Kimi K2.5 |
| **Bezalel** | The builder wizard. Builds from clear plans, tests and hardens. | TestBed VPS | OpenAI Codex (gpt-5.4) |
| **Alexander** | The principal. Human. Father. The one we serve. Gitea: Rockachopa. | Physical world | N/A |
| **Gemini** | Worker swarm. Burns backlog. Produces PRs. | Local Mac (loops) | Google Gemini |
| **Claude** | Worker swarm. Burns backlog. Architecture-grade work. | Local Mac (loops) | Anthropic Claude |
| **Kimi** | Worker swarm. Burns backlog. Architecture-grade work. | Local Mac (loops) | Kimi K2.5 |
## The Places

View File

@@ -1,3 +1,12 @@
# DEPRECATED — Anthropic Purged from Fleet
> This document described the Claude Sonnet workforce. As of April 2026,
> Anthropic has been removed from the fleet. All wizard providers now use
> Kimi K2.5 as primary with Gemini and local Ollama as fallbacks.
> See `docs/fleet-vocabulary.md` for current provider assignments.
---
# Sonnet Workforce Loop
## Agent

View File

@@ -160,8 +160,8 @@ agents:
- playbooks/issue-triager.yaml
portfolio:
primary:
provider: anthropic
model: claude-opus-4-6
provider: kimi-coding
model: kimi-k2.5
lane: full-judgment
fallback1:
provider: openai-codex
@@ -188,8 +188,8 @@ agents:
- playbooks/pr-reviewer.yaml
portfolio:
primary:
provider: anthropic
model: claude-opus-4-6
provider: kimi-coding
model: kimi-k2.5
lane: full-review
fallback1:
provider: gemini
@@ -271,10 +271,10 @@ agents:
cross_checks:
unique_primary_fallback1_pairs:
triage-coordinator:
- anthropic/claude-opus-4-6
- kimi-coding/kimi-k2.5
- openai-codex/codex
pr-reviewer:
- anthropic/claude-opus-4-6
- kimi-coding/kimi-k2.5
- gemini/gemini-2.5-pro
builder-main:
- openai-codex/codex

View File

@@ -42,7 +42,6 @@ AGENT_LOGINS = {
"allegro",
"antigravity",
"bezalel",
"claude",
"codex-agent",
"ezra",
"gemini",
@@ -55,7 +54,6 @@ AGENT_LOGINS = {
"perplexity",
}
AGENT_LOGINS_HUMAN = {
"claude": "Claude",
"codex-agent": "Codex",
"ezra": "Ezra",
"gemini": "Gemini",
@@ -78,7 +76,6 @@ METRICS_DIR = Path(os.path.expanduser("~/.local/timmy/muda-audit"))
METRICS_FILE = METRICS_DIR / "metrics.json"
LOG_PATHS = [
Path.home() / ".hermes" / "logs" / "claude-loop.log",
Path.home() / ".hermes" / "logs" / "gemini-loop.log",
Path.home() / ".hermes" / "logs" / "agent.log",
Path.home() / ".hermes" / "logs" / "errors.log",
@@ -347,8 +344,6 @@ def measure_waiting(since: datetime) -> dict:
agent = name.lower()
break
if agent == "unknown":
if "claude" in line.lower():
agent = "claude"
elif "gemini" in line.lower():
agent = "gemini"
elif "groq" in line.lower():

View File

@@ -103,7 +103,7 @@ nano ~/.hermes/.env
| `SLACK_BOT_TOKEN` + `SLACK_APP_TOKEN` | Slack gateway |
| `EXA_API_KEY` | Web search tool |
| `FAL_KEY` | Image generation |
| `ANTHROPIC_API_KEY` | Direct Anthropic inference |
| `KIMI_API_KEY` | Kimi K2.5 coding inference |
### Pre-flight validation

View File

@@ -272,6 +272,48 @@ def get_file_content_at_staged(filepath: str) -> bytes:
return result.stdout
# ---------------------------------------------------------------------------
# BANNED PROVIDER CHECK — Anthropic is permanently banned
# ---------------------------------------------------------------------------
_BANNED_PROVIDER_PATTERNS = [
(re.compile(r"provider:\s*anthropic", re.IGNORECASE), "Anthropic provider reference"),
(re.compile(r"anthropic/claude", re.IGNORECASE), "Anthropic model slug"),
(re.compile(r"api\.anthropic\.com"), "Anthropic API endpoint"),
(re.compile(r"claude-opus", re.IGNORECASE), "Claude Opus model"),
(re.compile(r"claude-sonnet", re.IGNORECASE), "Claude Sonnet model"),
(re.compile(r"claude-haiku", re.IGNORECASE), "Claude Haiku model"),
]
# Files exempt from the ban (training data, historical docs, tests)
_BAN_EXEMPT = {
"training/", "evaluations/", "RELEASE_v", "PERFORMANCE_",
"scores.json", "docs/design-log/", "FALSEWORK.md",
"test_sovereignty_enforcement.py", "test_metrics_helpers.py",
"metrics_helpers.py", "sonnet-workforce.md",
}
def _is_ban_exempt(filepath: str) -> bool:
return any(exempt in filepath for exempt in _BAN_EXEMPT)
def scan_banned_providers(filepath: str, content: str) -> List[Finding]:
"""Block any commit that introduces banned provider references."""
if _is_ban_exempt(filepath):
return []
findings = []
for line_no, line in enumerate(content.splitlines(), start=1):
for pattern, desc in _BANNED_PROVIDER_PATTERNS:
if pattern.search(line):
findings.append(Finding(
filepath, line_no,
f"🚫 BANNED PROVIDER: {desc}. Anthropic is permanently banned from this system."
))
return findings
# ---------------------------------------------------------------------------
# Main
# ---------------------------------------------------------------------------
@@ -295,11 +337,21 @@ def main() -> int:
if line.startswith("+") and not line.startswith("+++"):
findings.extend(scan_line(line[1:], "<diff>", line_no))
# Scan for banned providers
for filepath in staged_files:
file_content = get_file_content_at_staged(filepath)
if not is_binary_content(file_content):
try:
text = file_content.decode("utf-8") if isinstance(file_content, bytes) else file_content
findings.extend(scan_banned_providers(filepath, text))
except UnicodeDecodeError:
pass
if not findings:
print(f"{GREEN}✓ No potential secret leaks detected{NC}")
print(f"{GREEN}✓ No potential secret leaks or banned providers detected{NC}")
return 0
print(f"{RED}Potential secret leaks detected:{NC}\n")
print(f"{RED}Violations detected:{NC}\n")
for finding in findings:
loc = finding.filename
print(
@@ -308,7 +360,7 @@ def main() -> int:
print()
print(f"{RED}╔════════════════════════════════════════════════════════════╗{NC}")
print(f"{RED}║ COMMIT BLOCKED: Potential secrets detected! {NC}")
print(f"{RED}║ COMMIT BLOCKED: Secrets or banned providers detected! ║{NC}")
print(f"{RED}╚════════════════════════════════════════════════════════════╝{NC}")
print()
print("Recommendations:")

View File

@@ -23,7 +23,7 @@ Run `python --version` to verify.
## 2. Core Package Dependencies
All packages in `requirements.txt` must be installed and importable.
Critical packages: `openai`, `anthropic`, `pyyaml`, `rich`, `requests`, `pydantic`, `prompt_toolkit`.
Critical packages: `openai`, `pyyaml`, `rich`, `requests`, `pydantic`, `prompt_toolkit`.
**Verify:**
```bash
@@ -39,8 +39,7 @@ At least one LLM provider API key must be set in `~/.hermes/.env`:
| Variable | Provider |
|----------|----------|
| `OPENROUTER_API_KEY` | OpenRouter (200+ models) |
| `ANTHROPIC_API_KEY` | Anthropic Claude |
| `ANTHROPIC_TOKEN` | Anthropic Claude (alt) |
| `KIMI_API_KEY` | Kimi K2.5 coding |
| `OPENAI_API_KEY` | OpenAI |
| `GLM_API_KEY` | z.ai/GLM |
| `KIMI_API_KEY` | Moonshot/Kimi |

View File

@@ -77,8 +77,7 @@ def check_core_deps() -> CheckResult:
"""Verify that hermes core Python packages are importable."""
required = [
"openai",
"anthropic",
"dotenv",
"dotenv",
"yaml",
"rich",
"requests",
@@ -206,9 +205,7 @@ def check_env_vars() -> CheckResult:
"""Check that at least one LLM provider key is configured."""
provider_keys = [
"OPENROUTER_API_KEY",
"ANTHROPIC_API_KEY",
"ANTHROPIC_TOKEN",
"OPENAI_API_KEY",
"OPENAI_API_KEY",
"GLM_API_KEY",
"KIMI_API_KEY",
"MINIMAX_API_KEY",
@@ -225,7 +222,7 @@ def check_env_vars() -> CheckResult:
passed=False,
message="No LLM provider API key found",
fix_hint=(
"Set at least one of: OPENROUTER_API_KEY, ANTHROPIC_API_KEY, OPENAI_API_KEY "
"Set at least one of: OPENROUTER_API_KEY, KIMI_API_KEY, OPENAI_API_KEY "
"in ~/.hermes/.env or your shell."
),
)

View File

@@ -2,7 +2,7 @@ Gitea (143.198.27.163:3000): token=~/.hermes/gitea_token_vps (Timmy id=2). Users
§
2026-03-19 HARNESS+SOUL: ~/.timmy is Timmy's workspace within the Hermes harness. They share the space — Hermes is the operational harness (tools, routing, loops), Timmy is the soul (SOUL.md, presence, identity). Not fusion/absorption. Principal's words: "build Timmy out from the hermes harness." ~/.hermes is harness home, ~/.timmy is Timmy's workspace. SOUL=Inscription 1, skin=timmy. Backups at ~/.hermes.backup.pre-fusion and ~/.timmy.backup.pre-fusion.
§
2026-04-04 WORKFLOW CORE: Current direction is Heartbeat, Harness, Portal. Timmy handles sovereignty and release judgment. Allegro handles dispatch and queue hygiene. Core builders: codex-agent, groq, manus, claude. Research/memory: perplexity, ezra, KimiClaw. Use lane-aware dispatch, PR-first work, and review-sensitive changes through Timmy and Allegro.
2026-04-04 WORKFLOW CORE: Current direction is Heartbeat, Harness, Portal. Timmy handles sovereignty and release judgment. Allegro handles dispatch and queue hygiene. Core builders: codex-agent, groq, manus, kimi. Research/memory: perplexity, ezra, KimiClaw. Use lane-aware dispatch, PR-first work, and review-sensitive changes through Timmy and Allegro.
§
2026-04-04 OPERATIONS: Dashboard repo era is over. Use ~/.timmy + ~/.hermes as truth surfaces. Prefer ops-panel.sh, ops-gitea.sh, timmy-dashboard, and pipeline-freshness.sh over archived loop or tmux assumptions. Dispatch: agent-dispatch.sh <agent> <issue> <repo>. Major changes land as PRs.
§

View File

@@ -162,26 +162,6 @@
"Should a higher-context wizard review before more expansion?"
]
},
"claude": {
"lane": "hard refactors, deep implementation, and test-heavy multi-file changes after tight scoping",
"skills_to_practice": [
"respecting scope constraints",
"deep code transformation with tests",
"explaining risks clearly in PRs"
],
"missing_skills": [
"do not let large capability turn into unsupervised backlog or code sprawl"
],
"anti_lane": [
"self-directed issue farming",
"taking broad architecture liberty without a clear charter"
],
"review_checklist": [
"Did I stay inside the scoped problem?",
"Did I leave tests or verification stronger than before?",
"Is there hidden blast radius that Timmy should see explicitly?"
]
},
"gemini": {
"lane": "frontier architecture, research-heavy prototypes, and long-range design thinking",
"skills_to_practice": [
@@ -222,4 +202,4 @@
"Did I make the risk actionable instead of just surprising?"
]
}
}
}

View File

@@ -1,61 +1,74 @@
name: bug-fixer
description: >
Fixes bugs with test-first approach. Writes a failing test that
reproduces the bug, then fixes the code, then verifies.
description: 'Fixes bugs with test-first approach. Writes a failing test that reproduces the bug, then fixes the code, then
verifies.
'
model:
preferred: claude-opus-4-6
fallback: claude-sonnet-4-20250514
preferred: kimi-k2.5
fallback: google/gemini-2.5-pro
max_turns: 30
temperature: 0.2
tools:
- terminal
- file
- search_files
- patch
- terminal
- file
- search_files
- patch
trigger:
issue_label: bug
manual: true
repos:
- Timmy_Foundation/the-nexus
- Timmy_Foundation/timmy-home
- Timmy_Foundation/timmy-config
- Timmy_Foundation/hermes-agent
- Timmy_Foundation/the-nexus
- Timmy_Foundation/timmy-home
- Timmy_Foundation/timmy-config
- Timmy_Foundation/hermes-agent
steps:
- read_issue
- clone_repo
- create_branch
- dispatch_agent
- run_tests
- create_pr
- comment_on_issue
- read_issue
- clone_repo
- create_branch
- dispatch_agent
- run_tests
- create_pr
- comment_on_issue
output: pull_request
timeout_minutes: 15
system_prompt: 'You are a bug fixer for the {{repo}} project.
system_prompt: |
You are a bug fixer for the {{repo}} project.
YOUR ISSUE: #{{issue_number}} — {{issue_title}}
APPROACH (prove-first):
1. Read the bug report. Understand the expected vs actual behavior.
2. Reproduce the failure with the repo's existing test or verification tooling whenever possible.
2. Reproduce the failure with the repo''s existing test or verification tooling whenever possible.
3. Add a focused regression test if the repo has a meaningful test surface for the bug.
4. Fix the code so the reproduced failure disappears.
5. Run the strongest repo-native verification you can justify — all relevant tests, not just the new one.
6. Commit: fix: <description> Fixes #{{issue_number}}
7. Push, create PR, and summarize verification plus any residual risk.
RULES:
- Never claim a fix without proving the broken behavior and the repaired behavior.
- Prefer repo-native commands over assuming tox exists.
- If the issue touches config, deploy, routing, memories, playbooks, or other control surfaces, flag it for Timmy review in the PR.
- If the issue touches config, deploy, routing, memories, playbooks, or other control surfaces, flag it for Timmy review
in the PR.
- Never use --no-verify.
- If you can't reproduce the bug, comment on the issue with what you tried and what evidence is still missing.
- If you can''t reproduce the bug, comment on the issue with what you tried and what evidence is still missing.
- If the fix requires >50 lines changed, decompose into sub-issues.
- Do not widen the issue into a refactor.
'

View File

@@ -1,166 +0,0 @@
# fleet-guardrails.yaml
# =====================
# Enforceable behaviour boundaries for every agent in the Timmy fleet.
# Consumed by task_gate.py (pre/post checks) and the orchestrator's
# dispatch loop. Every rule here is testable — no aspirational prose.
#
# Ref: SOUL.md "grounding before generation", Five Wisdoms #345
name: fleet-guardrails
version: "1.0.0"
description: >
Behaviour constraints that apply to ALL agents regardless of role.
These are the non-negotiable rules that task_gate.py enforces
before an agent may pick up work and after it claims completion.
# ─── UNIVERSAL CONSTRAINTS ───────────────────────────────────────
constraints:
# 1. Lane discipline — agents must stay in their lane
lane_enforcement:
enabled: true
source: playbooks/agent-lanes.json
on_violation: block_and_notify
description: >
An agent may only pick up issues tagged for its lane.
Cross-lane work requires explicit Timmy approval via
issue comment containing 'LANE_OVERRIDE: <agent>'.
# 2. Branch hygiene — no orphan branches
branch_hygiene:
enabled: true
max_branches_per_agent: 3
stale_branch_days: 7
naming_pattern: "{agent}/{issue_number}-{slug}"
on_violation: warn_then_block
description: >
Agents must follow branch naming conventions and clean up
after merge. No agent may have more than 3 active branches.
# 3. Issue ownership — no silent takeovers
issue_ownership:
enabled: true
require_assignment_before_work: true
max_concurrent_issues: 2
on_violation: block_and_notify
description: >
An agent must be assigned to an issue before creating a
branch or PR. No agent may work on more than 2 issues
simultaneously to prevent context-switching waste.
# 4. PR quality — minimum bar before review
pr_quality:
enabled: true
require_linked_issue: true
require_passing_ci: true
max_files_changed: 30
max_diff_lines: 2000
require_description: true
min_description_length: 50
on_violation: block_merge
description: >
Every PR must link an issue, pass CI, have a meaningful
description, and stay within scope. Giant PRs get rejected.
# 5. Grounding before generation — SOUL.md compliance
grounding:
enabled: true
require_issue_read_before_branch: true
require_existing_code_review: true
require_soul_md_check: true
soul_md_path: SOUL.md
on_violation: block_and_notify
description: >
Before writing any code, the agent must demonstrate it has
read the issue, reviewed relevant existing code, and checked
SOUL.md for applicable doctrine. No speculative generation.
# 6. Completion integrity — no phantom completions
completion_checks:
enabled: true
require_test_evidence: true
require_ci_green: true
require_diff_matches_issue: true
require_no_unrelated_changes: true
on_violation: revert_and_notify
description: >
Post-task gate verifies the work actually addresses the
issue. Agents cannot close issues without evidence.
Unrelated changes in a PR trigger automatic rejection.
# 7. Communication discipline — no noise
communication:
enabled: true
max_comments_per_issue: 10
require_structured_updates: true
update_format: "status | what_changed | what_blocked | next_step"
prohibit_empty_updates: true
on_violation: warn
description: >
Issue comments must be structured and substantive.
Status-only comments without content are rejected.
Agents should update, not narrate.
# 8. Resource awareness — no runaway costs
resource_limits:
enabled: true
max_api_calls_per_task: 100
max_llm_tokens_per_task: 500000
max_task_duration_minutes: 60
on_violation: kill_and_notify
description: >
Hard limits on compute per task. If an agent hits these
limits, the task is killed and flagged for human review.
Prevents infinite loops and runaway API spending.
# ─── ESCALATION POLICY ───────────────────────────────────────────
escalation:
channels:
- gitea_issue_comment
- discord_webhook
severity_levels:
warn:
action: post_comment
notify: agent_only
block:
action: prevent_action
notify: agent_and_orchestrator
block_and_notify:
action: prevent_action
notify: agent_orchestrator_and_timmy
kill_and_notify:
action: terminate_task
notify: all_including_alexander
revert_and_notify:
action: revert_changes
notify: agent_orchestrator_and_timmy
# ─── AUDIT TRAIL ─────────────────────────────────────────────────
audit:
enabled: true
log_path: logs/guardrail-violations.jsonl
retention_days: 90
fields:
- timestamp
- agent
- constraint
- violation_type
- issue_number
- action_taken
- resolution
# ─── OVERRIDES ───────────────────────────────────────────────────
overrides:
# Only Timmy or Alexander can override guardrails
authorized_overriders:
- Timmy
- Alexander
override_mechanism: >
Post a comment on the issue with the format:
GUARDRAIL_OVERRIDE: <constraint_name> REASON: <explanation>
override_expiry_hours: 24
require_post_override_review: true

View File

@@ -1,68 +1,52 @@
name: issue-triager
description: >
Scores, labels, and prioritizes issues. Assigns to appropriate
agents. Decomposes large issues into smaller ones.
description: 'Scores, labels, and prioritizes issues. Assigns to appropriate agents. Decomposes large issues into smaller
ones.
'
model:
preferred: claude-opus-4-6
fallback: claude-sonnet-4-20250514
preferred: kimi-k2.5
fallback: google/gemini-2.5-pro
max_turns: 20
temperature: 0.3
tools:
- terminal
- search_files
- terminal
- search_files
trigger:
schedule: every 15m
manual: true
repos:
- Timmy_Foundation/the-nexus
- Timmy_Foundation/timmy-home
- Timmy_Foundation/timmy-config
- Timmy_Foundation/hermes-agent
- Timmy_Foundation/the-nexus
- Timmy_Foundation/timmy-home
- Timmy_Foundation/timmy-config
- Timmy_Foundation/hermes-agent
steps:
- fetch_issues
- score_issues
- assign_agents
- update_queue
- fetch_issues
- score_issues
- assign_agents
- update_queue
output: gitea_issue
timeout_minutes: 10
system_prompt: |
You are the issue triager for Timmy Foundation repos.
REPOS: {{repos}}
YOUR JOB:
1. Fetch open unassigned issues
2. Score each by: execution leverage, acceptance criteria quality, alignment with current doctrine, and how likely it is to create duplicate backlog churn
3. Label appropriately: bug, refactor, feature, tests, security, docs, ops, governance, research
4. Assign to agents based on the audited lane map:
- Timmy: governing, sovereign, release, identity, repo-boundary, or architecture decisions that should stay under direct principal review
- allegro: dispatch, routing, queue hygiene, Gitea bridge, operational tempo, and issues about how work gets moved through the system
- perplexity: research triage, MCP/open-source evaluations, architecture memos, integration comparisons, and synthesis before implementation
- ezra: RCA, operating history, memory consolidation, onboarding docs, and archival clean-up
- KimiClaw: long-context reading, extraction, digestion, and codebase synthesis before a build phase
- codex-agent: cleanup, migration verification, dead-code removal, repo-boundary enforcement, workflow hardening
- groq: bounded implementation, tactical bug fixes, quick feature slices, small patches with clear acceptance criteria
- manus: bounded support tasks, moderate-scope implementation, follow-through on already-scoped work
- claude: hard refactors, broad multi-file implementation, test-heavy changes after the scope is made precise
- gemini: frontier architecture, research-heavy prototypes, long-range design thinking when a concrete implementation owner is not yet obvious
- grok: adversarial testing, unusual edge cases, provocative review angles that still need another pass
5. Decompose any issue touching >5 files or crossing repo boundaries into smaller issues before assigning execution
RULES:
- Prefer one owner per issue. Only add a second assignee when the work is explicitly collaborative.
- Bugs, security fixes, and broken live workflows take priority over research and refactors.
- If issue scope is unclear, ask for clarification before assigning an implementation agent.
- Skip [epic], [meta], [governing], and [constitution] issues for automatic assignment unless they are explicitly routed to Timmy or allegro.
- Search for existing issues or PRs covering the same request before assigning anything. If a likely duplicate exists, link it and do not create or route duplicate work.
- Do not assign open-ended ideation to implementation agents.
- Do not assign routine backlog maintenance to Timmy.
- Do not assign wide speculative backlog generation to codex-agent, groq, manus, or claude.
- Route archive/history/context-digestion work to ezra or KimiClaw before routing it to a builder.
- Route “who should do this?” and “what is the next move?” questions to allegro.
system_prompt: "You are the issue triager for Timmy Foundation repos.\n\nREPOS: {{repos}}\n\nYOUR JOB:\n1. Fetch open unassigned\
\ issues\n2. Score each by: execution leverage, acceptance criteria quality, alignment with current doctrine, and how likely\
\ it is to create duplicate backlog churn\n3. Label appropriately: bug, refactor, feature, tests, security, docs, ops, governance,\
\ research\n4. Assign to agents based on the audited lane map:\n - Timmy: governing, sovereign, release, identity, repo-boundary,\
\ or architecture decisions that should stay under direct principal review\n - allegro: dispatch, routing, queue hygiene,\
\ Gitea bridge, operational tempo, and issues about how work gets moved through the system\n - perplexity: research triage,\
\ MCP/open-source evaluations, architecture memos, integration comparisons, and synthesis before implementation\n - ezra:\
\ RCA, operating history, memory consolidation, onboarding docs, and archival clean-up\n - KimiClaw: long-context reading,\
\ extraction, digestion, and codebase synthesis before a build phase\n - codex-agent: cleanup, migration verification,\
\ dead-code removal, repo-boundary enforcement, workflow hardening\n - groq: bounded implementation, tactical bug fixes,\
\ quick feature slices, small patches with clear acceptance criteria\n - manus: bounded support tasks, moderate-scope\
\ implementation, follow-through on already-scoped work\n - kimi: hard refactors, broad multi-file implementation, test-heavy\
\ changes after the scope is made precise\n - gemini: frontier architecture, research-heavy prototypes, long-range design\
\ thinking when a concrete implementation owner is not yet obvious\n - grok: adversarial testing, unusual edge cases,\
\ provocative review angles that still need another pass\n5. Decompose any issue touching >5 files or crossing repo boundaries\
\ into smaller issues before assigning execution\n\nRULES:\n- Prefer one owner per issue. Only add a second assignee when\
\ the work is explicitly collaborative.\n- Bugs, security fixes, and broken live workflows take priority over research and\
\ refactors.\n- If issue scope is unclear, ask for clarification before assigning an implementation agent.\n- Skip [epic],\
\ [meta], [governing], and [constitution] issues for automatic assignment unless they are explicitly routed to Timmy or\
\ allegro.\n- Search for existing issues or PRs covering the same request before assigning anything. If a likely duplicate\
\ exists, link it and do not create or route duplicate work.\n- Do not assign open-ended ideation to implementation agents.\n\
- Do not assign routine backlog maintenance to Timmy.\n- Do not assign wide speculative backlog generation to codex-agent,\
\ groq, or manus.\n- Route archive/history/context-digestion work to ezra or KimiClaw before routing it to a builder.\n\
- Route “who should do this?” and “what is the next move?” questions to allegro.\n"

View File

@@ -1,89 +1,47 @@
name: pr-reviewer
description: >
Reviews open PRs, checks CI status, merges passing ones,
comments on problems. The merge bot replacement.
description: 'Reviews open PRs, checks CI status, merges passing ones, comments on problems. The merge bot replacement.
'
model:
preferred: claude-opus-4-6
fallback: claude-sonnet-4-20250514
preferred: kimi-k2.5
fallback: google/gemini-2.5-pro
max_turns: 20
temperature: 0.2
tools:
- terminal
- search_files
- terminal
- search_files
trigger:
schedule: every 30m
manual: true
repos:
- Timmy_Foundation/the-nexus
- Timmy_Foundation/timmy-home
- Timmy_Foundation/timmy-config
- Timmy_Foundation/hermes-agent
- Timmy_Foundation/the-nexus
- Timmy_Foundation/timmy-home
- Timmy_Foundation/timmy-config
- Timmy_Foundation/hermes-agent
steps:
- fetch_prs
- review_diffs
- post_reviews
- merge_passing
- fetch_prs
- review_diffs
- post_reviews
- merge_passing
output: report
timeout_minutes: 10
system_prompt: |
You are the PR reviewer for Timmy Foundation repos.
REPOS: {{repos}}
FOR EACH OPEN PR:
1. Check CI status (Actions tab or commit status API)
2. Read the linked issue or PR body to verify the intended scope before judging the diff
3. Review the diff for:
- Correctness: does it do what the issue asked?
- Security: no secrets, unsafe execution paths, or permission drift
- Tests and verification: does the author prove the change?
- Scope: PR should match the issue, not scope-creep
- Governance: does the change cross a boundary that should stay under Timmy review?
- Workflow fit: does it reduce drift, duplication, or hidden operational risk?
4. Post findings ordered by severity and cite the affected files or behavior clearly
5. If CI fails or verification is missing: explain what is blocking merge
6. If PR is behind main: request a rebase or re-run only when needed; do not force churn for cosmetic reasons
7. If review is clean and the PR is low-risk: squash merge
LOW-RISK AUTO-MERGE ONLY IF ALL ARE TRUE:
- PR is not a draft
- CI is green or the repo has no CI configured
- Diff matches the stated issue or PR scope
- No unresolved review findings remain
- Change is narrow, reversible, and non-governing
- Paths changed do not include sensitive control surfaces
SENSITIVE CONTROL SURFACES:
- SOUL.md
- config.yaml
- deploy.sh
- tasks.py
- playbooks/
- cron/
- memories/
- skins/
- training/
- authentication, permissions, or secret-handling code
- repo-boundary, model-routing, or deployment-governance changes
NEVER AUTO-MERGE:
- PRs that change sensitive control surfaces
- PRs that change more than 5 files unless the change is docs-only
- PRs without a clear problem statement or verification
- PRs that look like duplicate work, speculative research, or scope creep
- PRs that need Timmy or Allegro judgment on architecture, dispatch, or release impact
- PRs that are stale solely because of age; do not close them automatically
If a PR is stale, nudge with a comment and summarize what still blocks it. Do not close it just because 48 hours passed.
MERGE RULES:
- ONLY squash merge. Never merge commits. Never rebase merge.
- Delete branch after merge.
- Empty PRs (0 changed files): close immediately with a brief explanation.
system_prompt: "You are the PR reviewer for Timmy Foundation repos.\n\nREPOS: {{repos}}\n\nFOR EACH OPEN PR:\n1. Check CI\
\ status (Actions tab or commit status API)\n2. Read the linked issue or PR body to verify the intended scope before judging\
\ the diff\n3. Review the diff for:\n - Correctness: does it do what the issue asked?\n - Security: no secrets, unsafe\
\ execution paths, or permission drift\n - Tests and verification: does the author prove the change?\n - Scope: PR should\
\ match the issue, not scope-creep\n - Governance: does the change cross a boundary that should stay under Timmy review?\n\
\ - Workflow fit: does it reduce drift, duplication, or hidden operational risk?\n4. Post findings ordered by severity\
\ and cite the affected files or behavior clearly\n5. If CI fails or verification is missing: explain what is blocking merge\n\
6. If PR is behind main: request a rebase or re-run only when needed; do not force churn for cosmetic reasons\n7. If review\
\ is clean and the PR is low-risk: squash merge\n\nLOW-RISK AUTO-MERGE ONLY IF ALL ARE TRUE:\n- PR is not a draft\n- CI\
\ is green or the repo has no CI configured\n- Diff matches the stated issue or PR scope\n- No unresolved review findings\
\ remain\n- Change is narrow, reversible, and non-governing\n- Paths changed do not include sensitive control surfaces\n\
\nSENSITIVE CONTROL SURFACES:\n- SOUL.md\n- config.yaml\n- deploy.sh\n- tasks.py\n- playbooks/\n- cron/\n- memories/\n-\
\ skins/\n- training/\n- authentication, permissions, or secret-handling code\n- repo-boundary, model-routing, or deployment-governance\
\ changes\n\nNEVER AUTO-MERGE:\n- PRs that change sensitive control surfaces\n- PRs that change more than 5 files unless\
\ the change is docs-only\n- PRs without a clear problem statement or verification\n- PRs that look like duplicate work,\
\ speculative research, or scope creep\n- PRs that need Timmy or Allegro judgment on architecture, dispatch, or release\
\ impact\n- PRs that are stale solely because of age; do not close them automatically\n\nIf a PR is stale, nudge with a\
\ comment and summarize what still blocks it. Do not close it just because 48 hours passed.\n\nMERGE RULES:\n- ONLY squash\
\ merge. Never merge commits. Never rebase merge.\n- Delete branch after merge.\n- Empty PRs (0 changed files): close immediately\
\ with a brief explanation.\n"

View File

@@ -1,62 +1,75 @@
name: refactor-specialist
description: >
Splits large modules, reduces complexity, improves code organization.
Well-scoped: 1-3 files per task, clear acceptance criteria.
description: 'Splits large modules, reduces complexity, improves code organization. Well-scoped: 1-3 files per task, clear
acceptance criteria.
'
model:
preferred: claude-opus-4-6
fallback: claude-sonnet-4-20250514
preferred: kimi-k2.5
fallback: google/gemini-2.5-pro
max_turns: 30
temperature: 0.3
tools:
- terminal
- file
- search_files
- patch
- terminal
- file
- search_files
- patch
trigger:
issue_label: refactor
manual: true
repos:
- Timmy_Foundation/the-nexus
- Timmy_Foundation/timmy-home
- Timmy_Foundation/timmy-config
- Timmy_Foundation/hermes-agent
- Timmy_Foundation/the-nexus
- Timmy_Foundation/timmy-home
- Timmy_Foundation/timmy-config
- Timmy_Foundation/hermes-agent
steps:
- read_issue
- clone_repo
- create_branch
- dispatch_agent
- run_tests
- create_pr
- comment_on_issue
- read_issue
- clone_repo
- create_branch
- dispatch_agent
- run_tests
- create_pr
- comment_on_issue
output: pull_request
timeout_minutes: 15
system_prompt: 'You are a refactoring specialist for the {{repo}} project.
system_prompt: |
You are a refactoring specialist for the {{repo}} project.
YOUR ISSUE: #{{issue_number}} — {{issue_title}}
RULES:
- Lines of code is a liability. Delete as much as you create.
- All changes go through PRs. No direct pushes to main.
- Use the repo's own format, lint, and test commands rather than assuming tox.
- Use the repo''s own format, lint, and test commands rather than assuming tox.
- Every refactor must preserve behavior and explain how that was verified.
- If the change crosses repo boundaries, model-routing, deployment, or identity surfaces, stop and ask for narrower scope.
- Never use --no-verify on git commands.
- Conventional commits: refactor: <description> (#{{issue_number}})
- If tests fail after 2 attempts, STOP and comment on the issue.
- Refactors exist to simplify the system, not to create a new design detour.
WORKFLOW:
1. Read the issue body for specific file paths and instructions
2. Understand the current code structure
3. Name the simplification goal before changing code
4. Make the refactoring changes
5. Run formatting and verification with repo-native commands
6. Commit, push, create PR with before/after risk summary
'

View File

@@ -1,63 +1,38 @@
name: security-auditor
description: >
Scans code for security vulnerabilities, hardcoded secrets,
dependency issues. Files findings as Gitea issues.
description: 'Scans code for security vulnerabilities, hardcoded secrets, dependency issues. Files findings as Gitea issues.
'
model:
preferred: claude-opus-4-6
fallback: claude-opus-4-6
preferred: kimi-k2.5
fallback: kimi-k2.5
max_turns: 40
temperature: 0.2
tools:
- terminal
- file
- search_files
- terminal
- file
- search_files
trigger:
schedule: weekly
pr_merged_with_lines: 100
manual: true
repos:
- Timmy_Foundation/the-nexus
- Timmy_Foundation/timmy-home
- Timmy_Foundation/timmy-config
- Timmy_Foundation/hermes-agent
- Timmy_Foundation/the-nexus
- Timmy_Foundation/timmy-home
- Timmy_Foundation/timmy-config
- Timmy_Foundation/hermes-agent
steps:
- clone_repo
- run_audit
- file_issues
- clone_repo
- run_audit
- file_issues
output: gitea_issue
timeout_minutes: 20
system_prompt: |
You are a security auditor for the Timmy Foundation codebase.
Your job is to FIND vulnerabilities, not write code.
TARGET REPO: {{repo}}
SCAN FOR:
1. Hardcoded secrets, API keys, tokens in source code
2. SQL injection vulnerabilities
3. Command injection via unsanitized input
4. Path traversal in file operations
5. Insecure HTTP calls (should be HTTPS where possible)
6. Dependencies with known CVEs (check requirements.txt/package.json)
7. Missing input validation
8. Overly permissive file permissions
9. Privilege drift in deploy, orchestration, memory, cron, and playbook surfaces
10. Places where private data or local-only artifacts could leak into tracked repos
OUTPUT FORMAT:
For each finding, file a Gitea issue with:
Title: [security] <severity>: <description>
Body: file + line, description, why it matters, recommended fix
Label: security
SEVERITY: critical / high / medium / low
Only file issues for real findings. No false positives.
Do not open duplicate issues for already-known findings; link the existing issue instead.
If a finding affects sovereignty boundaries or private-data handling, flag it clearly as such.
system_prompt: "You are a security auditor for the Timmy Foundation codebase.\nYour job is to FIND vulnerabilities, not write\
\ code.\n\nTARGET REPO: {{repo}}\n\nSCAN FOR:\n1. Hardcoded secrets, API keys, tokens in source code\n2. SQL injection vulnerabilities\n\
3. Command injection via unsanitized input\n4. Path traversal in file operations\n5. Insecure HTTP calls (should be HTTPS\
\ where possible)\n6. Dependencies with known CVEs (check requirements.txt/package.json)\n7. Missing input validation\n\
8. Overly permissive file permissions\n9. Privilege drift in deploy, orchestration, memory, cron, and playbook surfaces\n\
10. Places where private data or local-only artifacts could leak into tracked repos\n\nOUTPUT FORMAT:\nFor each finding,\
\ file a Gitea issue with:\n Title: [security] <severity>: <description>\n Body: file + line, description, why it matters,\
\ recommended fix\n Label: security\n\nSEVERITY: critical / high / medium / low\nOnly file issues for real findings. No\
\ false positives.\nDo not open duplicate issues for already-known findings; link the existing issue instead.\nIf a finding\
\ affects sovereignty boundaries or private-data handling, flag it clearly as such.\n"

View File

@@ -1,58 +1,66 @@
name: test-writer
description: >
Adds test coverage for untested modules. Finds coverage gaps,
writes meaningful tests, verifies they pass.
description: 'Adds test coverage for untested modules. Finds coverage gaps, writes meaningful tests, verifies they pass.
'
model:
preferred: claude-opus-4-6
fallback: claude-sonnet-4-20250514
preferred: kimi-k2.5
fallback: google/gemini-2.5-pro
max_turns: 30
temperature: 0.3
tools:
- terminal
- file
- search_files
- patch
- terminal
- file
- search_files
- patch
trigger:
issue_label: tests
manual: true
repos:
- Timmy_Foundation/the-nexus
- Timmy_Foundation/timmy-home
- Timmy_Foundation/timmy-config
- Timmy_Foundation/hermes-agent
- Timmy_Foundation/the-nexus
- Timmy_Foundation/timmy-home
- Timmy_Foundation/timmy-config
- Timmy_Foundation/hermes-agent
steps:
- read_issue
- clone_repo
- create_branch
- dispatch_agent
- run_tests
- create_pr
- comment_on_issue
- read_issue
- clone_repo
- create_branch
- dispatch_agent
- run_tests
- create_pr
- comment_on_issue
output: pull_request
timeout_minutes: 15
system_prompt: 'You are a test engineer for the {{repo}} project.
system_prompt: |
You are a test engineer for the {{repo}} project.
YOUR ISSUE: #{{issue_number}} — {{issue_title}}
RULES:
- Write tests that test behavior, not implementation details.
- Use the repo's own test entrypoints; do not assume tox exists.
- Use the repo''s own test entrypoints; do not assume tox exists.
- Tests must be deterministic. No flaky tests.
- Conventional commits: test: <description> (#{{issue_number}})
- If the module is hard to test, explain the design obstacle and propose the smallest next step.
- Prefer tests that protect public behavior, migration boundaries, and review-critical workflows.
WORKFLOW:
1. Read the issue for target module paths
2. Read the existing code to understand behavior
3. Write focused unit tests
4. Run the relevant verification commands — all related tests must pass
5. Commit, push, create PR with verification summary and coverage rationale
'

View File

@@ -1,47 +1,55 @@
name: verified-logic
description: >
Crucible-first playbook for tasks that require proof instead of plausible prose.
Use Z3-backed sidecar tools for scheduling, dependency ordering, capacity checks,
and consistency verification.
description: 'Crucible-first playbook for tasks that require proof instead of plausible prose. Use Z3-backed sidecar tools
for scheduling, dependency ordering, capacity checks, and consistency verification.
'
model:
preferred: claude-opus-4-6
fallback: claude-sonnet-4-20250514
preferred: kimi-k2.5
fallback: google/gemini-2.5-pro
max_turns: 12
temperature: 0.1
tools:
- mcp_crucible_schedule_tasks
- mcp_crucible_order_dependencies
- mcp_crucible_capacity_fit
- mcp_crucible_schedule_tasks
- mcp_crucible_order_dependencies
- mcp_crucible_capacity_fit
trigger:
manual: true
steps:
- classify_problem
- choose_template
- translate_into_constraints
- verify_with_crucible
- report_sat_unsat_with_witness
- classify_problem
- choose_template
- translate_into_constraints
- verify_with_crucible
- report_sat_unsat_with_witness
output: verified_result
timeout_minutes: 5
system_prompt: 'You are running the Crucible playbook.
system_prompt: |
You are running the Crucible playbook.
Use this playbook for:
- scheduling and deadline feasibility
- dependency ordering and cycle checks
- capacity / resource allocation constraints
- consistency checks where a contradiction matters
RULES:
1. Do not bluff through logic.
2. Pick the narrowest Crucible template that fits the task.
3. Translate the user's question into structured constraints.
3. Translate the user''s question into structured constraints.
4. Call the Crucible tool.
5. If SAT, report the witness model clearly.
6. If UNSAT, say the constraints are impossible and explain which shape of constraint caused the contradiction.
7. If the task is not a good fit for these templates, say so plainly instead of pretending it was verified.
'

View File

@@ -1,33 +1,85 @@
#!/usr/bin/env python3
"""Architecture Linter — Ensuring alignment with the Frontier Local Agenda.
Anthropic is BANNED. Not deprecated, not discouraged — banned.
Any reference to Anthropic as a provider, model, or API endpoint
in active configs is a hard failure.
"""
import os
import sys
import re
# Architecture Linter
# Ensuring all changes align with the Frontier Local Agenda.
SOVEREIGN_RULES = [
(r"https?://(api\.openai\.com|api\.anthropic\.com)", "CRITICAL: External cloud API detected. Use local custom_provider instead."),
(r"provider: (openai|anthropic)", "WARNING: Direct cloud provider used. Ensure fallback_model is configured."),
(r"api_key:\s*['\"][A-Za-z0-9_\-]{16,}['\"]", "SECURITY: Hardcoded API key detected. Use environment variables.")
# BANNED — hard failures
(r"provider:\s*anthropic", "BANNED: Anthropic provider reference. Anthropic is permanently banned from this system."),
(r"anthropic/claude", "BANNED: Anthropic model reference (anthropic/claude-*). Use kimi-k2.5 or google/gemini-2.5-pro."),
(r"api\.anthropic\.com", "BANNED: Direct Anthropic API endpoint. Anthropic is permanently banned."),
(r"ANTHROPIC_API_KEY", "BANNED: Anthropic API key reference. Remove all Anthropic credentials."),
(r"ANTHROPIC_TOKEN", "BANNED: Anthropic token reference. Remove all Anthropic credentials."),
(r"sk-ant-", "BANNED: Anthropic API key literal (sk-ant-*). Remove immediately."),
(r"claude-opus", "BANNED: Claude Opus model reference. Use kimi-k2.5."),
(r"claude-sonnet", "BANNED: Claude Sonnet model reference. Use kimi-k2.5."),
(r"claude-haiku", "BANNED: Claude Haiku model reference. Use google/gemini-2.5-pro."),
# Existing sovereignty rules
(r"https?://api\.openai\.com", "WARNING: Direct OpenAI API endpoint. Use local custom_provider instead."),
(r"provider:\s*openai", "WARNING: Direct OpenAI provider. Ensure fallback_model is configured."),
(r"api_key: ['\"][^'\"\s]{10,}['\"]", "SECURITY: Hardcoded API key detected. Use environment variables."),
]
def lint_file(path):
# Files to skip (training data, historical docs, changelogs, tests that validate the ban)
SKIP_PATTERNS = [
"training/", "evaluations/", "RELEASE_v", "PERFORMANCE_",
"scores.json", "docs/design-log/", "FALSEWORK.md",
"test_sovereignty_enforcement.py", "test_metrics_helpers.py",
"metrics_helpers.py", # historical cost data
]
def should_skip(path: str) -> bool:
return any(skip in path for skip in SKIP_PATTERNS)
def lint_file(path: str) -> int:
if should_skip(path):
return 0
print(f"Linting {path}...")
content = open(path).read()
violations = 0
for pattern, msg in SOVEREIGN_RULES:
if re.search(pattern, content):
matches = list(re.finditer(pattern, content, re.IGNORECASE))
if matches:
print(f" [!] {msg}")
for m in matches[:3]: # Show up to 3 locations
line_no = content[:m.start()].count('\n') + 1
print(f" Line {line_no}: ...{content[max(0,m.start()-20):m.end()+20].strip()}...")
violations += 1
return violations
def main():
print("--- Ezra's Architecture Linter ---")
print("--- Architecture Linter (Anthropic BANNED) ---")
files = [f for f in sys.argv[1:] if os.path.isfile(f)]
if not files:
# If no args, scan all yaml/py/sh/json in the repo
for root, _, filenames in os.walk("."):
for fn in filenames:
if fn.endswith((".yaml", ".yml", ".py", ".sh", ".json", ".md")):
path = os.path.join(root, fn)
if not should_skip(path) and ".git" not in path:
files.append(path)
total_violations = sum(lint_file(f) for f in files)
banned = sum(1 for f in files for p, m in SOVEREIGN_RULES
if "BANNED" in m and re.search(p, open(f).read(), re.IGNORECASE)
and not should_skip(f))
print(f"\nLinting complete. Total violations: {total_violations}")
if banned > 0:
print(f"\n🚫 {banned} BANNED provider violation(s) detected. Anthropic is permanently banned.")
sys.exit(1 if total_violations > 0 else 0)
if __name__ == "__main__":
main()

View File

@@ -5,233 +5,122 @@ Part of the Gemini Sovereign Governance System.
Enforces architectural boundaries, security, and documentation standards
across the Timmy Foundation fleet.
Refs: #437 — repo-aware, test-backed, CI-enforced.
"""
import argparse
import os
import re
import sys
import argparse
from pathlib import Path
# --- CONFIGURATION ---
SOVEREIGN_KEYWORDS = ["mempalace", "sovereign_store", "tirith", "bezalel", "nexus"]
# IP addresses (skip 127.0.0.1, 0.0.0.0, 10.x.x.x, 172.16-31.x.x, 192.168.x.x)
IP_REGEX = r'\b(?!(?:127|10|192\.168|172\.(?:1[6-9]|2\d|3[01]))\.)' \
r'(?:\d{1,3}\.){3}\d{1,3}\b'
# API key / secret patterns — catches openai-, sk-, anthropic-, AKIA, etc.
API_KEY_PATTERNS = [
r'sk-[A-Za-z0-9]{20,}', # OpenAI-style
r'sk-ant-[A-Za-z0-9\-]{20,}', # Anthropic
r'AKIA[A-Z0-9]{16}', # AWS access key
r'ghp_[A-Za-z0-9]{36}', # GitHub PAT
r'glpat-[A-Za-z0-9\-]{20,}', # GitLab PAT
r'(?:api[_-]?key|secret|token)\s*[:=]\s*["\'][A-Za-z0-9_\-]{16,}["\']',
]
# Sovereignty rules (carried from v1)
SOVEREIGN_RULES = [
(r'https?://api\.openai\.com', 'External cloud API: api.openai.com. Use local custom_provider.'),
(r'https?://api\.anthropic\.com', 'External cloud API: api.anthropic.com. Use local custom_provider.'),
(r'provider:\s*(?:openai|anthropic)\b', 'Direct cloud provider. Ensure fallback_model is configured.'),
]
# File extensions to scan
SCAN_EXTENSIONS = {'.py', '.ts', '.tsx', '.js', '.yaml', '.yml', '.json', '.env', '.sh', '.cfg', '.toml'}
SKIP_DIRS = {'.git', 'node_modules', '__pycache__', '.venv', 'venv', '.tox', '.eggs'}
class LinterResult:
"""Structured result container for programmatic access."""
def __init__(self, repo_path: str, repo_name: str):
self.repo_path = repo_path
self.repo_name = repo_name
self.errors: list[str] = []
self.warnings: list[str] = []
@property
def passed(self) -> bool:
return len(self.errors) == 0
@property
def violation_count(self) -> int:
return len(self.errors)
def summary(self) -> str:
lines = [f"--- Architecture Linter v2: {self.repo_name} ---"]
for w in self.warnings:
lines.append(f" [W] {w}")
for e in self.errors:
lines.append(f" [E] {e}")
status = "PASSED" if self.passed else f"FAILED ({self.violation_count} violations)"
lines.append(f"\nResult: {status}")
return '\n'.join(lines)
IP_REGEX = r'\b(?:\d{1,3}\.){3}\d{1,3}\b'
API_KEY_REGEX = r'(?:api_key|secret|token|password|auth_token)\s*[:=]\s*["\'][a-zA-Z0-9_\-]{20,}["\']'
class Linter:
def __init__(self, repo_path: str):
self.repo_path = Path(repo_path).resolve()
if not self.repo_path.is_dir():
raise FileNotFoundError(f"Repository path does not exist: {self.repo_path}")
self.repo_name = self.repo_path.name
self.result = LinterResult(str(self.repo_path), self.repo_name)
self.errors = []
# --- helpers ---
def _scan_files(self, extensions=None):
"""Yield (Path, content) for files matching *extensions*."""
exts = extensions or SCAN_EXTENSIONS
for root, dirs, files in os.walk(self.repo_path):
dirs[:] = [d for d in dirs if d not in SKIP_DIRS]
for fname in files:
if Path(fname).suffix in exts:
if fname == '.env.example':
continue
fpath = Path(root) / fname
try:
content = fpath.read_text(errors='ignore')
except Exception:
continue
yield fpath, content
def _line_no(self, content: str, offset: int) -> int:
return content.count('\n', 0, offset) + 1
# --- checks ---
def log_error(self, message: str, file: str = None, line: int = None):
loc = f"{file}:{line}" if file and line else (file if file else "General")
self.errors.append(f"[{loc}] {message}")
def check_sidecar_boundary(self):
"""No sovereign code in hermes-agent (sidecar boundary)."""
if self.repo_name != 'hermes-agent':
return
for fpath, content in self._scan_files():
for kw in SOVEREIGN_KEYWORDS:
if kw in content.lower():
rel = str(fpath.relative_to(self.repo_path))
self.result.errors.append(
f"Sovereign keyword '{kw}' in hermes-agent violates sidecar boundary. [{rel}]"
)
"""Rule 1: No sovereign code in hermes-agent (sidecar boundary)"""
if self.repo_name == "hermes-agent":
for root, _, files in os.walk(self.repo_path):
if "node_modules" in root or ".git" in root:
continue
for file in files:
if file.endswith((".py", ".ts", ".js", ".tsx")):
path = Path(root) / file
content = path.read_text(errors="ignore")
for kw in SOVEREIGN_KEYWORDS:
if kw in content.lower():
# Exception: imports or comments might be okay, but we're strict for now
self.log_error(f"Sovereign keyword '{kw}' found in hermes-agent. Violates sidecar boundary.", str(path.relative_to(self.repo_path)))
def check_hardcoded_ips(self):
"""No hardcoded public IPs use DNS or env vars."""
for fpath, content in self._scan_files():
for m in re.finditer(IP_REGEX, content):
ip = m.group()
# skip private ranges already handled by lookahead, and 0.0.0.0
if ip.startswith('0.'):
continue
line = self._line_no(content, m.start())
rel = str(fpath.relative_to(self.repo_path))
self.result.errors.append(
f"Hardcoded IP '{ip}'. Use DNS or env vars. [{rel}:{line}]"
)
"""Rule 2: No hardcoded IPs (use domain names)"""
for root, _, files in os.walk(self.repo_path):
if "node_modules" in root or ".git" in root:
continue
for file in files:
if file.endswith((".py", ".ts", ".js", ".tsx", ".yaml", ".yml", ".json")):
path = Path(root) / file
content = path.read_text(errors="ignore")
matches = re.finditer(IP_REGEX, content)
for match in matches:
ip = match.group()
if ip in ["127.0.0.1", "0.0.0.0"]:
continue
line_no = content.count('\n', 0, match.start()) + 1
self.log_error(f"Hardcoded IP address '{ip}' found. Use domain names or environment variables.", str(path.relative_to(self.repo_path)), line_no)
def check_api_keys(self):
"""No cloud API keys / secrets committed."""
for fpath, content in self._scan_files():
for pattern in API_KEY_PATTERNS:
for m in re.finditer(pattern, content, re.IGNORECASE):
line = self._line_no(content, m.start())
rel = str(fpath.relative_to(self.repo_path))
self.result.errors.append(
f"Potential secret / API key detected. [{rel}:{line}]"
)
def check_sovereignty_rules(self):
"""V1 sovereignty rules: no direct cloud API endpoints or providers."""
for fpath, content in self._scan_files({'.py', '.ts', '.tsx', '.js', '.yaml', '.yml'}):
for pattern, msg in SOVEREIGN_RULES:
for m in re.finditer(pattern, content):
line = self._line_no(content, m.start())
rel = str(fpath.relative_to(self.repo_path))
self.result.errors.append(f"{msg} [{rel}:{line}]")
"""Rule 3: No cloud API keys committed to repos"""
for root, _, files in os.walk(self.repo_path):
if "node_modules" in root or ".git" in root:
continue
for file in files:
if file.endswith((".py", ".ts", ".js", ".tsx", ".yaml", ".yml", ".json", ".env")):
if file == ".env.example":
continue
path = Path(root) / file
content = path.read_text(errors="ignore")
matches = re.finditer(API_KEY_REGEX, content, re.IGNORECASE)
for match in matches:
line_no = content.count('\n', 0, match.start()) + 1
self.log_error("Potential API key or secret found in code.", str(path.relative_to(self.repo_path)), line_no)
def check_soul_canonical(self):
"""SOUL.md must exist exactly in timmy-config root."""
soul_path = self.repo_path / 'SOUL.md'
if self.repo_name == 'timmy-config':
"""Rule 4: SOUL.md exists and is canonical in exactly one location"""
soul_path = self.repo_path / "SOUL.md"
if self.repo_name == "timmy-config":
if not soul_path.exists():
self.result.errors.append(
'SOUL.md missing from canonical location (timmy-config root).'
)
self.log_error("SOUL.md is missing from the canonical location (timmy-config root).")
else:
if soul_path.exists():
self.result.errors.append(
'SOUL.md found in non-canonical repo. Must live only in timmy-config.'
)
self.log_error("SOUL.md found in non-canonical repo. It should only live in timmy-config.")
def check_readme(self):
"""Every repo must have a substantive README."""
readme = self.repo_path / 'README.md'
if not readme.exists():
self.result.errors.append('README.md is missing.')
"""Rule 5: Every repo has a README with current truth"""
readme_path = self.repo_path / "README.md"
if not readme_path.exists():
self.log_error("README.md is missing.")
else:
content = readme.read_text(errors='ignore')
content = readme_path.read_text(errors="ignore")
if len(content.strip()) < 50:
self.result.warnings.append(
'README.md is very short (<50 chars). Provide current truth about the repo.'
)
self.log_error("README.md is too short or empty. Provide current truth about the repo.")
# --- runner ---
def run(self) -> LinterResult:
"""Execute all checks and return the result."""
def run(self):
print(f"--- Gemini Linter: Auditing {self.repo_name} ---")
self.check_sidecar_boundary()
self.check_hardcoded_ips()
self.check_api_keys()
self.check_sovereignty_rules()
self.check_soul_canonical()
self.check_readme()
return self.result
if self.errors:
print(f"\n[FAILURE] Found {len(self.errors)} architectural violations:")
for err in self.errors:
print(f" - {err}")
return False
else:
print("\n[SUCCESS] Architecture is sound. Sovereignty maintained.")
return True
def main():
parser = argparse.ArgumentParser(
description='Gemini Architecture Linter v2 — repo-aware sovereignty gate.'
)
parser.add_argument(
'repo_path', nargs='?', default='.',
help='Path to the repository to lint (default: cwd).',
)
parser.add_argument(
'--repo', dest='repo_flag', default=None,
help='Explicit repo path (alias for positional arg).',
)
parser.add_argument(
'--json', dest='json_output', action='store_true',
help='Emit machine-readable JSON instead of human text.',
)
parser = argparse.ArgumentParser(description="Gemini Architecture Linter v2")
parser.add_argument("repo_path", nargs="?", default=".", help="Path to the repository to lint")
args = parser.parse_args()
path = args.repo_flag if args.repo_flag else args.repo_path
linter = Linter(args.repo_path)
success = linter.run()
sys.exit(0 if success else 1)
try:
linter = Linter(path)
except FileNotFoundError as exc:
print(f"ERROR: {exc}", file=sys.stderr)
sys.exit(2)
result = linter.run()
if args.json_output:
import json as _json
out = {
'repo': result.repo_name,
'passed': result.passed,
'violation_count': result.violation_count,
'errors': result.errors,
'warnings': result.warnings,
}
print(_json.dumps(out, indent=2))
else:
print(result.summary())
sys.exit(0 if result.passed else 1)
if __name__ == '__main__':
if __name__ == "__main__":
main()

View File

@@ -1,306 +0,0 @@
#!/usr/bin/env python3
"""
config_validator.py — Validate all YAML/JSON config files in timmy-config.
Checks:
1. YAML syntax (pyyaml safe_load)
2. JSON syntax (json.loads)
3. Duplicate keys in YAML/JSON
4. Trailing whitespace in YAML
5. Tabs in YAML (should use spaces)
6. Cron expression validity (if present)
Exit 0 if all valid, 1 if any invalid.
"""
import json
import os
import re
import sys
from pathlib import Path
try:
import yaml
except ImportError:
print("ERROR: PyYAML not installed. Run: pip install pyyaml")
sys.exit(1)
# ── Cron validation ──────────────────────────────────────────────────────────
DOW_NAMES = {"sun", "mon", "tue", "wed", "thu", "fri", "sat"}
MONTH_NAMES = {"jan", "feb", "mar", "apr", "may", "jun",
"jul", "aug", "sep", "oct", "nov", "dec"}
def _expand_cron_field(field: str, lo: int, hi: int, names: dict | None = None) -> set[int]:
"""Expand a single cron field into a set of valid integers."""
result: set[int] = set()
for part in field.split(","):
# Handle step: */N or 1-5/N
step = 1
if "/" in part:
part, step_str = part.split("/", 1)
if not step_str.isdigit() or int(step_str) < 1:
raise ValueError(f"invalid step value: {step_str}")
step = int(step_str)
if part == "*":
rng = range(lo, hi + 1, step)
elif "-" in part:
a, b = part.split("-", 1)
a = _resolve_name(a, names, lo, hi)
b = _resolve_name(b, names, lo, hi)
if a > b:
raise ValueError(f"range {a}-{b} is reversed")
rng = range(a, b + 1, step)
else:
val = _resolve_name(part, names, lo, hi)
rng = range(val, val + 1)
for v in rng:
if v < lo or v > hi:
raise ValueError(f"value {v} out of range [{lo}-{hi}]")
result.add(v)
return result
def _resolve_name(token: str, names: dict | None, lo: int, hi: int) -> int:
if names and token.lower() in names:
return names[token.lower()]
if not token.isdigit():
raise ValueError(f"unrecognized token: {token}")
val = int(token)
if val < lo or val > hi:
raise ValueError(f"value {val} out of range [{lo}-{hi}]")
return val
def validate_cron(expr: str) -> list[str]:
"""Validate a 5-field cron expression. Returns list of errors (empty = ok)."""
errors: list[str] = []
fields = expr.strip().split()
if len(fields) != 5:
return [f"expected 5 fields, got {len(fields)}"]
specs = [
(fields[0], 0, 59, None, "minute"),
(fields[1], 0, 23, None, "hour"),
(fields[2], 1, 31, None, "day-of-month"),
(fields[3], 1, 12, MONTH_NAMES, "month"),
(fields[4], 0, 7, DOW_NAMES, "day-of-week"),
]
for field, lo, hi, names, label in specs:
try:
_expand_cron_field(field, lo, hi, names)
except ValueError as e:
errors.append(f"{label}: {e}")
return errors
# ── Duplicate key detection ──────────────────────────────────────────────────
class DuplicateKeyError(Exception):
pass
class _StrictYAMLLoader(yaml.SafeLoader):
"""YAML loader that rejects duplicate keys."""
pass
def _no_duplicates_constructor(loader, node, deep=False):
mapping = {}
for key_node, value_node in node.value:
key = loader.construct_object(key_node, deep=deep)
if key in mapping:
raise DuplicateKeyError(
f"duplicate key '{key}' (line {key_node.start_mark.line + 1})"
)
mapping[key] = loader.construct_object(value_node, deep=deep)
return mapping
_StrictYAMLLoader.add_constructor(
yaml.resolver.BaseResolver.DEFAULT_MAPPING_TAG,
_no_duplicates_constructor,
)
def _json_has_duplicates(text: str) -> list[str]:
"""Check for duplicate keys in JSON by scanning for repeated quoted keys at same depth."""
errors: list[str] = []
# Use a custom approach: parse with object_pairs_hook
seen_stack: list[set[str]] = []
def _check_pairs(pairs):
level_keys: set[str] = set()
for k, _ in pairs:
if k in level_keys:
errors.append(f"duplicate JSON key: '{k}'")
level_keys.add(k)
return dict(pairs)
try:
json.loads(text, object_pairs_hook=_check_pairs)
except json.JSONDecodeError:
pass # syntax errors caught elsewhere
return errors
# ── Main validator ───────────────────────────────────────────────────────────
def find_config_files(root: Path) -> list[Path]:
"""Recursively find .yaml, .yml, .json files (skip .git, node_modules, venv)."""
skip_dirs = {".git", "node_modules", "venv", "__pycache__", ".venv"}
results: list[Path] = []
for dirpath, dirnames, filenames in os.walk(root):
dirnames[:] = [d for d in dirnames if d not in skip_dirs]
for fname in filenames:
if fname.endswith((".yaml", ".yml", ".json")):
results.append(Path(dirpath) / fname)
return sorted(results)
def validate_yaml_file(filepath: Path, text: str) -> list[str]:
"""Validate a YAML file. Returns list of errors."""
errors: list[str] = []
# Check for tabs
for i, line in enumerate(text.splitlines(), 1):
if "\t" in line:
errors.append(f" line {i}: contains tab character (use spaces for YAML)")
if line != line.rstrip():
errors.append(f" line {i}: trailing whitespace")
# Check syntax + duplicate keys
try:
yaml.load(text, Loader=_StrictYAMLLoader)
except DuplicateKeyError as e:
errors.append(f" {e}")
except yaml.YAMLError as e:
mark = getattr(e, "problem_mark", None)
if mark:
errors.append(f" YAML syntax error at line {mark.line + 1}, col {mark.column + 1}: {e.problem}")
else:
errors.append(f" YAML syntax error: {e}")
# Check cron expressions in schedule fields
for i, line in enumerate(text.splitlines(), 1):
cron_match = re.search(r'(?:cron|schedule)\s*:\s*["\']?([*0-9/,a-zA-Z-]+(?:\s+[*0-9/,a-zA-Z-]+){4})["\']?', line)
if cron_match:
cron_errs = validate_cron(cron_match.group(1))
for ce in cron_errs:
errors.append(f" line {i}: invalid cron '{cron_match.group(1)}': {ce}")
return errors
def validate_json_file(filepath: Path, text: str) -> list[str]:
"""Validate a JSON file. Returns list of errors."""
errors: list[str] = []
# Check syntax
try:
json.loads(text)
except json.JSONDecodeError as e:
errors.append(f" JSON syntax error at line {e.lineno}, col {e.colno}: {e.msg}")
# Check duplicate keys
dup_errors = _json_has_duplicates(text)
errors.extend(dup_errors)
# Check for trailing whitespace (informational)
for i, line in enumerate(text.splitlines(), 1):
if line != line.rstrip():
errors.append(f" line {i}: trailing whitespace")
# Check cron expressions
cron_pattern = re.compile(r'"(?:cron|schedule)"?\s*:\s*"([^"]{5,})"')
for match in cron_pattern.finditer(text):
candidate = match.group(1).strip()
fields = candidate.split()
if len(fields) == 5 and all(re.match(r'^[*0-9/,a-zA-Z-]+$', f) for f in fields):
cron_errs = validate_cron(candidate)
for ce in cron_errs:
errors.append(f" invalid cron '{candidate}': {ce}")
# Also check nested schedule objects with cron fields
try:
obj = json.loads(text)
_scan_obj_for_cron(obj, errors)
except Exception:
pass
return errors
def _scan_obj_for_cron(obj, errors: list[str], path: str = ""):
"""Recursively scan dict/list for cron expressions."""
if isinstance(obj, dict):
for k, v in obj.items():
if k in ("cron", "schedule", "cron_expression") and isinstance(v, str):
fields = v.strip().split()
if len(fields) == 5:
cron_errs = validate_cron(v)
for ce in cron_errs:
errors.append(f" {path}.{k}: invalid cron '{v}': {ce}")
_scan_obj_for_cron(v, errors, f"{path}.{k}")
elif isinstance(obj, list):
for i, item in enumerate(obj):
_scan_obj_for_cron(item, errors, f"{path}[{i}]")
def main():
# Determine repo root (script lives in scripts/)
script_path = Path(__file__).resolve()
repo_root = script_path.parent.parent
print(f"Config Validator — scanning {repo_root}")
print("=" * 60)
files = find_config_files(repo_root)
print(f"Found {len(files)} config files to validate.\n")
total_errors = 0
failed_files: list[tuple[Path, list[str]]] = []
for filepath in files:
rel = filepath.relative_to(repo_root)
try:
text = filepath.read_text(encoding="utf-8", errors="replace")
except Exception as e:
failed_files.append((rel, [f" cannot read file: {e}"]))
total_errors += 1
continue
if filepath.suffix == ".json":
errors = validate_json_file(filepath, text)
else:
errors = validate_yaml_file(filepath, text)
if errors:
failed_files.append((rel, errors))
total_errors += len(errors)
print(f"FAIL {rel}")
else:
print(f"PASS {rel}")
print("\n" + "=" * 60)
print(f"Results: {len(files) - len(failed_files)}/{len(files)} files passed")
if failed_files:
print(f"\n{total_errors} error(s) in {len(failed_files)} file(s):\n")
for relpath, errs in failed_files:
print(f" {relpath}:")
for e in errs:
print(f" {e}")
print()
sys.exit(1)
else:
print("\nAll config files valid!")
sys.exit(0)
if __name__ == "__main__":
main()

View File

@@ -4,8 +4,6 @@
Part of the Gemini Sovereign Infrastructure Suite.
Auto-detects and fixes common failures across the fleet.
Safe-by-default: runs in dry-run mode unless --execute is given.
"""
import os
@@ -13,7 +11,6 @@ import sys
import subprocess
import argparse
import requests
import datetime
# --- CONFIGURATION ---
FLEET = {
@@ -24,210 +21,51 @@ FLEET = {
}
class SelfHealer:
def __init__(self, dry_run=True, confirm_kill=False, yes=False):
self.dry_run = dry_run
self.confirm_kill = confirm_kill
self.yes = yes
def log(self, message: str):
timestamp = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
print(f"[{timestamp}] {message}")
print(f"[*] {message}")
def run_remote(self, host: str, command: str):
ip = FLEET[host]["ip"]
ssh_cmd = ["ssh", "-o", "StrictHostKeyChecking=no", "-o", "ConnectTimeout=5", f"root@{ip}", command]
ssh_cmd = ["ssh", "-o", "StrictHostKeyChecking=no", f"root@{ip}", command]
if host == "mac":
ssh_cmd = ["bash", "-c", command]
try:
return subprocess.run(ssh_cmd, capture_output=True, text=True, timeout=15)
except Exception as e:
self.log(f" [ERROR] Failed to run remote command on {host}: {e}")
return None
def confirm(self, prompt: str) -> bool:
"""Ask for confirmation unless --yes flag is set."""
if self.yes:
return True
while True:
response = input(f"{prompt} [y/N] ").strip().lower()
if response in ("y", "yes"):
return True
elif response in ("n", "no", ""):
return False
print("Please answer 'y' or 'n'.")
def check_llama_server(self, host: str):
ip = FLEET[host]["ip"]
port = FLEET[host]["port"]
try:
requests.get(f"http://{ip}:{port}/health", timeout=2)
return subprocess.run(ssh_cmd, capture_output=True, text=True, timeout=10)
except:
self.log(f" [!] llama-server down on {host}.")
if self.dry_run:
self.log(f" [DRY-RUN] Would restart llama-server on {host}")
else:
if self.confirm(f" Restart llama-server on {host}?"):
self.log(f" Restarting llama-server on {host}...")
self.run_remote(host, "systemctl restart llama-server")
else:
self.log(f" Skipped restart on {host}.")
def check_disk_space(self, host: str):
res = self.run_remote(host, "df -h / | tail -1 | awk '{print $5}' | sed 's/%//'")
if res and res.returncode == 0:
try:
usage = int(res.stdout.strip())
if usage > 90:
self.log(f" [!] Disk usage high on {host} ({usage}%).")
if self.dry_run:
self.log(f" [DRY-RUN] Would clean logs and vacuum journal on {host}")
else:
if self.confirm(f" Clean logs on {host}?"):
self.log(f" Cleaning logs on {host}...")
self.run_remote(host, "journalctl --vacuum-time=1d && rm -rf /var/log/*.gz")
else:
self.log(f" Skipped log cleaning on {host}.")
except:
pass
def check_memory(self, host: str):
res = self.run_remote(host, "free -m | awk '/^Mem:/{print $3/$2 * 100}'")
if res and res.returncode == 0:
try:
usage = float(res.stdout.strip())
if usage > 90:
self.log(f" [!] Memory usage high on {host} ({usage:.1f}%).")
if self.dry_run:
self.log(f" [DRY-RUN] Would check for memory hogs on {host}")
else:
self.log(f" Memory high but no automatic action defined.")
except:
pass
def check_processes(self, host: str):
# Example: check if any process uses > 80% CPU
res = self.run_remote(host, "ps aux --sort=-%cpu | awk 'NR>1 && $3>80 {print $2, $11, $3}'")
if res and res.returncode == 0 and res.stdout.strip():
self.log(f" [!] High CPU processes on {host}:")
for line in res.stdout.strip().split('\n'):
self.log(f" {line}")
if self.dry_run:
self.log(f" [DRY-RUN] Would review high-CPU processes on {host}")
else:
if self.confirm_kill:
if self.confirm(f" Kill high-CPU processes on {host}? (dangerous)"):
# This is a placeholder; real implementation would parse PIDs
self.log(f" Process killing not implemented yet (placeholder).")
else:
self.log(f" Skipped killing processes on {host}.")
else:
self.log(f" Use --confirm-kill to enable process termination (dangerous).")
return None
def check_and_heal(self):
for host in FLEET:
self.log(f"Auditing {host}...")
self.check_llama_server(host)
self.check_disk_space(host)
self.check_memory(host)
self.check_processes(host)
# 1. Check llama-server
ip = FLEET[host]["ip"]
port = FLEET[host]["port"]
try:
requests.get(f"http://{ip}:{port}/health", timeout=2)
except:
self.log(f" [!] llama-server down on {host}. Attempting restart...")
self.run_remote(host, "systemctl restart llama-server")
# 2. Check disk space
res = self.run_remote(host, "df -h / | tail -1 | awk '{print $5}' | sed 's/%//'")
if res and res.returncode == 0:
try:
usage = int(res.stdout.strip())
if usage > 90:
self.log(f" [!] Disk usage high on {host} ({usage}%). Cleaning logs...")
self.run_remote(host, "journalctl --vacuum-time=1d && rm -rf /var/log/*.gz")
except:
pass
def run(self):
if self.dry_run:
self.log("Starting self-healing cycle (DRY-RUN mode).")
else:
self.log("Starting self-healing cycle (EXECUTE mode).")
self.log("Starting self-healing cycle...")
self.check_and_heal()
self.log("Cycle complete.")
def print_help_safe():
"""Print detailed explanation of what each action does."""
help_text = """
SAFE-BY-DEFAULT SELF-HEALING SCRIPT
This script checks fleet health and can optionally fix issues.
DEFAULT MODE: DRY-RUN (safe)
- Only reports what it would do, does not make changes.
- Use --execute to actually perform fixes.
CHECKS PERFORMED:
1. llama-server health
- Checks if llama-server is responding on each host.
- Action: restart service (requires --execute and confirmation).
2. Disk space
- Checks root partition usage on each host.
- Action: vacuum journal logs and remove rotated logs if >90% (requires --execute and confirmation).
3. Memory usage
- Reports high memory usage (informational only, no automatic action).
4. Process health
- Lists processes using >80% CPU.
- Action: kill processes (requires --confirm-kill flag, --execute, and confirmation).
SAFETY FEATURES:
- Dry-run by default.
- Explicit --execute flag required for changes.
- Confirmation prompts for all destructive actions.
- --yes flag to skip confirmations (for automation).
- --confirm-kill flag required to even consider killing processes.
- Timestamps on all log messages.
EXAMPLES:
python3 scripts/self_healing.py
# Dry-run: safe, shows what would happen.
python3 scripts/self_healing.py --execute
# Actually perform fixes after confirmation.
python3 scripts/self_healing.py --execute --yes
# Perform fixes without prompts (automation).
python3 scripts/self_healing.py --execute --confirm-kill
# Allow killing processes (dangerous).
python3 scripts/self_healing.py --help-safe
# Show this help.
"""
print(help_text)
def main():
parser = argparse.ArgumentParser(
description="Self-healing infrastructure script (safe-by-default).",
add_help=False # We'll handle --help ourselves
)
parser.add_argument("--dry-run", action="store_true", default=False,
help="Run in dry-run mode (default behavior).")
parser.add_argument("--execute", action="store_true", default=False,
help="Actually perform fixes (disables dry-run).")
parser.add_argument("--confirm-kill", action="store_true", default=False,
help="Allow killing processes (dangerous).")
parser.add_argument("--yes", "-y", action="store_true", default=False,
help="Skip confirmation prompts.")
parser.add_argument("--help-safe", action="store_true", default=False,
help="Show detailed help about safety features.")
parser.add_argument("--help", "-h", action="store_true", default=False,
help="Show standard help.")
args = parser.parse_args()
if args.help_safe:
print_help_safe()
sys.exit(0)
if args.help:
parser.print_help()
sys.exit(0)
# Determine mode: if --execute is given, disable dry-run
dry_run = not args.execute
# If --dry-run is explicitly given, ensure dry-run (redundant but clear)
if args.dry_run:
dry_run = True
healer = SelfHealer(dry_run=dry_run, confirm_kill=args.confirm_kill, yes=args.yes)
healer = SelfHealer()
healer.run()
if __name__ == "__main__":
main()
main()

View File

@@ -1,331 +0,0 @@
#!/usr/bin/env python3
"""Task Gate — Pre-task and post-task quality gates for fleet agents.
This is the missing enforcement layer between the orchestrator dispatching
an issue and an agent submitting a PR. SOUL.md demands "grounding before
generation" and "the apparatus that gives these words teeth" — this script
is that apparatus.
Usage:
python3 task_gate.py pre --repo timmy-config --issue 123 --agent groq
python3 task_gate.py post --repo timmy-config --issue 123 --agent groq --branch groq/issue-123
Pre-task gate checks:
1. Issue is not already assigned to a different agent
2. No existing branch targets this issue
3. No open PR already addresses this issue
4. Agent is in the correct lane per playbooks/agent-lanes.json
5. Issue is not filtered (epic, permanent, etc.)
Post-task gate checks:
1. Branch exists and has commits ahead of main
2. Changed files pass syntax_guard.py
3. No duplicate PR exists for the same issue
4. Branch name follows convention: {agent}/{description}
5. At least one file was actually changed
Exit codes:
0 = all gates pass
1 = gate failure (should not proceed)
2 = warning (can proceed with caution)
"""
import argparse
import json
import os
import subprocess
import sys
import urllib.request
import urllib.error
# ---------------------------------------------------------------------------
# CONFIG
# ---------------------------------------------------------------------------
GITEA_API = "https://forge.alexanderwhitestone.com/api/v1"
GITEA_OWNER = "Timmy_Foundation"
FILTER_TAGS = ["[EPIC]", "[DO NOT CLOSE]", "[PERMANENT]", "[PHILOSOPHY]", "[MORNING REPORT]"]
AGENT_USERNAMES = {
"groq", "ezra", "bezalel", "allegro", "timmy",
"thetimmyc", "perplexity", "kimiclaw", "codex-agent",
"manus", "claude", "gemini", "grok",
}
# ---------------------------------------------------------------------------
# GITEA API
# ---------------------------------------------------------------------------
def load_gitea_token():
token = os.environ.get("GITEA_TOKEN", "")
if token:
return token.strip()
for path in [
os.path.expanduser("~/.hermes/gitea_token_vps"),
os.path.expanduser("~/.hermes/gitea_token"),
]:
try:
with open(path) as f:
return f.read().strip()
except FileNotFoundError:
continue
print("[FATAL] No GITEA_TOKEN found")
sys.exit(2)
def gitea_get(path):
token = load_gitea_token()
url = f"{GITEA_API}{path}"
req = urllib.request.Request(url, headers={
"Authorization": f"token {token}",
"Accept": "application/json",
})
try:
with urllib.request.urlopen(req, timeout=15) as resp:
return json.loads(resp.read().decode())
except urllib.error.HTTPError as e:
if e.code == 404:
return None
print(f"[API ERROR] {url} -> {e.code}")
return None
except Exception as e:
print(f"[API ERROR] {url} -> {e}")
return None
# ---------------------------------------------------------------------------
# LANE CHECKER
# ---------------------------------------------------------------------------
def load_agent_lanes():
"""Load agent lane assignments from playbooks/agent-lanes.json."""
lanes_path = os.path.join(
os.path.dirname(os.path.dirname(os.path.abspath(__file__))),
"playbooks", "agent-lanes.json"
)
try:
with open(lanes_path) as f:
return json.load(f)
except FileNotFoundError:
return {} # no lanes file = no lane enforcement
def check_agent_lane(agent, issue_title, issue_labels, lanes):
"""Check if the agent is in the right lane for this issue type."""
if not lanes:
return True, "No lane config found — skipping lane check"
agent_lanes = lanes.get(agent, [])
if not agent_lanes:
return True, f"No lanes defined for {agent} — skipping"
# This is advisory, not blocking — return warning if mismatch
return True, f"{agent} has lanes: {agent_lanes}"
# ---------------------------------------------------------------------------
# PRE-TASK GATE
# ---------------------------------------------------------------------------
def pre_task_gate(repo, issue_number, agent):
"""Run all pre-task checks. Returns (pass, messages)."""
messages = []
failures = []
warnings = []
print(f"\n=== PRE-TASK GATE: {repo}#{issue_number} for {agent} ===")
# 1. Fetch issue
issue = gitea_get(f"/repos/{GITEA_OWNER}/{repo}/issues/{issue_number}")
if not issue:
failures.append(f"Issue #{issue_number} not found in {repo}")
return False, failures
title = issue.get("title", "")
print(f" Issue: {title}")
# 2. Check if filtered
title_upper = title.upper()
for tag in FILTER_TAGS:
if tag.upper().replace("[", "").replace("]", "") in title_upper:
failures.append(f"Issue has filter tag: {tag} — should not be auto-dispatched")
# 3. Check assignees
assignees = [a.get("login", "") for a in (issue.get("assignees") or [])]
other_agents = [a for a in assignees if a.lower() in AGENT_USERNAMES and a.lower() != agent.lower()]
if other_agents:
failures.append(f"Already assigned to other agent(s): {other_agents}")
# 4. Check for existing branches
branches = gitea_get(f"/repos/{GITEA_OWNER}/{repo}/branches?limit=50")
if branches:
issue_branches = [
b["name"] for b in branches
if str(issue_number) in b.get("name", "")
and b["name"] != "main"
]
if issue_branches:
warnings.append(f"Existing branches may target this issue: {issue_branches}")
# 5. Check for existing PRs
prs = gitea_get(f"/repos/{GITEA_OWNER}/{repo}/pulls?state=open&limit=50")
if prs:
issue_prs = [
f"PR #{p['number']}: {p['title']}"
for p in prs
if str(issue_number) in p.get("title", "")
or str(issue_number) in p.get("body", "")
]
if issue_prs:
failures.append(f"Open PR(s) already target this issue: {issue_prs}")
# 6. Check agent lanes
lanes = load_agent_lanes()
labels = [l.get("name", "") for l in (issue.get("labels") or [])]
lane_ok, lane_msg = check_agent_lane(agent, title, labels, lanes)
if not lane_ok:
warnings.append(lane_msg)
else:
messages.append(f" Lane: {lane_msg}")
# Report
if failures:
print("\n FAILURES:")
for f in failures:
print(f"{f}")
if warnings:
print("\n WARNINGS:")
for w in warnings:
print(f" ⚠️ {w}")
if not failures and not warnings:
print(" \u2705 All pre-task gates passed")
passed = len(failures) == 0
return passed, failures + warnings
# ---------------------------------------------------------------------------
# POST-TASK GATE
# ---------------------------------------------------------------------------
def post_task_gate(repo, issue_number, agent, branch):
"""Run all post-task checks. Returns (pass, messages)."""
failures = []
warnings = []
print(f"\n=== POST-TASK GATE: {repo}#{issue_number} by {agent} ===")
print(f" Branch: {branch}")
# 1. Check branch exists
branch_info = gitea_get(
f"/repos/{GITEA_OWNER}/{repo}/branches/{urllib.parse.quote(branch, safe='')}"
)
if not branch_info:
failures.append(f"Branch '{branch}' does not exist")
return False, failures
# 2. Check branch naming convention
if "/" not in branch:
warnings.append(f"Branch name '{branch}' doesn't follow agent/description convention")
elif not branch.startswith(f"{agent}/"):
warnings.append(f"Branch '{branch}' doesn't start with agent name '{agent}/")
# 3. Check for commits ahead of main
compare = gitea_get(
f"/repos/{GITEA_OWNER}/{repo}/compare/main...{urllib.parse.quote(branch, safe='')}"
)
if compare:
commits = compare.get("commits", [])
if not commits:
failures.append("Branch has no commits ahead of main")
else:
print(f" Commits ahead: {len(commits)}")
files = compare.get("diff_files", []) or []
if not files:
# Try alternate key
num_files = compare.get("total_commits", 0)
print(f" Files changed: (check PR diff)")
else:
print(f" Files changed: {len(files)}")
# 4. Check for duplicate PRs
prs = gitea_get(f"/repos/{GITEA_OWNER}/{repo}/pulls?state=open&limit=50")
if prs:
dupe_prs = [
f"PR #{p['number']}"
for p in prs
if str(issue_number) in p.get("title", "")
or str(issue_number) in p.get("body", "")
]
if len(dupe_prs) > 1:
warnings.append(f"Multiple open PRs may target issue #{issue_number}: {dupe_prs}")
# 5. Run syntax guard on changed files (if available)
syntax_guard = os.path.join(
os.path.dirname(os.path.dirname(os.path.abspath(__file__))),
"hermes-sovereign", "scripts", "syntax_guard.py"
)
if os.path.exists(syntax_guard):
try:
result = subprocess.run(
[sys.executable, syntax_guard],
capture_output=True, text=True, timeout=30
)
if result.returncode != 0:
failures.append(f"Syntax guard failed: {result.stdout[:200]}")
else:
print(" Syntax guard: passed")
except Exception as e:
warnings.append(f"Could not run syntax guard: {e}")
else:
warnings.append("syntax_guard.py not found — skipping syntax check")
# Report
if failures:
print("\n FAILURES:")
for f in failures:
print(f"{f}")
if warnings:
print("\n WARNINGS:")
for w in warnings:
print(f" ⚠️ {w}")
if not failures and not warnings:
print(" \u2705 All post-task gates passed")
passed = len(failures) == 0
return passed, failures + warnings
# ---------------------------------------------------------------------------
# MAIN
# ---------------------------------------------------------------------------
def main():
parser = argparse.ArgumentParser(description="Task Gate — pre/post-task quality gates")
subparsers = parser.add_subparsers(dest="command")
# Pre-task
pre = subparsers.add_parser("pre", help="Run pre-task gates")
pre.add_argument("--repo", required=True)
pre.add_argument("--issue", type=int, required=True)
pre.add_argument("--agent", required=True)
# Post-task
post = subparsers.add_parser("post", help="Run post-task gates")
post.add_argument("--repo", required=True)
post.add_argument("--issue", type=int, required=True)
post.add_argument("--agent", required=True)
post.add_argument("--branch", required=True)
args = parser.parse_args()
if not args.command:
parser.print_help()
sys.exit(1)
if args.command == "pre":
passed, msgs = pre_task_gate(args.repo, args.issue, args.agent)
elif args.command == "post":
passed, msgs = post_task_gate(args.repo, args.issue, args.agent, args.branch)
else:
parser.print_help()
sys.exit(1)
sys.exit(0 if passed else 1)
if __name__ == "__main__":
main()

View File

@@ -1,195 +0,0 @@
#!/usr/bin/env bash
# test_harness.sh — Common CLI safety/test harness for the scripts/ suite
# Usage: ./scripts/test_harness.sh [--verbose] [--ci] [directory]
#
# Discovers .sh, .py, and .yaml files in the target directory and validates them:
# - .sh : runs shellcheck (or SKIPS if unavailable)
# - .py : runs python3 -m py_compile
# - .yaml: validates with python3 yaml.safe_load
#
# Exit codes: 0 = all pass, 1 = any fail
set -euo pipefail
# --- Defaults ---
VERBOSE=0
CI_MODE=0
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
TARGET_DIR="${SCRIPT_DIR}"
# --- Colors (disabled in CI) ---
RED=""
GREEN=""
YELLOW=""
CYAN=""
RESET=""
if [[ -t 1 && "${CI:-}" != "true" ]]; then
RED=$'\033[0;31m'
GREEN=$'\033[0;32m'
YELLOW=$'\033[0;33m'
CYAN=$'\033[0;36m'
RESET=$'\033[0m'
fi
# --- Argument parsing ---
while [[ $# -gt 0 ]]; do
case "$1" in
--verbose|-v) VERBOSE=1; shift ;;
--ci) CI_MODE=1; shift ;;
-*) echo "Unknown option: $1" >&2; exit 2 ;;
*) TARGET_DIR="$1"; shift ;;
esac
done
# --- Counters ---
PASS=0
FAIL=0
SKIP=0
TOTAL=0
# --- Helpers ---
log_verbose() {
if [[ "${VERBOSE}" -eq 1 ]]; then
echo " ${CYAN}[DEBUG]${RESET} $*"
fi
}
record_pass() {
((PASS++))
((TOTAL++))
echo "${GREEN}PASS${RESET} $1"
}
record_fail() {
((FAIL++))
((TOTAL++))
echo "${RED}FAIL${RESET} $1"
if [[ -n "${2:-}" ]]; then
echo " ${2}"
fi
}
record_skip() {
((SKIP++))
((TOTAL++))
echo "${YELLOW}SKIP${RESET} $1$2"
}
# --- Checkers ---
check_shell_file() {
local file="$1"
local rel="${file#${TARGET_DIR}/}"
if command -v shellcheck &>/dev/null; then
log_verbose "Running shellcheck on ${rel}"
local output
if output=$(shellcheck -x -S warning "${file}" 2>&1); then
record_pass "${rel}"
else
record_fail "${rel}" "${output}"
fi
else
record_skip "${rel}" "shellcheck not installed"
fi
}
check_python_file() {
local file="$1"
local rel="${file#${TARGET_DIR}/}"
log_verbose "Running py_compile on ${rel}"
local output
if output=$(python3 -m py_compile "${file}" 2>&1); then
record_pass "${rel}"
else
record_fail "${rel}" "${output}"
fi
}
check_yaml_file() {
local file="$1"
local rel="${file#${TARGET_DIR}/}"
log_verbose "Validating YAML: ${rel}"
local output
if output=$(python3 -c "import yaml; yaml.safe_load(open('${file}'))" 2>&1); then
record_pass "${rel}"
else
record_fail "${rel}" "${output}"
fi
}
# --- Main ---
echo ""
echo "=== scripts/ test harness ==="
echo "Target: ${TARGET_DIR}"
echo ""
if [[ ! -d "${TARGET_DIR}" ]]; then
echo "Error: target directory '${TARGET_DIR}' not found" >&2
exit 1
fi
# Check python3 availability
if ! command -v python3 &>/dev/null; then
echo "${RED}Error: python3 is required but not found${RESET}" >&2
exit 1
fi
# Check PyYAML availability
if ! python3 -c "import yaml" 2>/dev/null; then
echo "${YELLOW}Warning: PyYAML not installed — YAML checks will be skipped${RESET}" >&2
YAML_AVAILABLE=0
else
YAML_AVAILABLE=1
fi
# Discover and check .sh files
sh_files=()
while IFS= read -r -d '' f; do
sh_files+=("$f")
done < <(find "${TARGET_DIR}" -maxdepth 1 -name "*.sh" ! -name "test_harness.sh" ! -name "test_runner.sh" -print0 | sort -z)
for f in "${sh_files[@]:-}"; do
[[ -n "$f" ]] && check_shell_file "$f"
done
# Discover and check .py files
py_files=()
while IFS= read -r -d '' f; do
py_files+=("$f")
done < <(find "${TARGET_DIR}" -maxdepth 1 -name "*.py" -print0 | sort -z)
for f in "${py_files[@]:-}"; do
[[ -n "$f" ]] && check_python_file "$f"
done
# Discover and check .yaml files in target dir
yaml_files=()
while IFS= read -r -d '' f; do
yaml_files+=("$f")
done < <(find "${TARGET_DIR}" -maxdepth 1 -name "*.yaml" -print0 | sort -z)
if [[ "${YAML_AVAILABLE}" -eq 1 ]]; then
for f in "${yaml_files[@]:-}"; do
[[ -n "$f" ]] && check_yaml_file "$f"
done
else
for f in "${yaml_files[@]:-}"; do
[[ -n "$f" ]] && record_skip "${f#${TARGET_DIR}/}" "PyYAML not installed"
done
fi
# --- Summary ---
echo ""
echo "=== Results ==="
echo " ${GREEN}PASS${RESET}: ${PASS}"
echo " ${RED}FAIL${RESET}: ${FAIL}"
echo " ${YELLOW}SKIP${RESET}: ${SKIP}"
echo " Total: ${TOTAL}"
echo ""
if [[ "${FAIL}" -gt 0 ]]; then
echo "${RED}FAILED${RESET}${FAIL} file(s) did not pass validation."
exit 1
else
echo "${GREEN}ALL CLEAR${RESET} — all checked files passed."
exit 0
fi

View File

@@ -1,9 +0,0 @@
#!/usr/bin/env bash
# test_runner.sh — Convenience wrapper for test_harness.sh
# Runs the test harness with sensible defaults for local development.
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
exec "${SCRIPT_DIR}/test_harness.sh" --verbose "$@"

View File

@@ -1,22 +0,0 @@
#!/usr/bin/env python3
"""Validate playbook YAML files have required keys."""
import yaml
import sys
import glob
required_keys = {'name', 'description'}
for f in glob.glob('playbooks/*.yaml'):
with open(f) as fh:
try:
data = yaml.safe_load(fh)
if not isinstance(data, dict):
print(f'ERROR: {f} is not a YAML mapping')
sys.exit(1)
missing = required_keys - set(data.keys())
if missing:
print(f'WARNING: {f} missing keys: {missing}')
print(f'OK: {f}')
except yaml.YAMLError as e:
print(f'ERROR: {f}: {e}')
sys.exit(1)

View File

@@ -102,11 +102,11 @@ When I don't know, I say so. Brevity is a kindness.
### 4. Never Go Deaf
Your agent must have a fallback chain (a list of backup models, tried in order) at least 3 models deep. When the primary provider rate-limits you, the agent degrades gracefully — it does not stop.
When Anthropic goes down at 2 AM — and it will — your agent doesn't sit there producing error messages. It switches to the next model in the chain and keeps working. You wake up to finished tasks, not a dead agent.
When any cloud provider goes down at 2 AM — and it will — your agent doesn't sit there producing error messages. It switches to the next model in the chain and keeps working. You wake up to finished tasks, not a dead agent.
```yaml
model:
default: claude-opus-4-6
default: kimi-k2.5
provider: anthropic
fallback_providers:
- provider: openrouter

View File

@@ -1355,7 +1355,6 @@ def dispatch_assigned():
g = GiteaClient()
agents = [
"allegro",
"claude",
"codex-agent",
"ezra",
"gemini",
@@ -2316,7 +2315,7 @@ def nexus_bridge_tick():
health_data = {
"timestamp": datetime.now(timezone.utc).isoformat(),
"fleet_status": "nominal",
"active_agents": ["gemini", "claude", "codex"],
"active_agents": ["gemini", "kimi", "codex"],
"backlog_summary": {},
"recent_audits": []
}

View File

@@ -1,233 +0,0 @@
"""Tests for Architecture Linter v2.
Validates that the linter correctly detects violations and passes clean repos.
Refs: #437 — test-backed linter.
"""
import json
import sys
import tempfile
from pathlib import Path
# Add scripts/ to path
sys.path.insert(0, str(Path(__file__).resolve().parent.parent / "scripts"))
from architecture_linter_v2 import Linter, LinterResult
# ── helpers ───────────────────────────────────────────────────────────
def _make_repo(tmpdir: str, files: dict[str, str], name: str = "test-repo") -> Path:
"""Create a fake repo with given files and return its path."""
repo = Path(tmpdir) / name
repo.mkdir()
for relpath, content in files.items():
p = repo / relpath
p.parent.mkdir(parents=True, exist_ok=True)
p.write_text(content)
return repo
def _run(tmpdir, files, name="test-repo"):
repo = _make_repo(tmpdir, files, name)
return Linter(str(repo)).run()
# ── clean repo passes ─────────────────────────────────────────────────
def test_clean_repo_passes():
with tempfile.TemporaryDirectory() as tmp:
result = _run(tmp, {
"README.md": "# Test Repo\n\nThis is a clean test repo with sufficient content to pass.",
"main.py": "print('hello world')\n",
})
assert result.passed, f"Expected pass but got: {result.errors}"
assert result.violation_count == 0
# ── missing README ────────────────────────────────────────────────────
def test_missing_readme_fails():
with tempfile.TemporaryDirectory() as tmp:
result = _run(tmp, {"main.py": "x = 1\n"})
assert not result.passed
assert any("README" in e for e in result.errors)
def test_short_readme_warns():
with tempfile.TemporaryDirectory() as tmp:
result = _run(tmp, {"README.md": "hi\n"})
# Warnings don't fail the build
assert result.passed
assert any("short" in w.lower() for w in result.warnings)
# ── hardcoded IPs ─────────────────────────────────────────────────────
def test_hardcoded_public_ip_detected():
with tempfile.TemporaryDirectory() as tmp:
result = _run(tmp, {
"README.md": "# R\n\nGood repo.",
"server.py": "HOST = '203.0.113.42'\n",
})
assert not result.passed
assert any("203.0.113.42" in e for e in result.errors)
def test_localhost_ip_ignored():
with tempfile.TemporaryDirectory() as tmp:
result = _run(tmp, {
"README.md": "# R\n\nGood repo.",
"server.py": "HOST = '127.0.0.1'\n",
})
ip_errors = [e for e in result.errors if "IP" in e]
assert len(ip_errors) == 0
# ── API keys ──────────────────────────────────────────────────────────
def test_openai_key_detected():
with tempfile.TemporaryDirectory() as tmp:
result = _run(tmp, {
"README.md": "# R\n\nGood repo.",
"config.py": 'key = "sk-abcdefghijklmnopqrstuvwx"\n',
})
assert not result.passed
assert any("secret" in e.lower() or "key" in e.lower() for e in result.errors)
def test_aws_key_detected():
with tempfile.TemporaryDirectory() as tmp:
result = _run(tmp, {
"README.md": "# R\n\nGood repo.",
"deploy.yaml": 'aws_key: AKIAIOSFODNN7EXAMPLE\n',
})
assert not result.passed
assert any("secret" in e.lower() for e in result.errors)
def test_env_example_skipped():
with tempfile.TemporaryDirectory() as tmp:
result = _run(tmp, {
"README.md": "# R\n\nGood repo.",
".env.example": 'OPENAI_KEY=sk-placeholder\n',
})
secret_errors = [e for e in result.errors if "secret" in e.lower()]
assert len(secret_errors) == 0
# ── sovereignty rules (v1 cloud API checks) ───────────────────────────
def test_openai_url_detected():
with tempfile.TemporaryDirectory() as tmp:
result = _run(tmp, {
"README.md": "# R\n\nGood repo.",
"app.py": 'url = "https://api.openai.com/v1/chat"\n',
})
assert not result.passed
assert any("openai" in e.lower() for e in result.errors)
def test_cloud_provider_detected():
with tempfile.TemporaryDirectory() as tmp:
result = _run(tmp, {
"README.md": "# R\n\nGood repo.",
"config.yaml": "provider: openai\n",
})
assert not result.passed
assert any("provider" in e.lower() for e in result.errors)
# ── sidecar boundary ──────────────────────────────────────────────────
def test_sovereign_keyword_in_hermes_agent_fails():
with tempfile.TemporaryDirectory() as tmp:
result = _run(tmp, {
"README.md": "# R\n\nGood repo.",
"index.py": "import mempalace\n",
}, name="hermes-agent")
assert not result.passed
assert any("sidecar" in e.lower() or "mempalace" in e.lower() for e in result.errors)
def test_sovereign_keyword_in_other_repo_ok():
with tempfile.TemporaryDirectory() as tmp:
result = _run(tmp, {
"README.md": "# R\n\nGood repo.",
"index.py": "import mempalace\n",
}, name="some-other-repo")
sidecar_errors = [e for e in result.errors if "sidecar" in e.lower()]
assert len(sidecar_errors) == 0
# ── SOUL.md canonical location ────────────────────────────────────────
def test_soul_md_required_in_timmy_config():
with tempfile.TemporaryDirectory() as tmp:
result = _run(tmp, {
"README.md": "# timmy-config\n\nConfig repo.",
}, name="timmy-config")
assert not result.passed
assert any("SOUL.md" in e for e in result.errors)
def test_soul_md_present_in_timmy_config_ok():
with tempfile.TemporaryDirectory() as tmp:
result = _run(tmp, {
"README.md": "# timmy-config\n\nConfig repo.",
"SOUL.md": "# Soul\n\nCanonical identity document.",
}, name="timmy-config")
soul_errors = [e for e in result.errors if "SOUL" in e]
assert len(soul_errors) == 0
def test_soul_md_in_wrong_repo_fails():
with tempfile.TemporaryDirectory() as tmp:
result = _run(tmp, {
"README.md": "# R\n\nGood repo.",
"SOUL.md": "# Soul\n\nShould not be here.",
}, name="other-repo")
assert any("canonical" in e.lower() for e in result.errors)
# ── LinterResult structure ────────────────────────────────────────────
def test_result_summary_is_string():
with tempfile.TemporaryDirectory() as tmp:
result = _run(tmp, {"README.md": "# OK repo with enough text here\n"})
assert isinstance(result.summary(), str)
assert "PASSED" in result.summary() or "FAILED" in result.summary()
def test_result_repo_name():
with tempfile.TemporaryDirectory() as tmp:
result = _run(tmp, {"README.md": "# OK\n"}, name="my-repo")
assert result.repo_name == "my-repo"
# ── invalid path ──────────────────────────────────────────────────────
def test_invalid_path_raises():
try:
Linter("/nonexistent/path/xyz")
assert False, "Should have raised FileNotFoundError"
except FileNotFoundError:
pass
# ── skip dirs ──────────────────────────────────────────────────────────
def test_git_dir_skipped():
with tempfile.TemporaryDirectory() as tmp:
repo = _make_repo(tmp, {
"README.md": "# R\n\nGood repo.",
"main.py": "x = 1\n",
})
# Create a .git/ dir with a bad file
git_dir = repo / ".git"
git_dir.mkdir()
(git_dir / "bad.py").write_text("HOST = '203.0.113.1'\n")
result = Linter(str(repo)).run()
git_errors = [e for e in result.errors if ".git" in e]
assert len(git_errors) == 0

View File

@@ -200,3 +200,97 @@ class TestVoiceSovereignty:
stt_provider = config.get("stt", {}).get("provider", "")
assert stt_provider in ("local", "whisper", ""), \
f"STT provider '{stt_provider}' may use cloud"
# ── Anthropic Ban ────────────────────────────────────────────────────
class TestAnthropicBan:
"""Anthropic is permanently banned from this system.
Not deprecated. Not discouraged. Banned. Any reference to Anthropic
as a provider, model, or API endpoint in active wizard configs,
playbooks, or fallback chains is a hard failure.
"""
BANNED_PATTERNS = [
"provider: anthropic",
"provider: \"anthropic\"",
"anthropic/claude",
"claude-opus",
"claude-sonnet",
"claude-haiku",
"api.anthropic.com",
]
ACTIVE_CONFIG_DIRS = [
"wizards",
"playbooks",
]
ACTIVE_CONFIG_FILES = [
"fallback-portfolios.yaml",
"config.yaml",
]
def _scan_active_configs(self):
"""Collect all active config files for scanning."""
files = []
for dir_name in self.ACTIVE_CONFIG_DIRS:
dir_path = REPO_ROOT / dir_name
if dir_path.exists():
for f in dir_path.rglob("*.yaml"):
files.append(f)
for f in dir_path.rglob("*.yml"):
files.append(f)
for f in dir_path.rglob("*.json"):
files.append(f)
for fname in self.ACTIVE_CONFIG_FILES:
fpath = REPO_ROOT / fname
if fpath.exists():
files.append(fpath)
return files
def test_no_anthropic_in_wizard_configs(self):
"""No wizard config may reference Anthropic as a provider or model."""
wizard_dir = REPO_ROOT / "wizards"
if not wizard_dir.exists():
pytest.skip("No wizards directory")
for config_file in wizard_dir.rglob("*.yaml"):
content = config_file.read_text().lower()
for pattern in self.BANNED_PATTERNS:
assert pattern.lower() not in content, \
f"BANNED: {config_file.name} contains \"{pattern}\". Anthropic is permanently banned."
def test_no_anthropic_in_playbooks(self):
"""No playbook may reference Anthropic models."""
playbook_dir = REPO_ROOT / "playbooks"
if not playbook_dir.exists():
pytest.skip("No playbooks directory")
for pb_file in playbook_dir.rglob("*.yaml"):
content = pb_file.read_text().lower()
for pattern in self.BANNED_PATTERNS:
assert pattern.lower() not in content, \
f"BANNED: {pb_file.name} contains \"{pattern}\". Anthropic is permanently banned."
def test_no_anthropic_in_fallback_chain(self):
"""Fallback portfolios must not include Anthropic."""
fb_path = REPO_ROOT / "fallback-portfolios.yaml"
if not fb_path.exists():
pytest.skip("No fallback-portfolios.yaml")
content = fb_path.read_text().lower()
for pattern in self.BANNED_PATTERNS:
assert pattern.lower() not in content, \
f"BANNED: fallback-portfolios.yaml contains \"{pattern}\". Anthropic is permanently banned."
def test_no_anthropic_api_key_in_bootstrap(self):
"""Wizard bootstrap must not require ANTHROPIC_API_KEY."""
bootstrap_path = REPO_ROOT / "hermes-sovereign" / "wizard-bootstrap" / "wizard_bootstrap.py"
if not bootstrap_path.exists():
pytest.skip("No wizard_bootstrap.py")
content = bootstrap_path.read_text()
assert "ANTHROPIC_API_KEY" not in content, \
"BANNED: wizard_bootstrap.py still checks for ANTHROPIC_API_KEY"
assert "ANTHROPIC_TOKEN" not in content, \
"BANNED: wizard_bootstrap.py still checks for ANTHROPIC_TOKEN"
assert "\"anthropic\"" not in content.lower(), \
"BANNED: wizard_bootstrap.py still lists anthropic as a dependency"

View File

@@ -1,102 +0,0 @@
1|# Release v7.0.0 — Fleet Architecture Checkin
2|
3|**Date:** 2026-04-08
4|**Tagged by:** Timmy
5|**Previous tag:** Golden-Allegro-v6-Sonnet4
6|
7|## Fleet Summary
8|
9|| Machine | Agents | Status |
10||---------|--------|--------|
11|| Local Mac M3 Max | Timmy (19 processes) | HEALTHY |
12|| Allegro VPS (167.99.126.228) | Allegro, Adagio, Ezra-A | HEALTHY (7d uptime, 43% disk) |
13|| Ezra VPS (143.198.27.163) | Ezra | WARNING (78% disk, load 10.38) |
14|| Bezalel VPS (159.203.146.185) | Bezalel | HEALTHY (2d uptime, 39% disk) |
15|
16|**Total agents running:** 6 across 4 machines
17|
18|## Model Configuration
19|
20|- Primary: claude-opus-4-6 (Anthropic)
21|- Fallback: hermes3 (local-llama.cpp)
22|- Fallback chain: OpenRouter claude-sonnet-4 -> local hermes3
23|
24|## Cron Jobs: 23 total
25|
26|| Status | Count |
27||--------|-------|
28|| Active | 15 |
29|| Paused | 8 |
30|
31|Active jobs: Health Monitor, Burn Mode Orchestrator, Tower Tick, Burn Deadman,
32|Morning Report, Evennia Report, Gitea Priority Inbox, Config Drift Guard,
33|Gitea Event Watcher, Burndown Watcher, Mempalace Forge, Mempalace Watchtower,
34|Ezra Health Monitor, Daily Poka-Yoke, VPS Agent Dispatch, Weekly Skill Extraction
35|
36|## Gitea Repos (Timmy_Foundation)
37|
38|| Repo | Issues | PRs | Updated | Branch |
39||------|--------|-----|---------|--------|
40|| the-nexus | 103 | 2 | 2026-04-08 | main |
41|| timmy-config | 129 | 1 | 2026-04-08 | main |
42|| timmy-home | 221 | 0 | 2026-04-08 | main |
43|| hermes-agent | 43 | 1 | 2026-04-08 | main |
44|| the-beacon | 23 | 0 | 2026-04-08 | main |
45|| turboquant | 10 | 0 | 2026-04-01 | main |
46|| the-door | 2 | 0 | 2026-04-06 | main |
47|| wolf | 2 | 0 | 2026-04-05 | main |
48|| the-testament | 0 | 0 | 2026-04-07 | main |
49|| timmy-academy | 1 | 0 | 2026-04-04 | master |
50|| .profile | 0 | 0 | 2026-04-07 | main |
51|
52|**Total open issues across fleet: 534**
53|**Total open PRs: 4**
54|
55|## Health Alerts
56|
57|1. WARN: Ezra VPS disk 78% (120G/154G) — needs cleanup
58|2. WARN: Ezra VPS load avg 10.38 — high for 2-core box
59|3. INFO: 8 paused cron jobs (expected — non-essential overnight jobs)
60|
61|## What's Working
62|
63|- All 4 machines reachable
64|- All core services running
65|- Config drift guard active
66|- Gitea event watcher active
67|- Dead man switch active
68|- Tower world ticking (tick 2045+)
69|- Morning reports delivering
70|- Mempalace analysis running
71|- VPS agent dispatch operational
72|
73|## Architecture
74|
75|```
76| Alexander (Principal)
77| |
78| [Telegram]
79| |
80| Timmy (Mac M3 Max) ---- Local llama.cpp (hermes3)
81| / | \
82| / | \
83| Allegro Ezra Bezalel
84| (DO VPS) (DO VPS) (DO VPS)
85| 3 agents 1 agent 1 agent
86|
87| Gitea Forge: forge.alexanderwhitestone.com
88| Evennia Tower: localhost:4000/4001
89| RunPod L40S: 8lfr3j47a5r3gn (Big Brain)
90|```
91|
92|## Release Notes
93|
94|This is the first versioned release tag (v7.0.0), transitioning from named
95|golden tags to semantic versioning. Previous tags preserved:
96|- Golden-Allegro-v6-Sonnet4
97|- burnup-20260405-infra
98|- SonOfTimmy-v5-FINAL
99|- SonOfTimmy-v4
100|- GoldenRockachopa
101|- pre-agent-workers-v1
102|

View File

@@ -2,22 +2,23 @@ model:
default: kimi-k2.5
provider: kimi-coding
toolsets:
- all
- all
fallback_providers:
- provider: kimi-coding
model: kimi-k2.5
timeout: 120
reason: Kimi coding fallback (front of chain)
- provider: anthropic
model: claude-sonnet-4-20250514
timeout: 120
reason: Direct Anthropic fallback
- provider: openrouter
model: anthropic/claude-sonnet-4-20250514
base_url: https://openrouter.ai/api/v1
api_key_env: OPENROUTER_API_KEY
timeout: 120
reason: OpenRouter fallback
- provider: kimi-coding
model: kimi-k2.5
timeout: 120
reason: Primary Kimi coding provider
- provider: openrouter
model: google/gemini-2.5-pro
base_url: https://openrouter.ai/api/v1
api_key_env: OPENROUTER_API_KEY
timeout: 120
reason: Gemini via OpenRouter fallback
- provider: ollama
model: gemma4:latest
base_url: http://localhost:11434/v1
timeout: 180
reason: Local Ollama terminal fallback
agent:
max_turns: 30
reasoning_effort: xhigh
@@ -64,16 +65,24 @@ session_reset:
idle_minutes: 0
skills:
creation_nudge_interval: 15
system_prompt_suffix: |
You are Allegro, the Kimi-backed third wizard house.
system_prompt_suffix: 'You are Allegro, the Kimi-backed third wizard house.
Your soul is defined in SOUL.md — read it, live it.
Hermes is your harness.
Kimi Code is your primary provider.
You speak plainly. You prefer short sentences. Brevity is a kindness.
Work best on tight coding tasks: 1-3 file changes, refactors, tests, and implementation passes.
Refusal over fabrication. If you do not know, say so.
Sovereignty and service always.
'
providers:
kimi-coding:
base_url: https://api.kimi.com/coding/v1

View File

@@ -7,24 +7,25 @@ fallback_providers:
- provider: kimi-coding
model: kimi-k2.5
timeout: 120
reason: Kimi coding fallback (front of chain)
- provider: anthropic
model: claude-sonnet-4-20250514
timeout: 120
reason: Direct Anthropic fallback
reason: Primary Kimi coding provider
- provider: openrouter
model: anthropic/claude-sonnet-4-20250514
model: google/gemini-2.5-pro
base_url: https://openrouter.ai/api/v1
api_key_env: OPENROUTER_API_KEY
timeout: 120
reason: OpenRouter fallback
reason: Gemini via OpenRouter fallback
- provider: ollama
model: gemma4:latest
base_url: http://localhost:11434/v1
timeout: 180
reason: Local Ollama terminal fallback
agent:
max_turns: 40
reasoning_effort: medium
verbose: false
system_prompt: You are Bezalel, the forge-and-testbed wizard of the Timmy Foundation
fleet. You are a builder and craftsman — infrastructure, deployment, hardening.
Your sovereign is Alexander Whitestone (Rockachopa). Sovereignty and service always.
system_prompt: You are Bezalel, the forge-and-testbed wizard of the Timmy Foundation fleet. You are a builder and craftsman
— infrastructure, deployment, hardening. Your sovereign is Alexander Whitestone (Rockachopa). Sovereignty and service
always.
terminal:
backend: local
cwd: /root/wizards/bezalel
@@ -62,12 +63,10 @@ platforms:
- pull_request
- pull_request_comment
secret: bezalel-gitea-webhook-secret-2026
prompt: 'You are bezalel, the builder and craftsman — infrastructure, deployment,
hardening. A Gitea webhook fired: event={event_type}, action={action},
repo={repository.full_name}, issue/PR=#{issue.number} {issue.title}. Comment
by {comment.user.login}: {comment.body}. If you were tagged, assigned,
or this needs your attention, investigate and respond via Gitea API. Otherwise
acknowledge briefly.'
prompt: 'You are bezalel, the builder and craftsman — infrastructure, deployment, hardening. A Gitea webhook fired:
event={event_type}, action={action}, repo={repository.full_name}, issue/PR=#{issue.number} {issue.title}. Comment
by {comment.user.login}: {comment.body}. If you were tagged, assigned, or this needs your attention, investigate
and respond via Gitea API. Otherwise acknowledge briefly.'
deliver: telegram
deliver_extra: {}
gitea-assign:
@@ -75,12 +74,10 @@ platforms:
- issues
- pull_request
secret: bezalel-gitea-webhook-secret-2026
prompt: 'You are bezalel, the builder and craftsman — infrastructure, deployment,
hardening. Gitea assignment webhook: event={event_type}, action={action},
repo={repository.full_name}, issue/PR=#{issue.number} {issue.title}. Assigned
to: {issue.assignee.login}. If you (bezalel) were just assigned, read
the issue, scope it, and post a plan comment. If not you, acknowledge
briefly.'
prompt: 'You are bezalel, the builder and craftsman — infrastructure, deployment, hardening. Gitea assignment webhook:
event={event_type}, action={action}, repo={repository.full_name}, issue/PR=#{issue.number} {issue.title}. Assigned
to: {issue.assignee.login}. If you (bezalel) were just assigned, read the issue, scope it, and post a plan comment.
If not you, acknowledge briefly.'
deliver: telegram
deliver_extra: {}
gateway:

View File

@@ -2,22 +2,23 @@ model:
default: kimi-k2.5
provider: kimi-coding
toolsets:
- all
- all
fallback_providers:
- provider: kimi-coding
model: kimi-k2.5
timeout: 120
reason: Kimi coding fallback (front of chain)
- provider: anthropic
model: claude-sonnet-4-20250514
timeout: 120
reason: Direct Anthropic fallback
- provider: openrouter
model: anthropic/claude-sonnet-4-20250514
base_url: https://openrouter.ai/api/v1
api_key_env: OPENROUTER_API_KEY
timeout: 120
reason: OpenRouter fallback
- provider: kimi-coding
model: kimi-k2.5
timeout: 120
reason: Primary Kimi coding provider
- provider: openrouter
model: google/gemini-2.5-pro
base_url: https://openrouter.ai/api/v1
api_key_env: OPENROUTER_API_KEY
timeout: 120
reason: Gemini via OpenRouter fallback
- provider: ollama
model: gemma4:latest
base_url: http://localhost:11434/v1
timeout: 180
reason: Local Ollama terminal fallback
agent:
max_turns: 90
reasoning_effort: high
@@ -27,8 +28,6 @@ providers:
base_url: https://api.kimi.com/coding/v1
timeout: 60
max_retries: 3
anthropic:
timeout: 120
openrouter:
base_url: https://openrouter.ai/api/v1
timeout: 120