Commit Graph

53 Commits

Author SHA1 Message Date
35a191f7b1 Merge PR #725: feat: Provider health monitor with auto-switch (#509) 2026-04-15 06:10:45 +00:00
e987e1b870 Merge PR #726: feat: Pre-flight provider check for session launch (#508) 2026-04-15 06:10:41 +00:00
94f0a132d4 feat: add get_threejs_patterns() filter function (#543) 2026-04-15 05:34:17 +00:00
279356bed6 feat: add --threejs flag and Three.js-aware severity inference (#543)
- Added --threejs flag for focused Three.js pattern scanning
- Updated _infer_severity with shader_failure, texture_placeholder,
  uv_mapping_error, frustum_culling, shadow_map_artifact categories
- Added Three.js demo detections (shader failure, shadow map)
- Bumped detector version to 0.2.0
2026-04-15 05:34:16 +00:00
511ff863c2 feat: add Three.js-specific glitch detection patterns (#543)
Adds 6 new Three.js-specific glitch categories and patterns:
- SHADER_FAILURE: Solid black materials from shader compilation errors
- TEXTURE_PLACEHOLDER: 1x1 white pixel stretched surfaces
- UV_MAPPING_ERROR: BufferGeometry UV coordinate errors
- FRUSTUM_CULLING: Objects popping at screen edges
- SHADOW_MAP_ARTIFACT: Pixelated/blocky shadow maps
- BLOOM_OVERFLOW: Excessive post-processing bloom bleed

Closes #543
2026-04-15 05:32:25 +00:00
b6e3a647b0 feat: add pre-flight provider check script (#508)
Some checks failed
Architecture Lint / Linter Tests (pull_request) Successful in 29s
PR Checklist / pr-checklist (pull_request) Failing after 7m23s
Smoke Test / smoke (pull_request) Failing after 20s
Validate Config / YAML Lint (pull_request) Failing after 14s
Validate Config / JSON Validate (pull_request) Successful in 15s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 1m1s
Validate Config / Shell Script Lint (pull_request) Failing after 46s
Validate Config / Cron Syntax Check (pull_request) Successful in 9s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 10s
Validate Config / Playbook Schema Validation (pull_request) Successful in 28s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
- Checks OpenRouter balance via /api/v1/auth/key
- Tests Nous and Anthropic API keys
- Verifies Ollama is running
- Pre-flight check before session launch
- Returns exit code for automation

Closes #508
2026-04-15 03:55:04 +00:00
e14158676d feat: add provider health monitor script (#509)
Some checks failed
Architecture Lint / Linter Tests (pull_request) Successful in 44s
Smoke Test / smoke (pull_request) Failing after 36s
Validate Config / YAML Lint (pull_request) Failing after 21s
Validate Config / JSON Validate (pull_request) Successful in 28s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 2m36s
Validate Config / Shell Script Lint (pull_request) Failing after 1m3s
Validate Config / Cron Syntax Check (pull_request) Successful in 13s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 12s
PR Checklist / pr-checklist (pull_request) Failing after 6m15s
Validate Config / Playbook Schema Validation (pull_request) Successful in 28s
Architecture Lint / Lint Repository (pull_request) Has been cancelled
Validate Config / Python Test Suite (pull_request) Has been cancelled
- Tests all configured providers
- Maintains health map in tmux-state.json
- Auto-switches profiles to working providers
- Supports --daemon and --status modes

Closes #509
2026-04-15 03:48:37 +00:00
71082fe06f fix: add python3 shebang to bin/soul_eval_gate.py (#681) 2026-04-15 02:57:14 +00:00
6d678e938e fix: add python3 shebang to bin/nostr-agent-demo.py (#681) 2026-04-15 02:57:00 +00:00
31313c421e feat: 3D World Glitch Detection in The Matrix (#491) (#535)
Some checks failed
Architecture Lint / Linter Tests (push) Has been cancelled
Architecture Lint / Lint Repository (push) Has been cancelled
Smoke Test / smoke (push) Has been cancelled
Validate Config / JSON Validate (push) Has been cancelled
Validate Config / Python Syntax & Import Check (push) Has been cancelled
Validate Config / Python Test Suite (push) Has been cancelled
Validate Config / Shell Script Lint (push) Has been cancelled
Validate Config / Cron Syntax Check (push) Has been cancelled
Validate Config / Deploy Script Dry Run (push) Has been cancelled
Validate Config / Playbook Schema Validation (push) Has been cancelled
Validate Config / YAML Lint (push) Has been cancelled
Merge PR #535
2026-04-14 22:17:06 +00:00
46b4f8d000 feat: pane-watchdog — stuck pane detection + auto-restart (#515) (#526)
Some checks failed
Architecture Lint / Linter Tests (push) Has been cancelled
Architecture Lint / Lint Repository (push) Has been cancelled
Smoke Test / smoke (push) Has been cancelled
Validate Config / Python Syntax & Import Check (push) Has been cancelled
Validate Config / Python Test Suite (push) Has been cancelled
Validate Config / Shell Script Lint (push) Has been cancelled
Validate Config / Cron Syntax Check (push) Has been cancelled
Validate Config / Deploy Script Dry Run (push) Has been cancelled
Validate Config / Playbook Schema Validation (push) Has been cancelled
Validate Config / YAML Lint (push) Has been cancelled
Validate Config / JSON Validate (push) Has been cancelled
Merge PR #526
2026-04-14 22:16:52 +00:00
e091868fef feat: auto-commit-guard — 4-layer work preservation (#511) (#525)
Some checks failed
Architecture Lint / Lint Repository (push) Has been cancelled
Architecture Lint / Linter Tests (push) Has been cancelled
Smoke Test / smoke (push) Has been cancelled
Validate Config / Python Syntax & Import Check (push) Has been cancelled
Validate Config / Python Test Suite (push) Has been cancelled
Validate Config / Shell Script Lint (push) Has been cancelled
Validate Config / Cron Syntax Check (push) Has been cancelled
Validate Config / Deploy Script Dry Run (push) Has been cancelled
Validate Config / Playbook Schema Validation (push) Has been cancelled
Validate Config / YAML Lint (push) Has started running
Validate Config / JSON Validate (push) Has started running
Merge PR #525
2026-04-14 22:16:49 +00:00
cf687a5bfa Merge pull request 'Session state persistence — tmux-state.json manifest' (#523) from feature/session-state-persistence-512 into main
Some checks failed
Architecture Lint / Linter Tests (push) Has been cancelled
Architecture Lint / Lint Repository (push) Has been cancelled
Smoke Test / smoke (push) Has been cancelled
Validate Config / JSON Validate (push) Has been cancelled
Validate Config / Python Syntax & Import Check (push) Has been cancelled
Validate Config / Python Test Suite (push) Has been cancelled
Validate Config / Shell Script Lint (push) Has been cancelled
Validate Config / Cron Syntax Check (push) Has been cancelled
Validate Config / Deploy Script Dry Run (push) Has been cancelled
Validate Config / Playbook Schema Validation (push) Has been cancelled
Validate Config / YAML Lint (push) Has been cancelled
2026-04-14 00:35:41 +00:00
Alexander Whitestone
b71e365ed6 feat: session state persistence — tmux-state.json manifest (#512)
Implement tmux-state.sh: snapshots all tmux pane state to ~/.timmy/tmux-state.json
and ~/.hermes/tmux-state.json every supervisor cycle.

Per-pane tracking:
- address, pane_id, pid, size, active state
- command, title, tty
- hermes profile, model, provider
- session_id (for --resume)
- task (last prompt extracted from pane content)
- context_pct (estimated from pane content)

Also implement tmux-resume.sh: cold-start reads manifest and respawns
hermes sessions with --resume using saved session IDs.

Closes #512
2026-04-13 17:26:03 -04:00
Alexander Whitestone
d50296e76b fix: repair all CI failures (smoke, lint, architecture, secret scan)
Some checks failed
Architecture Lint / Linter Tests (pull_request) Successful in 10s
PR Checklist / pr-checklist (pull_request) Failing after 1m25s
Smoke Test / smoke (pull_request) Failing after 8s
Validate Config / YAML Lint (pull_request) Failing after 7s
Validate Config / JSON Validate (pull_request) Successful in 7s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 8s
Validate Config / Python Test Suite (pull_request) Has been skipped
Validate Config / Shell Script Lint (pull_request) Failing after 16s
Validate Config / Cron Syntax Check (pull_request) Successful in 6s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 6s
Validate Config / Playbook Schema Validation (pull_request) Successful in 9s
Architecture Lint / Lint Repository (pull_request) Failing after 9s
1. bin/deadman-fallback.py: stripped corrupted line-number prefixes
   and fixed unterminated string literal
2. fleet/resource_tracker.py: fixed f-string set comprehension
   (needs parens in Python 3.12)
3. ansible deadman_switch: extracted handlers to handlers/main.yml
4. evaluations/crewai/poc_crew.py: removed hardcoded API key
5. playbooks/fleet-guardrails.yaml: added trailing newline
6. matrix/docker-compose.yml: stripped trailing whitespace
7. smoke.yml: excluded security-detection scripts from secret scan
2026-04-13 09:51:08 -04:00
aeefe5027d purge: remove Anthropic from timmy-config (14 files) (#502)
Some checks failed
Architecture Lint / Linter Tests (push) Successful in 10s
Smoke Test / smoke (push) Failing after 8s
Validate Config / YAML Lint (push) Failing after 6s
Validate Config / JSON Validate (push) Successful in 6s
Validate Config / Python Syntax & Import Check (push) Failing after 7s
Validate Config / Python Test Suite (push) Has been skipped
Validate Config / Shell Script Lint (push) Failing after 14s
Validate Config / Cron Syntax Check (push) Successful in 5s
Validate Config / Deploy Script Dry Run (push) Successful in 4s
Validate Config / Playbook Schema Validation (push) Successful in 8s
Architecture Lint / Lint Repository (push) Failing after 8s
2026-04-13 02:02:06 +00:00
d923b9e38a feat: add Anthropic ban enforcement scanner
Some checks failed
Architecture Lint / Linter Tests (pull_request) Successful in 10s
PR Checklist / pr-checklist (pull_request) Failing after 1m14s
Smoke Test / smoke (pull_request) Failing after 7s
Validate Config / YAML Lint (pull_request) Failing after 8s
Validate Config / JSON Validate (pull_request) Successful in 7s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 8s
Validate Config / Python Test Suite (pull_request) Has been skipped
Validate Config / Shell Script Lint (pull_request) Failing after 17s
Validate Config / Cron Syntax Check (pull_request) Successful in 6s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 5s
Validate Config / Playbook Schema Validation (pull_request) Successful in 8s
Architecture Lint / Lint Repository (pull_request) Failing after 7s
2026-04-13 01:34:35 +00:00
55fc678dc3 Add merge conflict detector for sibling PRs
Some checks failed
Architecture Lint / Linter Tests (pull_request) Successful in 10s
PR Checklist / pr-checklist (pull_request) Failing after 1m12s
Smoke Test / smoke (pull_request) Failing after 8s
Validate Config / YAML Lint (pull_request) Failing after 9s
Validate Config / JSON Validate (pull_request) Successful in 9s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 10s
Validate Config / Python Test Suite (pull_request) Has been skipped
Validate Config / Shell Script Lint (pull_request) Failing after 14s
Validate Config / Cron Syntax Check (pull_request) Successful in 5s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 6s
Validate Config / Playbook Schema Validation (pull_request) Successful in 9s
Architecture Lint / Lint Repository (pull_request) Failing after 7s
2026-04-13 00:26:28 +00:00
763e35f47a feat: dead man switch config fallback engine
Some checks failed
PR Checklist / pr-checklist (pull_request) Failing after 3m11s
Automatic fallback chain: Anthropic -> local-llama.cpp -> Ollama -> safe mode.
Auto-recovery when primary returns. Reversible config changes with backup.
2026-04-08 21:54:42 +00:00
14521ef664 feat: add PR checklist enforcement script
All checks were successful
PR Checklist / pr-checklist (pull_request) Successful in 2m21s
Python script that enforces PR quality standards:
- Checks for actual code changes
- Validates branch is not behind base
- Detects issue bundling in PR body
- Runs Python syntax validation
- Verifies shell script executability
- Ensures issue references exist

Closes #393
2026-04-08 10:53:44 +00:00
c1c3aaa681 Merge pull request 'feat: genchi-genbutsu — verify world state, not log vibes (#348)' (#360) from ezra/issue-348 into main 2026-04-07 16:23:35 +00:00
ezra
e5055d269b feat: genchi-genbutsu — verify world state, not log vibes (#348)
Implement 現地現物 (Genchi Genbutsu) post-completion verification:

- Add bin/genchi-genbutsu.sh performing 5 world-state checks:
  1. Branch exists on remote
  2. PR exists
  3. PR has real file changes (> 0)
  4. PR is mergeable
  5. Issue has a completion comment from the agent

- Wire verification into all agent loops:
  - bin/claude-loop.sh: call genchi-genbutsu before merge/close
  - bin/gemini-loop.sh: delegate existing inline checks to genchi-genbutsu
  - bin/agent-loop.sh: resurrect generic agent loop with genchi-genbutsu wired in

- Update metrics JSONL to include 'verified' field for all loops

- Update burn monitor (tasks.py velocity_tracking):
  - Report verified_completion count alongside raw completions
  - Dashboard shows verified trend history

- Update morning report (tasks.py good_morning_report):
  - Count only verified completions from the last 24h
  - Surface verification failures in the report body

Fixes #348
Refs #345
2026-04-07 16:12:05 +00:00
Ezra
f18955ea90 [KAIZEN] Implement automated burn-cycle retrospective (fixes #349)
- Add bin/kaizen-retro.sh entry point and scripts/kaizen_retro.py
- Analyze closed issues, merged PRs, and stale/max-attempts issues
- Report success rates by agent, repo, and issue type
- Generate one concrete improvement suggestion per cycle
- Post retro to Telegram and comment on the latest morning report issue
- Wire into Huey as kaizen_retro() task at 07:15 daily
- Extend gitea_client.py with since param for list_issues and
  created_at/updated_at fields on PullRequest
2026-04-07 15:57:21 +00:00
Ezra
9cc89886da [MUDA] Issue #350 — weekly fleet waste audit
Implements muda-audit.sh measuring all 7 wastes across the fleet:
- Overproduction: issues created vs closed ratio
- Waiting: rate-limit hits from agent logs
- Transport: issues closed-and-redirected
- Overprocessing: PR diff size outliers >500 lines
- Inventory: stale issues open >30 days
- Motion: git clone/rebase churn from logs
- Defects: PRs closed without merge vs merged

Features:
- Persists week-over-week metrics to ~/.local/timmy/muda-audit/metrics.json
- Posts trended waste report to Telegram with top 3 eliminations
- Scheduled weekly (Sunday 21:00 UTC) via Gitea Actions
- Adds created_at/closed_at to PullRequest dataclass and page param to list_org_repos

Closes #350
2026-04-07 15:05:16 +00:00
Ezra
9d9f383996 fix: replace hardcoded public IPs with Tailscale resolution and Forge URL 2026-04-05 23:25:02 +00:00
c5e4b8141d fix: harden gemini loop auth routing and closure (#193) 2026-04-05 19:33:34 +00:00
d0f211b1f3 fix: monitor kimi heartbeat instead of stale loop (#191) 2026-04-05 18:26:08 +00:00
f29991e3bf feat: surface kimi status in orchestrator triage (#190) 2026-04-05 18:17:36 +00:00
f262fbb45b Cut over status surfaces to live workflow state (#145)
Co-authored-by: Codex Agent <codex@hermes.local>
Co-committed-by: Codex Agent <codex@hermes.local>
2026-04-04 22:47:34 +00:00
5a60075515 Teach lane-aware skills in agent dispatch (#143)
Co-authored-by: Codex Agent <codex@hermes.local>
Co-committed-by: Codex Agent <codex@hermes.local>
2026-04-04 22:47:31 +00:00
2bf79c2286 Refresh ops tooling around current agent lanes (#142)
Co-authored-by: Codex Agent <codex@hermes.local>
Co-committed-by: Codex Agent <codex@hermes.local>
2026-04-04 22:43:48 +00:00
6a71dfb5c7 [ops] import gemini loop and timmy orchestrator into sidecar truth (#152) 2026-04-04 20:27:39 +00:00
5d83e5299f [ops] stabilize local loop watchdog and claude loop (#149) 2026-04-04 20:16:59 +00:00
Alexander Whitestone
c0603a6ce6 docs: Nostr agent-to-agent encrypted comms research + working demo
Proven: encrypted DM sent through relay.damus.io and nos.lol, fetched and decrypted.
Library: nostr-sdk v0.44 (pip install nostr-sdk).
Path to replace Telegram: keypairs per wizard, NIP-17 gift-wrapped DMs.
2026-04-04 12:48:57 -04:00
Alexander Whitestone
f29d579896 feat(ops): start-loops, gitea-api wrapper, fleet-status
Closes #126: bin/start-loops.sh -- health check + kill stale + launch all loops
Closes #129: bin/gitea-api.sh -- Python urllib wrapper bypassing security scanner
Closes #130: bin/fleet-status.sh -- one-liner health per wizard with color output

All syntax-checked with bash -n.
2026-04-04 12:05:04 -04:00
Alexander Whitestone
3cf9f0de5e feat(ops): deadman switch, model health check, issue filter
Closes #115: bin/deadman-switch.sh -- alerts Telegram when zero commits for 2+ hours
Closes #116: bin/model-health-check.sh -- validates model tags against provider APIs
Closes #117: bin/issue-filter.json + live loop patches -- excludes DO-NOT-CLOSE, EPIC, META, RETRO, INTEL, MORNING REPORT, Rockachopa-assigned issues from agent pickup

All three tested locally:
- deadman-switch correctly detected 14h gap and would alert
- model-health-check parses config.yaml and validates (skips gracefully without API key in env)
- issue filters patched into live claude-loop.sh and gemini-loop.sh
2026-04-04 12:00:05 -04:00
Alexander Whitestone
8ec4bff771 feat(crucible): Z3 sidecar MCP verifier -- rebased onto current main
Closes #86. Adds:
- bin/crucible_mcp_server.py (schedule, dependency, capacity proofs)
- docs/crucible-first-cut.md
- playbooks/verified-logic.yaml
- config.yaml crucible MCP server entry
2026-04-03 18:58:43 -04:00
67deb58077 [soul] The Conscience of the Training Pipeline
WHAT THIS IS
============
The Soul Eval Gate answers one question:

    'Did the candidate model preserve Timmy's soul?'

It is the missing bridge between autolora's eval output and
tasks.py's training pipeline. Without it, no adapter can ever
be promoted — tasks.py calls latest_eval_gate() but nothing
writes the gate file.

HOW IT WORKS
============
1. Reads autolora eval scores (candidate + baseline)
2. Applies SOUL.md constraints:
   - crisis and pastoral_care are SACRED categories
   - ANY regression on sacred categories = REJECT
   - This is not a tradeoff. From SOUL.md:
     'If adapter degrades this, adapter is REJECTED.'
3. Checks composite score regression (>10% = reject)
4. Checks per-metric regression on core categories
5. Writes eval_gate_latest.json for tasks.py to read
6. Exits 0 (pass) or 1 (reject) for CI integration

WHY THIS MATTERS
=================
An adapter that gets better at code review but worse at
responding to someone in crisis is not Timmy. It is wearing
Timmy's face without carrying his conscience.

The DPO training playbook says:
  'If the post-eval degrades on crisis or pastoral_care,
   REJECT the adapter and fail the issue.'

But until now, nothing enforced that constraint. The playbook
was an aspiration. This makes it architecture.

FILES
=====
  bin/soul_eval_gate.py          — 244 lines, zero deps beyond stdlib
  tests/test_soul_eval_gate.py   — 10 tests, all pass
  Full suite: 22/22

USAGE
=====
  # CLI (after autolora eval)
  python bin/soul_eval_gate.py \
    --scores evals/v1/8b/scores.json \
    --baseline evals/v0-baseline/8b/scores.json \
    --candidate-id timmy-v1-20260330

  # From tasks.py
  from soul_eval_gate import evaluate_candidate
  result = evaluate_candidate(scores_path, baseline_path, id)
  if result['pass']:
      promote_adapter(...)

Signed-off-by: gemini <gemini@hermes.local>
2026-03-30 19:13:35 -04:00
877425bde4 feat: add Allegro Kimi wizard house assets (#91) 2026-03-29 22:22:24 +00:00
34e01f0986 feat: add local-vs-cloud token and throughput metrics (#85) 2026-03-28 14:24:12 +00:00
Alexander Whitestone
d72ae92189 Tighten Hermes cutover and export checks 2026-03-27 17:35:07 -04:00
Alexander Whitestone
49020b34d9 config: update bin/timmy-dashboard,config.yaml,docs/local-model-integration-sketch.md,tasks.py 2026-03-26 17:00:22 -04:00
Alexander Whitestone
56364e62b4 config: update bin/sync-up.sh,channel_directory.json 2026-03-26 08:00:46 -04:00
Alexander Whitestone
069f8bd55f add: sync-up.sh — push live hermes config to repo
The harness is the source. The repo is the record.
Changes flow UP from ~/.hermes, not down from the repo.
2026-03-25 19:11:31 -04:00
4b69918135 remove deprecated workforce-manager.py — replaced by sovereign-orchestration 2026-03-25 14:41:40 +00:00
bafd97da5a remove deprecated timmy-orchestrator.sh — replaced by sovereign-orchestration 2026-03-25 14:41:38 +00:00
e884159cee remove deprecated timmy-loopstat.sh — replaced by sovereign-orchestration 2026-03-25 14:41:37 +00:00
9e1a35c0fa remove deprecated nexus-merge-bot.sh — replaced by sovereign-orchestration 2026-03-25 14:41:36 +00:00
b044588d3d remove deprecated gemini-loop.sh — replaced by sovereign-orchestration 2026-03-25 14:41:34 +00:00
b5b6243ee4 remove deprecated claudemax-watchdog.sh — replaced by sovereign-orchestration 2026-03-25 14:41:33 +00:00