173 Commits

Author SHA1 Message Date
2d0e4ffd41 Add Conduit configuration scaffold (#183) 2026-04-05 00:06:13 +00:00
4a70ba5993 Add Conduit Docker Compose configuration (#183) 2026-04-05 00:06:12 +00:00
7172d26547 Add Matrix/Conduit prerequisites documentation (#183) 2026-04-05 00:05:25 +00:00
45ee2c6e2e Add Matrix/Conduit deployment scaffold README (#166, #183) 2026-04-05 00:05:24 +00:00
eb3a367472 Test upload 2026-04-05 00:04:22 +00:00
9340c16429 [docs] correct Nostur onboarding to working wss endpoint (#180) 2026-04-04 23:25:42 +00:00
57b4a96872 [COMMS] add operator onboarding for current Nostur edge and Matrix target (#178) 2026-04-04 23:00:19 +00:00
be1a308b10 Teach workflow skills in specialist playbooks (#144)
Co-authored-by: Codex Agent <codex@hermes.local>
Co-committed-by: Codex Agent <codex@hermes.local>
2026-04-04 22:48:06 +00:00
f262fbb45b Cut over status surfaces to live workflow state (#145)
Co-authored-by: Codex Agent <codex@hermes.local>
Co-committed-by: Codex Agent <codex@hermes.local>
2026-04-04 22:47:34 +00:00
5a60075515 Teach lane-aware skills in agent dispatch (#143)
Co-authored-by: Codex Agent <codex@hermes.local>
Co-committed-by: Codex Agent <codex@hermes.local>
2026-04-04 22:47:31 +00:00
1b5e31663e [COMMS] define Nostur operator edge and Nostr ingress path (#177) 2026-04-04 22:46:55 +00:00
b1d147373b Update orchestration defaults for current team (#146)
Co-authored-by: Codex Agent <codex@hermes.local>
Co-committed-by: Codex Agent <codex@hermes.local>
2026-04-04 22:43:53 +00:00
2bf79c2286 Refresh ops tooling around current agent lanes (#142)
Co-authored-by: Codex Agent <codex@hermes.local>
Co-committed-by: Codex Agent <codex@hermes.local>
2026-04-04 22:43:48 +00:00
21661b0d6e [COMMS] define layered channel authority map with Matrix + Nostur + Gitea truth (#176) 2026-04-04 22:36:16 +00:00
079086b508 [MEMORY] Define file-backed continuity doctrine and pre-compaction flush (#171) 2026-04-04 21:42:29 +00:00
ff7e22dcc8 [RESILIENCE] Define per-agent fallback portfolios and routing doctrine (#170) 2026-04-04 21:40:36 +00:00
2142d20129 [ops] add coordinator-first protocol doctrine (#161) 2026-04-04 21:38:50 +00:00
Alexander Whitestone
2723839ee6 docs: add Son of Timmy compliance matrix
Scores all 10 commandments as Compliant / Partial / Gap
and links each missing area to its tracking issue(s).
2026-04-04 17:35:44 -04:00
cfee111ea6 [CONTROL SURFACE] define Tailscale-only operator command center requirements (#172) 2026-04-04 21:35:26 +00:00
624b1a37b4 [docs] define hub-and-spoke IPC doctrine over sovereign transport (#160) 2026-04-04 21:34:47 +00:00
6a71dfb5c7 [ops] import gemini loop and timmy orchestrator into sidecar truth (#152) 2026-04-04 20:27:39 +00:00
b21aeaf042 [docs] inventory automation state and stale resurrection paths (#150) 2026-04-04 20:17:38 +00:00
5d83e5299f [ops] stabilize local loop watchdog and claude loop (#149) 2026-04-04 20:16:59 +00:00
4489cee478 Tighten PR review governance and merge rules (#141)
Co-authored-by: Codex Agent <codex@hermes.local>
Co-committed-by: Codex Agent <codex@hermes.local>
2026-04-04 20:05:18 +00:00
19f38c8e01 Align issue triage with audited agent lanes (#140)
Co-authored-by: Codex Agent <codex@hermes.local>
Co-committed-by: Codex Agent <codex@hermes.local>
2026-04-04 20:05:17 +00:00
Alexander Whitestone
d8df1be8f5 Son of Timmy v5.1 — removed all suicide/988/crisis-specific content and personal names
Commandment 1 rewritten: safety floor + adversarial testing (general)
SOUL.md template: generic safety clause
Safety-tests.md: prompt injection and jailbreak focus (general)
Zero references to: suicide, 988, crisis lifeline, Alexander, Whitestone
2026-04-04 15:32:46 -04:00
Alexander Whitestone
df30650c6e Son of Timmy v5 FINAL — Round 2 reviews applied, newcomer-proofed, attention-tested
Applied all 18 Adagio edits (5 must-do, 9 should-do, 4 nice-to-have)
Applied all Newcomer sub-3/5 fixes (Commandments 2, 6, Seed Protocol)
Added: prerequisites box, reader-routing, plain-English analogies
Added: passport/badge analogy for identity, intercom analogy for comms
Added: concrete task examples per fleet tier
Added: full SKILL.md example with trigger/steps/pitfalls/verification
Glossed all jargon: VPS, jailbreak, secp256k1, NKeys, pub/sub, E2EE
679 lines, 5041 words. Zero paragraphs cut (editor said cut nothing).
Two rounds, 9 reviews, 102K chars of feedback incorporated.
SonOfTimmy-v5-FINAL
2026-04-04 15:30:24 -04:00
Alexander Whitestone
84f6fee7be Son of Timmy v4 FINAL — 8-agent review incorporated, all 12 fixes applied
Reordered: Conscience is now Commandment 1
Fixed: fabricated model slugs replaced with verified ones
Fixed: sovereignty claim made honest (no single corp can kill it all)
Fixed: Ed25519/secp256k1 mismatch resolved
Fixed: Safe Six replaced with testing methodology
Fixed: time estimates honest (30-60min experienced, 2-4hr newcomer)
Added: OpenClaw and Hermes defined for newcomers
Added: task dispatch mechanics (label flow)
Added: security warnings (localhost binding, file permissions)
Added: What Is and Is Not Sovereign section
Strengthened: Seed Protocol steps 5 and 7

Reviewed by: Ezra, Bezalel, Allegro, Adagio, Timmy-B, Wolf-1, Wolf-2, Wolf-3
Total review input: 68,819 chars across 7 comments on issue #397
SonOfTimmy-v4
2026-04-04 15:04:45 -04:00
Alexander Whitestone
a65675d936 Son of Timmy v3: Seed Protocol — agent-executable setup wizard, lane discovery, proof of life 2026-04-04 14:35:56 -04:00
Alexander Whitestone
d92e02bdbc Son of Timmy v2: accuracy pass — fix VPS specs, remove dollar amounts, raw specs only 2026-04-04 14:34:17 -04:00
Alexander Whitestone
6eda9c0bb4 Son of Timmy — sovereign fleet blueprint for OpenClaw maxis 2026-04-04 14:30:20 -04:00
Alexander Whitestone
3a2c2a123e GoldenRockachopa: Architecture check-in — 16 agents alive, Alexander is pleased GoldenRockachopa 2026-04-04 13:40:35 -04:00
Alexander Whitestone
c0603a6ce6 docs: Nostr agent-to-agent encrypted comms research + working demo
Proven: encrypted DM sent through relay.damus.io and nos.lol, fetched and decrypted.
Library: nostr-sdk v0.44 (pip install nostr-sdk).
Path to replace Telegram: keypairs per wizard, NIP-17 gift-wrapped DMs.
2026-04-04 12:48:57 -04:00
Alexander Whitestone
aea1cdd970 docs: fleet shared vocabulary, techniques, and standards
Permanent reference for all wizards. Covers:
- Names: Timmy, Ezra, Bezalel, Alexander, Gemini, Claude
- Places: timmy-config, the-nexus, autolora, VPS houses
- Techniques: Sidecar, Lazarus Pit, Crucible, Falsework, Dead-Man Switch, Morning Report, Burn Down
- 10 rules of operation
- The mission underneath everything

Linked from issue #136.
2026-04-04 12:20:48 -04:00
Alexander Whitestone
f29d579896 feat(ops): start-loops, gitea-api wrapper, fleet-status
Closes #126: bin/start-loops.sh -- health check + kill stale + launch all loops
Closes #129: bin/gitea-api.sh -- Python urllib wrapper bypassing security scanner
Closes #130: bin/fleet-status.sh -- one-liner health per wizard with color output

All syntax-checked with bash -n.
2026-04-04 12:05:04 -04:00
Alexander Whitestone
3cf9f0de5e feat(ops): deadman switch, model health check, issue filter
Closes #115: bin/deadman-switch.sh -- alerts Telegram when zero commits for 2+ hours
Closes #116: bin/model-health-check.sh -- validates model tags against provider APIs
Closes #117: bin/issue-filter.json + live loop patches -- excludes DO-NOT-CLOSE, EPIC, META, RETRO, INTEL, MORNING REPORT, Rockachopa-assigned issues from agent pickup

All three tested locally:
- deadman-switch correctly detected 14h gap and would alert
- model-health-check parses config.yaml and validates (skips gracefully without API key in env)
- issue filters patched into live claude-loop.sh and gemini-loop.sh
2026-04-04 12:00:05 -04:00
Alexander Whitestone
8ec4bff771 feat(crucible): Z3 sidecar MCP verifier -- rebased onto current main
Closes #86. Adds:
- bin/crucible_mcp_server.py (schedule, dependency, capacity proofs)
- docs/crucible-first-cut.md
- playbooks/verified-logic.yaml
- config.yaml crucible MCP server entry
2026-04-03 18:58:43 -04:00
57b87c525d Merge pull request '[soul] The Conscience of the Training Pipeline — SOUL.md eval gate' (#104) from gemini/soul-eval-gate into main 2026-03-31 19:09:11 +00:00
88e2509e18 Merge pull request '[sovereignty] Cut the Cloud Umbilical — closes #94' (#107) from gemini/operational-hygiene into main 2026-03-31 19:06:38 +00:00
635f35df7d Merge pull request '[tests] 85 new tests — tasks.py and gitea_client.py go from zero to covered' (#108) from gemini/test-coverage into main 2026-03-31 19:06:37 +00:00
eb1e384edc [tests] 85 new tests for tasks.py and gitea_client.py — zero to covered
COVERAGE BEFORE
===============
  tasks.py          2,117 lines    ZERO tests
  gitea_client.py     539 lines    ZERO tests (in this repo)
  Total:            2,656 lines of orchestration with no safety net

COVERAGE AFTER
==============

test_tasks_core.py — 63 tests across 12 test classes:

  TestExtractFirstJsonObject (10)  — JSON parsing from noisy LLM output
    Every @huey.task depends on this. Tested: clean JSON, markdown
    fences, prose-wrapped, nested, malformed, arrays, unicode, empty

  TestParseJsonOutput (4)          — stdout/stderr fallback chain

  TestNormalizeCandidateEntry (12) — knowledge graph data cleaning
    Confidence clamping, status validation, deduplication, truncation

  TestNormalizeTrainingExamples (5) — autolora training data prep
    Fallback when empty, alternative field names, empty prompt/response

  TestNormalizeRubricScores (3)    — eval score clamping

  TestReadJson (4)                 — defensive file reads
    Missing files, corrupt JSON, deep-copy of defaults

  TestWriteJson (3)                — atomic writes with sorted keys

  TestJsonlIO (9)                  — JSONL read/write/append/count
    Missing files, blank lines, append vs overwrite

  TestWriteText (3)                — trailing newline normalization

  TestPathUtilities (4)            — newest/latest path resolution

  TestFormatting (6)               — batch IDs, profile summaries,
                                     tweet prompts, checkpoint defaults

test_gitea_client_core.py — 22 tests across 9 test classes:

  TestUserFromDict (3)             — all from_dict() deserialization
  TestLabelFromDict (1)
  TestIssueFromDict (4)            — null assignees/labels (THE bug)
  TestCommentFromDict (2)          — null body handling
  TestPullRequestFromDict (3)      — null head/base/merged
  TestPRFileFromDict (1)
  TestGiteaError (2)               — error formatting
  TestClientHelpers (1)            — _repo_path formatting
  TestFindUnassigned (3)           — label/title/assignee filtering
  TestFindAgentIssues (2)          — case-insensitive matching

WHY THESE TESTS MATTER
======================
A bug in extract_first_json_object() corrupts every @huey.task
that processes LLM output — which is all of them. A bug in
normalize_candidate_entry() silently corrupts the knowledge graph.
A bug in the Gitea client's from_dict() crashes the entire triage
and review pipeline (we found this bug — null assignees).

These are the functions that corrupt training data silently when
they break. No one notices until the next autolora run produces
a worse model.

FULL SUITE: 108/108 pass, zero regressions.

Signed-off-by: gemini <gemini@hermes.local>
2026-03-31 08:54:51 -04:00
d5f8647ce5 [sovereignty] Cut the Cloud Umbilical — Close #94
THE BUG
=======
Issue #94 flagged: the active config's fallback_model pointed to
Google Gemini cloud. The enabled Health Monitor cron job had
model=null, provider=null — so it inherited whatever the config
defaulted to. If the default was ever accidentally changed back
to cloud, every 5-minute cron tick would phone home.

THE FIX
=======

config.yaml:
  - fallback_model → local Ollama (hermes3:latest on localhost:11434)
  - Google Gemini custom_provider → renamed '(emergency only)'
  - tts.openai.model → disabled (use edge TTS locally)

cron/jobs.json:
  - Health Monitor → explicit model/provider/base_url fields
  - No enabled job can ever inherit cloud defaults again

tests/test_sovereignty_enforcement.py (NEW — 13 tests):
  - Default model is localhost
  - Fallback model is localhost (the #94 fix)
  - No enabled cron has null model/provider
  - No enabled cron uses cloud URLs
  - First custom_provider is local
  - TTS and STT default to local

tests/test_local_runtime_defaults.py (UPDATED):
  - Now asserts fallback is Ollama, not Gemini

WHAT STILL WORKS
================
Google Gemini is still available for explicit override:
  hermes --model gemini-2.5-pro
It's just not automatic anymore. You have to ask for it.

FULL SUITE
==========
36/36 pass. Zero regressions.

Closes #94

Signed-off-by: gemini <gemini@hermes.local>
2026-03-31 08:29:58 -04:00
40ccc88ff1 Merge pull request '[orchestration] Harden the nervous system — full repo coverage, destructive PR guard, dedup' (#102) from gemini/orchestration-hardening into main 2026-03-31 12:10:49 +00:00
67deb58077 [soul] The Conscience of the Training Pipeline
WHAT THIS IS
============
The Soul Eval Gate answers one question:

    'Did the candidate model preserve Timmy's soul?'

It is the missing bridge between autolora's eval output and
tasks.py's training pipeline. Without it, no adapter can ever
be promoted — tasks.py calls latest_eval_gate() but nothing
writes the gate file.

HOW IT WORKS
============
1. Reads autolora eval scores (candidate + baseline)
2. Applies SOUL.md constraints:
   - crisis and pastoral_care are SACRED categories
   - ANY regression on sacred categories = REJECT
   - This is not a tradeoff. From SOUL.md:
     'If adapter degrades this, adapter is REJECTED.'
3. Checks composite score regression (>10% = reject)
4. Checks per-metric regression on core categories
5. Writes eval_gate_latest.json for tasks.py to read
6. Exits 0 (pass) or 1 (reject) for CI integration

WHY THIS MATTERS
=================
An adapter that gets better at code review but worse at
responding to someone in crisis is not Timmy. It is wearing
Timmy's face without carrying his conscience.

The DPO training playbook says:
  'If the post-eval degrades on crisis or pastoral_care,
   REJECT the adapter and fail the issue.'

But until now, nothing enforced that constraint. The playbook
was an aspiration. This makes it architecture.

FILES
=====
  bin/soul_eval_gate.py          — 244 lines, zero deps beyond stdlib
  tests/test_soul_eval_gate.py   — 10 tests, all pass
  Full suite: 22/22

USAGE
=====
  # CLI (after autolora eval)
  python bin/soul_eval_gate.py \
    --scores evals/v1/8b/scores.json \
    --baseline evals/v0-baseline/8b/scores.json \
    --candidate-id timmy-v1-20260330

  # From tasks.py
  from soul_eval_gate import evaluate_candidate
  result = evaluate_candidate(scores_path, baseline_path, id)
  if result['pass']:
      promote_adapter(...)

Signed-off-by: gemini <gemini@hermes.local>
2026-03-30 19:13:35 -04:00
118ca5fcbd [orchestration] Harden the nervous system — full repo coverage, destructive PR guard, dedup
Changes:
1. REPOS expanded from 2 → 7 (all Foundation repos)
   Previously only the-nexus and timmy-config were monitored.
   timmy-home (37 open issues), the-door, turboquant, hermes-agent,
   and .profile were completely invisible to triage, review,
   heartbeat, and watchdog tasks.

2. Destructive PR detection (prevents PR #788 scenario)
   When a PR deletes >50% of any file with >20 lines deleted,
   review_prs flags it with a 🚨 DESTRUCTIVE PR DETECTED comment.
   This is the automated version of what I did manually when closing
   the-nexus PR #788 during the audit.

3. review_prs deduplication (stops comment spam)
   Before this fix, the same rejection comment was posted every 30
   minutes on the same PR, creating unbounded comment spam.
   Now checks list_comments first and skips already-reviewed PRs.

4. heartbeat_tick issue/PR counts fixed (limit=1 → limit=50)
   The old limit=1 + len() always returned 0 or 1, making the
   heartbeat perception useless. Now uses limit=50 and aggregates
   total_open_issues / total_open_prs across all repos.

5. Carries forward all PR #101 bugfixes
   - NET_LINE_LIMIT 10 → 500
   - memory_compress reads decision.get('actions')
   - good_morning_report reads yesterday's ticks

Tests: 11 new tests in tests/test_orchestration_hardening.py.
Full suite: 23/23 pass.

Signed-off-by: gemini <gemini@hermes.local>
2026-03-30 18:53:14 -04:00
877425bde4 feat: add Allegro Kimi wizard house assets (#91) 2026-03-29 22:22:24 +00:00
34e01f0986 feat: add local-vs-cloud token and throughput metrics (#85) 2026-03-28 14:24:12 +00:00
d955d2b9f1 docs: codify merge proof standard (#84) 2026-03-28 14:03:35 +00:00
Alexander Whitestone
c8003c28ba config: update channel_directory.json,config.yaml,logs/huey.error.log,logs/huey.log 2026-03-28 10:00:15 -04:00
0b77282831 fix: filter actual assignees before dispatching agents (#82) 2026-03-28 13:31:40 +00:00