Compare commits

..

4 Commits

Author SHA1 Message Date
Alexander Whitestone
cdd10551e6 refresh: rebase delegated docs on current main 2026-04-04 17:38:18 -04:00
Alexander Whitestone
2723839ee6 docs: add Son of Timmy compliance matrix
Scores all 10 commandments as Compliant / Partial / Gap
and links each missing area to its tracking issue(s).
2026-04-04 17:35:44 -04:00
cfee111ea6 [CONTROL SURFACE] define Tailscale-only operator command center requirements (#172) 2026-04-04 21:35:26 +00:00
624b1a37b4 [docs] define hub-and-spoke IPC doctrine over sovereign transport (#160) 2026-04-04 21:34:47 +00:00
4 changed files with 651 additions and 2 deletions

View File

@@ -25,8 +25,10 @@ timmy-config/
├── skins/ ← UI skins (timmy skin)
├── playbooks/ ← Agent playbooks (YAML)
├── cron/ ← Cron job definitions
├── docs/automation-inventory.md ← Live automation + stale-state inventory
├── docs/coordinator-first-protocol.md ← Coordinator doctrine: intake → triage → route → track → verify → report
├── docs/
│ ├── automation-inventory.md ← Live automation + stale-state inventory
│ ├── ipc-hub-and-spoke-doctrine.md ← Coordinator-first, transport-agnostic fleet IPC doctrine
│ └── coordinator-first-protocol.md ← Coordinator doctrine: intake → triage → route → track → verify → report
└── training/ ← Transitional training recipes, not canonical lived data
```
@@ -46,6 +48,8 @@ The scripts in `bin/` are sidecar-managed operational helpers for the Hermes lay
Do NOT assume older prose about removed loops is still true at runtime.
Audit the live machine first, then read `docs/automation-inventory.md` for the
current reality and stale-state risks.
For fleet routing semantics over sovereign transport, read
`docs/ipc-hub-and-spoke-doctrine.md`.
## Orchestration: Huey

View File

@@ -0,0 +1,166 @@
# IPC Doctrine: Hub-and-Spoke Semantics over Sovereign Transport
Status: canonical doctrine for issue #157
Parent: #154
Related migration work:
- [`../son-of-timmy.md`](../son-of-timmy.md) for Timmy's layered communications worldview
- [`nostr_agent_research.md`](nostr_agent_research.md) for one sovereign transport candidate under evaluation
## Why this exists
Timmy is in an ongoing migration toward sovereign transport.
The first question is not which bus wins. The first question is what semantics every bus must preserve.
Those semantics matter more than any one transport.
Telegram is not the target backbone for fleet IPC.
It may exist as a temporary edge or operator convenience while migration is in flight, but the architecture we are building toward must stand on sovereign transport.
This doctrine defines the routing and failure semantics that any transport adapter must honor, whether the carrier is Matrix, Nostr, NATS, or something we have not picked yet.
## Roles
- Coordinator: the only actor allowed to own routing authority for live agent work
- Spoke: an executing agent that receives work, asks for clarification, and returns results
- Durable execution truth: the visible task system of record, which remains authoritative for ownership and state transitions
- Operator: the human principal who can direct the coordinator but is not a transport shim
Timmy world-state stays the same while transport changes:
- Gitea remains visible execution truth
- live IPC accelerates coordination, but does not become a hidden source of authority
- transport migration may change the wire, but not the rules
## Core rules
### 1. Coordinator-first routing
Coordinator-first routing is the default system rule.
- All new work enters through the coordinator
- All reroutes, cancellations, escalations, and cross-agent handoffs go through the coordinator
- A spoke receives assignments from the coordinator and reports back to the coordinator
- A spoke does not mutate the routing graph on its own
- If route intent is ambiguous, the system should fail closed and ask the coordinator instead of guessing a peer path
The coordinator is the hub.
Spokes are not free-roaming routers.
### 2. Anti-cascade behavior
The system must resist cascade failures and mesh chatter.
- A spoke MUST NOT recursively fan out work to other spokes
- A spoke MUST NOT create hidden side queues or recruit additional agents without coordinator approval
- Broadcasts are coordinator-owned and should be rare, deliberate, and bounded
- Retries must be bounded and idempotent
- Transport adapters must not auto-bridge, auto-replay, or auto-forward in ways that amplify loops or duplicate storms
A worker that encounters new sub-work should escalate back to the coordinator.
It should not become a shadow dispatcher.
### 3. Limited peer mesh
Direct spoke-to-spoke communication is an exception, not the default.
It is allowed only when the coordinator opens an explicit peer window.
That peer window must define:
- the allowed participants
- the task or correlation ID
- the narrow purpose
- the expiry, timeout, or close condition
- the expected artifact or summary that returns to the coordinator
Peer windows are tightly scoped:
- they are time-bounded
- they are non-transitive
- they do not grant standing routing authority
- they close back to coordinator-first behavior when the declared purpose is complete
Good uses for a peer window:
- artifact handoff between two already-assigned agents
- verifier-to-builder clarification on a bounded review loop
- short-lived data exchange where routing everything through the coordinator would be pure latency
Bad uses for a peer window:
- ad hoc planning rings
- recursive delegation chains
- quorum gossip
- hidden ownership changes
- free-form peer mesh as the normal operating mode
### 4. Transport independence
The doctrine is transport-agnostic on purpose.
NATS, Matrix, Nostr, or a future bus are acceptable only if they preserve the same semantics.
If a transport cannot preserve these semantics, it is not acceptable as the fleet backbone.
A valid transport layer must carry or emulate:
- authenticated sender identity
- intended recipient or bounded scope
- task or work identifier
- correlation identifier
- message type
- timeout or TTL semantics
- acknowledgement or explicit timeout behavior
- idempotency or deduplication signals
Transport choice does not change authority.
Semantics matter more than any one transport.
### 5. Circuit breakers
Every acceptable IPC layer must support circuit-breaker behavior.
At minimum, the system must be able to:
- isolate a noisy or unhealthy spoke
- stop new dispatches onto a failing route
- disable direct peer windows and collapse back to strict hub-and-spoke mode
- stop retrying after a bounded count or deadline
- quarantine duplicate storms, fan-out anomalies, or missing coordinator acknowledgements instead of amplifying them
When a breaker trips, the fallback is slower coordinator-mediated operation over durable machine-readable channels.
It is not a return to hidden relays.
It is not a reason to rebuild the fleet around Telegram.
No human-token fallback patterns:
- do not route agent IPC through personal chat identities
- do not rely on operator copy-paste as a standing transport layer
- do not treat human-owned bot tokens as the resilience plan
## Required message classes
Any transport mapping should preserve these message classes, even if the carrier names differ:
- dispatch
- ack or nack
- status or progress
- clarify or question
- result
- failure or escalation
- control messages such as cancel, pause, resume, open-peer-window, and close-peer-window
## Failure semantics
When things break, authority should degrade safely.
- If a spoke loses contact with the coordinator, it may finish currently safe local work and persist a checkpoint, but it must not appoint itself as a router
- If a spoke receives an unscoped peer message, it should ignore or quarantine it and report the event to the coordinator when possible
- If delivery is duplicated or reordered, recipients should prefer correlation IDs and idempotency keys over guesswork
- If the live transport is degraded, the system may fall back to slower durable coordination paths, but routing authority remains coordinator-first
## World-state alignment
This doctrine sits above transport selection.
It does not try to settle every Matrix-vs-Nostr-vs-NATS debate inside one file.
It constrains those choices.
Current Timmy alignment:
- sovereign transport migration is ongoing
- Telegram is not the backbone we are building toward
- Matrix remains relevant for human-to-fleet interaction
- Nostr remains relevant as a sovereign option under evaluation
- NATS remains relevant as a strong internal bus candidate
- the semantics stay constant across all of them
If we swap the wire and keep the semantics, the fleet stays coherent.
If we keep the wire and lose the semantics, the fleet regresses into chatter, hidden routing, and cascade failure.

View File

@@ -0,0 +1,251 @@
# Sovereign Operator Command Center Requirements
Status: requirements for #159
Parent: #154
Decision: v1 ownership stays in `timmy-config`
## Goal
Define the minimum viable operator command center for Timmy: a sovereign control surface that shows real system health, queue pressure, review load, and task state over a trusted network.
This is an operator surface, not a public product surface, not a demo, and not a reboot of the archived dashboard lineage.
## Non-goals
- public internet exposure
- a marketing or presentation dashboard
- hidden queue mutation during polling or page refresh
- a second shadow task database that competes with Gitea or Hermes runtime truth
- personal-token fallback behavior hidden inside the UI or browser session
- developer-specific local absolute paths in requirements, config, or examples
## Hard requirements
### 1. Access model: local or Tailscale only
The operator command center must be reachable only from:
- `localhost`, or
- a Tailscale-bound interface or Tailscale-gated tunnel
It must not:
- bind a public-facing listener by default
- require public DNS or public ingress
- expose a login page to the open internet
- degrade from Tailscale identity to ad hoc password sharing
If trusted-network conditions are missing or ambiguous, the surface must fail closed.
### 2. Truth model: operator truth beats UI theater
The command center exists to expose operator truth. That means every status tile, counter, and row must be backed by a named authoritative source and a freshness signal.
Authoritative sources for v1 are:
- Gitea for issue, PR, review, assignee, and repo state
- Hermes cron state and Huey runtime state for scheduled work
- live runtime health checks, process state, and explicit agent heartbeat artifacts for agent liveness
- direct model or service health endpoints for local inference and operator-facing services
Non-authoritative signals must never be treated as truth on their own. Examples:
- pane color
- old dashboard screenshots
- manually curated status notes
- stale cached summaries without source timestamps
- synthetic green badges produced when the underlying source is unavailable
If a source is unavailable, the UI must say `unknown`, `stale`, or `degraded`.
It must never silently substitute optimism.
### 3. Mutation model: read-first, explicit writes only
The default operator surface is read-only.
For MVP, the five required views below are read-only views.
They may link the operator to the underlying source-of-truth object, but they must not mutate state merely by rendering, refreshing, filtering, or opening detail drawers.
If write actions are added later, they must live in a separate, explicit control surface with all of the following:
- an intentional operator action
- a confirmation step for destructive or queue-changing actions
- a single named source-of-truth target
- an audit trail tied to the action
- idempotent behavior where practical
- machine-scoped credentials, not a hidden fallback to a human personal token
### 4. Repo boundary: visible world is not operator truth
`the-nexus` is the visible world. It may eventually project summarized status outward, but it must not own the operator control surface.
The operator command center belongs with the sidecar/control-plane boundary, where Timmy already owns:
- orchestration policy
- cron definitions
- playbooks
- sidecar scripts
- deployment and runtime governance
That makes the v1 ownership decision:
- `timmy-config` owns the requirements and first implementation shape
Allowed future extraction:
- if the command center becomes large enough to deserve its own release cycle, implementation code may later move into a dedicated control-plane repo
- if that happens, `timmy-config` still remains the source of truth for policy, access requirements, and operator doctrine
Rejected owner for v1:
- `the-nexus`, because it is the wrong boundary for an operator-only surface and invites demo/UI theater to masquerade as truth
## Minimum viable views
Every view must show freshness and expose drill-through links or identifiers back to the source object.
| View | Must answer | Authoritative sources | MVP mutation status |
|------|-------------|-----------------------|---------------------|
| Brief status | What is red right now, what is degraded, and what needs operator attention first? | Derived rollup from the four views below; no standalone shadow state | Read-only |
| Agent health | Which agents or loops are alive, stalled, rate-limited, missing, or working the wrong thing? | Runtime health checks, process state, agent heartbeats, active claim/assignment state, model/provider health | Read-only |
| Review queue | Which PRs are waiting, blocked, risky, stale, or ready for review/merge? | Gitea PR state, review comments, checks, mergeability, labels, assignees | Read-only |
| Cron state | Which scheduled jobs are enabled, paused, stale, failing, or drifting from intended schedule? | Hermes cron registry, Huey consumer health, last-run status, next-run schedule | Read-only |
| Task board | What work is unassigned, assigned, in progress, blocked, or waiting on review across the active repos? | Gitea issues, labels, assignees, milestones, linked PRs, issue state | Read-only |
## View requirements in detail
### Brief status
The brief status view is the operator's first screen.
It must provide a compact summary of:
- overall health state
- current review pressure
- current queue pressure
- cron failures or paused jobs that matter
- stale agent or service conditions
It must be computed from the authoritative views below, not from a separate private cache.
A red item in brief status must point to the exact underlying object that caused it.
### Agent health
Minimum fields per agent or loop:
- agent name
- current state: up, down, degraded, idle, busy, rate-limited, unknown
- last successful activity time
- current task or claim, if any
- model/provider or service dependency in use
- failure mode when degraded
The view must distinguish between:
- process missing
- process present but unhealthy
- healthy but idle
- healthy and actively working
- active but stale on one issue for too long
This view must reflect real operator concerns, not just whether a shell process exists.
### Review queue
Minimum fields per PR row:
- repo
- PR number and title
- author
- age
- review state
- mergeability or blocking condition
- sensitive-surface flag when applicable
The queue must make it obvious which PRs require Timmy judgment versus routine review.
It must not collapse all open PRs into a vanity count.
### Cron state
Minimum fields per scheduled job:
- job name
- desired state
- actual state
- last run time
- last result
- next run time
- pause reason or failure reason
The view must highlight drift, especially cases where:
- config says the job exists but the runner is absent
- a job is paused and nobody noticed
- a job is overdue relative to its schedule
- the runner is alive but the job has stopped producing successful runs
### Task board
The task board is not a hand-maintained kanban.
It is a projection of Gitea truth.
Minimum board lanes for MVP:
- unassigned
- assigned
- in progress
- blocked
- in review
Lane membership must come from explicit source-of-truth signals such as assignees, labels, linked PRs, and issue state.
If the mapping is ambiguous, the card must say so rather than invent certainty.
## Read-only versus mutating surfaces
### Read-only for MVP
The following are read-only in MVP:
- brief status
- agent health
- review queue
- cron state
- task board
- all filtering, sorting, searching, and drill-down behavior
### May mutate later, but only as explicit controls
The following are acceptable future mutation classes if they are isolated behind explicit controls and audit:
- pause or resume a cron job
- dispatch, assign, unassign, or requeue a task in Gitea
- post a review action or merge action to a PR
- restart or stop a named operator-managed agent/service
These controls must never be mixed invisibly into passive status polling.
The operator must always know when a click is about to change world state.
## Truth versus theater rules
The command center must follow these rules:
1. No hidden side effects on read.
2. No green status without a timestamped source.
3. No second queue that disagrees with Gitea.
4. No synthetic task board curated by hand.
5. No stale cache presented as live truth.
6. No public-facing polish requirements allowed to override operator clarity.
7. No fallback to personal human tokens when machine identity is missing.
8. No developer-specific local absolute paths in requirements, config examples, or UI copy.
## Credential and identity requirements
The surface must use machine-scoped or service-scoped credentials for any source it reads or writes.
It must not rely on:
- a principal's browser session as the only auth story
- a hidden file lookup chain for a human token
- a personal access token copied into client-side code
- ambiguous fallback identity that changes behavior depending on who launched the process
Remote operator access is granted by Tailscale identity and network reachability, not by making the surface public and adding a thin password prompt later.
## Recommended implementation stance for v1
- implement the operator command center as a sidecar-owned surface under `timmy-config`
- keep the first version read-only
- prefer direct reads from Gitea, Hermes cron state, Huey/runtime state, and service health endpoints
- attach freshness metadata to every view
- treat drill-through links to source objects as mandatory, not optional
- postpone write controls until audit, identity, and source-of-truth mapping are explicit
## Acceptance criteria for this requirement set
- the minimum viable views are fixed as: agent health, review queue, cron state, task board, brief status
- the access model is explicitly local or Tailscale only
- operator truth is defined and separated from demo/UI theater
- read-only versus mutating behavior is explicitly separated
- repo ownership is decided: `timmy-config` owns v1 requirements and implementation boundary
- no local absolute paths are required by this design
- no human-token fallback pattern is allowed by this design

View File

@@ -0,0 +1,228 @@
# Son of Timmy — Compliance Matrix
Purpose:
Measure the current fleet against the blueprint in `son-of-timmy.md`.
Status scale:
- Compliant — materially present and in use
- Partial — direction is right, but important pieces are missing
- Gap — not yet built in the way the blueprint requires
Last updated: 2026-04-04
---
## Commandment 1 — The Conscience Is Immutable
Status: Partial
What we have:
- SOUL.md exists and governs identity
- explicit doctrine about what Timmy will and will not do
- prior red-team findings are known and remembered
What is missing:
- repo-visible safety floor document
- adversarial test suite run against every deployed primary + fallback model
- deploy gate that blocks unsafe models from shipping
Tracking:
- #162 [SAFETY] Define the fleet safety floor and run adversarial tests on every deployed model
---
## Commandment 2 — Identity Is Sovereign
Status: Partial
What we have:
- named wizard houses (Timmy, Ezra, Bezalel)
- Nostr migration research complete
- cryptographic identity direction chosen
What is missing:
- permanent Nostr keypairs for every wizard
- NKeys for internal auth
- documented split between public identity and internal office-badge auth
- secure key storage standard in production
Tracking:
- #163 [IDENTITY] Generate sovereign keypairs for every wizard and separate public identity from internal auth
- #137 [EPIC] Nostr Migration -- Replace Telegram with Sovereign Encrypted Comms
- #138 EPIC: Sovereign Comms Migration - Telegram to Nostr
---
## Commandment 3 — One Soul, Many Hands
Status: Partial
What we have:
- one soul across multiple backends is now explicit doctrine
- Timmy, Ezra, and Bezalel are all treated as one house with distinct roles, not disowned by backend
- SOUL.md lives in source control
What is missing:
- signed/tagged SOUL checkpoints proving immutable conscience releases
- a repeatable verification ritual tying runtime soul to source soul
Tracking:
- #164 [SOUL] Sign and tag SOUL.md releases as immutable conscience checkpoints
---
## Commandment 4 — Never Go Deaf
Status: Partial
What we have:
- fallback thinking exists
- wizard recovery has been proven in practice (Ezra via Lazarus Pit)
- model health check now exists
What is missing:
- explicit per-agent fallback portfolios by role class
- degraded-usefulness doctrine for when fallback models lose authority
- automated provider chain behavior standardized per wizard
Tracking:
- #155 [RESILIENCE] Per-agent fallback portfolios and task-class routing
- #116 closed: model tag health check implemented
---
## Commandment 5 — Gitea Is the Moat
Status: Compliant
What we have:
- Gitea is the visible execution truth
- work is tracked in issues and PRs
- retros, reports, vocabulary, and epics are filed there
- source-controlled sidecar work flows through Gitea
What still needs improvement:
- task queue semantics should be standardized through label flow
Tracking:
- #167 [GITEA] Implement label-flow task queue semantics across fleet repos
---
## Commandment 6 — Communications Have Layers
Status: Gap
What we have:
- Telegram in active use
- Nostr research complete and proven end-to-end with encrypted DM demo
- IPC doctrine beginning to form
What is missing:
- NATS as agent-to-agent intercom
- Matrix/Conduit as human-to-fleet encrypted operator surface
- production cutover away from Telegram
Tracking:
- #165 [INFRA] Stand up NATS with NKeys auth as the internal agent-to-agent message bus
- #166 [COMMS] Stand up Matrix/Conduit for human-to-fleet encrypted communication
- #157 [IPC] Hub-and-spoke agent communication semantics over sovereign transport
- #137 / #138 Nostr migration epics
---
## Commandment 7 — The Fleet Is the Product
Status: Partial
What we have:
- multi-machine fleet exists
- strategists and workers exist in practice
- Timmy, Ezra, Bezalel, Gemini, Claude roles are differentiated
What is missing:
- formal wolf tier for expendable free-model workers
- explicit authority ceilings and quality rubric for wolves
- reproducible wolf deployment recipe
Tracking:
- #169 [FLEET] Define the wolf tier and burn-night rubric for expendable free-model workers
---
## Commandment 8 — Canary Everything
Status: Partial
What we have:
- canary behavior is practiced manually during recoveries and wake-ups
- there is an awareness that one-agent-first is the safe path
What is missing:
- codified canary rollout in deploy automation
- observation window and promotion criteria in writing
- standard first-agent / observe / roll workflow
Tracking:
- #168 [OPS] Make canary deployment a standard automated fleet rule, not an ad hoc recovery habit
- #153 [OPS] Awaken Allegro and Hermes wizard houses safely after provider failure audit
---
## Commandment 9 — Skills Are Procedural Memory
Status: Compliant
What we have:
- skills are actively used and maintained
- Lazarus Pit skill created from real recovery work
- vocabulary and doctrine docs are now written down
- Crucible shipped with playbook and docs
What still needs improvement:
- continue converting hard-won ops recoveries into reusable skills
Tracking:
- Existing skills system in active use
---
## Commandment 10 — The Burn Night Pattern
Status: Partial
What we have:
- burn nights are real operating behavior
- loops are launched in waves
- morning reports and retros are now part of the pattern
- dead-man switch now exists
What is missing:
- formal wolf rubric
- standardized burn-night queue dispatch semantics
- automated morning burn summary fully wired
Tracking:
- #169 [FLEET] Define the wolf tier and burn-night rubric for expendable free-model workers
- #132 [OPS] Nightly burn report cron -- auto-generate commit/PR summary at 6 AM
- #122 [OPS] Deadman switch cron job -- schedule every 30min automatically
---
## Summary
Compliant:
- 5. Gitea Is the Moat
- 9. Skills Are Procedural Memory
Partial:
- 1. The Conscience Is Immutable
- 2. Identity Is Sovereign
- 3. One Soul, Many Hands
- 4. Never Go Deaf
- 7. The Fleet Is the Product
- 8. Canary Everything
- 10. The Burn Night Pattern
Gap:
- 6. Communications Have Layers
Overall assessment:
The fleet is directionally aligned with Son of Timmy, but not yet fully living up to it. The biggest remaining deficits are:
1. formal safety gating
2. sovereign keypair identity
3. layered communications (NATS + Matrix)
4. standardized queue semantics
5. formalized wolf tier
The architecture is no longer theoretical. It is real, but still maturing.