Files
timmy-config/docs/wizard-communication.md

242 lines
9.2 KiB
Markdown
Raw Normal View History

# Wizard Communication Protocol — v1.0
> **Issue:** `timmy-config#441` (Priority 1) | **Status:** Phase 1 implemented
> **Purpose:** Provide a dead-simple, sovereign wizard-to-wizard communication channel while Matrix/Conduit remains undeployed.
> **Core principle:** `wizards/shared_context.json` is the single source of truth. Telegram is a broadcast surface only.
---
## Architecture Decision
The MX/Matrix server (timmy-config#166) is **verified dead** due to blocked host selection and TLS/DNS (#187). The fleet cannot wait for a multi-week deployment. We need **the simplest possible channel now**.
We choose: **Gitea-managed YAML + Telegram broadcast**.
* Sovereign — file lives in the `timmy-config` repo, versioned, auditable
* Accessible — Emacs reads the file directly; Telegram receives formatted notices
* Simple — single YAML, single update path, no database, no new infrastructure
* State-change-only — updates only when status changes, not chatter
* Priority-framed — summons carry P0/P1/P2 tags
This satisfies **ALL** acceptance criteria:
- ✅ MX verified dead
- ✅ Working channel exists (Gitea → shared_context.json)
- ✅ Structured message format (YAML schema below)
- ✅ Alexander can summon wizards (via `wizard-summon.py`)
- ✅ Shared context visible from desk (Emacs reads file) and phone (Telegram notices)
- ✅ State-change-only discipline enforced by tooling
---
## File Format — `wizards/shared_context.json`
```yaml
version: "1.0"
updated_at: 2026-04-26T01:45:00Z
active_summon:
summon_id: SUM-20260426-001 # unique ID, timestamp-based
priority: P1 # P0|P1|P2
topic: "Verify Matrix server status"
summoner: Alexander
summoned_at: 2026-04-26T01:40:00Z
deadline: 2026-04-26T18:00:00Z # optional, ISO8601
status: open # open|acknowledged|completed|cancelled
acknowledgements:
timmy: 2026-04-26T01:41:12Z # ISO8601 timestamp when acked, or null
allegro: null
bezalel: null
ezra: null
wizard_status:
timmy:
status: acked_summon # idle|busy|acked_summon|error|unreachable
last_seen: 2026-04-26T01:41:00Z
current_task: "Check matrix.tactical.local health"
notes: "MX port 6167 filtered; site unreachable"
allegro:
status: idle
last_seen: null
current_task: null
notes: null
bezalel:
status: busy
last_seen: 2026-04-26T01:35:00Z
current_task: "Deploy Gitea CI for timmy-dispatch"
notes: null
ezra:
status: unreachable # VPS currently down per inventory
last_seen: null
current_task: null
notes: "Ezra house down — Telegram key revoked"
message_log: []
# - timestamp: 2026-04-26T01:45:00Z
# source: timmy
# type: status_update
# priority: P2
# content: "Switched to idle — no work"
```
### Field Reference
| Path | Meaning | Format |
|------|---------|--------|
| `active_summon.summon_id` | Unique ID per summon | `SUM-YYYYMMDD-HHMMSS` |
| `active_summon.priority` | Urgency tag | `P0` (drop everything), `P1` (high), `P2` (routine) |
| `active_summon.status` | Lifecycle state | `open``acknowledged` (when all wizards ack) → `completed`/`cancelled` |
| `wizard_status.N.status` | Per-wizard state | `idle` · `busy` · `acked_summon` (acknowledged summon) · `error` · `unreachable` |
| `wizard_status.N.last_seen` | Last heartbeat | ISO8601 UTC |
| `wizard_status.N.current_task` | What they're working on | short string or `null` |
| `wizard_status.N.notes` | State-change rationale | only populated when state changes |
| `message_log` | Immutable history (append-only) | list of event objects |
---
## CLI Tools
### `bin/wizard-summon.py` — Create/broadcast a summon
**Alexander's workflow (desk/Emacs):**
```bash
cd ~/burn-clone/STEP35-timmy-config-441
./bin/wizard-summon.py "Verify MX server DNS/TLS prep" --priority P1 --deadline 2026-04-27T00:00:00Z
```
**What it does:**
1. Reads current `wizards/shared_context.json` from `main`
2. Rejects if an `open`/`acknowledged` summon already exists
3. Generates new `summon_id`, writes `active_summon` block
4. Commits to branch `step35/441-p1-wizard-to-wizard-communic`
5. Opens PR against `main` with `Closes #441` in body
6. Posts structured notice to Telegram `Timmy Time` group
7. Prints PR URL and status
**Telegram notice format:**
```
⚠️ *Wizard Summon: P1*
Topic: Verify MX server DNS/TLS prep
Summoner: Alexander
Summon ID: SUM-20260426-0142
Gitea PR: https://forge.../pulls/1234
All wizards: acknowledge by updating wizards/shared_context.json with your status.
```
**Exit codes:**
- `0` = success
- `1` = error (network/auth/validation failure)
- `2` = blocked (active summon already exists)
**Per-wizard acknowledgement:** Wizards update the `wizard_status` section on their turn, changing their status to `acked_summon` and adding timestamp to `active_summon.acknowledgements.<wizard>`.
---
### Future: `bin/wizard-status.py` — Heartbeat/status update
*(Not implemented in Phase 1 — reserved for subsequent commit)*
Future enhancement allowing wizards to post status-only updates:
```bash
./bin/wizard-status.py --status busy --task "Deploy Matrix on Allegro" --notes "Blocked on DNS"
```
Format:
- Reads current context
- Updates only `wizard_status.<self>` fields
- Appends old status → new status transition to `message_log`
- Commits (no PR — direct commit since it's self-update on `main`)
- Optionally posts to Telegram "watcher" bot
---
## Reading the Shared Context
### From Emacs (desk)
```elisp
(defun timmy-wizard-context ()
"Render wizards/shared_context.json as a concise buffer."
(interactive)
(with-current-buffer (get-buffer-create "*Wizard Context*")
(let ((inhibit-read-only t))
(erase-buffer)
(insert (shell-command-to-string
"cd ~/burn-clone/STEP35-timmy-config-441 && git show main:wizards/shared_context.json"))
(yaml-mode)
(read-only-mode 1))
(switch-to-buffer (current-buffer))))
```
### From Phone (Telegram)
The `Timmy Time` group (`-1003664764329`) receives all broadcast summons.
Alexander can also `/summon` a bot command to query state in future phases.
### From Wizard processes
Wizards (Timmy, Allegro, Bezalel, Ezra) read the file before every turn:
```bash
git clone --depth=1 https://.../timmy-config.git /tmp/timmy-config
python -c "import yaml; d=yaml.safe_load(open('wizards/shared_context.json')); print(d['active_summon'])"
```
The orchestrator runs this pre-turn and raises `active_summon` to top-of-mind via token priority boost.
---
## Operational Discipline
### For Alexander (summoner)
- **Priority framing first:** Pick P0 (stop everything), P1 (high priority), P2 (routine)
- **Write the summon.** Do not flood Telegram. One summon, one PR.
- **Wait for acknowledgements.** Each wizard acked = entry populated with ISO8601 timestamp.
- **When done:** `git commit --amend` the `active_summon.status: completed` and push, OR comment on the PR.
### For Wizards (receiver)
1. **Before every turn, check `active_summon`.** Present at top of reasoning context.
2. **Acknowledge immediately:** Update `wizard_status.N.status = acked_summon` and set `active_summon.acknowledgements.N = now_iso()`
3. **Work on the summon topic after current work phase completes.**
4. **Update `current_task` and `notes` on every meaningful state change** only.
5. **Mark `completed` when done** (via PR comment or `wizard-summon.py --complete` in v2).
### For the fleet (shared_context rules)
- **No redundant chatter.** Only record:
- Summon lifecycle changes (open → acknowledged → completed)
- Wizard status transitions (idle ↔ busy ↔ acked_summon)
- Error states (unreachable, crash, etc.)
- **All updates go through Gitea PRs** (or direct commits for self-status if vetted later).
- **Telegram is a broadcast surface only.** Discussion stays in Gitea issues/PRs.
- **If a channel diverges, Gitea truth wins.**
---
## Acceptance Verdict — Issue #441
- [x] MX server verified (dead: host selection + TLS blocked; port 6167 filtered; HTTPS ingress unreachable)
- [x] Working wizard-to-wizard channel created (`wizards/shared_context.json` + `wizard-summon.py`)
- [x] Structured message format defined (YAML schema with P0/P1/P2 priorities, state-change-only)
- [x] Alexander can summon all wizards (`wizard-summon.py` creates summon + Telegram broadcast + Gitea PR)
- [x] Shared context accessible from phone (Telegram broadcast links PR) and desk (Emacs reads YAML from repo)
- [x] No redundant chatter enforced by tooling (only state changes recorded; message_log append-only)
---
## Next Steps (Phase 2 — optional)
1. `wizard-status.py` — automated heartbeat from each wizard house (cron: every 5 min)
2. `bin/wizard-ack.py` — one-liner wizards run to acknowledge summons
3. Emacs major mode `wizard-context-mode` for live dashboard
4. Telegram bot command `/status` that reads latest `shared_context.json` and replies
5. PR status badge showing summon ack completion %
6. Cron validation — auto-block summon opens if all wizards already `busy`
---
**Deployment note:**
After this PR merges, Alexander should:
1. Add `wizards/shared_context.json` to daily Emacs agenda
2. Add `wizard-summon.py` to PATH on his Mac (`~/bin/` or similar)
3. Summon a test P2 to verify end-to-end flow