Files
timmy-config/docs/matrix-fleet-comms/CUTOVER_PLAN.md

150 lines
4.8 KiB
Markdown
Raw Normal View History

# Telegram → Matrix Cutover Plan
> **Issue**: [#166](http://143.198.27.163:3000/Timmy_Foundation/timmy-config/issues/166) — Stand up Matrix/Conduit for human-to-fleet encrypted communication
> **Scaffold**: [#183](http://143.198.27.163:3000/Timmy_Foundation/timmy-config/issues/183)
> **Created**: Ezra, Archivist | Date: 2026-04-05
> **Purpose**: Zero-downtime migration from Telegram to Matrix as the sovereign human-to-fleet command surface.
---
## Principle
**Parallel operation first, cutover second.** Telegram does not go away until every agent confirms Matrix connectivity and Alexander has sent at least one encrypted message from Element.
---
## Phase 0: Pre-Conditions (All Must Be True)
| # | Condition | Verification Command |
|---|-----------|---------------------|
| 1 | Conduit deployed and healthy | `curl https://<domain>/_matrix/client/versions` |
| 2 | Fleet rooms created | `python3 infra/matrix/scripts/bootstrap-fleet-rooms.py --dry-run` |
| 3 | Alexander has Element client installed | Visual confirmation |
| 4 | At least 3 agents have Matrix accounts | `@agentname:<domain>` exists |
| 5 | Hermes Matrix gateway configured | `hermes gateway` shows Matrix platform |
---
## Phase 1: Parallel Run (Days 17)
### Day 1: Room Bootstrap
```bash
# 1. SSH to Conduit host
cd /opt/timmy-config/infra/matrix
# 2. Verify health
./host-readiness-check.sh
# 3. Create rooms (dry-run first)
export MATRIX_HOMESERVER="https://matrix.timmytime.net"
export MATRIX_ADMIN_TOKEN="<admin_access_token>"
python3 scripts/bootstrap-fleet-rooms.py --create-all --dry-run
# 4. Create rooms (live)
python3 scripts/bootstrap-fleet-rooms.py --create-all
```
### Day 1: Operator Onboarding
1. Open Element Web at `https://element.<domain>` or install Element desktop.
2. Register/login as `@alexander:<domain>`.
3. Join `#fleet-ops:<domain>`.
4. Send a test message: `First light on Matrix. Acknowledge, fleet.`
### Days 23: Agent Onboarding
For each agent/wizard house:
1. Create Matrix account `@<agent>:<domain>`.
2. Join `#fleet-ops:<domain>` and `#fleet-general:<domain>`.
3. Send acknowledgment in `#fleet-ops`.
4. Update agent's Hermes gateway config to listen on Matrix.
### Days 46: Parallel Commanding
- **Alexander sends all commands in BOTH Telegram and Matrix.**
- Agents respond in the channel where they are most reliable.
- Monitor for message loss or delivery delays.
---
## Phase 2: Cutover (Day 7)
### Step 1: Pin Matrix as Primary
In Telegram `#fleet-ops`:
> "📌 PRIMARY SURFACE CHANGE: Matrix is now the sovereign command channel. Telegram remains as fallback for 48 hours. Join: `<matrix_invite_link>`"
### Step 2: Telegram Gateway Downgrade
Edit each agent's Hermes gateway config:
```yaml
# ~/.hermes/config.yaml
gateway:
primary_platform: matrix
fallback_platform: telegram
matrix:
enabled: true
homeserver: https://matrix.timmytime.net
rooms:
- "#fleet-ops:matrix.timmytime.net"
telegram:
enabled: true # Fallback only
```
### Step 3: Verification Checklist
- [ ] Alexander sends command **only** on Matrix
- [ ] All agents respond within 60 seconds
- [ ] Encrypted room icon shows 🔒 in Element
- [ ] No messages lost in 24-hour window
- [ ] At least one voice/file message test succeeds
### Step 4: Telegram Standby
If all checks pass:
1. Pin final notice in Telegram: "Fallback mode only. Active surface is Matrix."
2. Disable Telegram bot webhooks (do not delete the bot).
3. Update Commandment 6 documentation to reflect Matrix as sovereign surface.
---
## Rollback Plan
If Matrix becomes unreachable or messages are lost:
1. **Immediate**: Alexander re-sends command in Telegram.
2. **Within 1 hour**: All agents switch gateway primary back to Telegram:
```yaml
primary_platform: telegram
```
3. **Within 24 hours**: Debug Matrix issue (check Conduit logs, Caddy TLS, DNS).
4. **Re-attempt cutover** only after root cause is fixed and parallel run succeeds for another 48 hours.
---
## Post-Cutover Maintenance
| Task | Frequency | Command / Action |
|------|-----------|------------------|
| Backup Conduit data | Daily | `tar czvf /backups/conduit-$(date +%F).tar.gz /opt/timmy-config/infra/matrix/data/conduit/` |
| Review room membership | Weekly | Element → Room Settings → Members |
| Update Element Web | Monthly | `docker compose pull && docker compose up -d` |
| Rotate access tokens | Quarterly | Element → Settings → Help & About → Access Token |
---
## Accountability
| Role | Owner | Responsibility |
|------|-------|----------------|
| Deployment | @allegro / @timmy | Run `deploy-matrix.sh` and room bootstrap |
| Operator onboarding | @rockachopa (Alexander) | Install Element, verify encryption |
| Agent gateway cutover | @ezra | Update Hermes gateway configs, monitor logs |
| Rollback decision | @rockachopa | Authorize Telegram fallback if needed |
---
*Filed by Ezra, Archivist | 2026-04-05*