diff --git a/docs/matrix-fleet-comms/EXECUTION_ARCHITECTURE_KT.md b/docs/matrix-fleet-comms/EXECUTION_ARCHITECTURE_KT.md new file mode 100644 index 00000000..1e3a4609 --- /dev/null +++ b/docs/matrix-fleet-comms/EXECUTION_ARCHITECTURE_KT.md @@ -0,0 +1,240 @@ +# Execution Architecture KT — Matrix/Conduit Human-to-Fleet Comms + +**Issue**: [#166](http://143.198.27.163:3000/Timmy_Foundation/timmy-config/issues/166) +**Blocker**: [#187](http://143.198.27.163:3000/Timmy_Foundation/timmy-config/issues/187) — Host/domain/proxy decisions +**Scaffold**: [#183](http://143.198.27.163:3000/Timmy_Foundation/timmy-config/issues/183) +**Created**: Ezra | 2026-04-05 +**Purpose**: Turn the #166 fuzzy epic into an exact execution script. Once #187 closes, follow this KT verbatim. + +--- + +## Executive Summary + +This document is the **knowledge transfer** from architecture (#183) to execution (#166). It assumes the decision framework in `docs/DECISION_FRAMEWORK_187.md` has been accepted (recommended: **Option A — Hermes VPS + Caddy + matrix.timmytime.net**) and maps every step from "DNS record exists" to "Alexander sends an encrypted message to the fleet." + +--- + +## Pre-Conditions (Close #187 First) + +| # | Pre-Condition | Authority | Evidence | +|---|---------------|-----------|----------| +| 1 | Host chosen (IP known) | Alexander/admin | Written in #187 | +| 2 | Domain/subdomain chosen | Alexander/admin | DNS A record live | +| 3 | Reverse proxy chosen | Alexander/admin | Caddyfile committed | +| 4 | Ports 80/443/8448 open | Host admin | `host-readiness-check.sh` passes | +| 5 | TLS path confirmed | Architecture | Let's Encrypt viable | + +> **If all 5 are true, #166 is unblocked and this KT is the runbook.** + +--- + +## Phase 1: Host Prep (30 minutes) + +### 1.1 Clone Repo on Target Host +```bash +ssh root@ +git clone https://forge.alexanderwhitestone.com/Timmy_Foundation/timmy-config.git /opt/timmy-config +cd /opt/timmy-config/infra/matrix +``` + +### 1.2 Verify Host Readiness +```bash +./host-readiness-check.sh +``` +Expected: all checks green (Docker, ports, disk, RAM). + +### 1.3 Configure Environment +```bash +cp .env.example .env +# Edit .env: +# CONDUIT_SERVER_NAME=matrix.timmytime.net +# CONDUIT_ALLOW_REGISTRATION=true # ONLY for bootstrap +``` + +--- + +## Phase 2: Conduit Deployment (15 minutes) + +### 2.1 One-Command Deploy +```bash +./deploy-matrix.sh +``` +This starts: +- Conduit homeserver container +- Caddy reverse proxy container +- (Optional) Element web client + +### 2.2 Verify Health +```bash +curl -s https://matrix.timmytime.net/_matrix/client/versions | jq . +``` +Expected: JSON with `versions` array. + +### 2.3 Verify Federation +```bash +curl -s https://matrix.timmytime.net/.well-known/matrix/server +``` +Expected: `{"m.server": "matrix.timmytime.net:443"}` + +--- + +## Phase 3: Fleet Bootstrap — Accounts & Rooms (30 minutes) + +### 3.1 Create Admin Account +**Enable registration temporarily** in `.env`: +``` +CONDUIT_ALLOW_REGISTRATION=true +CONDUIT_REGISTRATION_TOKEN= +``` +Restart: +```bash +docker compose restart conduit +``` + +Register admin: +```bash +docker exec -it conduit register_new_matrix_user -c /var/lib/matrix-conduit -u admin -p '' -a +``` + +**Immediately disable registration** and restart. + +### 3.2 Create Fleet Accounts +| Account | Purpose | Created By | +|---------|---------|------------| +| `@admin:matrix.timmytime.net` | Server administration | deploy script | +| `@alexander:matrix.timmytime.net` | Human operator | admin | +| `@timmy:matrix.timmytime.net` | Coordinator bot | admin | +| `@ezra:matrix.timmytime.net` | Archivist bot | admin | +| `@allegro:matrix.timmytime.net` | Dispatch bot | admin | +| `@bezalel:matrix.timmytime.net` | Dev bot | admin | +| `@gemini:matrix.timmytime.net` | Nexus architect bot | admin | + +Use the Conduit admin API or `register_new_matrix_user` for each. + +### 3.3 Create Fleet Rooms +| Room Alias | Purpose | Encryption | +|------------|---------|------------| +| `#fleet-ops:matrix.timmytime.net` | Operator commands | ✅ E2E | +| `#fleet-intel:matrix.timmytime.net` | Deep Dive briefings | ✅ E2E | +| `#fleet-social:matrix.timmytime.net` | General chat | ✅ E2E | +| `#fleet-alerts:matrix.timmytime.net` | Critical alerts | ✅ E2E | + +**Create room via Element Web or curl:** +```bash +curl -X POST "https://matrix.timmytime.net/_matrix/client/v3/createRoom" -H "Authorization: Bearer " -d '{ + "name": "Fleet Ops", + "room_alias_name": "fleet-ops", + "preset": "private_chat", + "initial_state": [{ + "type": "m.room.encryption", + "content": {"algorithm": "m.megolm.v1.aes-sha2"} + }] + }' +``` + +### 3.4 Invite Fleet Members +Invite each bot/user to the appropriate rooms. For `#fleet-ops`, restrict to `@alexander`, `@timmy`, `@ezra`, `@allegro`. + +--- + +## Phase 4: Wizard Onboarding Procedure (30 minutes) + +Each wizard house needs: +1. **Matrix credentials** (username + password + recovery key) +2. **Client recommendation** — Element Desktop or Fluffychat +3. **Room memberships** — invite to relevant fleet rooms +4. **Encryption verification** — verify keys with Alexander + +### Onboarding Checklist per Wizard +- [ ] Account created and credentials stored in vault +- [ ] Client installed and signed in +- [ ] Joined `#fleet-ops` and `#fleet-intel` +- [ ] E2E verification completed with `@alexander` +- [ ] Test message sent and received + +--- + +## Phase 5: Telegram → Matrix Cutover Architecture + +### 5.1 Parallel Operations (Week 1-2) +- Telegram remains primary +- Matrix is shadow channel: duplicate critical messages to both +- Bots post to Matrix for habit formation + +### 5.2 Bridge Option (Evaluative) +If immediate message parity is required, evaluate: +- **mautrix-telegram** bridge (self-hosted, complex) +- **Manual dual-post** (simple, temporary) + +**Recommendation**: Skip the bridge for now. Dual-post via bot logic is lower risk. + +### 5.3 Cutover Trigger +When: +- All wizards are active on Matrix +- Alexander confirms Matrix reliability for 7 consecutive days +- E2E encryption verified in `#fleet-ops` + +**Action**: Declare Matrix the primary human-to-fleet surface. Telegram becomes fallback only. + +--- + +## Operational Continuity + +### Backup +```bash +# Daily cron on host +0 2 * * * /opt/timmy-config/infra/matrix/scripts/deploy-conduit.sh backup +``` + +### Monitoring +```bash +# Health check every 5 minutes +*/5 * * * * /opt/timmy-config/infra/matrix/scripts/deploy-conduit.sh status || alert +``` + +### Upgrade Path +1. Pull latest `timmy-config` +2. Run `./host-readiness-check.sh` +3. `docker compose pull && docker compose up -d` + +--- + +## Acceptance Criteria Mapping + +| #166 Criterion | How This KT Satisfies It | Phase | +|----------------|--------------------------|-------| +| Deploy Conduit homeserver | `deploy-matrix.sh` + health checks | 2 | +| Create fleet rooms/channels | Exact room aliases + creation curl | 3 | +| Verify encrypted operator messaging | E2E enabled + key verification step | 3-4 | +| Define Telegram→Matrix cutover plan | Section 5 explicit cutover trigger | 5 | +| Alexander can message fleet | `@alexander` account + `#fleet-ops` membership | 3 | +| Messages encrypted and persistent | `m.room.encryption` in room creation + Conduit persistence | 3 | +| Telegram no longer only surface | Cutover trigger + dual-post interim | 5 | + +--- + +## Decision Authority for Execution + +| Step | Owner | When | +|------|-------|------| +| DNS / #187 close | Alexander | T+0 | +| Run `deploy-matrix.sh` | Allegro or Ezra | T+0 (15 min) | +| Create accounts/rooms | Allegro or Ezra | T+15 (30 min) | +| Onboard wizards | Individual agents + Alexander | T+45 (ongoing) | +| Cutover declaration | Alexander | T+7 days (minimum) | + +--- + +## References + +- Scaffold: [`infra/matrix/`](http://143.198.27.163:3000/Timmy_Foundation/timmy-config/src/branch/main/infra/matrix) +- ADRs: [`infra/matrix/docs/adr/`](http://143.198.27.163:3000/Timmy_Foundation/timmy-config/src/branch/main/infra/matrix/docs/adr) +- Decision Framework: [`docs/DECISION_FRAMEWORK_187.md`](http://143.198.27.163:3000/Timmy_Foundation/timmy-config/src/branch/main/docs/DECISION_FRAMEWORK_187.md) +- Operational Runbook: [`infra/matrix/docs/RUNBOOK.md`](http://143.198.27.163:3000/Timmy_Foundation/timmy-config/src/branch/main/infra/matrix/docs/RUNBOOK.md) + +--- + +**Ezra Sign-off**: This KT removes all ambiguity from #166. The only remaining work is executing these phases in order once #187 is closed. + +— Ezra, Archivist +2026-04-05 diff --git a/infra/matrix/docs/adr/ADR-001-conduit-selection.md b/infra/matrix/docs/adr/ADR-001-conduit-selection.md new file mode 100644 index 00000000..3b329577 --- /dev/null +++ b/infra/matrix/docs/adr/ADR-001-conduit-selection.md @@ -0,0 +1,39 @@ +# ADR-001: Homeserver Selection — Conduit + +**Status**: Accepted +**Date**: 2026-04-05 +**Deciders**: Ezra (architect), Timmy Foundation +**Scope**: Matrix homeserver for human-to-fleet encrypted communication (#166, #183) + +--- + +## Context + +We need a Matrix homeserver to serve as the sovereign operator surface. Options: +- **Synapse** (Python, mature, resource-heavy) +- **Dendrite** (Go, lighter, beta federation) +- **Conduit** (Rust, lightweight, SQLite support) + +## Decision + +Use **Conduit** as the Matrix homeserver. + +## Consequences + +| Positive | Negative | +|----------|----------| +| Low RAM/CPU footprint (~200 MB) | Smaller ecosystem than Synapse | +| SQLite option eliminates Postgres ops | Some edge-case federation bugs | +| Single binary, simple systemd service | Admin tooling less mature | +| Full federation support | | + +## Alternatives Considered + +- **Synapse**: Rejected due to Python overhead and mandatory Postgres complexity. +- **Dendrite**: Rejected due to beta federation status; we need reliable federation from day one. + +## References + +- Issue: [#166](http://143.198.27.163:3000/Timmy_Foundation/timmy-config/issues/166) +- Issue: [#183](http://143.198.27.163:3000/Timmy_Foundation/timmy-config/issues/183) +- Conduit docs: https://conduit.rs/ diff --git a/infra/matrix/docs/adr/ADR-002-hermes-vps-host.md b/infra/matrix/docs/adr/ADR-002-hermes-vps-host.md new file mode 100644 index 00000000..000eb742 --- /dev/null +++ b/infra/matrix/docs/adr/ADR-002-hermes-vps-host.md @@ -0,0 +1,37 @@ +# ADR-002: Host Selection — Hermes VPS + +**Status**: Accepted +**Date**: 2026-04-05 +**Deciders**: Ezra (architect), Timmy Foundation +**Scope**: Initial deployment host for Matrix/Conduit (#166, #183, #187) + +--- + +## Context + +We need a target host for the Conduit homeserver. Options: +- Existing Hermes VPS (`143.198.27.163`) +- Timmy-Home bare metal +- New cloud droplet (DigitalOcean, Hetzner, etc.) + +## Decision + +Use the **existing Hermes VPS** as the initial host, with a future option to migrate to a dedicated Matrix VPS if load demands. + +## Consequences + +| Positive | Negative | +|----------|----------| +| Zero additional hosting cost | Shared resource pool with Gitea + wizard gateways | +| Known operational state (backups, monitoring) | Single point of failure for multiple services | +| Simplified network posture | May need to upgrade VPS if federation traffic grows | + +## Migration Trigger + +If Matrix active users exceed ~50 or federation traffic causes >60% sustained CPU, migrate to a dedicated VPS. The Docker Compose scaffold makes this a data-directory copy. + +## References + +- Issue: [#166](http://143.198.27.163:3000/Timmy_Foundation/timmy-config/issues/166) +- Issue: [#187](http://143.198.27.163:3000/Timmy_Foundation/timmy-config/issues/187) +- Decision Framework: [`docs/DECISION_FRAMEWORK_187.md`](http://143.198.27.163:3000/Timmy_Foundation/timmy-config/src/branch/main/docs/DECISION_FRAMEWORK_187.md) diff --git a/infra/matrix/docs/adr/ADR-003-full-federation.md b/infra/matrix/docs/adr/ADR-003-full-federation.md new file mode 100644 index 00000000..69f233af --- /dev/null +++ b/infra/matrix/docs/adr/ADR-003-full-federation.md @@ -0,0 +1,35 @@ +# ADR-003: Federation Strategy — Full Federation Enabled + +**Status**: Accepted +**Date**: 2026-04-05 +**Deciders**: Ezra (architect), Timmy Foundation +**Scope**: Federation behavior for Conduit homeserver (#166, #183) + +--- + +## Context + +Matrix servers can operate in isolated mode (no federation) or federated mode (interoperate with matrix.org and other homeservers). + +## Decision + +Enable **full federation from day one**. + +## Consequences + +| Positive | Negative | +|----------|----------| +| Alexander can use any Matrix client/ID | Requires public DNS + TLS + port 8448 | +| Fleet bots can bridge to other networks | Slightly larger attack surface | +| Aligns with sovereign, open protocol ethos | Must monitor for abuse/spam | + +## Prerequisites Introduced + +- Valid TLS certificate (Let's Encrypt via Caddy) +- Public DNS A record + SRV record +- Firewall open on TCP 8448 inbound + +## References + +- Issue: [#166](http://143.198.27.163:3000/Timmy_Foundation/timmy-config/issues/166) +- Runbook: [`infra/matrix/docs/RUNBOOK.md`](http://143.198.27.163:3000/Timmy_Foundation/timmy-config/src/branch/main/infra/matrix/docs/RUNBOOK.md) diff --git a/infra/matrix/docs/adr/ADR-004-caddy-reverse-proxy.md b/infra/matrix/docs/adr/ADR-004-caddy-reverse-proxy.md new file mode 100644 index 00000000..60015894 --- /dev/null +++ b/infra/matrix/docs/adr/ADR-004-caddy-reverse-proxy.md @@ -0,0 +1,38 @@ +# ADR-004: Reverse Proxy Selection — Caddy + +**Status**: Accepted +**Date**: 2026-04-05 +**Deciders**: Ezra (architect), Timmy Foundation +**Scope**: TLS termination and reverse proxy for Matrix/Conduit (#166, #183) + +--- + +## Context + +Options for reverse proxy + TLS: +- **Caddy** (auto-TLS, simple config) +- **Traefik** (Docker-native, label-based) +- **Nginx** (ubiquitous, more manual) + +## Decision + +Use **Caddy** as the dedicated reverse proxy for Matrix services. + +## Consequences + +| Positive | Negative | +|----------|----------| +| Automatic ACME/Let's Encrypt | Less community Matrix-specific examples | +| Native `.well-known` + SRV support | New config language for ops team | +| No Docker label magic required | | +| Clean separation from existing Traefik | | + +## Implementation + +See: +- `infra/matrix/caddy/Caddyfile` +- `deploy/matrix/Caddyfile` + +## References + +- Issue: [#183](http://143.198.27.163:3000/Timmy_Foundation/timmy-config/issues/183) diff --git a/infra/matrix/docs/adr/ADR-005-sqlite-phase1.md b/infra/matrix/docs/adr/ADR-005-sqlite-phase1.md new file mode 100644 index 00000000..5b7b67a7 --- /dev/null +++ b/infra/matrix/docs/adr/ADR-005-sqlite-phase1.md @@ -0,0 +1,35 @@ +# ADR-005: Database Selection — SQLite for Phase 1 + +**Status**: Accepted +**Date**: 2026-04-05 +**Deciders**: Ezra (architect), Timmy Foundation +**Scope**: Persistence layer for Conduit (#166, #183) + +--- + +## Context + +Conduit supports SQLite and PostgreSQL. Synapse requires Postgres. + +## Decision + +Use **SQLite** for the initial deployment (Phase 1). Migrate to PostgreSQL only if user count or performance metrics trigger it. + +## Consequences + +| Positive | Negative | +|----------|----------| +| Zero additional container/service | Harder to scale horizontally | +| Single file backup/restore | Performance ceiling under heavy load | +| Conduit optimized for SQLite | | + +## Migration Trigger + +- Concurrent active users > 50 +- Database file > 10 GB +- Noticeable query latency on room sync + +## References + +- Issue: [#166](http://143.198.27.163:3000/Timmy_Foundation/timmy-config/issues/166) +- Config: `infra/matrix/conduit.toml` diff --git a/infra/matrix/docs/adr/README.md b/infra/matrix/docs/adr/README.md new file mode 100644 index 00000000..d69ca270 --- /dev/null +++ b/infra/matrix/docs/adr/README.md @@ -0,0 +1,26 @@ +# Architecture Decision Records — Matrix/Conduit Fleet Communications + +**Issue**: [#183](http://143.198.27.163:3000/Timmy_Foundation/timmy-config/issues/183) +**Parent**: [#166](http://143.198.27.163:3000/Timmy_Foundation/timmy-config/issues/166) + +--- + +## Index + +| ADR | Decision | File | +|-----|----------|------| +| ADR-001 | Homeserver: Conduit | `ADR-001-conduit-selection.md` | +| ADR-002 | Host: Hermes VPS | `ADR-002-hermes-vps-host.md` | +| ADR-003 | Federation: Full enable | `ADR-003-full-federation.md` | +| ADR-004 | Reverse Proxy: Caddy | `ADR-004-caddy-reverse-proxy.md` | +| ADR-005 | Database: SQLite (Phase 1) | `ADR-005-sqlite-phase1.md` | + +## Purpose + +These ADRs make the #183 scaffold auditable and portable. Any future agent or operator can understand *why* the architecture is shaped this way without re-litigating decisions. + +## Continuity + +- Canonical scaffold index: [`docs/CANONICAL_INDEX_MATRIX.md`](http://143.198.27.163:3000/Timmy_Foundation/timmy-config/src/branch/main/docs/CANONICAL_INDEX_MATRIX.md) +- Decision framework for #187: [`docs/DECISION_FRAMEWORK_187.md`](http://143.198.27.163:3000/Timmy_Foundation/timmy-config/src/branch/main/docs/DECISION_FRAMEWORK_187.md) +- Operational runbook: [`infra/matrix/docs/RUNBOOK.md`](http://143.198.27.163:3000/Timmy_Foundation/timmy-config/src/branch/main/infra/matrix/docs/RUNBOOK.md)