83
docs/matrix-fleet-comms/ADR-001-matrix-scaffold.md
Normal file
83
docs/matrix-fleet-comms/ADR-001-matrix-scaffold.md
Normal file
@@ -0,0 +1,83 @@
|
|||||||
|
# ADR-001: Matrix/Conduit Deployment Scaffold
|
||||||
|
|
||||||
|
| Field | Value |
|
||||||
|
|-------|-------|
|
||||||
|
| **Status** | Accepted |
|
||||||
|
| **Date** | 2026-04-05 |
|
||||||
|
| **Decider** | Ezra (Architekt) |
|
||||||
|
| **Stakeholders** | Allegro, Timmy, Alexander |
|
||||||
|
| **Parent Issues** | #166, #183 |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. Context
|
||||||
|
|
||||||
|
Son of Timmy Commandment 6 requires encrypted human-to-fleet communication that is sovereign and independent of Telegram. Before any code can run, we needed a reproducible, infrastructure-agnostic deployment scaffold that any wizard house can verify, deploy, and restore.
|
||||||
|
|
||||||
|
## 2. Decision: Conduit over Synapse
|
||||||
|
|
||||||
|
**Chosen:** [Conduit](https://conduit.rs) as the Matrix homeserver.
|
||||||
|
|
||||||
|
**Alternatives considered:**
|
||||||
|
- **Synapse**: Mature, but heavier (Python, more RAM, more complex config).
|
||||||
|
- **Dendrite**: Go-based, lighter than Synapse, but less feature-complete for E2EE.
|
||||||
|
|
||||||
|
**Rationale:**
|
||||||
|
- Conduit is written in Rust, has a small footprint, and runs comfortably on the Hermes VPS (~7 GB RAM).
|
||||||
|
- Single static binary + SQLite (or Postgres) keeps the Docker image small and backup logic simple.
|
||||||
|
- E2EE support is production-grade enough for a closed fleet.
|
||||||
|
|
||||||
|
## 3. Decision: Docker Compose over Bare Metal
|
||||||
|
|
||||||
|
**Chosen:** Docker Compose stack (`docker-compose.yml`) with explicit volume mounts.
|
||||||
|
|
||||||
|
**Rationale:**
|
||||||
|
- Reproducibility: any host with Docker can stand the stack up in one command.
|
||||||
|
- Isolation: Conduit, Element Web, and Postgres live in separate containers with explicit network boundaries.
|
||||||
|
- Rollback: `docker compose down && docker compose up -d` is a safe, fast recovery path.
|
||||||
|
- Future portability: the same Compose file can move to a different VPS with only `.env` changes.
|
||||||
|
|
||||||
|
## 4. Decision: Caddy as Reverse Proxy (with Nginx coexistence)
|
||||||
|
|
||||||
|
**Chosen:** Caddy handles TLS termination and `.well-known/matrix` delegation inside the Compose network.
|
||||||
|
|
||||||
|
**Rationale:**
|
||||||
|
- Caddy automates Let’s Encrypt TLS via on-demand TLS.
|
||||||
|
- On hosts where Nginx already binds 80/443 (e.g., Hermes VPS), Nginx can reverse-proxy to Caddy or Conduit directly.
|
||||||
|
- The scaffold includes both a `caddy/Caddyfile` and Nginx-compatible notes so the operator is not locked into one proxy.
|
||||||
|
|
||||||
|
## 5. Decision: One Matrix Account Per Wizard House
|
||||||
|
|
||||||
|
**Chosen:** Each wizard house (Ezra, Allegro, Bezalel, etc.) gets its own Matrix user ID (`@ezra:domain`, `@allegro:domain`).
|
||||||
|
|
||||||
|
**Rationale:**
|
||||||
|
- Preserves sovereignty: each house has its own credentials, device keys, and E2EE trust chain.
|
||||||
|
- Matches the existing wizard-house mental model (independent agents, shared rooms).
|
||||||
|
- Simplifies debugging: message provenance is unambiguous.
|
||||||
|
|
||||||
|
## 6. Decision: `matrix-nio` for Hermes Gateway Integration
|
||||||
|
|
||||||
|
**Chosen:** [`matrix-nio`](https://github.com/poljar/matrix-nio) with the `e2e` extra.
|
||||||
|
|
||||||
|
**Rationale:**
|
||||||
|
- Already integrated into the Hermes gateway (`gateway/platforms/matrix.py`).
|
||||||
|
- Asyncio-native, matching the Hermes gateway architecture.
|
||||||
|
- Supports E2EE, media uploads, threads, and replies.
|
||||||
|
|
||||||
|
## 7. Consequences
|
||||||
|
|
||||||
|
### Positive
|
||||||
|
- The scaffold is **self-enforcing**: `validate-scaffold.py` and Gitea Actions CI guard integrity.
|
||||||
|
- Local integration can be verified without public DNS via `docker-compose.test.yml`.
|
||||||
|
- The path from "host decision" to "fleet online" is fully scripted.
|
||||||
|
|
||||||
|
### Negative / Accepted Trade-offs
|
||||||
|
- Conduit is younger than Synapse; edge-case federation bugs are possible. Mitigation: the fleet will run on a single homeserver initially.
|
||||||
|
- SQLite is the default Conduit backend. For >100 users, Postgres is recommended. The Compose file includes an optional Postgres service.
|
||||||
|
|
||||||
|
## 8. References
|
||||||
|
|
||||||
|
- `infra/matrix/CANONICAL_INDEX.md` — canonical artifact map
|
||||||
|
- `infra/matrix/scripts/validate-scaffold.py` — automated integrity checks
|
||||||
|
- `.gitea/workflows/validate-matrix-scaffold.yml` — CI enforcement
|
||||||
|
- `infra/matrix/HERMES_INTEGRATION_VERIFICATION.md` — adapter-to-scaffold mapping
|
||||||
140
docs/matrix-fleet-comms/DECISION_FRAMEWORK_187.md
Normal file
140
docs/matrix-fleet-comms/DECISION_FRAMEWORK_187.md
Normal file
@@ -0,0 +1,140 @@
|
|||||||
|
# Decision Framework: Matrix Host, Domain, and Proxy (#187)
|
||||||
|
|
||||||
|
**Parent:** #166 — Stand up Matrix/Conduit for human-to-fleet encrypted communication
|
||||||
|
**Blocker:** #187 — Decide Matrix host, domain, and proxy prerequisites
|
||||||
|
**Author:** Ezra
|
||||||
|
**Date:** 2026-04-05
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Executive Summary
|
||||||
|
|
||||||
|
#166 is **execution-ready**. The only remaining gate is three decisions:
|
||||||
|
1. **Host** — which machine runs Conduit?
|
||||||
|
2. **Domain** — what FQDN serves the homeserver?
|
||||||
|
3. **Proxy/TLS** — how do HTTPS and federation terminate?
|
||||||
|
|
||||||
|
This document provides **recommended decisions** with full trade-off analysis. If Alexander accepts the recommendations, #187 can close immediately and deployment can begin within the hour.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Decision 1: Host
|
||||||
|
|
||||||
|
### Recommended Choice
|
||||||
|
**Hermes VPS** (current host of Ezra, Bezalel, and Allegro-Primus gateway).
|
||||||
|
|
||||||
|
### Alternative Considered
|
||||||
|
**TestBed VPS** (67.205.155.108) — currently hosts Bezalel (stale) and other experimental workloads.
|
||||||
|
|
||||||
|
### Comparison
|
||||||
|
|
||||||
|
| Factor | Hermes VPS | TestBed VPS |
|
||||||
|
|--------|------------|-------------|
|
||||||
|
| Disk | ✅ 55 GB free | Unknown / smaller |
|
||||||
|
| RAM | ✅ 7 GB | 4 GB (reported) |
|
||||||
|
| Docker | ✅ Installed | Unknown |
|
||||||
|
| Docker Compose | ❌ Not installed (15-min fix) | Unknown |
|
||||||
|
| Nginx on 80/443 | ✅ Already running | Unknown |
|
||||||
|
| Tailscale | ✅ Active | Unknown |
|
||||||
|
| Existing wizard presence | ✅ Ezra, Bezalel, Allegro-Primus | ❌ None primary |
|
||||||
|
| Latency to Alexander | Low (US East) | Low (US East) |
|
||||||
|
|
||||||
|
### Ezra Recommendation
|
||||||
|
**Hermes VPS.** It has the resources, the existing fleet footprint, and the lowest operational surprise. The only missing package is Docker Compose, which is a one-line install (`apt install docker-compose-plugin` or `pip install docker-compose`).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Decision 2: Domain / Subdomain
|
||||||
|
|
||||||
|
### Recommended Choice
|
||||||
|
`matrix.alexanderwhitestone.com`
|
||||||
|
|
||||||
|
### Alternatives Considered
|
||||||
|
- `fleet.alexanderwhitestone.com`
|
||||||
|
- `chat.alexanderwhitestone.com`
|
||||||
|
- `conduit.alexanderwhitestone.com`
|
||||||
|
|
||||||
|
### Analysis
|
||||||
|
|
||||||
|
| Subdomain | Clarity | Federation Friendly | Notes |
|
||||||
|
|-----------|---------|---------------------|-------|
|
||||||
|
| `matrix.*` | ✅ Industry standard | ✅ Easy to remember | Best for `.well-known/matrix/server` delegation |
|
||||||
|
| `fleet.*` | ⚠️ Ambiguous (could be any fleet service) | ⚠️ Fine, but less obvious | Good branding, worse discoverability |
|
||||||
|
| `chat.*` | ✅ User friendly | ⚠️ Suggests a web app, not a homeserver | Fine for Element Web, less precise for federation |
|
||||||
|
| `conduit.*` | ⚠️ Ties us to one implementation | ✅ Fine | If we ever switch to Synapse, this ages poorly |
|
||||||
|
|
||||||
|
### Ezra Recommendation
|
||||||
|
**`matrix.alexanderwhitestone.com`** because it is unambiguous, implementation-agnostic, and follows Matrix community convention. The server name can still be `alexanderwhitestone.com` (for short Matrix IDs like `@ezra:alexanderwhitestone.com`) while the actual homeserver listens on `matrix.alexanderwhitestone.com:8448` or is delegated via `.well-known`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Decision 3: Reverse Proxy / TLS
|
||||||
|
|
||||||
|
### Recommended Choice
|
||||||
|
**Nginx** (already on 80/443) reverse-proxies to Conduit; Let’s Encrypt for TLS.
|
||||||
|
|
||||||
|
### Two Viable Patterns
|
||||||
|
|
||||||
|
#### Pattern A: Nginx → Conduit directly (Recommended)
|
||||||
|
```
|
||||||
|
Internet → Nginx (443) → Conduit (6167 internal)
|
||||||
|
Internet → Nginx (8448) → Conduit (8448 internal)
|
||||||
|
```
|
||||||
|
- Nginx handles TLS termination.
|
||||||
|
- Conduit runs plain HTTP on an internal port.
|
||||||
|
- Federation port 8448 is exposed through Nginx stream or server block.
|
||||||
|
|
||||||
|
#### Pattern B: Nginx → Caddy → Conduit
|
||||||
|
```
|
||||||
|
Internet → Nginx (443) → Caddy (4443) → Conduit (6167)
|
||||||
|
```
|
||||||
|
- Caddy automates Let’s Encrypt inside the Compose network.
|
||||||
|
- Nginx remains the edge listener.
|
||||||
|
- More moving parts, but Caddy’s on-demand TLS is convenient.
|
||||||
|
|
||||||
|
### Comparison
|
||||||
|
|
||||||
|
| Concern | Pattern A (Nginx direct) | Pattern B (Nginx → Caddy) |
|
||||||
|
|---------|--------------------------|---------------------------|
|
||||||
|
| Moving parts | Fewer | More |
|
||||||
|
| TLS automation | Manual certbot or certbot-nginx | Caddy handles it |
|
||||||
|
| Config complexity | Medium | Medium-High |
|
||||||
|
| Debuggability | Easier (one proxy hop) | Harder (two hops) |
|
||||||
|
| Aligns with existing Nginx | ✅ Yes | ⚠️ Needs extra upstream |
|
||||||
|
|
||||||
|
### Ezra Recommendation
|
||||||
|
**Pattern A** for initial deployment. Nginx is already the edge proxy on Hermes VPS. Adding one `server {}` block and one `location /_matrix/` block is the shortest path to a working homeserver. If TLS automation becomes a burden, we can migrate to Caddy later without changing Conduit’s configuration.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Pre-Deployment Checklist (Post-#187)
|
||||||
|
|
||||||
|
Once the decisions above are ratified, the exact execution sequence is:
|
||||||
|
|
||||||
|
1. **Install Docker Compose** on Hermes VPS (if not already present).
|
||||||
|
2. **Create DNS A record** for `matrix.alexanderwhitestone.com` → Hermes VPS public IP.
|
||||||
|
3. **Obtain TLS certificate** for `matrix.alexanderwhitestone.com` (certbot or manual).
|
||||||
|
4. **Copy Nginx server block** from `infra/matrix/caddy/` or write a minimal reverse-proxy config.
|
||||||
|
5. **Run `./host-readiness-check.sh`** and confirm all checks pass.
|
||||||
|
6. **Run `./deploy-matrix.sh`** and wait for Conduit to come online.
|
||||||
|
7. **Run `python3 scripts/bootstrap-fleet-rooms.py --create-all`** to initialize rooms.
|
||||||
|
8. **Run `./scripts/verify-hermes-integration.sh`** to prove E2EE messaging works.
|
||||||
|
9. **Follow `docs/matrix-fleet-comms/CUTOVER_PLAN.md`** for the Telegram → Matrix transition.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Accountability Matrix
|
||||||
|
|
||||||
|
| Decision | Recommended Option | Decision Owner | Execution Owner |
|
||||||
|
|----------|-------------------|----------------|-----------------|
|
||||||
|
| Host | Hermes VPS | @allegro / @timmy | @ezra |
|
||||||
|
| Domain | `matrix.alexanderwhitestone.com` | @rockachopa | @ezra |
|
||||||
|
| Proxy/TLS | Nginx direct (Pattern A) | @ezra / @allegro | @ezra |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Ezra Stance
|
||||||
|
|
||||||
|
#166 has been reduced from a fuzzy epic to a **three-decision, ten-step execution**. All architecture, verification scripts, and contingency plans are in repo truth. The only missing ingredient is a yes/no on the three decisions above.
|
||||||
|
|
||||||
|
— Ezra, Archivist
|
||||||
Reference in New Issue
Block a user