[ezra] Operational runbook for Matrix infrastructure (#166, #183)

This commit is contained in:
2026-04-05 05:10:56 +00:00
parent fde5db2802
commit 9687975a1b

View File

@@ -0,0 +1,119 @@
# Matrix/Conduit Operational Runbook
This document contains operational procedures for the Timmy Foundation Matrix infrastructure.
## Quick Reference
| Task | Command |
|------|---------|
| Start server | `cd infra/matrix/conduit && docker compose up -d` |
| View logs | `cd infra/matrix/conduit && docker compose logs -f` |
| Create admin account | `./scripts/deploy-conduit.sh admin` |
| Backup data | `./scripts/deploy-conduit.sh backup` |
| Check status | `./scripts/deploy-conduit.sh status` |
## Initial Setup Checklist
- [ ] DNS A record pointing to host IP (matrix.yourdomain.com → host)
- [ ] DNS SRV record for federation (_matrix._tcp → matrix.yourdomain.com:443)
- [ ] Docker and Docker Compose installed
- [ ] `.env` file configured with real values
- [ ] Ports 80, 443, 8448 open in firewall
- [ ] Run `./deploy-conduit.sh install`
- [ ] Run `./deploy-conduit.sh start`
- [ ] Create admin account immediately
- [ ] Disable registration in `.env` and restart
- [ ] Test with Element Web or other client
## Account Creation (One-Time)
**IMPORTANT**: Only enable registration during initial admin account creation.
1. Set `CONDUIT_ALLOW_REGISTRATION=true` in `.env`
2. Set `CONDUIT_REGISTRATION_TOKEN` to a random secret
3. Restart: `./deploy-conduit.sh restart`
4. Create account:
```bash
./deploy-conduit.sh admin
# Inside container:
register_new_matrix_user -c /var/lib/matrix-conduit -u admin -p YOUR_PASS -a
```
5. Set `CONDUIT_ALLOW_REGISTRATION=false` and restart
## Federation Troubleshooting
Federation allows your server to communicate with other Matrix servers (matrix.org, etc).
### Verify Federation Works
```bash
curl https://matrix.org/_matrix/federation/v1/query/directory?room_alias=%23timmy%3Amatrix.yourdomain.com
```
### Required:
- DNS SRV: `_matrix._tcp.yourdomain.com IN SRV 10 0 443 matrix.yourdomain.com`
- Or `.well-known/matrix/server` served on port 443
- Port 8448 reachable (Caddy handles this)
## Backup and Recovery
### Automated Daily Backup (cron)
```bash
0 2 * * * /path/to/timmy-config/infra/matrix/scripts/deploy-conduit.sh backup
```
### Restore from Backup
```bash
./deploy-conduit.sh stop
cd infra/matrix/conduit
rm -rf data/*
tar xzf /path/to/backup.tar.gz
./scripts/deploy-conduit.sh start
```
## Monitoring
### Health Endpoint
```bash
curl http://localhost:6167/_matrix/client/versions
```
### Prometheus Metrics
Enable in `.env`: `CONDUIT_ENABLE_METRICS=true`
Metrics available at: `http://localhost:6167/_matrix/metrics`
## Federation Federation
If you don't need federation (standalone server):
Set `CONDUIT_ALLOW_FEDERATION=false` in `.env`
## Matrix Client Configuration
### Element Web (Self-Hosted)
Create `element-config.json`:
```json
{
"default_server_config": {
"m.homeserver": {
"base_url": "https://matrix.yourdomain.com",
"server_name": "yourdomain.com"
}
}
}
```
### Element Desktop/Mobile
- Homeserver URL: `https://matrix.yourdomain.com`
- User ID: `@username:yourdomain.com`
## Security Hardening
- [ ] Fail2ban on SSH and HTTP
- [ ] Keep Docker images updated: `docker compose pull && docker compose up -d`
- [ ] Review Caddy logs for abuse
- [ ] Disable registration after admin creation
- [ ] Use strong admin password
- [ ] Store backups encrypted
## Related Issues
- Epic: timmy-config#166
- Scaffold: timmy-config#183
- Parent Epic: timmy-config#173 (Unified Comms)