allegro/docs/VPS_SETUP.md

# Timmy VPS Setup Guide

Complete guide for provisioning a sovereign Timmy wizard VPS.

## Prerequisites

- Fresh Ubuntu 22.04 or 24.04 VPS
- Root SSH access
- At least 4GB RAM, 20GB disk
- Internet connection

## Quick Start

```bash
# Download and run provisioning script
curl -sL https://raw.githubusercontent.com/Timmy_Foundation/timmy-home/main/scripts/provision-timmy-vps.sh | bash
```

## What Gets Installed

| Component | Purpose | Port |
|-----------|---------|------|
| llama.cpp | Local inference | 8081 (localhost only) |
| Python venv | Agent environment | - |
| timmy-home | Agent scripts | - |
| Syncthing | File sync | 22000 |
| UFW | Firewall | - |

## Directory Structure

```
~/timmy/
├── models/          # AI model weights
├── soul/            # Conscience files (SOUL.md)
├── scripts/         # Operational scripts
├── logs/            # Agent logs
├── shared/          # Syncthing shared folder
├── configs/         # Configuration files
└── timmy-home/      # Repository clone
```

## Services

### llama-server
Local inference server (CPU-only with OpenBLAS)
```bash
systemctl status llama-server
systemctl restart llama-server
journalctl -u llama-server -f
```

### timmy-agent
Agent harness that calls local inference
```bash
systemctl status timmy-agent
systemctl restart timmy-agent
```

### syncthing
File synchronization between VPS nodes
```bash
systemctl status syncthing@root
```

## Testing

### Check Inference
```bash
curl http://127.0.0.1:8081/health
```

### Test Completion
```bash
curl -X POST http://127.0.0.1:8081/completion \
  -H "Content-Type: application/json" \
  -d '{"prompt": "Hello, I am", "max_tokens": 10}'
```

### System Status
```bash
~/timmy/scripts/status.sh
```

## Security

- **Port 8081** (inference): localhost only, never exposed
- **Port 22000** (syncthing): open for P2P sync
- **Port 22** (SSH): standard access
- **UFW**: All other ports blocked by default

## Troubleshooting

| Issue | Solution |
|-------|----------|
| llama-server won't start | Check model exists: `ls ~/timmy/models/` |
| Out of memory | Use smaller GGUF (Q4_K_S instead of Q4_K_M) |
| Syncthing not syncing | Check firewall: `ufw status` |
| Slow inference | Ensure OpenBLAS is working: `ldd ~/timmy/llama-server \| grep blas` |

## Manual Model Download

If automatic download fails:
```bash
cd ~/timmy/models
wget https://huggingface.co/TheBloke/Hermes-3-Llama-3.1-8B-GGUF/resolve/main/hermes-3-llama-3.1-8b.Q4_K_M.gguf
systemctl restart llama-server
```

## Uninstall

```bash
systemctl stop llama-server timmy-agent syncthing@root
systemctl disable llama-server timmy-agent syncthing@root
rm -rf ~/timmy
```
checkpoint: 20:01 auto-commit 2026-03-31 20:02:01 +00:00			`# Timmy VPS Setup Guide`

			`Complete guide for provisioning a sovereign Timmy wizard VPS.`

			`## Prerequisites`

			`- Fresh Ubuntu 22.04 or 24.04 VPS`
			`- Root SSH access`
			`- At least 4GB RAM, 20GB disk`
			`- Internet connection`

			`## Quick Start`

			```bash
			`# Download and run provisioning script`
			`curl -sL https://raw.githubusercontent.com/Timmy_Foundation/timmy-home/main/scripts/provision-timmy-vps.sh \| bash`
			```

			`## What Gets Installed`

			`\| Component \| Purpose \| Port \|`
			`\|-----------\|---------\|------\|`
			`\| llama.cpp \| Local inference \| 8081 (localhost only) \|`
			`\| Python venv \| Agent environment \| - \|`
			`\| timmy-home \| Agent scripts \| - \|`
			`\| Syncthing \| File sync \| 22000 \|`
			`\| UFW \| Firewall \| - \|`

			`## Directory Structure`

			```
			`~/timmy/`
			`├── models/ # AI model weights`
			`├── soul/ # Conscience files (SOUL.md)`
			`├── scripts/ # Operational scripts`
			`├── logs/ # Agent logs`
			`├── shared/ # Syncthing shared folder`
			`├── configs/ # Configuration files`
			`└── timmy-home/ # Repository clone`
			```

			`## Services`

			`### llama-server`
			`Local inference server (CPU-only with OpenBLAS)`
			```bash
			`systemctl status llama-server`
			`systemctl restart llama-server`
			`journalctl -u llama-server -f`
			```

			`### timmy-agent`
			`Agent harness that calls local inference`
			```bash
			`systemctl status timmy-agent`
			`systemctl restart timmy-agent`
			```

			`### syncthing`
			`File synchronization between VPS nodes`
			```bash
			`systemctl status syncthing@root`
			```

			`## Testing`

			`### Check Inference`
			```bash
			`curl http://127.0.0.1:8081/health`
			```

			`### Test Completion`
			```bash
			`curl -X POST http://127.0.0.1:8081/completion \`
			`-H "Content-Type: application/json" \`
			`-d '{"prompt": "Hello, I am", "max_tokens": 10}'`
			```

			`### System Status`
			```bash
			`~/timmy/scripts/status.sh`
			```

			`## Security`

			`- Port 8081 (inference): localhost only, never exposed`
			`- Port 22000 (syncthing): open for P2P sync`
			`- Port 22 (SSH): standard access`
			`- UFW: All other ports blocked by default`

			`## Troubleshooting`

			`\| Issue \| Solution \|`
			`\|-------\|----------\|`
			\| llama-server won't start \| Check model exists: `ls ~/timmy/models/` \|
			`\| Out of memory \| Use smaller GGUF (Q4_K_S instead of Q4_K_M) \|`
			\| Syncthing not syncing \| Check firewall: `ufw status` \|
			\| Slow inference \| Ensure OpenBLAS is working: `ldd ~/timmy/llama-server \\| grep blas` \|

			`## Manual Model Download`

			`If automatic download fails:`
			```bash
			`cd ~/timmy/models`
			`wget https://huggingface.co/TheBloke/Hermes-3-Llama-3.1-8B-GGUF/resolve/main/hermes-3-llama-3.1-8b.Q4_K_M.gguf`
			`systemctl restart llama-server`
			```

			`## Uninstall`

			```bash
			`systemctl stop llama-server timmy-agent syncthing@root`
			`systemctl disable llama-server timmy-agent syncthing@root`
			`rm -rf ~/timmy`
			```