ezra-environment/the-nexus/deepdive/docs/OPERATIONS.md

# Deep Dive Operations Runbook

**Issue**: the-nexus#830
**Maintainer**: Operations team post-deployment

---

## Quick Start

```bash
# 1. Install dependencies
cd deepdive && pip install -r requirements.txt

# 2. Configure environment
cp config/.env.example config/.env
# Edit config/.env with your API keys

# 3. Test full pipeline
./bin/run_full_pipeline.py --date=$(date +%Y-%m-%d) --dry-run

# 4. Run for real
./bin/run_full_pipeline.py
```

---

## Daily Operations

### Manual Run (On-Demand)

```bash
# Run full pipeline for today
./bin/run_full_pipeline.py

# Run specific phases
./bin/run_full_pipeline.py --phases 1,2    # Just aggregate and rank
./bin/run_full_pipeline.py --phase3-only   # Regenerate briefing
```

### Cron Setup (Scheduled)

```bash
# Edit crontab
crontab -e

# Add daily 6 AM run (server time should be EST)
0 6 * * * /opt/deepdive/bin/run_full_pipeline.py >> /var/log/deepdive.log 2>&1
```

Systemd timer alternative:
```bash
sudo cp config/deepdive.service /etc/systemd/system/
sudo cp config/deepdive.timer /etc/systemd/system/
sudo systemctl enable deepdive.timer
sudo systemctl start deepdive.timer
```

---

## Monitoring

### Check Today's Run

```bash
# View logs
tail -f /var/log/deepdive.log

# Check data directories
ls -la data/sources/$(date +%Y-%m-%d)/
ls -la data/briefings/
ls -la data/audio/

# Verify Telegram delivery
curl -s "https://api.telegram.org/bot${TOKEN}/getUpdates" | jq '.result[-1]'
```

### Common Issues

| Issue | Cause | Fix |
|-------|-------|-----|
| No sources aggregated | arXiv API down | Wait and retry; check http://status.arxiv.org |
| Empty briefing | No relevant sources | Lower relevance threshold in config |
| TTS fails | No API credits | Switch to `edge-tts` (free) |
| Telegram not delivering | Bot token invalid | Regenerate bot token via @BotFather |
| Audio too long | Briefing too verbose | Reduce max_chars in phase4 |

---

## Configuration

### Source Management

Edit `config/sources.yaml`:

```yaml
sources:
  arxiv:
    categories:
      - cs.AI
      - cs.CL
      - cs.LG
    max_items: 50

  blogs:
    openai: https://openai.com/blog/rss.xml
    anthropic: https://www.anthropic.com/news.atom
    deepmind: https://deepmind.google/blog/rss.xml
    max_items_per_source: 10

  newsletters:
    - name: "Import AI"
      email_filter: "importai@jack-clark.net"
```

### Relevance Tuning

Edit `config/relevance.yaml`:

```yaml
keywords:
  hermes: 3.0        # Boost Hermes mentions
  agent: 1.5
  mcp: 2.0

thresholds:
  min_score: 2.0     # Drop items below this
  max_items: 20      # Top N to keep
```

### LLM Selection

Environment variable:
```bash
export DEEPDIVE_LLM_MODEL="openai/gpt-4o-mini"
# or
export DEEPDIVE_LLM_MODEL="anthropic/claude-3-haiku"
# or
export DEEPDIVE_LLM_MODEL="hermes/local"
```

### TTS Selection

Environment variable:
```bash
export DEEPDIVE_TTS_PROVIDER="edge-tts"      # Free, recommended
# or
export DEEPDIVE_TTS_PROVIDER="openai"        # Requires OPENAI_API_KEY
# or
export DEEPDIVE_TTS_PROVIDER="elevenlabs"    # Best quality
```

---

## Telegram Bot Setup

1. **Create Bot**: Message @BotFather, create new bot, get token
2. **Get Chat ID**: Message bot, then:
   ```bash
   curl https://api.telegram.org/bot<TOKEN>/getUpdates
   ```
3. **Configure**:
   ```bash
   export DEEPDIVE_TELEGRAM_BOT_TOKEN="<token>"
   export DEEPDIVE_TELEGRAM_CHAT_ID="<chat_id>"
   ```

---

## Maintenance

### Weekly

- [ ] Check disk space in `data/` directory
- [ ] Review log for errors: `grep ERROR /var/log/deepdive.log`
- [ ] Verify cron/timer is running: `systemctl status deepdive.timer`

### Monthly

- [ ] Archive old audio: `find data/audio -mtime +30 -exec gzip {} \;`
- [ ] Review source quality: are rankings accurate?
- [ ] Update API keys if approaching limits

---

## Troubleshooting

### Debug Mode

Run phases individually with verbose output:

```bash
# Phase 1 with verbose
python -c "
import asyncio
from bin.phase1_aggregate import SourceAggregator
from pathlib import Path
agg = SourceAggregator(Path('data'), '2026-04-05')
asyncio.run(agg.run())
"
```

### Reset State

Delete and regenerate:
```bash
rm -rf data/sources/2026-04-*
rm -rf data/ranked/*.json
rm -rf data/briefings/*.md
rm -rf data/audio/*.mp3
```

### Test Telegram

```bash
curl -X POST \
  https://api.telegram.org/bot<TOKEN>/sendMessage \
  -d chat_id=<CHAT_ID> \
  -d text="Deep Dive test message"
```

---

## Security

- API keys stored in `config/.env` (gitignored)
- `.env` file permissions: `chmod 600 config/.env`
- Telegram bot token: regenerate if compromised
- LLM API usage: monitor for unexpected spend

---

**Issue Ref**: #830
**Last Updated**: 2026-04-05 by Ezra