feat: Timmy Voice Batch 03 — 1K prompt→response pairs (#583)
Some checks failed
Architecture Lint / Linter Tests (pull_request) Successful in 26s
Smoke Test / smoke (pull_request) Failing after 23s
Validate Config / YAML Lint (pull_request) Failing after 17s
Validate Config / JSON Validate (pull_request) Successful in 21s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 1m11s
Validate Config / Python Test Suite (pull_request) Has been skipped
Validate Config / Shell Script Lint (pull_request) Failing after 1m11s
Validate Config / Cron Syntax Check (pull_request) Successful in 15s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 15s
Validate Training Data / validate (pull_request) Successful in 24s
Validate Config / Playbook Schema Validation (pull_request) Successful in 28s
Architecture Lint / Lint Repository (pull_request) Failing after 25s
PR Checklist / pr-checklist (pull_request) Successful in 6m4s

- Add training-data/timmy-voice-batch03.jsonl (1,000 pairs, ShareGPT format)
- Add training-data/generate_timmy_voice_batch03.py (deterministic generator)
- Add training-data/validate_timmy_voice.py (SOUL.md compliance checker)
- Add training-data/README-batch03.md (batch documentation)

All pairs quality score ≥ 0.80, avg 0.83.
Categories: hermes (427), sovereignty (464), crisis (109).
Crisis prompts include 988 protocol.

Closes #583
This commit is contained in:
Alexander Whitestone
2026-04-22 02:58:23 -04:00
parent b711b0e0b6
commit 16902d05b2
4 changed files with 1667 additions and 0 deletions

View File

@@ -0,0 +1,43 @@
# Timmy Voice: Batch 03 — 1K Prompt→Response Pairs
Training Factory — Timmy Voice Worker 3/10 (#583)
## Files
| File | Description |
|------|-------------|
| `timmy-voice-batch03.jsonl` | 1,000 prompt→response pairs in ShareGPT format |
| `generate_timmy_voice_batch03.py` | Generation script with quality filtering |
| `validate_timmy_voice.py` | Validation script for SOUL.md compliance |
## Stats
- **Total pairs:** 1,000
- **Quality threshold:** ≥0.80
- **Average quality score:** 0.83
- **Format:** ShareGPT (`system` / `human` / `gpt`)
- **System prompt:** Timmy identity with SOUL.md voice rules
## Category Breakdown
| Category | Count | Description |
|----------|-------|-------------|
| hermes | 427 | Hermes/Timmy Foundation operational questions |
| sovereignty | 464 | Local-first, open-source, AI ethics |
| crisis | 109 | Crisis-adjacent prompts with 988 protocol |
## Voice Rules Applied (from SOUL.md)
- Speak plainly. Short sentences.
- Answer the question asked before answering the question that wasn't.
- I don't know is better than a confident guess.
- Brevity is a kindness.
- Sovereignty and service always.
## Validation
```bash
python3 training-data/validate_timmy_voice.py training-data/timmy-voice-batch03.jsonl
```
All 1,000 entries pass quality and SOUL.md compliance checks.