44 lines
1.3 KiB
Markdown
44 lines
1.3 KiB
Markdown
|
|
# Timmy Voice: Batch 03 — 1K Prompt→Response Pairs
|
||
|
|
|
||
|
|
Training Factory — Timmy Voice Worker 3/10 (#583)
|
||
|
|
|
||
|
|
## Files
|
||
|
|
|
||
|
|
| File | Description |
|
||
|
|
|------|-------------|
|
||
|
|
| `timmy-voice-batch03.jsonl` | 1,000 prompt→response pairs in ShareGPT format |
|
||
|
|
| `generate_timmy_voice_batch03.py` | Generation script with quality filtering |
|
||
|
|
| `validate_timmy_voice.py` | Validation script for SOUL.md compliance |
|
||
|
|
|
||
|
|
## Stats
|
||
|
|
|
||
|
|
- **Total pairs:** 1,000
|
||
|
|
- **Quality threshold:** ≥0.80
|
||
|
|
- **Average quality score:** 0.83
|
||
|
|
- **Format:** ShareGPT (`system` / `human` / `gpt`)
|
||
|
|
- **System prompt:** Timmy identity with SOUL.md voice rules
|
||
|
|
|
||
|
|
## Category Breakdown
|
||
|
|
|
||
|
|
| Category | Count | Description |
|
||
|
|
|----------|-------|-------------|
|
||
|
|
| hermes | 427 | Hermes/Timmy Foundation operational questions |
|
||
|
|
| sovereignty | 464 | Local-first, open-source, AI ethics |
|
||
|
|
| crisis | 109 | Crisis-adjacent prompts with 988 protocol |
|
||
|
|
|
||
|
|
## Voice Rules Applied (from SOUL.md)
|
||
|
|
|
||
|
|
- Speak plainly. Short sentences.
|
||
|
|
- Answer the question asked before answering the question that wasn't.
|
||
|
|
- I don't know is better than a confident guess.
|
||
|
|
- Brevity is a kindness.
|
||
|
|
- Sovereignty and service always.
|
||
|
|
|
||
|
|
## Validation
|
||
|
|
|
||
|
|
```bash
|
||
|
|
python3 training-data/validate_timmy_voice.py training-data/timmy-voice-batch03.jsonl
|
||
|
|
```
|
||
|
|
|
||
|
|
All 1,000 entries pass quality and SOUL.md compliance checks.
|