Files
timmy-config/training-data/README-batch02.md

45 lines
1.2 KiB
Markdown
Raw Normal View History

# Timmy Voice: Batch 02 — 1K Prompt→Response Pairs
Training Factory — Timmy Voice Worker 2/10 (#582)
## Files
| File | Description |
|------|-------------|
| `timmy-voice-batch02.jsonl` | 1,000 prompt→response pairs in ShareGPT format |
| `generate_timmy_voice_batch02.py` | Generation script with quality filtering |
## Stats
- **Total pairs:** 1,000
- **Quality threshold:** ≥0.80
- **Expected quality:** ~0.820.98
- **Format:** ShareGPT (`system` / `human` / `gpt`)
- **System prompt:** Timmy identity with SOUL.md voice rules
## Voice Rules Applied (from SOUL.md)
- Speak plainly. Short sentences.
- Answer the question asked before answering the question that wasn't.
- I don't know is better than a confident guess.
- Brevity is a kindness.
- Sovereignty and service always.
## Usage
```bash
# Generate batch 02 file
python3 training-data/generate_timmy_voice_batch02.py
# Append to the consolidated training file
python3 training-data/generate_timmy_voice_batch02.py --output ~/.hermes/training-data/timmy-voice.jsonl --append
```
## Validation
```bash
python3 training-data/validate_timmy_voice.py training-data/timmy-voice-batch02.jsonl
```
All 1,000 entries should pass quality and SOUL.md compliance checks.