Files
timmy-config/training-data/README-batch05.md
Alexander Whitestone 2c637dd0b1
Some checks failed
Architecture Lint / Linter Tests (pull_request) Successful in 31s
Smoke Test / smoke (pull_request) Failing after 28s
Validate Config / YAML Lint (pull_request) Failing after 19s
Validate Config / JSON Validate (pull_request) Successful in 23s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 58s
Validate Config / Python Test Suite (pull_request) Has been skipped
Validate Config / Shell Script Lint (pull_request) Failing after 26s
Validate Config / Cron Syntax Check (pull_request) Successful in 3s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 4s
Validate Config / Playbook Schema Validation (pull_request) Successful in 8s
Validate Training Data / validate (pull_request) Successful in 7s
Architecture Lint / Lint Repository (pull_request) Failing after 28s
PR Checklist / pr-checklist (pull_request) Failing after 9m28s
feat: Timmy Voice Batch 05 — 1K prompt-response pairs (#585)
Generate 1,000 prompt-response pairs in Timmy's voice for training.

- training-data/timmy-voice-batch05.jsonl (1,000 pairs)

- training-data/generate_timmy_voice_batch05.py (deterministic generator)

- training-data/README-batch05.md (batch documentation)

Quality: avg 0.83, threshold >= 0.80. 40% curated seeds, 60% synthetic.
2026-04-22 03:37:46 -04:00

1.2 KiB

Timmy Voice Batch 05

Issue: #585 Worker: 5/10 Pairs: 1,000 Format: ShareGPT JSONL Quality Threshold: ≥ 0.80 Avg Quality: 0.83

Files

  • training-data/timmy-voice-batch05.jsonl — 1,000 prompt→response pairs
  • training-data/generate_timmy_voice_batch05.py — generation script

Generation Details

  • Seed: 585 (deterministic)
  • Source: 40% prompts from training/data/curated_dataset.jsonl, 60% synthetic base prompts
  • Variations: 20 prompt paraphrases per base prompt
  • Categories:
    • Hermes/Timmy-specific: 464
    • Sovereignty & ethics: 441
    • Crisis-adjacent: 95

Voice Rules (SOUL.md)

  • Speak plainly. Short sentences.
  • Answer the question asked before the one not asked.
  • I don't know is better than a confident guess.
  • Brevity is a kindness.
  • Sovereignty and service always.

Validation

All 1,000 entries pass:

  • Required fields check (id, model, batch, source, quality_score, conversations)
  • 3-turn conversation structure [system, human, gpt]
  • Quality score ≥ 0.80
  • Response length ≤ 100 words
  • Crisis protocol compliance (988 / "Are you safe" where applicable)