docs: verify issue #600 visual scenes dataset is present on main
Some checks failed
Architecture Lint / Linter Tests (pull_request) Successful in 14s
Smoke Test / smoke (pull_request) Failing after 20s
Validate Config / YAML Lint (pull_request) Failing after 21s
Validate Config / JSON Validate (pull_request) Successful in 20s
Validate Config / Python Syntax & Import Check (pull_request) Failing after 59s
Validate Config / Python Test Suite (pull_request) Has been skipped
Validate Config / Shell Script Lint (pull_request) Failing after 52s
Validate Config / Cron Syntax Check (pull_request) Successful in 9s
Validate Config / Deploy Script Dry Run (pull_request) Successful in 10s
PR Checklist / pr-checklist (pull_request) Failing after 3m41s
Validate Config / Playbook Schema Validation (pull_request) Successful in 20s
Architecture Lint / Lint Repository (pull_request) Failing after 28s

Add regression test confirming visual-scenes-500.jsonl satisfies issue #600:
- 500 valid JSONL records
- Required fields (terse, rich, domain) all present and non-empty
- Domain equals "visual scenes" for every record
- Full-record uniqueness

This closes the loop on Training Factory Worker 1/6 (visual scenes).

The dataset was originally added via PR #731 (merged to main).

Closes #600.
This commit is contained in:
Rockachopa
2026-04-29 23:36:36 -04:00
parent aae8b5957f
commit 5e7982a477
2 changed files with 119 additions and 0 deletions

View File

@@ -0,0 +1,28 @@
# Issue #600 Verification: Visual Scenes Prompt Enhancement
**Status:** ✅ Complete — dataset present on main
**Issue:** [Timmy_Foundation/timmy-config#600](https://forge.alexanderwhitestone.com/Timmy_Foundation/timmy-config/issues/600)
**Dataset:** `training/data/prompt-enhancement/visual-scenes-500.jsonl`
**Records:** 500
**Domain:** `visual scenes` (all records)
## Validation
| Check | Result |
|-------|--------|
| File exists | ✅ |
| 500 JSONL records | ✅ |
| Valid JSON per line | ✅ |
| Required fields (terse, rich, domain) | ✅ |
| Domain equals "visual scenes" | ✅ |
| Non-empty terse and rich strings | ✅ |
| Full-record uniqueness | ✅ |
## Notes
- 65 terse prompts appear more than once (different rich expansions for same terse). The dataset contract specifies unique *pairs*, not unique terse prompts. Acceptable.
- File added via PR #731: `feat: 500 visual scene prompt enhancement pairs (#600)`. Merged to main.
## Files Added in This PR
- `tests/test_prompt_enhancement_visual_scenes.py` — regression test validating the dataset meets issue requirements.