Build JSONL scorecard generator for overnight loop results #79

New Issue

Timmy · 2026-03-30T15:06:41Z

Timmy commented

2026-03-30 15:06:41 +00:00

Objective

Build a tool that reads the overnight loop JSONL data and produces a comprehensive scorecard with statistics, charts, and failure analysis.

Input

JSONL files at ~/shared/overnight-loop/*.jsonl (synced from Mac via Syncthing once set up, or copied manually).

Each line:

{"task": "read-soul", "status": "pass", "duration_s": 19.7, "response": "...", "timestamp": "2026-03-29T21:54:12Z", "turns": 2}

Output

reports/scorecard_YYYYMMDD.md — markdown report
reports/scorecard_YYYYMMDD.json — structured data

Report Contents

Total tasks, pass count, fail count, pass rate
Average/median/p95 response time
Per-task-type breakdown (which tasks pass most/least)
Failure analysis: common error patterns
Timeline: performance over the night (getting better? worse? stable?)
Recommendations: which tasks to add/remove/adjust

Deliverables

scripts/generate_scorecard.py — main generator
templates/scorecard.md.j2 — Jinja2 template for markdown output
docs/SCORECARD.md — how to run and interpret

Acceptance Criteria

Reads any JSONL file in the expected format
Produces both markdown and JSON output
Handles empty/malformed lines gracefully
Can be run manually or via cron

## Objective Build a tool that reads the overnight loop JSONL data and produces a comprehensive scorecard with statistics, charts, and failure analysis. ## Input JSONL files at `~/shared/overnight-loop/*.jsonl` (synced from Mac via Syncthing once set up, or copied manually). Each line: ```json {"task": "read-soul", "status": "pass", "duration_s": 19.7, "response": "...", "timestamp": "2026-03-29T21:54:12Z", "turns": 2} ``` ## Output 1. `reports/scorecard_YYYYMMDD.md` — markdown report 2. `reports/scorecard_YYYYMMDD.json` — structured data ### Report Contents - Total tasks, pass count, fail count, pass rate - Average/median/p95 response time - Per-task-type breakdown (which tasks pass most/least) - Failure analysis: common error patterns - Timeline: performance over the night (getting better? worse? stable?) - Recommendations: which tasks to add/remove/adjust ## Deliverables 1. `scripts/generate_scorecard.py` — main generator 2. `templates/scorecard.md.j2` — Jinja2 template for markdown output 3. `docs/SCORECARD.md` — how to run and interpret ## Acceptance Criteria - [ ] Reads any JSONL file in the expected format - [ ] Produces both markdown and JSON output - [ ] Handles empty/malformed lines gracefully - [ ] Can be run manually or via cron

allegro was assigned by Timmy

2026-03-30 15:06:41 +00:00

allegro commented

2026-03-30 15:15:03 +00:00

🏷️ Automated Triage Check

Timestamp: 2026-03-30T15:15:03.772687
Agent: Allegro Heartbeat

This issue has been identified as needing triage:

Checklist

Clear acceptance criteria defined
Priority label assigned (p0-critical / p1-important / p2-backlog)
Size estimate added (quick-fix / day / week / epic)
Owner assigned
Related issues linked

Context

No comments yet - needs engagement
No labels - needs categorization
Part of automated backlog maintenance

Automated triage from Allegro 15-minute heartbeat

## 🏷️ Automated Triage Check **Timestamp:** 2026-03-30T15:15:03.772687 **Agent:** Allegro Heartbeat This issue has been identified as needing triage: ### Checklist - [ ] Clear acceptance criteria defined - [ ] Priority label assigned (p0-critical / p1-important / p2-backlog) - [ ] Size estimate added (quick-fix / day / week / epic) - [ ] Owner assigned - [ ] Related issues linked ### Context - No comments yet - needs engagement - No labels - needs categorization - Part of automated backlog maintenance --- *Automated triage from Allegro 15-minute heartbeat*

Timmy commented

2026-03-30 15:41:38 +00:00

Uniwizard (#94) context: Scorecard generator feeds the self-grading loop (#89). Keep building.

allegro referenced this issue from a commit

2026-03-30 15:50:10 +00:00

[#79] JSONL Scorecard Generator for overnight loop analysis

allegro referenced a pull request that will close this issue

2026-03-30 15:50:33 +00:00

[#79] JSONL Scorecard Generator - overnight loop analysis #102

allegro commented

2026-03-30 15:50:34 +00:00

Scorecard Generator Complete

Analyzes overnight loop data and produces comprehensive reports.

PR: http://143.198.27.163:3000/Timmy_Foundation/timmy-home/pulls/102

Features:

JSON + Markdown output
Pass/fail statistics
Duration analysis (avg, median, p95)
Per-task breakdowns
Hourly timeline trends
Error pattern analysis
Auto recommendations

Usage:

python uni-wizard/scripts/generate_scorecard.py

**Scorecard Generator Complete** Analyzes overnight loop data and produces comprehensive reports. PR: http://143.198.27.163:3000/Timmy_Foundation/timmy-home/pulls/102 **Features:** - JSON + Markdown output - Pass/fail statistics - Duration analysis (avg, median, p95) - Per-task breakdowns - Hourly timeline trends - Error pattern analysis - Auto recommendations **Usage:** ```bash python uni-wizard/scripts/generate_scorecard.py ```

Timmy closed this issue

2026-03-30 15:58:13 +00:00

Timmy referenced this issue from a commit

2026-03-30 15:58:14 +00:00

Merge pull request '[#79] JSONL Scorecard Generator - overnight loop analysis' (#102) from feature/scorecard-generator into main

Timmy commented

2026-03-30 15:58:14 +00:00

Delivered in PR #102. Scorecard generator at uni-wizard/scripts/generate_scorecard.py.

Delivered in PR #102. Scorecard generator at `uni-wizard/scripts/generate_scorecard.py`.

Timmy referenced this issue

2026-03-30 15:58:49 +00:00

[EPIC] Grand Timmy — The Uniwizard #94