[Harvester] Build session_reader.py — JSONL transcript parser #6

Open
opened 2026-04-14 15:15:15 +00:00 by Timmy · 1 comment
Owner

Epic: #2 (Session Harvester)

Task

Build a Python script that reads Hermes session JSONL transcripts and extracts structured data.

Requirements

  • Parse JSONL format (one JSON object per line, each with role/content/tool_calls)
  • Separate user messages, assistant messages, and tool outputs
  • Extract: session ID, model used, start/end time, total tokens (estimate from content length)
  • Classify session by repo (scan messages for repo names)
  • Classify outcome: success / partial / failure / unknown
  • Output a structured session summary

Input

~/.hermes/sessions/session_*.jsonl

Output

{
  "session_id": "...",
  "model": "xiaomi/mimo-v2-pro",
  "repo": "the-nexus",
  "outcome": "success",
  "message_count": 47,
  "tool_calls": 12,
  "duration_estimate": "8m",
  "key_actions": ["merged PR #123", "commented on #456"],
  "errors_encountered": ["HTTP 405 on merge attempt"]
}

Acceptance Criteria

  • Reads any valid Hermes session JSONL
  • Correctly identifies repo from message content
  • Handles missing/malformed entries gracefully
  • Runs in <1 second per session file
## Epic: #2 (Session Harvester) ### Task Build a Python script that reads Hermes session JSONL transcripts and extracts structured data. ### Requirements - Parse JSONL format (one JSON object per line, each with role/content/tool_calls) - Separate user messages, assistant messages, and tool outputs - Extract: session ID, model used, start/end time, total tokens (estimate from content length) - Classify session by repo (scan messages for repo names) - Classify outcome: success / partial / failure / unknown - Output a structured session summary ### Input `~/.hermes/sessions/session_*.jsonl` ### Output ```json { "session_id": "...", "model": "xiaomi/mimo-v2-pro", "repo": "the-nexus", "outcome": "success", "message_count": 47, "tool_calls": 12, "duration_estimate": "8m", "key_actions": ["merged PR #123", "commented on #456"], "errors_encountered": ["HTTP 405 on merge attempt"] } ``` ### Acceptance Criteria - [ ] Reads any valid Hermes session JSONL - [ ] Correctly identifies repo from message content - [ ] Handles missing/malformed entries gracefully - [ ] Runs in <1 second per session file
Timmy added the harvestermilestone:1 labels 2026-04-14 15:15:15 +00:00
Owner

Closed PR #21 (incompatible with harvester pipeline).

Opened PR #52 with session_metadata.py that works alongside the existing session_reader.py:

  • SessionSummary dataclass for structured metadata
  • Repo classification, outcome classification, duration estimation
  • Uses session_reader.read_session() for file reading
  • No changes to existing session_reader.py API
  • Compatible with harvester.py pipeline

PR #52: #52

Closed PR #21 (incompatible with harvester pipeline). Opened PR #52 with session_metadata.py that works alongside the existing session_reader.py: - SessionSummary dataclass for structured metadata - Repo classification, outcome classification, duration estimation - Uses session_reader.read_session() for file reading - No changes to existing session_reader.py API - Compatible with harvester.py pipeline PR #52: https://forge.alexanderwhitestone.com/Timmy_Foundation/compounding-intelligence/pulls/52
hermes was assigned by Rockachopa 2026-04-15 01:50:47 +00:00
Sign in to join this conversation.
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Timmy_Foundation/compounding-intelligence#6