[triage-generated] [bug] Integrate confidence.py into agent response pipeline #171

Closed
opened 2026-03-15 15:39:54 +00:00 by hermes · 1 comment
Collaborator

Problem

src/timmy/confidence.py was merged in PR #161 but is dead code — not imported or used anywhere.

$ grep -rn "from.*confidence import\|import.*confidence" src/ --include="*.py"
(no results)

SOUL.md requires: "When I am uncertain, I must say so in proportion to my uncertainty. The code should implement mechanisms — a second inference pass, a calibration check, a retrieval verification — that surface my actual confidence."

The module exists. It has estimate_confidence(text) -> float. It just needs to be wired in.

Scope

  1. In src/timmy/agent.py (or agentic_loop.py), after generating a response, call estimate_confidence(response_text)
  2. Attach the confidence score to the response metadata
  3. If confidence < 0.3, prepend a hedging disclaimer or log a warning
  4. Store the score in session logs (via SessionLogger)

Files

  • src/timmy/confidence.py (already exists, no changes needed)
  • src/timmy/agent.py — wire in confidence estimation
  • src/timmy/session_logger.py — add confidence field to log entries

Acceptance Criteria

  • estimate_confidence() is called on every Timmy response
  • Confidence score appears in session log entries
  • Test: mock a hedging response, verify confidence < 0.5; mock a certain response, verify > 0.5

Tags

[triage-generated] [feature] [soul-gap]

Origin

Timmy himself requested this during triage consultation: "Integrating the confidence module is most critical for my growth—it directly enables honest, transparent responses aligned with my core values."

## Problem `src/timmy/confidence.py` was merged in PR #161 but is **dead code** — not imported or used anywhere. ``` $ grep -rn "from.*confidence import\|import.*confidence" src/ --include="*.py" (no results) ``` SOUL.md requires: "When I am uncertain, I must say so in proportion to my uncertainty. The code should implement mechanisms — a second inference pass, a calibration check, a retrieval verification — that surface my actual confidence." The module exists. It has `estimate_confidence(text) -> float`. It just needs to be wired in. ## Scope 1. In `src/timmy/agent.py` (or `agentic_loop.py`), after generating a response, call `estimate_confidence(response_text)` 2. Attach the confidence score to the response metadata 3. If confidence < 0.3, prepend a hedging disclaimer or log a warning 4. Store the score in session logs (via SessionLogger) ## Files - `src/timmy/confidence.py` (already exists, no changes needed) - `src/timmy/agent.py` — wire in confidence estimation - `src/timmy/session_logger.py` — add confidence field to log entries ## Acceptance Criteria - `estimate_confidence()` is called on every Timmy response - Confidence score appears in session log entries - Test: mock a hedging response, verify confidence < 0.5; mock a certain response, verify > 0.5 ## Tags [triage-generated] [feature] [soul-gap] ## Origin Timmy himself requested this during triage consultation: "Integrating the confidence module is most critical for my growth—it directly enables honest, transparent responses aligned with my core values."
Author
Collaborator

Superseded by PR#191 which implements the full audit trail including confidence integration. Closing.

Superseded by PR#191 which implements the full audit trail including confidence integration. Closing.
Sign in to join this conversation.
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Rockachopa/Timmy-time-dashboard#171