enhancement: log tool hallucination statistics to metrics #1109

Closed
Rockachopa wants to merge 187 commits from step35/853-enhancement-log-tool-halluci into main
Owner

Summary

Implements persistent logging of tool hallucination statistics detected by the poka-yoke validator.

Key Changes:

  • Added metrics persistence layer writing to ~/.hermes/metrics/hallucination_stats.json
  • Track cumulative failures per tool (persists across restarts)
  • Track first/last event timestamps and total event counts
  • New get_hallucination_metrics() function for aggregated statistics
  • New hermes hallucination CLI command to view metrics
  • Enhanced reset_circuit_breaker() to also clear cumulative state and persisted file

Files:

  • tools/tool_pokayoke.py — Core tracking logic (+ cumulative tracking, persistence hooks)
  • hermes_cli/hallucination.py — New CLI command (NEW file)
  • hermes_cli/main.py — Command registration

CLI Usage:

hermes hallucination              # Show summary with top tools
hermes hallucination --json       # Raw JSON output
hermes hallucination --top 20     # Show top 20 tools
hermes hallucination --reset      # Reset all counters (with confirmation)

Metrics file format (~/.hermes/metrics/hallucination_stats.json):

{
  "version": "1.0",
  "cumulative_failures": {"tool_name": count, ...},
  "tools_affected": ["tool1", "tool2", ...],
  "first_event": 1698765432.0,
  "last_event": 1698766000.0,
  "total_events": 42
}

Acceptance Criteria (from #853):

  • Hallucination events logged to structured file
  • Statistics aggregated by tool name
  • CLI command to view metrics (hermes hallucination)

Closes #853

## Summary Implements persistent logging of tool hallucination statistics detected by the poka-yoke validator. **Key Changes:** - Added metrics persistence layer writing to `~/.hermes/metrics/hallucination_stats.json` - Track cumulative failures per tool (persists across restarts) - Track first/last event timestamps and total event counts - New `get_hallucination_metrics()` function for aggregated statistics - New `hermes hallucination` CLI command to view metrics - Enhanced `reset_circuit_breaker()` to also clear cumulative state and persisted file **Files:** - `tools/tool_pokayoke.py` — Core tracking logic (+ cumulative tracking, persistence hooks) - `hermes_cli/hallucination.py` — New CLI command (NEW file) - `hermes_cli/main.py` — Command registration **CLI Usage:** ```bash hermes hallucination # Show summary with top tools hermes hallucination --json # Raw JSON output hermes hallucination --top 20 # Show top 20 tools hermes hallucination --reset # Reset all counters (with confirmation) ``` **Metrics file format** (`~/.hermes/metrics/hallucination_stats.json`): ```json { "version": "1.0", "cumulative_failures": {"tool_name": count, ...}, "tools_affected": ["tool1", "tool2", ...], "first_event": 1698765432.0, "last_event": 1698766000.0, "total_events": 42 } ``` **Acceptance Criteria** (from #853): - [x] Hallucination events logged to structured file - [x] Statistics aggregated by tool name - [x] CLI command to view metrics (`hermes hallucination`) Closes #853
Rockachopa added 1 commit 2026-04-29 04:22:55 +00:00
enhancement: log tool hallucination statistics to metrics
All checks were successful
Lint / lint (pull_request) Successful in 10s
52865209ac
- Add persistent metrics storage (~/.hermes/metrics/hallucination_stats.json)
- Track cumulative failures per tool (survives restarts)
- Track first/last event timestamps and total event count
- Add `get_hallucination_metrics()` API for aggregated stats
- Add `hermes hallucination` CLI command to view metrics
- Enhance `reset_circuit_breaker()` to also clear cumulative state and file
- Increment cumulative counter on each failure in ToolCallValidator._record_failure

Resolves: #853

Acceptance criteria:
- [x] Hallucination events logged to structured file (JSON at ~/.hermes/metrics/hallucination_stats.json)
- [x] Statistics aggregated by tool name (cumulative_failures dict + by_tool breakdown)
- [x] Dashboard integration or CLI command to view (new `hermes hallucination` command)

The implementation is minimal: extends ToolCallValidator with in-memory cumulative tracking
and hooks _record_failure to persist on each failure. CLI command reads both in-memory
(current session) and persisted (historical) data for full picture.
Rockachopa closed this pull request 2026-04-29 11:44:09 +00:00
Rockachopa deleted branch step35/853-enhancement-log-tool-halluci 2026-04-29 11:44:10 +00:00
All checks were successful
Lint / lint (pull_request) Successful in 10s

Pull request closed

Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Timmy_Foundation/hermes-agent#1109