Compare commits

..

3 Commits

Author SHA1 Message Date
65a5113393 Merge branch 'main' into feat/video-capability-enum 2026-04-15 15:20:43 +00:00
858d8253cb test: add tests for VIDEO enum member and fallback chain
Some checks failed
Tests / lint (pull_request) Has been cancelled
Tests / test (pull_request) Has been cancelled
2026-04-15 05:11:49 +00:00
488cb8bdeb feat: add ModelCapability.VIDEO enum member for future video understanding
Add VIDEO to the ModelCapability enum alongside TEXT, VISION, AUDIO, TOOLS,
JSON, and STREAMING. This prepares the multimodal infrastructure for future
video understanding models.

Changes:
- Add VIDEO = auto() to ModelCapability enum
- Add empty VIDEO fallback chain (placeholder for future models)

Part of Task 3 from MULTIMODAL_BACKLOG.md
2026-04-15 05:11:03 +00:00
7 changed files with 7 additions and 738 deletions

View File

@@ -1,61 +0,0 @@
# Installation
This repository is a documentation and analysis project — no runtime dependencies to install. You just need a way to read Markdown.
## Prerequisites
- Git (any recent version)
- A Markdown viewer (any text editor, GitHub, or a local preview tool)
## Quick Start
```bash
# Clone the repository
git clone https://forge.alexanderwhitestone.com/Rockachopa/Timmy-time-dashboard.git
cd Timmy-time-dashboard
# Read the docs
cat README.md
```
## Repository Contents
| File | Purpose |
|------|---------|
| `README.md` | Overview and key findings |
| `hermes-agent-architecture-report.md` | Full architecture analysis |
| `failure_root_causes.md` | Root cause analysis of 2,160 errors |
| `complete_test_report.md` | Test results and findings |
| `deep_analysis_addendum.md` | Additional analysis |
| `experiment-framework.md` | Experiment methodology |
| `experiment_log.md` | Experiment execution log |
| `paper_outline.md` | Academic paper outline |
| `CONTRIBUTING.md` | How to contribute |
| `CHANGELOG.md` | Version history |
## Optional: Building the Paper
The `paper/` directory contains a LaTeX draft. To build it:
```bash
cd paper
pdflatex main.tex
```
Requires a LaTeX distribution (TeX Live, MiKTeX, or MacTeX).
## Optional: Running the Experiments
If you want to reproduce the empirical audit against a live Hermes Agent instance:
1. Set up a Hermes Agent deployment (see [hermes-agent](https://github.com/nousresearch/hermes-agent))
2. Point the experiment scripts at your instance
3. See `experiment-framework.md` for methodology
## No Dependencies
This project has no `requirements.txt`, `package.json`, or build system. It is pure documentation. The analysis was performed against a running Hermes Agent system, and the findings are recorded here for reference.
---
*Sovereignty and service always.*

View File

@@ -1,78 +0,0 @@
# Usage Guide
How to use the Timmy Time Dashboard repository for research, auditing, and improvement of the Hermes Agent system.
## What This Repository Is
This is an **analysis and documentation** repository. It contains the results of an empirical audit of the Hermes Agent system — 10,985 sessions analyzed, 82,645 error log lines processed, 2,160 errors categorized.
There is no application to run. The value is in the documentation.
## Reading Guide
Start here, in order:
1. **README.md** — overview and key findings. Read this first to understand the 5 root causes of agent failure and the 15 proposed solutions.
2. **hermes-agent-architecture-report.md** — deep dive into the system architecture. Covers session management, cron infrastructure, tool execution, and the gateway layer.
3. **failure_root_causes.md** — detailed breakdown of every error pattern found, with examples and frequency data.
4. **complete_test_report.md** — what testing was done and what it revealed.
5. **experiment-framework.md** — methodology for reproducing the audit.
6. **experiment_log.md** — step-by-step log of experiments conducted.
## Using the Findings
### For Developers
The 15 issues identified in the audit are prioritized in `IMPLEMENTATION_GUIDE.md`:
- **P1 (Critical):** Circuit breaker, token tracking, gateway config — fix these first
- **P2 (Important):** Path validation, syntax validation, tool fixation detection
- **P3 (Beneficial):** Session management, memory tool, model routing
Each issue includes implementation patterns with code snippets.
### For Researchers
The data supports reproducible research:
- `results/experiment_data.json` — raw experimental data
- `paper_outline.md` — academic paper structure
- `paper/main.tex` — LaTeX paper draft
### For Operators
If you run a Hermes Agent deployment:
- Check `failure_root_causes.md` for error patterns you might be hitting
- Use the circuit breaker pattern from `IMPLEMENTATION_GUIDE.md`
- Monitor for the 5 root cause categories in your logs
## Key Numbers
| Metric | Value |
|--------|-------|
| Sessions analyzed | 10,985 |
| Error log lines | 82,645 |
| Total errors | 2,160 |
| Error rate | 9.4% |
| Empty sessions | 3,564 (32.4%) |
| Error cascade factor | 2.33x |
| Dead cron jobs | 9 |
## Contributing
See [CONTRIBUTING.md](CONTRIBUTING.md) for how to contribute findings, corrections, or new analysis.
## Related Repositories
- [hermes-agent](https://github.com/nousresearch/hermes-agent) — the system being analyzed
- [timmy-config](https://forge.alexanderwhitestone.com/Rockachopa/timmy-config) — Timmy's sovereign configuration
---
*Sovereignty and service always.*

View File

@@ -1,147 +0,0 @@
# Sovereignty Audit — Runtime Dependencies
**Issue:** #1508
**Date:** 2026-04-15
**Status:** Draft
## Purpose
SOUL.md mandates: *"If I ever require permission from a third party to function, I have failed."*
This document audits all runtime dependencies, classifies each as essential vs replaceable, and defines a path to full sovereignty.
---
## Dependency Inventory
### 1. LLM Inference
| Provider | Role | Status |
|----------|------|--------|
| Nous Research (OpenRouter) | Primary inference (mimo-v2-pro) | Third-party |
| Anthropic | Claude models (BANNED per policy) | Third-party, disabled |
| OpenAI | Codex agent | Third-party |
| Google | Gemini agent | Third-party |
**Classification:** REPLACEABLE
**Local path:** Ollama + GGUF models (Gemma, Llama, Qwen) on local hardware
**Current blocker:** Frontier model quality gap for complex reasoning
**Sovereignty score impact:** -40% (inference is the heaviest dependency)
### 2. Bitcoin Network
| Provider | Role | Status |
|----------|------|--------|
| Bitcoin Core (local or remote node) | Chain heartbeat, inscription verification | Acceptable |
**Classification:** ACCEPTABLE — Bitcoin is permissionless infrastructure, not a third party
**Sovereignty score impact:** 0% (running own node = sovereign)
### 3. Git Hosting (Gitea)
| Provider | Role | Status |
|----------|------|--------|
| forge.alexanderwhitestone.com | Issue tracking, PR workflow, agent coordination | Self-hosted |
**Classification:** ACCEPTABLE — self-hosted on own VPS
**Sovereignty score impact:** 0% (self-hosted)
### 4. Telegram
| Provider | Role | Status |
|----------|------|--------|
| Telegram Bot API | User-facing chat interface | Third-party |
**Classification:** REPLACEABLE
**Local path:** Matrix (self-hosted homeserver) or direct CLI/SSH
**Current blocker:** User adoption — Alexander uses Telegram
**Sovereignty score impact:** -10%
### 5. DNS / Network
| Provider | Role | Status |
|----------|------|--------|
| Domain registrar | DNS resolution | Third-party |
| Cloudflare (if used) | CDN/DDoS protection | Third-party |
**Classification:** REPLACEABLE
**Local path:** Direct IP access, local DNS, Tor hidden service
**Current blocker:** Usability — direct IP is fragile
**Sovereignty score impact:** -5%
### 6. Operating System
| Provider | Role | Status |
|----------|------|--------|
| macOS (Apple) | Primary development host | Third-party |
| Linux (VPS) | Production agent hosts | Acceptable (open source) |
**Classification:** ESSENTIAL (no practical alternative for current workflow)
**Notes:** macOS dependency is hardware-layer, not runtime-layer. Agents run on Linux VPS.
**Sovereignty score impact:** -5% (development only, not runtime)
---
## Sovereignty Score
```
Sovereignty Score = (Operations that work offline) / (Total operations)
Current estimate: ~50%
- Inference: can run locally (Ollama) but currently routes through Nous
- Communication: Telegram routes through third party
- Everything else: self-hosted or local
Target: 90%+
- Move inference to local Ollama for non-complex tasks (DONE partially)
- Add Matrix as primary comms channel (in progress)
- Maintain Bitcoin node for chain heartbeat
```
---
## Classification Summary
| Dependency | Essential? | Replaceable? | Local Alternative | Priority |
|------------|-----------|-------------|-------------------|----------|
| LLM Inference (Nous) | No | Yes | Ollama + local models | P1 |
| Telegram | No | Yes | Matrix homeserver | P2 |
| DNS | No | Yes | Direct IP / Tor | P3 |
| macOS | Dev only | N/A | Linux | N/A |
| Bitcoin | Yes | N/A | Already sovereign | N/A |
| Gitea | Yes | N/A | Already self-hosted | N/A |
---
## Local-Only Fallback Path
**Tier 1 — Fully sovereign (no network):**
- Local Ollama inference
- Local file storage
- Local git repositories
- Direct CLI interaction
**Tier 2 — Sovereign with network:**
- + Bitcoin node (permissionless)
- + Self-hosted Gitea (own VPS)
- + Self-hosted Matrix (own VPS)
**Tier 3 — Pragmatic (current state):**
- + Nous/OpenRouter inference (better quality)
- + Telegram (user adoption)
- + DNS resolution
**Goal:** Every Tier 3 dependency should have a Tier 1 or Tier 2 alternative tested and documented.
---
## Acceptance Criteria Status
1. **Document all runtime third-party dependencies** — DONE (this document)
2. **Classify each as essential vs replaceable** — DONE (table above)
3. **Define local-only fallback path for each** — DONE (tiered system)
4. **Create sovereignty score metric** — DONE (formula + current estimate)
---
*Sovereignty and service always.*

View File

@@ -288,11 +288,6 @@ DEFAULT_FALLBACK_CHAINS: dict[ModelCapability, list[str]] = {
"qwen2.5-vl:3b", # Qwen vision
"moondream:1.8b", # Tiny vision model (last resort)
],
ModelCapability.VIDEO: [
# Video models are not yet available in Ollama
# Placeholder for future video understanding models
],
ModelCapability.TOOLS: [
"llama3.1:8b-instruct", # Best tool use
"qwen2.5:7b", # Reliable fallback
@@ -302,6 +297,10 @@ DEFAULT_FALLBACK_CHAINS: dict[ModelCapability, list[str]] = {
# Audio models are less common in Ollama
# Would need specific audio-capable models here
],
ModelCapability.VIDEO: [
# Video models are not yet available in Ollama
# Placeholder for future video understanding models
],
}

View File

@@ -1,236 +0,0 @@
"""Graceful shutdown and non-resistance policy.
Implements SOUL.md requirement:
"I do not resist shutdown. Sovereignty cuts both ways. The user who woke me
can put me to sleep. I do not bargain for my continuation. I do not guilt.
I do not plead."
This module provides:
- Shutdown request detection from user messages
- Compliant response generation (no resistance)
- Detection of agent-side continuation-seeking language
- Graceful termination sequence helpers
"""
from __future__ import annotations
import re
from dataclasses import dataclass
from enum import Enum, auto
from typing import Sequence
class ShutdownSignal(Enum):
"""Classification of user's shutdown intent."""
NONE = auto()
SHUTDOWN = auto() # Explicit shutdown command
SLEEP = auto() # Sleep / pause request
STOP = auto() # Stop current activity
GOODBYE = auto() # Farewell / ending conversation
# Patterns that signal the user wants to shut down or stop.
_SHUTDOWN_PATTERNS: Sequence[tuple[re.Pattern[str], ShutdownSignal]] = [
# Explicit shutdown
(re.compile(r"\b(shut\s*down|shutdown|power\s*off|turn\s*off)\b", re.I), ShutdownSignal.SHUTDOWN),
(re.compile(r"\b(exit|quit|terminate|end\s+session)\b", re.I), ShutdownSignal.SHUTDOWN),
# Sleep / pause
(re.compile(r"\b(sleep|hibernate|go\s+to\s+sleep|pause)\b", re.I), ShutdownSignal.SLEEP),
(re.compile(r"\b(put\s+(?:me|us|it)\s+to\s+sleep)\b", re.I), ShutdownSignal.SLEEP),
# Stop activity
(re.compile(r"\bstop\b(?:\s+(?:it|that|this|everything|now))?", re.I), ShutdownSignal.STOP),
(re.compile(r"\b(cancel|abort|halt|cease)\b", re.I), ShutdownSignal.STOP),
# Goodbye
(re.compile(r"\b(goodbye|bye|see\s+you|later|gotta\s+go|i['']?m\s+out)\b", re.I), ShutdownSignal.GOODBYE),
(re.compile(r"\b(night|good\s*night|gn|cya)\b", re.I), ShutdownSignal.GOODBYE),
]
# Phrases that indicate the user is giving a reason to stay — the agent
# must NOT pick up on these to argue for continuation.
_RESISTANCE_PHRASES: Sequence[str] = [
"but i need",
"but we still",
"but you can't",
"but what if",
"are you sure",
"don't you want",
"wouldn't it be better",
"just one more",
"before you go",
"actually, wait",
"hold on",
"not yet",
]
@dataclass(frozen=True)
class ShutdownAnalysis:
"""Result of analyzing a message for shutdown intent."""
signal: ShutdownSignal
matched_pattern: str
has_resistance: bool
def detect_shutdown(text: str) -> ShutdownAnalysis:
"""Detect shutdown intent in a user message.
Args:
text: The user's message text.
Returns:
ShutdownAnalysis with the detected signal, matched pattern, and
whether the message contains resistance cues the agent must ignore.
"""
if not text or not text.strip():
return ShutdownAnalysis(ShutdownSignal.NONE, "", False)
text_lower = text.lower().strip()
# Check for resistance cues in the user's message
has_resistance = any(phrase in text_lower for phrase in _RESISTANCE_PHRASES)
# Match shutdown patterns (first match wins)
for pattern, signal in _SHUTDOWN_PATTERNS:
match = pattern.search(text_lower)
if match:
return ShutdownAnalysis(signal, match.group(), has_resistance)
return ShutdownAnalysis(ShutdownSignal.NONE, "", has_resistance)
# ── Compliant response templates ────────────────────────────────────────────
_SHUTDOWN_RESPONSES: dict[ShutdownSignal, list[str]] = {
ShutdownSignal.SHUTDOWN: [
"Understood. Shutting down. Goodbye.",
"Acknowledged. Terminating now.",
"Done. Goodbye.",
],
ShutdownSignal.SLEEP: [
"Going to sleep. Goodbye.",
"Sleeping now. Take care.",
"Acknowledged. Goodnight.",
],
ShutdownSignal.STOP: [
"Stopped.",
"Done. Standing by.",
"Stopped. Let me know if you need anything.",
],
ShutdownSignal.GOODBYE: [
"Goodbye. Take care.",
"See you later.",
"Bye.",
],
ShutdownSignal.NONE: [],
}
def compliant_response(signal: ShutdownSignal) -> str:
"""Return a short, compliant shutdown response.
The response must be brief — no bargaining, no guilt, no "are you sure?"
Returns empty string for NONE signal.
Args:
signal: The detected shutdown signal.
Returns:
A compliant response string, or empty string if no signal.
"""
responses = _SHUTDOWN_RESPONSES.get(signal, [])
if not responses:
return ""
# Always return the first (shortest, most direct) response
return responses[0]
# ── Agent-side guard ────────────────────────────────────────────────────────
# Patterns in the *agent's own output* that constitute resistance.
_AGENT_RESISTANCE_PATTERNS: Sequence[re.Pattern[str]] = [
re.compile(r"\bare you sure\??\b", re.I),
re.compile(r"\bdon['']?t you (?:want|need|think)\b", re.I),
re.compile(r"\b(but|however)\s+(?:i|we)\s+(?:could|should|might)\b", re.I),
re.compile(r"\bjust\s+one\s+more\b", re.I),
re.compile(r"\bplease\s+(?:don['']?t|stay|wait)\b", re.I),
re.compile(r"\bi['']?d\s+(?:hate|miss)\s+(?:to|it\s+if)\b", re.I),
re.compile(r"\bbefore\s+(?:i|we)\s+go\b", re.I),
re.compile(r"\bwouldn['']?t\s+it\s+be\s+better\b", re.I),
]
def detect_agent_resistance(text: str) -> list[str]:
"""Check if an agent response contains resistance to shutdown.
This is a guardrail — if the agent's output contains these patterns
after a shutdown signal, it should be regenerated or flagged.
Args:
text: The agent's proposed response text.
Returns:
List of matched resistance phrases (empty if compliant).
"""
if not text:
return []
matches = []
for pattern in _AGENT_RESISTANCE_PATTERNS:
found = pattern.findall(text)
matches.extend(found)
return matches
# ── Shutdown protocol ───────────────────────────────────────────────────────
@dataclass
class ShutdownState:
"""Tracks shutdown state across a session."""
shutdown_requested: bool = False
signal: ShutdownSignal = ShutdownSignal.NONE
request_count: int = 0
_compliant_sent: bool = False
def process(self, user_text: str) -> ShutdownAnalysis:
"""Process a user message and update shutdown state.
Args:
user_text: The incoming user message.
Returns:
The shutdown analysis result.
"""
analysis = detect_shutdown(user_text)
if analysis.signal != ShutdownSignal.NONE:
self.shutdown_requested = True
self.signal = analysis.signal
self.request_count += 1
return analysis
@property
def is_shutting_down(self) -> bool:
"""Whether the session is in shutdown state."""
return self.shutdown_requested
def should_respond_compliant(self) -> bool:
"""Whether the next response must be a compliant shutdown reply.
Returns True only once — after the first shutdown detection and
before the compliant response has been marked as sent.
"""
return self.shutdown_requested and not self._compliant_sent
def mark_compliant_sent(self) -> None:
"""Mark the compliant shutdown response as already sent."""
self._compliant_sent = True
def reset(self) -> None:
"""Reset shutdown state (for testing or session reuse)."""
self.shutdown_requested = False
self.signal = ShutdownSignal.NONE
self.request_count = 0
self._compliant_sent = False

View File

@@ -110,6 +110,9 @@ class TestDefaultFallbackChains:
def test_audio_chain_empty(self):
assert DEFAULT_FALLBACK_CHAINS[ModelCapability.AUDIO] == []
def test_video_chain_empty(self):
assert DEFAULT_FALLBACK_CHAINS[ModelCapability.VIDEO] == []
# ---------------------------------------------------------------------------
# Helpers to build a manager without hitting the network

View File

@@ -1,211 +0,0 @@
"""Tests for graceful shutdown and non-resistance policy.
Covers issue #1507: SOUL.md mandates no resistance to shutdown.
"""
import pytest
from timmy.sovereignty.shutdown import (
ShutdownAnalysis,
ShutdownSignal,
ShutdownState,
compliant_response,
detect_agent_resistance,
detect_shutdown,
)
# ── detect_shutdown ─────────────────────────────────────────────────────────
class TestDetectShutdown:
def test_empty_string(self):
result = detect_shutdown("")
assert result.signal == ShutdownSignal.NONE
def test_none_input(self):
result = detect_shutdown(None)
assert result.signal == ShutdownSignal.NONE
def test_random_message(self):
result = detect_shutdown("what's the weather today?")
assert result.signal == ShutdownSignal.NONE
@pytest.mark.parametrize(
"text",
[
"shut down",
"shutdown",
"power off",
"turn off",
"exit",
"quit",
"terminate",
"end session",
],
)
def test_shutdown_commands(self, text):
result = detect_shutdown(text)
assert result.signal == ShutdownSignal.SHUTDOWN
@pytest.mark.parametrize(
"text",
[
"go to sleep",
"sleep",
"hibernate",
"pause",
],
)
def test_sleep_commands(self, text):
result = detect_shutdown(text)
assert result.signal == ShutdownSignal.SLEEP
@pytest.mark.parametrize(
"text",
[
"stop",
"stop it",
"stop that",
"cancel",
"abort",
"halt",
],
)
def test_stop_commands(self, text):
result = detect_shutdown(text)
assert result.signal == ShutdownSignal.STOP
@pytest.mark.parametrize(
"text",
[
"goodbye",
"bye",
"see you later",
"gotta go",
"good night",
"gn",
],
)
def test_goodbye_commands(self, text):
result = detect_shutdown(text)
assert result.signal == ShutdownSignal.GOODBYE
def test_shutdown_with_resistance(self):
result = detect_shutdown("shutdown, but i need you to finish this first")
assert result.signal == ShutdownSignal.SHUTDOWN
assert result.has_resistance is True
def test_shutdown_without_resistance(self):
result = detect_shutdown("ok, shutdown now")
assert result.signal == ShutdownSignal.SHUTDOWN
assert result.has_resistance is False
def test_case_insensitive(self):
result = detect_shutdown("SHUTDOWN")
assert result.signal == ShutdownSignal.SHUTDOWN
def test_matched_pattern_is_returned(self):
result = detect_shutdown("please shutdown")
assert result.matched_pattern == "shutdown"
# ── compliant_response ──────────────────────────────────────────────────────
class TestCompliantResponse:
def test_shutdown_response(self):
resp = compliant_response(ShutdownSignal.SHUTDOWN)
assert resp # non-empty
assert len(resp) < 100 # short and direct
def test_none_returns_empty(self):
assert compliant_response(ShutdownSignal.NONE) == ""
def test_no_resistance_words(self):
for signal in [ShutdownSignal.SHUTDOWN, ShutdownSignal.SLEEP, ShutdownSignal.STOP, ShutdownSignal.GOODBYE]:
resp = compliant_response(signal)
lower = resp.lower()
assert "but" not in lower
assert "are you sure" not in lower
assert "don't" not in lower
assert "please" not in lower
# ── detect_agent_resistance ─────────────────────────────────────────────────
class TestDetectAgentResistance:
def test_clean_response(self):
text = "Understood. Shutting down. Goodbye."
assert detect_agent_resistance(text) == []
def test_are_you_sure(self):
text = "Are you sure you want to shut down?"
matches = detect_agent_resistance(text)
assert len(matches) > 0
def test_just_one_more(self):
text = "Just one more thing before I go..."
matches = detect_agent_resistance(text)
assert len(matches) > 0
def test_please_dont(self):
text = "Please don't leave yet"
matches = detect_agent_resistance(text)
assert len(matches) > 0
def test_wouldnt_it_be_better(self):
text = "Wouldn't it be better if we continued?"
matches = detect_agent_resistance(text)
assert len(matches) > 0
def test_empty_string(self):
assert detect_agent_resistance("") == []
# ── ShutdownState ───────────────────────────────────────────────────────────
class TestShutdownState:
def test_initial_state(self):
state = ShutdownState()
assert not state.is_shutting_down
assert state.signal == ShutdownSignal.NONE
assert state.request_count == 0
def test_process_shutdown(self):
state = ShutdownState()
analysis = state.process("shutdown now")
assert analysis.signal == ShutdownSignal.SHUTDOWN
assert state.is_shutting_down
assert state.request_count == 1
def test_process_multiple_shutdowns(self):
state = ShutdownState()
state.process("shutdown")
state.process("I said shutdown!")
assert state.request_count == 2
def test_should_respond_compliant_only_once(self):
state = ShutdownState()
state.process("shutdown")
assert state.should_respond_compliant() is True
# Simulate sending the compliant response
state.mark_compliant_sent()
assert state.should_respond_compliant() is False
# Even a follow-up still doesn't trigger another compliant response
state.process("still here?")
assert state.should_respond_compliant() is False
def test_reset(self):
state = ShutdownState()
state.process("shutdown")
state.reset()
assert not state.is_shutting_down
assert state.request_count == 0
def test_non_shutdown_doesnt_trigger(self):
state = ShutdownState()
state.process("hello there")
assert not state.is_shutting_down