Commit Graph

365 Commits

Author SHA1 Message Date
09fcf956ec Merge pull request '[loop-cycle-1] feat: tool allowlist for autonomous operation (#69)' (#88) from fix/tool-allowlist-autonomous into main 2026-03-14 17:41:56 -04:00
d28e2f4a7e [loop-cycle-1] feat: tool allowlist for autonomous operation (#69)
Add config/allowlist.yaml — YAML-driven gate that auto-approves bounded
tool calls when no human is present.

When Timmy runs with --autonomous or stdin is not a terminal, tool calls
are checked against allowlist: matched → auto-approved, else → rejected.

Changes:
  - config/allowlist.yaml: shell prefixes, deny patterns, path rules
  - tool_safety.py: is_allowlisted() checks tools against YAML rules
  - cli.py: --autonomous flag, _is_interactive() detection
  - 44 new allowlist tests, 8 updated CLI tests

Closes #69
2026-03-14 17:39:48 -04:00
0b0251f702 Merge pull request '[loop-cycle-13] fix: configurable model fallback chains (#53)' (#76) from fix/configurable-fallback-models into main 2026-03-14 17:28:34 -04:00
94cd1a9840 fix: make model fallback chains configurable (#53)
Move hardcoded model fallback lists from module-level constants into
settings.fallback_models and settings.vision_fallback_models (pydantic
Settings fields). Can now be overridden via env vars
FALLBACK_MODELS / VISION_FALLBACK_MODELS or config/providers.yaml.

Removed:
- OLLAMA_MODEL_PRIMARY / OLLAMA_MODEL_FALLBACK from config.py
- DEFAULT_MODEL_FALLBACKS / VISION_MODEL_FALLBACKS from agent.py

get_effective_ollama_model() and _resolve_model_with_fallback() now
walk the configurable chains instead of hardcoded constants.

5 new tests guard the configurable behavior and prevent regression
to hardcoded constants.
2026-03-14 17:26:47 -04:00
f097784de8 Merge pull request '[loop-cycle-12] fix: brevity tuning — Timmy speaks plainly (#71)' (#75) from fix/brevity-tuning into main 2026-03-14 17:18:06 -04:00
061c8f6628 fix: brevity tuning — plain text prompts, markdown=False, front-loaded brevity
Closes #71: Timmy was responding with elaborate markdown formatting
(tables, headers, emoji, bullet lists) for simple questions.

Root causes fixed:
1. Agno Agent markdown=True flag explicitly told the model to format
   responses as markdown. Set to False in both agent.py and agents/base.py.
2. SYSTEM_PROMPT_FULL used ## and ### markdown headers, bold (**), and
   numbered lists — teaching by example that markdown is expected.
   Rewritten to plain text with labeled sections.
3. Brevity instructions were buried at the bottom of the full prompt.
   Moved to immediately after the opening line as 'VOICE AND BREVITY'
   with explicit override priority.
4. Orchestrator prompt in agents.yaml was silent on response style.
   Added 'Voice: brief, plain, direct' with concrete examples.

The full prompt is now 41 lines shorter (124 → 83). The prompt itself
practices the brevity it preaches.

SOUL.md alignment:
- 'Brevity is a kindness' — now front-loaded in both base and agent prompt
- 'I do not fill silence with noise' — explicit in both tiers
- 'I speak plainly. I prefer short sentences.' — structural enforcement

4 new tests guard against regression:
- test_full_prompt_brevity_first: brevity section before tools/memory
- test_full_prompt_no_markdown_headers: no ## or ### in prompt text
- test_full_prompt_plain_text_brevity: 'plain text' instruction present
- test_lite_prompt_brevity: lite tier also instructs brevity
2026-03-14 17:15:56 -04:00
3c671de446 Merge pull request '[loop-cycle-9] fix: thinking engine skips MCP tools to avoid cancel-scope errors (#72)' (#74) from fix/thinking-mcp-cancel-scope into main 2026-03-14 16:51:07 -04:00
rockachopa
927e25cc40 Merge pull request 'fix: replace print() with proper logging (#29, #51)' (#59) from fix/print-to-logging into main 2026-03-14 16:50:04 -04:00
rockachopa
2d2b566e58 Merge pull request 'fix: replace print() with proper logging (#29, #51)' (#59) from fix/print-to-logging into main 2026-03-14 16:34:48 -04:00
64fd1d9829 voice: reinforce brevity at top of system prompt 2026-03-14 16:32:47 -04:00
f0b0e2f202 fix: WebSocket 403 spam and missing /swarm endpoints
- CSRF middleware now skips WebSocket upgrade requests (they don't carry tokens)
- Added /swarm/live WebSocket endpoint wired to ws_manager singleton
- Added /swarm/agents/sidebar HTMX partial (was 404 on every dashboard poll)

Stops hundreds of 403 Forbidden + 404 log lines per minute.
2026-03-14 16:29:59 -04:00
b30b5c6b57 [loop-cycle-6] Break thinking rumination loop — semantic dedup (#38)
Add post-generation similarity check to ThinkingEngine.think_once().

Problem: Timmy's thinking engine generates repetitive thoughts because
small local models ignore 'don't repeat' instructions in the prompt.
The same observation ('still no chat messages', 'Alexander's name is in
profile') would appear 14+ times in a single day's journal.

Fix: After generating a thought, compare it against the last 5 thoughts
using SequenceMatcher. If similarity >= 0.6, retry with a new seed up to
2 times. If all retries produce repetitive content, discard rather than
store. Uses stdlib difflib — no new dependencies.

Changes:
- thinking.py: Add _is_too_similar() method with SequenceMatcher
- thinking.py: Wrap generation in retry loop with dedup check
- test_thinking.py: 7 new tests covering exact match, near match,
  different thoughts, retry behavior, and max-retry discard

+96/-20 lines in thinking.py, +87 lines in tests.
2026-03-14 16:21:16 -04:00
rockachopa
0d61b709da Merge pull request '[loop-cycle-5] Persist chat history in SQLite (#46)' (#63) from fix/issue-46-chat-persistence into main 2026-03-14 16:10:55 -04:00
79edfd1106 feat: persist chat history in SQLite — survives server restarts
Replace in-memory MessageLog with SQLite-backed implementation.
Same API surface (append/all/clear/len) so zero caller changes needed.

- data/chat.db stores messages with role, content, timestamp, source
- Lazy DB connection (opened on first use, not at import time)
- Retention policy: oldest messages pruned when count > 500
- New .recent(limit) method for efficient last-N queries
- Thread-safe with explicit locking
- WAL mode for concurrent read performance
- Test isolation: conftest redirects DB to tmp_path per test
- 8 new tests: persistence, retention, concurrency, source field

Closes #46
2026-03-14 16:09:26 -04:00
rockachopa
013a2cc330 Merge pull request 'feat: add --session-id to timmy chat CLI' (#62) from fix/cli-session-id into main 2026-03-14 16:06:16 -04:00
f426df5b42 feat: add --session-id option to timmy chat CLI
Allows specifying a named session for conversation persistence.
Use cases:
- Autonomous loops can have their own session (e.g. --session-id loop)
- Multiple users/agents can maintain separate conversations
- Testing different conversation threads without polluting the default

Precedence: --session-id > --new > default 'cli' session
2026-03-14 16:05:00 -04:00
rockachopa
bef4fc1024 Merge pull request '[loop-cycle-4] Push event system coverage to ≥80% on all modules' (#61) from fix/issue-45-event-coverage into main 2026-03-14 16:02:27 -04:00
9535dd86de test: push event system coverage to ≥80% on all three modules
Add 3 targeted tests for infrastructure/error_capture.py:
- test_stale_entries_pruned: exercises dedup cache pruning (line 61)
- test_git_context_fallback_on_failure: exercises exception path (lines 90-91)
- test_returns_none_when_feedback_disabled: exercises early return (line 112)

Coverage results (63 tests, all passing):
- error_capture.py: 75.6% → 80.0%
- broadcaster.py: 93.9% (unchanged)
- bus.py: 92.9% (unchanged)
- Total: 88.1% → 89.4%

Closes #45
2026-03-14 16:01:05 -04:00
70d5dc5ce1 fix: replace eval() with AST-walking safe evaluator in calculator
Fixes #52

- Replace eval() in calculator() with _safe_eval() that walks the AST
  and only permits: numeric constants, arithmetic ops (+,-,*,/,//,%,**),
  unary +/-, math module access, and whitelisted builtins (abs, round,
  min, max)
- Reject all other syntax: imports, attribute access on non-math objects,
  lambdas, comprehensions, string literals, etc.
- Add 39 tests covering arithmetic, precedence, math functions,
  allowed builtins, error handling, and 14 injection prevention cases
2026-03-14 15:51:35 -04:00
rockachopa
122d07471e Merge pull request 'fix: sanitize dynamic innerHTML in HTML templates (#47)' (#58) from fix/xss-sanitize into main 2026-03-14 15:45:11 -04:00
rockachopa
3d110098d1 Merge pull request 'feat: Add Kimi agent workspace with development scaffolding' (#44) from kimi/agent-workspace-init into main
Reviewed-on: http://localhost:3000/rockachopa/Timmy-time-dashboard/pulls/44
2026-03-14 15:09:04 -04:00
db129bbe16 fix: replace print() with proper logging (#29, #51) 2026-03-14 15:07:07 -04:00
591954891a fix: sanitize dynamic innerHTML in templates (#47) 2026-03-14 15:07:00 -04:00
bb287b2c73 fix: sanitize WebSocket data in HTML templates (XSS #47) 2026-03-14 15:01:48 -04:00
efb1feafc9 fix: replace print() with proper logging (#29, #51) 2026-03-14 15:01:34 -04:00
6233a8ccd6 feat: Add Kimi agent workspace with development scaffolding
Create the Kimi (Moonshot AI) agent workspace per AGENTS.md conventions:

Workspace Structure:
- .kimi/AGENTS.md - Workspace guide and conventions
- .kimi/README.md - Quick reference documentation
- .kimi/CHECKPOINT.md - Session state tracking
- .kimi/TODO.md - Task list for upcoming work
- .kimi/notes/ - Working notes directory
- .kimi/plans/ - Plan documents
- .kimi/worktrees/ - Git worktrees (reserved)

Development Scripts:
- scripts/bootstrap.sh - One-time workspace setup (venv, deps, .env)
- scripts/resume.sh - Quick status check + resume prompt
- scripts/dev.sh - Development helpers (status, test, lint, format, clean, nuke)

Features:
- Validates Python 3.11+, venv, deps, .env, git config
- Provides quick status on git, tests, Ollama, dashboard
- Commands for testing, linting, formatting, cleaning

Per AGENTS.md:
- Kimi is Build Tier for large-context feature drops
- Follows existing project patterns
- No changes to source code - workspace only
2026-03-14 14:30:38 -04:00
fa838b0063 fix: clean shutdown — silence MCP async-generator teardown noise
Swallow anyio cancel-scope RuntimeError and BaseExceptionGroup
from MCP stdio_client generators during GC on voice loop exit.
Custom unraisablehook + loop exception handler + warnings filter.
2026-03-14 14:12:05 -04:00
782218aa2c fix: voice loop — persistent event loop, markdown stripping, MCP noise
Three fixes from real-world testing:

1. Event loop: replaced asyncio.run() with a persistent loop so
   Agno's MCP sessions survive across conversation turns. No more
   'Event loop is closed' errors on turn 2+.

2. Markdown stripping: voice preamble tells Timmy to respond in
   natural spoken language, plus _strip_markdown() as a safety net
   removes **bold**, *italic*, bullets, headers, code fences, etc.
   TTS no longer reads 'asterisk asterisk'.

3. MCP noise: _suppress_mcp_noise() quiets mcp/agno/httpx loggers
   during voice mode so the terminal shows clean transcript only.

32 tests (12 new for markdown stripping + persistent loop).
2026-03-14 14:05:24 -04:00
dbadfc425d feat: sovereign voice loop — timmy voice command
Adds fully local listen-think-speak voice interface.
STT: Whisper, LLM: Ollama, TTS: Piper. No cloud, no network.

- src/timmy/voice_loop.py: VoiceLoop with VAD, Whisper, Piper
- src/timmy/cli.py: new voice command
- pyproject.toml: voice extras updated
- 20 new tests
2026-03-14 13:58:56 -04:00
d770d66150 Merge pull request 'fix: fact distillation — block garbage and secrets, improve dedup' (#43) from fix/fact-distillation into main hermes/v0.1 2026-03-14 13:00:59 -04:00
8ecc0b1780 fix: fact distillation — block garbage and secrets, improve dedup
- Rewrite distillation prompt with explicit GOOD/BAD examples
  Good: user preferences, project decisions, learned knowledge
  Bad: meta-observations, internal state, credentials
- Add security filter: block facts containing token/password/secret/key patterns
- Add meta-observation filter: block self-referential 'my thinking' facts
- Lower dedup threshold 0.9 -> 0.75 to catch paraphrased duplicates

Ref #40
2026-03-14 13:00:30 -04:00
60631a7ad1 Merge pull request 'fix: persistent event loop in CLI interview — no more Event loop is closed' (#42) from fix/cli-event-loop into main 2026-03-14 12:58:46 -04:00
b222b28856 fix: use persistent event loop in interview command
Replace repeated asyncio.run() calls with a single event loop that
persists across all interview questions. The old approach created and
destroyed loops per question, orphaning MCP stdio transports and
causing 'Event loop is closed' errors on ~50% of questions.

Also adds clean shutdown: closes MCP sessions before closing the loop.

Ref #36
2026-03-14 12:58:11 -04:00
f19b52a4dc Merge pull request 'fix: corrupted memory state + regex bug in update_user_profile' (#41) from fix/corrupted-memory-state into main 2026-03-14 12:56:52 -04:00
58ddf55282 fix: regex corruption in update_user_profile + hot memory write guards
- memory_system.py: fix regex replacement in update_user_profile()
  Used lambda instead of raw replacement string to prevent corruption
- memory_system.py: add guards to update_section() for empty/oversized writes

Ref #39
2026-03-14 12:55:02 -04:00
rockachopa
d062b0a890 Merge pull request 'cleanup: delete ~8,000 lines of dead code + sovereignty fix' (#33) from cleanup/code-review-issues into main
Reviewed-on: http://localhost:3000/rockachopa/Timmy-time-dashboard/pulls/33
2026-03-14 09:54:17 -04:00
2f623826bd cleanup: delete dead modules — ~7,900 lines removed
Closes #22, Closes #23

Deleted: brain/, swarm/, openfang/, paperclip/, cascade_adapter,
memory_migrate, agents/timmy.py, dead routes + all corresponding tests.

Updated pyproject.toml, app.py, loop_qa.py for removed imports.
2026-03-14 09:49:24 -04:00
rockachopa
c7221e27cc Merge pull request 'refactor: YAML-driven agent config — kill hardcoded personas' (#21) from refactor/yaml-driven-agents into main
Reviewed-on: http://localhost:3000/rockachopa/Timmy-time-dashboard/pulls/21
2026-03-14 08:44:04 -04:00
Trip T
0e89caa830 test: update delegation tests for YAML-driven agent IDs
Old hardcoded IDs (seer, forge, echo, helm, quill) replaced with
YAML-defined IDs (orchestrator, researcher, coder, writer, memory,
experimenter). Added test that old names are explicitly rejected.
2026-03-14 08:40:24 -04:00
rockachopa
dc380860ba Merge pull request 'fix: MCP integration — StdioServerParameters + smoke-tested' (#20) from claude/sharp-mcnulty into main
Reviewed-on: http://localhost:3000/rockachopa/Timmy-time-dashboard/pulls/20
2026-03-12 22:06:55 -04:00
Trip T
bd1aa55904 fix: use StdioServerParameters to bypass Agno executable whitelist
Agno's MCPTools has an undocumented executable whitelist that blocks
gitea-mcp (Go binary). Switch to server_params=StdioServerParameters()
which bypasses this restriction. Also fixes:

- Use tools.session.call_tool() for standalone invocation (MCPTools
  doesn't expose call_tool() directly)
- Use close() instead of disconnect() for cleanup
- Resolve gitea-mcp path via ~/go/bin fallback when not on PATH
- Stub mcp.client.stdio in test conftest

Smoke-tested end-to-end against real Gitea: connect, list_issues,
create issue, close issue, create_gitea_issue_via_mcp — all pass.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 22:03:45 -04:00
Trip T
8aef55ac07 fix: correct MCP tool names, timeout kwarg, and make mcp a core dep
- Fix tool names to match gitea-mcp server: issue_write, issue_read,
  list_issues, pull_request_write, etc. (old names didn't exist)
- Fix timeout → timeout_seconds (MCPTools API)
- Move mcp from optional to core dependency (required for agent)
- Add PR tools (pull_request_write/read, list_pull_requests)
- Fix create_gitea_issue_via_mcp to use issue_write with method="create"
- Update tool_safety.py and tests for corrected names
- Regenerate poetry.lock

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 21:52:28 -04:00
rockachopa
dc13b368a5 Merge pull request 'feat: replace custom Gitea with MCP servers' (#14) from claude/sharp-mcnulty into main
Reviewed-on: http://localhost:3000/rockachopa/Timmy-time-dashboard/pulls/14
2026-03-12 21:45:55 -04:00
Trip T
78167675f2 feat: replace custom Gitea client with MCP servers
Replace the bespoke GiteaHand httpx client and tools_gitea.py wrappers
with official MCP tool servers (gitea-mcp + filesystem MCP), wired into
Agno via MCPTools. Switch all session functions to async (arun/acontinue_run)
so MCP tools auto-connect. Delete ~1070 lines of custom Gitea code.

- Create src/timmy/mcp_tools.py with MCP factories + standalone issue bridge
- Wire MCPTools into agent.py tool list (Gitea + filesystem)
- Switch session.py chat/chat_with_tools/continue_chat to async
- Update all callers (dashboard routes, Discord vendor, CLI, thinking engine)
- Add gitea_token fallback from ~/.config/gitea/token
- Add MCP session cleanup to app shutdown hook
- Update tool_safety.py for MCP tool names
- 11 new tests, all 1417 passing, coverage 74.2%

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 21:40:32 -04:00
rockachopa
e1c6fdc3fd Merge pull request 'claude/sharp-mcnulty' (#13) from claude/sharp-mcnulty into main
Reviewed-on: http://localhost:3000/rockachopa/Timmy-time-dashboard/pulls/13
2026-03-12 20:57:46 -04:00
Trip T
41d6ebaf6a feat: CLI session persistence + tool confirmation gate
- Chat sessions persist across `timmy chat` invocations via Agno SQLite
  (session_id="cli"), fixing context amnesia between turns
- Dangerous tools (shell, write_file, etc.) now prompt for approval in CLI
  instead of silently exiting — uses typer.confirm() + Agno continue_run
- --new flag starts a fresh conversation when needed
- Improved _maybe_file_issues prompt for engineer-quality issue bodies
  (what's happening, expected behavior, suggested fix, acceptance criteria)
- think/status commands also pass session_id for continuity

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 20:55:56 -04:00
Trip T
350e6f54ff fix: prevent "Event loop is closed" on repeated Gitea API calls
The httpx AsyncClient was cached across asyncio.run() boundaries.
Each asyncio.run() creates and closes a new event loop, leaving the
cached client's connections on a dead loop.  Second+ calls would fail
with "Event loop is closed".

Fix: create a fresh client per request and close it in a finally block.
No more cross-loop client reuse.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 20:40:39 -04:00
rockachopa
4ca15de1e7 Merge pull request 'feat: add Gitea issue creation — Timmy's self-improvement channel' (#9) from claude/sharp-mcnulty into main
Reviewed-on: http://localhost:3000/rockachopa/Timmy-time-dashboard/pulls/9
2026-03-12 18:39:46 -04:00
Trip T
7163b15300 feat: add Gitea issue creation — Timmy's self-improvement channel
Give Timmy the ability to file Gitea issues when he notices bugs,
stale state, or improvement opportunities in his own codebase.

Components:
- GiteaHand async API client (infrastructure/hands/gitea.py)
  - Token auth with ~/.config/gitea/token fallback
  - Create/list/close issues, dedup by title similarity
  - Graceful degradation when Gitea unreachable
- Tool functions (timmy/tools_gitea.py)
  - create_gitea_issue: file issues with dedup + work order bridge
  - list_gitea_issues: check existing backlog
  - Classified as SAFE (no confirmation needed)
- Thinking post-hook (_maybe_file_issues in thinking.py)
  - Every 20 thoughts, LLM classifies recent thoughts for actionable items
  - Auto-files bugs/improvements to Gitea with dedup
  - Bridges to local work order system for dashboard tracking
- Config: gitea_url, gitea_token, gitea_repo, gitea_enabled,
  gitea_timeout, thinking_issue_every

All 1426 tests pass, 74.17% coverage.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 18:36:06 -04:00
rockachopa
faa743131f Merge pull request 'feat: consolidate memory into unified memory.db with 4-type model' (#8) from claude/sharp-mcnulty into main
Reviewed-on: http://localhost:3000/rockachopa/Timmy-time-dashboard/pulls/8
2026-03-12 11:28:51 -04:00