Timmy-time-dashboard

Archived

forked from Rockachopa/Timmy-time-dashboard

Author	SHA1	Message	Date
hermes	d1a8b16cd7	Merge pull request '[loop-cycle-5] test: skip voice_loop tests when numpy missing (#48 )' (#94 ) from fix/skip-voice-tests-no-numpy into main	2026-03-14 18:26:40 -04:00
Kimi Agent	bf30d26dd1	test: skip voice_loop tests gracefully when numpy unavailable Wrap numpy and voice_loop imports in try/except with pytestmark skipif. Tests skip cleanly instead of ImportError when numpy not in dev deps. Closes #48	2026-03-14 18:24:56 -04:00
Kimi Agent	b3a1e0ce36	fix: prune dead web_search tool — ddgs never installed (#87 ) Remove DuckDuckGoTools import, all web_search registrations across 4 toolkit factories, catalog entry, safety classification, prompt references, and session regex. Total: -41 lines of dead code. consult_grok is functional (grok_enabled=True, API key set) and opt-in, so it stays — but Timmy never calls it autonomously, which is correct sovereign behavior (no cloud calls unless user permits). Closes #87	2026-03-14 18:13:51 -04:00
Kimi Agent	7132b42ff3	fix: model introspection uses exact match, queries /api/ps first _get_ollama_model() used prefix match (startswith) on /api/tags, causing qwen3:30b to match qwen3.5:latest. Now: 1. Queries /api/ps (loaded models) first — most accurate 2. Falls back to /api/tags with exact name match 3. Reports actual running model, not just configured one Updated test_get_system_info_contains_model to not assume model==config. Fixes #77. 5 regression tests added.	2026-03-14 18:03:59 -04:00
hermes	74e426c63b	[loop-cycle-2] fix: suppress confirmation tool WARNING spam (#79 ) (#89 )	2026-03-14 17:54:58 -04:00
hermes	09fcf956ec	Merge pull request '[loop-cycle-1] feat: tool allowlist for autonomous operation (#69 )' (#88 ) from fix/tool-allowlist-autonomous into main	2026-03-14 17:41:56 -04:00
Kimi Agent	d28e2f4a7e	[loop-cycle-1] feat: tool allowlist for autonomous operation (#69 ) Add config/allowlist.yaml — YAML-driven gate that auto-approves bounded tool calls when no human is present. When Timmy runs with --autonomous or stdin is not a terminal, tool calls are checked against allowlist: matched → auto-approved, else → rejected. Changes: - config/allowlist.yaml: shell prefixes, deny patterns, path rules - tool_safety.py: is_allowlisted() checks tools against YAML rules - cli.py: --autonomous flag, _is_interactive() detection - 44 new allowlist tests, 8 updated CLI tests Closes #69	2026-03-14 17:39:48 -04:00
Kimi Agent	94cd1a9840	fix: make model fallback chains configurable (#53 ) Move hardcoded model fallback lists from module-level constants into settings.fallback_models and settings.vision_fallback_models (pydantic Settings fields). Can now be overridden via env vars FALLBACK_MODELS / VISION_FALLBACK_MODELS or config/providers.yaml. Removed: - OLLAMA_MODEL_PRIMARY / OLLAMA_MODEL_FALLBACK from config.py - DEFAULT_MODEL_FALLBACKS / VISION_MODEL_FALLBACKS from agent.py get_effective_ollama_model() and _resolve_model_with_fallback() now walk the configurable chains instead of hardcoded constants. 5 new tests guard the configurable behavior and prevent regression to hardcoded constants.	2026-03-14 17:26:47 -04:00
Kimi Agent	061c8f6628	fix: brevity tuning — plain text prompts, markdown=False, front-loaded brevity Closes #71: Timmy was responding with elaborate markdown formatting (tables, headers, emoji, bullet lists) for simple questions. Root causes fixed: 1. Agno Agent markdown=True flag explicitly told the model to format responses as markdown. Set to False in both agent.py and agents/base.py. 2. SYSTEM_PROMPT_FULL used ## and ### markdown headers, bold (**), and numbered lists — teaching by example that markdown is expected. Rewritten to plain text with labeled sections. 3. Brevity instructions were buried at the bottom of the full prompt. Moved to immediately after the opening line as 'VOICE AND BREVITY' with explicit override priority. 4. Orchestrator prompt in agents.yaml was silent on response style. Added 'Voice: brief, plain, direct' with concrete examples. The full prompt is now 41 lines shorter (124 → 83). The prompt itself practices the brevity it preaches. SOUL.md alignment: - 'Brevity is a kindness' — now front-loaded in both base and agent prompt - 'I do not fill silence with noise' — explicit in both tiers - 'I speak plainly. I prefer short sentences.' — structural enforcement 4 new tests guard against regression: - test_full_prompt_brevity_first: brevity section before tools/memory - test_full_prompt_no_markdown_headers: no ## or ### in prompt text - test_full_prompt_plain_text_brevity: 'plain text' instruction present - test_lite_prompt_brevity: lite tier also instructs brevity	2026-03-14 17:15:56 -04:00
rockachopa	927e25cc40	Merge pull request 'fix: replace print() with proper logging (#29 , #51 )' (#59 ) from fix/print-to-logging into main	2026-03-14 16:50:04 -04:00
Kimi Agent	b30b5c6b57	[loop-cycle-6] Break thinking rumination loop — semantic dedup (#38 ) Add post-generation similarity check to ThinkingEngine.think_once(). Problem: Timmy's thinking engine generates repetitive thoughts because small local models ignore 'don't repeat' instructions in the prompt. The same observation ('still no chat messages', 'Alexander's name is in profile') would appear 14+ times in a single day's journal. Fix: After generating a thought, compare it against the last 5 thoughts using SequenceMatcher. If similarity >= 0.6, retry with a new seed up to 2 times. If all retries produce repetitive content, discard rather than store. Uses stdlib difflib — no new dependencies. Changes: - thinking.py: Add _is_too_similar() method with SequenceMatcher - thinking.py: Wrap generation in retry loop with dedup check - test_thinking.py: 7 new tests covering exact match, near match, different thoughts, retry behavior, and max-retry discard +96/-20 lines in thinking.py, +87 lines in tests.	2026-03-14 16:21:16 -04:00
Kimi Agent	70d5dc5ce1	fix: replace eval() with AST-walking safe evaluator in calculator Fixes #52 - Replace eval() in calculator() with _safe_eval() that walks the AST and only permits: numeric constants, arithmetic ops (+,-,,/,//,%,*), unary +/-, math module access, and whitelisted builtins (abs, round, min, max) - Reject all other syntax: imports, attribute access on non-math objects, lambdas, comprehensions, string literals, etc. - Add 39 tests covering arithmetic, precedence, math functions, allowed builtins, error handling, and 14 injection prevention cases	2026-03-14 15:51:35 -04:00
Hermes Agent	782218aa2c	fix: voice loop — persistent event loop, markdown stripping, MCP noise Three fixes from real-world testing: 1. Event loop: replaced asyncio.run() with a persistent loop so Agno's MCP sessions survive across conversation turns. No more 'Event loop is closed' errors on turn 2+. 2. Markdown stripping: voice preamble tells Timmy to respond in natural spoken language, plus _strip_markdown() as a safety net removes bold, italic, bullets, headers, code fences, etc. TTS no longer reads 'asterisk asterisk'. 3. MCP noise: _suppress_mcp_noise() quiets mcp/agno/httpx loggers during voice mode so the terminal shows clean transcript only. 32 tests (12 new for markdown stripping + persistent loop).	2026-03-14 14:05:24 -04:00
Hermes Agent	dbadfc425d	feat: sovereign voice loop — timmy voice command Adds fully local listen-think-speak voice interface. STT: Whisper, LLM: Ollama, TTS: Piper. No cloud, no network. - src/timmy/voice_loop.py: VoiceLoop with VAD, Whisper, Piper - src/timmy/cli.py: new voice command - pyproject.toml: voice extras updated - 20 new tests	2026-03-14 13:58:56 -04:00
Hermes Agent	2f623826bd	cleanup: delete dead modules — ~7,900 lines removed Closes #22, Closes #23 Deleted: brain/, swarm/, openfang/, paperclip/, cascade_adapter, memory_migrate, agents/timmy.py, dead routes + all corresponding tests. Updated pyproject.toml, app.py, loop_qa.py for removed imports.	2026-03-14 09:49:24 -04:00
Trip T	78167675f2	feat: replace custom Gitea client with MCP servers Replace the bespoke GiteaHand httpx client and tools_gitea.py wrappers with official MCP tool servers (gitea-mcp + filesystem MCP), wired into Agno via MCPTools. Switch all session functions to async (arun/acontinue_run) so MCP tools auto-connect. Delete ~1070 lines of custom Gitea code. - Create src/timmy/mcp_tools.py with MCP factories + standalone issue bridge - Wire MCPTools into agent.py tool list (Gitea + filesystem) - Switch session.py chat/chat_with_tools/continue_chat to async - Update all callers (dashboard routes, Discord vendor, CLI, thinking engine) - Add gitea_token fallback from ~/.config/gitea/token - Add MCP session cleanup to app shutdown hook - Update tool_safety.py for MCP tool names - 11 new tests, all 1417 passing, coverage 74.2% Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-12 21:40:32 -04:00
Trip T	41d6ebaf6a	feat: CLI session persistence + tool confirmation gate - Chat sessions persist across `timmy chat` invocations via Agno SQLite (session_id="cli"), fixing context amnesia between turns - Dangerous tools (shell, write_file, etc.) now prompt for approval in CLI instead of silently exiting — uses typer.confirm() + Agno continue_run - --new flag starts a fresh conversation when needed - Improved _maybe_file_issues prompt for engineer-quality issue bodies (what's happening, expected behavior, suggested fix, acceptance criteria) - think/status commands also pass session_id for continuity Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-12 20:55:56 -04:00
Trip T	7163b15300	feat: add Gitea issue creation — Timmy's self-improvement channel Give Timmy the ability to file Gitea issues when he notices bugs, stale state, or improvement opportunities in his own codebase. Components: - GiteaHand async API client (infrastructure/hands/gitea.py) - Token auth with ~/.config/gitea/token fallback - Create/list/close issues, dedup by title similarity - Graceful degradation when Gitea unreachable - Tool functions (timmy/tools_gitea.py) - create_gitea_issue: file issues with dedup + work order bridge - list_gitea_issues: check existing backlog - Classified as SAFE (no confirmation needed) - Thinking post-hook (_maybe_file_issues in thinking.py) - Every 20 thoughts, LLM classifies recent thoughts for actionable items - Auto-files bugs/improvements to Gitea with dedup - Bridges to local work order system for dashboard tracking - Config: gitea_url, gitea_token, gitea_repo, gitea_enabled, gitea_timeout, thinking_issue_every All 1426 tests pass, 74.17% coverage. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-12 18:36:06 -04:00
Trip T	d42c574d26	feat: add Loop QA self-testing framework Structured self-test framework that probes 6 capabilities (tool use, multistep planning, memory read/write, self-coding, lightning econ) in round-robin. Reuses existing infra: event_log for persistence, create_task() for upgrade proposals, capture_error() for crash handling, and in-memory circuit breaker for failure tracking. - src/timmy/loop_qa.py: Capability enum, 6 async probes, orchestrator - src/dashboard/routes/loop_qa.py: JSON + HTMX health endpoints - HTMX partial polls every 30s on the health panel - Background scheduler in app.py lifespan - 25 tests covering probes, orchestrator, health snapshot, routes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 22:33:16 -04:00
Trip T	f1e909b1e3	feat: enrich thinking engine — anti-loop, anti-confabulation, grounding Rewrite _THINKING_PROMPT with strict rules: 2-3 sentence limit, anti-confabulation (only reference real data), anti-repetition. - Add _pick_seed_type() with recent-type dedup (excludes last 3) - Add _gather_system_snapshot() for real-time grounding (time, thought count, chat activity, task queue) - Improve _build_continuity_context() with anti-repetition header and 100-char truncation - Fix journal + memory timestamps to include local timezone - 12 new TDD tests covering all improvements Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 21:47:28 -04:00
Trip T	f8dadeec59	feat: tick prompt arg + fix name extraction learning verbs as names Add optional prompt argument to `timmy tick` so custom journal prompts can be passed from the CLI (seed_type="prompted"). Fix extract_user_name() learning verbs as names (e.g. "Serving"). Now requires the candidate word to start with a capital letter in the original message, rejects common verb suffixes (-ing, -tion, etc.), and deduplicates the naive regex in TimmyWithMemory to use the fixed ConversationManager.extract_user_name() instead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 21:11:53 -04:00
Trip T	6a7875e05f	feat: heartbeat memory hooks — pre-recall and post-update Wire MEMORY.md + soul.md into the thinking loop so each heartbeat is grounded in identity and recent context, breaking repetitive loops. Pre-hook: _load_memory_context() reads hot memory first (changes each cycle) then soul.md (stable identity), truncated to 1500 chars. Post-hook: _update_memory() writes a "Last Reflection" section to MEMORY.md after each thought so the next cycle has fresh context. soul.md is read-only from the heartbeat — never modified by it. All hooks degrade gracefully and never crash the heartbeat. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 20:54:13 -04:00
Trip T	f6a6c0f62e	feat: upgrade to qwen3.5, self-hosted Gitea CI, optimize Docker image Model upgrade: - qwen2.5:14b → qwen3.5:latest across config, tools, and docs - Added qwen3.5 to multimodal model registry Self-hosted Gitea CI: - .gitea/workflows/tests.yml: lint + test jobs via act_runner - Unified Dockerfile: pre-baked deps from poetry.lock for fast CI - sitepackages=true in tox for ~2s dep resolution (was ~40s) - OLLAMA_URL set to dead port in CI to prevent real LLM calls Test isolation fixes: - Smoke test fixture mocks create_timmy (was hitting real Ollama) - WebSocket sends initial_state before joining broadcast pool (race fix) - Tests use settings.ollama_model/url instead of hardcoded values - skip_ci marker for Ollama-dependent tests, excluded in CI tox envs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 18:36:42 -04:00
Alexander Whitestone	36fc10097f	Claude/angry cerf (#173 ) * feat: set qwen3.5:latest as default model - Make qwen3.5:latest the primary default model for faster inference - Move llama3.1:8b-instruct to fallback chain - Update text fallback chain to prioritize qwen3.5:latest Retains full backward compatibility via cascade fallback. * test: remove ~55 brittle, duplicate, and useless tests Audit of all 100 test files identified tests that provided no real regression protection. Removed: - 4 files deleted entirely: test_setup_script (always skipped), test_csrf_bypass (tautological assertions), test_input_validation (accepts 200-500 status codes), test_security_regression (fragile source-pattern checks redundant with rendering tests) - Duplicate test classes (TestToolTracking, TestCalculatorExtended) - Mock-only tests that just verify mock wiring, not behavior - Structurally broken tests (TestCreateToolFunctions patches after import) - Empty/pass-body tests and meaningless assertions (len > 20) - Flaky subprocess tests (aider tool calling real binary) All 1328 remaining tests pass. Net: -699 lines, zero coverage loss. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: prevent test pollution from autoresearch_enabled mutation test_autoresearch_perplexity.py was setting settings.autoresearch_enabled = True but never restoring it in the finally block — polluting subsequent tests. When pytest-randomly ordered it before test_experiments_page_shows_disabled_when_off, the victim test saw enabled=True and failed to find "Disabled" in the page. Fix both sides: - Restore autoresearch_enabled in the finally block (root cause) - Mock settings explicitly in the victim test (defense in depth) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Trip T <trip@local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 16:55:27 -04:00
Alexander Whitestone	9d78eb31d1	ruff (#169 ) * polish: streamline nav, extract inline styles, improve tablet UX - Restructure desktop nav from 8+ flat links + overflow dropdown into 5 grouped dropdowns (Core, Agents, Intel, System, More) matching the mobile menu structure to reduce decision fatigue - Extract all inline styles from mission_control.html and base.html notification elements into mission-control.css with semantic classes - Replace JS-built innerHTML with secure DOM construction in notification loader and chat history - Add CONNECTING state to connection indicator (amber) instead of showing OFFLINE before WebSocket connects - Add tablet breakpoint (1024px) with larger touch targets for Apple Pencil / stylus use and safe-area padding for iPad toolbar - Add active-link highlighting in desktop dropdown menus - Rename "Mission Control" page title to "System Overview" to disambiguate from the chat home page - Add "Home — Timmy Time" page title to index.html https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h * fix(security): move auth-gate credentials to environment variables Hardcoded username, password, and HMAC secret in auth-gate.py replaced with os.environ lookups. Startup now refuses to run if any variable is unset. Added AUTH_GATE_SECRET/USER/PASS to .env.example. https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h * refactor(tooling): migrate from black+isort+bandit to ruff Replace three separate linting/formatting tools with a single ruff invocation. Updates tox.ini (lint, format, pre-push, pre-commit envs), .pre-commit-config.yaml, and CI workflow. Fixes all ruff errors including unused imports, missing raise-from, and undefined names. Ruff config maps existing bandit skips to equivalent S-rules. https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h --------- Co-authored-by: Claude <noreply@anthropic.com>	2026-03-11 12:23:35 -04:00
Alexander Whitestone	4a4c9be1eb	fix: repair broken test patch target and add interview transcript (#156 ) - Fix test_autoresearch_perplexity: patch target was dashboard.routes.experiments.get_experiment_history but the function is imported locally inside the route handler, so patch the source module timmy.autoresearch.get_experiment_history instead. - Add tests for src/timmy/interview.py (previously 0% coverage): question structure, run_interview flow, error handling, formatting. - Produce interview transcript document from structured Timmy interview. https://claude.ai/code/session_01EXDzXqgsC2ohS8qreF1fBo Co-authored-by: Claude <noreply@anthropic.com>	2026-03-10 15:26:29 -04:00
Alexander Whitestone	904a7c564e	feat: migrate to Agno native HITL tool confirmation flow (#158 ) Replace the homebrew regex-based tool extraction and manual dispatch (tool_executor.py) with Agno's built-in Human-In-The-Loop confirmation: - Toolkit(requires_confirmation_tools=...) marks dangerous tools - agent.run() returns RunOutput with status=paused when confirmation needed - RunRequirement.confirm()/reject() + agent.continue_run() resumes execution Dashboard and Discord vendor both use the native flow. DuckDuckGo import isolated so its absence doesn't kill all tools. Test stubs cleaned up (agno is a real dependency, only truly optional packages stubbed). 1384 tests pass in parallel (~14s). Co-authored-by: Trip T <trip@local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 21:54:04 -04:00
Alexander Whitestone	574031a55c	fix: remove invalid show_tool_calls kwarg crashing Agent init (#157 ) * fix: remove invalid show_tool_calls kwarg crashing Agent init (regression) show_tool_calls was removed in `f95c960` (Feb 26) because agno 2.5.x doesn't accept it, then reintroduced in `fd0ede0` (Mar 8) without runtime testing — mocked tests hid the breakage. Replace the bogus assertion with a regression guard and an allowlist test that catches unknown kwargs before they reach production. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: auto-install git hooks, add black/isort to dev deps - Add .githooks/ with portable pre-commit hook (macOS + Linux) - make install now auto-activates hooks via core.hooksPath - Add black and isort to poetry dev group (were only in CI via raw pip) - Fix black formatting on 2 files flagged by CI - Fix test_autoresearch_perplexity patching wrong module path Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Trip T <trip@local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 15:01:00 -04:00
Alexander Whitestone	21e2ae427a	Add test plan for autoresearch with perplexity metric (#154 )	2026-03-09 09:36:26 -04:00
Alexander Whitestone	ae3bb1cc21	feat: code quality audit + autoresearch integration + infra hardening (#150 )	2026-03-08 12:50:44 -04:00
Alexander Whitestone	fd0ede0d51	feat: auto-escalation system + agentic loop fixes (#149 ) (#149 ) Wire up automatic error-to-task escalation and fix the agentic loop stopping after the first tool call. Auto-escalation: - Add swarm.task_queue.models with create_task() bridge to existing task queue SQLite DB - Add swarm.event_log with EventType enum, log_event(), and SQLite persistence + WebSocket broadcast - Wire capture_error() into request logging middleware so unhandled HTTP exceptions auto-create [BUG] tasks with stack traces, git context, and push notifications (5-min dedup window) Agentic loop (Round 11 Bug #1): - Wrap agent_chat() in asyncio.to_thread() to stop blocking the event loop (fixes Discord heartbeat warnings) - Enable Agno's native multi-turn tool chaining via show_tool_calls and tool_call_limit on the Agent config - Strengthen multi-step continuation prompts with explicit examples Co-authored-by: Trip T <trip@local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-08 03:11:14 -04:00
Alexander Whitestone	7792ae745f	feat: agentic loop for multi-step tasks + regression fixes (#148 ) * fix: name extraction blocklist, memory preview escaping, and gitignore cleanup - Add _NAME_BLOCKLIST to extract_user_name() to reject gerunds and UI-state words like "Sending" that were incorrectly captured as user names - Collapse whitespace in get_memory_status() preview so newlines survive JSON serialization without showing raw \n escape sequences - Broaden .gitignore from specific memory/self/user_profile.md to memory/self/ and untrack memory/self/methodology.md (runtime-edited file) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: catch Ollama connection errors in session.py + add 71 smoke tests - Wrap agent.run() in session.py with try/except so Ollama connection failures return a graceful fallback message instead of dumping raw tracebacks to Docker logs - Add tests/test_smoke.py with 71 tests covering every GET route: core pages, feature pages, JSON APIs, and a parametrized no-500 sweep — catches import errors, template failures, and schema mismatches that unit tests miss Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: agentic loop for multi-step tasks + Round 10 regression fixes Agentic loop (Parts 1-4): - Add multi-step chaining instructions to system prompt - New agentic_loop.py with plan→execute→adapt→summarize flow - Register plan_and_execute tool for background task execution - Add max_agent_steps config setting (default: 10) - Discord fix: 300s timeout, typing indicator, send error handling - 16 new unit + e2e tests for agentic loop Round 10 regressions (R1-R5, P1): - R1: Fix literal \n escape sequences in tool responses - R2: Chat timeout/error feedback in agent panel - R3: /hands infinite spinner → static empty states - R4: /self-coding infinite spinner → static stats + journal - R5: /grok/status raw JSON → HTML dashboard template - P1: VETO confirmation dialog on task cards Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: briefing route 500 in CI when agno is MagicMock stub _call_agent() returned a MagicMock instead of a string when agno is stubbed in tests, causing SQLite "Error binding parameter 4" on save. Ensure the return value is always an actual string. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: briefing route 500 in CI — graceful degradation at route level When agno is stubbed with MagicMock in CI, agent.run() returns a MagicMock instead of raising — so the exception handler never fires and a MagicMock propagates as the summary to SQLite, which can't bind it. Fix: catch at the route level and return a fallback Briefing object. This follows the project's graceful degradation pattern — the briefing page always renders, even when the backend is completely unavailable. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Trip T <trip@local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-08 01:46:29 -05:00
Alexander Whitestone	b8e0f4539f	fix: Discord memory bug — add session continuity + 6 memory system fixes (#147 ) Discord created a new agent per message with no conversation history, causing Timmy to lose context between messages (the "yes" bug). Now uses a singleton agent with per-channel/thread session_id, matching the dashboard's session.py pattern. Also applies _clean_response() to strip hallucinated tool-call JSON from Discord output. Additional fixes: - get_system_context() no longer clears the handoff file (was destroying session context on every agent creation) - Orchestrator uses HotMemory.read() to auto-create MEMORY.md if missing - vector_store DB_PATH anchored to __file__ instead of relative CWD - brain/schema.py: removed invalid .load dot-commands from INIT_SQL - tools_intro: fixed wrong table name 'vectors' → 'chunks' in tier3 check Co-authored-by: Trip T <trip@local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-08 00:20:38 -05:00
Alexander Whitestone	248af9ed03	fix: dashboard bugs and clean up build artifacts (#145 ) * chore: stop tracking runtime-generated self-modify reports These 65 files in data/self_modify_reports/ are auto-generated at runtime and already listed in .gitignore. Tracking them caused conflicts when pulling from main. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve 8 dashboard bugs from Round 4 testing report - Fix Ollama timeout regression: request_timeout → timeout (agno API) - Add Bootstrap JS to base.html (fixes creative UI tab switching) - Send initial_state on Swarm Live WebSocket connect - Add /api/queue/status endpoint (stops 404 log spam from chat panel) - Populate agent tools from registry on /tools page - Add notification bell dropdown with /api/notifications endpoint - All 1157 tests pass Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Trip T <trip@local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-07 23:44:56 -05:00
Alexander Whitestone	e36a1dc939	fix: resolve 6 dashboard bugs and rebuild Task Queue + Work Orders (#144 ) (#144 ) Round 2+3 bug fix batch: 1. Ollama timeout: Add request_timeout=300 to prevent socket read errors on complex 30-60s prompts (production crash fix) 2. Memory API: Create missing HTMX partial templates (memory_facts.html, memory_results.html) so Save/Search buttons work 3. CALM page: Add create_tables() call so SQLAlchemy tables exist on first request (was returning HTTP 500) 4. Task Queue: Full SQLite-backed rebuild with CRUD endpoints, HTMX partials, and action buttons (approve/veto/pause/cancel/retry) 5. Work Orders: Full SQLite-backed rebuild with submit/approve/reject/ execute pipeline and HTMX polling partials 6. Memory READ tool: Add memory_read function so Timmy stops calling read_file when trying to recall stored facts Also: Close GitHub issues #115, #114, #112, #110 as won't-fix. Comment on #107 confirming prune_memories() already wired to startup. Tests: 33 new tests across 4 test files, all passing. Full suite: 1155 passed, 2 pre-existing failures (hands_shell). Co-authored-by: Trip T <trip@local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-07 23:21:30 -05:00
Alexander Whitestone	b8164e46b0	fix: remove dead swarm imports, add memory_write tool, and auto-prune on startup (#143 ) - Replace dead `from swarm` imports in tools_delegation and tools_intro with working implementations sourced from _PERSONAS - Add `memory_write` tool so the agent can actually persist memories when users ask it to remember something - Enhance `memory_search` to search both vault files AND the runtime vector store for cross-channel recall (Discord/web/Telegram) - Add memory management config: memory_prune_days, memory_prune_keep_facts, memory_vault_max_mb - Auto-prune old vector store entries and warn on vault size at startup - Update tests for new delegation agent list (mace removed) Co-authored-by: Trip T <trip@local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-07 22:34:30 -05:00
Alexander Whitestone	3f06e7231d	Improve test coverage from 63.6% to 73.4% and fix test infrastructure (#137 )	2026-03-06 13:21:05 -05:00
Alexander Whitestone	2b97da9e9c	Add pre-commit hook enforcing 30s test suite time limit (#132 )	2026-03-05 19:45:38 -05:00
Alexander Whitestone	425e7da380	Claude/remove persona system f vgt m (#126 ) * Remove persona system, identity, and all Timmy references Strip the codebase to pure orchestration logic: - Delete TIMMY_IDENTITY.md and memory/self/identity.md - Gut brain/identity.py to no-op stubs (empty returns) - Remove all system prompts reinforcing Timmy's character, faith, sovereignty, sign-off ("Sir, affirmative"), and agent roster - Replace identity-laden prompts with generic local-AI-assistant prompts - Remove "You work for Timmy" from all sub-agent system prompts - Rename PersonaTools → AgentTools, PERSONA_TOOLKITS → AGENT_TOOLKITS - Replace "timmy" agent ID with "orchestrator" across routes, marketplace, tools catalog, and orchestrator class - Strip Timmy references from config comments, templates, telegram bot, chat API, and dashboard UI - Delete tests/brain/test_identity.py entirely - Fix all test assertions that checked for persona identity content 729 tests pass (2 pre-existing failures in test_calm.py unrelated). https://claude.ai/code/session_01LjQGUE6nk9W9674zaxrYxy * Add Taskosaur (PM + AI task execution) to docker-compose Spins up Taskosaur alongside the dashboard on `docker compose up`: - postgres:16-alpine (port 5432, Taskosaur DB) - redis:7-alpine (Bull queue backend) - taskosaur (ports 3000 API / 3001 UI) - dashboard now depends_on taskosaur healthy - TASKOSAUR_API_URL injected into dashboard environment Dashboard can reach Taskosaur at http://taskosaur:3000/api on the internal network. Frontend UI accessible at http://localhost:3001. https://claude.ai/code/session_01LjQGUE6nk9W9674zaxrYxy --------- Co-authored-by: Claude <noreply@anthropic.com>	2026-03-04 12:00:49 -05:00
Alexander Whitestone	eb2b34876b	security: implement rate limiting for chat API and fix calm tests (#122 )	2026-03-03 08:16:36 -05:00
Alexander Whitestone	584eeb679e	Operation Darling Purge: slim to wealth core (-33,783 lines) (#121 )	2026-03-02 13:17:38 -05:00
Alexander Whitestone	62ef1120a4	Memory Unification + Canonical Identity: -11,074 lines of homebrew (#119 )	2026-03-02 09:58:07 -05:00
Alexander Whitestone	89cfe1be0d	fix: Docker-first test suite, UX improvements, and bug fixes (#100 ) Dashboard UX: - Restructure nav from 22 flat links to 6 core + MORE dropdown - Add mobile nav section labels (Core, Intelligence, Agents, System, Commerce) - Defer marked.js and dompurify.js loading, consolidate CDN to jsdelivr - Optimize font weights (drop unused 300/500), bump style.css cache buster - Remove duplicate HTMX load triggers from sidebar and health panels Bug fixes: - Fix Timmy showing OFFLINE by registering after swarm recovery sweep - Fix ThinkingEngine await bug with asyncio.run_coroutine_threadsafe - Fix chat auto-scroll by calling scrollChat() after history partial loads - Add missing /voice/button page and /voice/command endpoint - Fix Grok api_key="" treated as falsy falling through to env key - Fix self_modify PROJECT_ROOT using settings.repo_root instead of __file__ Docker test infrastructure: - Bind-mount hands/, docker/, Dockerfiles, and compose files into test container - Add fontconfig + fonts-dejavu-core for creative/assembler TextClip tests - Initialize minimal git repo in Dockerfile.test for GitSafety compatibility - Fix introspection and path resolution tests for Docker /app context All 1863 tests pass in Docker (0 failures, 77 skipped). Co-authored-by: Alexander Payne <apayne@MM.local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-28 22:14:37 -05:00
Alexander Whitestone	d7d7a5a80a	audit: clean Docker architecture, consolidate test fixtures, add containerized test runner (#94 )	2026-02-28 16:11:58 -05:00
Alexander Whitestone	ca0c42398b	feat: migrate to Poetry, fix Docker build, and resolve 6 UI/backend bugs (#92 ) Migrate from Hatchling to Poetry for dependency management, fixing the Docker build failure caused by .dockerignore excluding README.md that Hatchling needed for metadata. Poetry export strategy bypasses this entirely. Creative extras removed from main build (separate service). Docker changes: - Multi-stage builds with poetry export → pip install - BuildKit cache mounts for faster rebuilds - All 3 Dockerfiles updated (root, dashboard, agent) Bug fixes from tester audit: - TaskStatus/TaskPriority case-insensitive enum parsing - scrollChat() upgraded to requestAnimationFrame, removed duplicate - Desktop/mobile nav items synced in base.html - HTMX pointed to direct htmx.min.js URL - Removed unused highlight.js and bootstrap.bundle.min.js - Registered missing escalation/external task handlers in app.py Co-authored-by: Alexander Payne <apayne@MM.local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-28 13:12:14 -05:00
Alexander Whitestone	e5190b248a	CI/CD Optimization: Guard Rails, Pre-commit Checks, and Test Fixes (#90 ) * CI/CD Optimization: Guard Rails, Black Linting, and Pre-commit Hooks - Fixed all test collection errors (Selenium imports, fixture paths, syntax) - Implemented pre-commit hooks with Black formatting and isort - Created comprehensive Makefile with test targets (unit, integration, functional, e2e) - Added pytest.ini with marker definitions for test categorization - Established guard rails to prevent future collection errors - Wrapped optional dependencies (Selenium, MoviePy) in try-except blocks - Added conftest_markers for automatic test categorization This ensures a smooth development stream with: - Fast feedback loops (pre-commit checks before push) - Consistent code formatting (Black) - Reliable CI/CD (no collection errors, proper test isolation) - Clear test organization (unit, integration, functional, E2E) * Fix CI/CD test failures: - Export templates from dashboard.app - Fix model name assertion in test_agent.py - Fix platform-agnostic path resolution in test_path_resolution.py - Skip Docker tests in test_docker_deployment.py if docker not available - Fix test_model_fallback_chain logic in test_ollama_integration.py * Add preventative pre-commit checks and Docker test skipif decorators: - Create pre_commit_checks.py script for common CI failures - Add skipif decorators to Docker tests - Improve test robustness for CI environments	2026-02-28 11:36:50 -05:00
Alexander Whitestone	ab014dc5c6	feat: add `timmy interview` command for structured agent initialization (#87 )	2026-02-28 09:35:44 -05:00
Alexander Whitestone	849b5b1a8d	feat: add default thinking thread — Timmy always ponders (#75 )	2026-02-27 01:00:11 -05:00
Alexander Whitestone	a975a845c5	feat: Timmy system introspection, delegation, and session logging (#74 ) * test: remove hardcoded sleeps, add pytest-timeout - Replace fixed time.sleep() calls with intelligent polling or WebDriverWait - Add pytest-timeout dependency and --timeout=30 to prevent hangs - Fixes test flakiness and improves test suite speed * feat: add Aider AI tool to Forge's toolkit - Add Aider tool that calls local Ollama (qwen2.5:14b) for AI coding assist - Register tool in Forge's code toolkit - Add functional tests for the Aider tool * config: add opencode.json with local Ollama provider for sovereign AI * feat: Timmy fixes and improvements ## Bug Fixes - Fix read_file path resolution: add ~ expansion, proper relative path handling - Add repo_root to config.py with auto-detection from .git location - Fix hardcoded llama3.2 - now dynamic from settings.ollama_model ## Timmy's Requests - Add communication protocol to AGENTS.md (read context first, explain changes) - Create DECISIONS.md for architectural decision documentation - Add reasoning guidance to system prompts (step-by-step, state uncertainty) - Update tests to reflect correct model name (llama3.1:8b-instruct) ## Testing - All 177 dashboard tests pass - All 32 prompt/tool tests pass * feat: Timmy system introspection, delegation, and session logging ## System Introspection (Sovereign Self-Knowledge) - Add get_system_info() tool - Timmy can now query his runtime environment - Add check_ollama_health() - verify Ollama status - Add get_memory_status() - check memory tier status - True introspection vs hardcoded prompts ## Path Resolution Fix - Fix all toolkits to use settings.repo_root consistently - Now uses Path(settings.repo_root) instead of Path.cwd() ## Inter-Agent Delegation - Add delegate_task() tool - Timmy can dispatch to Seer, Forge, Echo, etc. - Add list_swarm_agents() - query available agents ## Session Logging - Add SessionLogger for comprehensive interaction logging - Records messages, tool calls, errors, decisions - Writes to /logs/session_{date}.jsonl ## Tests - Add tests for introspection tools - Add tests for delegation tools - Add tests for session logging - Add tests for path resolution - All 18 new tests pass - All 177 dashboard tests pass --------- Co-authored-by: Alexander Payne <apayne@MM.local>	2026-02-27 00:11:53 -05:00
Alexander Whitestone	18ed6232f9	feat: Timmy fixes and improvements (#72 ) * test: remove hardcoded sleeps, add pytest-timeout - Replace fixed time.sleep() calls with intelligent polling or WebDriverWait - Add pytest-timeout dependency and --timeout=30 to prevent hangs - Fixes test flakiness and improves test suite speed * feat: add Aider AI tool to Forge's toolkit - Add Aider tool that calls local Ollama (qwen2.5:14b) for AI coding assist - Register tool in Forge's code toolkit - Add functional tests for the Aider tool * config: add opencode.json with local Ollama provider for sovereign AI * feat: Timmy fixes and improvements ## Bug Fixes - Fix read_file path resolution: add ~ expansion, proper relative path handling - Add repo_root to config.py with auto-detection from .git location - Fix hardcoded llama3.2 - now dynamic from settings.ollama_model ## Timmy's Requests - Add communication protocol to AGENTS.md (read context first, explain changes) - Create DECISIONS.md for architectural decision documentation - Add reasoning guidance to system prompts (step-by-step, state uncertainty) - Update tests to reflect correct model name (llama3.1:8b-instruct) ## Testing - All 177 dashboard tests pass - All 32 prompt/tool tests pass --------- Co-authored-by: Alexander Payne <apayne@MM.local>	2026-02-26 23:39:13 -05:00

1 2

54 Commits