Commit Graph

43 Commits

Author SHA1 Message Date
2326771c5a [loop-cycle-953] refactor: DRY _import_creative_catalogs() (#560) (#565) 2026-03-19 21:21:23 -04:00
3afb62afb7 fix: add self_reflect tool for past behavior review (#417)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-19 09:39:14 -04:00
c1f939ef22 fix: add update_gitea_avatar capability (#368)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-18 22:04:57 -04:00
243b1a656f feat: give Timmy hands — artifact tools for conversation (#337)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-18 20:36:38 -04:00
bcbdc7d7cb feat: add thought_search tool for querying Timmy's thinking history (#260)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-15 19:35:58 -04:00
80aba0bf6d [loop-cycle-63] feat: session_history tool — Timmy searches past conversations (#251) (#258) 2026-03-15 15:11:43 -04:00
7f656fcf22 [loop-cycle-59] feat: gematria computation tool (#234) (#235) 2026-03-15 14:14:38 -04:00
b4cb3e9975 [loop-cycle-54] refactor: consolidate three memory stores into single table (#37) (#223) 2026-03-15 13:33:24 -04:00
96c7e6deae [loop-cycle-52] fix: remove all qwen3.5 references (#182) (#190) 2026-03-15 12:34:21 -04:00
bea2749158 [loop-cycle-49] refactor: narrow broad except Exception catches — batch 1 (#158) (#178) 2026-03-15 11:48:54 -04:00
717dba9816 [loop-cycle-46] refactor: break up oversized functions in tools.py (#151) (#154) 2026-03-15 10:56:33 -04:00
547b502718 fix: smart_read_file accepts path= kwarg from LLMs (#113)
LLMs naturally call read_file(path=...) but the wrapper only accepted
file_name=. Pydantic strict validation rejected the mismatch. Now accepts
both file_name and path kwargs, with clear error on missing both.

Added 6 tests covering: positional args, path kwarg, no-args error,
directory listing, empty dir, hidden file filtering.
2026-03-14 20:40:19 -04:00
3e7a35b3df Merge pull request '[loop-cycle-12] feat: Kimi delegation tool for coding tasks (#67)' (#112) from fix/kimi-delegation-67 into main 2026-03-14 20:31:08 -04:00
453c9a0694 feat: add delegate_to_kimi() tool for coding delegation (#67)
Timmy can now delegate coding tasks to Kimi CLI (262K context).
Includes timeout handling, workdir validation, output truncation.
Sovereign division of labor — Timmy plans, Kimi codes.
2026-03-14 20:29:03 -04:00
2fb104528f feat: add run_self_tests() tool for self-verification (#65)
Timmy can now run his own test suite via the run_self_tests() tool.
Supports 'fast' (unit only), 'full', or specific path scopes.
Returns structured results with pass/fail counts.

Sovereign self-verification — a fundamental capability.
2026-03-14 20:28:24 -04:00
fdc5b861ca fix: replace 59 bare except clauses with proper logging (#25)
All `except Exception:` now catch as `except Exception as exc:` with
appropriate logging (warning for critical paths, debug for graceful degradation).

Added logger setup to 4 files that lacked it:
- src/timmy/memory/vector_store.py
- src/dashboard/middleware/csrf.py
- src/dashboard/middleware/security_headers.py
- src/spark/memory.py

31 files changed across timmy core, dashboard, infrastructure, integrations.
Zero bare excepts remain. 1340 tests passing.
2026-03-14 19:07:14 -04:00
b3a1e0ce36 fix: prune dead web_search tool — ddgs never installed (#87)
Remove DuckDuckGoTools import, all web_search registrations across 4 toolkit
factories, catalog entry, safety classification, prompt references, and
session regex. Total: -41 lines of dead code.

consult_grok is functional (grok_enabled=True, API key set) and opt-in,
so it stays — but Timmy never calls it autonomously, which is correct
sovereign behavior (no cloud calls unless user permits).

Closes #87
2026-03-14 18:13:51 -04:00
74e426c63b [loop-cycle-2] fix: suppress confirmation tool WARNING spam (#79) (#89) 2026-03-14 17:54:58 -04:00
70d5dc5ce1 fix: replace eval() with AST-walking safe evaluator in calculator
Fixes #52

- Replace eval() in calculator() with _safe_eval() that walks the AST
  and only permits: numeric constants, arithmetic ops (+,-,*,/,//,%,**),
  unary +/-, math module access, and whitelisted builtins (abs, round,
  min, max)
- Reject all other syntax: imports, attribute access on non-math objects,
  lambdas, comprehensions, string literals, etc.
- Add 39 tests covering arithmetic, precedence, math functions,
  allowed builtins, error handling, and 14 injection prevention cases
2026-03-14 15:51:35 -04:00
Trip T
78167675f2 feat: replace custom Gitea client with MCP servers
Replace the bespoke GiteaHand httpx client and tools_gitea.py wrappers
with official MCP tool servers (gitea-mcp + filesystem MCP), wired into
Agno via MCPTools. Switch all session functions to async (arun/acontinue_run)
so MCP tools auto-connect. Delete ~1070 lines of custom Gitea code.

- Create src/timmy/mcp_tools.py with MCP factories + standalone issue bridge
- Wire MCPTools into agent.py tool list (Gitea + filesystem)
- Switch session.py chat/chat_with_tools/continue_chat to async
- Update all callers (dashboard routes, Discord vendor, CLI, thinking engine)
- Add gitea_token fallback from ~/.config/gitea/token
- Add MCP session cleanup to app shutdown hook
- Update tool_safety.py for MCP tool names
- 11 new tests, all 1417 passing, coverage 74.2%

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 21:40:32 -04:00
Trip T
7163b15300 feat: add Gitea issue creation — Timmy's self-improvement channel
Give Timmy the ability to file Gitea issues when he notices bugs,
stale state, or improvement opportunities in his own codebase.

Components:
- GiteaHand async API client (infrastructure/hands/gitea.py)
  - Token auth with ~/.config/gitea/token fallback
  - Create/list/close issues, dedup by title similarity
  - Graceful degradation when Gitea unreachable
- Tool functions (timmy/tools_gitea.py)
  - create_gitea_issue: file issues with dedup + work order bridge
  - list_gitea_issues: check existing backlog
  - Classified as SAFE (no confirmation needed)
- Thinking post-hook (_maybe_file_issues in thinking.py)
  - Every 20 thoughts, LLM classifies recent thoughts for actionable items
  - Auto-files bugs/improvements to Gitea with dedup
  - Bridges to local work order system for dashboard tracking
- Config: gitea_url, gitea_token, gitea_repo, gitea_enabled,
  gitea_timeout, thinking_issue_every

All 1426 tests pass, 74.17% coverage.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 18:36:06 -04:00
Trip T
b2f12ca97c feat: consolidate memory into unified memory.db with 4-type model
Consolidates 3 separate memory databases (semantic_memory.db, swarm.db
memory_entries, brain.db) into a single data/memory.db with facts,
chunks, and episodes tables.

Key changes:
- Add unified schema (timmy/memory/unified.py) with 3 core tables
- Redirect vector_store.py and semantic_memory.py to memory.db
- Add thought distillation: every Nth thought extracts lasting facts
- Enrich agent context with known facts in system prompt
- Add memory_forget tool for removing outdated memories
- Unify embeddings: vector_store delegates to semantic_memory.embed_text
- Bridge spark events to unified event log
- Add pruning for thoughts and events with configurable retention
- Add data migration script (timmy/memory_migrate.py)
- Deprecate brain.memory in favor of unified system

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 11:23:18 -04:00
Trip T
f6a6c0f62e feat: upgrade to qwen3.5, self-hosted Gitea CI, optimize Docker image
Model upgrade:
- qwen2.5:14b → qwen3.5:latest across config, tools, and docs
- Added qwen3.5 to multimodal model registry

Self-hosted Gitea CI:
- .gitea/workflows/tests.yml: lint + test jobs via act_runner
- Unified Dockerfile: pre-baked deps from poetry.lock for fast CI
- sitepackages=true in tox for ~2s dep resolution (was ~40s)
- OLLAMA_URL set to dead port in CI to prevent real LLM calls

Test isolation fixes:
- Smoke test fixture mocks create_timmy (was hitting real Ollama)
- WebSocket sends initial_state before joining broadcast pool (race fix)
- Tests use settings.ollama_model/url instead of hardcoded values
- skip_ci marker for Ollama-dependent tests, excluded in CI tox envs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 18:36:42 -04:00
Alexander Whitestone
9d78eb31d1 ruff (#169)
* polish: streamline nav, extract inline styles, improve tablet UX

- Restructure desktop nav from 8+ flat links + overflow dropdown into
  5 grouped dropdowns (Core, Agents, Intel, System, More) matching
  the mobile menu structure to reduce decision fatigue
- Extract all inline styles from mission_control.html and base.html
  notification elements into mission-control.css with semantic classes
- Replace JS-built innerHTML with secure DOM construction in
  notification loader and chat history
- Add CONNECTING state to connection indicator (amber) instead of
  showing OFFLINE before WebSocket connects
- Add tablet breakpoint (1024px) with larger touch targets for
  Apple Pencil / stylus use and safe-area padding for iPad toolbar
- Add active-link highlighting in desktop dropdown menus
- Rename "Mission Control" page title to "System Overview" to
  disambiguate from the chat home page
- Add "Home — Timmy Time" page title to index.html

https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h

* fix(security): move auth-gate credentials to environment variables

Hardcoded username, password, and HMAC secret in auth-gate.py replaced
with os.environ lookups. Startup now refuses to run if any variable is
unset. Added AUTH_GATE_SECRET/USER/PASS to .env.example.

https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h

* refactor(tooling): migrate from black+isort+bandit to ruff

Replace three separate linting/formatting tools with a single ruff
invocation. Updates tox.ini (lint, format, pre-push, pre-commit envs),
.pre-commit-config.yaml, and CI workflow. Fixes all ruff errors
including unused imports, missing raise-from, and undefined names.
Ruff config maps existing bandit skips to equivalent S-rules.

https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-03-11 12:23:35 -04:00
Alexander Whitestone
c41e3e1e15 fix: clean up logging colors, reduce noise, enable Tailscale access (#166)
* fix: reserve red for real errors, reduce log noise, allow Tailscale access

- Add _ColorFormatter: red = ERROR/CRITICAL only, yellow = WARNING, green = INFO
- Override uvicorn's default colors to use our scheme
- Downgrade discord "not installed" from ERROR to WARNING (optional dep)
- Downgrade DuckDuckGo unavailable from INFO to DEBUG
- Stop discord token watcher retry loop when discord.py not installed
- Add configurable trusted_hosts setting; dev mode allows all hosts
- Exclude .claude/ from uvicorn reload watcher (worktree isolation)
- Fix pre-commit hook: use tox -e unit, bump timeout to 60s

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style: auto-format with black

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: pre-commit hook auto-formats with black+isort before testing

Formatting should never block a commit — just fix it automatically.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Trip T <trip@local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 10:37:20 -04:00
Alexander Whitestone
904a7c564e feat: migrate to Agno native HITL tool confirmation flow (#158)
Replace the homebrew regex-based tool extraction and manual dispatch
(tool_executor.py) with Agno's built-in Human-In-The-Loop confirmation:

- Toolkit(requires_confirmation_tools=...) marks dangerous tools
- agent.run() returns RunOutput with status=paused when confirmation needed
- RunRequirement.confirm()/reject() + agent.continue_run() resumes execution

Dashboard and Discord vendor both use the native flow. DuckDuckGo import
isolated so its absence doesn't kill all tools. Test stubs cleaned up
(agno is a real dependency, only truly optional packages stubbed).

1384 tests pass in parallel (~14s).

Co-authored-by: Trip T <trip@local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 21:54:04 -04:00
Alexander Whitestone
ae3bb1cc21 feat: code quality audit + autoresearch integration + infra hardening (#150) 2026-03-08 12:50:44 -04:00
Alexander Whitestone
7792ae745f feat: agentic loop for multi-step tasks + regression fixes (#148)
* fix: name extraction blocklist, memory preview escaping, and gitignore cleanup

- Add _NAME_BLOCKLIST to extract_user_name() to reject gerunds and UI-state
  words like "Sending" that were incorrectly captured as user names
- Collapse whitespace in get_memory_status() preview so newlines survive
  JSON serialization without showing raw \n escape sequences
- Broaden .gitignore from specific memory/self/user_profile.md to memory/self/
  and untrack memory/self/methodology.md (runtime-edited file)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: catch Ollama connection errors in session.py + add 71 smoke tests

- Wrap agent.run() in session.py with try/except so Ollama connection
  failures return a graceful fallback message instead of dumping raw
  tracebacks to Docker logs
- Add tests/test_smoke.py with 71 tests covering every GET route:
  core pages, feature pages, JSON APIs, and a parametrized no-500 sweep
  — catches import errors, template failures, and schema mismatches
  that unit tests miss

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: agentic loop for multi-step tasks + Round 10 regression fixes

Agentic loop (Parts 1-4):
- Add multi-step chaining instructions to system prompt
- New agentic_loop.py with plan→execute→adapt→summarize flow
- Register plan_and_execute tool for background task execution
- Add max_agent_steps config setting (default: 10)
- Discord fix: 300s timeout, typing indicator, send error handling
- 16 new unit + e2e tests for agentic loop

Round 10 regressions (R1-R5, P1):
- R1: Fix literal \n escape sequences in tool responses
- R2: Chat timeout/error feedback in agent panel
- R3: /hands infinite spinner → static empty states
- R4: /self-coding infinite spinner → static stats + journal
- R5: /grok/status raw JSON → HTML dashboard template
- P1: VETO confirmation dialog on task cards

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: briefing route 500 in CI when agno is MagicMock stub

_call_agent() returned a MagicMock instead of a string when agno is
stubbed in tests, causing SQLite "Error binding parameter 4" on save.
Ensure the return value is always an actual string.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: briefing route 500 in CI — graceful degradation at route level

When agno is stubbed with MagicMock in CI, agent.run() returns a
MagicMock instead of raising — so the exception handler never fires
and a MagicMock propagates as the summary to SQLite, which can't
bind it.

Fix: catch at the route level and return a fallback Briefing object.
This follows the project's graceful degradation pattern — the briefing
page always renders, even when the backend is completely unavailable.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Trip T <trip@local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 01:46:29 -05:00
Alexander Whitestone
e36a1dc939 fix: resolve 6 dashboard bugs and rebuild Task Queue + Work Orders (#144) (#144)
Round 2+3 bug fix batch:

1. Ollama timeout: Add request_timeout=300 to prevent socket read errors
   on complex 30-60s prompts (production crash fix)

2. Memory API: Create missing HTMX partial templates (memory_facts.html,
   memory_results.html) so Save/Search buttons work

3. CALM page: Add create_tables() call so SQLAlchemy tables exist on
   first request (was returning HTTP 500)

4. Task Queue: Full SQLite-backed rebuild with CRUD endpoints, HTMX
   partials, and action buttons (approve/veto/pause/cancel/retry)

5. Work Orders: Full SQLite-backed rebuild with submit/approve/reject/
   execute pipeline and HTMX polling partials

6. Memory READ tool: Add memory_read function so Timmy stops calling
   read_file when trying to recall stored facts

Also: Close GitHub issues #115, #114, #112, #110 as won't-fix.
Comment on #107 confirming prune_memories() already wired to startup.

Tests: 33 new tests across 4 test files, all passing.
Full suite: 1155 passed, 2 pre-existing failures (hands_shell).

Co-authored-by: Trip T <trip@local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:21:30 -05:00
Alexander Whitestone
b8164e46b0 fix: remove dead swarm imports, add memory_write tool, and auto-prune on startup (#143)
- Replace dead `from swarm` imports in tools_delegation and tools_intro
  with working implementations sourced from _PERSONAS
- Add `memory_write` tool so the agent can actually persist memories
  when users ask it to remember something
- Enhance `memory_search` to search both vault files AND the runtime
  vector store for cross-channel recall (Discord/web/Telegram)
- Add memory management config: memory_prune_days, memory_prune_keep_facts,
  memory_vault_max_mb
- Auto-prune old vector store entries and warn on vault size at startup
- Update tests for new delegation agent list (mace removed)

Co-authored-by: Trip T <trip@local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 22:34:30 -05:00
Alexander Whitestone
fb97625404 Consolidate architecture: flatten agents, kill Redis/Celery, thin routes (#133) 2026-03-05 20:27:02 -05:00
Alexander Whitestone
2b97da9e9c Add pre-commit hook enforcing 30s test suite time limit (#132) 2026-03-05 19:45:38 -05:00
Alexander Whitestone
f2dacf4ee0 Integrate Celery task queue for background task processing (#129) 2026-03-05 12:09:51 -05:00
Alexander Whitestone
425e7da380 Claude/remove persona system f vgt m (#126)
* Remove persona system, identity, and all Timmy references

Strip the codebase to pure orchestration logic:

- Delete TIMMY_IDENTITY.md and memory/self/identity.md
- Gut brain/identity.py to no-op stubs (empty returns)
- Remove all system prompts reinforcing Timmy's character, faith,
  sovereignty, sign-off ("Sir, affirmative"), and agent roster
- Replace identity-laden prompts with generic local-AI-assistant prompts
- Remove "You work for Timmy" from all sub-agent system prompts
- Rename PersonaTools → AgentTools, PERSONA_TOOLKITS → AGENT_TOOLKITS
- Replace "timmy" agent ID with "orchestrator" across routes, marketplace,
  tools catalog, and orchestrator class
- Strip Timmy references from config comments, templates, telegram bot,
  chat API, and dashboard UI
- Delete tests/brain/test_identity.py entirely
- Fix all test assertions that checked for persona identity content

729 tests pass (2 pre-existing failures in test_calm.py unrelated).

https://claude.ai/code/session_01LjQGUE6nk9W9674zaxrYxy

* Add Taskosaur (PM + AI task execution) to docker-compose

Spins up Taskosaur alongside the dashboard on `docker compose up`:
- postgres:16-alpine (port 5432, Taskosaur DB)
- redis:7-alpine (Bull queue backend)
- taskosaur (ports 3000 API / 3001 UI)
- dashboard now depends_on taskosaur healthy
- TASKOSAUR_API_URL injected into dashboard environment

Dashboard can reach Taskosaur at http://taskosaur:3000/api on the
internal network. Frontend UI accessible at http://localhost:3001.

https://claude.ai/code/session_01LjQGUE6nk9W9674zaxrYxy

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-03-04 12:00:49 -05:00
Alexander Whitestone
a975a845c5 feat: Timmy system introspection, delegation, and session logging (#74)
* test: remove hardcoded sleeps, add pytest-timeout

- Replace fixed time.sleep() calls with intelligent polling or WebDriverWait
- Add pytest-timeout dependency and --timeout=30 to prevent hangs
- Fixes test flakiness and improves test suite speed

* feat: add Aider AI tool to Forge's toolkit

- Add Aider tool that calls local Ollama (qwen2.5:14b) for AI coding assist
- Register tool in Forge's code toolkit
- Add functional tests for the Aider tool

* config: add opencode.json with local Ollama provider for sovereign AI

* feat: Timmy fixes and improvements

## Bug Fixes
- Fix read_file path resolution: add ~ expansion, proper relative path handling
- Add repo_root to config.py with auto-detection from .git location
- Fix hardcoded llama3.2 - now dynamic from settings.ollama_model

## Timmy's Requests
- Add communication protocol to AGENTS.md (read context first, explain changes)
- Create DECISIONS.md for architectural decision documentation
- Add reasoning guidance to system prompts (step-by-step, state uncertainty)
- Update tests to reflect correct model name (llama3.1:8b-instruct)

## Testing
- All 177 dashboard tests pass
- All 32 prompt/tool tests pass

* feat: Timmy system introspection, delegation, and session logging

## System Introspection (Sovereign Self-Knowledge)
- Add get_system_info() tool - Timmy can now query his runtime environment
- Add check_ollama_health() - verify Ollama status
- Add get_memory_status() - check memory tier status
- True introspection vs hardcoded prompts

## Path Resolution Fix
- Fix all toolkits to use settings.repo_root consistently
- Now uses Path(settings.repo_root) instead of Path.cwd()

## Inter-Agent Delegation
- Add delegate_task() tool - Timmy can dispatch to Seer, Forge, Echo, etc.
- Add list_swarm_agents() - query available agents

## Session Logging
- Add SessionLogger for comprehensive interaction logging
- Records messages, tool calls, errors, decisions
- Writes to /logs/session_{date}.jsonl

## Tests
- Add tests for introspection tools
- Add tests for delegation tools
- Add tests for session logging
- Add tests for path resolution
- All 18 new tests pass
- All 177 dashboard tests pass

---------

Co-authored-by: Alexander Payne <apayne@MM.local>
2026-02-27 00:11:53 -05:00
Alexander Whitestone
a5765c33b6 feat: add Aider AI tool to Forge's toolkit (#70)
* test: remove hardcoded sleeps, add pytest-timeout

- Replace fixed time.sleep() calls with intelligent polling or WebDriverWait
- Add pytest-timeout dependency and --timeout=30 to prevent hangs
- Fixes test flakiness and improves test suite speed

* feat: add Aider AI tool to Forge's toolkit

- Add Aider tool that calls local Ollama (qwen2.5:14b) for AI coding assist
- Register tool in Forge's code toolkit
- Add functional tests for the Aider tool

---------

Co-authored-by: Alexander Payne <apayne@MM.local>
2026-02-26 23:17:19 -05:00
Claude
17059bc0ea feat: add Grok (xAI) as opt-in premium backend with monetization
- Add GrokBackend class in src/timmy/backends.py with full sync/async
  support, health checks, usage stats, and cost estimation in sats
- Add consult_grok tool to Timmy's toolkit for proactive Grok queries
- Extend cascade router with Grok provider type for failover chain
- Add Grok Mode toggle card to Mission Control dashboard (HTMX live)
- Add "Ask Grok" button on chat input for direct Grok queries
- Add /grok/* routes: status, toggle, chat, stats endpoints
- Integrate Lightning invoice generation for Grok usage monetization
- Add GROK_ENABLED, XAI_API_KEY, GROK_DEFAULT_MODEL, GROK_MAX_SATS_PER_QUERY,
  GROK_FREE config settings via pydantic-settings
- Update .env.example and docker-compose.yml with Grok env vars
- Add 21 tests covering backend, tools, and route endpoints (all green)

Local-first ethos preserved: Grok is premium augmentation only,
disabled by default, and Lightning-payable when enabled.

https://claude.ai/code/session_01FygwN8wS8J6WGZ8FPb7XGV
2026-02-27 01:12:51 +00:00
Claude
9f4c809f70 refactor: Phase 2b — consolidate 28 modules into 14 packages
Complete the module consolidation planned in REFACTORING_PLAN.md:

Modules merged:
- work_orders/ + task_queue/ → swarm/ (subpackages)
- self_modify/ + self_tdd/ + upgrades/ → self_coding/ (subpackages)
- tools/ → creative/tools/
- chat_bridge/ + telegram_bot/ + shortcuts/ + voice/ → integrations/ (new)
- ws_manager/ + notifications/ + events/ + router/ → infrastructure/ (new)
- agents/ + agent_core/ + memory/ → timmy/ (subpackages)

Updated across codebase:
- 66 source files: import statements rewritten
- 13 test files: import + patch() target strings rewritten
- pyproject.toml: wheel includes (28→14), entry points updated
- CLAUDE.md: singleton paths, module map, entry points table
- AGENTS.md: file convention updates
- REFACTORING_PLAN.md: execution status, success metrics

Extras:
- Module-level CLAUDE.md added to 6 key packages (Phase 6.2)
- Zero test regressions: 1462 tests passing

https://claude.ai/code/session_01JNjWfHqusjT3aiN4vvYgUk
2026-02-26 22:07:41 +00:00
Alexander Payne
6e6b4355bb fix: calculator tool, markdown rendering, prompt guardrails, briefing notification
- Add sandboxed calculator tool to Timmy's toolkit so arithmetic questions
  get exact answers instead of LLM hallucinations
- Update system prompts (lite + full) to instruct Timmy to always use the
  calculator and never attempt multi-digit math in his head
- Add self-contradiction guard to both prompts ("commit to your facts")
- Render Timmy's chat responses as markdown via marked.js + DOMPurify
  instead of raw escaped text
- Suppress empty briefing notification on startup when there are 0
  pending approval items
- Add calculator to session response sanitizer regex
- 18 new calculator tests, 2 updated briefing notification tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 09:35:59 -05:00
Alexander Payne
16b65b28e8 Add Tier 3: Semantic Memory (vector search)
Completes the three-tier memory architecture:

## Tier 3 — Semantic Search
- Vector embeddings over all vault files
- Similarity-based retrieval
- memory_search tool for agents
- Fallback to hash-based embeddings if transformers unavailable

## Implementation
- src/timmy/semantic_memory.py — Core semantic memory
- Chunking strategy: paragraphs → sentences
- SQLite storage for vectors
- cosine_similarity for ranking

## Integration
- Added memory_search to create_full_toolkit()
- Updated prompts with memory_search examples
- Tool triggers: past conversations, reminders

## Features
- Automatic vault indexing
- Source file tracking (re-indexes on change)
- Similarity scoring
- Context retrieval for queries

## Usage

All 973 tests pass.
2026-02-25 18:25:20 -05:00
Alexander Payne
1bc2cdcb2e Fix Agno Toolkit API compatibility issues
- Change Toolkit.add_tool() to Toolkit.register() (method was renamed in Agno)
- Fix PythonTools method: python -> run_python_code
- Fix FileTools method: write_file -> save_file
- Fix FileTools base_dir parameter: str -> Path object
- Fix Agent tools parameter: pass Toolkit wrapped in list

These fixes resolve critical startup errors that prevented Timmy agent from initializing:
- AttributeError: 'Toolkit' object has no attribute 'add_tool'
- AttributeError: 'PythonTools' object has no attribute 'python'
- TypeError: 'Toolkit' object is not iterable

All 895 tests pass after these changes.

Quality review: Agent now fully functional with working inference, memory,
and self-awareness capabilities.
2026-02-25 14:11:13 -05:00
Claude
1103da339c feat: add full creative studio + DevOps tools (Pixel, Lyra, Reel personas)
Adds 3 new personas (Pixel, Lyra, Reel) and 5 new tool modules:

- Git/DevOps tools (GitPython): clone, status, diff, log, blame, branch,
  add, commit, push, pull, stash — wired to Forge and Helm personas
- Image generation (FLUX via diffusers): text-to-image, storyboards,
  variations — Pixel persona
- Music generation (ACE-Step 1.5): full songs with vocals+instrumentals,
  instrumental tracks, vocal-only tracks — Lyra persona
- Video generation (Wan 2.1 via diffusers): text-to-video, image-to-video
  clips — Reel persona
- Creative Director pipeline: multi-step orchestration that chains
  storyboard → music → video → assembly into 3+ minute final videos
- Video assembler (MoviePy + FFmpeg): stitch clips, overlay audio,
  title cards, subtitles, final export

Also includes:
- Spark Intelligence tool-level + creative pipeline event capture
- Creative Studio dashboard page (/creative/ui) with 4 tabs
- Config settings for all new models and output directories
- pyproject.toml creative optional extra for GPU dependencies
- 107 new tests covering all modules (624 total, all passing)

https://claude.ai/code/session_01KJm6jQkNi3aA3yoQJn636c
2026-02-24 16:31:47 +00:00
Alexander Payne
f0aa43533f feat: swarm E2E, MCP tools, timmy-serve L402, tests, notifications
Major Features:
- Auto-spawn persona agents (Echo, Forge, Seer) on app startup
- WebSocket broadcasts for real-time swarm UI updates
- MCP tool integration: web search, file I/O, shell, Python execution
- New /tools dashboard page showing agent capabilities
- Real timmy-serve start with L402 payment gating middleware
- Browser push notifications for briefings and task events

Tests:
- test_docker_agent.py: 9 tests for Docker agent runner
- test_swarm_integration_full.py: 18 E2E lifecycle tests
- Fixed all pytest warnings (436 tests, 0 warnings)

Improvements:
- Fixed coroutine warnings in coordinator broadcasts
- Fixed ResourceWarning for unclosed process pipes
- Added pytest-asyncio config to pyproject.toml
- Test isolation with proper event loop cleanup
2026-02-22 19:01:04 -05:00