1
0
Commit Graph

60 Commits

Author SHA1 Message Date
Alexander Whitestone
c41e3e1e15 fix: clean up logging colors, reduce noise, enable Tailscale access (#166)
* fix: reserve red for real errors, reduce log noise, allow Tailscale access

- Add _ColorFormatter: red = ERROR/CRITICAL only, yellow = WARNING, green = INFO
- Override uvicorn's default colors to use our scheme
- Downgrade discord "not installed" from ERROR to WARNING (optional dep)
- Downgrade DuckDuckGo unavailable from INFO to DEBUG
- Stop discord token watcher retry loop when discord.py not installed
- Add configurable trusted_hosts setting; dev mode allows all hosts
- Exclude .claude/ from uvicorn reload watcher (worktree isolation)
- Fix pre-commit hook: use tox -e unit, bump timeout to 60s

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style: auto-format with black

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: pre-commit hook auto-formats with black+isort before testing

Formatting should never block a commit — just fix it automatically.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Trip T <trip@local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 10:37:20 -04:00
Alexander Whitestone
07f2c1b41e fix: wire up tick engine scheduler + add journal + systemd timer (#163) 2026-03-11 08:47:57 -04:00
Alexander Whitestone
82fb2417e3 feat: enable SQLite WAL mode for all databases (AGI ticket #1) (#153) 2026-03-08 16:07:02 -04:00
Alexander Whitestone
ae3bb1cc21 feat: code quality audit + autoresearch integration + infra hardening (#150) 2026-03-08 12:50:44 -04:00
Alexander Whitestone
e36a1dc939 fix: resolve 6 dashboard bugs and rebuild Task Queue + Work Orders (#144) (#144)
Round 2+3 bug fix batch:

1. Ollama timeout: Add request_timeout=300 to prevent socket read errors
   on complex 30-60s prompts (production crash fix)

2. Memory API: Create missing HTMX partial templates (memory_facts.html,
   memory_results.html) so Save/Search buttons work

3. CALM page: Add create_tables() call so SQLAlchemy tables exist on
   first request (was returning HTTP 500)

4. Task Queue: Full SQLite-backed rebuild with CRUD endpoints, HTMX
   partials, and action buttons (approve/veto/pause/cancel/retry)

5. Work Orders: Full SQLite-backed rebuild with submit/approve/reject/
   execute pipeline and HTMX polling partials

6. Memory READ tool: Add memory_read function so Timmy stops calling
   read_file when trying to recall stored facts

Also: Close GitHub issues #115, #114, #112, #110 as won't-fix.
Comment on #107 confirming prune_memories() already wired to startup.

Tests: 33 new tests across 4 test files, all passing.
Full suite: 1155 passed, 2 pre-existing failures (hands_shell).

Co-authored-by: Trip T <trip@local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 23:21:30 -05:00
Alexander Whitestone
b8164e46b0 fix: remove dead swarm imports, add memory_write tool, and auto-prune on startup (#143)
- Replace dead `from swarm` imports in tools_delegation and tools_intro
  with working implementations sourced from _PERSONAS
- Add `memory_write` tool so the agent can actually persist memories
  when users ask it to remember something
- Enhance `memory_search` to search both vault files AND the runtime
  vector store for cross-channel recall (Discord/web/Telegram)
- Add memory management config: memory_prune_days, memory_prune_keep_facts,
  memory_vault_max_mb
- Auto-prune old vector store entries and warn on vault size at startup
- Update tests for new delegation agent list (mace removed)

Co-authored-by: Trip T <trip@local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 22:34:30 -05:00
Alexander Whitestone
b615595100 refactor: centralize config & harden security (#141)
* feat: upgrade primary model from llama3.1:8b to qwen2.5:14b

- Swap OLLAMA_MODEL_PRIMARY to qwen2.5:14b for better reasoning
- llama3.1:8b-instruct becomes fallback
- Update .env default and README quick start
- Fix hardcoded model assertions in tests

qwen2.5:14b provides significantly better multi-step reasoning
and tool calling reliability while still running locally on
modest hardware. The 8B model remains as automatic fallback.

* security: centralize config, harden uploads, fix silent exceptions

- Add 9 pydantic Settings fields (skip_embeddings, disable_csrf,
  rqlite_url, brain_source, brain_db_path, csrf_cookie_secure,
  chat_api_max_body_bytes, timmy_test_mode) to centralize env-var access
- Migrate 8 os.environ.get() calls across 5 source files to use
  `from config import settings` per project convention
- Add path traversal defense-in-depth to file upload endpoint
- Add 1MB request body size limit to chat API
- Make CSRF cookie secure flag configurable via settings
- Replace 2 silent `except: pass` blocks with debug logging in session.py
- Remove unused `import os` from brain/memory.py and csrf.py
- Update 5 CSRF test fixtures to patch settings instead of os.environ

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Trip T <trip@local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 18:49:37 -05:00
Alexander Whitestone
87dc5eadfe Wire orchestrator pipe into task runner + pipe-verifying integration tests (#134) 2026-03-06 01:20:14 -05:00
Alexander Whitestone
fb97625404 Consolidate architecture: flatten agents, kill Redis/Celery, thin routes (#133) 2026-03-05 20:27:02 -05:00
Alexander Whitestone
2b97da9e9c Add pre-commit hook enforcing 30s test suite time limit (#132) 2026-03-05 19:45:38 -05:00
Alexander Whitestone
aff3edb06a Audit cleanup: security fixes, code reduction, test hygiene (#131) 2026-03-05 18:56:52 -05:00
Alexander Whitestone
f2dacf4ee0 Integrate Celery task queue for background task processing (#129) 2026-03-05 12:09:51 -05:00
AlexanderWhitestone
5e8766cef0 Fix build issues, implement missing routes, and stabilize e2e tests for production readiness 2026-03-04 17:15:46 -05:00
Alexander Whitestone
15c7ee5d1e Refactor: migrate inline middleware to dedicated classes (#123) 2026-03-04 07:58:39 -05:00
Alexander Whitestone
584eeb679e Operation Darling Purge: slim to wealth core (-33,783 lines) (#121) 2026-03-02 13:17:38 -05:00
AlexanderWhitestone
d080e67faf feat: Implement Minimum Viable Calm (MVC) feature and initial tests 2026-03-02 11:46:40 -05:00
Alexander Whitestone
62ef1120a4 Memory Unification + Canonical Identity: -11,074 lines of homebrew (#119) 2026-03-02 09:58:07 -05:00
Alexander Whitestone
b1552b9f64 Fix font resolution in creative module and add security headers to dashboard (#103) 2026-03-01 11:50:34 -05:00
Alexander Whitestone
6eefcabc97 feat: Phase 1 autonomy upgrades — introspection, heartbeat, source tagging, Discord auto-detect (#101)
UC-01: Live System Introspection Tool
- Add get_task_queue_status(), get_agent_roster(), get_live_system_status()
  to timmy/tools_intro with graceful degradation
- Enhanced get_memory_status() with line counts, section headers, vault
  directory listing, semantic memory row count, self-coding journal stats
- Register system_status MCP tool (creative/tools/system_status.py)
- Add system_status to Timmy's tool list + Hard Rule #7

UC-02: Fix Offline Status Bug
- Add registry.heartbeat() calls in task_processor run_loop() and
  process_single_task() so health endpoint reflects actual agent status
- health.py now consults swarm registry instead of Ollama connectivity

UC-03: Message Source Tagging
- Add source field to Message dataclass (default "browser")
- Tag all message_log.append() calls: browser, api, system
- Include source in /api/chat/history response

UC-04: Discord Token Auto-Detection & Docker Fix
- Add _discord_token_watcher() background coroutine that polls every 30s
  for DISCORD_TOKEN in env vars, .env file, or state file
- Add --extras discord to all three Dockerfiles (main, dashboard, test)

All 26 Phase 1 tests pass in Docker (make test-docker).
Full suite: 1889 passed, 77 skipped, 0 failed.

Co-authored-by: Alexander Payne <apayne@MM.local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 22:49:24 -05:00
Alexander Whitestone
89cfe1be0d fix: Docker-first test suite, UX improvements, and bug fixes (#100)
Dashboard UX:
- Restructure nav from 22 flat links to 6 core + MORE dropdown
- Add mobile nav section labels (Core, Intelligence, Agents, System, Commerce)
- Defer marked.js and dompurify.js loading, consolidate CDN to jsdelivr
- Optimize font weights (drop unused 300/500), bump style.css cache buster
- Remove duplicate HTMX load triggers from sidebar and health panels

Bug fixes:
- Fix Timmy showing OFFLINE by registering after swarm recovery sweep
- Fix ThinkingEngine await bug with asyncio.run_coroutine_threadsafe
- Fix chat auto-scroll by calling scrollChat() after history partial loads
- Add missing /voice/button page and /voice/command endpoint
- Fix Grok api_key="" treated as falsy falling through to env key
- Fix self_modify PROJECT_ROOT using settings.repo_root instead of __file__

Docker test infrastructure:
- Bind-mount hands/, docker/, Dockerfiles, and compose files into test container
- Add fontconfig + fonts-dejavu-core for creative/assembler TextClip tests
- Initialize minimal git repo in Dockerfile.test for GitSafety compatibility
- Fix introspection and path resolution tests for Docker /app context

All 1863 tests pass in Docker (0 failures, 77 skipped).

Co-authored-by: Alexander Payne <apayne@MM.local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 22:14:37 -05:00
Alexander Whitestone
6e67c3b421 feat: add bug report ingestion pipeline with Forge dispatch (#99)
Replace the stub `handle_bug_report` handler with a real implementation
that logs a decision trail and dispatches code_fix tasks to Forge for
automated fixing. Add `POST /api/bugs/submit` endpoint and `timmy
ingest-report` CLI command so AI test runners (Comet) can submit
structured bug reports without manual copy-paste.

- POST /api/bugs/submit: accepts JSON reports, creates bug_report tasks
- timmy ingest-report: CLI for file/stdin JSON ingestion with --dry-run
- handle_bug_report: logs decision trail to event_log, dispatches
  code_fix task to Forge with parent_task_id linking back to the bug
- 18 TDD tests covering endpoint, handler, and CLI

Co-authored-by: Alexander Payne <apayne@MM.local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 21:15:53 -05:00
Alexander Whitestone
2e92838033 fix: restore real-time chat responses via WebSocket (#98)
The chat WebSocket return path was broken by two bugs that prevented
Timmy's responses from appearing in the live chat feed:

1. Frontend checked msg.type instead of msg.event for 'timmy_response'
   events — the WSEvent dataclass uses 'event' as the field name.
2. Frontend accessed msg.response instead of msg.data.response — the
   response payload is nested in the data field.

Additional fixes:
- Queue acknowledgment ("Message queued...") no longer logged as an
  agent message in chat history; the real response is logged by the
  task processor when it completes, eliminating duplicate messages.
- Chat message template now carries data-task-id so the WS handler
  can find and replace the placeholder with the actual response.
- appendMessage() uses DOM APIs (textContent) instead of innerHTML
  for safer content insertion before markdown rendering.
- Fixed chat_message.html script targeting when queue-status div is
  present between the agent message and the inline script.

https://claude.ai/code/session_011cJfexqBBuGhSRQU8qwKcR

Co-authored-by: Claude <noreply@anthropic.com>
2026-02-28 20:22:47 -05:00
Alexander Whitestone
d4acaefee9 fix: resolve WebSocket crashes from websockets 16.0 incompatibility (#97)
The /ws redirect handler crashed with AttributeError because websockets
16.0 removed the legacy transfer_data_task attribute. The /swarm/live
endpoint could also error on early client disconnects during accept.

Co-authored-by: Alexander Payne <apayne@MM.local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 20:09:03 -05:00
Alexander Whitestone
b7c89d1101 feat: dockerize OpenFang as vendored tool runtime sidecar (#96) 2026-02-28 19:27:48 -05:00
Alexander Whitestone
1e19164379 fix: resolve portal startup hangs with non-blocking init (#93)
* fix: resolve portal startup hangs with non-blocking init

- Add socket_connect_timeout/socket_timeout (3s) to Redis connection in
  SwarmComms to prevent infinite hangs when Redis is unreachable
- Defer reconcile_on_startup() from SwarmCoordinator.__init__() to an
  explicit initialize() call during app lifespan, unblocking the
  module-level singleton creation
- Make Ollama health checks non-blocking via asyncio.to_thread() so they
  don't freeze the event loop for 2s per call
- Fix _check_redis() to reuse coordinator's SwarmComms singleton instead
  of creating a new connection on every health check
- Move discord bot platform registration from lifespan critical path
  into background task to avoid heavy import before yield
- Increase Docker healthcheck start_period from 10s/15s to 30s to give
  the app adequate time to complete startup

https://claude.ai/code/session_016t5jNBYsUAQuyoR7sXe7Ux

* fix: disable commit signing in git_tools test fixture

The git_repo fixture inherits global gpgsign config, causing git_commit
to fail when the signing server rejects unsigned source context.
Disable signing in the temp repo's local config.

https://claude.ai/code/session_016t5jNBYsUAQuyoR7sXe7Ux

* fix: add dev extras for pip-based CI install

The CI workflow runs `pip install -e ".[dev]"` but after the Poetry
migration there was no `dev` extra defined — only a Poetry dev group.
This caused pytest to not be installed, resulting in exit code 127
(command not found) on every CI run.

Add a pip-compatible `dev` extra that mirrors the Poetry dev group
so both `pip install -e ".[dev]"` and `poetry install` work.

https://claude.ai/code/session_016t5jNBYsUAQuyoR7sXe7Ux

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-02-28 15:01:48 -05:00
Alexander Whitestone
ca0c42398b feat: migrate to Poetry, fix Docker build, and resolve 6 UI/backend bugs (#92)
Migrate from Hatchling to Poetry for dependency management, fixing the
Docker build failure caused by .dockerignore excluding README.md that
Hatchling needed for metadata. Poetry export strategy bypasses this
entirely. Creative extras removed from main build (separate service).

Docker changes:
- Multi-stage builds with poetry export → pip install
- BuildKit cache mounts for faster rebuilds
- All 3 Dockerfiles updated (root, dashboard, agent)

Bug fixes from tester audit:
- TaskStatus/TaskPriority case-insensitive enum parsing
- scrollChat() upgraded to requestAnimationFrame, removed duplicate
- Desktop/mobile nav items synced in base.html
- HTMX pointed to direct htmx.min.js URL
- Removed unused highlight.js and bootstrap.bundle.min.js
- Registered missing escalation/external task handlers in app.py

Co-authored-by: Alexander Payne <apayne@MM.local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 13:12:14 -05:00
Alexander Whitestone
7b967d84b2 fix: restore task processor pipeline and eliminate /ws 403 spam (#91)
The microservices refactoring (PR #88) accidentally dropped handler
registration, zombie reconciliation, and startup drain from app.py.
Every task entering the queue was immediately backlogged with
"No handler for task type" because self._handlers stayed empty.

Restores the three critical blocks from app_backup.py:
- Register handlers for chat_response, thought, internal, bug_report,
  task_request
- Reconcile zombie RUNNING tasks from previous crashes
- Drain all pending tasks on startup before entering steady-state loop
- Re-approve tasks that were backlogged due to missing handlers

Also adds a /ws WebSocket catch-all that accepts stale connections and
closes with code 1008 instead of spamming 403 on every retry, and a
`make fresh` target for clean container rebuilds with no cached state.

Co-authored-by: Alexander Payne <apayne@MM.local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 12:18:18 -05:00
Alexander Whitestone
e5190b248a CI/CD Optimization: Guard Rails, Pre-commit Checks, and Test Fixes (#90)
* CI/CD Optimization: Guard Rails, Black Linting, and Pre-commit Hooks

- Fixed all test collection errors (Selenium imports, fixture paths, syntax)
- Implemented pre-commit hooks with Black formatting and isort
- Created comprehensive Makefile with test targets (unit, integration, functional, e2e)
- Added pytest.ini with marker definitions for test categorization
- Established guard rails to prevent future collection errors
- Wrapped optional dependencies (Selenium, MoviePy) in try-except blocks
- Added conftest_markers for automatic test categorization

This ensures a smooth development stream with:
- Fast feedback loops (pre-commit checks before push)
- Consistent code formatting (Black)
- Reliable CI/CD (no collection errors, proper test isolation)
- Clear test organization (unit, integration, functional, E2E)

* Fix CI/CD test failures:
- Export templates from dashboard.app
- Fix model name assertion in test_agent.py
- Fix platform-agnostic path resolution in test_path_resolution.py
- Skip Docker tests in test_docker_deployment.py if docker not available
- Fix test_model_fallback_chain logic in test_ollama_integration.py

* Add preventative pre-commit checks and Docker test skipif decorators:
- Create pre_commit_checks.py script for common CI failures
- Add skipif decorators to Docker tests
- Improve test robustness for CI environments
2026-02-28 11:36:50 -05:00
Alexander Whitestone
a5fd680428 feat: microservices refactoring with TDD and Docker optimization (#88)
## Summary
Complete refactoring of Timmy Time from monolithic architecture to microservices
using Test-Driven Development (TDD) and optimized Docker builds.

## Changes

### Core Improvements
- Optimized dashboard startup: moved blocking tasks to async background processes
- Fixed model fallback logic in agent configuration
- Enhanced test fixtures with comprehensive conftest.py

### Microservices Architecture
- Created separate Dockerfiles for dashboard, Ollama, and agent services
- Implemented docker-compose.microservices.yml for service orchestration
- Added health checks and non-root user execution for security
- Multi-stage Docker builds for lean, fast images

### Testing
- Added E2E tests for dashboard responsiveness
- Added E2E tests for Ollama integration
- Added E2E tests for microservices architecture validation
- All 36 tests passing, 8 skipped (environment-specific)

### Documentation
- Created comprehensive final report
- Generated issue resolution plan
- Added interview transcript demonstrating core agent functionality

### New Modules
- skill_absorption.py: Dynamic skill loading and integration system for Timmy

## Test Results
 36 passed, 8 skipped, 6 warnings
 All microservices tests passing
 Dashboard responsiveness verified
 Ollama integration validated

## Files Added/Modified
- docker/: Multi-stage Dockerfiles for all services
- tests/e2e/: Comprehensive E2E test suite
- src/timmy/skill_absorption.py: Skill absorption system
- src/dashboard/app.py: Optimized startup logic
- tests/conftest.py: Enhanced test fixtures
- docker-compose.microservices.yml: Service orchestration

## Breaking Changes
None - all changes are backward compatible

## Next Steps
- Integrate skill absorption system into agent workflow
- Test with microservices-tdd-refactor skill
- Deploy to production with docker-compose orchestration
2026-02-28 11:07:19 -05:00
Alexander Whitestone
add3f7a07a fix: register task_request handler and fix Docker 403 errors on macOS (#86) 2026-02-28 07:39:31 -05:00
Alexander Whitestone
3426761894 fix: unblock task queue — auto-approve all tasks, recycle zombie runners (#85)
The task queue was completely stuck: 82 tasks trapped in pending_approval,
4 zombie tasks frozen in running, and the worker loop unable to process
anything. This removes the approval gate as the default and adds startup
recovery for orphaned tasks.

- Auto-approve all tasks by default; only task_type="escalation" requires
  human review (and escalations never block the processor)
- Add reconcile_zombie_tasks() to reset RUNNING→APPROVED on startup
- Use in-memory _current_task for concurrency check instead of DB status
  so stale RUNNING rows from a crash can't block new work
- Update get_next_pending_task to only query APPROVED tasks
- Update all callsites (chat route, API, form) to match new defaults

Co-authored-by: Alexander Payne <apayne@MM.local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 06:57:51 -05:00
Alexander Whitestone
aa3263bc3b feat: automatic error feedback loop with bug report tracker (#80)
Errors and uncaught exceptions are now automatically captured, deduplicated,
persisted to a rotating log file, and filed as bug report tasks in the
existing task queue — giving Timmy a sovereign, local issue tracker with
zero new dependencies.

- Add RotatingFileHandler writing errors to logs/errors.log (5MB rotate, 5 backups)
- Add error capture module with stack-trace hashing and 5-min dedup window
- Add FastAPI exception middleware + global exception handler
- Instrument all background loops (briefing, thinking, task processor) with capture_error()
- Extend task queue with bug_report task type and auto-approve rule
- Fix auto-approve type matching (was ignoring task_type field entirely)
- Add /bugs dashboard page and /api/bugs JSON endpoints
- Add ERROR_CAPTURED and BUG_REPORT_CREATED event types for real-time feed
- Add BUGS nav link to desktop and mobile navigation
- Add 16 tests covering error capture, deduplication, and bug report routes

Co-authored-by: Alexander Payne <apayne@MM.local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 19:51:37 -05:00
Alexander Whitestone
5b6d33e05a feat: task queue system with startup drain and backlogging (#76)
* feat: add task queue system for Timmy - all work goes through the queue

- Add queue position tracking to task_queue models with task_type field
- Add TaskProcessor class that consumes tasks from queue one at a time
- Modify chat route to queue all messages for async processing
- Chat responses get 'high' priority to jump ahead of thought tasks
- Add queue status API endpoints for position polling
- Update UI to show queue position (x/y) and current task banner
- Replace thinking loop with task-based approach - thoughts are queued tasks
- Push responses to user via WebSocket instead of immediate HTTP response
- Add database migrations for existing tables

* feat: Timmy drains task queue on startup, backlogs unhandleable tasks

On spin-up, Timmy now iterates through all pending/approved tasks
immediately instead of waiting for the polling loop. Tasks without a
registered handler or with permanent errors are moved to a new
BACKLOGGED status with a reason, keeping the queue clear for work
Timmy can actually do.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Alexander Payne <apayne@MM.local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 01:52:42 -05:00
Alexander Whitestone
849b5b1a8d feat: add default thinking thread — Timmy always ponders (#75) 2026-02-27 01:00:11 -05:00
Alexander Whitestone
5e60a6453b feat: wire mobile app to real Timmy backend via JSON REST API (#73)
Add /api/chat, /api/upload, and /api/chat/history endpoints to the
FastAPI dashboard so the Expo mobile app talks directly to Timmy's
brain (Ollama) instead of a non-existent Node.js server.

Backend:
- New src/dashboard/routes/chat_api.py with 4 endpoints
- Mount /uploads/ for serving chat attachments
- Same context injection and session management as HTMX chat

Mobile app fixes:
- Point API base URL at port 8000 (FastAPI) instead of 3000
- Create lib/_core/theme.ts (was referenced but never created)
- Fix shared/types.ts (remove broken drizzle/errors re-exports)
- Remove broken server/chat.ts and 1,235-line template README
- Clean package.json (remove express, mysql2, drizzle, tRPC deps)
- Remove debug console.log from theme-provider

Tests: 13 new tests covering all API endpoints (all passing).

https://claude.ai/code/session_01XqErDoh2rVsPY8oTj21Lz2

Co-authored-by: Claude <noreply@anthropic.com>
2026-02-26 23:58:53 -05:00
Claude
211c54bc8c feat: add custom weights, model registry, per-agent models, and reward scoring
Inspired by OpenClaw-RL's multi-model orchestration, this adds four
features for custom model management:

1. Custom model registry (infrastructure/models/registry.py) — SQLite-backed
   registry for GGUF, safetensors, HF checkpoint, and Ollama models with
   role-based lookups (general, reward, teacher, judge).

2. Per-agent model assignment — each swarm persona can use a different model
   instead of sharing the global default. Resolved via registry assignment >
   persona default > global default.

3. Runtime model management API (/api/v1/models) — REST endpoints to register,
   list, assign, enable/disable, and remove custom models without restart.
   Includes a dashboard page at /models.

4. Reward model scoring (PRM-style) — majority-vote quality evaluation of
   agent outputs using a configurable reward model. Scores persist in SQLite
   and feed into the swarm learner.

New config settings: custom_weights_dir, reward_model_enabled,
reward_model_name, reward_model_votes.

54 new tests covering registry CRUD, API endpoints, agent assignments,
role lookups, and reward scoring.

https://claude.ai/code/session_01V4iTozMwcE2gjfnCJdCugC
2026-02-27 01:27:53 +00:00
Claude
17059bc0ea feat: add Grok (xAI) as opt-in premium backend with monetization
- Add GrokBackend class in src/timmy/backends.py with full sync/async
  support, health checks, usage stats, and cost estimation in sats
- Add consult_grok tool to Timmy's toolkit for proactive Grok queries
- Extend cascade router with Grok provider type for failover chain
- Add Grok Mode toggle card to Mission Control dashboard (HTMX live)
- Add "Ask Grok" button on chat input for direct Grok queries
- Add /grok/* routes: status, toggle, chat, stats endpoints
- Integrate Lightning invoice generation for Grok usage monetization
- Add GROK_ENABLED, XAI_API_KEY, GROK_DEFAULT_MODEL, GROK_MAX_SATS_PER_QUERY,
  GROK_FREE config settings via pydantic-settings
- Update .env.example and docker-compose.yml with Grok env vars
- Add 21 tests covering backend, tools, and route endpoints (all green)

Local-first ethos preserved: Grok is premium augmentation only,
disabled by default, and Lightning-payable when enabled.

https://claude.ai/code/session_01FygwN8wS8J6WGZ8FPb7XGV
2026-02-27 01:12:51 +00:00
Claude
bc2c09d3f8 feat: replace GitHub page with embedded Timmy chat interface
Replaces the marketing landing page with a minimal, full-screen chat
interface that connects to a running Timmy instance. Mobile-first design
with single vertical scroll direction, looping scroll, no zoom, no
buttons — just type and press Enter to talk to Timmy.

- docs/index.html: full rewrite as a clean chat UI with dark terminal
  theme, looping infinite scroll, markdown rendering, connection status,
  and /connect, /clear, /help slash commands
- src/dashboard/app.py: add CORS middleware so the GitHub Pages site can
  reach a local Timmy server cross-origin
- src/config.py: add cors_origins setting (defaults to ["*"])

https://claude.ai/code/session_01AWLxg6KDWsfCATiuvsRMGr
2026-02-27 00:35:33 +00:00
Claude
9f4c809f70 refactor: Phase 2b — consolidate 28 modules into 14 packages
Complete the module consolidation planned in REFACTORING_PLAN.md:

Modules merged:
- work_orders/ + task_queue/ → swarm/ (subpackages)
- self_modify/ + self_tdd/ + upgrades/ → self_coding/ (subpackages)
- tools/ → creative/tools/
- chat_bridge/ + telegram_bot/ + shortcuts/ + voice/ → integrations/ (new)
- ws_manager/ + notifications/ + events/ + router/ → infrastructure/ (new)
- agents/ + agent_core/ + memory/ → timmy/ (subpackages)

Updated across codebase:
- 66 source files: import statements rewritten
- 13 test files: import + patch() target strings rewritten
- pyproject.toml: wheel includes (28→14), entry points updated
- CLAUDE.md: singleton paths, module map, entry points table
- AGENTS.md: file convention updates
- REFACTORING_PLAN.md: execution status, success metrics

Extras:
- Module-level CLAUDE.md added to 6 key packages (Phase 6.2)
- Zero test regressions: 1462 tests passing

https://claude.ai/code/session_01JNjWfHqusjT3aiN4vvYgUk
2026-02-26 22:07:41 +00:00
Claude
d2c80fbf4c refactor: Phase 2a — consolidate dashboard routes (27→22 files)
Merge related route files to reduce sprawl:
- voice.py ← voice_enhanced.py (enhanced pipeline merged in)
- swarm.py ← swarm_internal.py + swarm_ws.py (internal API + WebSocket)
- self_coding.py ← self_modify.py (self-modify endpoints merged in)
- Delete mobile_test.py route + template (test-only page, not for prod)
- Delete test_xss_prevention.py (tested the deleted mobile_test page)

Update app.py to use consolidated imports.
Update test_voice_enhanced.py patch paths.
Remove mobile_test.py from coverage omit (file deleted).

27 route files → 22. Tests: 1502 passed (1 removed with deleted page).

https://claude.ai/code/session_019oMFNvD8uSGSSmBMGkBfQN
2026-02-26 21:30:39 +00:00
Alexander Payne
d7aaae74d5 feat: Hands Dashboard Routes and UI (Phase 3.6)
Add dashboard for managing autonomous Hands:

Routes (src/dashboard/routes/hands.py):
- GET /api/hands - List all Hands with status
- GET /api/hands/{name} - Get Hand details
- POST /api/hands/{name}/trigger - Manual trigger
- POST /api/hands/{name}/pause - Pause scheduled Hand
- POST /api/hands/{name}/resume - Resume paused Hand
- GET /api/approvals - List pending approvals
- POST /api/approvals/{id}/approve - Approve request
- POST /api/approvals/{id}/reject - Reject request
- GET /api/executions - List execution history

Templates:
- hands.html - Main dashboard page
- partials/hands_list.html - Active Hands list
- partials/approvals_list.html - Pending approvals
- partials/hand_executions.html - Execution history

Integration:
- Wired up in app.py
- Navigation links in base.html
2026-02-26 12:46:48 -05:00
Alexander Payne
62365cc9b2 feat: Wire up Self-Coding Dashboard
Integrate self-coding routes into dashboard:

Changes:
- Add import for self_coding_router in app.py
- Include self_coding_router in FastAPI app
- Add SELF-CODING link to desktop navigation
- Add SELF-CODING link to mobile navigation

The self-coding dashboard is now accessible at /self-coding
2026-02-26 12:28:30 -05:00
Claude
63bbe2a288 feat: add sovereign biblical text integration module (scripture)
Implement the core scripture module for local-first ESV text storage,
verse retrieval, reference parsing, original language support,
cross-referencing, topical mapping, and automated meditation workflows.

Architecture:
- scripture/constants.py: 66-book Protestant canon with aliases and metadata
- scripture/models.py: Pydantic models with integer-encoded verse IDs
- scripture/parser.py: Regex-based reference extraction and formatting
- scripture/store.py: SQLite-backed verse/xref/topic/Strong's storage
- scripture/memory.py: Tripartite memory (working/long-term/associative)
- scripture/meditation.py: Sequential/thematic/lectionary meditation scheduler
- dashboard/routes/scripture.py: REST endpoints for all scripture operations
- config.py: scripture_enabled, translation, meditation settings
- 95 comprehensive tests covering all modules and routes

https://claude.ai/code/session_015wv7FM6BFsgZ35Us6WeY7H
2026-02-26 17:06:00 +00:00
Alexander Payne
5f9bbb8435 feat: add task queue with human-in-the-loop approval + work orders + UI bug fixes
Task Queue system:
- New /tasks page with three-column layout (Pending/Active/Completed)
- Full CRUD API at /api/tasks with approve/veto/modify/pause/cancel/retry
- SQLite persistence in task_queue table
- WebSocket live updates via ws_manager
- Create task modal with agent assignment and priority
- Auto-approve rules for low-risk tasks
- HTMX polling for real-time column updates
- HOME TASK buttons now link to task queue with agent pre-selected
- MARKET HIRE buttons link to task queue with agent pre-selected

Work Order system:
- External submission API for agents/users (POST /work-orders/submit)
- Risk scoring and configurable auto-execution thresholds
- Dashboard at /work-orders/queue with approve/reject/execute flow
- Integration with swarm task system for execution

UI & Dashboard bug fixes:
- EVENTS: add startup event so page is never empty
- LEDGER: fix empty filter params in URL
- MISSION CONTROL: LLM backend and model now read from /health
- MISSION CONTROL: agent count fallback to /swarm/agents
- SWARM: HTMX fallback loads initial data if WebSocket is slow
- MEMORY: add edit/delete buttons for personal facts
- UPGRADES: add empty state guidance with links
- BRIEFING: add regenerate button and POST /briefing/regenerate endpoint

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 10:27:08 -05:00
Alexander Payne
f95c9606f1 fix: Timmy startup crashes and clean initialization
- Remove show_tool_calls kwarg (not in Agno 2.5.3), which crashed Agent.__init__
- Guard memory_search against top_k=None from model, return formatted string
- Skip Telegram/Discord startup silently when no token configured
- Replace placeholder MEMORY.md with proper structured hot memory document

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 09:11:48 -05:00
Alexander Payne
d8d976aa60 feat: complete Event Log, Ledger, Memory, Cascade Router, Upgrade Queue, Activity Feed
This commit implements six major features:

1. Event Log System (src/swarm/event_log.py)
   - SQLite-based audit trail for all swarm events
   - Task lifecycle tracking (created, assigned, completed, failed)
   - Agent lifecycle tracking (joined, left, status changes)
   - Integrated with coordinator for automatic logging
   - Dashboard page at /swarm/events

2. Lightning Ledger (src/lightning/ledger.py)
   - Transaction tracking for Lightning Network payments
   - Balance calculations (incoming, outgoing, net, available)
   - Integrated with payment_handler for automatic logging
   - Dashboard page at /lightning/ledger

3. Semantic Memory / Vector Store (src/memory/vector_store.py)
   - Embedding-based similarity search for Echo agent
   - Fallback to keyword matching if sentence-transformers unavailable
   - Personal facts storage and retrieval
   - Dashboard page at /memory

4. Cascade Router Integration (src/timmy/cascade_adapter.py)
   - Automatic LLM failover between providers (Ollama → AirLLM → API)
   - Circuit breaker pattern for failing providers
   - Metrics tracking per provider (latency, error rates)
   - Dashboard status page at /router/status

5. Self-Upgrade Approval Queue (src/upgrades/)
   - State machine for self-modifications: proposed → approved/rejected → applied/failed
   - Human approval required before applying changes
   - Git integration for branch management
   - Dashboard queue at /self-modify/queue

6. Real-Time Activity Feed (src/events/broadcaster.py)
   - WebSocket-based live activity streaming
   - Bridges event_log to dashboard clients
   - Activity panel on /swarm/live

Tests:
- 101 unit tests passing
- 4 new E2E test files for Selenium testing
- Run with: SELENIUM_UI=1 pytest tests/functional/ -v --headed

Documentation:
- 6 ADRs (017-022) documenting architecture decisions
- Implementation summary in docs/IMPLEMENTATION_SUMMARY.md
- Architecture diagram in docs/architecture-v2.md
2026-02-26 08:01:01 -05:00
Alexander Payne
56437751d3 Phase 4: Tool Registry Auto-Discovery
- @mcp_tool decorator for marking functions as tools
- ToolDiscovery class for introspecting modules and packages
- Automatic JSON schema generation from type hints
- AST-based discovery for files (without importing)
- Auto-bootstrap on startup (packages=['tools'] by default)
- Support for tags, categories, and metadata
- Updated registry with register_tool() convenience method
- Environment variable MCP_AUTO_BOOTSTRAP to disable
- 39 tests with proper isolation and cleanup

Files Added:
- src/mcp/discovery.py: Tool discovery and introspection
- src/mcp/bootstrap.py: Auto-bootstrap functionality
- tests/test_mcp_discovery.py: 26 tests
- tests/test_mcp_bootstrap.py: 13 tests

Files Modified:
- src/mcp/registry.py: Added tags, source_module, auto_discovered fields
- src/mcp/__init__.py: Export discovery and bootstrap modules
- src/dashboard/app.py: Auto-bootstrap on startup
2026-02-25 19:59:42 -05:00
Alexander Payne
c658ca829c Phase 3: Cascade LLM Router with automatic failover
- YAML-based provider configuration (config/providers.yaml)
- Priority-ordered provider routing
- Circuit breaker pattern for failing providers
- Health check and availability monitoring
- Metrics tracking (latency, errors, success rates)
- Support for Ollama, OpenAI, Anthropic, AirLLM providers
- Automatic failover on rate limits or errors
- REST API endpoints for monitoring and control
- 41 comprehensive tests

API Endpoints:
- POST /api/v1/router/complete - Chat completion with failover
- GET /api/v1/router/status - Provider health status
- GET /api/v1/router/metrics - Detailed metrics
- GET /api/v1/router/providers - List all providers
- POST /api/v1/router/providers/{name}/control - Enable/disable/reset
- POST /api/v1/router/health-check - Run health checks
- GET /api/v1/router/config - View configuration
2026-02-25 19:43:43 -05:00
Alexander Payne
8fec9c41a5 feat: autonomous self-modifying agent with multi-backend LLM support
Adds SelfModifyLoop — an edit→validate→test→commit cycle that can read
its own failure reports, diagnose root causes, and restart autonomously.

Key capabilities:
- Multi-backend LLM: Anthropic Claude API, Ollama, or auto-detect
- Syntax validation via compile() before writing to disk
- Autonomous self-correction loop with configurable max cycles
- XML-based output format to avoid triple-quote delimiter conflicts
- Branch creation skipped by default to prevent container restarts
- CLI: self-modify run "instruction" --backend auto --autonomous
- 939 tests passing, 30 skipped

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 17:23:47 -05:00
Claude
15596ca325 feat: add Discord integration with chat_bridge abstraction layer
Introduces a vendor-agnostic chat platform architecture:

- chat_bridge/base.py: ChatPlatform ABC, ChatMessage, ChatThread
- chat_bridge/registry.py: PlatformRegistry singleton
- chat_bridge/invite_parser.py: QR + Ollama vision invite extraction
- chat_bridge/vendors/discord.py: DiscordVendor with native threads

Workflow: paste a screenshot of a Discord invite or QR code at
POST /discord/join → Timmy extracts the invite automatically.

Every Discord conversation gets its own thread, keeping channels clean.
Bot responds to @mentions and DMs, routes through Timmy agent.

43 new tests (base classes, registry, invite parser, vendor, routes).

https://claude.ai/code/session_01WU4h3cQQiouMwmgYmAgkMM
2026-02-25 01:11:14 +00:00