Adds a `nuke` target that kills stale processes on port 8000 and stops
Docker containers. `make dev` now runs `nuke` first, eliminating the
errno 48 (address already in use) error on restart.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
selenium depends on websocket-client which installs a top-level
`websocket` package that shadows our src/websocket/ module on CI.
Renaming to ws_manager eliminates the conflict entirely — no more
sys.path hacks needed in conftest or Selenium tests.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
selenium depends on websocket-client which installs a top-level
`websocket` package that shadows our src/websocket/ module. Ensure
src/ is inserted at the front of sys.path in conftest so the project
module wins the import race. Fixes collection errors for
test_websocket.py and test_websocket_extended.py on GitHub Actions.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add an ollama service (behind --profile ollama) to the test compose stack
and a new test suite that verifies real LLM inference end-to-end:
- docker-compose.test.yml: add ollama/ollama service with health check,
make OLLAMA_URL and OLLAMA_MODEL configurable via env vars
- tests/functional/test_ollama_chat.py: session-scoped fixture that
brings up Ollama + dashboard, pulls qwen2.5:0.5b (~400MB, CPU-only),
and runs chat/history/multi-turn tests against the live stack
- Makefile: add `make test-ollama` target
Run with: make test-ollama (or FUNCTIONAL_DOCKER=1 pytest tests/functional/test_ollama_chat.py -v)
https://claude.ai/code/session_01NTEzfRHSZQCfkfypxgyHKk
Introduces a vendor-agnostic chat platform architecture:
- chat_bridge/base.py: ChatPlatform ABC, ChatMessage, ChatThread
- chat_bridge/registry.py: PlatformRegistry singleton
- chat_bridge/invite_parser.py: QR + Ollama vision invite extraction
- chat_bridge/vendors/discord.py: DiscordVendor with native threads
Workflow: paste a screenshot of a Discord invite or QR code at
POST /discord/join → Timmy extracts the invite automatically.
Every Discord conversation gets its own thread, keeping channels clean.
Bot responds to @mentions and DMs, routes through Timmy agent.
43 new tests (base classes, registry, invite parser, vendor, routes).
https://claude.ai/code/session_01WU4h3cQQiouMwmgYmAgkMM
Three-tier functional test infrastructure:
- CLI tests via Typer CliRunner (timmy, timmy-serve, self-tdd)
- Dashboard integration tests with real TestClient, real SQLite, real
coordinator (no patch/mock — Ollama offline = graceful degradation)
- Docker compose container-level tests (gated by FUNCTIONAL_DOCKER=1)
- End-to-end L402 payment flow with real mock-lightning backend
42 new tests (8 Docker tests skipped without FUNCTIONAL_DOCKER=1).
All 849 tests pass.
https://claude.ai/code/session_01WU4h3cQQiouMwmgYmAgkMM
Comprehensive reference covering project structure, architecture patterns,
testing conventions, development workflows, and key configuration for AI
assistants working in this repository.
https://claude.ai/code/session_01Y77ZMumHHk5t9wT8ASrpwZ
Add complete production deployment stack so Timmy can be deployed to any
cloud provider (DigitalOcean, AWS, Hetzner, etc.) with a single command.
New files:
- docker-compose.prod.yml: production stack (Caddy auto-HTTPS, Ollama LLM,
Dashboard, Timmy agent, Watchtower auto-updates)
- deploy/Caddyfile: reverse proxy with security headers and WebSocket support
- deploy/setup.sh: interactive one-click setup script for any Ubuntu/Debian server
- deploy/cloud-init.yaml: paste as User Data when creating a cloud VM
- deploy/timmy.service: systemd unit for auto-start on boot
- deploy/digitalocean/create-droplet.sh: create a DO droplet via doctl CLI
Updated:
- Dockerfile: non-root user, healthcheck, missing deps (GitPython, moviepy, redis)
- Makefile: cloud-deploy, cloud-up/down/logs/status/update/scale targets
- .env.example: DOMAIN setting for HTTPS
- .dockerignore: exclude deploy configs from image
https://claude.ai/code/session_018CduUZoEJzFynBwMsxaP8T
Incorporates findings from deep-dive audits of all 5 subsystems:
- Swarm auction timing bug (sleep(0) instead of 15s)
- Docker agent HTTP API partially wired
- L402 macaroons are HMAC-only (no caveats/delegation)
- Agent sats are bid-only, no settlement occurs
- CLI test coverage gap (2 tests for 3 commands)
- agent_core persist_memory/communicate are stubs
https://claude.ai/code/session_01SZTwAkTg6v4ybv8g9NLxqN
- Fix /serve/chat AttributeError: split Request and ChatRequest params
so auth headers are read from HTTP request, not Pydantic body
- Add regression tests for the serve_chat endpoint bug
- Add agent_core and lightning to pyproject.toml wheel includes
- Replace Apache 2.0 LICENSE with MIT to match pyproject.toml
- Update test count from "228" to "600+" across README, docs, AGENTS.md
- Add 5 missing subsystems to README table (Spark, Creative, Tools,
Telegram, agent_core/lightning)
- Update AGENTS.md project structure with 6 missing modules
- Mark completed v2 roadmap items (personas, MCP tools) in AGENTS.md
https://claude.ai/code/session_01GMiccXbo77GkV3TA69x6KS
Update GitHub URLs, clone commands, CI badge links, GitHub Pages URL,
agent team name, and hardcoded macOS paths in handoff scripts to reflect
the new GitHub username. Handoff scripts now use relative paths instead
of hardcoded /Users/apayne paths.
https://claude.ai/code/session_01GMiccXbo77GkV3TA69x6KS
Build real PNG, WAV, and MP4 fixtures (no AI models) and exercise the
full assembler and Creative Director pipeline end-to-end. Fix MoviePy v2
crossfade API (vfx.CrossFadeIn) and font resolution (DejaVu-Sans).
14 new integration tests — 638 total, all passing.
https://claude.ai/code/session_01KJm6jQkNi3aA3yoQJn636c
Adds 3 new personas (Pixel, Lyra, Reel) and 5 new tool modules:
- Git/DevOps tools (GitPython): clone, status, diff, log, blame, branch,
add, commit, push, pull, stash — wired to Forge and Helm personas
- Image generation (FLUX via diffusers): text-to-image, storyboards,
variations — Pixel persona
- Music generation (ACE-Step 1.5): full songs with vocals+instrumentals,
instrumental tracks, vocal-only tracks — Lyra persona
- Video generation (Wan 2.1 via diffusers): text-to-video, image-to-video
clips — Reel persona
- Creative Director pipeline: multi-step orchestration that chains
storyboard → music → video → assembly into 3+ minute final videos
- Video assembler (MoviePy + FFmpeg): stitch clips, overlay audio,
title cards, subtitles, final export
Also includes:
- Spark Intelligence tool-level + creative pipeline event capture
- Creative Studio dashboard page (/creative/ui) with 4 tabs
- Config settings for all new models and output directories
- pyproject.toml creative optional extra for GPU dependencies
- 107 new tests covering all modules (624 total, all passing)
https://claude.ai/code/session_01KJm6jQkNi3aA3yoQJn636c
Add full pytest-cov configuration with fail_under=60% threshold,
HTML/XML report targets, and proper exclude_lines. Fix websocket
history test to use public broadcast() API instead of manually
manipulating internals. Audit confirmed 491 tests at 71.2% coverage.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add bootstrap.sh and checkpoint files for 2-hour handoff cycles:
- CONTINUE.md - Quick start guide
- CHECKPOINT.md - Current state (updated by Kimi)
- TODO.md - Remaining tasks
- bootstrap.sh - One-command status check
Introduce a feedback loop where task outcomes (win/loss, success/failure)
feed back into agent bidding strategy. Borrows the "learn from outcomes"
concept from Spark Intelligence but builds it natively on Timmy's existing
SQLite + swarm architecture.
New module: src/swarm/learner.py
- Records every bid outcome with task description context
- Computes per-agent metrics: win rate, success rate, keyword performance
- suggest_bid() adjusts bids based on historical performance
- learned_keywords() discovers what task types agents actually excel at
Changes:
- persona_node: _compute_bid() now consults learner for adaptive adjustments
- coordinator: complete_task/fail_task feed results into learner
- coordinator: run_auction_and_assign records all bid outcomes
- routes/swarm: add /swarm/insights and /swarm/insights/{agent_id} endpoints
- routes/swarm: add POST /swarm/tasks/{task_id}/fail endpoint
All 413 tests pass (23 new + 390 existing).
https://claude.ai/code/session_01E5jhTCwSUnJk9p9zrTMVUJ
Assess practical usefulness of spark.vibeship.co as an
integration into Timmy's agent system. Conclusion: low value
due to fundamental purpose mismatch (developer meta-tool vs
autonomous agent system), redundant persistence/orchestration
layers, and non-trivial integration effort for unclear benefit.
https://claude.ai/code/session_01E5jhTCwSUnJk9p9zrTMVUJ
Add infrastructure for running swarm agents as isolated Docker
containers with HTTP-based coordination, startup recovery, and
enhanced dashboard UI for agent management.
- Dockerfile and docker-compose.yml for multi-service orchestration
- DockerAgentRunner for programmatic container lifecycle management
- Internal HTTP API for container agents to poll tasks and submit bids
- Startup recovery system to reconcile orphaned tasks and stale agents
- Enhanced UI partials for agent panels, chat, and task assignment
- Timmy docker entry point with heartbeat and task polling
- New Makefile targets for Docker workflows
- Tests for swarm recovery
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>