Commit Graph

96 Commits

Author SHA1 Message Date
Alexander Payne
1bc2cdcb2e Fix Agno Toolkit API compatibility issues
- Change Toolkit.add_tool() to Toolkit.register() (method was renamed in Agno)
- Fix PythonTools method: python -> run_python_code
- Fix FileTools method: write_file -> save_file
- Fix FileTools base_dir parameter: str -> Path object
- Fix Agent tools parameter: pass Toolkit wrapped in list

These fixes resolve critical startup errors that prevented Timmy agent from initializing:
- AttributeError: 'Toolkit' object has no attribute 'add_tool'
- AttributeError: 'PythonTools' object has no attribute 'python'
- TypeError: 'Toolkit' object is not iterable

All 895 tests pass after these changes.

Quality review: Agent now fully functional with working inference, memory,
and self-awareness capabilities.
2026-02-25 14:11:13 -05:00
Alexander Whitestone
eac149e9ab Merge pull request #32 from AlexanderWhitestone/claude/review-mac-deployment-siJ6f 2026-02-25 13:20:29 -05:00
Claude
2e7f3d1b29 feat: centralize L402 config, automate Metal install, fix watchdog cleanup
- config.py: add L402_HMAC_SECRET, L402_MACAROON_SECRET, LIGHTNING_BACKEND
  to pydantic-settings with startup warnings for default secrets
- l402_proxy.py, mock_backend.py, factory.py: migrate from os.environ.get()
  to `from config import settings` per project convention
- Makefile: `make install-creative` now auto-installs PyTorch nightly with
  Metal (MPS) support on Apple Silicon instead of just printing a note
- activate_self_tdd.sh: add PID file (.watchdog.pid) and EXIT trap so
  Ctrl-C cleanly stops both the dashboard and the watchdog process
- .gitignore: add .watchdog.pid

https://claude.ai/code/session_01A81E5HMxZEPxzv2acNo35u
2026-02-25 18:19:22 +00:00
Claude
c0ca166d43 fix: improve macOS deployment compatibility and Docker build hygiene
- .gitignore: add missing macOS artifacts (.AppleDouble, .Spotlight-V100, etc.)
- Makefile: fix `make ip` to detect network interfaces on both macOS and Linux
  (adds `ip` command fallback, guards macOS-only `ipconfig` behind uname check)
- Makefile: add `make install-creative` target with Apple Silicon Metal guidance
- Dockerfile: install deps from pyproject.toml instead of duplicating the list,
  eliminating drift between Dockerfile and pyproject.toml
- docker-compose.yml: document data/ directory prerequisite for bind-mount volume

https://claude.ai/code/session_01A81E5HMxZEPxzv2acNo35u
2026-02-25 18:08:57 +00:00
Alexander Whitestone
c41185a588 Merge pull request #31 from AlexanderWhitestone/claude/heuristic-wu
fix: auto-clean port 8000 and containers before `make dev`
2026-02-25 11:03:19 -05:00
Alexander Payne
127011b898 fix: auto-clean port 8000 and containers before make dev
Adds a `nuke` target that kills stale processes on port 8000 and stops
Docker containers. `make dev` now runs `nuke` first, eliminating the
errno 48 (address already in use) error on restart.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 11:01:11 -05:00
Alexander Whitestone
c430f8002c Merge pull request #29 from AlexanderWhitestone/fix/xss-prevention-mobile-test
Security: XSS Prevention in Mobile Test Page
2026-02-25 08:01:05 -05:00
Alexander Whitestone
cc7e151a73 Merge pull request #30 from AlexanderWhitestone/claude/keen-taussig
Single-command Docker startup, fix UI bugs, add Selenium tests
2026-02-25 08:00:02 -05:00
Alexander Payne
3463f4e4a4 fix: rename src/websocket to src/ws_manager to avoid websocket-client clash
selenium depends on websocket-client which installs a top-level
`websocket` package that shadows our src/websocket/ module on CI.
Renaming to ws_manager eliminates the conflict entirely — no more
sys.path hacks needed in conftest or Selenium tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 07:57:28 -05:00
Alexander Payne
e483748816 fix: resolve websocket-client shadowing src/websocket on CI
selenium depends on websocket-client which installs a top-level
`websocket` package that shadows our src/websocket/ module.  Ensure
src/ is inserted at the front of sys.path in conftest so the project
module wins the import race.  Fixes collection errors for
test_websocket.py and test_websocket_extended.py on GitHub Actions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 07:32:57 -05:00
Alexander Payne
29292cfb84 feat: single-command Docker startup, fix UI bugs, add Selenium tests
- Add `make up` / `make up DEV=1` for one-command Docker startup with
  optional hot-reload via docker-compose.dev.yml overlay
- Add `timmy up --dev` / `timmy down` CLI commands
- Fix cross-platform font resolution in creative assembler (7 test failures)
- Fix Ollama host URL not passed to Agno model (container connectivity)
- Fix task panel route shadowing by reordering literal routes before
  parameterized routes in swarm.py
- Fix chat input not clearing after send (hx-on::after-request)
- Fix chat scroll overflow (CSS min-height: 0 on flex children)
- Add Selenium UI smoke tests (17 tests, gated behind SELENIUM_UI=1)
- Install fonts-dejavu-core in Dockerfile for container font support
- Remove obsolete docker-compose version key
- Bump CSS cache-bust to v4

833 unit tests pass, 15 Selenium tests pass (2 skipped).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 07:20:56 -05:00
AlexanderWhitestone
bc1be23e23 security: prevent XSS in mobile-test by using textContent 2026-02-25 02:08:02 -05:00
Alexander Whitestone
6bae3af268 Merge pull request #28 from AlexanderWhitestone/claude/discord-bot-setup-exwPO 2026-02-24 22:05:48 -05:00
Claude
78cf91697c feat: add functional Ollama chat tests with containerised LLM
Add an ollama service (behind --profile ollama) to the test compose stack
and a new test suite that verifies real LLM inference end-to-end:

- docker-compose.test.yml: add ollama/ollama service with health check,
  make OLLAMA_URL and OLLAMA_MODEL configurable via env vars
- tests/functional/test_ollama_chat.py: session-scoped fixture that
  brings up Ollama + dashboard, pulls qwen2.5:0.5b (~400MB, CPU-only),
  and runs chat/history/multi-turn tests against the live stack
- Makefile: add `make test-ollama` target

Run with: make test-ollama (or FUNCTIONAL_DOCKER=1 pytest tests/functional/test_ollama_chat.py -v)

https://claude.ai/code/session_01NTEzfRHSZQCfkfypxgyHKk
2026-02-25 02:44:36 +00:00
Alexander Whitestone
df222f7d7e Merge pull request #27 from AlexanderWhitestone/claude/analyze-test-coverage-KBlkN 2026-02-24 21:16:01 -05:00
Claude
548319cb10 chore: gitignore discord_state.json (contains bot token)
https://claude.ai/code/session_01WU4h3cQQiouMwmgYmAgkMM
2026-02-25 01:12:03 +00:00
Claude
15596ca325 feat: add Discord integration with chat_bridge abstraction layer
Introduces a vendor-agnostic chat platform architecture:

- chat_bridge/base.py: ChatPlatform ABC, ChatMessage, ChatThread
- chat_bridge/registry.py: PlatformRegistry singleton
- chat_bridge/invite_parser.py: QR + Ollama vision invite extraction
- chat_bridge/vendors/discord.py: DiscordVendor with native threads

Workflow: paste a screenshot of a Discord invite or QR code at
POST /discord/join → Timmy extracts the invite automatically.

Every Discord conversation gets its own thread, keeping channels clean.
Bot responds to @mentions and DMs, routes through Timmy agent.

43 new tests (base classes, registry, invite parser, vendor, routes).

https://claude.ai/code/session_01WU4h3cQQiouMwmgYmAgkMM
2026-02-25 01:11:14 +00:00
Claude
2c419a777d fix: skip Docker tests gracefully when daemon is unavailable
The docker_stack fixture now checks `docker info` before attempting
`compose up`. If the daemon isn't reachable, tests skip instead of
erroring with pytest.fail.

https://claude.ai/code/session_01WU4h3cQQiouMwmgYmAgkMM
2026-02-25 00:49:06 +00:00
Claude
c91e02e7c5 test: add functional test suite with real fixtures, no mocking
Three-tier functional test infrastructure:
- CLI tests via Typer CliRunner (timmy, timmy-serve, self-tdd)
- Dashboard integration tests with real TestClient, real SQLite, real
  coordinator (no patch/mock — Ollama offline = graceful degradation)
- Docker compose container-level tests (gated by FUNCTIONAL_DOCKER=1)
- End-to-end L402 payment flow with real mock-lightning backend

42 new tests (8 Docker tests skipped without FUNCTIONAL_DOCKER=1).
All 849 tests pass.

https://claude.ai/code/session_01WU4h3cQQiouMwmgYmAgkMM
2026-02-25 00:46:22 +00:00
Claude
3e51434b4b test: add 157 functional tests covering 8 low-coverage modules
Analyze test coverage (75.3% → 85.4%) and add functional test suites
for the major gaps identified:

- test_agent_core.py: Full coverage for agent_core/interface.py (0→100%)
  and agent_core/ollama_adapter.py (0→100%) — data classes, factories,
  abstract enforcement, perceive/reason/act/recall workflow, effect logging

- test_docker_runner.py: Full coverage for swarm/docker_runner.py (0→100%)
  — container spawn/stop/list lifecycle with mocked subprocess

- test_timmy_tools.py: Tool usage tracking, persona toolkit mapping,
  catalog generation, graceful degradation without Agno

- test_routes_tools.py: /tools page, API stats endpoint, and WebSocket
  /swarm/live connect/disconnect/send lifecycle (41→82%)

- test_voice_tts_functional.py: VoiceTTS init, speak, volume clamping,
  voice listing, graceful degradation (41→94%)

- test_watchdog_functional.py: _run_tests, watch loop state transitions,
  regression detection, KeyboardInterrupt (47→97%)

- test_lnd_backend.py: LND init from params/env, grpc stub enforcement,
  method-level BackendNotAvailableError, settle returns False (25→61%)

- test_swarm_routes_functional.py: Agent spawn/stop, task CRUD, auction,
  insights, UI partials, error paths (63→92%)

https://claude.ai/code/session_01WU4h3cQQiouMwmgYmAgkMM
2026-02-24 23:36:50 +00:00
Alexander Whitestone
72d9e316f4 Merge pull request #26 from AlexanderWhitestone/claude/add-claude-documentation-CUvev 2026-02-24 18:19:25 -05:00
Claude
48255ead3d docs: add CLAUDE.md with codebase guide for AI assistants
Comprehensive reference covering project structure, architecture patterns,
testing conventions, development workflows, and key configuration for AI
assistants working in this repository.

https://claude.ai/code/session_01Y77ZMumHHk5t9wT8ASrpwZ
2026-02-24 23:17:49 +00:00
Alexander Whitestone
28bb92e35b Merge pull request #25 from AlexanderWhitestone/claude/fix-iphone-ui-NWwmk 2026-02-24 17:54:36 -05:00
Claude
65a278dbee fix: comprehensive iPhone UI overhaul — glassmorphism, responsive layouts, theme unification
- base.html: add missing {% block extra_styles %}, mobile hamburger menu with
  slide-out nav, interactive-widget viewport meta, -webkit-text-size-adjust
- style.css: define 15+ missing CSS variables (--bg-secondary, --text-muted,
  --accent, --success, --danger, etc.), add missing utility classes (.grid,
  .stat, .agent-card, .agent-avatar, .form-group), glassmorphism card effects,
  iPhone breakpoints (768px, 390px), 44pt min touch targets, smooth animations
- mobile.html: rewrite with proper theme variables, glass cards, touch-friendly
  quick actions grid, chat with proper message bubbles
- swarm_live.html: replace undefined CSS vars, use mc-panel theme cards
- marketplace.html: responsive agent cards that stack on iPhone, themed pricing
- voice_button.html & voice_enhanced.html: proper theme integration, touch-sized
  buttons, themed result containers
- create_task.html: mobile-friendly forms with 16px font (prevents iOS zoom)
- tools.html & creative.html: themed headers, responsive column stacking
- spark.html: replace all hardcoded blue (#00d4ff) colors with theme purple/orange
- briefing.html: replace hardcoded bootstrap colors with theme variables

Fixes: header nav overflow on iPhone (7 links in single row), missing
extra_styles block silently dropping child template styles, undefined CSS
variables breaking mobile/swarm/marketplace/voice pages, sub-44pt touch
targets, missing -webkit-text-size-adjust, inconsistent color themes.

97 UI tests pass (91 UI-specific + 6 creative route).

https://claude.ai/code/session_01JiyhGyee2zoMN4p8xWYqEe
2026-02-24 22:25:04 +00:00
Alexander Whitestone
d96b7593fc Merge pull request #24 from AlexanderWhitestone/claude/cloud-ready-deployment-KxS0u 2026-02-24 16:28:20 -05:00
Claude
b7cfb3b097 feat: one-click cloud deployment — Caddy HTTPS, Ollama, systemd, cloud-init
Add complete production deployment stack so Timmy can be deployed to any
cloud provider (DigitalOcean, AWS, Hetzner, etc.) with a single command.

New files:
- docker-compose.prod.yml: production stack (Caddy auto-HTTPS, Ollama LLM,
  Dashboard, Timmy agent, Watchtower auto-updates)
- deploy/Caddyfile: reverse proxy with security headers and WebSocket support
- deploy/setup.sh: interactive one-click setup script for any Ubuntu/Debian server
- deploy/cloud-init.yaml: paste as User Data when creating a cloud VM
- deploy/timmy.service: systemd unit for auto-start on boot
- deploy/digitalocean/create-droplet.sh: create a DO droplet via doctl CLI

Updated:
- Dockerfile: non-root user, healthcheck, missing deps (GitPython, moviepy, redis)
- Makefile: cloud-deploy, cloud-up/down/logs/status/update/scale targets
- .env.example: DOMAIN setting for HTTPS
- .dockerignore: exclude deploy configs from image

https://claude.ai/code/session_018CduUZoEJzFynBwMsxaP8T
2026-02-24 21:22:56 +00:00
Alexander Whitestone
7018a756b3 Merge pull request #22 from AlexanderWhitestone/claude/audit-timmy-dashboard-ft27r 2026-02-24 14:18:29 -05:00
Claude
96c9f1b02f fix: address audit low-hanging fruit — docs accuracy, auction timing, stubs, tests
- Docs: "No Cloud" → "No Cloud AI" (frontend uses CDN for Bootstrap/HTMX/fonts)
- Docs: "600+" → "640+" tests, "20+" → "58" endpoints (actual counts)
- Docs: LND described as "scaffolded" not "gRPC-ready"; remove "agents earn sats"
- Fix auction timing: coordinator sleep(0) → sleep(AUCTION_DURATION_SECONDS)
- agent_core: implement remember() with dedup/eviction, communicate() via swarm comms
- Tests: add CLI tests for chat, think, and backend/model-size forwarding (647 passing)

https://claude.ai/code/session_01SZTwAkTg6v4ybv8g9NLxqN
2026-02-24 18:29:21 +00:00
Alexander Whitestone
03ff505c4b Merge pull request #23 from AlexanderWhitestone/security/macaroon-forgery-and-xss-1771955896 2026-02-24 13:00:52 -05:00
AlexanderWhitestone
4daf382819 security: fix L402 macaroon forgery and XSS in templates 2026-02-24 12:58:19 -05:00
Claude
0367fe3649 audit: add detailed findings from parallel subsystem audits
Incorporates findings from deep-dive audits of all 5 subsystems:
- Swarm auction timing bug (sleep(0) instead of 15s)
- Docker agent HTTP API partially wired
- L402 macaroons are HMAC-only (no caveats/delegation)
- Agent sats are bid-only, no settlement occurs
- CLI test coverage gap (2 tests for 3 commands)
- agent_core persist_memory/communicate are stubs

https://claude.ai/code/session_01SZTwAkTg6v4ybv8g9NLxqN
2026-02-24 17:36:10 +00:00
Claude
dd28595dbd audit: comprehensive feature verification against documentation claims
Audits all 15+ subsystems against claims in docs/index.html and README.md.
643 tests pass (not "600+"), 58 endpoints exist (not "20+"). Identifies
three false claims: "0 Cloud Calls" (CDN deps in templates), "LND gRPC-ready"
(every method raises NotImplementedError), and "agents earn sats autonomously"
(unimplemented v3 feature presented as current).

https://claude.ai/code/session_01SZTwAkTg6v4ybv8g9NLxqN
2026-02-24 17:34:04 +00:00
Alexander Whitestone
1e9e0748a9 Merge pull request #21 from AlexanderWhitestone/claude/cleanup-github-references-LdfDa 2026-02-24 12:20:16 -05:00
Claude
832478f0d0 fix: serve_chat endpoint bug, stale docs, and license mismatch
- Fix /serve/chat AttributeError: split Request and ChatRequest params
  so auth headers are read from HTTP request, not Pydantic body
- Add regression tests for the serve_chat endpoint bug
- Add agent_core and lightning to pyproject.toml wheel includes
- Replace Apache 2.0 LICENSE with MIT to match pyproject.toml
- Update test count from "228" to "600+" across README, docs, AGENTS.md
- Add 5 missing subsystems to README table (Spark, Creative, Tools,
  Telegram, agent_core/lightning)
- Update AGENTS.md project structure with 6 missing modules
- Mark completed v2 roadmap items (personas, MCP tools) in AGENTS.md

https://claude.ai/code/session_01GMiccXbo77GkV3TA69x6KS
2026-02-24 17:18:29 +00:00
Claude
d7cd686341 chore: replace all Alexspayne/Payne references with AlexanderWhitestone
Update GitHub URLs, clone commands, CI badge links, GitHub Pages URL,
agent team name, and hardcoded macOS paths in handoff scripts to reflect
the new GitHub username. Handoff scripts now use relative paths instead
of hardcoded /Users/apayne paths.

https://claude.ai/code/session_01GMiccXbo77GkV3TA69x6KS
2026-02-24 17:05:16 +00:00
Alexander Whitestone
1c8edfc52b Merge pull request #20 from AlexanderWhitestone/claude/integrate-spark-timmy-e5D1i 2026-02-24 11:50:26 -05:00
Claude
b098b00959 test: add integration tests with real media for music video pipeline
Build real PNG, WAV, and MP4 fixtures (no AI models) and exercise the
full assembler and Creative Director pipeline end-to-end.  Fix MoviePy v2
crossfade API (vfx.CrossFadeIn) and font resolution (DejaVu-Sans).

14 new integration tests — 638 total, all passing.

https://claude.ai/code/session_01KJm6jQkNi3aA3yoQJn636c
2026-02-24 16:48:14 +00:00
Claude
1103da339c feat: add full creative studio + DevOps tools (Pixel, Lyra, Reel personas)
Adds 3 new personas (Pixel, Lyra, Reel) and 5 new tool modules:

- Git/DevOps tools (GitPython): clone, status, diff, log, blame, branch,
  add, commit, push, pull, stash — wired to Forge and Helm personas
- Image generation (FLUX via diffusers): text-to-image, storyboards,
  variations — Pixel persona
- Music generation (ACE-Step 1.5): full songs with vocals+instrumentals,
  instrumental tracks, vocal-only tracks — Lyra persona
- Video generation (Wan 2.1 via diffusers): text-to-video, image-to-video
  clips — Reel persona
- Creative Director pipeline: multi-step orchestration that chains
  storyboard → music → video → assembly into 3+ minute final videos
- Video assembler (MoviePy + FFmpeg): stitch clips, overlay audio,
  title cards, subtitles, final export

Also includes:
- Spark Intelligence tool-level + creative pipeline event capture
- Creative Studio dashboard page (/creative/ui) with 4 tabs
- Config settings for all new models and output directories
- pyproject.toml creative optional extra for GPU dependencies
- 107 new tests covering all modules (624 total, all passing)

https://claude.ai/code/session_01KJm6jQkNi3aA3yoQJn636c
2026-02-24 16:31:47 +00:00
Claude
1ab26d30ad feat: integrate Spark Intelligence into Timmy swarm system
Adds a self-evolving cognitive layer inspired by vibeship-spark-intelligence,
adapted for Timmy's agent architecture. Spark captures swarm events, runs
EIDOS prediction-evaluation loops, consolidates memories, and generates
advisory recommendations — all backed by SQLite consistent with existing
patterns.

New modules:
- spark/memory.py — event capture with importance scoring + memory consolidation
- spark/eidos.py — EIDOS cognitive loop (predict → observe → evaluate → learn)
- spark/advisor.py — ranked advisory generation from accumulated intelligence
- spark/engine.py — top-level API wiring all subsystems together

Dashboard:
- /spark/ui — full Spark Intelligence dashboard (3-column: status/advisories,
  predictions/memories, event timeline) with HTMX auto-refresh
- /spark — JSON API for programmatic access
- SPARK link added to navigation header

Integration:
- Coordinator hooks emit Spark events on task post, bid, assign, complete, fail
- EIDOS predictions generated when tasks are posted, evaluated on completion
- Memory consolidation triggers when agents accumulate enough outcomes
- SPARK_ENABLED config toggle (default: true)

Tests: 47 new tests covering all Spark subsystems + dashboard routes.
Full suite: 538 tests passing.

https://claude.ai/code/session_01KJm6jQkNi3aA3yoQJn636c
2026-02-24 15:51:15 +00:00
Alexander Whitestone
4554891674 Merge pull request #19 from AlexanderWhitestone/claude/nostalgic-cori
feat: pytest-cov setup and test suite audit
2026-02-22 20:44:42 -05:00
Alexander Payne
ca60483268 feat: pytest-cov configuration and test audit cleanup
Add full pytest-cov configuration with fail_under=60% threshold,
HTML/XML report targets, and proper exclude_lines. Fix websocket
history test to use public broadcast() API instead of manually
manipulating internals. Audit confirmed 491 tests at 71.2% coverage.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 20:42:58 -05:00
Alexander Payne
14072f9bb5 feat: MCP tools integration for swarm agents
ToolExecutor:
- Persona-specific toolkit selection (forge gets code tools, echo gets search)
- Tool inference from task keywords (search→web_search, code→python)
- LLM-powered reasoning about tool selection
- Graceful degradation when Agno unavailable

PersonaNode Updates:
- Subscribe to swarm:events for task assignments
- Execute tasks using ToolExecutor when assigned
- Complete tasks via comms.complete_task()
- Track current_task for status monitoring

Tests:
- 19 new tests for tool execution
- All 6 personas covered
- Tool inference verification
- Edge cases (no toolkit, unknown tasks)

Total: 491 tests passing
2026-02-22 20:33:26 -05:00
Alexander Payne
c5df954d44 feat: Lightning interface, swarm routing, sovereignty audit, embodiment prep
Lightning Backend Interface:
- Abstract LightningBackend with pluggable implementations
- MockBackend for development (auto-settle invoices)
- LndBackend stub with gRPC integration path documented
- Backend factory for runtime selection via LIGHTNING_BACKEND env

Intelligent Swarm Routing:
- CapabilityManifest for agent skill declarations
- Task scoring based on keywords + capabilities + bid price
- RoutingDecision audit logging to SQLite
- Agent stats tracking (wins, consideration rate)

Sovereignty Audit:
- Comprehensive audit report (docs/SOVEREIGNTY_AUDIT.md)
- 9.2/10 sovereignty score
- Documented all external dependencies and local alternatives

Substrate-Agnostic Agent Interface:
- TimAgent abstract base class
- Perception/Action/Memory/Communication types
- OllamaAdapter implementation
- Foundation for future embodiment (robot, VR)

Tests:
- 36 new tests for Lightning and routing
- 472 total tests passing
- Maintained 0 warning policy
2026-02-22 20:20:11 -05:00
Alexander Payne
82ce8a31cf chore: add resume.sh one-liner for handoff 2026-02-22 19:37:15 -05:00
Alexander Whitestone
feaac7ce38 Merge pull request #18 from AlexanderWhitestone/kimi/sprint-v2-swarm-tools-serve
Sprint v2: Swarm E2E, MCP Tools, timmy-serve L402, Tests, Notifications
2026-02-22 19:34:50 -05:00
Alexander Payne
90fa7f55cf docs: update checkpoint with handoff system info 2026-02-22 19:33:44 -05:00
Alexander Payne
bd0030f536 chore: add handoff system for long-running sessions
Add bootstrap.sh and checkpoint files for 2-hour handoff cycles:
- CONTINUE.md - Quick start guide
- CHECKPOINT.md - Current state (updated by Kimi)
- TODO.md - Remaining tasks
- bootstrap.sh - One-command status check
2026-02-22 19:16:09 -05:00
Alexander Payne
f0aa43533f feat: swarm E2E, MCP tools, timmy-serve L402, tests, notifications
Major Features:
- Auto-spawn persona agents (Echo, Forge, Seer) on app startup
- WebSocket broadcasts for real-time swarm UI updates
- MCP tool integration: web search, file I/O, shell, Python execution
- New /tools dashboard page showing agent capabilities
- Real timmy-serve start with L402 payment gating middleware
- Browser push notifications for briefings and task events

Tests:
- test_docker_agent.py: 9 tests for Docker agent runner
- test_swarm_integration_full.py: 18 E2E lifecycle tests
- Fixed all pytest warnings (436 tests, 0 warnings)

Improvements:
- Fixed coroutine warnings in coordinator broadcasts
- Fixed ResourceWarning for unclosed process pipes
- Added pytest-asyncio config to pyproject.toml
- Test isolation with proper event loop cleanup
2026-02-22 19:01:04 -05:00
Alexander Whitestone
c5f86b8960 Merge pull request #17 from AlexanderWhitestone/claude/evaluate-integration-usefulness-FyYSl
Claude/evaluate integration usefulness fy y sl
2026-02-22 17:10:21 -05:00
Claude
167fd0a7b4 Add outcome-based learning system for swarm agents
Introduce a feedback loop where task outcomes (win/loss, success/failure)
feed back into agent bidding strategy. Borrows the "learn from outcomes"
concept from Spark Intelligence but builds it natively on Timmy's existing
SQLite + swarm architecture.

New module: src/swarm/learner.py
- Records every bid outcome with task description context
- Computes per-agent metrics: win rate, success rate, keyword performance
- suggest_bid() adjusts bids based on historical performance
- learned_keywords() discovers what task types agents actually excel at

Changes:
- persona_node: _compute_bid() now consults learner for adaptive adjustments
- coordinator: complete_task/fail_task feed results into learner
- coordinator: run_auction_and_assign records all bid outcomes
- routes/swarm: add /swarm/insights and /swarm/insights/{agent_id} endpoints
- routes/swarm: add POST /swarm/tasks/{task_id}/fail endpoint

All 413 tests pass (23 new + 390 existing).

https://claude.ai/code/session_01E5jhTCwSUnJk9p9zrTMVUJ
2026-02-22 22:04:37 +00:00