Commit Graph

200 Commits

Author SHA1 Message Date
Alexander Whitestone
5e60a6453b feat: wire mobile app to real Timmy backend via JSON REST API (#73)
Add /api/chat, /api/upload, and /api/chat/history endpoints to the
FastAPI dashboard so the Expo mobile app talks directly to Timmy's
brain (Ollama) instead of a non-existent Node.js server.

Backend:
- New src/dashboard/routes/chat_api.py with 4 endpoints
- Mount /uploads/ for serving chat attachments
- Same context injection and session management as HTMX chat

Mobile app fixes:
- Point API base URL at port 8000 (FastAPI) instead of 3000
- Create lib/_core/theme.ts (was referenced but never created)
- Fix shared/types.ts (remove broken drizzle/errors re-exports)
- Remove broken server/chat.ts and 1,235-line template README
- Clean package.json (remove express, mysql2, drizzle, tRPC deps)
- Remove debug console.log from theme-provider

Tests: 13 new tests covering all API endpoints (all passing).

https://claude.ai/code/session_01XqErDoh2rVsPY8oTj21Lz2

Co-authored-by: Claude <noreply@anthropic.com>
2026-02-26 23:58:53 -05:00
Alexander Whitestone
18ed6232f9 feat: Timmy fixes and improvements (#72)
* test: remove hardcoded sleeps, add pytest-timeout

- Replace fixed time.sleep() calls with intelligent polling or WebDriverWait
- Add pytest-timeout dependency and --timeout=30 to prevent hangs
- Fixes test flakiness and improves test suite speed

* feat: add Aider AI tool to Forge's toolkit

- Add Aider tool that calls local Ollama (qwen2.5:14b) for AI coding assist
- Register tool in Forge's code toolkit
- Add functional tests for the Aider tool

* config: add opencode.json with local Ollama provider for sovereign AI

* feat: Timmy fixes and improvements

## Bug Fixes
- Fix read_file path resolution: add ~ expansion, proper relative path handling
- Add repo_root to config.py with auto-detection from .git location
- Fix hardcoded llama3.2 - now dynamic from settings.ollama_model

## Timmy's Requests
- Add communication protocol to AGENTS.md (read context first, explain changes)
- Create DECISIONS.md for architectural decision documentation
- Add reasoning guidance to system prompts (step-by-step, state uncertainty)
- Update tests to reflect correct model name (llama3.1:8b-instruct)

## Testing
- All 177 dashboard tests pass
- All 32 prompt/tool tests pass

---------

Co-authored-by: Alexander Payne <apayne@MM.local>
2026-02-26 23:39:13 -05:00
Alexander Whitestone
4ba272eb4f config: add opencode.json with local Ollama provider (#71)
* test: remove hardcoded sleeps, add pytest-timeout

- Replace fixed time.sleep() calls with intelligent polling or WebDriverWait
- Add pytest-timeout dependency and --timeout=30 to prevent hangs
- Fixes test flakiness and improves test suite speed

* feat: add Aider AI tool to Forge's toolkit

- Add Aider tool that calls local Ollama (qwen2.5:14b) for AI coding assist
- Register tool in Forge's code toolkit
- Add functional tests for the Aider tool

* config: add opencode.json with local Ollama provider for sovereign AI

---------

Co-authored-by: Alexander Payne <apayne@MM.local>
2026-02-26 23:21:43 -05:00
Alexander Whitestone
a5765c33b6 feat: add Aider AI tool to Forge's toolkit (#70)
* test: remove hardcoded sleeps, add pytest-timeout

- Replace fixed time.sleep() calls with intelligent polling or WebDriverWait
- Add pytest-timeout dependency and --timeout=30 to prevent hangs
- Fixes test flakiness and improves test suite speed

* feat: add Aider AI tool to Forge's toolkit

- Add Aider tool that calls local Ollama (qwen2.5:14b) for AI coding assist
- Register tool in Forge's code toolkit
- Add functional tests for the Aider tool

---------

Co-authored-by: Alexander Payne <apayne@MM.local>
2026-02-26 23:17:19 -05:00
Alexander Whitestone
51140fb7f0 test: remove hardcoded sleeps, add pytest-timeout (#69)
- Replace fixed time.sleep() calls with intelligent polling or WebDriverWait
- Add pytest-timeout dependency and --timeout=30 to prevent hangs
- Fixes test flakiness and improves test suite speed

Co-authored-by: Alexander Payne <apayne@MM.local>
2026-02-26 22:52:36 -05:00
Alexander Whitestone
bf0e388d2a Merge pull request #57 from AlexanderWhitestone/feature/model-upgrade-llama3.1
feat: Multi-modal LLM support with automatic model fallback
2026-02-26 22:35:19 -05:00
Alexander Payne
72a58f1f49 feat: Multi-modal support with automatic model fallback
- Add MultiModalManager with capability detection for vision/audio/tools
- Define fallback chains: vision (llama3.2:3b -> llava:7b -> moondream)
                       tools (llama3.1:8b-instruct -> qwen2.5:7b)
- Update CascadeRouter to detect content type and select appropriate models
- Add model pulling with automatic fallback in agent creation
- Update providers.yaml with multi-modal model configurations
- Update OllamaAdapter to use model resolution with vision support

Tests: All 96 infrastructure tests pass
2026-02-26 22:29:44 -05:00
Alexander Payne
a85661274c Merge main into feature/model-upgrade-llama3.1 with conflict resolution 2026-02-26 22:19:44 -05:00
Alexander Whitestone
024e6a4318 Merge pull request #68 from AlexanderWhitestone/feature/timmy-chat-mobile-app
feat: Timmy Chat Mobile App (Expo/React Native)
2026-02-26 21:59:00 -05:00
Manus AI
b4b508ff5a feat: add Timmy Chat mobile app (Expo/React Native)
- Single-screen chat interface with Timmy's sovereign AI personality
- Text messaging with real-time AI responses via server chat API
- Voice recording and playback with waveform visualization
- Image sharing (camera + photo library) with full-screen viewer
- File attachments via document picker
- Dark arcane theme matching the Timmy Time dashboard
- Custom app icon with glowing T circuit design
- Timmy system prompt ported from dashboard prompts.py
- Unit tests for chat utilities and message types
2026-02-26 21:55:41 -05:00
Alexander Whitestone
031a106e65 Merge pull request #67 from AlexanderWhitestone/claude/fix-build-zLt4o 2026-02-26 21:08:05 -05:00
Claude
eb501c43da fix: resolve 8 test failures from missing requests stub and wrong python path
- Add `requests` to conftest.py module stubs so patch("requests.post") works
  in reward scoring tests without the package installed
- Use sys.executable instead of bare "python" in git safety tests so the
  subprocess finds pytest from the venv rather than system python

https://claude.ai/code/session_012Ye9nyFEiw2QQfx4bZeDmn
2026-02-27 02:06:45 +00:00
Alexander Whitestone
9c444959df Merge pull request #66 from AlexanderWhitestone/fix/missing-requests-dependency 2026-02-26 21:05:12 -05:00
AlexanderWhitestone
7198b6b173 fix: add missing requests dependency to pyproject.toml 2026-02-26 21:01:33 -05:00
Alexander Whitestone
c006609094 Merge pull request #65 from AlexanderWhitestone/claude/finish-and-submit-pr-fmxyC 2026-02-26 20:55:04 -05:00
Claude
21846f3897 fix: disable gpg signing in test git fixtures and skip root-only permission test
Test fixtures that create temporary git repos now set commit.gpgsign=false
to avoid failures in environments with global commit signing configured.
The permission error test is skipped when running as root since file
permissions don't apply to the root user.

https://claude.ai/code/session_018u1fAx2GihSGctYS64tD4H
2026-02-27 01:52:47 +00:00
Claude
211c54bc8c feat: add custom weights, model registry, per-agent models, and reward scoring
Inspired by OpenClaw-RL's multi-model orchestration, this adds four
features for custom model management:

1. Custom model registry (infrastructure/models/registry.py) — SQLite-backed
   registry for GGUF, safetensors, HF checkpoint, and Ollama models with
   role-based lookups (general, reward, teacher, judge).

2. Per-agent model assignment — each swarm persona can use a different model
   instead of sharing the global default. Resolved via registry assignment >
   persona default > global default.

3. Runtime model management API (/api/v1/models) — REST endpoints to register,
   list, assign, enable/disable, and remove custom models without restart.
   Includes a dashboard page at /models.

4. Reward model scoring (PRM-style) — majority-vote quality evaluation of
   agent outputs using a configurable reward model. Scores persist in SQLite
   and feed into the swarm learner.

New config settings: custom_weights_dir, reward_model_enabled,
reward_model_name, reward_model_votes.

54 new tests covering registry CRUD, API endpoints, agent assignments,
role lookups, and reward scoring.

https://claude.ai/code/session_01V4iTozMwcE2gjfnCJdCugC
2026-02-27 01:27:53 +00:00
Alexander Whitestone
e4d5ec5ed4 Merge pull request #62 from AlexanderWhitestone/claude/grok-backend-monetization-iVc5i 2026-02-26 20:26:15 -05:00
Claude
17059bc0ea feat: add Grok (xAI) as opt-in premium backend with monetization
- Add GrokBackend class in src/timmy/backends.py with full sync/async
  support, health checks, usage stats, and cost estimation in sats
- Add consult_grok tool to Timmy's toolkit for proactive Grok queries
- Extend cascade router with Grok provider type for failover chain
- Add Grok Mode toggle card to Mission Control dashboard (HTMX live)
- Add "Ask Grok" button on chat input for direct Grok queries
- Add /grok/* routes: status, toggle, chat, stats endpoints
- Integrate Lightning invoice generation for Grok usage monetization
- Add GROK_ENABLED, XAI_API_KEY, GROK_DEFAULT_MODEL, GROK_MAX_SATS_PER_QUERY,
  GROK_FREE config settings via pydantic-settings
- Update .env.example and docker-compose.yml with Grok env vars
- Add 21 tests covering backend, tools, and route endpoints (all green)

Local-first ethos preserved: Grok is premium augmentation only,
disabled by default, and Lightning-payable when enabled.

https://claude.ai/code/session_01FygwN8wS8J6WGZ8FPb7XGV
2026-02-27 01:12:51 +00:00
Alexander Whitestone
bb31f322e5 Merge pull request #61 from AlexanderWhitestone/claude/add-github-chat-interface-iZ0yN 2026-02-26 19:41:00 -05:00
Claude
bc2c09d3f8 feat: replace GitHub page with embedded Timmy chat interface
Replaces the marketing landing page with a minimal, full-screen chat
interface that connects to a running Timmy instance. Mobile-first design
with single vertical scroll direction, looping scroll, no zoom, no
buttons — just type and press Enter to talk to Timmy.

- docs/index.html: full rewrite as a clean chat UI with dark terminal
  theme, looping infinite scroll, markdown rendering, connection status,
  and /connect, /clear, /help slash commands
- src/dashboard/app.py: add CORS middleware so the GitHub Pages site can
  reach a local Timmy server cross-origin
- src/config.py: add cors_origins setting (defaults to ["*"])

https://claude.ai/code/session_01AWLxg6KDWsfCATiuvsRMGr
2026-02-27 00:35:33 +00:00
Alexander Whitestone
e0e2a2b9d8 Merge pull request #60 from AlexanderWhitestone/claude/local-models-iphone-EwXtC 2026-02-26 19:24:32 -05:00
Claude
3b7fcc5ebc feat: add in-browser local model support for iPhone via WebLLM
Enable Timmy to run directly on iPhone by loading a small LLM into
the browser via WebGPU (Safari 26+ / iOS 26+). No server connection
required — fully sovereign, fully offline.

New files:
- static/local_llm.js: WebLLM wrapper with model catalogue, WebGPU
  detection, streaming chat, and progress callbacks
- templates/mobile_local.html: Mobile-optimized UI with model
  selector, download progress, LOCAL/SERVER badge, and chat
- tests/dashboard/test_local_models.py: 31 tests covering routes,
  config, template UX, JS asset, and XSS prevention

Changes:
- config.py: browser_model_enabled, browser_model_id,
  browser_model_fallback settings
- routes/mobile.py: /mobile/local page, /mobile/local-models API
- base.html: LOCAL AI nav link

Supported models: SmolLM2-360M (~200MB), Qwen2.5-0.5B (~350MB),
SmolLM2-1.7B (~1GB), Llama-3.2-1B (~700MB). Falls back to
server-side Ollama when local model is unavailable.

https://claude.ai/code/session_01Cqkvr4sZbED7T3iDu1rwSD
2026-02-27 00:03:05 +00:00
Alexander Whitestone
528c86298a Merge pull request #59 from AlexanderWhitestone/claude/refactoring-phase-two-lhBGv 2026-02-26 18:37:24 -05:00
Claude
3adc18c208 chore: gitignore src/data/ (test runtime artifacts)
Test runs generate src/data/swarm.db and src/data/self_modify_reports/
which should not be tracked.

https://claude.ai/code/session_01JNjWfHqusjT3aiN4vvYgUk
2026-02-26 22:09:04 +00:00
Claude
89e677e5cc chore: remove accidentally tracked self_modify_reports
These test artifacts are already in .gitignore (data/self_modify_reports/)
but were included because they landed in src/data/ during test runs.

https://claude.ai/code/session_01JNjWfHqusjT3aiN4vvYgUk
2026-02-26 22:07:59 +00:00
Claude
9f4c809f70 refactor: Phase 2b — consolidate 28 modules into 14 packages
Complete the module consolidation planned in REFACTORING_PLAN.md:

Modules merged:
- work_orders/ + task_queue/ → swarm/ (subpackages)
- self_modify/ + self_tdd/ + upgrades/ → self_coding/ (subpackages)
- tools/ → creative/tools/
- chat_bridge/ + telegram_bot/ + shortcuts/ + voice/ → integrations/ (new)
- ws_manager/ + notifications/ + events/ + router/ → infrastructure/ (new)
- agents/ + agent_core/ + memory/ → timmy/ (subpackages)

Updated across codebase:
- 66 source files: import statements rewritten
- 13 test files: import + patch() target strings rewritten
- pyproject.toml: wheel includes (28→14), entry points updated
- CLAUDE.md: singleton paths, module map, entry points table
- AGENTS.md: file convention updates
- REFACTORING_PLAN.md: execution status, success metrics

Extras:
- Module-level CLAUDE.md added to 6 key packages (Phase 6.2)
- Zero test regressions: 1462 tests passing

https://claude.ai/code/session_01JNjWfHqusjT3aiN4vvYgUk
2026-02-26 22:07:41 +00:00
Alexander Whitestone
24c3d33c3b Merge pull request #58 from AlexanderWhitestone/claude/plan-repo-refactoring-hgskF 2026-02-26 16:33:11 -05:00
Claude
f15559482b docs: Update REFACTORING_PLAN.md with execution status
Mark completed phases (1, 2a, 3, 4, 6) and document remaining work
(full module consolidation, package extraction) with guidance on
incremental execution approach.

https://claude.ai/code/session_019oMFNvD8uSGSSmBMGkBfQN
2026-02-26 21:32:18 +00:00
Claude
d2c80fbf4c refactor: Phase 2a — consolidate dashboard routes (27→22 files)
Merge related route files to reduce sprawl:
- voice.py ← voice_enhanced.py (enhanced pipeline merged in)
- swarm.py ← swarm_internal.py + swarm_ws.py (internal API + WebSocket)
- self_coding.py ← self_modify.py (self-modify endpoints merged in)
- Delete mobile_test.py route + template (test-only page, not for prod)
- Delete test_xss_prevention.py (tested the deleted mobile_test page)

Update app.py to use consolidated imports.
Update test_voice_enhanced.py patch paths.
Remove mobile_test.py from coverage omit (file deleted).

27 route files → 22. Tests: 1502 passed (1 removed with deleted page).

https://claude.ai/code/session_019oMFNvD8uSGSSmBMGkBfQN
2026-02-26 21:30:39 +00:00
Claude
4e11dd2490 refactor: Phase 3 — reorganize tests into module-mirroring subdirectories
Move 97 test files from flat tests/ into 13 subdirectories:
  tests/dashboard/   (8 files — routes, mobile, mission control)
  tests/swarm/       (17 files — coordinator, docker, routing, tasks)
  tests/timmy/       (12 files — agent, backends, CLI, tools)
  tests/self_coding/  (14 files — git safety, indexer, self-modify)
  tests/lightning/   (3 files — L402, LND, interface)
  tests/creative/    (8 files — assembler, director, image/music/video)
  tests/integrations/ (10 files — chat bridge, telegram, voice, websocket)
  tests/mcp/         (4 files — bootstrap, discovery, executor)
  tests/spark/       (3 files — engine, tools, events)
  tests/hands/       (3 files — registry, oracle, phase5)
  tests/scripture/   (1 file)
  tests/infrastructure/ (3 files — router cascade, API)
  tests/security/    (3 files — XSS, regression)

Fix Path(__file__) reference in test_mobile_scenarios.py for new depth.
Add __init__.py to all test subdirectories.

Tests: 1503 passed, 9 failed (pre-existing), 53 errors (pre-existing)

https://claude.ai/code/session_019oMFNvD8uSGSSmBMGkBfQN
2026-02-26 21:21:28 +00:00
Claude
6045077144 refactor: Phase 1/4/6 — doc cleanup, config fix, token optimization
Phase 1 — Documentation cleanup:
- Slim README 303→93 lines (remove duplicated architecture, config tables)
- Slim CLAUDE.md 267→80 lines (remove project layout, env vars, CI section)
- Slim AGENTS.md 342→72 lines (remove duplicated patterns, running locally)
- Delete MEMORY.md, WORKSET_PLAN.md, WORKSET_PLAN_PHASE2.md (session docs)
- Archive PLAN.md, IMPLEMENTATION_SUMMARY.md to docs/
- Move QUALITY_ANALYSIS.md, QUALITY_REVIEW_REPORT.md to docs/
- Move apply_security_fixes.py, activate_self_tdd.sh to scripts/

Phase 4 — Config & build cleanup:
- Fix wheel build: add 11 missing modules to pyproject.toml include list
- Add pytest markers (unit, integration, dashboard, swarm, slow)
- Add data/self_modify_reports/ and .handoff/ to .gitignore

Phase 6 — Token optimization:
- Add docstrings to 15 __init__.py files that were empty
- Create __init__.py for events/, memory/, upgrades/ modules

Root markdown: 87KB → ~18KB (79% reduction)

https://claude.ai/code/session_019oMFNvD8uSGSSmBMGkBfQN
2026-02-26 21:03:15 +00:00
Claude
31760682f6 docs: Add comprehensive architectural refactoring plan
Full VP-engineering-level review of the codebase identifying 8 problems
(monolith sprawl, dashboard gravity well, doc entropy, test skeleton
bloat, unclear project boundaries, broken wheel build, dashboard
coupling, overscoped conftest) and proposing 6 phases of incremental
refactoring from low-risk doc cleanup to potential package extraction.

Key findings:
- 28 modules in src/, 11 missing from wheel build
- 87KB of root markdown with massive duplication
- 61 of 97 test files are empty skeletons (0 test functions)
- Dashboard routes: 27 files, 4,562 lines (gravity well)
- 4 autouse fixtures run on every test regardless of need

https://claude.ai/code/session_019oMFNvD8uSGSSmBMGkBfQN
2026-02-26 20:42:02 +00:00
Alexander Payne
d9e556d4c1 fix: Upgrade model to llama3.1:8b-instruct + fix git tool cwd
Change 1: Model Upgrade (Primary Fix)
- Changed default model from llama3.2 to llama3.1:8b-instruct
- llama3.1:8b-instruct is fine-tuned for reliable tool/function calling
- llama3.2 (3B) consistently hallucinated tool output in testing
- Added fallback to qwen2.5:14b if primary unavailable

Change 2: Structured Output Foundation
- Enhanced session init to load real data on first message
- Preparation for JSON schema enforcement

Change 3: Git Tool Working Directory Fix
- Rewrote git_tools.py to use subprocess with cwd=REPO_ROOT
- REPO_ROOT auto-detected at module load time
- All git commands now run from correct directory

Change 4: Session Init with Git Log
- _session_init() reads git log --oneline -15 on first message
- Recent commits prepended to system prompt
- Timmy can now answer 'what's new?' from actual commit data

Change 5: Documentation
- Updated README with new model requirement
- Added CHANGELOG_2025-02-27.md

User must run: ollama pull llama3.1:8b-instruct

All 18 git tool tests pass.
2026-02-26 13:42:36 -05:00
Alexander Whitestone
f403d69bc1 Merge pull request #56 from AlexanderWhitestone/feature/hands-infrastructure-phase3
feat: Hands Infrastructure + 6 Autonomous Agents
2026-02-26 13:10:38 -05:00
Alexander Payne
9edcc627ea docs: Update README with all 6 Hands
Update Hands documentation to include Phase 5 additions:
- Scout: hourly OSINT monitoring
- Scribe: daily content production
- Ledger: 6-hour treasury tracking
- Weaver: weekly creative pipeline

Total: 6 autonomous Hands using existing agent framework.
2026-02-26 13:09:03 -05:00
Alexander Payne
7b26922339 test: Phase 5 Hands tests
Add comprehensive tests for new Hands:

TestScoutHand:
- Directory structure, TOML validity, SYSTEM.md
- Registry loading

TestScribeHand:
- Same validation pattern

TestLedgerHand:
- Same validation pattern

TestWeaverHand:
- Same validation pattern

TestPhase5Schedules:
- Scout: hourly (0 * * * *)
- Scribe: daily 9am (0 9 * * *)
- Ledger: every 6 hours (0 */6 * * *)
- Weaver: Sunday 10am (0 10 * * 0)

TestPhase5ApprovalGates:
- All 4 Hands have approval gates

TestAllHandsLoad:
- All 6 Hands load together

25 tests total, all passing.
2026-02-26 13:08:48 -05:00
Alexander Payne
a8f44c159e feat: Phase 5 Additional Hands (Scout, Scribe, Ledger, Weaver)
Add 4 new autonomous Hands using existing agent framework:

Scout Hand (hands/scout/):
- OSINT monitoring every hour
- Monitors: HN, Reddit, RSS for Bitcoin/sovereign AI topics
- Uses: web_search, rss_fetch, sentiment analysis

Scribe Hand (hands/scribe/):
- Content production daily at 9am
- Produces: blog posts, docs, changelog
- Uses: file ops, git tools, codebase indexer

Ledger Hand (hands/ledger/):
- Treasury tracking every 6 hours
- Monitors: on-chain, Lightning balances, payment flows
- Uses: lightning_balance, onchain_balance, payment_audit

Weaver Hand (hands/weaver/):
- Creative pipeline weekly on Sundays
- Orchestrates: Pixel + Lyra + Reel for video production
- Uses: creative_director, project management tools

All Hands configured with:
- HAND.toml manifests with schedules
- SYSTEM.md prompts
- Approval gates for write actions
- Dashboard + Telegram output
2026-02-26 13:07:43 -05:00
Alexander Payne
b884884bad docs: Hands documentation in README (Phase 4)
Update README with Hands subsystem documentation:

- Add Hands to 'What's built' table
- New 'Hands — Autonomous Agents' section with:
  - Built-in Hands reference (Oracle, Sentinel)
  - Dashboard URL
  - HAND.toml example
- Update project layout to include src/hands/ and hands/
- Update roadmap: Exodus now includes Hands

Complete documentation for Phase 3-4 Hands infrastructure.
2026-02-26 12:58:21 -05:00
Alexander Payne
7508ef13c1 test: Oracle and Sentinel Hands tests (Phase 4)
Add validation tests for the first two autonomous Hands:

TestOracleHand:
- Directory structure exists
- HAND.toml is valid TOML with correct config
- SYSTEM.md exists with proper content
- Skills directory populated
- Loads correctly in HandRegistry

TestSentinelHand:
- Same validation pattern as Oracle

TestHandSchedules:
- Oracle runs twice daily (7am, 7pm UTC)
- Sentinel runs every 15 minutes

TestHandApprovalGates:
- Both Hands have approval gates configured
- Safety model enforced

14 tests total, all passing.
2026-02-26 12:57:41 -05:00
Alexander Payne
1ba03e4ce2 feat: Oracle and Sentinel Hands (Phase 4)
Add the first two autonomous Hands to validate infrastructure:

Oracle Hand (hands/oracle/):
- Bitcoin intelligence briefing, 2x daily (7am, 7pm)
- Monitors: price action, on-chain metrics, macro context
- Tools: mempool_fetch, fee_estimate, price_fetch, whale_alert
- Output: Dashboard + Telegram, markdown format
- Safety: Broadcast requires approval (5min auto)

Sentinel Hand (hands/sentinel/):
- System health monitoring, every 15 minutes
- Monitors: dashboard, agents, database, disk, memory
- Tools: system_stats, db_health, agent_status, disk_check
- Output: Dashboard + Telegram, JSON format
- Safety: Service restart requires approval (1min auto)

Both include:
- HAND.toml configuration with schedules
- SYSTEM.md with complete prompts
- skills/ directory with specialized knowledge
- Approval gates for write actions
2026-02-26 12:57:07 -05:00
Alexander Whitestone
536a371d48 Merge pull request #55 from AlexanderWhitestone/feature/hands-infrastructure-phase3
Feature/hands infrastructure phase3
2026-02-26 12:55:28 -05:00
Alexander Payne
a1d00da2de test: Hands infrastructure tests (Phase 3)
Add comprehensive test suite for Hands framework:

TestHandRegistry:
- Load all Hands from directory
- Get Hand by name (with not-found handling)
- Get scheduled vs all Hands
- State management (status updates)
- Approval queue operations

TestHandScheduler:
- Scheduler initialization
- Schedule Hand with cron
- Get scheduled jobs list
- Manual trigger execution

TestHandRunner:
- Load system prompts from SYSTEM.md
- Load skills from skills/ directory
- Build execution prompts

TestHandConfig:
- HandConfig creation and validation
- Cron schedule validation

TestHandModels:
- HandStatus enum values
- HandState serialization to dict

17 tests total, all passing.
2026-02-26 12:49:06 -05:00
Alexander Payne
d7aaae74d5 feat: Hands Dashboard Routes and UI (Phase 3.6)
Add dashboard for managing autonomous Hands:

Routes (src/dashboard/routes/hands.py):
- GET /api/hands - List all Hands with status
- GET /api/hands/{name} - Get Hand details
- POST /api/hands/{name}/trigger - Manual trigger
- POST /api/hands/{name}/pause - Pause scheduled Hand
- POST /api/hands/{name}/resume - Resume paused Hand
- GET /api/approvals - List pending approvals
- POST /api/approvals/{id}/approve - Approve request
- POST /api/approvals/{id}/reject - Reject request
- GET /api/executions - List execution history

Templates:
- hands.html - Main dashboard page
- partials/hands_list.html - Active Hands list
- partials/approvals_list.html - Pending approvals
- partials/hand_executions.html - Execution history

Integration:
- Wired up in app.py
- Navigation links in base.html
2026-02-26 12:46:48 -05:00
Alexander Payne
73cf780656 feat: HandRunner and hands module init (Phase 3.5)
Add HandRunner for executing Hands:

- hands/runner.py: Hand execution engine
  - Load SYSTEM.md and SKILL.md files
  - Inject domain expertise into LLM context
  - Check and handle approval gates
  - Execute tool loop with LLM
  - Deliver output to dashboard/channel/file
  - Log execution records

- hands/__init__.py: Module exports
  - Export all public classes and models
  - Usage documentation

The HandRunner completes the core Hands infrastructure.
2026-02-26 12:43:40 -05:00
Alexander Payne
8a952f6818 feat: Hands Infrastructure - Models, Registry, Scheduler (Phase 3.1-3.3)
Add core Hands infrastructure:

- hands/models.py: Pydantic models for HAND.toml schema
  - HandConfig: Complete hand configuration
  - HandState: Runtime state tracking
  - HandExecution: Execution records
  - ApprovalRequest: Approval queue entries

- hands/registry.py: HandRegistry for loading and indexing
  - Load Hands from hands/ directory
  - Parse HAND.toml manifests
  - SQLite indexing for fast lookup
  - Approval queue management
  - Execution history logging

- hands/scheduler.py: APScheduler-based scheduling
  - Cron and interval triggers
  - Job management (schedule, pause, resume, unschedule)
  - Hand execution wrapper
  - Manual trigger support
2026-02-26 12:41:52 -05:00
Alexander Whitestone
7de9db32ea Merge pull request #54 from AlexanderWhitestone/feature/self-coding-rebased
Feature/self coding rebased
2026-02-26 12:36:37 -05:00
Alexander Payne
4d3995012a test: Self-Coding Dashboard Tests
Add tests for dashboard routes:

- Page routes (main page, journal partial, stats partial, execute form)
- API routes (journal list/detail, stats, codebase summary/reindex)
- Execute endpoints (API and HTMX)
- Navigation integration (link in header)

Tests verify endpoints return correct status codes and content types.
2026-02-26 12:28:30 -05:00
Alexander Payne
62365cc9b2 feat: Wire up Self-Coding Dashboard
Integrate self-coding routes into dashboard:

Changes:
- Add import for self_coding_router in app.py
- Include self_coding_router in FastAPI app
- Add SELF-CODING link to desktop navigation
- Add SELF-CODING link to mobile navigation

The self-coding dashboard is now accessible at /self-coding
2026-02-26 12:28:30 -05:00
Alexander Payne
e81be8aed7 feat: Self-Coding Dashboard HTMX Templates
Add complete UI for self-coding dashboard:

Templates:
- self_coding.html - Main dashboard page with layout
- partials/self_coding_stats.html - Stats cards (total, success rate, etc)
- partials/journal_entries.html - List of modification attempts
- partials/journal_entry_detail.html - Expanded view of single attempt
- partials/execute_form.html - Task execution form
- partials/execute_result.html - Execution result display
- partials/error.html - Error message display

Features:
- HTMX-powered dynamic updates
- Real-time journal filtering (all/success/failure)
- Modal dialog for task execution
- Responsive Bootstrap 5 styling
- Automatic refresh after successful execution
2026-02-26 12:28:05 -05:00