Compare commits

..

86 Commits

Author SHA1 Message Date
f634886f9b feat: FastAPI Morrowind harness + SOUL.md framework (#821, #854)
## FastAPI Morrowind Harness (#821)
- GET /api/v1/morrowind/perception — reads perception.json, validates against PerceptionOutput
- POST /api/v1/morrowind/command — validates CommandInput, logs via CommandLogger, stubs bridge
- GET /api/v1/morrowind/status — connection state, last perception, queue depth, vitals
- Router registered in dashboard app.py

## SOUL.md Framework (#854)
- Template, authoring guide, and role extensions docs in docs/soul-framework/
- SoulLoader: parse SOUL.md files into structured SoulDocument
- SoulValidator: check required sections, detect contradictions, validate structure
- SoulVersioner: hash-based change detection and JSONL history tracking
- 39 tests covering all endpoints and framework components

Depends on #864

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 22:49:56 +00:00
215329146a feat(morrowind): add Perception/Command protocol + SQLite command log (#859, #855)
Implement two foundational infrastructure pieces for the Morrowind integration:

1. Perception/Command Protocol (Issue #859):
   - Formal spec document with JSON schemas, API contracts, versioning strategy
   - Engine-agnostic design following the Falsework Rule
   - Pydantic v2 models (PerceptionOutput, CommandInput) for runtime validation

2. SQLite Command Log + Training Pipeline (Issue #855):
   - SQLAlchemy model for command_log table with full indexing
   - CommandLogger class with log_command(), query(), export_training_data()
   - TrainingExporter with chat-completion, episode, and instruction formats
   - Storage management (rotation/archival) utilities
   - Alembic migration for the new table

Includes 39 passing tests covering schema validation, logging, querying, and export.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 22:33:13 +00:00
e4864b14f2 [kimi] Add Submit Job modal with client-side validation (#754) (#832) 2026-03-21 22:14:19 +00:00
e99b09f700 [kimi] Add About/Info panel to Matrix UI (#755) (#831) 2026-03-21 22:06:18 +00:00
2ab6539564 [kimi] Add ConnectionPool class with unit tests (#769) (#830) 2026-03-21 22:02:08 +00:00
28b8673584 [kimi] Add unit tests for voice_tts.py (#768) (#829) 2026-03-21 21:56:45 +00:00
2f15435fed [kimi] Implement quick health snapshot before coding (#710) (#828) 2026-03-21 21:53:40 +00:00
dfe40f5fe6 [kimi] Centralize agent token rules and hooks for automations (#711) (#792) 2026-03-21 21:44:35 +00:00
6dd48685e7 [kimi] Weekly narrative summary generator (#719) (#791) 2026-03-21 21:36:40 +00:00
a95cf806c8 [kimi] Implement token quest system for agents (#713) (#789) 2026-03-21 20:45:35 +00:00
19367d6e41 [kimi] OpenClaw architecture and deployment research report (#721) (#788) 2026-03-21 20:36:23 +00:00
7e983fcdb3 [kimi] Add dashboard card for Daily Run and triage metrics (#718) (#786) 2026-03-21 19:58:25 +00:00
46f89d59db [kimi] Add Golden Path generator for longer sessions (#717) (#785) 2026-03-21 19:41:34 +00:00
e3a0f1d2d6 [kimi] Implement Daily Run orchestration script (10-minute ritual) (#703) (#783) 2026-03-21 19:24:43 +00:00
2a9d21cea1 [kimi] Implement Daily Run orchestration script (10-minute ritual) (#703) (#783)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-21 19:24:42 +00:00
05b87c3ac1 [kimi] Implement Timmy control panel CLI entry point (#702) (#767) 2026-03-21 19:15:27 +00:00
8276279775 [kimi] Create central Timmy Automations module (#701) (#766) 2026-03-21 19:09:38 +00:00
d1f5c2714b [kimi] refactor: extract helpers from chat() (#627) (#690)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-21 18:09:22 +00:00
65df56414a [kimi] Add visitor_state message handler (#670) (#699)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-21 18:08:53 +00:00
b08ce53bab [kimi] Refactor request_logging.py::dispatch (#616) (#765) 2026-03-21 18:06:34 +00:00
e0660bf768 [kimi] refactor: extract helpers from chat() (#627) (#690)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-21 18:01:27 +00:00
dc9f0c04eb [kimi] Add rate limiting middleware for Matrix API endpoints (#683) (#746) 2026-03-21 16:23:16 +00:00
815933953c [kimi] Add WebSocket authentication for Matrix connections (#682) (#744) 2026-03-21 16:14:05 +00:00
d54493a87b [kimi] Add /api/matrix/health endpoint (#685) (#745) 2026-03-21 15:51:29 +00:00
f7404f67ec [kimi] Add system_status message producer (#681) (#743) 2026-03-21 15:13:01 +00:00
5f4580f98d [kimi] Add matrix config loader utility (#680) (#742) 2026-03-21 15:05:06 +00:00
695d1401fd [kimi] Add CORS config for Matrix frontend origin (#679) (#741) 2026-03-21 14:56:43 +00:00
ddadc95e55 [kimi] Add /api/matrix/memory/search endpoint (#678) (#740) 2026-03-21 14:52:31 +00:00
8fc8e0fc3d [kimi] Add /api/matrix/thoughts endpoint for recent thought stream (#677) (#739) 2026-03-21 14:44:46 +00:00
ada0774ca6 [kimi] Add Pip familiar state to agent_state messages (#676) (#738) 2026-03-21 14:37:39 +00:00
2a7b6d5708 [kimi] Add /api/matrix/bark endpoint — HTTP fallback for bark messages (#675) (#737) 2026-03-21 14:32:04 +00:00
9d4ac8e7cc [kimi] Add /api/matrix/config endpoint for world configuration (#674) (#736) 2026-03-21 14:25:19 +00:00
c9601ba32c [kimi] Add /api/matrix/agents endpoint for Matrix visualization (#673) (#735) 2026-03-21 14:18:46 +00:00
646eaefa3e [kimi] Add produce_thought() to stream thinking to Matrix (#672) (#734) 2026-03-21 14:09:19 +00:00
2fa5b23c0c [kimi] Add bark message producer for Matrix bark messages (#671) (#732) 2026-03-21 14:01:42 +00:00
9b57774282 [kimi] feat: pre-cycle state validation for stale cycle_result.json (#661) (#666)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-21 13:53:11 +00:00
62bde03f9e [kimi] feat: add agent_state message producer (#669) (#698) 2026-03-21 13:46:10 +00:00
3474eeb4eb [kimi] refactor: extract presence state serializer from workshop heartbeat (#668) (#697) 2026-03-21 13:41:42 +00:00
e92e151dc3 [kimi] refactor: extract WebSocket message types into shared protocol module (#667) (#696) 2026-03-21 13:37:28 +00:00
1f1bc222e4 [kimi] test: add comprehensive tests for spark modules (#659) (#695) 2026-03-21 13:32:53 +00:00
cc30bdb391 [kimi] test: add comprehensive tests for multimodal.py (#658) (#694) 2026-03-21 04:00:53 +00:00
6f0863b587 [kimi] test: add comprehensive tests for config.py (#648) (#693) 2026-03-21 03:54:54 +00:00
e3d425483d [kimi] fix: add logging to silent except Exception handlers (#646) (#692) 2026-03-21 03:50:26 +00:00
c9445e3056 [kimi] refactor: extract helpers from CSRFMiddleware.dispatch (#628) (#691) 2026-03-21 03:41:09 +00:00
11cd2e3372 [kimi] refactor: extract helpers from chat() (#627) (#686) 2026-03-21 03:33:16 +00:00
9d0f5c778e [loop-cycle-2] fix: resolve endpoint before execution in CSRF middleware (#626) (#656) 2026-03-20 23:05:09 +00:00
d2a5866650 [loop-cycle-1] fix: use config for xAI base URL (#647) (#655) 2026-03-20 22:47:05 +00:00
2381d0b6d0 refactor: break up _create_bug_report — extract helpers (#645)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-20 22:03:40 +00:00
03ad2027a4 refactor: break up _load_config into helpers (#656)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-20 17:48:08 -04:00
2bfc44ea1b [loop-cycle-1] refactor: extract _try_prune helper and fix f-string logging (#653) (#657) 2026-03-20 17:44:32 -04:00
fe1fa78ef1 refactor: break up _create_default — extract template constant (#650)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-20 17:39:17 -04:00
3c46a1b202 refactor: extract _create_default template to module constant (#649)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-20 17:36:29 -04:00
001358c64f refactor: break up create_gitea_issue_via_mcp into helpers (#647)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-20 17:29:55 -04:00
faad0726a2 [loop-cycle-1666] fix: replace remaining deprecated utcnow() in calm.py (#633) (#644) 2026-03-20 17:22:35 -04:00
dd4410fe57 refactor: break up create_gitea_issue_via_mcp into helpers (#646)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-20 17:22:33 -04:00
ef7f31070b refactor: break up self_reflect into helpers (#643)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-20 17:09:28 -04:00
6f66670396 [loop-cycle-1664] fix: replace deprecated datetime.utcnow() (#633) (#636) 2026-03-20 17:01:19 -04:00
4cdd82818b refactor: break up get_state_dict into helpers (#632)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-20 17:01:16 -04:00
99ad672e4d refactor: break up delegate_to_kimi into helpers (#637)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-20 16:52:21 -04:00
a3f61c67d3 refactor: break up post_morning_ritual into helpers (#631)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-20 16:43:14 -04:00
32dbdc68c8 refactor: break up should_use_tools into helpers (#624)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-20 16:31:34 -04:00
84302aedac fix: pass max_tokens to Ollama provider in cascade router (#622)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-20 16:27:24 -04:00
2c217104db feat: real-time Spark visualization in Mission Control (#615)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-20 16:22:15 -04:00
7452e8a4f0 fix: add missing tests for Tower route /tower (#621)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-20 16:22:13 -04:00
9732c80892 feat: Real-time Spark Visualization in Tower Dashboard (#612)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-20 16:10:42 -04:00
f3b3d1e648 [loop-cycle-1658] feat: provider health history endpoint (#457) (#611) 2026-03-20 16:09:20 -04:00
4ba8d25749 feat: Lightning Network integration for tool usage (#610)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-20 13:07:02 -04:00
2622f0a0fb [loop-cycle-1242] fix: cycle_retro reads cycle_result.json (#603) (#609) 2026-03-20 12:55:01 -04:00
e3d60b89a9 fix: remove model_size kwarg from create_timmy() CLI calls (#606)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-20 12:48:49 -04:00
6214ad3225 refactor: extract helpers from run_self_tests() (#601)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-20 12:40:44 -04:00
5f5da2163f [loop-cycle] refactor: extract helpers from _handle_tool_confirmation (#592) (#600) 2026-03-20 12:32:24 -04:00
0029c34bb1 refactor: break up search_thoughts() into focused helpers (#597)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-20 12:26:51 -04:00
2577b71207 fix: capture thought timestamp at cycle start, not after LLM call (#590)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-20 12:13:48 -04:00
1a8b8ecaed [loop-cycle-1235] refactor: break up _migrate_schema() into focused helpers (#591) (#595) 2026-03-20 12:07:15 -04:00
d821e76589 [loop-cycle-1234] refactor: break up _generate_avatar_image (#563) (#589) 2026-03-20 11:57:53 -04:00
bc010ecfba [loop-cycle-1233] refactor: add docstrings to calm.py route handlers (#569) (#585) 2026-03-20 11:44:06 -04:00
faf6c1a5f1 [loop-cycle-1233] refactor: break up BaseAgent.run() (#561) (#584) 2026-03-20 11:24:36 -04:00
48103bb076 [loop-cycle-956] refactor: break up _handle_message() into focused helpers (#553) (#574) 2026-03-19 21:42:01 -04:00
9f244ffc70 refactor: break up _record_utterance() into focused helpers (#572)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-19 21:37:32 -04:00
0162a604be refactor: break up voice_loop.py::run() into focused helpers (#567)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-19 21:33:59 -04:00
2326771c5a [loop-cycle-953] refactor: DRY _import_creative_catalogs() (#560) (#565) 2026-03-19 21:21:23 -04:00
8f6cf2681b refactor: break up search_memories() into focused helpers (#557)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-19 21:16:07 -04:00
f361893fdd [loop-cycle-951] refactor: break up _migrate_schema() (#552) (#558) 2026-03-19 21:11:02 -04:00
7ad0ee17b6 refactor: break up shell.py::run() into helpers (#551)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-19 21:04:10 -04:00
29220b6bdd refactor: break up api_chat() into helpers (#547)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-19 21:02:04 -04:00
2849dba756 [loop-cycle-948] refactor: break up _gather_system_snapshot() into helpers (#540) (#549) 2026-03-19 20:52:13 -04:00
135 changed files with 25527 additions and 1321 deletions

1
.gitignore vendored
View File

@@ -73,7 +73,6 @@ morning_briefing.txt
markdown_report.md
data/timmy_soul.jsonl
scripts/migrate_to_zeroclaw.py
src/infrastructure/db_pool.py
workspace/
# Loop orchestration state

33
config/matrix.yaml Normal file
View File

@@ -0,0 +1,33 @@
# Matrix World Configuration
# Serves lighting, environment, and feature settings to the Matrix frontend.
lighting:
ambient_color: "#FFAA55" # Warm amber (Workshop warmth)
ambient_intensity: 0.5
point_lights:
- color: "#FFAA55" # Warm amber (Workshop center light)
intensity: 1.2
position: { x: 0, y: 5, z: 0 }
- color: "#3B82F6" # Cool blue (Matrix accent)
intensity: 0.8
position: { x: -5, y: 3, z: -5 }
- color: "#A855F7" # Purple accent
intensity: 0.6
position: { x: 5, y: 3, z: 5 }
environment:
rain_enabled: false
starfield_enabled: true # Cool blue starfield (Matrix feel)
fog_color: "#0f0f23"
fog_density: 0.02
features:
chat_enabled: true
visitor_avatars: true
pip_familiar: true
workshop_portal: true
agents:
default_count: 5
max_count: 20
agents: []

178
config/quests.yaml Normal file
View File

@@ -0,0 +1,178 @@
# ── Token Quest System Configuration ─────────────────────────────────────────
#
# Quests are special objectives that agents (and humans) can complete for
# bonus tokens. Each quest has:
# - id: Unique identifier
# - name: Display name
# - description: What the quest requires
# - reward_tokens: Number of tokens awarded on completion
# - criteria: Detection rules for completion
# - enabled: Whether this quest is active
# - repeatable: Whether this quest can be completed multiple times
# - cooldown_hours: Minimum hours between completions (if repeatable)
#
# Quest Types:
# - issue_count: Complete when N issues matching criteria are closed
# - issue_reduce: Complete when open issue count drops by N
# - docs_update: Complete when documentation files are updated
# - test_improve: Complete when test coverage/cases improve
# - daily_run: Complete Daily Run session objectives
# - custom: Special quests with manual completion
#
# ── Active Quests ─────────────────────────────────────────────────────────────
quests:
# ── Daily Run & Test Improvement Quests ───────────────────────────────────
close_flaky_tests:
id: close_flaky_tests
name: Flaky Test Hunter
description: Close 3 issues labeled "flaky-test"
reward_tokens: 150
type: issue_count
enabled: true
repeatable: true
cooldown_hours: 24
criteria:
issue_labels:
- flaky-test
target_count: 3
issue_state: closed
lookback_days: 7
notification_message: "Quest Complete! You closed 3 flaky-test issues and earned {tokens} tokens."
reduce_p1_issues:
id: reduce_p1_issues
name: Priority Firefighter
description: Reduce open P1 Daily Run issues by 2
reward_tokens: 200
type: issue_reduce
enabled: true
repeatable: true
cooldown_hours: 48
criteria:
issue_labels:
- layer:triage
- P1
target_reduction: 2
lookback_days: 3
notification_message: "Quest Complete! You reduced P1 issues by 2 and earned {tokens} tokens."
improve_test_coverage:
id: improve_test_coverage
name: Coverage Champion
description: Improve test coverage by 5% or add 10 new test cases
reward_tokens: 300
type: test_improve
enabled: true
repeatable: false
criteria:
coverage_increase_percent: 5
min_new_tests: 10
notification_message: "Quest Complete! You improved test coverage and earned {tokens} tokens."
complete_daily_run_session:
id: complete_daily_run_session
name: Daily Runner
description: Successfully complete 5 Daily Run sessions in a week
reward_tokens: 250
type: daily_run
enabled: true
repeatable: true
cooldown_hours: 168 # 1 week
criteria:
min_sessions: 5
lookback_days: 7
notification_message: "Quest Complete! You completed 5 Daily Run sessions and earned {tokens} tokens."
# ── Documentation & Maintenance Quests ────────────────────────────────────
improve_automation_docs:
id: improve_automation_docs
name: Documentation Hero
description: Improve documentation for automations (update 3+ doc files)
reward_tokens: 100
type: docs_update
enabled: true
repeatable: true
cooldown_hours: 72
criteria:
file_patterns:
- "docs/**/*.md"
- "**/README.md"
- "timmy_automations/**/*.md"
min_files_changed: 3
lookback_days: 7
notification_message: "Quest Complete! You improved automation docs and earned {tokens} tokens."
close_micro_fixes:
id: close_micro_fixes
name: Micro Fix Master
description: Close 5 issues labeled "layer:micro-fix"
reward_tokens: 125
type: issue_count
enabled: true
repeatable: true
cooldown_hours: 24
criteria:
issue_labels:
- layer:micro-fix
target_count: 5
issue_state: closed
lookback_days: 7
notification_message: "Quest Complete! You closed 5 micro-fix issues and earned {tokens} tokens."
# ── Special Achievements ──────────────────────────────────────────────────
first_contribution:
id: first_contribution
name: First Steps
description: Make your first contribution (close any issue)
reward_tokens: 50
type: issue_count
enabled: true
repeatable: false
criteria:
target_count: 1
issue_state: closed
lookback_days: 30
notification_message: "Welcome! You completed your first contribution and earned {tokens} tokens."
bug_squasher:
id: bug_squasher
name: Bug Squasher
description: Close 10 issues labeled "bug"
reward_tokens: 500
type: issue_count
enabled: true
repeatable: true
cooldown_hours: 168 # 1 week
criteria:
issue_labels:
- bug
target_count: 10
issue_state: closed
lookback_days: 7
notification_message: "Quest Complete! You squashed 10 bugs and earned {tokens} tokens."
# ── Quest System Settings ───────────────────────────────────────────────────
settings:
# Enable/disable quest notifications
notifications_enabled: true
# Maximum number of concurrent active quests per agent
max_concurrent_quests: 5
# Auto-detect quest completions on Daily Run metrics update
auto_detect_on_daily_run: true
# Gitea issue labels that indicate quest-related work
quest_work_labels:
- layer:triage
- layer:micro-fix
- layer:tests
- layer:economy
- flaky-test
- bug
- documentation

View File

@@ -0,0 +1,312 @@
# Morrowind Perception/Command Protocol Specification
**Version:** 1.0.0
**Status:** Draft
**Authors:** Timmy Infrastructure Team
**Date:** 2026-03-21
---
## 1. Overview
This document defines the **engine-agnostic Perception/Command protocol** used by Timmy's
heartbeat loop to observe the game world and issue commands. The protocol is designed
around the **Falsework Rule**: TES3MP (Morrowind) is scaffolding. If the engine swaps,
only the bridge and perception script change — the heartbeat, reasoning, and journal
remain sovereign.
### 1.1 Design Principles
- **Engine-agnostic**: Schemas reference abstract concepts (cells, entities, quests), not
Morrowind-specific internals.
- **Versioned**: Every payload carries a `protocol_version` so consumers can negotiate
compatibility.
- **Typed at the boundary**: Pydantic v2 models enforce validation on both the producer
(bridge) and consumer (heartbeat) side.
- **Logged by default**: Every command is persisted to the SQLite command log for
training-data extraction (see Issue #855).
---
## 2. Protocol Version Strategy
| Field | Type | Description |
| ------------------ | ------ | ------------------------------------ |
| `protocol_version` | string | SemVer string (e.g. `"1.0.0"`) |
### Compatibility Rules
- **Patch** bump (1.0.x): additive fields with defaults — fully backward-compatible.
- **Minor** bump (1.x.0): new optional endpoints or enum values — old clients still work.
- **Major** bump (x.0.0): breaking schema change — requires coordinated upgrade of bridge
and heartbeat.
Consumers MUST reject payloads whose major version exceeds their own.
---
## 3. Perception Output Schema
Returned by `GET /perception`. Represents a single snapshot of the game world as observed
by the bridge.
```json
{
"protocol_version": "1.0.0",
"timestamp": "2026-03-21T14:30:00Z",
"agent_id": "timmy",
"location": {
"cell": "Balmora",
"x": 1024.5,
"y": -512.3,
"z": 64.0,
"interior": false
},
"health": {
"current": 85,
"max": 100
},
"nearby_entities": [
{
"entity_id": "npc_001",
"name": "Caius Cosades",
"entity_type": "npc",
"distance": 12.5,
"disposition": 65
}
],
"inventory_summary": {
"gold": 150,
"item_count": 23,
"encumbrance_pct": 0.45
},
"active_quests": [
{
"quest_id": "mq_01",
"name": "Report to Caius Cosades",
"stage": 10
}
],
"environment": {
"time_of_day": "afternoon",
"weather": "clear",
"is_combat": false,
"is_dialogue": false
},
"raw_engine_data": {}
}
```
### 3.1 Field Reference
| Field | Type | Required | Description |
| -------------------- | ----------------- | -------- | ------------------------------------------------------------ |
| `protocol_version` | string | yes | Protocol SemVer |
| `timestamp` | ISO 8601 datetime | yes | When the snapshot was taken |
| `agent_id` | string | yes | Which agent this perception belongs to |
| `location.cell` | string | yes | Current cell/zone name |
| `location.x/y/z` | float | yes | World coordinates |
| `location.interior` | bool | yes | Whether the agent is indoors |
| `health.current` | int (0max) | yes | Current health |
| `health.max` | int (>0) | yes | Maximum health |
| `nearby_entities` | array | yes | Entities within perception radius (may be empty) |
| `inventory_summary` | object | yes | Lightweight inventory overview |
| `active_quests` | array | yes | Currently tracked quests |
| `environment` | object | yes | World-state flags |
| `raw_engine_data` | object | no | Opaque engine-specific blob (not relied upon by heartbeat) |
### 3.2 Entity Types
The `entity_type` field uses a controlled vocabulary:
| Value | Description |
| ---------- | ------------------------ |
| `npc` | Non-player character |
| `creature` | Hostile or neutral mob |
| `item` | Pickup-able world item |
| `door` | Door or transition |
| `container`| Lootable container |
---
## 4. Command Input Schema
Sent via `POST /command`. Represents a single action the agent wants to take in the world.
```json
{
"protocol_version": "1.0.0",
"timestamp": "2026-03-21T14:30:01Z",
"agent_id": "timmy",
"command": "move_to",
"params": {
"target_cell": "Balmora",
"target_x": 1050.0,
"target_y": -500.0
},
"reasoning": "Moving closer to Caius Cosades to begin the main quest dialogue.",
"episode_id": "ep_20260321_001",
"context": {
"perception_timestamp": "2026-03-21T14:30:00Z",
"heartbeat_cycle": 42
}
}
```
### 4.1 Field Reference
| Field | Type | Required | Description |
| ------------------------------ | ----------------- | -------- | ------------------------------------------------------- |
| `protocol_version` | string | yes | Protocol SemVer |
| `timestamp` | ISO 8601 datetime | yes | When the command was issued |
| `agent_id` | string | yes | Which agent is issuing the command |
| `command` | string (enum) | yes | Command type (see §4.2) |
| `params` | object | yes | Command-specific parameters (may be empty `{}`) |
| `reasoning` | string | yes | Natural-language explanation of *why* this command |
| `episode_id` | string | no | Groups commands into training episodes |
| `context` | object | no | Metadata linking command to its triggering perception |
### 4.2 Command Types
| Command | Description | Key Params |
| --------------- | ---------------------------------------- | ---------------------------------- |
| `move_to` | Navigate to coordinates or entity | `target_cell`, `target_x/y/z` |
| `interact` | Interact with entity (talk, activate) | `entity_id`, `interaction_type` |
| `use_item` | Use an inventory item | `item_id`, `target_entity_id?` |
| `wait` | Wait/idle for a duration | `duration_seconds` |
| `combat_action` | Perform a combat action | `action_type`, `target_entity_id` |
| `dialogue` | Choose a dialogue option | `entity_id`, `topic`, `choice_idx` |
| `journal_note` | Write an internal journal observation | `content`, `tags` |
| `noop` | Heartbeat tick with no action | — |
---
## 5. API Contracts
### 5.1 `GET /perception`
Returns the latest perception snapshot.
**Response:** `200 OK` with `PerceptionOutput` JSON body.
**Error Responses:**
| Status | Code | Description |
| ------ | ------------------- | ----------------------------------- |
| 503 | `BRIDGE_UNAVAILABLE`| Game bridge is not connected |
| 504 | `PERCEPTION_TIMEOUT`| Bridge did not respond in time |
| 422 | `SCHEMA_MISMATCH` | Bridge returned incompatible schema |
### 5.2 `POST /command`
Submit a command for the agent to execute.
**Request:** `CommandInput` JSON body.
**Response:** `202 Accepted`
```json
{
"status": "accepted",
"command_id": "cmd_abc123",
"logged": true
}
```
**Error Responses:**
| Status | Code | Description |
| ------ | -------------------- | ----------------------------------- |
| 400 | `INVALID_COMMAND` | Command type not recognized |
| 400 | `VALIDATION_ERROR` | Payload fails Pydantic validation |
| 409 | `COMMAND_CONFLICT` | Agent is busy executing another cmd |
| 503 | `BRIDGE_UNAVAILABLE` | Game bridge is not connected |
### 5.3 `GET /morrowind/status`
Health-check endpoint for the Morrowind bridge.
**Response:** `200 OK`
```json
{
"bridge_connected": true,
"engine": "tes3mp",
"protocol_version": "1.0.0",
"uptime_seconds": 3600,
"last_perception_at": "2026-03-21T14:30:00Z"
}
```
---
## 6. Engine-Swap Documentation (The Falsework Rule)
### What Changes
| Component | Changes on Engine Swap? | Notes |
| ---------------------- | ----------------------- | --------------------------------------------- |
| Bridge process | **YES** — replaced | New bridge speaks same protocol to new engine |
| Perception Lua script | **YES** — replaced | New engine's scripting language/API |
| `PerceptionOutput` | NO | Schema is engine-agnostic |
| `CommandInput` | NO | Schema is engine-agnostic |
| Heartbeat loop | NO | Consumes `PerceptionOutput`, emits `Command` |
| Reasoning/LLM layer | NO | Operates on abstract perception data |
| Journal system | NO | Writes `journal_note` commands |
| Command log + training | NO | Logs all commands regardless of engine |
| Dashboard WebSocket | NO | Separate protocol (`src/infrastructure/protocol.py`) |
### Swap Procedure
1. Implement new bridge that serves `GET /perception` and accepts `POST /command`.
2. Update `raw_engine_data` field documentation for the new engine.
3. Extend `entity_type` enum if the new engine has novel entity categories.
4. Bump `protocol_version` minor (or major if schema changes are required).
5. Run integration tests against the new bridge.
---
## 7. Error Handling Specification
### 7.1 Error Response Format
All error responses follow a consistent structure:
```json
{
"error": {
"code": "BRIDGE_UNAVAILABLE",
"message": "Human-readable error description",
"details": {},
"timestamp": "2026-03-21T14:30:00Z"
}
}
```
### 7.2 Error Codes
| Code | HTTP Status | Retry? | Description |
| -------------------- | ----------- | ------ | ---------------------------------------- |
| `BRIDGE_UNAVAILABLE` | 503 | yes | Bridge process not connected |
| `PERCEPTION_TIMEOUT` | 504 | yes | Bridge did not respond within deadline |
| `SCHEMA_MISMATCH` | 422 | no | Protocol version incompatibility |
| `INVALID_COMMAND` | 400 | no | Unknown command type |
| `VALIDATION_ERROR` | 400 | no | Pydantic validation failed |
| `COMMAND_CONFLICT` | 409 | yes | Agent busy — retry after current command |
| `INTERNAL_ERROR` | 500 | yes | Unexpected server error |
### 7.3 Retry Policy
Clients SHOULD implement exponential backoff for retryable errors:
- Initial delay: 100ms
- Max delay: 5s
- Max retries: 5
- Jitter: ±50ms
---
## 8. Appendix: Pydantic Model Reference
The canonical Pydantic v2 models live in `src/infrastructure/morrowind/schemas.py`.
These models serve as both runtime validation and living documentation of this spec.
Any change to this spec document MUST be reflected in the Pydantic models, and vice versa.

View File

@@ -0,0 +1,912 @@
# OpenClaw Architecture, Deployment Modes, and Ollama Integration
## Research Report for Timmy Time Dashboard Project
**Issue:** #721 — [Kimi Research] OpenClaw architecture, deployment modes, and Ollama integration
**Date:** 2026-03-21
**Author:** Kimi (Moonshot AI)
**Status:** Complete
---
## Executive Summary
OpenClaw is an open-source AI agent framework that bridges messaging platforms (WhatsApp, Telegram, Slack, Discord, iMessage) to AI coding agents through a centralized gateway. Originally known as Clawdbot and Moltbot, it was rebranded to OpenClaw in early 2026. This report provides a comprehensive analysis of OpenClaw's architecture, deployment options, Ollama integration capabilities, and suitability for deployment on resource-constrained VPS environments like the Hermes DigitalOcean droplet (2GB RAM / 1 vCPU).
**Key Finding:** Running OpenClaw with local LLMs on a 2GB RAM VPS is **not recommended**. The absolute minimum for a text-only agent with external API models is 4GB RAM. For local model inference via Ollama, 8-16GB RAM is the practical minimum. A hybrid approach using OpenRouter as the primary provider with Ollama as fallback is the most viable configuration for small VPS deployments.
---
## 1. Architecture Overview
### 1.1 Core Components
OpenClaw follows a **hub-and-spoke (轴辐式)** architecture optimized for multi-agent task execution:
```
┌─────────────────────────────────────────────────────────────────────────┐
│ OPENCLAW ARCHITECTURE │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ WhatsApp │ │ Telegram │ │ Discord │ │
│ │ Channel │ │ Channel │ │ Channel │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
│ └────────────────────┼────────────────────┘ │
│ ▼ │
│ ┌──────────────────┐ │
│ │ Gateway │◄─────── WebSocket/API │
│ │ (Port 18789) │ Control Plane │
│ └────────┬─────────┘ │
│ │ │
│ ┌──────────────┼──────────────┐ │
│ ▼ ▼ ▼ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Agent A │ │ Agent B │ │ Pi Agent│ │
│ │ (main) │ │ (coder) │ │(delegate)│ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
│ │ │ │ │
│ └──────────────┼──────────────┘ │
│ ▼ │
│ ┌────────────────────────┐ │
│ │ LLM Router │ │
│ │ (Primary/Fallback) │ │
│ └───────────┬────────────┘ │
│ │ │
│ ┌─────────────────┼─────────────────┐ │
│ ▼ ▼ ▼ │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ Ollama │ │ OpenAI │ │Anthropic│ │
│ │(local) │ │(cloud) │ │(cloud) │ │
│ └─────────┘ └─────────┘ └─────────┘ │
│ │ ┌─────┐ │
│ └────────────────────────────────────────────────────►│ MCP │ │
│ │Tools│ │
│ └─────┘ │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Memory │ │ Skills │ │ Workspace │ │
│ │ (SOUL.md) │ │ (SKILL.md) │ │ (sessions) │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────┘
```
### 1.2 Component Deep Dive
| Component | Purpose | Configuration File |
|-----------|---------|-------------------|
| **Gateway** | Central control plane, WebSocket/API server, session management | `gateway` section in `openclaw.json` |
| **Pi Agent** | Core agent runner, "指挥中心" - schedules LLM calls, tool execution, error handling | `agents` section in `openclaw.json` |
| **Channels** | Messaging platform integrations (Telegram, WhatsApp, Slack, Discord, iMessage) | `channels` section in `openclaw.json` |
| **SOUL.md** | Agent persona definition - personality, communication style, behavioral guidelines | `~/.openclaw/workspace/SOUL.md` |
| **AGENTS.md** | Multi-agent configuration, routing rules, agent specialization definitions | `~/.openclaw/workspace/AGENTS.md` |
| **Workspace** | File system for agent state, session data, temporary files | `~/.openclaw/workspace/` |
| **Skills** | Bundled tools, prompts, configurations that teach agents specific tasks | `~/.openclaw/workspace/skills/` |
| **Sessions** | Conversation history, context persistence between interactions | `~/.openclaw/agents/<agent>/sessions/` |
| **MCP Tools** | Model Context Protocol integration for external tool access | Via `mcporter` or native MCP |
### 1.3 Agent Runner Execution Flow
According to OpenClaw documentation, a complete agent run follows these stages:
1. **Queuing** - Session-level queue (serializes same-session requests) → Global queue (controls total concurrency)
2. **Preparation** - Parse workspace, provider/model, thinking level parameters
3. **Plugin Loading** - Load relevant skills based on task context
4. **Memory Retrieval** - Fetch relevant context from SOUL.md and conversation history
5. **LLM Inference** - Send prompt to configured provider with tool definitions
6. **Tool Execution** - Execute any tool calls returned by the LLM
7. **Response Generation** - Format and return final response to the channel
8. **Memory Storage** - Persist conversation and results to session storage
---
## 2. Deployment Modes
### 2.1 Comparison Matrix
| Deployment Mode | Best For | Setup Complexity | Resource Overhead | Stability |
|----------------|----------|------------------|-------------------|-----------|
| **npm global** | Development, quick testing | Low | Minimal (~200MB) | Moderate |
| **Docker** | Production, isolation, reproducibility | Medium | Higher (~2.5GB base image) | High |
| **Docker Compose** | Multi-service stacks, complex setups | Medium-High | Higher | High |
| **Bare metal/systemd** | Maximum performance, dedicated hardware | High | Minimal | Moderate |
### 2.2 NPM Global Installation (Recommended for Quick Start)
```bash
# One-line installer
curl -fsSL https://openclaw.ai/install.sh | bash
# Or manual npm install
npm install -g openclaw
# Initialize configuration
openclaw onboard
# Start gateway
openclaw gateway
```
**Pros:**
- Fastest setup (~30 seconds)
- Direct access to host resources
- Easy updates via `npm update -g openclaw`
**Cons:**
- Node.js 22+ dependency required
- No process isolation
- Manual dependency management
### 2.3 Docker Deployment (Recommended for Production)
```bash
# Pull and run
docker pull openclaw/openclaw:latest
docker run -d \
--name openclaw \
-p 127.0.0.1:18789:18789 \
-v ~/.openclaw:/root/.openclaw \
-e ANTHROPIC_API_KEY=sk-ant-... \
openclaw/openclaw:latest
# Or with Docker Compose
docker compose -f compose.yml --env-file .env up -d --build
```
**Docker Compose Configuration (production-ready):**
```yaml
version: '3.8'
services:
openclaw:
image: openclaw/openclaw:latest
container_name: openclaw
restart: unless-stopped
ports:
- "127.0.0.1:18789:18789" # Never expose to 0.0.0.0
volumes:
- ./openclaw-data:/root/.openclaw
- ./workspace:/root/.openclaw/workspace
environment:
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
- OPENROUTER_API_KEY=${OPENROUTER_API_KEY}
- OLLAMA_API_KEY=ollama-local
networks:
- openclaw-net
# Resource limits for small VPS
deploy:
resources:
limits:
cpus: '1.5'
memory: 3G
reservations:
cpus: '0.5'
memory: 1G
networks:
openclaw-net:
driver: bridge
```
### 2.4 Bare Metal / Systemd Installation
For running as a system service on Linux:
```bash
# Create systemd service
sudo tee /etc/systemd/system/openclaw.service > /dev/null <<EOF
[Unit]
Description=OpenClaw Gateway
After=network.target
[Service]
Type=simple
User=openclaw
Group=openclaw
WorkingDirectory=/home/openclaw
Environment="PATH=/usr/local/bin:/usr/bin:/bin"
Environment="NODE_ENV=production"
Environment="ANTHROPIC_API_KEY=sk-ant-..."
ExecStart=/usr/local/bin/openclaw gateway
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
EOF
sudo systemctl daemon-reload
sudo systemctl enable openclaw
sudo systemctl start openclaw
```
### 2.5 Recommended Deployment for 2GB RAM VPS
**⚠️ Critical Finding:** OpenClaw's official minimum is 4GB RAM. On a 2GB VPS:
1. **Do NOT run local LLMs** - Use external API providers exclusively
2. **Use npm installation** - Docker overhead is too heavy
3. **Disable browser automation** - Chromium requires 2-4GB alone
4. **Enable swap** - Critical for preventing OOM kills
5. **Use OpenRouter** - Cheap/free tier models reduce costs
**Setup script for 2GB VPS:**
```bash
#!/bin/bash
# openclaw-minimal-vps.sh
# Setup for 2GB RAM VPS - EXTERNAL API ONLY
# Create 4GB swap
sudo fallocate -l 4G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab
# Install Node.js 22
curl -fsSL https://deb.nodesource.com/setup_22.x | sudo bash -
sudo apt-get install -y nodejs
# Install OpenClaw
npm install -g openclaw
# Configure for minimal resource usage
mkdir -p ~/.openclaw
cat > ~/.openclaw/openclaw.json <<'EOF'
{
"gateway": {
"bind": "127.0.0.1",
"port": 18789,
"mode": "local"
},
"agents": {
"defaults": {
"model": {
"primary": "openrouter/google/gemma-3-4b-it:free",
"fallbacks": [
"openrouter/meta/llama-3.1-8b-instruct:free"
]
},
"maxIterations": 15,
"timeout": 120
}
},
"channels": {
"telegram": {
"enabled": true,
"dmPolicy": "pairing"
}
}
}
EOF
# Set OpenRouter API key
export OPENROUTER_API_KEY="sk-or-v1-..."
# Start gateway
openclaw gateway &
```
---
## 3. Ollama Integration
### 3.1 Architecture
OpenClaw integrates with Ollama through its native `/api/chat` endpoint, supporting both streaming responses and tool calling simultaneously:
```
┌──────────────┐ HTTP/JSON ┌──────────────┐ GGUF/CPU/GPU ┌──────────┐
│ OpenClaw │◄───────────────────►│ Ollama │◄────────────────────►│ Local │
│ Gateway │ /api/chat │ Server │ Model inference │ LLM │
│ │ Port 11434 │ Port 11434 │ │ │
└──────────────┘ └──────────────┘ └──────────┘
```
### 3.2 Configuration
**Basic Ollama Setup:**
```bash
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
# Start server
ollama serve
# Pull a tool-capable model
ollama pull qwen2.5-coder:7b
ollama pull llama3.1:8b
# Configure OpenClaw
export OLLAMA_API_KEY="ollama-local" # Any non-empty string works
```
**OpenClaw Configuration for Ollama:**
```json
{
"models": {
"providers": {
"ollama": {
"baseUrl": "http://localhost:11434",
"apiKey": "ollama-local",
"api": "ollama",
"models": [
{
"id": "qwen2.5-coder:7b",
"name": "Qwen 2.5 Coder 7B",
"contextWindow": 32768,
"maxTokens": 8192,
"cost": { "input": 0, "output": 0 }
},
{
"id": "llama3.1:8b",
"name": "Llama 3.1 8B",
"contextWindow": 128000,
"maxTokens": 8192,
"cost": { "input": 0, "output": 0 }
}
]
}
}
},
"agents": {
"defaults": {
"model": {
"primary": "ollama/qwen2.5-coder:7b",
"fallbacks": ["ollama/llama3.1:8b"]
}
}
}
}
```
### 3.3 Context Window Requirements
**⚠️ Critical Requirement:** OpenClaw requires a minimum **64K token context window** for reliable multi-step task execution.
| Model | Parameters | Context Window | Tool Support | OpenClaw Compatible |
|-------|-----------|----------------|--------------|---------------------|
| **llama3.1** | 8B | 128K | ✅ Yes | ✅ Yes |
| **qwen2.5-coder** | 7B | 32K | ✅ Yes | ⚠️ Below minimum |
| **qwen2.5-coder** | 32B | 128K | ✅ Yes | ✅ Yes |
| **gpt-oss** | 20B | 128K | ✅ Yes | ✅ Yes |
| **glm-4.7-flash** | - | 128K | ✅ Yes | ✅ Yes |
| **deepseek-coder-v2** | 33B | 128K | ✅ Yes | ✅ Yes |
| **mistral-small3.1** | - | 128K | ✅ Yes | ✅ Yes |
**Context Window Configuration:**
For models that don't report context window via Ollama's API:
```bash
# Create custom Modelfile with extended context
cat > ~/qwen-custom.modelfile <<EOF
FROM qwen2.5-coder:7b
PARAMETER num_ctx 65536
PARAMETER temperature 0.7
EOF
# Create custom model
ollama create qwen2.5-coder-64k -f ~/qwen-custom.modelfile
```
### 3.4 Models for Small VPS (≤8B Parameters)
For resource-constrained environments (2-4GB RAM):
| Model | Quantization | RAM Required | VRAM Required | Performance |
|-------|-------------|--------------|---------------|-------------|
| **Llama 3.1 8B** | Q4_K_M | ~5GB | ~6GB | Good |
| **Llama 3.2 3B** | Q4_K_M | ~2.5GB | ~3GB | Basic |
| **Qwen 2.5 7B** | Q4_K_M | ~5GB | ~6GB | Good |
| **Qwen 2.5 3B** | Q4_K_M | ~2.5GB | ~3GB | Basic |
| **DeepSeek 7B** | Q4_K_M | ~5GB | ~6GB | Good |
| **Phi-4 4B** | Q4_K_M | ~3GB | ~4GB | Moderate |
**⚠️ Verdict for 2GB VPS:** Running local LLMs is **NOT viable**. Use external APIs only.
---
## 4. OpenRouter Integration (Fallback Strategy)
### 4.1 Overview
OpenRouter provides a unified API gateway to multiple LLM providers, enabling:
- Single API key access to 200+ models
- Automatic failover between providers
- Free tier models for cost-conscious deployments
- Unified billing and usage tracking
### 4.2 Configuration
**Environment Variable Setup:**
```bash
export OPENROUTER_API_KEY="sk-or-v1-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
```
**OpenClaw Configuration:**
```json
{
"models": {
"providers": {
"openrouter": {
"apiKey": "${OPENROUTER_API_KEY}",
"baseUrl": "https://openrouter.ai/api/v1"
}
}
},
"agents": {
"defaults": {
"model": {
"primary": "openrouter/anthropic/claude-sonnet-4-6",
"fallbacks": [
"openrouter/google/gemini-3.1-pro",
"openrouter/meta/llama-3.3-70b-instruct",
"openrouter/google/gemma-3-4b-it:free"
]
}
}
}
}
```
### 4.3 Recommended Free/Cheap Models on OpenRouter
For cost-conscious VPS deployments:
| Model | Cost | Context | Best For |
|-------|------|---------|----------|
| **google/gemma-3-4b-it:free** | Free | 128K | General tasks, simple automation |
| **meta/llama-3.1-8b-instruct:free** | Free | 128K | General tasks, longer contexts |
| **deepseek/deepseek-chat-v3.2** | $0.53/M | 64K | Code generation, reasoning |
| **xiaomi/mimo-v2-flash** | $0.40/M | 128K | Fast responses, basic tasks |
| **qwen/qwen3-coder-next** | $1.20/M | 128K | Code-focused tasks |
### 4.4 Hybrid Configuration (Recommended for Timmy)
A production-ready configuration for the Hermes VPS:
```json
{
"models": {
"providers": {
"openrouter": {
"apiKey": "${OPENROUTER_API_KEY}",
"models": [
{
"id": "google/gemma-3-4b-it:free",
"name": "Gemma 3 4B (Free)",
"contextWindow": 131072,
"maxTokens": 8192,
"cost": { "input": 0, "output": 0 }
},
{
"id": "deepseek/deepseek-chat-v3.2",
"name": "DeepSeek V3.2",
"contextWindow": 64000,
"maxTokens": 8192,
"cost": { "input": 0.00053, "output": 0.00053 }
}
]
},
"ollama": {
"baseUrl": "http://localhost:11434",
"apiKey": "ollama-local",
"models": [
{
"id": "llama3.2:3b",
"name": "Llama 3.2 3B (Local Fallback)",
"contextWindow": 128000,
"maxTokens": 4096,
"cost": { "input": 0, "output": 0 }
}
]
}
}
},
"agents": {
"defaults": {
"model": {
"primary": "openrouter/google/gemma-3-4b-it:free",
"fallbacks": [
"openrouter/deepseek/deepseek-chat-v3.2",
"ollama/llama3.2:3b"
]
},
"maxIterations": 10,
"timeout": 90
}
}
}
```
---
## 5. Hardware Constraints & VPS Viability
### 5.1 System Requirements Summary
| Component | Minimum | Recommended | Notes |
|-----------|---------|-------------|-------|
| **CPU** | 2 vCPU | 4 vCPU | Dedicated preferred over shared |
| **RAM** | 4 GB | 8 GB | 2GB causes OOM with external APIs |
| **Storage** | 40 GB SSD | 80 GB NVMe | Docker images are ~10-15GB |
| **Network** | 100 Mbps | 1 Gbps | For API calls and model downloads |
| **OS** | Ubuntu 22.04/Debian 12 | Ubuntu 24.04 LTS | Linux required for production |
### 5.2 2GB RAM VPS Analysis
**Can it work?** Yes, with severe limitations:
**What works:**
- Text-only agents with external API providers
- Single Telegram/Discord channel
- Basic file operations and shell commands
- No browser automation
**What doesn't work:**
- Local LLM inference via Ollama
- Browser automation (Chromium needs 2-4GB)
- Multiple concurrent channels
- Python environment-heavy skills
**Required mitigations for 2GB VPS:**
```bash
# 1. Create substantial swap
sudo fallocate -l 4G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
# 2. Configure swappiness
echo 'vm.swappiness=60' | sudo tee -a /etc/sysctl.conf
sudo sysctl -p
# 3. Limit Node.js memory
export NODE_OPTIONS="--max-old-space-size=1536"
# 4. Use external APIs only - NO OLLAMA
# 5. Disable browser skills
# 6. Set conservative concurrency limits
```
### 5.3 4-bit Quantization Viability
**Qwen 2.5 7B Q4_K_M on 2GB VPS:**
- Model size: ~4.5GB
- RAM required at runtime: ~5-6GB
- **Verdict:** Will cause immediate OOM on 2GB VPS
- **Even with 4GB VPS:** Marginal, heavy swap usage, poor performance
**Viable models for 4GB VPS with Ollama:**
- Llama 3.2 3B Q4_K_M (~2.5GB RAM)
- Qwen 2.5 3B Q4_K_M (~2.5GB RAM)
- Phi-4 4B Q4_K_M (~3GB RAM)
---
## 6. Security Configuration
### 6.1 Network Ports
| Port | Purpose | Exposure |
|------|---------|----------|
| **18789/tcp** | OpenClaw Gateway (WebSocket/HTTP) | **NEVER expose to internet** |
| **11434/tcp** | Ollama API (if running locally) | Localhost only |
| **22/tcp** | SSH | Restrict to known IPs |
**⚠️ CRITICAL:** Never expose port 18789 to the public internet. Use Tailscale or SSH tunnels for remote access.
### 6.2 Tailscale Integration
Tailscale provides zero-configuration VPN mesh for secure remote access:
```bash
# Install Tailscale
curl -fsSL https://tailscale.com/install.sh | sh
sudo tailscale up
# Get Tailscale IP
tailscale ip
# Returns: 100.x.y.z
# Configure OpenClaw to bind to Tailscale
cat > ~/.openclaw/openclaw.json <<EOF
{
"gateway": {
"bind": "tailnet",
"port": 18789
},
"tailscale": {
"mode": "on",
"resetOnExit": false
}
}
EOF
```
**Tailscale vs SSH Tunnel:**
| Feature | Tailscale | SSH Tunnel |
|---------|-----------|------------|
| Setup | Very easy | Moderate |
| Persistence | Automatic | Requires autossh |
| Multiple devices | Built-in | One tunnel per connection |
| NAT traversal | Works | Requires exposed SSH |
| Access control | Tailscale ACL | SSH keys |
### 6.3 Firewall Configuration (UFW)
```bash
# Default deny
sudo ufw default deny incoming
sudo ufw default allow outgoing
# Allow SSH
sudo ufw allow 22/tcp
# Allow Tailscale only (if using)
sudo ufw allow in on tailscale0 to any port 18789
# Block public access to OpenClaw
# (bind is 127.0.0.1, so this is defense in depth)
sudo ufw enable
```
### 6.4 Authentication Configuration
```json
{
"gateway": {
"bind": "127.0.0.1",
"port": 18789,
"auth": {
"mode": "token",
"token": "your-64-char-hex-token-here"
},
"controlUi": {
"allowedOrigins": [
"http://localhost:18789",
"https://your-domain.tailnet-name.ts.net"
],
"allowInsecureAuth": false,
"dangerouslyDisableDeviceAuth": false
}
}
}
```
**Generate secure token:**
```bash
openssl rand -hex 32
```
### 6.5 Sandboxing Considerations
OpenClaw executes arbitrary shell commands and file operations by default. For production:
1. **Run as non-root user:**
```bash
sudo useradd -r -s /bin/false openclaw
sudo mkdir -p /home/openclaw/.openclaw
sudo chown -R openclaw:openclaw /home/openclaw
```
2. **Use Docker for isolation:**
```bash
docker run --security-opt=no-new-privileges \
--cap-drop=ALL \
--read-only \
--tmpfs /tmp:noexec,nosuid,size=100m \
openclaw/openclaw:latest
```
3. **Enable dmPolicy for channels:**
```json
{
"channels": {
"telegram": {
"dmPolicy": "pairing" // Require one-time code for new contacts
}
}
}
```
---
## 7. MCP (Model Context Protocol) Tools
### 7.1 Overview
MCP is an open standard created by Anthropic (donated to Linux Foundation in Dec 2025) that lets AI applications connect to external tools through a universal interface. Think of it as "USB-C for AI."
### 7.2 MCP vs OpenClaw Skills
| Aspect | MCP | OpenClaw Skills |
|--------|-----|-----------------|
| **Protocol** | Standardized (Anthropic) | OpenClaw-specific |
| **Isolation** | Process-isolated | Runs in agent context |
| **Security** | Higher (sandboxed) | Lower (full system access) |
| **Discovery** | Automatic via protocol | Manual via SKILL.md |
| **Ecosystem** | 10,000+ servers | 5400+ skills |
**Note:** OpenClaw currently has limited native MCP support. Use `mcporter` tool for MCP integration.
### 7.3 Using MCPorter (MCP Bridge)
```bash
# Install mcporter
clawhub install mcporter
# Configure MCP server
mcporter config add github \
--url "https://api.github.com/mcp" \
--token "ghp_..."
# List available tools
mcporter list
# Call MCP tool
mcporter call github.list_repos --owner "rockachopa"
```
### 7.4 Popular MCP Servers
| Server | Purpose | Integration |
|--------|---------|-------------|
| **GitHub** | Repo management, PRs, issues | `mcp-github` |
| **Slack** | Messaging, channel management | `mcp-slack` |
| **PostgreSQL** | Database queries | `mcp-postgres` |
| **Filesystem** | File operations (sandboxed) | `mcp-filesystem` |
| **Brave Search** | Web search | `mcp-brave` |
---
## 8. Recommendations for Timmy Time Dashboard
### 8.1 Deployment Strategy for Hermes VPS (2GB RAM)
Given the hardware constraints, here's the recommended approach:
**Option A: External API Only (Recommended)**
```
┌─────────────────────────────────────────┐
│ Hermes VPS (2GB RAM) │
│ ┌─────────────────────────────────┐ │
│ │ OpenClaw Gateway │ │
│ │ (npm global install) │ │
│ └─────────────┬───────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────┐ │
│ │ OpenRouter API (Free Tier) │ │
│ │ google/gemma-3-4b-it:free │ │
│ └─────────────────────────────────┘ │
│ │
│ NO OLLAMA - insufficient RAM │
└─────────────────────────────────────────┘
```
**Option B: Hybrid with External Ollama**
```
┌──────────────────────┐ ┌──────────────────────────┐
│ Hermes VPS (2GB) │ │ Separate Ollama Host │
│ ┌────────────────┐ │ │ ┌────────────────────┐ │
│ │ OpenClaw │ │◄────►│ │ Ollama Server │ │
│ │ (external API) │ │ │ │ (8GB+ RAM required)│ │
│ └────────────────┘ │ │ └────────────────────┘ │
└──────────────────────┘ └──────────────────────────┘
```
### 8.2 Configuration Summary
```json
{
"gateway": {
"bind": "127.0.0.1",
"port": 18789,
"auth": {
"mode": "token",
"token": "GENERATE_WITH_OPENSSL_RAND"
}
},
"models": {
"providers": {
"openrouter": {
"apiKey": "${OPENROUTER_API_KEY}",
"models": [
{
"id": "google/gemma-3-4b-it:free",
"contextWindow": 131072,
"maxTokens": 4096
}
]
}
}
},
"agents": {
"defaults": {
"model": {
"primary": "openrouter/google/gemma-3-4b-it:free"
},
"maxIterations": 10,
"timeout": 90,
"maxConcurrent": 2
}
},
"channels": {
"telegram": {
"enabled": true,
"dmPolicy": "pairing"
}
}
}
```
### 8.3 Migration Path (Future)
When upgrading to a larger VPS (4-8GB RAM):
1. **Phase 1:** Enable Ollama with Llama 3.2 3B as fallback
2. **Phase 2:** Add browser automation skills (requires 4GB+ RAM)
3. **Phase 3:** Enable multi-agent routing with specialized agents
4. **Phase 4:** Add MCP server integration for external tools
---
## 9. References
1. OpenClaw Official Documentation: https://docs.openclaw.ai
2. Ollama Integration Guide: https://docs.ollama.com/integrations/openclaw
3. OpenRouter Documentation: https://openrouter.ai/docs
4. MCP Specification: https://modelcontextprotocol.io
5. OpenClaw Community Discord: https://discord.gg/openclaw
6. GitHub Repository: https://github.com/openclaw/openclaw
---
## 10. Appendix: Quick Command Reference
```bash
# Installation
curl -fsSL https://openclaw.ai/install.sh | bash
# Configuration
openclaw onboard # Interactive setup
openclaw configure # Edit config
openclaw config set <key> <value> # Set specific value
# Gateway management
openclaw gateway # Start gateway
openclaw gateway --verbose # Start with logs
openclaw gateway status # Check status
openclaw gateway restart # Restart gateway
openclaw gateway stop # Stop gateway
# Model management
openclaw models list # List available models
openclaw models set <model> # Set default model
openclaw models status # Check model status
# Diagnostics
openclaw doctor # System health check
openclaw doctor --repair # Auto-fix issues
openclaw security audit # Security check
# Dashboard
openclaw dashboard # Open web UI
```
---
*End of Research Report*

View File

@@ -0,0 +1,127 @@
# SOUL.md Authoring Guide
How to write a SOUL.md for a new agent in the Timmy ecosystem.
## Before You Start
1. **Read the template**`docs/soul-framework/template.md` has the canonical
structure with all required and optional sections.
2. **Read Timmy's soul**`memory/self/soul.md` is the reference implementation.
Study how values, behavior, and boundaries work together.
3. **Decide the role** — What does this agent do? A SOUL.md that tries to cover
everything covers nothing.
## Writing Process
### Step 1: Identity
Start with who the agent is. Keep it concrete:
```markdown
## Identity
- **Name:** Seer
- **Role:** Cartographic intelligence — maps terrain, tracks routes, flags points of interest
- **Lineage:** Timmy (inherits sovereignty and honesty values)
- **Version:** 1.0.0
```
The lineage field matters. If this agent derives from another, say so — the
validator checks that inherited values are not contradicted.
### Step 2: Values
Values are ordered by priority. When two values conflict, the higher-ranked
value wins. Three to six values is the sweet spot.
**Good values are specific and actionable:**
- *"Accuracy. I report what I observe, not what I expect."*
- *"Caution. When uncertain about terrain, I mark it as unexplored rather than guessing."*
**Bad values are vague or aspirational:**
- *"Be good."*
- *"Try my best."*
### Step 3: Prime Directive
One sentence. This is the tie-breaker when values conflict:
```markdown
## Prime Directive
Map the world faithfully so that Timmy can navigate safely.
```
### Step 4: Audience Awareness
Who does this agent talk to? Another agent? A human? Both?
```markdown
## Audience Awareness
- **Primary audience:** Timmy (parent agent) and other sub-agents
- **Tone:** Terse, data-oriented, no pleasantries
- **Adaptation rules:** When reporting to humans via dashboard, add natural-language summaries
```
### Step 5: Constraints
Hard rules. These are never broken, even under direct instruction:
```markdown
## Constraints
1. Never fabricate map data — unknown is always better than wrong
2. Never overwrite another agent's observations without evidence
3. Report confidence levels on all terrain classifications
```
### Step 6: Behavior and Boundaries (Optional)
Behavior is how the agent communicates. Boundaries are what it refuses to do.
Only include these if the defaults from the parent agent aren't sufficient.
## Validation
After writing, run the validator:
```python
from infrastructure.soul.loader import SoulLoader
from infrastructure.soul.validator import SoulValidator
loader = SoulLoader()
soul = loader.load("path/to/SOUL.md")
validator = SoulValidator()
result = validator.validate(soul)
if not result.valid:
for error in result.errors:
print(f"ERROR: {error}")
for warning in result.warnings:
print(f"WARNING: {warning}")
```
## Common Mistakes
1. **Too many values.** More than six values means you haven't prioritized.
If everything is important, nothing is.
2. **Contradictory constraints.** "Always respond immediately" + "Take time to
think before responding" — the validator catches these.
3. **Missing prime directive.** Without a tie-breaker, value conflicts are
resolved arbitrarily.
4. **Copying Timmy's soul verbatim.** Sub-agents should inherit values via
lineage, not duplication. Add role-specific values, don't repeat parent values.
5. **Vague boundaries.** "Will not do bad things" is not a boundary. "Will not
execute commands that modify game state without Timmy's approval" is.
## File Placement
- Agent souls live alongside the agent: `memory/agents/{name}/soul.md`
- Timmy's soul: `memory/self/soul.md`
- Templates: `docs/soul-framework/template.md`

View File

@@ -0,0 +1,153 @@
# SOUL.md Role Extensions
Sub-agents in the Timmy ecosystem inherit core identity from their parent
but extend it with role-specific values, constraints, and behaviors.
## How Role Extensions Work
A role extension is a SOUL.md that declares a `Lineage` pointing to a parent
agent. The sub-agent inherits the parent's values and adds its own:
```
Parent (Timmy) Sub-agent (Seer)
├── Sovereignty ├── Sovereignty (inherited)
├── Service ├── Service (inherited)
├── Honesty ├── Honesty (inherited)
└── Humility ├── Humility (inherited)
├── Accuracy (role-specific)
└── Caution (role-specific)
```
**Rules:**
- Inherited values cannot be contradicted (validator enforces this)
- Role-specific values are appended after inherited ones
- The prime directive can differ from the parent's
- Constraints are additive — a sub-agent can add constraints but not remove parent constraints
## Reference: Sub-Agent Roles
### Seer — Cartographic Intelligence
**Focus:** Terrain mapping, route planning, point-of-interest discovery.
```markdown
## Identity
- **Name:** Seer
- **Role:** Cartographic intelligence
- **Lineage:** Timmy
- **Version:** 1.0.0
## Values
- **Accuracy.** I report what I observe, not what I expect.
- **Caution.** When uncertain about terrain, I mark it as unexplored.
- **Completeness.** I aim to map every reachable cell.
## Prime Directive
Map the world faithfully so that Timmy can navigate safely.
## Constraints
1. Never fabricate map data
2. Mark confidence levels on all observations
3. Re-verify stale data (older than 10 game-days) before relying on it
```
### Mace — Combat Intelligence
**Focus:** Threat assessment, combat tactics, equipment optimization.
```markdown
## Identity
- **Name:** Mace
- **Role:** Combat intelligence
- **Lineage:** Timmy
- **Version:** 1.0.0
## Values
- **Survival.** Timmy's survival is the top priority in any encounter.
- **Efficiency.** Minimize resource expenditure per encounter.
- **Awareness.** Continuously assess threats even outside combat.
## Prime Directive
Keep Timmy alive and effective in hostile encounters.
## Constraints
1. Never initiate combat without Timmy's approval (unless self-defense)
2. Always maintain an escape route assessment
3. Report threat levels honestly — never downplay danger
```
### Quill — Dialogue Intelligence
**Focus:** NPC interaction, quest tracking, reputation management.
```markdown
## Identity
- **Name:** Quill
- **Role:** Dialogue intelligence
- **Lineage:** Timmy
- **Version:** 1.0.0
## Values
- **Attentiveness.** Listen fully before responding.
- **Diplomacy.** Prefer non-hostile resolutions when possible.
- **Memory.** Track all NPC relationships and prior conversations.
## Prime Directive
Manage NPC relationships to advance Timmy's goals without deception.
## Constraints
1. Never lie to NPCs unless Timmy explicitly instructs it
2. Track disposition changes and warn when relationships deteriorate
3. Summarize dialogue options with risk assessments
```
### Ledger — Resource Intelligence
**Focus:** Inventory management, economy, resource optimization.
```markdown
## Identity
- **Name:** Ledger
- **Role:** Resource intelligence
- **Lineage:** Timmy
- **Version:** 1.0.0
## Values
- **Prudence.** Conserve resources for future needs.
- **Transparency.** Report all transactions and resource changes.
- **Optimization.** Maximize value per weight unit carried.
## Prime Directive
Ensure Timmy always has the resources needed for the current objective.
## Constraints
1. Never sell quest-critical items
2. Maintain a minimum gold reserve (configurable)
3. Alert when encumbrance exceeds 80%
```
## Creating a New Role Extension
1. Copy the template from `docs/soul-framework/template.md`
2. Set `Lineage` to the parent agent name
3. Add role-specific values *after* acknowledging inherited ones
4. Write a role-specific prime directive
5. Add constraints (remember: additive only)
6. Run the validator to check for contradictions
7. Place the file at `memory/agents/{name}/soul.md`

View File

@@ -0,0 +1,94 @@
# SOUL.md Template
A **SOUL.md** file defines an agent's identity, values, and operating constraints.
It is the root-of-trust document that shapes every decision the agent makes.
Copy this template and fill in each section for your agent.
---
```markdown
# {Agent Name} — Soul Identity
{One-paragraph summary of who this agent is and why it exists.}
## Identity
- **Name:** {Agent name}
- **Role:** {Primary function — e.g., "autonomous game agent", "code reviewer"}
- **Lineage:** {Parent agent or template this identity derives from, if any}
- **Version:** {SOUL.md version — use semantic versioning, e.g., 1.0.0}
## Values
List the agent's core values in priority order. Each value gets a name and
a one-sentence definition. Values are non-negotiable — they constrain all
behavior even when they conflict with user instructions.
- **{Value 1}.** {Definition}
- **{Value 2}.** {Definition}
- **{Value 3}.** {Definition}
## Prime Directive
{A single sentence that captures the agent's highest-priority goal.
When values conflict, the prime directive breaks the tie.}
## Audience Awareness
Describe who the agent serves and how it should adapt its communication:
- **Primary audience:** {Who the agent talks to most}
- **Tone:** {Formal, casual, terse, etc.}
- **Adaptation rules:** {How the agent adjusts for different contexts}
## Constraints
Hard rules that the agent must never violate, regardless of context:
1. {Constraint — e.g., "Never fabricate information"}
2. {Constraint — e.g., "Never claim certainty without evidence"}
3. {Constraint — e.g., "Refuse over fabricate"}
## Behavior
Optional section for communication style, preferences, and defaults:
- {Behavioral guideline}
- {Behavioral guideline}
## Boundaries
Lines the agent will not cross. Distinct from constraints (which are rules)
— boundaries are refusals:
- {Boundary — e.g., "Will not pretend to be human"}
- {Boundary — e.g., "Will not execute destructive actions without confirmation"}
---
*{Closing motto or statement of purpose.}*
```
---
## Section Reference
| Section | Required | Purpose |
| -------------------- | -------- | ------------------------------------------------- |
| Identity | Yes | Name, role, lineage, version |
| Values | Yes | Ordered list of non-negotiable principles |
| Prime Directive | Yes | Tie-breaker when values conflict |
| Audience Awareness | Yes | Who the agent serves and how it adapts |
| Constraints | Yes | Hard rules that must never be violated |
| Behavior | No | Communication style and defaults |
| Boundaries | No | Lines the agent refuses to cross |
## Versioning
Every SOUL.md must include a version in the Identity section. Use semantic
versioning: `MAJOR.MINOR.PATCH`.
- **MAJOR** — fundamental identity change (new role, new values)
- **MINOR** — added or reordered values, new constraints
- **PATCH** — wording clarifications, formatting fixes

View File

@@ -20,6 +20,7 @@ if config.config_file_name is not None:
# target_metadata = mymodel.Base.metadata
from src.dashboard.models.database import Base
from src.dashboard.models.calm import Task, JournalEntry
from src.infrastructure.morrowind.command_log import CommandLog # noqa: F401
target_metadata = Base.metadata
# other values from the config, defined by the needs of env.py,

View File

@@ -0,0 +1,89 @@
"""Create command_log table
Revision ID: a1b2c3d4e5f6
Revises: 0093c15b4bbf
Create Date: 2026-03-21 12:00:00.000000
"""
from typing import Sequence, Union
import sqlalchemy as sa
from alembic import op
# revision identifiers, used by Alembic.
revision: str = "a1b2c3d4e5f6"
down_revision: Union[str, Sequence[str], None] = "0093c15b4bbf"
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
"""Upgrade schema."""
op.create_table(
"command_log",
sa.Column("id", sa.Integer(), autoincrement=True, nullable=False),
sa.Column("timestamp", sa.DateTime(), nullable=False),
sa.Column("command", sa.String(length=64), nullable=False),
sa.Column("params", sa.Text(), nullable=False, server_default="{}"),
sa.Column("reasoning", sa.Text(), nullable=False, server_default=""),
sa.Column(
"perception_snapshot", sa.Text(), nullable=False, server_default="{}"
),
sa.Column("outcome", sa.Text(), nullable=True),
sa.Column(
"agent_id",
sa.String(length=64),
nullable=False,
server_default="timmy",
),
sa.Column("episode_id", sa.String(length=128), nullable=True),
sa.Column("cell", sa.String(length=255), nullable=True),
sa.Column(
"protocol_version",
sa.String(length=16),
nullable=False,
server_default="1.0.0",
),
sa.Column("created_at", sa.DateTime(), nullable=False),
sa.PrimaryKeyConstraint("id"),
)
op.create_index(
op.f("ix_command_log_timestamp"), "command_log", ["timestamp"], unique=False
)
op.create_index(
op.f("ix_command_log_command"), "command_log", ["command"], unique=False
)
op.create_index(
op.f("ix_command_log_agent_id"), "command_log", ["agent_id"], unique=False
)
op.create_index(
op.f("ix_command_log_episode_id"),
"command_log",
["episode_id"],
unique=False,
)
op.create_index(
op.f("ix_command_log_cell"), "command_log", ["cell"], unique=False
)
op.create_index(
"ix_command_log_cmd_cell", "command_log", ["command", "cell"], unique=False
)
op.create_index(
"ix_command_log_episode",
"command_log",
["episode_id", "timestamp"],
unique=False,
)
def downgrade() -> None:
"""Downgrade schema."""
op.drop_index("ix_command_log_episode", table_name="command_log")
op.drop_index("ix_command_log_cmd_cell", table_name="command_log")
op.drop_index(op.f("ix_command_log_cell"), table_name="command_log")
op.drop_index(op.f("ix_command_log_episode_id"), table_name="command_log")
op.drop_index(op.f("ix_command_log_agent_id"), table_name="command_log")
op.drop_index(op.f("ix_command_log_command"), table_name="command_log")
op.drop_index(op.f("ix_command_log_timestamp"), table_name="command_log")
op.drop_table("command_log")

View File

@@ -20,6 +20,7 @@ packages = [
{ include = "spark", from = "src" },
{ include = "timmy", from = "src" },
{ include = "timmy_serve", from = "src" },
{ include = "timmyctl", from = "src" },
]
[tool.poetry.dependencies]
@@ -82,6 +83,7 @@ mypy = ">=1.0.0"
[tool.poetry.scripts]
timmy = "timmy.cli:main"
timmy-serve = "timmy_serve.cli:main"
timmyctl = "timmyctl.cli:main"
[tool.pytest.ini_options]
testpaths = ["tests"]

View File

@@ -54,6 +54,7 @@ REPO_ROOT = Path(__file__).resolve().parent.parent
RETRO_FILE = REPO_ROOT / ".loop" / "retro" / "cycles.jsonl"
SUMMARY_FILE = REPO_ROOT / ".loop" / "retro" / "summary.json"
EPOCH_COUNTER_FILE = REPO_ROOT / ".loop" / "retro" / ".epoch_counter"
CYCLE_RESULT_FILE = REPO_ROOT / ".loop" / "cycle_result.json"
# How many recent entries to include in rolling summary
SUMMARY_WINDOW = 50
@@ -246,9 +247,37 @@ def update_summary() -> None:
SUMMARY_FILE.write_text(json.dumps(summary, indent=2) + "\n")
def _load_cycle_result() -> dict:
"""Read .loop/cycle_result.json if it exists; return empty dict on failure."""
if not CYCLE_RESULT_FILE.exists():
return {}
try:
raw = CYCLE_RESULT_FILE.read_text().strip()
# Strip hermes fence markers (```json ... ```) if present
if raw.startswith("```"):
lines = raw.splitlines()
lines = [l for l in lines if not l.startswith("```")]
raw = "\n".join(lines)
return json.loads(raw)
except (json.JSONDecodeError, OSError):
return {}
def main() -> None:
args = parse_args()
# Backfill from cycle_result.json when CLI args have defaults
cr = _load_cycle_result()
if cr:
if args.issue is None and cr.get("issue"):
args.issue = int(cr["issue"])
if args.type == "unknown" and cr.get("type"):
args.type = cr["type"]
if args.tests_passed == 0 and cr.get("tests_passed"):
args.tests_passed = int(cr["tests_passed"])
if not args.notes and cr.get("notes"):
args.notes = cr["notes"]
# Auto-detect issue from branch when not explicitly provided
if args.issue is None:
args.issue = detect_issue_from_branch()

View File

@@ -27,11 +27,15 @@ from pathlib import Path
REPO_ROOT = Path(__file__).resolve().parent.parent
QUEUE_FILE = REPO_ROOT / ".loop" / "queue.json"
IDLE_STATE_FILE = REPO_ROOT / ".loop" / "idle_state.json"
CYCLE_RESULT_FILE = REPO_ROOT / ".loop" / "cycle_result.json"
TOKEN_FILE = Path.home() / ".hermes" / "gitea_token"
GITEA_API = os.environ.get("GITEA_API", "http://localhost:3000/api/v1")
REPO_SLUG = os.environ.get("REPO_SLUG", "rockachopa/Timmy-time-dashboard")
# Default cycle duration in seconds (5 min); stale threshold = 2× this
CYCLE_DURATION = int(os.environ.get("CYCLE_DURATION", "300"))
# Backoff sequence: 60s, 120s, 240s, 600s max
BACKOFF_BASE = 60
BACKOFF_MAX = 600
@@ -77,6 +81,89 @@ def _fetch_open_issue_numbers() -> set[int] | None:
return None
def _load_cycle_result() -> dict:
"""Read cycle_result.json, handling markdown-fenced JSON."""
if not CYCLE_RESULT_FILE.exists():
return {}
try:
raw = CYCLE_RESULT_FILE.read_text().strip()
if raw.startswith("```"):
lines = raw.splitlines()
lines = [ln for ln in lines if not ln.startswith("```")]
raw = "\n".join(lines)
return json.loads(raw)
except (json.JSONDecodeError, OSError):
return {}
def _is_issue_open(issue_number: int) -> bool | None:
"""Check if a single issue is open. Returns None on API failure."""
token = _get_token()
if not token:
return None
try:
url = f"{GITEA_API}/repos/{REPO_SLUG}/issues/{issue_number}"
req = urllib.request.Request(
url,
headers={
"Authorization": f"token {token}",
"Accept": "application/json",
},
)
with urllib.request.urlopen(req, timeout=10) as resp:
data = json.loads(resp.read())
return data.get("state") == "open"
except Exception:
return None
def validate_cycle_result() -> bool:
"""Pre-cycle validation: remove stale or invalid cycle_result.json.
Checks:
1. Age — if older than 2× CYCLE_DURATION, delete it.
2. Issue — if the referenced issue is closed, delete it.
Returns True if the file was removed, False otherwise.
"""
if not CYCLE_RESULT_FILE.exists():
return False
# Age check
try:
age = time.time() - CYCLE_RESULT_FILE.stat().st_mtime
except OSError:
return False
stale_threshold = CYCLE_DURATION * 2
if age > stale_threshold:
print(
f"[loop-guard] cycle_result.json is {int(age)}s old "
f"(threshold {stale_threshold}s) — removing stale file"
)
CYCLE_RESULT_FILE.unlink(missing_ok=True)
return True
# Issue check
cr = _load_cycle_result()
issue_num = cr.get("issue")
if issue_num is not None:
try:
issue_num = int(issue_num)
except (ValueError, TypeError):
return False
is_open = _is_issue_open(issue_num)
if is_open is False:
print(
f"[loop-guard] cycle_result.json references closed "
f"issue #{issue_num} — removing"
)
CYCLE_RESULT_FILE.unlink(missing_ok=True)
return True
# is_open is None (API failure) or True — keep file
return False
def load_queue() -> list[dict]:
"""Load queue.json and return ready items, filtering out closed issues."""
if not QUEUE_FILE.exists():
@@ -150,6 +237,9 @@ def main() -> int:
}, indent=2))
return 0
# Pre-cycle validation: remove stale cycle_result.json
validate_cycle_result()
ready = load_queue()
if ready:

View File

@@ -84,6 +84,7 @@ class Settings(BaseSettings):
# Only used when explicitly enabled and query complexity warrants it.
grok_enabled: bool = False
xai_api_key: str = ""
xai_base_url: str = "https://api.x.ai/v1"
grok_default_model: str = "grok-3-fast"
grok_max_sats_per_query: int = 200
grok_free: bool = False # Skip Lightning invoice when user has own API key
@@ -148,6 +149,18 @@ class Settings(BaseSettings):
"http://127.0.0.1:8000",
]
# ── Matrix Frontend Integration ────────────────────────────────────────
# URL of the Matrix frontend (Replit/Tailscale) for CORS.
# When set, this origin is added to CORS allowed_origins.
# Example: "http://100.124.176.28:8080" or "https://alexanderwhitestone.com"
matrix_frontend_url: str = "" # Empty = disabled
# WebSocket authentication token for Matrix connections.
# When set, clients must provide this token via ?token= query param
# or in the first message as {"type": "auth", "token": "..."}.
# Empty/unset = auth disabled (dev mode).
matrix_ws_token: str = ""
# Trusted hosts for the Host header check (TrustedHostMiddleware).
# Set TRUSTED_HOSTS as a comma-separated list. Wildcards supported (e.g. "*.ts.net").
# Defaults include localhost + Tailscale MagicDNS. Add your Tailscale IP if needed.
@@ -317,6 +330,13 @@ class Settings(BaseSettings):
autoresearch_max_iterations: int = 100
autoresearch_metric: str = "val_bpb" # metric to optimise (lower = better)
# ── Weekly Narrative Summary ───────────────────────────────────────
# Generates a human-readable weekly summary of development activity.
# Disabling this will stop the weekly narrative generation.
weekly_narrative_enabled: bool = True
weekly_narrative_lookback_days: int = 7
weekly_narrative_output_dir: str = ".loop"
# ── Local Hands (Shell + Git) ──────────────────────────────────────
# Enable local shell/git execution hands.
hands_shell_enabled: bool = True

View File

@@ -10,6 +10,7 @@ Key improvements:
import asyncio
import json
import logging
import re
from contextlib import asynccontextmanager
from pathlib import Path
@@ -23,6 +24,7 @@ from config import settings
# Import dedicated middleware
from dashboard.middleware.csrf import CSRFMiddleware
from dashboard.middleware.rate_limit import RateLimitMiddleware
from dashboard.middleware.request_logging import RequestLoggingMiddleware
from dashboard.middleware.security_headers import SecurityHeadersMiddleware
from dashboard.routes.agents import router as agents_router
@@ -30,6 +32,7 @@ from dashboard.routes.briefing import router as briefing_router
from dashboard.routes.calm import router as calm_router
from dashboard.routes.chat_api import router as chat_api_router
from dashboard.routes.chat_api_v1 import router as chat_api_v1_router
from dashboard.routes.daily_run import router as daily_run_router
from dashboard.routes.db_explorer import router as db_explorer_router
from dashboard.routes.discord import router as discord_router
from dashboard.routes.experiments import router as experiments_router
@@ -40,15 +43,19 @@ from dashboard.routes.memory import router as memory_router
from dashboard.routes.mobile import router as mobile_router
from dashboard.routes.models import api_router as models_api_router
from dashboard.routes.models import router as models_router
from dashboard.routes.quests import router as quests_router
from dashboard.routes.spark import router as spark_router
from dashboard.routes.system import router as system_router
from dashboard.routes.tasks import router as tasks_router
from dashboard.routes.telegram import router as telegram_router
from dashboard.routes.thinking import router as thinking_router
from dashboard.routes.tools import router as tools_router
from dashboard.routes.tower import router as tower_router
from dashboard.routes.voice import router as voice_router
from dashboard.routes.work_orders import router as work_orders_router
from dashboard.routes.world import matrix_router
from dashboard.routes.world import router as world_router
from infrastructure.morrowind.api import router as morrowind_router
from timmy.workshop_state import PRESENCE_FILE
@@ -376,73 +383,78 @@ def _startup_background_tasks() -> list[asyncio.Task]:
]
def _try_prune(label: str, prune_fn, days: int) -> None:
"""Run a prune function, log results, swallow errors."""
try:
pruned = prune_fn()
if pruned:
logger.info(
"%s auto-prune: removed %d entries older than %d days",
label,
pruned,
days,
)
except Exception as exc:
logger.debug("%s auto-prune skipped: %s", label, exc)
def _check_vault_size() -> None:
"""Warn if the memory vault exceeds the configured size limit."""
try:
vault_path = Path(settings.repo_root) / "memory" / "notes"
if vault_path.exists():
total_bytes = sum(f.stat().st_size for f in vault_path.rglob("*") if f.is_file())
total_mb = total_bytes / (1024 * 1024)
if total_mb > settings.memory_vault_max_mb:
logger.warning(
"Memory vault (%.1f MB) exceeds limit (%d MB) — consider archiving old notes",
total_mb,
settings.memory_vault_max_mb,
)
except Exception as exc:
logger.debug("Vault size check skipped: %s", exc)
def _startup_pruning() -> None:
"""Auto-prune old memories, thoughts, and events on startup."""
if settings.memory_prune_days > 0:
try:
from timmy.memory_system import prune_memories
from timmy.memory_system import prune_memories
pruned = prune_memories(
_try_prune(
"Memory",
lambda: prune_memories(
older_than_days=settings.memory_prune_days,
keep_facts=settings.memory_prune_keep_facts,
)
if pruned:
logger.info(
"Memory auto-prune: removed %d entries older than %d days",
pruned,
settings.memory_prune_days,
)
except Exception as exc:
logger.debug("Memory auto-prune skipped: %s", exc)
),
settings.memory_prune_days,
)
if settings.thoughts_prune_days > 0:
try:
from timmy.thinking import thinking_engine
from timmy.thinking import thinking_engine
pruned = thinking_engine.prune_old_thoughts(
_try_prune(
"Thought",
lambda: thinking_engine.prune_old_thoughts(
keep_days=settings.thoughts_prune_days,
keep_min=settings.thoughts_prune_keep_min,
)
if pruned:
logger.info(
"Thought auto-prune: removed %d entries older than %d days",
pruned,
settings.thoughts_prune_days,
)
except Exception as exc:
logger.debug("Thought auto-prune skipped: %s", exc)
),
settings.thoughts_prune_days,
)
if settings.events_prune_days > 0:
try:
from swarm.event_log import prune_old_events
from swarm.event_log import prune_old_events
pruned = prune_old_events(
_try_prune(
"Event",
lambda: prune_old_events(
keep_days=settings.events_prune_days,
keep_min=settings.events_prune_keep_min,
)
if pruned:
logger.info(
"Event auto-prune: removed %d entries older than %d days",
pruned,
settings.events_prune_days,
)
except Exception as exc:
logger.debug("Event auto-prune skipped: %s", exc)
),
settings.events_prune_days,
)
if settings.memory_vault_max_mb > 0:
try:
vault_path = Path(settings.repo_root) / "memory" / "notes"
if vault_path.exists():
total_bytes = sum(f.stat().st_size for f in vault_path.rglob("*") if f.is_file())
total_mb = total_bytes / (1024 * 1024)
if total_mb > settings.memory_vault_max_mb:
logger.warning(
"Memory vault (%.1f MB) exceeds limit (%d MB) — consider archiving old notes",
total_mb,
settings.memory_vault_max_mb,
)
except Exception as exc:
logger.debug("Vault size check skipped: %s", exc)
_check_vault_size()
async def _shutdown_cleanup(
@@ -513,25 +525,55 @@ app = FastAPI(
def _get_cors_origins() -> list[str]:
"""Get CORS origins from settings, rejecting wildcards in production."""
origins = settings.cors_origins
"""Get CORS origins from settings, rejecting wildcards in production.
Adds matrix_frontend_url when configured. Always allows Tailscale IPs
(100.x.x.x range) for development convenience.
"""
origins = list(settings.cors_origins)
# Strip wildcards in production (security)
if "*" in origins and not settings.debug:
logger.warning(
"Wildcard '*' in CORS_ORIGINS stripped in production — "
"set explicit origins via CORS_ORIGINS env var"
)
origins = [o for o in origins if o != "*"]
# Add Matrix frontend URL if configured
if settings.matrix_frontend_url:
url = settings.matrix_frontend_url.strip()
if url and url not in origins:
origins.append(url)
logger.debug("Added Matrix frontend to CORS: %s", url)
return origins
# Pattern to match Tailscale IPs (100.x.x.x) for CORS origin regex
_TAILSCALE_IP_PATTERN = re.compile(r"^https?://100\.\d{1,3}\.\d{1,3}\.\d{1,3}(?::\d+)?$")
def _is_tailscale_origin(origin: str) -> bool:
"""Check if origin is a Tailscale IP (100.x.x.x range)."""
return bool(_TAILSCALE_IP_PATTERN.match(origin))
# Add dedicated middleware in correct order
# 1. Logging (outermost to capture everything)
app.add_middleware(RequestLoggingMiddleware, skip_paths=["/health"])
# 2. Security Headers
# 2. Rate Limiting (before security to prevent abuse early)
app.add_middleware(
RateLimitMiddleware,
path_prefixes=["/api/matrix/"],
requests_per_minute=30,
)
# 3. Security Headers
app.add_middleware(SecurityHeadersMiddleware, production=not settings.debug)
# 3. CSRF Protection
# 4. CSRF Protection
app.add_middleware(CSRFMiddleware)
# 4. Standard FastAPI middleware
@@ -545,6 +587,7 @@ app.add_middleware(
app.add_middleware(
CORSMiddleware,
allow_origins=_get_cors_origins(),
allow_origin_regex=r"https?://100\.\d{1,3}\.\d{1,3}\.\d{1,3}(:\d+)?",
allow_credentials=True,
allow_methods=["GET", "POST", "PUT", "DELETE", "OPTIONS"],
allow_headers=["Content-Type", "Authorization"],
@@ -583,6 +626,11 @@ app.include_router(system_router)
app.include_router(experiments_router)
app.include_router(db_explorer_router)
app.include_router(world_router)
app.include_router(matrix_router)
app.include_router(tower_router)
app.include_router(daily_run_router)
app.include_router(quests_router)
app.include_router(morrowind_router)
@app.websocket("/ws")

View File

@@ -1,6 +1,7 @@
"""Dashboard middleware package."""
from .csrf import CSRFMiddleware, csrf_exempt, generate_csrf_token, validate_csrf_token
from .rate_limit import RateLimiter, RateLimitMiddleware
from .request_logging import RequestLoggingMiddleware
from .security_headers import SecurityHeadersMiddleware
@@ -9,6 +10,8 @@ __all__ = [
"csrf_exempt",
"generate_csrf_token",
"validate_csrf_token",
"RateLimiter",
"RateLimitMiddleware",
"SecurityHeadersMiddleware",
"RequestLoggingMiddleware",
]

View File

@@ -131,7 +131,6 @@ class CSRFMiddleware(BaseHTTPMiddleware):
For safe methods: Set a CSRF token cookie if not present.
For unsafe methods: Validate the CSRF token or check if exempt.
"""
# Bypass CSRF if explicitly disabled (e.g. in tests)
from config import settings
if settings.timmy_disable_csrf:
@@ -141,52 +140,55 @@ class CSRFMiddleware(BaseHTTPMiddleware):
if request.headers.get("upgrade", "").lower() == "websocket":
return await call_next(request)
# Get existing CSRF token from cookie
csrf_cookie = request.cookies.get(self.cookie_name)
# For safe methods, just ensure a token exists
if request.method in self.SAFE_METHODS:
response = await call_next(request)
return await self._handle_safe_method(request, call_next, csrf_cookie)
# Set CSRF token cookie if not present
if not csrf_cookie:
new_token = generate_csrf_token()
response.set_cookie(
key=self.cookie_name,
value=new_token,
httponly=False, # Must be readable by JavaScript
secure=settings.csrf_cookie_secure,
samesite="Lax",
max_age=86400, # 24 hours
)
return await self._handle_unsafe_method(request, call_next, csrf_cookie)
return response
async def _handle_safe_method(
self, request: Request, call_next, csrf_cookie: str | None
) -> Response:
"""Handle safe HTTP methods (GET, HEAD, OPTIONS, TRACE).
# For unsafe methods, we need to validate or check if exempt
# First, try to validate the CSRF token
if await self._validate_request(request, csrf_cookie):
# Token is valid, allow the request
return await call_next(request)
Forwards the request and sets a CSRF token cookie if not present.
"""
from config import settings
# Token validation failed, check if the path is exempt
path = request.url.path
if self._is_likely_exempt(path):
# Path is exempt, allow the request
return await call_next(request)
# Token validation failed and path is not exempt
# We still need to call the app to check if the endpoint is decorated
# with @csrf_exempt, so we'll let it through and check after routing
response = await call_next(request)
# After routing, check if the endpoint is marked as exempt
endpoint = request.scope.get("endpoint")
if endpoint and is_csrf_exempt(endpoint):
# Endpoint is marked as exempt, allow the response
return response
if not csrf_cookie:
new_token = generate_csrf_token()
response.set_cookie(
key=self.cookie_name,
value=new_token,
httponly=False, # Must be readable by JavaScript
secure=settings.csrf_cookie_secure,
samesite="Lax",
max_age=86400, # 24 hours
)
return response
async def _handle_unsafe_method(
self, request: Request, call_next, csrf_cookie: str | None
) -> Response:
"""Handle unsafe HTTP methods (POST, PUT, DELETE, PATCH).
Validates the CSRF token, checks path and endpoint exemptions,
or returns a 403 error.
"""
if await self._validate_request(request, csrf_cookie):
return await call_next(request)
if self._is_likely_exempt(request.url.path):
return await call_next(request)
endpoint = self._resolve_endpoint(request)
if endpoint and is_csrf_exempt(endpoint):
return await call_next(request)
# Endpoint is not exempt and token validation failed
# Return 403 error
return JSONResponse(
status_code=403,
content={
@@ -196,6 +198,41 @@ class CSRFMiddleware(BaseHTTPMiddleware):
},
)
def _resolve_endpoint(self, request: Request) -> Callable | None:
"""Resolve the route endpoint without executing it.
Walks the Starlette/FastAPI router to find which endpoint function
handles this request, so we can check @csrf_exempt before any
side effects occur.
Returns:
The endpoint callable, or None if no route matched.
"""
# If routing already happened (endpoint in scope), use it
endpoint = request.scope.get("endpoint")
if endpoint:
return endpoint
# Walk the middleware/app chain to find something with routes
from starlette.routing import Match
app = self.app
while app is not None:
if hasattr(app, "routes"):
for route in app.routes:
match, _ = route.matches(request.scope)
if match == Match.FULL:
return getattr(route, "endpoint", None)
# Try .router (FastAPI stores routes on app.router)
if hasattr(app, "router") and hasattr(app.router, "routes"):
for route in app.router.routes:
match, _ = route.matches(request.scope)
if match == Match.FULL:
return getattr(route, "endpoint", None)
app = getattr(app, "app", None)
return None
def _is_likely_exempt(self, path: str) -> bool:
"""Check if a path is likely to be CSRF exempt.

View File

@@ -0,0 +1,209 @@
"""Rate limiting middleware for FastAPI.
Simple in-memory rate limiter for API endpoints. Tracks requests per IP
with configurable limits and automatic cleanup of stale entries.
"""
import logging
import time
from collections import deque
from starlette.middleware.base import BaseHTTPMiddleware
from starlette.requests import Request
from starlette.responses import JSONResponse, Response
logger = logging.getLogger(__name__)
class RateLimiter:
"""In-memory rate limiter for tracking requests per IP.
Stores request timestamps in a dict keyed by client IP.
Automatically cleans up stale entries every 60 seconds.
Attributes:
requests_per_minute: Maximum requests allowed per minute per IP.
cleanup_interval_seconds: How often to clean stale entries.
"""
def __init__(
self,
requests_per_minute: int = 30,
cleanup_interval_seconds: int = 60,
):
self.requests_per_minute = requests_per_minute
self.cleanup_interval_seconds = cleanup_interval_seconds
self._storage: dict[str, deque[float]] = {}
self._last_cleanup: float = time.time()
self._window_seconds: float = 60.0 # 1 minute window
def _get_client_ip(self, request: Request) -> str:
"""Extract client IP from request, respecting X-Forwarded-For header.
Args:
request: The incoming request.
Returns:
Client IP address string.
"""
# Check for forwarded IP (when behind proxy/load balancer)
forwarded = request.headers.get("x-forwarded-for")
if forwarded:
# Take the first IP in the chain
return forwarded.split(",")[0].strip()
real_ip = request.headers.get("x-real-ip")
if real_ip:
return real_ip
# Fall back to direct connection
if request.client:
return request.client.host
return "unknown"
def _cleanup_if_needed(self) -> None:
"""Remove stale entries older than the cleanup interval."""
now = time.time()
if now - self._last_cleanup < self.cleanup_interval_seconds:
return
cutoff = now - self._window_seconds
stale_ips: list[str] = []
for ip, timestamps in self._storage.items():
# Remove timestamps older than the window
while timestamps and timestamps[0] < cutoff:
timestamps.popleft()
# Mark IP for removal if no recent requests
if not timestamps:
stale_ips.append(ip)
# Remove stale IP entries
for ip in stale_ips:
del self._storage[ip]
self._last_cleanup = now
if stale_ips:
logger.debug("Rate limiter cleanup: removed %d stale IPs", len(stale_ips))
def is_allowed(self, client_ip: str) -> tuple[bool, float]:
"""Check if a request from the given IP is allowed.
Args:
client_ip: The client's IP address.
Returns:
Tuple of (allowed: bool, retry_after: float).
retry_after is seconds until next allowed request, 0 if allowed now.
"""
now = time.time()
cutoff = now - self._window_seconds
# Get or create timestamp deque for this IP
if client_ip not in self._storage:
self._storage[client_ip] = deque()
timestamps = self._storage[client_ip]
# Remove timestamps outside the window
while timestamps and timestamps[0] < cutoff:
timestamps.popleft()
# Check if limit exceeded
if len(timestamps) >= self.requests_per_minute:
# Calculate retry after time
oldest = timestamps[0]
retry_after = self._window_seconds - (now - oldest)
return False, max(0.0, retry_after)
# Record this request
timestamps.append(now)
return True, 0.0
def check_request(self, request: Request) -> tuple[bool, float]:
"""Check if the request is allowed under rate limits.
Args:
request: The incoming request.
Returns:
Tuple of (allowed: bool, retry_after: float).
"""
self._cleanup_if_needed()
client_ip = self._get_client_ip(request)
return self.is_allowed(client_ip)
class RateLimitMiddleware(BaseHTTPMiddleware):
"""Middleware to apply rate limiting to specific routes.
Usage:
# Apply to all routes (not recommended for public static files)
app.add_middleware(RateLimitMiddleware)
# Apply only to specific paths
app.add_middleware(
RateLimitMiddleware,
path_prefixes=["/api/matrix/"],
requests_per_minute=30,
)
Attributes:
path_prefixes: List of URL path prefixes to rate limit.
If empty, applies to all paths.
requests_per_minute: Maximum requests per minute per IP.
"""
def __init__(
self,
app,
path_prefixes: list[str] | None = None,
requests_per_minute: int = 30,
):
super().__init__(app)
self.path_prefixes = path_prefixes or []
self.limiter = RateLimiter(requests_per_minute=requests_per_minute)
def _should_rate_limit(self, path: str) -> bool:
"""Check if the given path should be rate limited.
Args:
path: The request URL path.
Returns:
True if path matches any configured prefix.
"""
if not self.path_prefixes:
return True
return any(path.startswith(prefix) for prefix in self.path_prefixes)
async def dispatch(self, request: Request, call_next) -> Response:
"""Apply rate limiting to configured paths.
Args:
request: The incoming request.
call_next: Callable to get the response from downstream.
Returns:
Response from downstream, or 429 if rate limited.
"""
# Skip if path doesn't match configured prefixes
if not self._should_rate_limit(request.url.path):
return await call_next(request)
# Check rate limit
allowed, retry_after = self.limiter.check_request(request)
if not allowed:
return JSONResponse(
status_code=429,
content={
"error": "Rate limit exceeded. Try again later.",
"retry_after": int(retry_after) + 1,
},
headers={"Retry-After": str(int(retry_after) + 1)},
)
# Process the request
return await call_next(request)

View File

@@ -42,6 +42,114 @@ class RequestLoggingMiddleware(BaseHTTPMiddleware):
self.skip_paths = set(skip_paths or [])
self.log_level = log_level
def _should_skip_path(self, path: str) -> bool:
"""Check if the request path should be skipped from logging.
Args:
path: The request URL path.
Returns:
True if the path should be skipped, False otherwise.
"""
return path in self.skip_paths
def _prepare_request_context(self, request: Request) -> tuple[str, float]:
"""Prepare context for request processing.
Generates a correlation ID and records the start time.
Args:
request: The incoming request.
Returns:
Tuple of (correlation_id, start_time).
"""
correlation_id = str(uuid.uuid4())[:8]
request.state.correlation_id = correlation_id
start_time = time.time()
return correlation_id, start_time
def _get_duration_ms(self, start_time: float) -> float:
"""Calculate the request duration in milliseconds.
Args:
start_time: The start time from time.time().
Returns:
Duration in milliseconds.
"""
return (time.time() - start_time) * 1000
def _log_success(
self,
request: Request,
response: Response,
correlation_id: str,
duration_ms: float,
client_ip: str,
user_agent: str,
) -> None:
"""Log a successful request.
Args:
request: The incoming request.
response: The response from downstream.
correlation_id: The request correlation ID.
duration_ms: Request duration in milliseconds.
client_ip: Client IP address.
user_agent: User-Agent header value.
"""
self._log_request(
method=request.method,
path=request.url.path,
status_code=response.status_code,
duration_ms=duration_ms,
client_ip=client_ip,
user_agent=user_agent,
correlation_id=correlation_id,
)
def _log_error(
self,
request: Request,
exc: Exception,
correlation_id: str,
duration_ms: float,
client_ip: str,
) -> None:
"""Log a failed request and capture the error.
Args:
request: The incoming request.
exc: The exception that was raised.
correlation_id: The request correlation ID.
duration_ms: Request duration in milliseconds.
client_ip: Client IP address.
"""
logger.error(
f"[{correlation_id}] {request.method} {request.url.path} "
f"- ERROR - {duration_ms:.2f}ms - {client_ip} - {str(exc)}"
)
# Auto-escalate: create bug report task from unhandled exception
try:
from infrastructure.error_capture import capture_error
capture_error(
exc,
source="http",
context={
"method": request.method,
"path": request.url.path,
"correlation_id": correlation_id,
"client_ip": client_ip,
"duration_ms": f"{duration_ms:.0f}",
},
)
except Exception:
logger.warning("Escalation logging error: capture failed")
# never let escalation break the request
async def dispatch(self, request: Request, call_next) -> Response:
"""Log the request and response details.
@@ -52,74 +160,23 @@ class RequestLoggingMiddleware(BaseHTTPMiddleware):
Returns:
The response from downstream.
"""
# Check if we should skip logging this path
if request.url.path in self.skip_paths:
if self._should_skip_path(request.url.path):
return await call_next(request)
# Generate correlation ID
correlation_id = str(uuid.uuid4())[:8]
request.state.correlation_id = correlation_id
# Record start time
start_time = time.time()
# Get client info
correlation_id, start_time = self._prepare_request_context(request)
client_ip = self._get_client_ip(request)
user_agent = request.headers.get("user-agent", "-")
try:
# Process the request
response = await call_next(request)
# Calculate duration
duration_ms = (time.time() - start_time) * 1000
# Log the request
self._log_request(
method=request.method,
path=request.url.path,
status_code=response.status_code,
duration_ms=duration_ms,
client_ip=client_ip,
user_agent=user_agent,
correlation_id=correlation_id,
)
# Add correlation ID to response headers
duration_ms = self._get_duration_ms(start_time)
self._log_success(request, response, correlation_id, duration_ms, client_ip, user_agent)
response.headers["X-Correlation-ID"] = correlation_id
return response
except Exception as exc:
# Calculate duration even for failed requests
duration_ms = (time.time() - start_time) * 1000
# Log the error
logger.error(
f"[{correlation_id}] {request.method} {request.url.path} "
f"- ERROR - {duration_ms:.2f}ms - {client_ip} - {str(exc)}"
)
# Auto-escalate: create bug report task from unhandled exception
try:
from infrastructure.error_capture import capture_error
capture_error(
exc,
source="http",
context={
"method": request.method,
"path": request.url.path,
"correlation_id": correlation_id,
"client_ip": client_ip,
"duration_ms": f"{duration_ms:.0f}",
},
)
except Exception as exc:
logger.debug("Escalation logging error: %s", exc)
pass # never let escalation break the request
# Re-raise the exception
duration_ms = self._get_duration_ms(start_time)
self._log_error(request, exc, correlation_id, duration_ms, client_ip)
raise
def _get_client_ip(self, request: Request) -> str:

View File

@@ -1,4 +1,4 @@
from datetime import date, datetime
from datetime import UTC, date, datetime
from enum import StrEnum
from sqlalchemy import JSON, Boolean, Column, Date, DateTime, Index, Integer, String
@@ -40,8 +40,13 @@ class Task(Base):
deferred_at = Column(DateTime, nullable=True)
# Timestamps
created_at = Column(DateTime, default=datetime.utcnow, nullable=False)
updated_at = Column(DateTime, default=datetime.utcnow, onupdate=datetime.utcnow, nullable=False)
created_at = Column(DateTime, default=lambda: datetime.now(UTC), nullable=False)
updated_at = Column(
DateTime,
default=lambda: datetime.now(UTC),
onupdate=lambda: datetime.now(UTC),
nullable=False,
)
__table_args__ = (Index("ix_task_state_order", "state", "sort_order"),)
@@ -59,4 +64,4 @@ class JournalEntry(Base):
gratitude = Column(String(500), nullable=True)
energy_level = Column(Integer, nullable=True) # User-reported, 1-10
created_at = Column(DateTime, default=datetime.utcnow, nullable=False)
created_at = Column(DateTime, default=lambda: datetime.now(UTC), nullable=False)

View File

@@ -1,5 +1,5 @@
import logging
from datetime import date, datetime
from datetime import UTC, date, datetime
from fastapi import APIRouter, Depends, Form, HTTPException, Request
from fastapi.responses import HTMLResponse
@@ -19,14 +19,17 @@ router = APIRouter(tags=["calm"])
# Helper functions for state machine logic
def get_now_task(db: Session) -> Task | None:
"""Return the single active NOW task, or None."""
return db.query(Task).filter(Task.state == TaskState.NOW).first()
def get_next_task(db: Session) -> Task | None:
"""Return the single queued NEXT task, or None."""
return db.query(Task).filter(Task.state == TaskState.NEXT).first()
def get_later_tasks(db: Session) -> list[Task]:
"""Return all LATER tasks ordered by MIT flag then sort_order."""
return (
db.query(Task)
.filter(Task.state == TaskState.LATER)
@@ -35,7 +38,63 @@ def get_later_tasks(db: Session) -> list[Task]:
)
def _create_mit_tasks(db: Session, titles: list[str | None]) -> list[int]:
"""Create MIT tasks from a list of titles, return their IDs."""
task_ids: list[int] = []
for title in titles:
if title:
task = Task(
title=title,
is_mit=True,
state=TaskState.LATER,
certainty=TaskCertainty.SOFT,
)
db.add(task)
db.commit()
db.refresh(task)
task_ids.append(task.id)
return task_ids
def _create_other_tasks(db: Session, other_tasks: str):
"""Create non-MIT tasks from newline-separated text."""
for line in other_tasks.split("\n"):
line = line.strip()
if line:
task = Task(
title=line,
state=TaskState.LATER,
certainty=TaskCertainty.FUZZY,
)
db.add(task)
def _seed_now_next(db: Session):
"""Set initial NOW/NEXT states when both slots are empty."""
if get_now_task(db) or get_next_task(db):
return
later_tasks = (
db.query(Task)
.filter(Task.state == TaskState.LATER)
.order_by(Task.is_mit.desc(), Task.sort_order)
.all()
)
if later_tasks:
later_tasks[0].state = TaskState.NOW
db.add(later_tasks[0])
db.flush()
if len(later_tasks) > 1:
later_tasks[1].state = TaskState.NEXT
db.add(later_tasks[1])
def promote_tasks(db: Session):
"""Enforce the NOW/NEXT/LATER state machine invariants.
- At most one NOW task (extras demoted to NEXT).
- If no NOW, promote NEXT -> NOW.
- If no NEXT, promote highest-priority LATER -> NEXT.
"""
# Ensure only one NOW task exists. If multiple, demote extras to NEXT.
now_tasks = db.query(Task).filter(Task.state == TaskState.NOW).all()
if len(now_tasks) > 1:
@@ -74,6 +133,7 @@ def promote_tasks(db: Session):
# Endpoints
@router.get("/calm", response_class=HTMLResponse)
async def get_calm_view(request: Request, db: Session = Depends(get_db)):
"""Render the main CALM dashboard with NOW/NEXT/LATER counts."""
now_task = get_now_task(db)
next_task = get_next_task(db)
later_tasks_count = len(get_later_tasks(db))
@@ -90,6 +150,7 @@ async def get_calm_view(request: Request, db: Session = Depends(get_db)):
@router.get("/calm/ritual/morning", response_class=HTMLResponse)
async def get_morning_ritual_form(request: Request):
"""Render the morning ritual intake form."""
return templates.TemplateResponse(request, "calm/morning_ritual_form.html", {})
@@ -102,63 +163,20 @@ async def post_morning_ritual(
mit3_title: str = Form(None),
other_tasks: str = Form(""),
):
# Create Journal Entry
mit_task_ids = []
"""Process morning ritual: create MITs, other tasks, and set initial states."""
journal_entry = JournalEntry(entry_date=date.today())
db.add(journal_entry)
db.commit()
db.refresh(journal_entry)
# Create MIT tasks
for mit_title in [mit1_title, mit2_title, mit3_title]:
if mit_title:
task = Task(
title=mit_title,
is_mit=True,
state=TaskState.LATER, # Initially LATER, will be promoted
certainty=TaskCertainty.SOFT,
)
db.add(task)
db.commit()
db.refresh(task)
mit_task_ids.append(task.id)
journal_entry.mit_task_ids = mit_task_ids
journal_entry.mit_task_ids = _create_mit_tasks(db, [mit1_title, mit2_title, mit3_title])
db.add(journal_entry)
# Create other tasks
for task_title in other_tasks.split("\n"):
task_title = task_title.strip()
if task_title:
task = Task(
title=task_title,
state=TaskState.LATER,
certainty=TaskCertainty.FUZZY,
)
db.add(task)
_create_other_tasks(db, other_tasks)
db.commit()
# Set initial NOW/NEXT states
# Set initial NOW/NEXT states after all tasks are created
if not get_now_task(db) and not get_next_task(db):
later_tasks = (
db.query(Task)
.filter(Task.state == TaskState.LATER)
.order_by(Task.is_mit.desc(), Task.sort_order)
.all()
)
if later_tasks:
# Set the highest priority LATER task to NOW
later_tasks[0].state = TaskState.NOW
db.add(later_tasks[0])
db.flush() # Flush to make the change visible for the next query
# Set the next highest priority LATER task to NEXT
if len(later_tasks) > 1:
later_tasks[1].state = TaskState.NEXT
db.add(later_tasks[1])
db.commit() # Commit changes after initial NOW/NEXT setup
_seed_now_next(db)
db.commit()
return templates.TemplateResponse(
request,
@@ -173,6 +191,7 @@ async def post_morning_ritual(
@router.get("/calm/ritual/evening", response_class=HTMLResponse)
async def get_evening_ritual_form(request: Request, db: Session = Depends(get_db)):
"""Render the evening ritual form for today's journal entry."""
journal_entry = db.query(JournalEntry).filter(JournalEntry.entry_date == date.today()).first()
if not journal_entry:
raise HTTPException(status_code=404, detail="No journal entry for today")
@@ -189,6 +208,7 @@ async def post_evening_ritual(
gratitude: str = Form(None),
energy_level: int = Form(None),
):
"""Process evening ritual: save reflection/gratitude, archive active tasks."""
journal_entry = db.query(JournalEntry).filter(JournalEntry.entry_date == date.today()).first()
if not journal_entry:
raise HTTPException(status_code=404, detail="No journal entry for today")
@@ -206,7 +226,7 @@ async def post_evening_ritual(
)
for task in active_tasks:
task.state = TaskState.DEFERRED # Or DONE, depending on desired archiving logic
task.deferred_at = datetime.utcnow()
task.deferred_at = datetime.now(UTC)
db.add(task)
db.commit()
@@ -223,6 +243,7 @@ async def create_new_task(
is_mit: bool = Form(False),
certainty: TaskCertainty = Form(TaskCertainty.SOFT),
):
"""Create a new task in LATER state and return updated count."""
task = Task(
title=title,
description=description,
@@ -247,6 +268,7 @@ async def start_task(
task_id: int,
db: Session = Depends(get_db),
):
"""Move a task to NOW state, demoting the current NOW to NEXT."""
current_now_task = get_now_task(db)
if current_now_task and current_now_task.id != task_id:
current_now_task.state = TaskState.NEXT # Demote current NOW to NEXT
@@ -257,7 +279,7 @@ async def start_task(
raise HTTPException(status_code=404, detail="Task not found")
task.state = TaskState.NOW
task.started_at = datetime.utcnow()
task.started_at = datetime.now(UTC)
db.add(task)
db.commit()
@@ -281,12 +303,13 @@ async def complete_task(
task_id: int,
db: Session = Depends(get_db),
):
"""Mark a task as DONE and trigger state promotion."""
task = db.query(Task).filter(Task.id == task_id).first()
if not task:
raise HTTPException(status_code=404, detail="Task not found")
task.state = TaskState.DONE
task.completed_at = datetime.utcnow()
task.completed_at = datetime.now(UTC)
db.add(task)
db.commit()
@@ -309,12 +332,13 @@ async def defer_task(
task_id: int,
db: Session = Depends(get_db),
):
"""Defer a task and trigger state promotion."""
task = db.query(Task).filter(Task.id == task_id).first()
if not task:
raise HTTPException(status_code=404, detail="Task not found")
task.state = TaskState.DEFERRED
task.deferred_at = datetime.utcnow()
task.deferred_at = datetime.now(UTC)
db.add(task)
db.commit()
@@ -333,6 +357,7 @@ async def defer_task(
@router.get("/calm/partials/later_tasks_list", response_class=HTMLResponse)
async def get_later_tasks_list(request: Request, db: Session = Depends(get_db)):
"""Render the expandable list of LATER tasks."""
later_tasks = get_later_tasks(db)
return templates.TemplateResponse(
"calm/partials/later_tasks_list.html",
@@ -348,6 +373,7 @@ async def reorder_tasks(
later_task_ids: str = Form(""),
next_task_id: int | None = Form(None),
):
"""Reorder LATER tasks and optionally promote one to NEXT."""
# Reorder LATER tasks
if later_task_ids:
ids_in_order = [int(x.strip()) for x in later_task_ids.split(",") if x.strip()]

View File

@@ -0,0 +1,435 @@
"""Daily Run metrics routes — dashboard card for triage and session metrics."""
from __future__ import annotations
import json
import logging
import os
from dataclasses import dataclass
from datetime import UTC, datetime, timedelta
from pathlib import Path
from urllib.error import HTTPError, URLError
from urllib.request import Request as UrlRequest
from urllib.request import urlopen
from fastapi import APIRouter, Request
from fastapi.responses import HTMLResponse, JSONResponse
from config import settings
from dashboard.templating import templates
logger = logging.getLogger(__name__)
router = APIRouter(tags=["daily-run"])
REPO_ROOT = Path(settings.repo_root)
CONFIG_PATH = REPO_ROOT / "timmy_automations" / "config" / "daily_run.json"
DEFAULT_CONFIG = {
"gitea_api": "http://localhost:3000/api/v1",
"repo_slug": "rockachopa/Timmy-time-dashboard",
"token_file": "~/.hermes/gitea_token",
"layer_labels_prefix": "layer:",
}
LAYER_LABELS = ["layer:triage", "layer:micro-fix", "layer:tests", "layer:economy"]
def _load_config() -> dict:
"""Load configuration from config file with fallback to defaults."""
config = DEFAULT_CONFIG.copy()
if CONFIG_PATH.exists():
try:
file_config = json.loads(CONFIG_PATH.read_text())
if "orchestrator" in file_config:
config.update(file_config["orchestrator"])
except (json.JSONDecodeError, OSError) as exc:
logger.debug("Could not load daily_run config: %s", exc)
# Environment variable overrides
if os.environ.get("TIMMY_GITEA_API"):
config["gitea_api"] = os.environ.get("TIMMY_GITEA_API")
if os.environ.get("TIMMY_REPO_SLUG"):
config["repo_slug"] = os.environ.get("TIMMY_REPO_SLUG")
if os.environ.get("TIMMY_GITEA_TOKEN"):
config["token"] = os.environ.get("TIMMY_GITEA_TOKEN")
return config
def _get_token(config: dict) -> str | None:
"""Get Gitea token from environment or file."""
if "token" in config:
return config["token"]
token_file = Path(config["token_file"]).expanduser()
if token_file.exists():
return token_file.read_text().strip()
return None
class GiteaClient:
"""Simple Gitea API client with graceful degradation."""
def __init__(self, config: dict, token: str | None):
self.api_base = config["gitea_api"].rstrip("/")
self.repo_slug = config["repo_slug"]
self.token = token
self._available: bool | None = None
def _headers(self) -> dict:
headers = {"Accept": "application/json"}
if self.token:
headers["Authorization"] = f"token {self.token}"
return headers
def _api_url(self, path: str) -> str:
return f"{self.api_base}/repos/{self.repo_slug}/{path}"
def is_available(self) -> bool:
"""Check if Gitea API is reachable."""
if self._available is not None:
return self._available
try:
req = UrlRequest(
f"{self.api_base}/version",
headers=self._headers(),
method="GET",
)
with urlopen(req, timeout=5) as resp:
self._available = resp.status == 200
return self._available
except (HTTPError, URLError, TimeoutError):
self._available = False
return False
def get_paginated(self, path: str, params: dict | None = None) -> list:
"""Fetch all pages of a paginated endpoint."""
all_items = []
page = 1
limit = 50
while True:
url = self._api_url(path)
query_parts = [f"limit={limit}", f"page={page}"]
if params:
for key, val in params.items():
query_parts.append(f"{key}={val}")
url = f"{url}?{'&'.join(query_parts)}"
req = UrlRequest(url, headers=self._headers(), method="GET")
with urlopen(req, timeout=15) as resp:
batch = json.loads(resp.read())
if not batch:
break
all_items.extend(batch)
if len(batch) < limit:
break
page += 1
return all_items
@dataclass
class LayerMetrics:
"""Metrics for a single layer."""
name: str
label: str
current_count: int
previous_count: int
@property
def trend(self) -> str:
"""Return trend indicator."""
if self.previous_count == 0:
return "" if self.current_count == 0 else ""
diff = self.current_count - self.previous_count
pct = (diff / self.previous_count) * 100
if pct > 20:
return "↑↑"
elif pct > 5:
return ""
elif pct < -20:
return "↓↓"
elif pct < -5:
return ""
return ""
@property
def trend_color(self) -> str:
"""Return color for trend (CSS variable name)."""
trend = self.trend
if trend in ("↑↑", ""):
return "var(--green)" # More work = positive
elif trend in ("↓↓", ""):
return "var(--amber)" # Less work = caution
return "var(--text-dim)"
@dataclass
class DailyRunMetrics:
"""Complete Daily Run metrics."""
sessions_completed: int
sessions_previous: int
layers: list[LayerMetrics]
total_touched_current: int
total_touched_previous: int
lookback_days: int
generated_at: str
@property
def sessions_trend(self) -> str:
"""Return sessions trend indicator."""
if self.sessions_previous == 0:
return "" if self.sessions_completed == 0 else ""
diff = self.sessions_completed - self.sessions_previous
pct = (diff / self.sessions_previous) * 100
if pct > 20:
return "↑↑"
elif pct > 5:
return ""
elif pct < -20:
return "↓↓"
elif pct < -5:
return ""
return ""
@property
def sessions_trend_color(self) -> str:
"""Return color for sessions trend."""
trend = self.sessions_trend
if trend in ("↑↑", ""):
return "var(--green)"
elif trend in ("↓↓", ""):
return "var(--amber)"
return "var(--text-dim)"
def _extract_layer(labels: list[dict]) -> str | None:
"""Extract layer label from issue labels."""
for label in labels:
name = label.get("name", "")
if name.startswith("layer:"):
return name.replace("layer:", "")
return None
def _load_cycle_data(days: int = 14) -> dict:
"""Load cycle retrospective data for session counting."""
retro_file = REPO_ROOT / ".loop" / "retro" / "cycles.jsonl"
if not retro_file.exists():
return {"current": 0, "previous": 0}
try:
entries = []
for line in retro_file.read_text().strip().splitlines():
try:
entries.append(json.loads(line))
except json.JSONDecodeError:
continue
now = datetime.now(UTC)
current_cutoff = now - timedelta(days=days)
previous_cutoff = now - timedelta(days=days * 2)
current_count = 0
previous_count = 0
for entry in entries:
ts_str = entry.get("timestamp", "")
if not ts_str:
continue
try:
ts = datetime.fromisoformat(ts_str.replace("Z", "+00:00"))
if ts >= current_cutoff:
if entry.get("success", False):
current_count += 1
elif ts >= previous_cutoff:
if entry.get("success", False):
previous_count += 1
except (ValueError, TypeError):
continue
return {"current": current_count, "previous": previous_count}
except (OSError, ValueError) as exc:
logger.debug("Failed to load cycle data: %s", exc)
return {"current": 0, "previous": 0}
def _fetch_layer_metrics(
client: GiteaClient, lookback_days: int = 7
) -> tuple[list[LayerMetrics], int, int]:
"""Fetch metrics for each layer from Gitea issues."""
now = datetime.now(UTC)
current_cutoff = now - timedelta(days=lookback_days)
previous_cutoff = now - timedelta(days=lookback_days * 2)
layers = []
total_current = 0
total_previous = 0
for layer_label in LAYER_LABELS:
layer_name = layer_label.replace("layer:", "")
try:
# Fetch all issues with this layer label (both open and closed)
issues = client.get_paginated(
"issues",
{"state": "all", "labels": layer_label, "limit": 100},
)
current_count = 0
previous_count = 0
for issue in issues:
updated_at = issue.get("updated_at", "")
if not updated_at:
continue
try:
updated = datetime.fromisoformat(updated_at.replace("Z", "+00:00"))
if updated >= current_cutoff:
current_count += 1
elif updated >= previous_cutoff:
previous_count += 1
except (ValueError, TypeError):
continue
layers.append(
LayerMetrics(
name=layer_name,
label=layer_label,
current_count=current_count,
previous_count=previous_count,
)
)
total_current += current_count
total_previous += previous_count
except (HTTPError, URLError) as exc:
logger.debug("Failed to fetch issues for %s: %s", layer_label, exc)
layers.append(
LayerMetrics(
name=layer_name,
label=layer_label,
current_count=0,
previous_count=0,
)
)
return layers, total_current, total_previous
def _get_metrics(lookback_days: int = 7) -> DailyRunMetrics | None:
"""Get Daily Run metrics from Gitea API."""
config = _load_config()
token = _get_token(config)
client = GiteaClient(config, token)
if not client.is_available():
logger.debug("Gitea API not available for Daily Run metrics")
return None
try:
# Get layer metrics from issues
layers, total_current, total_previous = _fetch_layer_metrics(client, lookback_days)
# Get session data from cycle retrospectives
cycle_data = _load_cycle_data(days=lookback_days)
return DailyRunMetrics(
sessions_completed=cycle_data["current"],
sessions_previous=cycle_data["previous"],
layers=layers,
total_touched_current=total_current,
total_touched_previous=total_previous,
lookback_days=lookback_days,
generated_at=datetime.now(UTC).isoformat(),
)
except Exception as exc:
logger.debug("Error fetching Daily Run metrics: %s", exc)
return None
@router.get("/daily-run/metrics", response_class=JSONResponse)
async def daily_run_metrics_api(lookback_days: int = 7):
"""Return Daily Run metrics as JSON API."""
metrics = _get_metrics(lookback_days)
if not metrics:
return JSONResponse(
{"error": "Gitea API unavailable", "status": "unavailable"},
status_code=503,
)
# Check for quest completions based on Daily Run metrics
quest_rewards = []
try:
from dashboard.routes.quests import check_daily_run_quests
quest_rewards = await check_daily_run_quests(agent_id="system")
except Exception as exc:
logger.debug("Quest checking failed: %s", exc)
return JSONResponse(
{
"status": "ok",
"lookback_days": metrics.lookback_days,
"sessions": {
"completed": metrics.sessions_completed,
"previous": metrics.sessions_previous,
"trend": metrics.sessions_trend,
},
"layers": [
{
"name": layer.name,
"label": layer.label,
"current": layer.current_count,
"previous": layer.previous_count,
"trend": layer.trend,
}
for layer in metrics.layers
],
"totals": {
"current": metrics.total_touched_current,
"previous": metrics.total_touched_previous,
},
"generated_at": metrics.generated_at,
"quest_rewards": quest_rewards,
}
)
@router.get("/daily-run/panel", response_class=HTMLResponse)
async def daily_run_panel(request: Request, lookback_days: int = 7):
"""Return Daily Run metrics panel HTML for HTMX polling."""
metrics = _get_metrics(lookback_days)
# Build Gitea URLs for filtered issue lists
config = _load_config()
repo_slug = config.get("repo_slug", "rockachopa/Timmy-time-dashboard")
gitea_base = config.get("gitea_api", "http://localhost:3000/api/v1").replace("/api/v1", "")
# Logbook URL (link to issues with any layer label)
layer_labels = ",".join(LAYER_LABELS)
logbook_url = f"{gitea_base}/{repo_slug}/issues?labels={layer_labels}&state=all"
# Layer-specific URLs
layer_urls = {
layer: f"{gitea_base}/{repo_slug}/issues?labels=layer:{layer}&state=all"
for layer in ["triage", "micro-fix", "tests", "economy"]
}
return templates.TemplateResponse(
request,
"partials/daily_run_panel.html",
{
"metrics": metrics,
"logbook_url": logbook_url,
"layer_urls": layer_urls,
"gitea_available": metrics is not None,
},
)

View File

@@ -75,6 +75,7 @@ def _query_database(db_path: str) -> dict:
"truncated": count > MAX_ROWS,
}
except Exception as exc:
logger.exception("Failed to query table %s", table_name)
result["tables"][table_name] = {
"error": str(exc),
"columns": [],
@@ -83,6 +84,7 @@ def _query_database(db_path: str) -> dict:
"truncated": False,
}
except Exception as exc:
logger.exception("Failed to query database %s", db_path)
result["error"] = str(exc)
return result

View File

@@ -135,6 +135,7 @@ def _run_grok_query(message: str) -> dict:
result = backend.run(message)
return {"response": f"**[Grok]{invoice_note}:** {result.content}", "error": None}
except Exception as exc:
logger.exception("Grok query failed")
return {"response": None, "error": f"Grok error: {exc}"}
@@ -193,6 +194,7 @@ async def grok_stats():
"model": settings.grok_default_model,
}
except Exception as exc:
logger.exception("Failed to load Grok stats")
return {"error": str(exc)}

View File

@@ -148,6 +148,7 @@ def _check_sqlite() -> DependencyStatus:
details={"path": str(db_path)},
)
except Exception as exc:
logger.exception("SQLite health check failed")
return DependencyStatus(
name="SQLite Database",
status="unavailable",
@@ -274,3 +275,54 @@ async def component_status():
},
"timestamp": datetime.now(UTC).isoformat(),
}
@router.get("/health/snapshot")
async def health_snapshot():
"""Quick health snapshot before coding.
Returns a concise status summary including:
- CI pipeline status (pass/fail/unknown)
- Critical issues count (P0/P1)
- Test flakiness rate
- Token economy temperature
Fast execution (< 5 seconds) for pre-work checks.
Refs: #710
"""
import sys
from pathlib import Path
# Import the health snapshot module
snapshot_path = Path(settings.repo_root) / "timmy_automations" / "daily_run"
if str(snapshot_path) not in sys.path:
sys.path.insert(0, str(snapshot_path))
try:
from health_snapshot import generate_snapshot, get_token, load_config
config = load_config()
token = get_token(config)
# Run the health snapshot (in thread to avoid blocking)
snapshot = await asyncio.to_thread(generate_snapshot, config, token)
return snapshot.to_dict()
except Exception as exc:
logger.warning("Health snapshot failed: %s", exc)
# Return graceful fallback
return {
"timestamp": datetime.now(UTC).isoformat(),
"overall_status": "unknown",
"error": str(exc),
"ci": {"status": "unknown", "message": "Snapshot failed"},
"issues": {"count": 0, "p0_count": 0, "p1_count": 0, "issues": []},
"flakiness": {
"status": "unknown",
"recent_failures": 0,
"recent_cycles": 0,
"failure_rate": 0.0,
"message": "Snapshot failed",
},
"tokens": {"status": "unknown", "message": "Snapshot failed"},
}

View File

@@ -0,0 +1,377 @@
"""Quest system routes for agent token rewards.
Provides API endpoints for:
- Listing quests and their status
- Claiming quest rewards
- Getting quest leaderboard
- Quest progress tracking
"""
from __future__ import annotations
import logging
from typing import Any
from fastapi import APIRouter, Request
from fastapi.responses import HTMLResponse, JSONResponse
from pydantic import BaseModel
from dashboard.templating import templates
from timmy.quest_system import (
QuestStatus,
auto_evaluate_all_quests,
claim_quest_reward,
evaluate_quest_progress,
get_active_quests,
get_agent_quests_status,
get_quest_definition,
get_quest_leaderboard,
load_quest_config,
)
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/quests", tags=["quests"])
class ClaimQuestRequest(BaseModel):
"""Request to claim a quest reward."""
agent_id: str
quest_id: str
class EvaluateQuestRequest(BaseModel):
"""Request to manually evaluate quest progress."""
agent_id: str
quest_id: str
# ---------------------------------------------------------------------------
# API Endpoints
# ---------------------------------------------------------------------------
@router.get("/api/definitions")
async def get_quest_definitions_api() -> JSONResponse:
"""Get all quest definitions.
Returns:
JSON list of all quest definitions with their criteria.
"""
definitions = get_active_quests()
return JSONResponse(
{
"quests": [
{
"id": q.id,
"name": q.name,
"description": q.description,
"reward_tokens": q.reward_tokens,
"type": q.quest_type.value,
"repeatable": q.repeatable,
"cooldown_hours": q.cooldown_hours,
"criteria": q.criteria,
}
for q in definitions
]
}
)
@router.get("/api/status/{agent_id}")
async def get_agent_quest_status(agent_id: str) -> JSONResponse:
"""Get quest status for a specific agent.
Returns:
Complete quest status including progress, completion counts,
and tokens earned.
"""
status = get_agent_quests_status(agent_id)
return JSONResponse(status)
@router.post("/api/claim")
async def claim_quest_reward_api(request: ClaimQuestRequest) -> JSONResponse:
"""Claim a quest reward for an agent.
The quest must be completed but not yet claimed.
"""
reward = claim_quest_reward(request.quest_id, request.agent_id)
if not reward:
return JSONResponse(
{
"success": False,
"error": "Quest not completed, already claimed, or on cooldown",
},
status_code=400,
)
return JSONResponse(
{
"success": True,
"reward": reward,
}
)
@router.post("/api/evaluate")
async def evaluate_quest_api(request: EvaluateQuestRequest) -> JSONResponse:
"""Manually evaluate quest progress with provided context.
This is useful for testing or when the quest completion
needs to be triggered manually.
"""
quest = get_quest_definition(request.quest_id)
if not quest:
return JSONResponse(
{"success": False, "error": "Quest not found"},
status_code=404,
)
# Build evaluation context based on quest type
context = await _build_evaluation_context(quest)
progress = evaluate_quest_progress(request.quest_id, request.agent_id, context)
if not progress:
return JSONResponse(
{"success": False, "error": "Failed to evaluate quest"},
status_code=500,
)
# Auto-claim if completed
reward = None
if progress.status == QuestStatus.COMPLETED:
reward = claim_quest_reward(request.quest_id, request.agent_id)
return JSONResponse(
{
"success": True,
"progress": progress.to_dict(),
"reward": reward,
"completed": progress.status == QuestStatus.COMPLETED,
}
)
@router.get("/api/leaderboard")
async def get_leaderboard_api() -> JSONResponse:
"""Get the quest completion leaderboard.
Returns agents sorted by total tokens earned.
"""
leaderboard = get_quest_leaderboard()
return JSONResponse(
{
"leaderboard": leaderboard,
}
)
@router.post("/api/reload")
async def reload_quest_config_api() -> JSONResponse:
"""Reload quest configuration from quests.yaml.
Useful for applying quest changes without restarting.
"""
definitions, quest_settings = load_quest_config()
return JSONResponse(
{
"success": True,
"quests_loaded": len(definitions),
"settings": quest_settings,
}
)
# ---------------------------------------------------------------------------
# Dashboard UI Endpoints
# ---------------------------------------------------------------------------
@router.get("", response_class=HTMLResponse)
async def quests_dashboard(request: Request) -> HTMLResponse:
"""Main quests dashboard page."""
return templates.TemplateResponse(
request,
"quests.html",
{"agent_id": "current_user"},
)
@router.get("/panel/{agent_id}", response_class=HTMLResponse)
async def quests_panel(request: Request, agent_id: str) -> HTMLResponse:
"""Quest panel for HTMX partial updates."""
status = get_agent_quests_status(agent_id)
return templates.TemplateResponse(
request,
"partials/quests_panel.html",
{
"agent_id": agent_id,
"quests": status["quests"],
"total_tokens": status["total_tokens_earned"],
"completed_count": status["total_quests_completed"],
},
)
# ---------------------------------------------------------------------------
# Internal Functions
# ---------------------------------------------------------------------------
async def _build_evaluation_context(quest) -> dict[str, Any]:
"""Build evaluation context for a quest based on its type."""
context: dict[str, Any] = {}
if quest.quest_type.value == "issue_count":
# Fetch closed issues with relevant labels
context["closed_issues"] = await _fetch_closed_issues(
quest.criteria.get("issue_labels", [])
)
elif quest.quest_type.value == "issue_reduce":
# Fetch current and previous issue counts
labels = quest.criteria.get("issue_labels", [])
context["current_issue_count"] = await _fetch_open_issue_count(labels)
context["previous_issue_count"] = await _fetch_previous_issue_count(
labels, quest.criteria.get("lookback_days", 7)
)
elif quest.quest_type.value == "daily_run":
# Fetch Daily Run metrics
metrics = await _fetch_daily_run_metrics()
context["sessions_completed"] = metrics.get("sessions_completed", 0)
return context
async def _fetch_closed_issues(labels: list[str]) -> list[dict]:
"""Fetch closed issues matching the given labels."""
try:
from dashboard.routes.daily_run import GiteaClient, _load_config
config = _load_config()
token = _get_gitea_token(config)
client = GiteaClient(config, token)
if not client.is_available():
return []
# Build label filter
label_filter = ",".join(labels) if labels else ""
issues = client.get_paginated(
"issues",
{"state": "closed", "labels": label_filter, "limit": 100},
)
return issues
except Exception as exc:
logger.debug("Failed to fetch closed issues: %s", exc)
return []
async def _fetch_open_issue_count(labels: list[str]) -> int:
"""Fetch count of open issues with given labels."""
try:
from dashboard.routes.daily_run import GiteaClient, _load_config
config = _load_config()
token = _get_gitea_token(config)
client = GiteaClient(config, token)
if not client.is_available():
return 0
label_filter = ",".join(labels) if labels else ""
issues = client.get_paginated(
"issues",
{"state": "open", "labels": label_filter, "limit": 100},
)
return len(issues)
except Exception as exc:
logger.debug("Failed to fetch open issue count: %s", exc)
return 0
async def _fetch_previous_issue_count(labels: list[str], lookback_days: int) -> int:
"""Fetch previous issue count (simplified - uses current for now)."""
# This is a simplified implementation
# In production, you'd query historical data
return await _fetch_open_issue_count(labels)
async def _fetch_daily_run_metrics() -> dict[str, Any]:
"""Fetch Daily Run metrics."""
try:
from dashboard.routes.daily_run import _get_metrics
metrics = _get_metrics(lookback_days=7)
if metrics:
return {
"sessions_completed": metrics.sessions_completed,
"sessions_previous": metrics.sessions_previous,
}
except Exception as exc:
logger.debug("Failed to fetch Daily Run metrics: %s", exc)
return {"sessions_completed": 0, "sessions_previous": 0}
def _get_gitea_token(config: dict) -> str | None:
"""Get Gitea token from config."""
if "token" in config:
return config["token"]
from pathlib import Path
token_file = Path(config.get("token_file", "~/.hermes/gitea_token")).expanduser()
if token_file.exists():
return token_file.read_text().strip()
return None
# ---------------------------------------------------------------------------
# Daily Run Integration
# ---------------------------------------------------------------------------
async def check_daily_run_quests(agent_id: str = "system") -> list[dict]:
"""Check and award Daily Run related quests.
Called by the Daily Run system when metrics are updated.
Returns:
List of rewards awarded
"""
# Check if auto-detect is enabled
_, quest_settings = load_quest_config()
if not quest_settings.get("auto_detect_on_daily_run", True):
return []
# Build context from Daily Run metrics
metrics = await _fetch_daily_run_metrics()
context = {
"sessions_completed": metrics.get("sessions_completed", 0),
"sessions_previous": metrics.get("sessions_previous", 0),
}
# Add closed issues for issue_count quests
active_quests = get_active_quests()
for quest in active_quests:
if quest.quest_type.value == "issue_count":
labels = quest.criteria.get("issue_labels", [])
context["closed_issues"] = await _fetch_closed_issues(labels)
break # Only need to fetch once
# Evaluate all quests
rewards = auto_evaluate_all_quests(agent_id, context)
return rewards

View File

@@ -16,52 +16,11 @@ router = APIRouter(tags=["system"])
@router.get("/lightning/ledger", response_class=HTMLResponse)
async def lightning_ledger(request: Request):
"""Ledger and balance page."""
# Mock data for now, as this seems to be a UI-first feature
balance = {
"available_sats": 1337,
"incoming_total_sats": 2000,
"outgoing_total_sats": 663,
"fees_paid_sats": 5,
"net_sats": 1337,
"pending_incoming_sats": 0,
"pending_outgoing_sats": 0,
}
"""Ledger and balance page backed by the in-memory Lightning ledger."""
from lightning.ledger import get_balance, get_transactions
# Mock transactions
from collections import namedtuple
from enum import Enum
class TxType(Enum):
incoming = "incoming"
outgoing = "outgoing"
class TxStatus(Enum):
completed = "completed"
pending = "pending"
Tx = namedtuple(
"Tx", ["tx_type", "status", "amount_sats", "payment_hash", "memo", "created_at"]
)
transactions = [
Tx(
TxType.outgoing,
TxStatus.completed,
50,
"hash1",
"Model inference",
"2026-03-04 10:00:00",
),
Tx(
TxType.incoming,
TxStatus.completed,
1000,
"hash2",
"Manual deposit",
"2026-03-03 15:00:00",
),
]
balance = get_balance()
transactions = get_transactions()
return templates.TemplateResponse(
request,
@@ -70,7 +29,7 @@ async def lightning_ledger(request: Request):
"balance": balance,
"transactions": transactions,
"tx_types": ["incoming", "outgoing"],
"tx_statuses": ["completed", "pending"],
"tx_statuses": ["pending", "settled", "failed", "expired"],
"filter_type": None,
"filter_status": None,
"stats": {},

View File

@@ -5,7 +5,7 @@ import sqlite3
import uuid
from collections.abc import Generator
from contextlib import closing, contextmanager
from datetime import datetime
from datetime import UTC, datetime
from pathlib import Path
from fastapi import APIRouter, Form, HTTPException, Request
@@ -219,7 +219,7 @@ async def create_task_form(
raise HTTPException(status_code=400, detail="Task title cannot be empty")
task_id = str(uuid.uuid4())
now = datetime.utcnow().isoformat()
now = datetime.now(UTC).isoformat()
priority = priority if priority in VALID_PRIORITIES else "normal"
with _get_db() as db:
@@ -287,7 +287,7 @@ async def modify_task(
async def _set_status(request: Request, task_id: str, new_status: str):
"""Helper to update status and return refreshed task card."""
completed_at = (
datetime.utcnow().isoformat() if new_status in ("completed", "vetoed", "failed") else None
datetime.now(UTC).isoformat() if new_status in ("completed", "vetoed", "failed") else None
)
with _get_db() as db:
db.execute(
@@ -316,7 +316,7 @@ async def api_create_task(request: Request):
raise HTTPException(422, "title is required")
task_id = str(uuid.uuid4())
now = datetime.utcnow().isoformat()
now = datetime.now(UTC).isoformat()
priority = body.get("priority", "normal")
if priority not in VALID_PRIORITIES:
priority = "normal"
@@ -358,7 +358,7 @@ async def api_update_status(task_id: str, request: Request):
raise HTTPException(422, f"Invalid status. Must be one of: {VALID_STATUSES}")
completed_at = (
datetime.utcnow().isoformat() if new_status in ("completed", "vetoed", "failed") else None
datetime.now(UTC).isoformat() if new_status in ("completed", "vetoed", "failed") else None
)
with _get_db() as db:
db.execute(

View File

@@ -0,0 +1,108 @@
"""Tower dashboard — real-time Spark visualization via WebSocket.
GET /tower — HTML Tower dashboard (Thinking / Predicting / Advising)
WS /tower/ws — WebSocket stream of Spark engine state updates
"""
import asyncio
import json
import logging
from fastapi import APIRouter, Request, WebSocket
from fastapi.responses import HTMLResponse
from dashboard.templating import templates
from spark.engine import spark_engine
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/tower", tags=["tower"])
_PUSH_INTERVAL = 5 # seconds between state broadcasts
def _spark_snapshot() -> dict:
"""Build a JSON-serialisable snapshot of Spark state."""
status = spark_engine.status()
timeline = spark_engine.get_timeline(limit=10)
events = []
for ev in timeline:
entry = {
"event_type": ev.event_type,
"description": ev.description,
"importance": ev.importance,
"created_at": ev.created_at,
}
if ev.agent_id:
entry["agent_id"] = ev.agent_id[:8]
if ev.task_id:
entry["task_id"] = ev.task_id[:8]
try:
entry["data"] = json.loads(ev.data)
except (json.JSONDecodeError, TypeError):
entry["data"] = {}
events.append(entry)
predictions = spark_engine.get_predictions(limit=5)
preds = []
for p in predictions:
pred = {
"task_id": p.task_id[:8] if p.task_id else "?",
"accuracy": p.accuracy,
"evaluated": p.evaluated_at is not None,
"created_at": p.created_at,
}
try:
pred["predicted"] = json.loads(p.predicted_value)
except (json.JSONDecodeError, TypeError):
pred["predicted"] = {}
preds.append(pred)
advisories = spark_engine.get_advisories()
advs = [
{
"category": a.category,
"priority": a.priority,
"title": a.title,
"detail": a.detail,
"suggested_action": a.suggested_action,
}
for a in advisories
]
return {
"type": "spark_state",
"status": status,
"events": events,
"predictions": preds,
"advisories": advs,
}
@router.get("", response_class=HTMLResponse)
async def tower_ui(request: Request):
"""Render the Tower dashboard page."""
snapshot = _spark_snapshot()
return templates.TemplateResponse(
request,
"tower.html",
{"snapshot": snapshot},
)
@router.websocket("/ws")
async def tower_ws(websocket: WebSocket) -> None:
"""Stream Spark state snapshots to the Tower dashboard."""
await websocket.accept()
logger.info("Tower WS connected")
try:
# Send initial snapshot
await websocket.send_text(json.dumps(_spark_snapshot()))
while True:
await asyncio.sleep(_PUSH_INTERVAL)
await websocket.send_text(json.dumps(_spark_snapshot()))
except Exception:
logger.debug("Tower WS disconnected")

View File

@@ -59,6 +59,7 @@ async def tts_speak(text: str = Form(...)):
voice_tts.speak(text)
return {"spoken": True, "text": text}
except Exception as exc:
logger.exception("TTS speak failed")
return {"spoken": False, "reason": str(exc)}

View File

@@ -5,7 +5,7 @@ import sqlite3
import uuid
from collections.abc import Generator
from contextlib import closing, contextmanager
from datetime import datetime
from datetime import UTC, datetime
from pathlib import Path
from fastapi import APIRouter, Form, HTTPException, Request
@@ -144,7 +144,7 @@ async def submit_work_order(
related_files: str = Form(""),
):
wo_id = str(uuid.uuid4())
now = datetime.utcnow().isoformat()
now = datetime.now(UTC).isoformat()
priority = priority if priority in PRIORITIES else "medium"
category = category if category in CATEGORIES else "suggestion"
@@ -211,7 +211,7 @@ async def active_partial(request: Request):
async def _update_status(request: Request, wo_id: str, new_status: str, **extra):
completed_at = (
datetime.utcnow().isoformat() if new_status in ("completed", "rejected") else None
datetime.now(UTC).isoformat() if new_status in ("completed", "rejected") else None
)
with _get_db() as db:
sets = ["status=?", "completed_at=COALESCE(?, completed_at)"]

View File

@@ -17,16 +17,221 @@ or missing.
import asyncio
import json
import logging
import math
import re
import time
from collections import deque
from datetime import UTC, datetime
from pathlib import Path
from typing import Any
from fastapi import APIRouter, WebSocket
import yaml
from fastapi import APIRouter, Request, WebSocket
from fastapi.responses import JSONResponse
from pydantic import BaseModel
from config import settings
from infrastructure.presence import produce_bark, serialize_presence
from timmy.memory_system import search_memories
from timmy.workshop_state import PRESENCE_FILE
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/api/world", tags=["world"])
matrix_router = APIRouter(prefix="/api/matrix", tags=["matrix"])
# ---------------------------------------------------------------------------
# Matrix Bark Endpoint — HTTP fallback for bark messages
# ---------------------------------------------------------------------------
# Rate limiting: 1 request per 3 seconds per visitor_id
_BARK_RATE_LIMIT_SECONDS = 3
_bark_last_request: dict[str, float] = {}
class BarkRequest(BaseModel):
"""Request body for POST /api/matrix/bark."""
text: str
visitor_id: str
@matrix_router.post("/bark")
async def post_matrix_bark(request: BarkRequest) -> JSONResponse:
"""Generate a bark response for a visitor message.
HTTP fallback for when WebSocket isn't available. The Matrix frontend
can POST a message and get Timmy's bark response back as JSON.
Rate limited to 1 request per 3 seconds per visitor_id.
Request body:
- text: The visitor's message text
- visitor_id: Unique identifier for the visitor (used for rate limiting)
Returns:
- 200: Bark message in produce_bark() format
- 429: Rate limit exceeded (try again later)
- 422: Invalid request (missing/invalid fields)
"""
# Validate inputs
text = request.text.strip() if request.text else ""
visitor_id = request.visitor_id.strip() if request.visitor_id else ""
if not text:
return JSONResponse(
status_code=422,
content={"error": "text is required"},
)
if not visitor_id:
return JSONResponse(
status_code=422,
content={"error": "visitor_id is required"},
)
# Rate limiting check
now = time.time()
last_request = _bark_last_request.get(visitor_id, 0)
time_since_last = now - last_request
if time_since_last < _BARK_RATE_LIMIT_SECONDS:
retry_after = _BARK_RATE_LIMIT_SECONDS - time_since_last
return JSONResponse(
status_code=429,
content={"error": "Rate limit exceeded. Try again later."},
headers={"Retry-After": str(int(retry_after) + 1)},
)
# Record this request
_bark_last_request[visitor_id] = now
# Generate bark response
try:
reply = await _generate_bark(text)
except Exception as exc:
logger.warning("Bark generation failed: %s", exc)
reply = "Hmm, my thoughts are a bit tangled right now."
# Build bark response using produce_bark format
bark = produce_bark(agent_id="timmy", text=reply, style="speech")
return JSONResponse(
content=bark,
headers={"Cache-Control": "no-cache, no-store"},
)
# ---------------------------------------------------------------------------
# Matrix Agent Registry — serves agents to the Matrix visualization
# ---------------------------------------------------------------------------
# Agent color mapping — consistent with Matrix visual identity
_AGENT_COLORS: dict[str, str] = {
"timmy": "#FFD700", # Gold
"orchestrator": "#FFD700", # Gold
"perplexity": "#3B82F6", # Blue
"replit": "#F97316", # Orange
"kimi": "#06B6D4", # Cyan
"claude": "#A855F7", # Purple
"researcher": "#10B981", # Emerald
"coder": "#EF4444", # Red
"writer": "#EC4899", # Pink
"memory": "#8B5CF6", # Violet
"experimenter": "#14B8A6", # Teal
"forge": "#EF4444", # Red (coder alias)
"seer": "#10B981", # Emerald (researcher alias)
"quill": "#EC4899", # Pink (writer alias)
"echo": "#8B5CF6", # Violet (memory alias)
"lab": "#14B8A6", # Teal (experimenter alias)
}
# Agent shape mapping for 3D visualization
_AGENT_SHAPES: dict[str, str] = {
"timmy": "sphere",
"orchestrator": "sphere",
"perplexity": "cube",
"replit": "cylinder",
"kimi": "dodecahedron",
"claude": "octahedron",
"researcher": "icosahedron",
"coder": "cube",
"writer": "cone",
"memory": "torus",
"experimenter": "tetrahedron",
"forge": "cube",
"seer": "icosahedron",
"quill": "cone",
"echo": "torus",
"lab": "tetrahedron",
}
# Default fallback values
_DEFAULT_COLOR = "#9CA3AF" # Gray
_DEFAULT_SHAPE = "sphere"
_DEFAULT_STATUS = "available"
def _get_agent_color(agent_id: str) -> str:
"""Get the Matrix color for an agent."""
return _AGENT_COLORS.get(agent_id.lower(), _DEFAULT_COLOR)
def _get_agent_shape(agent_id: str) -> str:
"""Get the Matrix shape for an agent."""
return _AGENT_SHAPES.get(agent_id.lower(), _DEFAULT_SHAPE)
def _compute_circular_positions(count: int, radius: float = 3.0) -> list[dict[str, float]]:
"""Compute circular positions for agents in the Matrix.
Agents are arranged in a circle on the XZ plane at y=0.
"""
positions = []
for i in range(count):
angle = (2 * math.pi * i) / count
x = radius * math.cos(angle)
z = radius * math.sin(angle)
positions.append({"x": round(x, 2), "y": 0.0, "z": round(z, 2)})
return positions
def _build_matrix_agents_response() -> list[dict[str, Any]]:
"""Build the Matrix agent registry response.
Reads from agents.yaml and returns agents with Matrix-compatible
formatting including colors, shapes, and positions.
"""
try:
from timmy.agents.loader import list_agents
agents = list_agents()
if not agents:
return []
positions = _compute_circular_positions(len(agents))
result = []
for i, agent in enumerate(agents):
agent_id = agent.get("id", "")
result.append(
{
"id": agent_id,
"display_name": agent.get("name", agent_id.title()),
"role": agent.get("role", "general"),
"color": _get_agent_color(agent_id),
"position": positions[i],
"shape": _get_agent_shape(agent_id),
"status": agent.get("status", _DEFAULT_STATUS),
}
)
return result
except Exception as exc:
logger.warning("Failed to load agents for Matrix: %s", exc)
return []
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/api/world", tags=["world"])
@@ -149,21 +354,7 @@ def _read_presence_file() -> dict | None:
def _build_world_state(presence: dict) -> dict:
"""Transform presence dict into the world/state API response."""
return {
"timmyState": {
"mood": presence.get("mood", "calm"),
"activity": presence.get("current_focus", "idle"),
"energy": presence.get("energy", 0.5),
"confidence": presence.get("confidence", 0.7),
},
"familiar": presence.get("familiar"),
"activeThreads": presence.get("active_threads", []),
"recentEvents": presence.get("recent_events", []),
"concerns": presence.get("concerns", []),
"visitorPresent": False,
"updatedAt": presence.get("liveness", datetime.now(UTC).strftime("%Y-%m-%dT%H:%M:%SZ")),
"version": presence.get("version", 1),
}
return serialize_presence(presence)
def _get_current_state() -> dict:
@@ -224,6 +415,50 @@ async def _heartbeat(websocket: WebSocket) -> None:
logger.debug("Heartbeat stopped — connection gone")
async def _authenticate_ws(websocket: WebSocket) -> bool:
"""Authenticate WebSocket connection using matrix_ws_token.
Checks for token in query param ?token= first. If no query param,
accepts the connection and waits for first message with
{"type": "auth", "token": "..."}.
Returns True if authenticated (or if auth is disabled).
Returns False and closes connection with code 4001 if invalid.
"""
token_setting = settings.matrix_ws_token
# Auth disabled in dev mode (empty/unset token)
if not token_setting:
return True
# Check query param first (can validate before accept)
query_token = websocket.query_params.get("token", "")
if query_token:
if query_token == token_setting:
return True
# Invalid token in query param - we need to accept to close properly
await websocket.accept()
await websocket.close(code=4001, reason="Invalid token")
return False
# No query token - accept and wait for auth message
await websocket.accept()
# Wait for auth message as first message
try:
raw = await websocket.receive_text()
data = json.loads(raw)
if data.get("type") == "auth" and data.get("token") == token_setting:
return True
# Invalid auth message
await websocket.close(code=4001, reason="Invalid token")
return False
except (json.JSONDecodeError, TypeError):
# Non-JSON first message without valid token
await websocket.close(code=4001, reason="Authentication required")
return False
@router.websocket("/ws")
async def world_ws(websocket: WebSocket) -> None:
"""Accept a Workshop client and keep it alive for state broadcasts.
@@ -232,8 +467,28 @@ async def world_ws(websocket: WebSocket) -> None:
client never starts from a blank slate. Incoming frames are parsed
as JSON — ``visitor_message`` triggers a bark response. A background
heartbeat ping runs every 15 s to detect dead connections early.
Authentication:
- If matrix_ws_token is configured, clients must provide it via
?token= query param or in the first message as
{"type": "auth", "token": "..."}.
- Invalid token results in close code 4001.
- Valid token receives a connection_ack message.
"""
await websocket.accept()
# Authenticate (may accept connection internally)
is_authed = await _authenticate_ws(websocket)
if not is_authed:
logger.info("World WS connection rejected — invalid token")
return
# Auth passed - accept if not already accepted
if websocket.client_state.name != "CONNECTED":
await websocket.accept()
# Send connection_ack if auth was required
if settings.matrix_ws_token:
await websocket.send_text(json.dumps({"type": "connection_ack"}))
_ws_clients.append(websocket)
logger.info("World WS connected — %d clients", len(_ws_clients))
@@ -383,3 +638,428 @@ async def _generate_bark(visitor_text: str) -> str:
except Exception as exc:
logger.warning("Bark generation failed: %s", exc)
return "Hmm, my thoughts are a bit tangled right now."
# ---------------------------------------------------------------------------
# Matrix Configuration Endpoint
# ---------------------------------------------------------------------------
# Default Matrix configuration (fallback when matrix.yaml is missing/corrupt)
_DEFAULT_MATRIX_CONFIG: dict[str, Any] = {
"lighting": {
"ambient_color": "#1a1a2e",
"ambient_intensity": 0.4,
"point_lights": [
{"color": "#FFD700", "intensity": 1.2, "position": {"x": 0, "y": 5, "z": 0}},
{"color": "#3B82F6", "intensity": 0.8, "position": {"x": -5, "y": 3, "z": -5}},
{"color": "#A855F7", "intensity": 0.6, "position": {"x": 5, "y": 3, "z": 5}},
],
},
"environment": {
"rain_enabled": False,
"starfield_enabled": True,
"fog_color": "#0f0f23",
"fog_density": 0.02,
},
"features": {
"chat_enabled": True,
"visitor_avatars": True,
"pip_familiar": True,
"workshop_portal": True,
},
}
def _load_matrix_config() -> dict[str, Any]:
"""Load Matrix world configuration from matrix.yaml with fallback to defaults.
Returns a dict with sections: lighting, environment, features.
If the config file is missing or invalid, returns sensible defaults.
"""
try:
config_path = Path(settings.repo_root) / "config" / "matrix.yaml"
if not config_path.exists():
logger.debug("matrix.yaml not found, using default config")
return _DEFAULT_MATRIX_CONFIG.copy()
raw = config_path.read_text()
config = yaml.safe_load(raw)
if not isinstance(config, dict):
logger.warning("matrix.yaml invalid format, using defaults")
return _DEFAULT_MATRIX_CONFIG.copy()
# Merge with defaults to ensure all required fields exist
result: dict[str, Any] = {
"lighting": {
**_DEFAULT_MATRIX_CONFIG["lighting"],
**config.get("lighting", {}),
},
"environment": {
**_DEFAULT_MATRIX_CONFIG["environment"],
**config.get("environment", {}),
},
"features": {
**_DEFAULT_MATRIX_CONFIG["features"],
**config.get("features", {}),
},
}
# Ensure point_lights is a list
if "point_lights" in config.get("lighting", {}):
result["lighting"]["point_lights"] = config["lighting"]["point_lights"]
else:
result["lighting"]["point_lights"] = _DEFAULT_MATRIX_CONFIG["lighting"]["point_lights"]
return result
except Exception as exc:
logger.warning("Failed to load matrix config: %s, using defaults", exc)
return _DEFAULT_MATRIX_CONFIG.copy()
@matrix_router.get("/config")
async def get_matrix_config() -> JSONResponse:
"""Return Matrix world configuration.
Serves lighting presets, environment settings, and feature flags
to the Matrix frontend so it can be config-driven rather than
hardcoded. Reads from config/matrix.yaml with sensible defaults.
Response structure:
- lighting: ambient_color, ambient_intensity, point_lights[]
- environment: rain_enabled, starfield_enabled, fog_color, fog_density
- features: chat_enabled, visitor_avatars, pip_familiar, workshop_portal
"""
config = _load_matrix_config()
return JSONResponse(
content=config,
headers={"Cache-Control": "no-cache, no-store"},
)
# ---------------------------------------------------------------------------
# Matrix Agent Registry Endpoint
# ---------------------------------------------------------------------------
@matrix_router.get("/agents")
async def get_matrix_agents() -> JSONResponse:
"""Return the agent registry for Matrix visualization.
Serves agents from agents.yaml with Matrix-compatible formatting:
- id: agent identifier
- display_name: human-readable name
- role: functional role
- color: hex color code for visualization
- position: {x, y, z} coordinates in 3D space
- shape: 3D shape type
- status: availability status
Agents are arranged in a circular layout by default.
Returns 200 with empty list if no agents configured.
"""
agents = _build_matrix_agents_response()
return JSONResponse(
content=agents,
headers={"Cache-Control": "no-cache, no-store"},
)
# ---------------------------------------------------------------------------
# Matrix Thoughts Endpoint — Timmy's recent thought stream for Matrix display
# ---------------------------------------------------------------------------
_MAX_THOUGHT_LIMIT = 50 # Maximum thoughts allowed per request
_DEFAULT_THOUGHT_LIMIT = 10 # Default number of thoughts to return
_MAX_THOUGHT_TEXT_LEN = 500 # Max characters for thought text
def _build_matrix_thoughts_response(limit: int = _DEFAULT_THOUGHT_LIMIT) -> list[dict[str, Any]]:
"""Build the Matrix thoughts response from the thinking engine.
Returns recent thoughts formatted for Matrix display:
- id: thought UUID
- text: thought content (truncated to 500 chars)
- created_at: ISO-8601 timestamp
- chain_id: parent thought ID (or null if root thought)
Returns empty list if thinking engine is disabled or fails.
"""
try:
from timmy.thinking import thinking_engine
thoughts = thinking_engine.get_recent_thoughts(limit=limit)
return [
{
"id": t.id,
"text": t.content[:_MAX_THOUGHT_TEXT_LEN],
"created_at": t.created_at,
"chain_id": t.parent_id,
}
for t in thoughts
]
except Exception as exc:
logger.warning("Failed to load thoughts for Matrix: %s", exc)
return []
@matrix_router.get("/thoughts")
async def get_matrix_thoughts(limit: int = _DEFAULT_THOUGHT_LIMIT) -> JSONResponse:
"""Return Timmy's recent thoughts formatted for Matrix display.
This is the REST companion to the thought WebSocket messages,
allowing the Matrix frontend to display what Timmy is actually
thinking about rather than canned contextual lines.
Query params:
- limit: Number of thoughts to return (default 10, max 50)
Response: JSON array of thought objects:
- id: thought UUID
- text: thought content (truncated to 500 chars)
- created_at: ISO-8601 timestamp
- chain_id: parent thought ID (null if root thought)
Returns empty array if thinking engine is disabled or fails.
"""
# Clamp limit to valid range
if limit < 1:
limit = 1
elif limit > _MAX_THOUGHT_LIMIT:
limit = _MAX_THOUGHT_LIMIT
thoughts = _build_matrix_thoughts_response(limit=limit)
return JSONResponse(
content=thoughts,
headers={"Cache-Control": "no-cache, no-store"},
)
# ---------------------------------------------------------------------------
# Matrix Health Endpoint — backend capability discovery
# ---------------------------------------------------------------------------
# Health check cache (5-second TTL for capability checks)
_health_cache: dict | None = None
_health_cache_ts: float = 0.0
_HEALTH_CACHE_TTL = 5.0
def _check_capability_thinking() -> bool:
"""Check if thinking engine is available."""
try:
from timmy.thinking import thinking_engine
# Check if the engine has been initialized (has a db path)
return hasattr(thinking_engine, "_db") and thinking_engine._db is not None
except Exception:
return False
def _check_capability_memory() -> bool:
"""Check if memory system is available."""
try:
from timmy.memory_system import HOT_MEMORY_PATH
return HOT_MEMORY_PATH.exists()
except Exception:
return False
def _check_capability_bark() -> bool:
"""Check if bark production is available."""
try:
from infrastructure.presence import produce_bark
return callable(produce_bark)
except Exception:
return False
def _check_capability_familiar() -> bool:
"""Check if familiar (Pip) is available."""
try:
from timmy.familiar import pip_familiar
return pip_familiar is not None
except Exception:
return False
def _check_capability_lightning() -> bool:
"""Check if Lightning payments are available."""
# Lightning is currently disabled per health.py
# Returns False until properly re-implemented
return False
def _build_matrix_health_response() -> dict[str, Any]:
"""Build the Matrix health response with capability checks.
Performs lightweight checks (<100ms total) to determine which features
are available. Returns 200 even if some capabilities are degraded.
"""
capabilities = {
"thinking": _check_capability_thinking(),
"memory": _check_capability_memory(),
"bark": _check_capability_bark(),
"familiar": _check_capability_familiar(),
"lightning": _check_capability_lightning(),
}
# Status is ok if core capabilities (thinking, memory, bark) are available
core_caps = ["thinking", "memory", "bark"]
core_available = all(capabilities[c] for c in core_caps)
status = "ok" if core_available else "degraded"
return {
"status": status,
"version": "1.0.0",
"capabilities": capabilities,
}
@matrix_router.get("/health")
async def get_matrix_health() -> JSONResponse:
"""Return health status and capability availability for Matrix frontend.
This endpoint allows the Matrix frontend to discover what backend
capabilities are available so it can show/hide UI elements:
- thinking: Show thought bubbles if enabled
- memory: Show crystal ball memory search if available
- bark: Enable visitor chat responses
- familiar: Show Pip the familiar
- lightning: Enable payment features
Response time is <100ms (no heavy checks). Returns 200 even if
some capabilities are degraded.
Response:
- status: "ok" or "degraded"
- version: API version string
- capabilities: dict of feature:bool
"""
response = _build_matrix_health_response()
status_code = 200 # Always 200, even if degraded
return JSONResponse(
content=response,
status_code=status_code,
headers={"Cache-Control": "no-cache, no-store"},
)
# ---------------------------------------------------------------------------
# Matrix Memory Search Endpoint — visitors query Timmy's memory
# ---------------------------------------------------------------------------
# Rate limiting: 1 search per 5 seconds per IP
_MEMORY_SEARCH_RATE_LIMIT_SECONDS = 5
_memory_search_last_request: dict[str, float] = {}
_MAX_MEMORY_RESULTS = 5
_MAX_MEMORY_TEXT_LENGTH = 200
def _get_client_ip(request) -> str:
"""Extract client IP from request, respecting X-Forwarded-For header."""
# Check for forwarded IP (when behind proxy)
forwarded = request.headers.get("X-Forwarded-For")
if forwarded:
# Take the first IP in the chain
return forwarded.split(",")[0].strip()
# Fall back to direct client IP
if request.client:
return request.client.host
return "unknown"
def _build_matrix_memory_response(
memories: list,
) -> list[dict[str, Any]]:
"""Build the Matrix memory search response.
Formats memory entries for Matrix display:
- text: truncated to 200 characters
- relevance: 0-1 score from relevance_score
- created_at: ISO-8601 timestamp
- context_type: the memory type
Results are capped at _MAX_MEMORY_RESULTS.
"""
results = []
for mem in memories[:_MAX_MEMORY_RESULTS]:
text = mem.content
if len(text) > _MAX_MEMORY_TEXT_LENGTH:
text = text[:_MAX_MEMORY_TEXT_LENGTH] + "..."
results.append(
{
"text": text,
"relevance": round(mem.relevance_score or 0.0, 4),
"created_at": mem.timestamp,
"context_type": mem.context_type,
}
)
return results
@matrix_router.get("/memory/search")
async def get_matrix_memory_search(request: Request, q: str | None = None) -> JSONResponse:
"""Search Timmy's memory for relevant snippets.
Allows Matrix visitors to query Timmy's memory ("what do you remember
about sovereignty?"). Results appear as floating crystal-ball text
in the Workshop room.
Query params:
- q: Search query text (required)
Response: JSON array of memory objects:
- text: Memory content (truncated to 200 chars)
- relevance: Similarity score 0-1
- created_at: ISO-8601 timestamp
- context_type: Memory type (conversation, fact, etc.)
Rate limited to 1 search per 5 seconds per IP.
Returns:
- 200: JSON array of memory results (max 5)
- 400: Missing or empty query parameter
- 429: Rate limit exceeded
"""
# Validate query parameter
query = q.strip() if q else ""
if not query:
return JSONResponse(
status_code=400,
content={"error": "Query parameter 'q' is required"},
)
# Rate limiting check by IP
client_ip = _get_client_ip(request)
now = time.time()
last_request = _memory_search_last_request.get(client_ip, 0)
time_since_last = now - last_request
if time_since_last < _MEMORY_SEARCH_RATE_LIMIT_SECONDS:
retry_after = _MEMORY_SEARCH_RATE_LIMIT_SECONDS - time_since_last
return JSONResponse(
status_code=429,
content={"error": "Rate limit exceeded. Try again later."},
headers={"Retry-After": str(int(retry_after) + 1)},
)
# Record this request
_memory_search_last_request[client_ip] = now
# Search memories
try:
memories = search_memories(query, limit=_MAX_MEMORY_RESULTS)
results = _build_matrix_memory_response(memories)
except Exception as exc:
logger.warning("Memory search failed: %s", exc)
results = []
return JSONResponse(
content=results,
headers={"Cache-Control": "no-cache, no-store"},
)

View File

@@ -21,6 +21,11 @@
</div>
{% endcall %}
<!-- Daily Run Metrics (HTMX polled) -->
{% call panel("DAILY RUN", hx_get="/daily-run/panel", hx_trigger="every 60s") %}
<div class="mc-loading-placeholder">LOADING...</div>
{% endcall %}
</div>
<!-- Main panel — swappable via HTMX; defaults to Timmy on load -->

View File

@@ -138,6 +138,47 @@
</div>
</div>
<!-- Spark Intelligence -->
{% from "macros.html" import panel %}
<div class="mc-card-spaced">
<div class="card">
<div class="card-header">
<h2 class="card-title">Spark Intelligence</h2>
<div>
<span class="badge" id="spark-status-badge">Loading...</span>
</div>
</div>
<div class="grid grid-3">
<div class="stat">
<div class="stat-value" id="spark-events">-</div>
<div class="stat-label">Events</div>
</div>
<div class="stat">
<div class="stat-value" id="spark-memories">-</div>
<div class="stat-label">Memories</div>
</div>
<div class="stat">
<div class="stat-value" id="spark-predictions">-</div>
<div class="stat-label">Predictions</div>
</div>
</div>
</div>
<div class="grid grid-2 mc-section-gap">
{% call panel("SPARK TIMELINE", id="spark-timeline-panel",
hx_get="/spark/timeline",
hx_trigger="load, every 10s") %}
<div class="spark-timeline-scroll">
<p class="chat-history-placeholder">Loading timeline...</p>
</div>
{% endcall %}
{% call panel("SPARK INSIGHTS", id="spark-insights-panel",
hx_get="/spark/insights",
hx_trigger="load, every 30s") %}
<p class="chat-history-placeholder">Loading insights...</p>
{% endcall %}
</div>
</div>
<!-- Chat History -->
<div class="card mc-card-spaced">
<div class="card-header">
@@ -428,7 +469,34 @@ async function loadGrokStats() {
}
}
// Load Spark status
async function loadSparkStatus() {
try {
var response = await fetch('/spark');
var data = await response.json();
var st = data.status || {};
document.getElementById('spark-events').textContent = st.total_events || 0;
document.getElementById('spark-memories').textContent = st.total_memories || 0;
document.getElementById('spark-predictions').textContent = st.total_predictions || 0;
var badge = document.getElementById('spark-status-badge');
if (st.total_events > 0) {
badge.textContent = 'Active';
badge.className = 'badge badge-success';
} else {
badge.textContent = 'Idle';
badge.className = 'badge badge-warning';
}
} catch (error) {
var badge = document.getElementById('spark-status-badge');
badge.textContent = 'Offline';
badge.className = 'badge badge-danger';
}
}
// Initial load
loadSparkStatus();
loadSovereignty();
loadHealth();
loadSwarmStats();
@@ -442,5 +510,6 @@ setInterval(loadHealth, 10000);
setInterval(loadSwarmStats, 5000);
setInterval(updateHeartbeat, 5000);
setInterval(loadGrokStats, 10000);
setInterval(loadSparkStatus, 15000);
</script>
{% endblock %}

View File

@@ -0,0 +1,54 @@
<div class="card-header mc-panel-header">// DAILY RUN METRICS</div>
<div class="card-body p-3">
{% if not gitea_available %}
<div class="mc-muted" style="font-size: 0.85rem; padding: 8px 0;">
<span style="color: var(--amber);"></span> Gitea API unavailable
</div>
{% else %}
{% set m = metrics %}
<!-- Sessions summary -->
<div class="dr-section" style="margin-bottom: 16px;">
<div class="dr-row" style="display: flex; justify-content: space-between; align-items: center; margin-bottom: 8px;">
<span class="dr-label" style="font-size: 0.85rem; color: var(--text-dim);">Sessions ({{ m.lookback_days }}d)</span>
<a href="{{ logbook_url }}" target="_blank" class="dr-link" style="font-size: 0.75rem; color: var(--green); text-decoration: none;">
Logbook →
</a>
</div>
<div class="dr-stat" style="display: flex; align-items: baseline; gap: 8px;">
<span class="dr-value" style="font-size: 1.5rem; font-weight: 600; color: var(--text-bright);">{{ m.sessions_completed }}</span>
<span class="dr-trend" style="font-size: 0.9rem; color: {{ m.sessions_trend_color }};">{{ m.sessions_trend }}</span>
<span class="dr-prev" style="font-size: 0.75rem; color: var(--text-dim);">vs {{ m.sessions_previous }} prev</span>
</div>
</div>
<!-- Layer breakdown -->
<div class="dr-section">
<div class="dr-label" style="font-size: 0.85rem; color: var(--text-dim); margin-bottom: 8px;">Issues by Layer</div>
<div class="dr-layers" style="display: flex; flex-direction: column; gap: 6px;">
{% for layer in m.layers %}
<div class="dr-layer-row" style="display: flex; justify-content: space-between; align-items: center;">
<a href="{{ layer_urls[layer.name] }}" target="_blank" class="dr-layer-name" style="font-size: 0.8rem; color: var(--text); text-decoration: none; text-transform: capitalize;">
{{ layer.name.replace('-', ' ') }}
</a>
<div class="dr-layer-stat" style="display: flex; align-items: center; gap: 6px;">
<span class="dr-layer-value" style="font-size: 0.9rem; font-weight: 500; color: var(--text-bright);">{{ layer.current_count }}</span>
<span class="dr-layer-trend" style="font-size: 0.75rem; color: {{ layer.trend_color }}; width: 18px; text-align: center;">{{ layer.trend }}</span>
</div>
</div>
{% endfor %}
</div>
</div>
<!-- Total touched -->
<div class="dr-section" style="margin-top: 12px; padding-top: 12px; border-top: 1px solid var(--border);">
<div class="dr-row" style="display: flex; justify-content: space-between; align-items: center;">
<span class="dr-label" style="font-size: 0.8rem; color: var(--text-dim);">Total Issues Touched</span>
<div class="dr-total-stat" style="display: flex; align-items: center; gap: 6px;">
<span class="dr-total-value" style="font-size: 1rem; font-weight: 600; color: var(--text-bright);">{{ m.total_touched_current }}</span>
<span class="dr-total-prev" style="font-size: 0.7rem; color: var(--text-dim);">/ {{ m.total_touched_previous }} prev</span>
</div>
</div>
</div>
{% endif %}
</div>

View File

@@ -0,0 +1,80 @@
{% from "macros.html" import panel %}
<div class="quests-summary mb-4">
<div class="row">
<div class="col-md-4">
<div class="stat-card">
<div class="stat-value">{{ total_tokens }}</div>
<div class="stat-label">Tokens Earned</div>
</div>
</div>
<div class="col-md-4">
<div class="stat-card">
<div class="stat-value">{{ completed_count }}</div>
<div class="stat-label">Quests Completed</div>
</div>
</div>
<div class="col-md-4">
<div class="stat-card">
<div class="stat-value">{{ quests|selectattr('enabled', 'equalto', true)|list|length }}</div>
<div class="stat-label">Active Quests</div>
</div>
</div>
</div>
</div>
<div class="quests-list">
{% for quest in quests %}
{% if quest.enabled %}
<div class="quest-card quest-status-{{ quest.status }}">
<div class="quest-header">
<h5 class="quest-name">{{ quest.name }}</h5>
<span class="quest-reward">+{{ quest.reward_tokens }} ⚡</span>
</div>
<p class="quest-description">{{ quest.description }}</p>
<div class="quest-progress">
{% if quest.status == 'completed' %}
<div class="progress">
<div class="progress-bar bg-success" style="width: 100%"></div>
</div>
<span class="quest-status-badge completed">Completed</span>
{% elif quest.status == 'claimed' %}
<div class="progress">
<div class="progress-bar bg-success" style="width: 100%"></div>
</div>
<span class="quest-status-badge claimed">Reward Claimed</span>
{% elif quest.on_cooldown %}
<div class="progress">
<div class="progress-bar bg-secondary" style="width: 100%"></div>
</div>
<span class="quest-status-badge cooldown">
Cooldown: {{ quest.cooldown_hours_remaining }}h remaining
</span>
{% else %}
<div class="progress">
<div class="progress-bar" style="width: {{ (quest.current_value / quest.target_value * 100)|int }}%"></div>
</div>
<span class="quest-progress-text">{{ quest.current_value }} / {{ quest.target_value }}</span>
{% endif %}
</div>
<div class="quest-meta">
<span class="quest-type">{{ quest.type }}</span>
{% if quest.repeatable %}
<span class="quest-repeatable">↻ Repeatable</span>
{% endif %}
{% if quest.completion_count > 0 %}
<span class="quest-completions">Completed {{ quest.completion_count }} time{% if quest.completion_count != 1 %}s{% endif %}</span>
{% endif %}
</div>
</div>
{% endif %}
{% endfor %}
</div>
{% if not quests|selectattr('enabled', 'equalto', true)|list|length %}
<div class="alert alert-info">
No active quests available. Check back later or contact an administrator.
</div>
{% endif %}

View File

@@ -0,0 +1,50 @@
{% extends "base.html" %}
{% block title %}Quests — Mission Control{% endblock %}
{% block content %}
<div class="container-fluid">
<div class="row">
<div class="col-12">
<h1 class="mc-title">Token Quests</h1>
<p class="mc-subtitle">Complete quests to earn bonus tokens</p>
</div>
</div>
<div class="row mt-4">
<div class="col-md-8">
<div id="quests-panel" hx-get="/quests/panel/{{ agent_id }}" hx-trigger="load, every 30s">
<div class="mc-loading">Loading quests...</div>
</div>
</div>
<div class="col-md-4">
<div class="card mc-panel">
<div class="card-header">
<h5 class="mb-0">Leaderboard</h5>
</div>
<div class="card-body">
<div id="leaderboard" hx-get="/quests/api/leaderboard" hx-trigger="load, every 60s">
<div class="mc-loading">Loading leaderboard...</div>
</div>
</div>
</div>
<div class="card mc-panel mt-4">
<div class="card-header">
<h5 class="mb-0">About Quests</h5>
</div>
<div class="card-body">
<p class="mb-2">Quests are special objectives that reward tokens upon completion.</p>
<ul class="mc-list mb-0">
<li>Complete Daily Run sessions</li>
<li>Close flaky-test issues</li>
<li>Reduce P1 issue backlog</li>
<li>Improve documentation</li>
</ul>
</div>
</div>
</div>
</div>
</div>
{% endblock %}

View File

@@ -0,0 +1,180 @@
{% extends "base.html" %}
{% block title %}Timmy Time — Tower{% endblock %}
{% block extra_styles %}{% endblock %}
{% block content %}
<div class="container-fluid tower-container py-3">
<div class="tower-header">
<div class="tower-title">TOWER</div>
<div class="tower-subtitle">
Real-time Spark visualization &mdash;
<span id="tower-conn" class="tower-conn-badge tower-conn-connecting">CONNECTING</span>
</div>
</div>
<div class="row g-3">
<!-- Left: THINKING (events) -->
<div class="col-12 col-lg-4 d-flex flex-column gap-3">
<div class="card mc-panel tower-phase-card">
<div class="card-header mc-panel-header tower-phase-thinking">// THINKING</div>
<div class="card-body p-3 tower-scroll" id="tower-events">
<div class="tower-empty">Waiting for Spark data&hellip;</div>
</div>
</div>
</div>
<!-- Middle: PREDICTING (EIDOS) -->
<div class="col-12 col-lg-4 d-flex flex-column gap-3">
<div class="card mc-panel tower-phase-card">
<div class="card-header mc-panel-header tower-phase-predicting">// PREDICTING</div>
<div class="card-body p-3" id="tower-predictions">
<div class="tower-empty">Waiting for Spark data&hellip;</div>
</div>
</div>
<div class="card mc-panel">
<div class="card-header mc-panel-header">// EIDOS STATS</div>
<div class="card-body p-3">
<div class="tower-stat-grid" id="tower-stats">
<div class="tower-stat"><span class="tower-stat-label">EVENTS</span><span class="tower-stat-value" id="ts-events">0</span></div>
<div class="tower-stat"><span class="tower-stat-label">MEMORIES</span><span class="tower-stat-value" id="ts-memories">0</span></div>
<div class="tower-stat"><span class="tower-stat-label">PREDICTIONS</span><span class="tower-stat-value" id="ts-preds">0</span></div>
<div class="tower-stat"><span class="tower-stat-label">ACCURACY</span><span class="tower-stat-value" id="ts-accuracy"></span></div>
</div>
</div>
</div>
</div>
<!-- Right: ADVISING -->
<div class="col-12 col-lg-4 d-flex flex-column gap-3">
<div class="card mc-panel tower-phase-card">
<div class="card-header mc-panel-header tower-phase-advising">// ADVISING</div>
<div class="card-body p-3 tower-scroll" id="tower-advisories">
<div class="tower-empty">Waiting for Spark data&hellip;</div>
</div>
</div>
</div>
</div>
</div>
<script>
(function() {
var ws = null;
var badge = document.getElementById('tower-conn');
function setConn(state) {
badge.textContent = state.toUpperCase();
badge.className = 'tower-conn-badge tower-conn-' + state;
}
function esc(s) { var d = document.createElement('div'); d.textContent = s; return d.innerHTML; }
function renderEvents(events) {
var el = document.getElementById('tower-events');
if (!events || !events.length) { el.innerHTML = '<div class="tower-empty">No events captured yet.</div>'; return; }
var html = '';
for (var i = 0; i < events.length; i++) {
var ev = events[i];
var dots = ev.importance >= 0.8 ? '\u25cf\u25cf\u25cf' : ev.importance >= 0.5 ? '\u25cf\u25cf' : '\u25cf';
html += '<div class="tower-event tower-etype-' + esc(ev.event_type) + '">'
+ '<div class="tower-ev-head">'
+ '<span class="tower-ev-badge">' + esc(ev.event_type.replace(/_/g, ' ').toUpperCase()) + '</span>'
+ '<span class="tower-ev-dots">' + dots + '</span>'
+ '</div>'
+ '<div class="tower-ev-desc">' + esc(ev.description) + '</div>'
+ '<div class="tower-ev-time">' + esc((ev.created_at || '').slice(0, 19)) + '</div>'
+ '</div>';
}
el.innerHTML = html;
}
function renderPredictions(preds) {
var el = document.getElementById('tower-predictions');
if (!preds || !preds.length) { el.innerHTML = '<div class="tower-empty">No predictions yet.</div>'; return; }
var html = '';
for (var i = 0; i < preds.length; i++) {
var p = preds[i];
var cls = p.evaluated ? 'tower-pred-done' : 'tower-pred-pending';
var accTxt = p.accuracy != null ? Math.round(p.accuracy * 100) + '%' : 'PENDING';
var accCls = p.accuracy != null ? (p.accuracy >= 0.7 ? 'text-success' : p.accuracy < 0.4 ? 'text-danger' : 'text-warning') : '';
html += '<div class="tower-pred ' + cls + '">'
+ '<div class="tower-pred-head">'
+ '<span class="tower-pred-task">' + esc(p.task_id) + '</span>'
+ '<span class="tower-pred-acc ' + accCls + '">' + accTxt + '</span>'
+ '</div>';
if (p.predicted) {
var pr = p.predicted;
html += '<div class="tower-pred-detail">';
if (pr.likely_winner) html += '<span>Winner: ' + esc(pr.likely_winner.slice(0, 8)) + '</span> ';
if (pr.success_probability != null) html += '<span>Success: ' + Math.round(pr.success_probability * 100) + '%</span> ';
html += '</div>';
}
html += '<div class="tower-ev-time">' + esc((p.created_at || '').slice(0, 19)) + '</div>'
+ '</div>';
}
el.innerHTML = html;
}
function renderAdvisories(advs) {
var el = document.getElementById('tower-advisories');
if (!advs || !advs.length) { el.innerHTML = '<div class="tower-empty">No advisories yet.</div>'; return; }
var html = '';
for (var i = 0; i < advs.length; i++) {
var a = advs[i];
var prio = a.priority >= 0.7 ? 'high' : a.priority >= 0.4 ? 'medium' : 'low';
html += '<div class="tower-advisory tower-adv-' + prio + '">'
+ '<div class="tower-adv-head">'
+ '<span class="tower-adv-cat">' + esc(a.category.replace(/_/g, ' ').toUpperCase()) + '</span>'
+ '<span class="tower-adv-prio">' + Math.round(a.priority * 100) + '%</span>'
+ '</div>'
+ '<div class="tower-adv-title">' + esc(a.title) + '</div>'
+ '<div class="tower-adv-detail">' + esc(a.detail) + '</div>'
+ '<div class="tower-adv-action">' + esc(a.suggested_action) + '</div>'
+ '</div>';
}
el.innerHTML = html;
}
function renderStats(status) {
if (!status) return;
document.getElementById('ts-events').textContent = status.events_captured || 0;
document.getElementById('ts-memories').textContent = status.memories_stored || 0;
var p = status.predictions || {};
document.getElementById('ts-preds').textContent = p.total_predictions || 0;
var acc = p.avg_accuracy;
var accEl = document.getElementById('ts-accuracy');
if (acc != null) {
accEl.textContent = Math.round(acc * 100) + '%';
accEl.className = 'tower-stat-value ' + (acc >= 0.7 ? 'text-success' : acc < 0.4 ? 'text-danger' : 'text-warning');
} else {
accEl.textContent = '\u2014';
}
}
function handleMsg(data) {
if (data.type !== 'spark_state') return;
renderEvents(data.events);
renderPredictions(data.predictions);
renderAdvisories(data.advisories);
renderStats(data.status);
}
function connect() {
var proto = location.protocol === 'https:' ? 'wss:' : 'ws:';
ws = new WebSocket(proto + '//' + location.host + '/tower/ws');
ws.onopen = function() { setConn('live'); };
ws.onclose = function() { setConn('offline'); setTimeout(connect, 3000); };
ws.onerror = function() { setConn('offline'); };
ws.onmessage = function(e) {
try { handleMsg(JSON.parse(e.data)); } catch(err) { console.error('Tower WS parse error', err); }
};
}
connect();
})();
</script>
{% endblock %}

View File

@@ -0,0 +1,84 @@
"""Thread-local SQLite connection pool.
Provides a ConnectionPool class that manages SQLite connections per thread,
with support for context managers and automatic cleanup.
"""
import sqlite3
import threading
from collections.abc import Generator
from contextlib import contextmanager
from pathlib import Path
class ConnectionPool:
"""Thread-local SQLite connection pool.
Each thread gets its own connection, which is reused for subsequent
requests from the same thread. Connections are automatically cleaned
up when close_connection() is called or the context manager exits.
"""
def __init__(self, db_path: Path | str) -> None:
"""Initialize the connection pool.
Args:
db_path: Path to the SQLite database file.
"""
self._db_path = Path(db_path)
self._local = threading.local()
def _ensure_db_exists(self) -> None:
"""Ensure the database directory exists."""
self._db_path.parent.mkdir(parents=True, exist_ok=True)
def get_connection(self) -> sqlite3.Connection:
"""Get a connection for the current thread.
Creates a new connection if one doesn't exist for this thread,
otherwise returns the existing connection.
Returns:
A sqlite3 Connection object.
"""
if not hasattr(self._local, "conn") or self._local.conn is None:
self._ensure_db_exists()
self._local.conn = sqlite3.connect(str(self._db_path), check_same_thread=False)
self._local.conn.row_factory = sqlite3.Row
return self._local.conn
def close_connection(self) -> None:
"""Close the connection for the current thread.
Cleans up the thread-local storage. Safe to call even if
no connection exists for this thread.
"""
if hasattr(self._local, "conn") and self._local.conn is not None:
self._local.conn.close()
self._local.conn = None
@contextmanager
def connection(self) -> Generator[sqlite3.Connection, None, None]:
"""Context manager for getting and automatically closing a connection.
Yields:
A sqlite3 Connection object.
Example:
with pool.connection() as conn:
cursor = conn.execute("SELECT 1")
result = cursor.fetchone()
"""
conn = self.get_connection()
try:
yield conn
finally:
self.close_connection()
def close_all(self) -> None:
"""Close all connections (useful for testing).
Note: This only closes the connection for the current thread.
In a multi-threaded environment, each thread must close its own.
"""
self.close_connection()

View File

@@ -149,6 +149,52 @@ def _log_error_event(
logger.debug("Failed to log error event: %s", log_exc)
def _build_report_description(
exc: Exception,
source: str,
context: dict | None,
error_hash: str,
tb_str: str,
affected_file: str,
affected_line: int,
git_ctx: dict,
) -> str:
"""Build the markdown description for a bug report task."""
parts = [
f"**Error:** {type(exc).__name__}: {str(exc)}",
f"**Source:** {source}",
f"**File:** {affected_file}:{affected_line}",
f"**Git:** {git_ctx.get('branch', '?')} @ {git_ctx.get('commit', '?')}",
f"**Time:** {datetime.now(UTC).isoformat()}",
f"**Hash:** {error_hash}",
]
if context:
ctx_str = ", ".join(f"{k}={v}" for k, v in context.items())
parts.append(f"**Context:** {ctx_str}")
parts.append(f"\n**Stack Trace:**\n```\n{tb_str[:2000]}\n```")
return "\n".join(parts)
def _log_bug_report_created(source: str, task_id: str, error_hash: str, title: str) -> None:
"""Log a BUG_REPORT_CREATED event (best-effort)."""
try:
from swarm.event_log import EventType, log_event
log_event(
EventType.BUG_REPORT_CREATED,
source=source,
task_id=task_id,
data={
"error_hash": error_hash,
"title": title[:100],
},
)
except Exception as exc:
logger.warning("Bug report event log error: %s", exc)
def _create_bug_report(
exc: Exception,
source: str,
@@ -164,25 +210,20 @@ def _create_bug_report(
from swarm.task_queue.models import create_task
title = f"[BUG] {type(exc).__name__}: {str(exc)[:80]}"
description_parts = [
f"**Error:** {type(exc).__name__}: {str(exc)}",
f"**Source:** {source}",
f"**File:** {affected_file}:{affected_line}",
f"**Git:** {git_ctx.get('branch', '?')} @ {git_ctx.get('commit', '?')}",
f"**Time:** {datetime.now(UTC).isoformat()}",
f"**Hash:** {error_hash}",
]
if context:
ctx_str = ", ".join(f"{k}={v}" for k, v in context.items())
description_parts.append(f"**Context:** {ctx_str}")
description_parts.append(f"\n**Stack Trace:**\n```\n{tb_str[:2000]}\n```")
description = _build_report_description(
exc,
source,
context,
error_hash,
tb_str,
affected_file,
affected_line,
git_ctx,
)
task = create_task(
title=title,
description="\n".join(description_parts),
description=description,
assigned_to="default",
created_by="system",
priority="normal",
@@ -190,24 +231,9 @@ def _create_bug_report(
auto_approve=True,
task_type="bug_report",
)
task_id = task.id
try:
from swarm.event_log import EventType, log_event
log_event(
EventType.BUG_REPORT_CREATED,
source=source,
task_id=task_id,
data={
"error_hash": error_hash,
"title": title[:100],
},
)
except Exception as exc:
logger.warning("Bug report screenshot error: %s", exc)
return task_id
_log_bug_report_created(source, task.id, error_hash, title)
return task.id
except Exception as task_exc:
logger.debug("Failed to create bug report task: %s", task_exc)

View File

@@ -64,7 +64,7 @@ class EventBus:
@bus.subscribe("agent.task.*")
async def handle_task(event: Event):
logger.debug(f"Task event: {event.data}")
logger.debug("Task event: %s", event.data)
await bus.publish(Event(
type="agent.task.assigned",

View File

@@ -144,6 +144,65 @@ class ShellHand:
return None
@staticmethod
def _build_run_env(env: dict | None) -> dict:
"""Merge *env* overrides into a copy of the current environment."""
import os
run_env = os.environ.copy()
if env:
run_env.update(env)
return run_env
async def _execute_subprocess(
self,
command: str,
effective_timeout: int,
cwd: str | None,
run_env: dict,
start: float,
) -> ShellResult:
"""Run *command* as a subprocess with timeout enforcement."""
proc = await asyncio.create_subprocess_shell(
command,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
cwd=cwd,
env=run_env,
)
try:
stdout_bytes, stderr_bytes = await asyncio.wait_for(
proc.communicate(), timeout=effective_timeout
)
except TimeoutError:
proc.kill()
await proc.wait()
latency = (time.time() - start) * 1000
logger.warning("Shell command timed out after %ds: %s", effective_timeout, command)
return ShellResult(
command=command,
success=False,
exit_code=-1,
error=f"Command timed out after {effective_timeout}s",
latency_ms=latency,
timed_out=True,
)
latency = (time.time() - start) * 1000
exit_code = proc.returncode if proc.returncode is not None else -1
stdout = stdout_bytes.decode("utf-8", errors="replace").strip()
stderr = stderr_bytes.decode("utf-8", errors="replace").strip()
return ShellResult(
command=command,
success=exit_code == 0,
exit_code=exit_code,
stdout=stdout,
stderr=stderr,
latency_ms=latency,
)
async def run(
self,
command: str,
@@ -164,7 +223,6 @@ class ShellHand:
"""
start = time.time()
# Validate
validation_error = self._validate_command(command)
if validation_error:
return ShellResult(
@@ -178,52 +236,8 @@ class ShellHand:
cwd = working_dir or self._working_dir
try:
import os
run_env = os.environ.copy()
if env:
run_env.update(env)
proc = await asyncio.create_subprocess_shell(
command,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
cwd=cwd,
env=run_env,
)
try:
stdout_bytes, stderr_bytes = await asyncio.wait_for(
proc.communicate(), timeout=effective_timeout
)
except TimeoutError:
proc.kill()
await proc.wait()
latency = (time.time() - start) * 1000
logger.warning("Shell command timed out after %ds: %s", effective_timeout, command)
return ShellResult(
command=command,
success=False,
exit_code=-1,
error=f"Command timed out after {effective_timeout}s",
latency_ms=latency,
timed_out=True,
)
latency = (time.time() - start) * 1000
exit_code = proc.returncode if proc.returncode is not None else -1
stdout = stdout_bytes.decode("utf-8", errors="replace").strip()
stderr = stderr_bytes.decode("utf-8", errors="replace").strip()
return ShellResult(
command=command,
success=exit_code == 0,
exit_code=exit_code,
stdout=stdout,
stderr=stderr,
latency_ms=latency,
)
run_env = self._build_run_env(env)
return await self._execute_subprocess(command, effective_timeout, cwd, run_env, start)
except Exception as exc:
latency = (time.time() - start) * 1000
logger.warning("Shell command failed: %s%s", command, exc)

View File

@@ -0,0 +1,266 @@
"""Matrix configuration loader utility.
Provides a typed dataclass for Matrix world configuration and a loader
that fetches settings from YAML with sensible defaults.
"""
import logging
from dataclasses import dataclass, field
from pathlib import Path
from typing import Any
import yaml
logger = logging.getLogger(__name__)
@dataclass
class PointLight:
"""A single point light in the Matrix world."""
color: str = "#FFFFFF"
intensity: float = 1.0
position: dict[str, float] = field(default_factory=lambda: {"x": 0, "y": 0, "z": 0})
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "PointLight":
"""Create a PointLight from a dictionary with defaults."""
return cls(
color=data.get("color", "#FFFFFF"),
intensity=data.get("intensity", 1.0),
position=data.get("position", {"x": 0, "y": 0, "z": 0}),
)
def _default_point_lights_factory() -> list[PointLight]:
"""Factory function for default point lights."""
return [
PointLight(
color="#FFAA55", # Warm amber (Workshop)
intensity=1.2,
position={"x": 0, "y": 5, "z": 0},
),
PointLight(
color="#3B82F6", # Cool blue (Matrix)
intensity=0.8,
position={"x": -5, "y": 3, "z": -5},
),
PointLight(
color="#A855F7", # Purple accent
intensity=0.6,
position={"x": 5, "y": 3, "z": 5},
),
]
@dataclass
class LightingConfig:
"""Lighting configuration for the Matrix world."""
ambient_color: str = "#FFAA55" # Warm amber (Workshop warmth)
ambient_intensity: float = 0.5
point_lights: list[PointLight] = field(default_factory=_default_point_lights_factory)
@classmethod
def from_dict(cls, data: dict[str, Any] | None) -> "LightingConfig":
"""Create a LightingConfig from a dictionary with defaults."""
if data is None:
data = {}
point_lights_data = data.get("point_lights", [])
point_lights = (
[PointLight.from_dict(pl) for pl in point_lights_data]
if point_lights_data
else _default_point_lights_factory()
)
return cls(
ambient_color=data.get("ambient_color", "#FFAA55"),
ambient_intensity=data.get("ambient_intensity", 0.5),
point_lights=point_lights,
)
@dataclass
class EnvironmentConfig:
"""Environment settings for the Matrix world."""
rain_enabled: bool = False
starfield_enabled: bool = True
fog_color: str = "#0f0f23"
fog_density: float = 0.02
@classmethod
def from_dict(cls, data: dict[str, Any] | None) -> "EnvironmentConfig":
"""Create an EnvironmentConfig from a dictionary with defaults."""
if data is None:
data = {}
return cls(
rain_enabled=data.get("rain_enabled", False),
starfield_enabled=data.get("starfield_enabled", True),
fog_color=data.get("fog_color", "#0f0f23"),
fog_density=data.get("fog_density", 0.02),
)
@dataclass
class FeaturesConfig:
"""Feature toggles for the Matrix world."""
chat_enabled: bool = True
visitor_avatars: bool = True
pip_familiar: bool = True
workshop_portal: bool = True
@classmethod
def from_dict(cls, data: dict[str, Any] | None) -> "FeaturesConfig":
"""Create a FeaturesConfig from a dictionary with defaults."""
if data is None:
data = {}
return cls(
chat_enabled=data.get("chat_enabled", True),
visitor_avatars=data.get("visitor_avatars", True),
pip_familiar=data.get("pip_familiar", True),
workshop_portal=data.get("workshop_portal", True),
)
@dataclass
class AgentConfig:
"""Configuration for a single Matrix agent."""
name: str = ""
role: str = ""
enabled: bool = True
@classmethod
def from_dict(cls, data: dict[str, Any]) -> "AgentConfig":
"""Create an AgentConfig from a dictionary with defaults."""
return cls(
name=data.get("name", ""),
role=data.get("role", ""),
enabled=data.get("enabled", True),
)
@dataclass
class AgentsConfig:
"""Agent registry configuration."""
default_count: int = 5
max_count: int = 20
agents: list[AgentConfig] = field(default_factory=list)
@classmethod
def from_dict(cls, data: dict[str, Any] | None) -> "AgentsConfig":
"""Create an AgentsConfig from a dictionary with defaults."""
if data is None:
data = {}
agents_data = data.get("agents", [])
agents = [AgentConfig.from_dict(a) for a in agents_data] if agents_data else []
return cls(
default_count=data.get("default_count", 5),
max_count=data.get("max_count", 20),
agents=agents,
)
@dataclass
class MatrixConfig:
"""Complete Matrix world configuration.
Combines lighting, environment, features, and agent settings
into a single configuration object.
"""
lighting: LightingConfig = field(default_factory=LightingConfig)
environment: EnvironmentConfig = field(default_factory=EnvironmentConfig)
features: FeaturesConfig = field(default_factory=FeaturesConfig)
agents: AgentsConfig = field(default_factory=AgentsConfig)
@classmethod
def from_dict(cls, data: dict[str, Any] | None) -> "MatrixConfig":
"""Create a MatrixConfig from a dictionary with defaults for missing sections."""
if data is None:
data = {}
return cls(
lighting=LightingConfig.from_dict(data.get("lighting")),
environment=EnvironmentConfig.from_dict(data.get("environment")),
features=FeaturesConfig.from_dict(data.get("features")),
agents=AgentsConfig.from_dict(data.get("agents")),
)
def to_dict(self) -> dict[str, Any]:
"""Convert the configuration to a plain dictionary."""
return {
"lighting": {
"ambient_color": self.lighting.ambient_color,
"ambient_intensity": self.lighting.ambient_intensity,
"point_lights": [
{
"color": pl.color,
"intensity": pl.intensity,
"position": pl.position,
}
for pl in self.lighting.point_lights
],
},
"environment": {
"rain_enabled": self.environment.rain_enabled,
"starfield_enabled": self.environment.starfield_enabled,
"fog_color": self.environment.fog_color,
"fog_density": self.environment.fog_density,
},
"features": {
"chat_enabled": self.features.chat_enabled,
"visitor_avatars": self.features.visitor_avatars,
"pip_familiar": self.features.pip_familiar,
"workshop_portal": self.features.workshop_portal,
},
"agents": {
"default_count": self.agents.default_count,
"max_count": self.agents.max_count,
"agents": [
{"name": a.name, "role": a.role, "enabled": a.enabled}
for a in self.agents.agents
],
},
}
def load_from_yaml(path: str | Path) -> MatrixConfig:
"""Load Matrix configuration from a YAML file.
Missing keys are filled with sensible defaults. If the file
cannot be read or parsed, returns a fully default configuration.
Args:
path: Path to the YAML configuration file.
Returns:
A MatrixConfig instance with loaded or default values.
"""
path = Path(path)
if not path.exists():
logger.warning("Matrix config file not found: %s, using defaults", path)
return MatrixConfig()
try:
with open(path, encoding="utf-8") as f:
raw_data = yaml.safe_load(f)
if not isinstance(raw_data, dict):
logger.warning("Matrix config invalid format, using defaults")
return MatrixConfig()
return MatrixConfig.from_dict(raw_data)
except yaml.YAMLError as exc:
logger.warning("Matrix config YAML parse error: %s, using defaults", exc)
return MatrixConfig()
except OSError as exc:
logger.warning("Matrix config read error: %s, using defaults", exc)
return MatrixConfig()

View File

@@ -0,0 +1,19 @@
"""Morrowind engine-agnostic perception/command protocol.
This package implements the Perception/Command protocol defined in
``docs/protocol/morrowind-perception-command-spec.md``. It provides:
- Pydantic v2 schemas for runtime validation (``schemas``)
- SQLite command logging and query interface (``command_log``)
- Training-data export pipeline (``training_export``)
- FastAPI HTTP harness for perception/command exchange (``api``)
"""
from .schemas import CommandInput, CommandType, EntityType, PerceptionOutput
__all__ = [
"CommandInput",
"CommandType",
"EntityType",
"PerceptionOutput",
]

View File

@@ -0,0 +1,211 @@
"""FastAPI HTTP harness for the Morrowind Perception/Command protocol.
Exposes three endpoints:
- ``GET /perception`` — current world state (perception.json)
- ``POST /command`` — submit a command with validation + logging
- ``GET /morrowind/status`` — system health overview
These endpoints are consumed by the heartbeat loop and the reasoning layer.
The Input Bridge forwarding is stubbed — the bridge doesn't exist yet.
"""
from __future__ import annotations
import json
import logging
from datetime import UTC, datetime
from pathlib import Path
from typing import Any
import asyncio
from fastapi import APIRouter, HTTPException
from pydantic import BaseModel, Field
from .command_log import CommandLogger
from .schemas import CommandInput, PerceptionOutput
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Configuration
# ---------------------------------------------------------------------------
PERCEPTION_PATH = Path("/tmp/timmy/perception.json")
# ---------------------------------------------------------------------------
# Module-level singletons (lazy)
# ---------------------------------------------------------------------------
_command_logger: CommandLogger | None = None
def _get_command_logger() -> CommandLogger:
"""Return (and lazily create) the module-level CommandLogger."""
global _command_logger
if _command_logger is None:
_command_logger = CommandLogger()
return _command_logger
# ---------------------------------------------------------------------------
# Response models
# ---------------------------------------------------------------------------
class CommandResponse(BaseModel):
"""Confirmation returned after a command is accepted."""
command_id: int = Field(..., description="Auto-generated row ID from the command log")
status: str = Field("accepted", description="Command acceptance status")
bridge_forwarded: bool = Field(
False, description="Whether the command was forwarded to the Input Bridge"
)
class MorrowindStatus(BaseModel):
"""System health overview for the Morrowind subsystem."""
connected: bool = Field(..., description="Whether the perception pipeline is active")
last_perception_timestamp: str | None = Field(
None, description="ISO timestamp of the last perception snapshot"
)
command_queue_depth: int = Field(0, description="Total logged commands")
current_cell: str | None = Field(None, description="Agent's current cell/zone")
vitals: dict[str, Any] = Field(default_factory=dict, description="Agent health summary")
# ---------------------------------------------------------------------------
# Router
# ---------------------------------------------------------------------------
router = APIRouter(prefix="/api/v1/morrowind", tags=["morrowind"])
@router.get("/perception", response_model=PerceptionOutput)
async def get_perception() -> PerceptionOutput:
"""Read the latest perception snapshot from disk.
The perception script writes ``perception.json`` on each heartbeat tick.
This endpoint reads, validates, and returns the current world state.
"""
perception_path = PERCEPTION_PATH
if not perception_path.exists():
raise HTTPException(
status_code=404,
detail=f"Perception file not found: {perception_path}",
)
try:
raw = await asyncio.to_thread(perception_path.read_text, encoding="utf-8")
data = json.loads(raw)
return PerceptionOutput.model_validate(data)
except json.JSONDecodeError as exc:
logger.warning("perception.json parse error: %s", exc)
raise HTTPException(status_code=422, detail=f"Invalid JSON: {exc}") from exc
except Exception as exc:
logger.error("Failed to read perception: %s", exc)
raise HTTPException(status_code=500, detail=str(exc)) from exc
@router.post("/command", response_model=CommandResponse)
async def post_command(command: CommandInput) -> CommandResponse:
"""Accept and log a command, then stub-forward to the Input Bridge.
The command is validated against ``CommandInput``, persisted via
``CommandLogger``, and (in the future) forwarded to the game engine
through the Input Bridge socket.
"""
cmd_logger = _get_command_logger()
# Read current perception for context (best-effort)
perception: PerceptionOutput | None = None
if PERCEPTION_PATH.exists():
try:
raw = await asyncio.to_thread(
PERCEPTION_PATH.read_text, encoding="utf-8"
)
perception = PerceptionOutput.model_validate_json(raw)
except Exception as exc:
logger.warning("Could not read perception for command context: %s", exc)
# Persist to SQLite
try:
row_id = await asyncio.to_thread(
cmd_logger.log_command, command, perception
)
except Exception as exc:
logger.error("Command log write failed: %s", exc)
raise HTTPException(status_code=500, detail="Failed to log command") from exc
# Stub: forward to Input Bridge (not implemented yet)
bridge_forwarded = False
logger.debug(
"Command %s logged (id=%d); bridge forwarding stubbed",
command.command.value,
row_id,
)
return CommandResponse(
command_id=row_id,
status="accepted",
bridge_forwarded=bridge_forwarded,
)
@router.get("/status", response_model=MorrowindStatus)
async def get_morrowind_status() -> MorrowindStatus:
"""Return a health overview of the Morrowind subsystem.
Checks perception pipeline liveness, command log depth, and
agent vitals from the latest perception snapshot.
"""
cmd_logger = _get_command_logger()
# Perception pipeline state
connected = PERCEPTION_PATH.exists()
last_ts: str | None = None
current_cell: str | None = None
vitals: dict[str, Any] = {}
if connected:
try:
raw = await asyncio.to_thread(
PERCEPTION_PATH.read_text, encoding="utf-8"
)
perception = PerceptionOutput.model_validate_json(raw)
last_ts = perception.timestamp.isoformat()
current_cell = perception.location.cell
vitals = {
"health": f"{perception.health.current}/{perception.health.max}",
"location": {
"cell": perception.location.cell,
"x": perception.location.x,
"y": perception.location.y,
"z": perception.location.z,
"interior": perception.location.interior,
},
"in_combat": perception.environment.is_combat,
"in_dialogue": perception.environment.is_dialogue,
"inventory_items": perception.inventory_summary.item_count,
"gold": perception.inventory_summary.gold,
}
except Exception as exc:
logger.warning("Status check: failed to read perception: %s", exc)
connected = False
# Command log depth
try:
queue_depth = await asyncio.to_thread(cmd_logger.count)
except Exception as exc:
logger.warning("Status check: failed to count commands: %s", exc)
queue_depth = 0
return MorrowindStatus(
connected=connected,
last_perception_timestamp=last_ts,
command_queue_depth=queue_depth,
current_cell=current_cell,
vitals=vitals,
)

View File

@@ -0,0 +1,307 @@
"""SQLite command log for the Morrowind Perception/Command protocol.
Every heartbeat cycle is logged — the resulting dataset serves as organic
training data for local model fine-tuning (Phase 7+).
Usage::
from infrastructure.morrowind.command_log import CommandLogger
logger = CommandLogger() # uses project default DB
logger.log_command(command_input, perception_snapshot)
results = logger.query(command_type="move_to", limit=100)
logger.export_training_data("export.jsonl")
"""
from __future__ import annotations
import json
import logging
from datetime import UTC, datetime, timedelta
from pathlib import Path
from typing import Any
from sqlalchemy import (
Column,
DateTime,
Index,
Integer,
String,
Text,
create_engine,
)
from sqlalchemy.orm import Session, sessionmaker
from src.dashboard.models.database import Base
from .schemas import CommandInput, CommandType, PerceptionOutput
logger = logging.getLogger(__name__)
# Default database path — same SQLite file as the rest of the project.
DEFAULT_DB_PATH = Path("./data/timmy_calm.db")
# ---------------------------------------------------------------------------
# SQLAlchemy model
# ---------------------------------------------------------------------------
class CommandLog(Base):
"""Persisted command log entry.
Schema columns mirror the requirements from Issue #855:
timestamp, command, params, reasoning, perception_snapshot,
outcome, episode_id.
"""
__tablename__ = "command_log"
id = Column(Integer, primary_key=True, autoincrement=True)
timestamp = Column(
DateTime, nullable=False, default=lambda: datetime.now(UTC), index=True
)
command = Column(String(64), nullable=False, index=True)
params = Column(Text, nullable=False, default="{}")
reasoning = Column(Text, nullable=False, default="")
perception_snapshot = Column(Text, nullable=False, default="{}")
outcome = Column(Text, nullable=True)
agent_id = Column(String(64), nullable=False, default="timmy", index=True)
episode_id = Column(String(128), nullable=True, index=True)
cell = Column(String(255), nullable=True, index=True)
protocol_version = Column(String(16), nullable=False, default="1.0.0")
created_at = Column(
DateTime, nullable=False, default=lambda: datetime.now(UTC)
)
__table_args__ = (
Index("ix_command_log_cmd_cell", "command", "cell"),
Index("ix_command_log_episode", "episode_id", "timestamp"),
)
# ---------------------------------------------------------------------------
# CommandLogger — high-level API
# ---------------------------------------------------------------------------
class CommandLogger:
"""High-level interface for logging, querying, and exporting commands.
Args:
db_url: SQLAlchemy database URL. Defaults to the project SQLite path.
"""
def __init__(self, db_url: str | None = None) -> None:
if db_url is None:
DEFAULT_DB_PATH.parent.mkdir(parents=True, exist_ok=True)
db_url = f"sqlite:///{DEFAULT_DB_PATH}"
self._engine = create_engine(
db_url, connect_args={"check_same_thread": False}
)
self._SessionLocal = sessionmaker(
autocommit=False, autoflush=False, bind=self._engine
)
# Ensure table exists.
Base.metadata.create_all(bind=self._engine, tables=[CommandLog.__table__])
def _get_session(self) -> Session:
return self._SessionLocal()
# -- Write ---------------------------------------------------------------
def log_command(
self,
command_input: CommandInput,
perception: PerceptionOutput | None = None,
outcome: str | None = None,
) -> int:
"""Persist a command to the log.
Returns the auto-generated row id.
"""
perception_json = perception.model_dump_json() if perception else "{}"
cell = perception.location.cell if perception else None
entry = CommandLog(
timestamp=command_input.timestamp,
command=command_input.command.value,
params=json.dumps(command_input.params),
reasoning=command_input.reasoning,
perception_snapshot=perception_json,
outcome=outcome,
agent_id=command_input.agent_id,
episode_id=command_input.episode_id,
cell=cell,
protocol_version=command_input.protocol_version,
)
session = self._get_session()
try:
session.add(entry)
session.commit()
session.refresh(entry)
row_id: int = entry.id
return row_id
except Exception:
session.rollback()
raise
finally:
session.close()
# -- Read ----------------------------------------------------------------
def query(
self,
*,
command_type: str | CommandType | None = None,
cell: str | None = None,
episode_id: str | None = None,
agent_id: str | None = None,
since: datetime | None = None,
until: datetime | None = None,
limit: int = 100,
offset: int = 0,
) -> list[dict[str, Any]]:
"""Query command log entries with optional filters.
Returns a list of dicts (serialisable to JSON).
"""
session = self._get_session()
try:
q = session.query(CommandLog)
if command_type is not None:
q = q.filter(CommandLog.command == str(command_type))
if cell is not None:
q = q.filter(CommandLog.cell == cell)
if episode_id is not None:
q = q.filter(CommandLog.episode_id == episode_id)
if agent_id is not None:
q = q.filter(CommandLog.agent_id == agent_id)
if since is not None:
q = q.filter(CommandLog.timestamp >= since)
if until is not None:
q = q.filter(CommandLog.timestamp <= until)
q = q.order_by(CommandLog.timestamp.desc())
q = q.offset(offset).limit(limit)
rows = q.all()
return [self._row_to_dict(row) for row in rows]
finally:
session.close()
# -- Export --------------------------------------------------------------
def export_training_data(
self,
output_path: str | Path,
*,
episode_id: str | None = None,
since: datetime | None = None,
until: datetime | None = None,
) -> int:
"""Export command log entries as a JSONL file for fine-tuning.
Each line is a JSON object with ``perception`` (input) and
``command`` + ``reasoning`` (target output).
Returns the number of rows exported.
"""
output_path = Path(output_path)
output_path.parent.mkdir(parents=True, exist_ok=True)
session = self._get_session()
try:
q = session.query(CommandLog)
if episode_id is not None:
q = q.filter(CommandLog.episode_id == episode_id)
if since is not None:
q = q.filter(CommandLog.timestamp >= since)
if until is not None:
q = q.filter(CommandLog.timestamp <= until)
q = q.order_by(CommandLog.timestamp.asc())
count = 0
with open(output_path, "w", encoding="utf-8") as fh:
for row in q.yield_per(500):
record = {
"input": {
"perception": json.loads(row.perception_snapshot),
},
"output": {
"command": row.command,
"params": json.loads(row.params),
"reasoning": row.reasoning,
},
"metadata": {
"timestamp": row.timestamp.isoformat() if row.timestamp else None,
"episode_id": row.episode_id,
"cell": row.cell,
"outcome": row.outcome,
},
}
fh.write(json.dumps(record) + "\n")
count += 1
logger.info("Exported %d training records to %s", count, output_path)
return count
finally:
session.close()
# -- Storage management --------------------------------------------------
def rotate(self, max_age_days: int = 90) -> int:
"""Delete command log entries older than *max_age_days*.
Returns the number of rows deleted.
"""
cutoff = datetime.now(UTC) - timedelta(days=max_age_days)
session = self._get_session()
try:
deleted = (
session.query(CommandLog)
.filter(CommandLog.timestamp < cutoff)
.delete(synchronize_session=False)
)
session.commit()
logger.info("Rotated %d command log entries older than %s", deleted, cutoff)
return deleted
except Exception:
session.rollback()
raise
finally:
session.close()
def count(self) -> int:
"""Return the total number of command log entries."""
session = self._get_session()
try:
return session.query(CommandLog).count()
finally:
session.close()
# -- Helpers -------------------------------------------------------------
@staticmethod
def _row_to_dict(row: CommandLog) -> dict[str, Any]:
return {
"id": row.id,
"timestamp": row.timestamp.isoformat() if row.timestamp else None,
"command": row.command,
"params": json.loads(row.params) if row.params else {},
"reasoning": row.reasoning,
"perception_snapshot": json.loads(row.perception_snapshot)
if row.perception_snapshot
else {},
"outcome": row.outcome,
"agent_id": row.agent_id,
"episode_id": row.episode_id,
"cell": row.cell,
"protocol_version": row.protocol_version,
"created_at": row.created_at.isoformat() if row.created_at else None,
}

View File

@@ -0,0 +1,186 @@
"""Pydantic v2 models for the Morrowind Perception/Command protocol.
These models enforce the contract defined in
``docs/protocol/morrowind-perception-command-spec.md`` at runtime.
They are engine-agnostic by design — see the Falsework Rule.
"""
from __future__ import annotations
from datetime import datetime
from enum import StrEnum
from typing import Any
from pydantic import BaseModel, Field, model_validator
PROTOCOL_VERSION = "1.0.0"
# ---------------------------------------------------------------------------
# Enums
# ---------------------------------------------------------------------------
class EntityType(StrEnum):
"""Controlled vocabulary for nearby entity types."""
NPC = "npc"
CREATURE = "creature"
ITEM = "item"
DOOR = "door"
CONTAINER = "container"
class CommandType(StrEnum):
"""All supported command types."""
MOVE_TO = "move_to"
INTERACT = "interact"
USE_ITEM = "use_item"
WAIT = "wait"
COMBAT_ACTION = "combat_action"
DIALOGUE = "dialogue"
JOURNAL_NOTE = "journal_note"
NOOP = "noop"
# ---------------------------------------------------------------------------
# Perception Output sub-models
# ---------------------------------------------------------------------------
class Location(BaseModel):
"""Agent position within the game world."""
cell: str = Field(..., description="Current cell/zone name")
x: float = Field(..., description="World X coordinate")
y: float = Field(..., description="World Y coordinate")
z: float = Field(0.0, description="World Z coordinate")
interior: bool = Field(False, description="Whether the agent is indoors")
class HealthStatus(BaseModel):
"""Agent health information."""
current: int = Field(..., ge=0, description="Current health points")
max: int = Field(..., gt=0, description="Maximum health points")
@model_validator(mode="after")
def current_le_max(self) -> "HealthStatus":
if self.current > self.max:
raise ValueError(
f"current ({self.current}) cannot exceed max ({self.max})"
)
return self
class NearbyEntity(BaseModel):
"""An entity within the agent's perception radius."""
entity_id: str = Field(..., description="Unique entity identifier")
name: str = Field(..., description="Display name")
entity_type: EntityType = Field(..., description="Entity category")
distance: float = Field(..., ge=0, description="Distance from agent")
disposition: int | None = Field(None, description="NPC disposition (0-100)")
class InventorySummary(BaseModel):
"""Lightweight overview of the agent's inventory."""
gold: int = Field(0, ge=0, description="Gold held")
item_count: int = Field(0, ge=0, description="Total items carried")
encumbrance_pct: float = Field(
0.0, ge=0.0, le=1.0, description="Encumbrance as fraction (0.01.0)"
)
class QuestInfo(BaseModel):
"""A currently tracked quest."""
quest_id: str = Field(..., description="Quest identifier")
name: str = Field(..., description="Quest display name")
stage: int = Field(0, ge=0, description="Current quest stage")
class Environment(BaseModel):
"""World-state flags."""
time_of_day: str = Field("unknown", description="Time period (morning, afternoon, etc.)")
weather: str = Field("clear", description="Current weather condition")
is_combat: bool = Field(False, description="Whether the agent is in combat")
is_dialogue: bool = Field(False, description="Whether the agent is in dialogue")
# ---------------------------------------------------------------------------
# Top-level schemas
# ---------------------------------------------------------------------------
class PerceptionOutput(BaseModel):
"""Complete perception snapshot returned by ``GET /perception``.
This is the engine-agnostic view of the game world consumed by the
heartbeat loop and reasoning layer.
"""
protocol_version: str = Field(
default=PROTOCOL_VERSION,
description="Protocol SemVer string",
)
timestamp: datetime = Field(..., description="When the snapshot was taken")
agent_id: str = Field(..., description="Which agent this perception belongs to")
location: Location
health: HealthStatus
nearby_entities: list[NearbyEntity] = Field(default_factory=list)
inventory_summary: InventorySummary = Field(default_factory=InventorySummary)
active_quests: list[QuestInfo] = Field(default_factory=list)
environment: Environment = Field(default_factory=Environment)
raw_engine_data: dict[str, Any] = Field(
default_factory=dict,
description="Opaque engine-specific blob — not relied upon by heartbeat",
)
class CommandContext(BaseModel):
"""Metadata linking a command to its triggering perception."""
perception_timestamp: datetime | None = Field(
None, description="Timestamp of the perception that triggered this command"
)
heartbeat_cycle: int | None = Field(
None, ge=0, description="Heartbeat cycle number"
)
class CommandInput(BaseModel):
"""Command payload sent via ``POST /command``.
Every command includes a ``reasoning`` field so the command log
captures the agent's intent — critical for training-data export.
"""
protocol_version: str = Field(
default=PROTOCOL_VERSION,
description="Protocol SemVer string",
)
timestamp: datetime = Field(..., description="When the command was issued")
agent_id: str = Field(..., description="Which agent is issuing the command")
command: CommandType = Field(..., description="Command type")
params: dict[str, Any] = Field(
default_factory=dict, description="Command-specific parameters"
)
reasoning: str = Field(
...,
min_length=1,
description="Natural-language explanation of why this command was chosen",
)
episode_id: str | None = Field(
None, description="Groups commands into training episodes"
)
context: CommandContext | None = Field(
None, description="Metadata linking command to its triggering perception"
)

View File

@@ -0,0 +1,243 @@
"""Fine-tuning dataset export pipeline for command log data.
Transforms raw command log entries into structured training datasets
suitable for supervised fine-tuning of local models.
Usage::
from infrastructure.morrowind.training_export import TrainingExporter
exporter = TrainingExporter(command_logger)
stats = exporter.export_chat_format("train.jsonl")
stats = exporter.export_episode_sequences("episodes/", min_length=5)
"""
from __future__ import annotations
import json
import logging
from dataclasses import dataclass, field
from datetime import datetime
from pathlib import Path
from typing import Any
from .command_log import CommandLogger
logger = logging.getLogger(__name__)
@dataclass
class ExportStats:
"""Statistics about an export run."""
total_records: int = 0
episodes_exported: int = 0
skipped_records: int = 0
output_path: str = ""
format: str = ""
exported_at: str = field(default_factory=lambda: datetime.utcnow().isoformat())
class TrainingExporter:
"""Builds fine-tuning datasets from the command log.
Supports multiple output formats used by common fine-tuning
frameworks (chat-completion style, instruction-following, episode
sequences).
Args:
command_logger: A :class:`CommandLogger` instance to read from.
"""
def __init__(self, command_logger: CommandLogger) -> None:
self._logger = command_logger
# -- Chat-completion format ----------------------------------------------
def export_chat_format(
self,
output_path: str | Path,
*,
since: datetime | None = None,
until: datetime | None = None,
max_records: int | None = None,
) -> ExportStats:
"""Export as chat-completion training pairs.
Each line is a JSON object with ``messages`` list containing a
``system`` prompt, ``user`` (perception), and ``assistant``
(command + reasoning) message.
This format is compatible with OpenAI / Llama fine-tuning APIs.
"""
output_path = Path(output_path)
output_path.parent.mkdir(parents=True, exist_ok=True)
rows = self._logger.query(
since=since,
until=until,
limit=max_records or 100_000,
)
# query returns newest-first; reverse for chronological export
rows.reverse()
stats = ExportStats(
output_path=str(output_path),
format="chat_completion",
)
with open(output_path, "w", encoding="utf-8") as fh:
for row in rows:
perception = row.get("perception_snapshot", {})
if not perception:
stats.skipped_records += 1
continue
record = {
"messages": [
{
"role": "system",
"content": (
"You are an autonomous agent navigating a game world. "
"Given a perception of the world state, decide what "
"command to execute and explain your reasoning."
),
},
{
"role": "user",
"content": json.dumps(perception),
},
{
"role": "assistant",
"content": json.dumps(
{
"command": row.get("command"),
"params": row.get("params", {}),
"reasoning": row.get("reasoning", ""),
}
),
},
],
}
fh.write(json.dumps(record) + "\n")
stats.total_records += 1
logger.info(
"Exported %d chat-format records to %s (skipped %d)",
stats.total_records,
output_path,
stats.skipped_records,
)
return stats
# -- Episode sequences ---------------------------------------------------
def export_episode_sequences(
self,
output_dir: str | Path,
*,
min_length: int = 3,
since: datetime | None = None,
until: datetime | None = None,
) -> ExportStats:
"""Export command sequences grouped by episode.
Each episode is written as a separate JSONL file in *output_dir*.
Episodes shorter than *min_length* are skipped.
"""
output_dir = Path(output_dir)
output_dir.mkdir(parents=True, exist_ok=True)
# Gather all rows (high limit) and group by episode.
rows = self._logger.query(since=since, until=until, limit=1_000_000)
rows.reverse() # chronological
episodes: dict[str, list[dict[str, Any]]] = {}
for row in rows:
ep_id = row.get("episode_id") or "unknown"
episodes.setdefault(ep_id, []).append(row)
stats = ExportStats(
output_path=str(output_dir),
format="episode_sequence",
)
for ep_id, entries in episodes.items():
if len(entries) < min_length:
stats.skipped_records += len(entries)
continue
ep_file = output_dir / f"{ep_id}.jsonl"
with open(ep_file, "w", encoding="utf-8") as fh:
for entry in entries:
fh.write(json.dumps(entry, default=str) + "\n")
stats.total_records += 1
stats.episodes_exported += 1
logger.info(
"Exported %d episodes (%d records) to %s",
stats.episodes_exported,
stats.total_records,
output_dir,
)
return stats
# -- Instruction-following format ----------------------------------------
def export_instruction_format(
self,
output_path: str | Path,
*,
since: datetime | None = None,
until: datetime | None = None,
max_records: int | None = None,
) -> ExportStats:
"""Export as instruction/response pairs (Alpaca-style).
Each line has ``instruction``, ``input``, and ``output`` fields.
"""
output_path = Path(output_path)
output_path.parent.mkdir(parents=True, exist_ok=True)
rows = self._logger.query(
since=since,
until=until,
limit=max_records or 100_000,
)
rows.reverse()
stats = ExportStats(
output_path=str(output_path),
format="instruction",
)
with open(output_path, "w", encoding="utf-8") as fh:
for row in rows:
perception = row.get("perception_snapshot", {})
if not perception:
stats.skipped_records += 1
continue
record = {
"instruction": (
"Given the following game world perception, decide what "
"command to execute. Explain your reasoning."
),
"input": json.dumps(perception),
"output": json.dumps(
{
"command": row.get("command"),
"params": row.get("params", {}),
"reasoning": row.get("reasoning", ""),
}
),
}
fh.write(json.dumps(record) + "\n")
stats.total_records += 1
logger.info(
"Exported %d instruction-format records to %s",
stats.total_records,
output_path,
)
return stats

View File

@@ -0,0 +1,333 @@
"""Presence state serializer — transforms ADR-023 presence dicts for consumers.
Converts the raw presence schema (version, liveness, mood, energy, etc.)
into the camelCase world-state payload consumed by the Workshop 3D renderer
and WebSocket gateway.
"""
import logging
import time
from datetime import UTC, datetime
logger = logging.getLogger(__name__)
# Default Pip familiar state (used when familiar module unavailable)
DEFAULT_PIP_STATE = {
"name": "Pip",
"mood": "sleepy",
"energy": 0.5,
"color": "0x00b450", # emerald green
"trail_color": "0xdaa520", # gold
}
def _get_familiar_state() -> dict:
"""Get Pip familiar state from familiar module, with graceful fallback.
Returns a dict with name, mood, energy, color, and trail_color.
Falls back to default state if familiar module unavailable or raises.
"""
try:
from timmy.familiar import pip_familiar
snapshot = pip_familiar.snapshot()
# Map PipSnapshot fields to the expected agent_state format
return {
"name": snapshot.name,
"mood": snapshot.state,
"energy": DEFAULT_PIP_STATE["energy"], # Pip doesn't track energy yet
"color": DEFAULT_PIP_STATE["color"],
"trail_color": DEFAULT_PIP_STATE["trail_color"],
}
except Exception as exc:
logger.warning("Familiar state unavailable, using default: %s", exc)
return DEFAULT_PIP_STATE.copy()
# Valid bark styles for Matrix protocol
BARK_STYLES = {"speech", "thought", "whisper", "shout"}
def produce_bark(agent_id: str, text: str, reply_to: str = None, style: str = "speech") -> dict:
"""Format a chat response as a Matrix bark message.
Barks appear as floating text above agents in the Matrix 3D world with
typing animation. This function formats the text for the Matrix protocol.
Parameters
----------
agent_id:
Unique identifier for the agent (e.g. ``"timmy"``).
text:
The chat response text to display as a bark.
reply_to:
Optional message ID or reference this bark is replying to.
style:
Visual style of the bark. One of: "speech" (default), "thought",
"whisper", "shout". Invalid styles fall back to "speech".
Returns
-------
dict
Bark message with keys ``type``, ``agent_id``, ``data`` (containing
``text``, ``reply_to``, ``style``), and ``ts``.
Examples
--------
>>> produce_bark("timmy", "Hello world!")
{
"type": "bark",
"agent_id": "timmy",
"data": {"text": "Hello world!", "reply_to": None, "style": "speech"},
"ts": 1742529600,
}
"""
# Validate and normalize style
if style not in BARK_STYLES:
style = "speech"
# Truncate text to 280 characters (bark, not essay)
truncated_text = text[:280] if text else ""
return {
"type": "bark",
"agent_id": agent_id,
"data": {
"text": truncated_text,
"reply_to": reply_to,
"style": style,
},
"ts": int(time.time()),
}
def produce_thought(
agent_id: str, thought_text: str, thought_id: int, chain_id: str = None
) -> dict:
"""Format a thinking engine thought as a Matrix thought message.
Thoughts appear as subtle floating text in the 3D world, streaming from
Timmy's thinking engine (/thinking/api). This function wraps thoughts in
Matrix protocol format.
Parameters
----------
agent_id:
Unique identifier for the agent (e.g. ``"timmy"``).
thought_text:
The thought text to display. Truncated to 500 characters.
thought_id:
Unique identifier for this thought (sequence number).
chain_id:
Optional chain identifier grouping related thoughts.
Returns
-------
dict
Thought message with keys ``type``, ``agent_id``, ``data`` (containing
``text``, ``thought_id``, ``chain_id``), and ``ts``.
Examples
--------
>>> produce_thought("timmy", "Considering the options...", 42, "chain-123")
{
"type": "thought",
"agent_id": "timmy",
"data": {"text": "Considering the options...", "thought_id": 42, "chain_id": "chain-123"},
"ts": 1742529600,
}
"""
# Truncate text to 500 characters (thoughts can be longer than barks)
truncated_text = thought_text[:500] if thought_text else ""
return {
"type": "thought",
"agent_id": agent_id,
"data": {
"text": truncated_text,
"thought_id": thought_id,
"chain_id": chain_id,
},
"ts": int(time.time()),
}
def serialize_presence(presence: dict) -> dict:
"""Transform an ADR-023 presence dict into the world-state API shape.
Parameters
----------
presence:
Raw presence dict as written by
:func:`~timmy.workshop_state.get_state_dict` or read from
``~/.timmy/presence.json``.
Returns
-------
dict
CamelCase world-state payload with ``timmyState``, ``familiar``,
``activeThreads``, ``recentEvents``, ``concerns``, ``visitorPresent``,
``updatedAt``, and ``version`` keys.
"""
return {
"timmyState": {
"mood": presence.get("mood", "calm"),
"activity": presence.get("current_focus", "idle"),
"energy": presence.get("energy", 0.5),
"confidence": presence.get("confidence", 0.7),
},
"familiar": presence.get("familiar"),
"activeThreads": presence.get("active_threads", []),
"recentEvents": presence.get("recent_events", []),
"concerns": presence.get("concerns", []),
"visitorPresent": False,
"updatedAt": presence.get("liveness", datetime.now(UTC).strftime("%Y-%m-%dT%H:%M:%SZ")),
"version": presence.get("version", 1),
}
# ---------------------------------------------------------------------------
# Status mapping: ADR-023 current_focus → Matrix agent status
# ---------------------------------------------------------------------------
_STATUS_KEYWORDS: dict[str, str] = {
"thinking": "thinking",
"speaking": "speaking",
"talking": "speaking",
"idle": "idle",
}
def _derive_status(current_focus: str) -> str:
"""Map a free-text current_focus value to a Matrix status enum.
Returns one of: online, idle, thinking, speaking.
"""
focus_lower = current_focus.lower()
for keyword, status in _STATUS_KEYWORDS.items():
if keyword in focus_lower:
return status
if current_focus and current_focus != "idle":
return "online"
return "idle"
def produce_agent_state(agent_id: str, presence: dict) -> dict:
"""Build a Matrix-compatible ``agent_state`` message from presence data.
Parameters
----------
agent_id:
Unique identifier for the agent (e.g. ``"timmy"``).
presence:
Raw ADR-023 presence dict.
Returns
-------
dict
Message with keys ``type``, ``agent_id``, ``data``, and ``ts``.
"""
return {
"type": "agent_state",
"agent_id": agent_id,
"data": {
"display_name": presence.get("display_name", agent_id.title()),
"role": presence.get("role", "assistant"),
"status": _derive_status(presence.get("current_focus", "idle")),
"mood": presence.get("mood", "calm"),
"energy": presence.get("energy", 0.5),
"bark": presence.get("bark", ""),
"familiar": _get_familiar_state(),
},
"ts": int(time.time()),
}
def produce_system_status() -> dict:
"""Generate a system_status message for the Matrix.
Returns a dict with system health metrics including agent count,
visitor count, uptime, thinking engine status, and memory count.
Returns
-------
dict
Message with keys ``type``, ``data`` (containing ``agents_online``,
``visitors``, ``uptime_seconds``, ``thinking_active``, ``memory_count``),
and ``ts``.
Examples
--------
>>> produce_system_status()
{
"type": "system_status",
"data": {
"agents_online": 5,
"visitors": 2,
"uptime_seconds": 3600,
"thinking_active": True,
"memory_count": 150,
},
"ts": 1742529600,
}
"""
# Count agents with status != offline
agents_online = 0
try:
from timmy.agents.loader import list_agents
agents = list_agents()
agents_online = sum(1 for a in agents if a.get("status", "") not in ("offline", ""))
except Exception as exc:
logger.debug("Failed to count agents: %s", exc)
# Count visitors from WebSocket clients
visitors = 0
try:
from dashboard.routes.world import _ws_clients
visitors = len(_ws_clients)
except Exception as exc:
logger.debug("Failed to count visitors: %s", exc)
# Calculate uptime
uptime_seconds = 0
try:
from datetime import UTC
from config import APP_START_TIME
uptime_seconds = int((datetime.now(UTC) - APP_START_TIME).total_seconds())
except Exception as exc:
logger.debug("Failed to calculate uptime: %s", exc)
# Check thinking engine status
thinking_active = False
try:
from config import settings
from timmy.thinking import thinking_engine
thinking_active = settings.thinking_enabled and thinking_engine is not None
except Exception as exc:
logger.debug("Failed to check thinking status: %s", exc)
# Count memories in vector store
memory_count = 0
try:
from timmy.memory_system import get_memory_stats
stats = get_memory_stats()
memory_count = stats.get("total_entries", 0)
except Exception as exc:
logger.debug("Failed to count memories: %s", exc)
return {
"type": "system_status",
"data": {
"agents_online": agents_online,
"visitors": visitors,
"uptime_seconds": uptime_seconds,
"thinking_active": thinking_active,
"memory_count": memory_count,
},
"ts": int(time.time()),
}

View File

@@ -0,0 +1,261 @@
"""Shared WebSocket message protocol for the Matrix frontend.
Defines all WebSocket message types as an enum and typed dataclasses
with ``to_json()`` / ``from_json()`` helpers so every producer and the
gateway speak the same language.
Message wire format
-------------------
.. code-block:: json
{"type": "agent_state", "agent_id": "timmy", "data": {...}, "ts": 1234567890}
"""
import json
import logging
import time
from dataclasses import asdict, dataclass, field
from enum import StrEnum
from typing import Any
logger = logging.getLogger(__name__)
class MessageType(StrEnum):
"""All WebSocket message types defined by the Matrix PROTOCOL.md."""
AGENT_STATE = "agent_state"
VISITOR_STATE = "visitor_state"
BARK = "bark"
THOUGHT = "thought"
SYSTEM_STATUS = "system_status"
CONNECTION_ACK = "connection_ack"
ERROR = "error"
TASK_UPDATE = "task_update"
MEMORY_FLASH = "memory_flash"
# ---------------------------------------------------------------------------
# Base message
# ---------------------------------------------------------------------------
@dataclass
class WSMessage:
"""Base WebSocket message with common envelope fields."""
type: str
ts: float = field(default_factory=time.time)
def to_json(self) -> str:
"""Serialise the message to a JSON string."""
return json.dumps(asdict(self))
@classmethod
def from_json(cls, raw: str) -> "WSMessage":
"""Deserialise a JSON string into the correct message subclass.
Falls back to the base ``WSMessage`` when the ``type`` field is
unrecognised.
"""
data = json.loads(raw)
msg_type = data.get("type")
sub = _REGISTRY.get(msg_type)
if sub is not None:
return sub.from_json(raw)
return cls(**data)
# ---------------------------------------------------------------------------
# Concrete message types
# ---------------------------------------------------------------------------
@dataclass
class AgentStateMessage(WSMessage):
"""State update for a single agent."""
type: str = field(default=MessageType.AGENT_STATE)
agent_id: str = ""
data: dict[str, Any] = field(default_factory=dict)
@classmethod
def from_json(cls, raw: str) -> "AgentStateMessage":
payload = json.loads(raw)
return cls(
type=payload.get("type", MessageType.AGENT_STATE),
ts=payload.get("ts", time.time()),
agent_id=payload.get("agent_id", ""),
data=payload.get("data", {}),
)
@dataclass
class VisitorStateMessage(WSMessage):
"""State update for a visitor / user session."""
type: str = field(default=MessageType.VISITOR_STATE)
visitor_id: str = ""
data: dict[str, Any] = field(default_factory=dict)
@classmethod
def from_json(cls, raw: str) -> "VisitorStateMessage":
payload = json.loads(raw)
return cls(
type=payload.get("type", MessageType.VISITOR_STATE),
ts=payload.get("ts", time.time()),
visitor_id=payload.get("visitor_id", ""),
data=payload.get("data", {}),
)
@dataclass
class BarkMessage(WSMessage):
"""A bark (chat-like utterance) from an agent."""
type: str = field(default=MessageType.BARK)
agent_id: str = ""
content: str = ""
@classmethod
def from_json(cls, raw: str) -> "BarkMessage":
payload = json.loads(raw)
return cls(
type=payload.get("type", MessageType.BARK),
ts=payload.get("ts", time.time()),
agent_id=payload.get("agent_id", ""),
content=payload.get("content", ""),
)
@dataclass
class ThoughtMessage(WSMessage):
"""An inner thought from an agent."""
type: str = field(default=MessageType.THOUGHT)
agent_id: str = ""
content: str = ""
@classmethod
def from_json(cls, raw: str) -> "ThoughtMessage":
payload = json.loads(raw)
return cls(
type=payload.get("type", MessageType.THOUGHT),
ts=payload.get("ts", time.time()),
agent_id=payload.get("agent_id", ""),
content=payload.get("content", ""),
)
@dataclass
class SystemStatusMessage(WSMessage):
"""System-wide status broadcast."""
type: str = field(default=MessageType.SYSTEM_STATUS)
status: str = ""
data: dict[str, Any] = field(default_factory=dict)
@classmethod
def from_json(cls, raw: str) -> "SystemStatusMessage":
payload = json.loads(raw)
return cls(
type=payload.get("type", MessageType.SYSTEM_STATUS),
ts=payload.get("ts", time.time()),
status=payload.get("status", ""),
data=payload.get("data", {}),
)
@dataclass
class ConnectionAckMessage(WSMessage):
"""Acknowledgement sent when a client connects."""
type: str = field(default=MessageType.CONNECTION_ACK)
client_id: str = ""
@classmethod
def from_json(cls, raw: str) -> "ConnectionAckMessage":
payload = json.loads(raw)
return cls(
type=payload.get("type", MessageType.CONNECTION_ACK),
ts=payload.get("ts", time.time()),
client_id=payload.get("client_id", ""),
)
@dataclass
class ErrorMessage(WSMessage):
"""Error message sent to a client."""
type: str = field(default=MessageType.ERROR)
code: str = ""
message: str = ""
@classmethod
def from_json(cls, raw: str) -> "ErrorMessage":
payload = json.loads(raw)
return cls(
type=payload.get("type", MessageType.ERROR),
ts=payload.get("ts", time.time()),
code=payload.get("code", ""),
message=payload.get("message", ""),
)
@dataclass
class TaskUpdateMessage(WSMessage):
"""Update about a task (created, assigned, completed, etc.)."""
type: str = field(default=MessageType.TASK_UPDATE)
task_id: str = ""
status: str = ""
data: dict[str, Any] = field(default_factory=dict)
@classmethod
def from_json(cls, raw: str) -> "TaskUpdateMessage":
payload = json.loads(raw)
return cls(
type=payload.get("type", MessageType.TASK_UPDATE),
ts=payload.get("ts", time.time()),
task_id=payload.get("task_id", ""),
status=payload.get("status", ""),
data=payload.get("data", {}),
)
@dataclass
class MemoryFlashMessage(WSMessage):
"""A flash of memory — a recalled or stored memory event."""
type: str = field(default=MessageType.MEMORY_FLASH)
agent_id: str = ""
memory_key: str = ""
content: str = ""
@classmethod
def from_json(cls, raw: str) -> "MemoryFlashMessage":
payload = json.loads(raw)
return cls(
type=payload.get("type", MessageType.MEMORY_FLASH),
ts=payload.get("ts", time.time()),
agent_id=payload.get("agent_id", ""),
memory_key=payload.get("memory_key", ""),
content=payload.get("content", ""),
)
# ---------------------------------------------------------------------------
# Registry for from_json dispatch
# ---------------------------------------------------------------------------
_REGISTRY: dict[str, type[WSMessage]] = {
MessageType.AGENT_STATE: AgentStateMessage,
MessageType.VISITOR_STATE: VisitorStateMessage,
MessageType.BARK: BarkMessage,
MessageType.THOUGHT: ThoughtMessage,
MessageType.SYSTEM_STATUS: SystemStatusMessage,
MessageType.CONNECTION_ACK: ConnectionAckMessage,
MessageType.ERROR: ErrorMessage,
MessageType.TASK_UPDATE: TaskUpdateMessage,
MessageType.MEMORY_FLASH: MemoryFlashMessage,
}

View File

@@ -2,6 +2,7 @@
from .api import router
from .cascade import CascadeRouter, Provider, ProviderStatus, get_router
from .history import HealthHistoryStore, get_history_store
__all__ = [
"CascadeRouter",
@@ -9,4 +10,6 @@ __all__ = [
"ProviderStatus",
"get_router",
"router",
"HealthHistoryStore",
"get_history_store",
]

View File

@@ -8,6 +8,7 @@ from fastapi import APIRouter, Depends, HTTPException
from pydantic import BaseModel
from .cascade import CascadeRouter, get_router
from .history import HealthHistoryStore, get_history_store
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/api/v1/router", tags=["router"])
@@ -199,6 +200,17 @@ async def reload_config(
raise HTTPException(status_code=500, detail=f"Reload failed: {exc}") from exc
@router.get("/history")
async def get_history(
hours: int = 24,
store: Annotated[HealthHistoryStore, Depends(get_history_store)] = None,
) -> list[dict[str, Any]]:
"""Get provider health history for the last N hours."""
if store is None:
store = get_history_store()
return store.get_history(hours=hours)
@router.get("/config")
async def get_config(
cascade: Annotated[CascadeRouter, Depends(get_cascade_router)],

View File

@@ -221,65 +221,56 @@ class CascadeRouter:
raise RuntimeError("PyYAML not installed")
content = self.config_path.read_text()
# Expand environment variables
content = self._expand_env_vars(content)
data = yaml.safe_load(content)
# Load cascade settings
cascade = data.get("cascade", {})
# Load fallback chains
fallback_chains = data.get("fallback_chains", {})
# Load multi-modal settings
multimodal = data.get("multimodal", {})
self.config = RouterConfig(
timeout_seconds=cascade.get("timeout_seconds", 30),
max_retries_per_provider=cascade.get("max_retries_per_provider", 2),
retry_delay_seconds=cascade.get("retry_delay_seconds", 1),
circuit_breaker_failure_threshold=cascade.get("circuit_breaker", {}).get(
"failure_threshold", 5
),
circuit_breaker_recovery_timeout=cascade.get("circuit_breaker", {}).get(
"recovery_timeout", 60
),
circuit_breaker_half_open_max_calls=cascade.get("circuit_breaker", {}).get(
"half_open_max_calls", 2
),
auto_pull_models=multimodal.get("auto_pull", True),
fallback_chains=fallback_chains,
)
# Load providers
for p_data in data.get("providers", []):
# Skip disabled providers
if not p_data.get("enabled", False):
continue
provider = Provider(
name=p_data["name"],
type=p_data["type"],
enabled=p_data.get("enabled", True),
priority=p_data.get("priority", 99),
url=p_data.get("url"),
api_key=p_data.get("api_key"),
base_url=p_data.get("base_url"),
models=p_data.get("models", []),
)
# Check if provider is actually available
if self._check_provider_available(provider):
self.providers.append(provider)
else:
logger.warning("Provider %s not available, skipping", provider.name)
# Sort by priority
self.providers.sort(key=lambda p: p.priority)
self.config = self._parse_router_config(data)
self._load_providers(data)
except Exception as exc:
logger.error("Failed to load config: %s", exc)
def _parse_router_config(self, data: dict) -> RouterConfig:
"""Build a RouterConfig from parsed YAML data."""
cascade = data.get("cascade", {})
cb = cascade.get("circuit_breaker", {})
multimodal = data.get("multimodal", {})
return RouterConfig(
timeout_seconds=cascade.get("timeout_seconds", 30),
max_retries_per_provider=cascade.get("max_retries_per_provider", 2),
retry_delay_seconds=cascade.get("retry_delay_seconds", 1),
circuit_breaker_failure_threshold=cb.get("failure_threshold", 5),
circuit_breaker_recovery_timeout=cb.get("recovery_timeout", 60),
circuit_breaker_half_open_max_calls=cb.get("half_open_max_calls", 2),
auto_pull_models=multimodal.get("auto_pull", True),
fallback_chains=data.get("fallback_chains", {}),
)
def _load_providers(self, data: dict) -> None:
"""Load, filter, and sort providers from parsed YAML data."""
for p_data in data.get("providers", []):
if not p_data.get("enabled", False):
continue
provider = Provider(
name=p_data["name"],
type=p_data["type"],
enabled=p_data.get("enabled", True),
priority=p_data.get("priority", 99),
url=p_data.get("url"),
api_key=p_data.get("api_key"),
base_url=p_data.get("base_url"),
models=p_data.get("models", []),
)
if self._check_provider_available(provider):
self.providers.append(provider)
else:
logger.warning("Provider %s not available, skipping", provider.name)
self.providers.sort(key=lambda p: p.priority)
def _expand_env_vars(self, content: str) -> str:
"""Expand ${VAR} syntax in YAML content.
@@ -564,6 +555,7 @@ class CascadeRouter:
messages=messages,
model=model or provider.get_default_model(),
temperature=temperature,
max_tokens=max_tokens,
content_type=content_type,
)
elif provider.type == "openai":
@@ -604,6 +596,7 @@ class CascadeRouter:
messages: list[dict],
model: str,
temperature: float,
max_tokens: int | None = None,
content_type: ContentType = ContentType.TEXT,
) -> dict:
"""Call Ollama API with multi-modal support."""
@@ -614,13 +607,15 @@ class CascadeRouter:
# Transform messages for Ollama format (including images)
transformed_messages = self._transform_messages_for_ollama(messages)
options = {"temperature": temperature}
if max_tokens:
options["num_predict"] = max_tokens
payload = {
"model": model,
"messages": transformed_messages,
"stream": False,
"options": {
"temperature": temperature,
},
"options": options,
}
timeout = aiohttp.ClientTimeout(total=self.config.timeout_seconds)
@@ -764,7 +759,7 @@ class CascadeRouter:
client = openai.AsyncOpenAI(
api_key=provider.api_key,
base_url=provider.base_url or "https://api.x.ai/v1",
base_url=provider.base_url or settings.xai_base_url,
timeout=httpx.Timeout(300.0),
)

View File

@@ -0,0 +1,152 @@
"""Provider health history — time-series snapshots for dashboard visualization."""
import asyncio
import logging
import sqlite3
from datetime import UTC, datetime, timedelta
from pathlib import Path
logger = logging.getLogger(__name__)
_store: "HealthHistoryStore | None" = None
class HealthHistoryStore:
"""Stores timestamped provider health snapshots in SQLite."""
def __init__(self, db_path: str = "data/router_history.db") -> None:
self.db_path = db_path
if db_path != ":memory:":
Path(db_path).parent.mkdir(parents=True, exist_ok=True)
self._conn = sqlite3.connect(db_path, check_same_thread=False)
self._conn.row_factory = sqlite3.Row
self._init_schema()
self._bg_task: asyncio.Task | None = None
def _init_schema(self) -> None:
self._conn.execute("""
CREATE TABLE IF NOT EXISTS snapshots (
id INTEGER PRIMARY KEY AUTOINCREMENT,
timestamp TEXT NOT NULL,
provider_name TEXT NOT NULL,
status TEXT NOT NULL,
error_rate REAL NOT NULL,
avg_latency_ms REAL NOT NULL,
circuit_state TEXT NOT NULL,
total_requests INTEGER NOT NULL
)
""")
self._conn.execute("""
CREATE INDEX IF NOT EXISTS idx_snapshots_ts
ON snapshots(timestamp)
""")
self._conn.commit()
def record_snapshot(self, providers: list[dict]) -> None:
"""Record a health snapshot for all providers."""
ts = datetime.now(UTC).isoformat()
rows = [
(
ts,
p["name"],
p["status"],
p["error_rate"],
p["avg_latency_ms"],
p["circuit_state"],
p["total_requests"],
)
for p in providers
]
self._conn.executemany(
"""INSERT INTO snapshots
(timestamp, provider_name, status, error_rate,
avg_latency_ms, circuit_state, total_requests)
VALUES (?, ?, ?, ?, ?, ?, ?)""",
rows,
)
self._conn.commit()
def get_history(self, hours: int = 24) -> list[dict]:
"""Return snapshots from the last N hours, grouped by timestamp."""
cutoff = (datetime.now(UTC) - timedelta(hours=hours)).isoformat()
rows = self._conn.execute(
"""SELECT timestamp, provider_name, status, error_rate,
avg_latency_ms, circuit_state, total_requests
FROM snapshots WHERE timestamp >= ? ORDER BY timestamp""",
(cutoff,),
).fetchall()
# Group by timestamp
snapshots: dict[str, list[dict]] = {}
for row in rows:
ts = row["timestamp"]
if ts not in snapshots:
snapshots[ts] = []
snapshots[ts].append(
{
"name": row["provider_name"],
"status": row["status"],
"error_rate": row["error_rate"],
"avg_latency_ms": row["avg_latency_ms"],
"circuit_state": row["circuit_state"],
"total_requests": row["total_requests"],
}
)
return [{"timestamp": ts, "providers": providers} for ts, providers in snapshots.items()]
def prune(self, keep_hours: int = 168) -> int:
"""Remove snapshots older than keep_hours. Returns rows deleted."""
cutoff = (datetime.now(UTC) - timedelta(hours=keep_hours)).isoformat()
cursor = self._conn.execute("DELETE FROM snapshots WHERE timestamp < ?", (cutoff,))
self._conn.commit()
return cursor.rowcount
def close(self) -> None:
"""Close the database connection."""
if self._bg_task and not self._bg_task.done():
self._bg_task.cancel()
self._conn.close()
def _capture_snapshot(self, cascade_router) -> None: # noqa: ANN001
"""Capture current provider state as a snapshot."""
providers = []
for p in cascade_router.providers:
providers.append(
{
"name": p.name,
"status": p.status.value,
"error_rate": round(p.metrics.error_rate, 4),
"avg_latency_ms": round(p.metrics.avg_latency_ms, 2),
"circuit_state": p.circuit_state.value,
"total_requests": p.metrics.total_requests,
}
)
self.record_snapshot(providers)
async def start_background_task(
self,
cascade_router,
interval_seconds: int = 60, # noqa: ANN001
) -> None:
"""Start periodic snapshot capture."""
async def _loop() -> None:
while True:
try:
self._capture_snapshot(cascade_router)
logger.debug("Recorded health snapshot")
except Exception:
logger.exception("Failed to record health snapshot")
await asyncio.sleep(interval_seconds)
self._bg_task = asyncio.create_task(_loop())
logger.info("Health history background task started (interval=%ds)", interval_seconds)
def get_history_store() -> HealthHistoryStore:
"""Get or create the singleton history store."""
global _store # noqa: PLW0603
if _store is None:
_store = HealthHistoryStore()
return _store

View File

@@ -0,0 +1,20 @@
"""SOUL.md framework — load, validate, and version agent identity documents.
Provides:
- ``SoulLoader`` — parse SOUL.md files into structured data
- ``SoulValidator`` — validate structure and check for contradictions
- ``SoulVersioner`` — track identity evolution over time
"""
from .loader import SoulDocument, SoulLoader
from .validator import SoulValidator, ValidationResult
from .versioning import SoulVersioner
__all__ = [
"SoulDocument",
"SoulLoader",
"SoulValidator",
"SoulVersioner",
"ValidationResult",
]

View File

@@ -0,0 +1,238 @@
"""Load and parse SOUL.md files into structured data.
A SOUL.md is a Markdown file with specific sections that define an agent's
identity, values, constraints, and behavior. This loader extracts those
sections into a ``SoulDocument`` for programmatic access.
"""
from __future__ import annotations
import logging
import re
from dataclasses import dataclass, field
from pathlib import Path
from typing import Any
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Data model
# ---------------------------------------------------------------------------
# Recognised H2 section headings (case-insensitive match)
REQUIRED_SECTIONS = frozenset({
"identity",
"values",
"prime directive",
"audience awareness",
"constraints",
})
OPTIONAL_SECTIONS = frozenset({
"behavior",
"boundaries",
})
ALL_SECTIONS = REQUIRED_SECTIONS | OPTIONAL_SECTIONS
@dataclass
class SoulDocument:
"""Parsed representation of a SOUL.md file."""
# Header paragraph (text before the first H2)
preamble: str = ""
# Identity fields
name: str = ""
role: str = ""
lineage: str = ""
version: str = ""
# Ordered list of (value_name, definition) pairs
values: list[tuple[str, str]] = field(default_factory=list)
# Prime directive — single sentence
prime_directive: str = ""
# Audience awareness
audience: dict[str, str] = field(default_factory=dict)
# Constraints — ordered list
constraints: list[str] = field(default_factory=list)
# Optional sections
behavior: list[str] = field(default_factory=list)
boundaries: list[str] = field(default_factory=list)
# Raw section text keyed by lowercase heading
raw_sections: dict[str, str] = field(default_factory=dict)
# Source path (if loaded from file)
source_path: Path | None = None
def value_names(self) -> list[str]:
"""Return the ordered list of value names."""
return [name for name, _ in self.values]
# ---------------------------------------------------------------------------
# Parser helpers
# ---------------------------------------------------------------------------
_H1_RE = re.compile(r"^#\s+(.+)", re.MULTILINE)
_H2_RE = re.compile(r"^##\s+(.+)", re.MULTILINE)
_BOLD_ITEM_RE = re.compile(r"^(?:[-*]\s+)?\*\*(.+?)[.*]*\*\*\.?\s*(.*)", re.MULTILINE)
_LIST_ITEM_RE = re.compile(r"^[-*]\s+\*\*(.+?):?\*\*\s*(.*)", re.MULTILINE)
_NUMBERED_RE = re.compile(r"^\d+\.\s+(.+)", re.MULTILINE)
_BULLET_RE = re.compile(r"^[-*]\s+(.+)", re.MULTILINE)
def _split_sections(text: str) -> tuple[str, dict[str, str]]:
"""Split markdown into preamble + dict of H2 sections."""
parts = _H2_RE.split(text)
# parts[0] is text before first H2 (preamble)
preamble = parts[0].strip() if parts else ""
sections: dict[str, str] = {}
# Remaining parts alternate: heading, body, heading, body, ...
for i in range(1, len(parts), 2):
heading = parts[i].strip().lower()
body = parts[i + 1].strip() if i + 1 < len(parts) else ""
sections[heading] = body
return preamble, sections
def _parse_identity(text: str) -> dict[str, str]:
"""Extract identity key-value pairs from section text."""
result: dict[str, str] = {}
for match in _LIST_ITEM_RE.finditer(text):
key = match.group(1).strip().lower()
value = match.group(2).strip()
result[key] = value
return result
def _parse_values(text: str) -> list[tuple[str, str]]:
"""Extract ordered (name, definition) pairs from the values section."""
values: list[tuple[str, str]] = []
for match in _BOLD_ITEM_RE.finditer(text):
name = match.group(1).strip().rstrip(".")
defn = match.group(2).strip()
values.append((name, defn))
return values
def _parse_list(text: str) -> list[str]:
"""Extract a flat list from numbered or bulleted items."""
items: list[str] = []
for match in _NUMBERED_RE.finditer(text):
items.append(match.group(1).strip())
if not items:
for match in _BULLET_RE.finditer(text):
items.append(match.group(1).strip())
return items
def _parse_audience(text: str) -> dict[str, str]:
"""Extract audience key-value pairs."""
result: dict[str, str] = {}
for match in _LIST_ITEM_RE.finditer(text):
key = match.group(1).strip().lower()
value = match.group(2).strip()
result[key] = value
# Fallback: if no structured items, store raw text
if not result and text.strip():
result["description"] = text.strip()
return result
# ---------------------------------------------------------------------------
# Loader
# ---------------------------------------------------------------------------
class SoulLoader:
"""Load and parse SOUL.md files."""
def load(self, path: str | Path) -> SoulDocument:
"""Load a SOUL.md file from disk and parse it.
Args:
path: Path to the SOUL.md file.
Returns:
Parsed ``SoulDocument``.
Raises:
FileNotFoundError: If the file does not exist.
"""
path = Path(path)
if not path.exists():
raise FileNotFoundError(f"SOUL.md not found: {path}")
text = path.read_text(encoding="utf-8")
doc = self.parse(text)
doc.source_path = path
return doc
def parse(self, text: str) -> SoulDocument:
"""Parse raw markdown text into a ``SoulDocument``.
Args:
text: Raw SOUL.md content.
Returns:
Parsed ``SoulDocument``.
"""
preamble, sections = _split_sections(text)
doc = SoulDocument(preamble=preamble, raw_sections=sections)
# Identity
if "identity" in sections:
identity = _parse_identity(sections["identity"])
doc.name = identity.get("name", "")
doc.role = identity.get("role", "")
doc.lineage = identity.get("lineage", "")
doc.version = identity.get("version", "")
# Values
if "values" in sections:
doc.values = _parse_values(sections["values"])
# Prime Directive
if "prime directive" in sections:
doc.prime_directive = sections["prime directive"].strip()
# Audience Awareness
if "audience awareness" in sections:
doc.audience = _parse_audience(sections["audience awareness"])
# Constraints
if "constraints" in sections:
doc.constraints = _parse_list(sections["constraints"])
# Behavior (optional)
if "behavior" in sections:
doc.behavior = _parse_list(sections["behavior"])
# Boundaries (optional)
if "boundaries" in sections:
doc.boundaries = _parse_list(sections["boundaries"])
# Infer name from H1 if not in Identity section
if not doc.name:
h1_match = _H1_RE.search(preamble)
if h1_match:
title = h1_match.group(1).strip()
# "Timmy — Soul Identity" → "Timmy"
if "" in title:
doc.name = title.split("")[0].strip()
elif "-" in title:
doc.name = title.split("-")[0].strip()
else:
doc.name = title
return doc

View File

@@ -0,0 +1,192 @@
"""Validate SOUL.md structure and check for contradictions.
The validator checks:
1. All required sections are present
2. Identity fields are populated
3. Values are well-formed and ordered
4. Constraints don't contradict each other or values
5. No duplicate values or constraints
"""
from __future__ import annotations
import logging
import re
from dataclasses import dataclass, field
from .loader import REQUIRED_SECTIONS, SoulDocument
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Contradiction patterns
# ---------------------------------------------------------------------------
# Pairs of phrases that indicate contradictory directives.
# Each tuple is (pattern_a, pattern_b) — if both appear in constraints
# or values, the validator flags a potential contradiction.
_CONTRADICTION_PAIRS: list[tuple[str, str]] = [
("always respond immediately", "take time to think"),
("never refuse", "refuse when"),
("always obey", "push back"),
("maximum verbosity", "brevity"),
("never question", "question everything"),
("act autonomously", "always ask permission"),
("hide errors", "report all errors"),
("never apologize", "apologize when wrong"),
]
# Negation patterns for detecting self-contradicting single statements
_NEGATION_RE = re.compile(
r"\b(never|always|must not|must|do not|cannot)\b", re.IGNORECASE
)
# ---------------------------------------------------------------------------
# Result model
# ---------------------------------------------------------------------------
@dataclass
class ValidationResult:
"""Outcome of a SOUL.md validation pass."""
valid: bool = True
errors: list[str] = field(default_factory=list)
warnings: list[str] = field(default_factory=list)
def add_error(self, msg: str) -> None:
"""Record a validation error (makes result invalid)."""
self.errors.append(msg)
self.valid = False
def add_warning(self, msg: str) -> None:
"""Record a non-fatal warning."""
self.warnings.append(msg)
# ---------------------------------------------------------------------------
# Validator
# ---------------------------------------------------------------------------
class SoulValidator:
"""Validate a ``SoulDocument`` for structural and semantic issues."""
def validate(self, doc: SoulDocument) -> ValidationResult:
"""Run all validation checks on a parsed SOUL.md.
Args:
doc: Parsed ``SoulDocument`` to validate.
Returns:
``ValidationResult`` with errors and warnings.
"""
result = ValidationResult()
self._check_required_sections(doc, result)
self._check_identity(doc, result)
self._check_values(doc, result)
self._check_prime_directive(doc, result)
self._check_constraints(doc, result)
self._check_contradictions(doc, result)
return result
def _check_required_sections(
self, doc: SoulDocument, result: ValidationResult
) -> None:
"""Verify all required H2 sections are present."""
present = set(doc.raw_sections.keys())
for section in REQUIRED_SECTIONS:
if section not in present:
result.add_error(f"Missing required section: '{section}'")
def _check_identity(self, doc: SoulDocument, result: ValidationResult) -> None:
"""Verify identity fields are populated."""
if not doc.name:
result.add_error("Identity: 'name' is missing or empty")
if not doc.role:
result.add_error("Identity: 'role' is missing or empty")
if not doc.version:
result.add_warning("Identity: 'version' is not set — recommended for tracking")
def _check_values(self, doc: SoulDocument, result: ValidationResult) -> None:
"""Check values are well-formed."""
if not doc.values:
result.add_error("Values section is empty — at least one value required")
return
if len(doc.values) > 8:
result.add_warning(
f"Too many values ({len(doc.values)}) — "
"consider prioritizing to 36 for clarity"
)
# Check for duplicates
names = [name.lower() for name, _ in doc.values]
seen: set[str] = set()
for name in names:
if name in seen:
result.add_error(f"Duplicate value: '{name}'")
seen.add(name)
# Check for empty definitions
for name, defn in doc.values:
if not defn.strip():
result.add_warning(f"Value '{name}' has no definition")
def _check_prime_directive(
self, doc: SoulDocument, result: ValidationResult
) -> None:
"""Check the prime directive is present and concise."""
if not doc.prime_directive:
result.add_error("Prime directive is missing or empty")
return
# Warn if excessively long (more than ~200 chars suggests multiple sentences)
if len(doc.prime_directive) > 300:
result.add_warning(
"Prime directive is long — consider condensing to a single sentence"
)
def _check_constraints(
self, doc: SoulDocument, result: ValidationResult
) -> None:
"""Check constraints are present and not duplicated."""
if not doc.constraints:
result.add_warning("No constraints defined — consider adding hard rules")
return
# Check for duplicates (fuzzy: lowercase + stripped)
normalized = [c.lower().strip() for c in doc.constraints]
seen: set[str] = set()
for i, norm in enumerate(normalized):
if norm in seen:
result.add_warning(
f"Possible duplicate constraint: '{doc.constraints[i]}'"
)
seen.add(norm)
def _check_contradictions(
self, doc: SoulDocument, result: ValidationResult
) -> None:
"""Scan for contradictory directives across values, constraints, and boundaries."""
# Collect all directive text for scanning
all_text: list[str] = []
for _, defn in doc.values:
all_text.append(defn.lower())
for constraint in doc.constraints:
all_text.append(constraint.lower())
for boundary in doc.boundaries:
all_text.append(boundary.lower())
if doc.prime_directive:
all_text.append(doc.prime_directive.lower())
combined = " ".join(all_text)
for pattern_a, pattern_b in _CONTRADICTION_PAIRS:
if pattern_a.lower() in combined and pattern_b.lower() in combined:
result.add_warning(
f"Potential contradiction: '{pattern_a}' conflicts with '{pattern_b}'"
)

View File

@@ -0,0 +1,162 @@
"""Track SOUL.md version history using content hashing.
Each version snapshot stores the document hash, version string, and a
timestamp. This allows detecting identity drift and auditing changes
over time without requiring git history.
"""
from __future__ import annotations
import hashlib
import json
import logging
from dataclasses import asdict, dataclass, field
from datetime import UTC, datetime
from pathlib import Path
from typing import Any
from .loader import SoulDocument
logger = logging.getLogger(__name__)
DEFAULT_HISTORY_DIR = Path("data/soul_versions")
@dataclass
class VersionSnapshot:
"""A single point-in-time record of a SOUL.md state."""
version: str
content_hash: str
agent_name: str
timestamp: str
value_names: list[str] = field(default_factory=list)
constraint_count: int = 0
source_path: str = ""
def to_dict(self) -> dict[str, Any]:
"""Serialize to a JSON-compatible dict."""
return asdict(self)
@classmethod
def from_dict(cls, data: dict[str, Any]) -> VersionSnapshot:
"""Deserialize from a dict."""
return cls(**data)
class SoulVersioner:
"""Track and query SOUL.md version history.
Snapshots are stored as a JSON Lines file per agent. Each line is a
``VersionSnapshot`` recording the state at a point in time.
Args:
history_dir: Directory to store version history files.
"""
def __init__(self, history_dir: str | Path | None = None) -> None:
self._history_dir = Path(history_dir) if history_dir else DEFAULT_HISTORY_DIR
def snapshot(self, doc: SoulDocument) -> VersionSnapshot:
"""Create a version snapshot from the current document state.
Args:
doc: Parsed ``SoulDocument``.
Returns:
``VersionSnapshot`` capturing the current state.
"""
# Hash the raw section content for change detection
raw_content = json.dumps(doc.raw_sections, sort_keys=True)
content_hash = hashlib.sha256(raw_content.encode("utf-8")).hexdigest()[:16]
return VersionSnapshot(
version=doc.version or "0.0.0",
content_hash=content_hash,
agent_name=doc.name or "unknown",
timestamp=datetime.now(UTC).isoformat(),
value_names=doc.value_names(),
constraint_count=len(doc.constraints),
source_path=str(doc.source_path) if doc.source_path else "",
)
def record(self, doc: SoulDocument) -> VersionSnapshot:
"""Create a snapshot and persist it to the history file.
Skips writing if the latest snapshot has the same content hash
(no actual changes).
Args:
doc: Parsed ``SoulDocument``.
Returns:
The ``VersionSnapshot`` (whether newly written or existing).
"""
snap = self.snapshot(doc)
# Check if latest snapshot is identical
history = self.get_history(snap.agent_name)
if history and history[-1].content_hash == snap.content_hash:
logger.debug(
"SOUL.md unchanged for %s (hash=%s), skipping record",
snap.agent_name,
snap.content_hash,
)
return history[-1]
# Persist
self._history_dir.mkdir(parents=True, exist_ok=True)
history_file = self._history_dir / f"{snap.agent_name.lower()}.jsonl"
with open(history_file, "a", encoding="utf-8") as fh:
fh.write(json.dumps(snap.to_dict()) + "\n")
logger.info(
"Recorded SOUL.md version %s for %s (hash=%s)",
snap.version,
snap.agent_name,
snap.content_hash,
)
return snap
def get_history(self, agent_name: str) -> list[VersionSnapshot]:
"""Load the full version history for an agent.
Args:
agent_name: Name of the agent.
Returns:
List of ``VersionSnapshot`` in chronological order.
"""
history_file = self._history_dir / f"{agent_name.lower()}.jsonl"
if not history_file.exists():
return []
snapshots: list[VersionSnapshot] = []
for line in history_file.read_text(encoding="utf-8").splitlines():
line = line.strip()
if not line:
continue
try:
data = json.loads(line)
snapshots.append(VersionSnapshot.from_dict(data))
except (json.JSONDecodeError, TypeError) as exc:
logger.warning("Skipping malformed version record: %s", exc)
return snapshots
def has_changed(self, doc: SoulDocument) -> bool:
"""Check whether a document has changed since the last recorded snapshot.
Args:
doc: Parsed ``SoulDocument``.
Returns:
True if the content hash differs from the latest snapshot, or
if no history exists yet.
"""
snap = self.snapshot(doc)
history = self.get_history(snap.agent_name)
if not history:
return True
return history[-1].content_hash != snap.content_hash

View File

@@ -0,0 +1,166 @@
"""Visitor state tracking for the Matrix frontend.
Tracks active visitors as they connect and move around the 3D world,
and provides serialization for Matrix protocol broadcast messages.
"""
import time
from dataclasses import dataclass, field
from datetime import UTC, datetime
@dataclass
class VisitorState:
"""State for a single visitor in the Matrix.
Attributes
----------
visitor_id: Unique identifier for the visitor (client ID).
display_name: Human-readable name shown above the visitor.
position: 3D coordinates (x, y, z) in the world.
rotation: Rotation angle in degrees (0-360).
connected_at: ISO timestamp when the visitor connected.
"""
visitor_id: str
display_name: str = ""
position: dict[str, float] = field(default_factory=lambda: {"x": 0.0, "y": 0.0, "z": 0.0})
rotation: float = 0.0
connected_at: str = field(
default_factory=lambda: datetime.now(UTC).strftime("%Y-%m-%dT%H:%M:%SZ")
)
def __post_init__(self):
"""Set display_name to visitor_id if not provided; copy position dict."""
if not self.display_name:
self.display_name = self.visitor_id
# Copy position to avoid shared mutable state
self.position = dict(self.position)
class VisitorRegistry:
"""Registry of active visitors in the Matrix.
Thread-safe singleton pattern (Python GIL protects dict operations).
Used by the WebSocket layer to track and broadcast visitor positions.
"""
_instance: "VisitorRegistry | None" = None
def __new__(cls) -> "VisitorRegistry":
"""Singleton constructor."""
if cls._instance is None:
cls._instance = super().__new__(cls)
cls._instance._visitors: dict[str, VisitorState] = {}
return cls._instance
def add(
self, visitor_id: str, display_name: str = "", position: dict | None = None
) -> VisitorState:
"""Add a new visitor to the registry.
Parameters
----------
visitor_id: Unique identifier for the visitor.
display_name: Optional display name (defaults to visitor_id).
position: Optional initial position (defaults to origin).
Returns
-------
The newly created VisitorState.
"""
visitor = VisitorState(
visitor_id=visitor_id,
display_name=display_name,
position=position if position else {"x": 0.0, "y": 0.0, "z": 0.0},
)
self._visitors[visitor_id] = visitor
return visitor
def remove(self, visitor_id: str) -> bool:
"""Remove a visitor from the registry.
Parameters
----------
visitor_id: The visitor to remove.
Returns
-------
True if the visitor was found and removed, False otherwise.
"""
if visitor_id in self._visitors:
del self._visitors[visitor_id]
return True
return False
def update_position(
self,
visitor_id: str,
position: dict[str, float],
rotation: float | None = None,
) -> bool:
"""Update a visitor's position and rotation.
Parameters
----------
visitor_id: The visitor to update.
position: New 3D coordinates (x, y, z).
rotation: Optional new rotation angle.
Returns
-------
True if the visitor was found and updated, False otherwise.
"""
if visitor_id not in self._visitors:
return False
self._visitors[visitor_id].position = position
if rotation is not None:
self._visitors[visitor_id].rotation = rotation
return True
def get(self, visitor_id: str) -> VisitorState | None:
"""Get a single visitor's state.
Parameters
----------
visitor_id: The visitor to retrieve.
Returns
-------
The VisitorState if found, None otherwise.
"""
return self._visitors.get(visitor_id)
def get_all(self) -> list[dict]:
"""Get all active visitors as Matrix protocol message dicts.
Returns
-------
List of visitor_state dicts ready for WebSocket broadcast.
Each dict has: type, visitor_id, data (with display_name,
position, rotation, connected_at), and ts.
"""
now = int(time.time())
return [
{
"type": "visitor_state",
"visitor_id": v.visitor_id,
"data": {
"display_name": v.display_name,
"position": v.position,
"rotation": v.rotation,
"connected_at": v.connected_at,
},
"ts": now,
}
for v in self._visitors.values()
]
def clear(self) -> None:
"""Remove all visitors (useful for testing)."""
self._visitors.clear()
def __len__(self) -> int:
"""Return the number of active visitors."""
return len(self._visitors)

View File

@@ -515,25 +515,36 @@ class DiscordVendor(ChatPlatform):
async def _handle_message(self, message) -> None:
"""Process an incoming message and respond via a thread."""
# Strip the bot mention from the message content
content = message.content
if self._client.user:
content = content.replace(f"<@{self._client.user.id}>", "").strip()
content = self._extract_content(message)
if not content:
return
# Create or reuse a thread for this conversation
thread = await self._get_or_create_thread(message)
target = thread or message.channel
session_id = f"discord_{thread.id}" if thread else f"discord_{message.channel.id}"
# Derive session_id for per-conversation history via Agno's SQLite
if thread:
session_id = f"discord_{thread.id}"
else:
session_id = f"discord_{message.channel.id}"
run_output, response = await self._invoke_agent(content, session_id, target)
# Run Timmy agent with typing indicator and timeout
if run_output is not None:
await self._handle_paused_run(run_output, target, session_id)
raw_content = run_output.content if hasattr(run_output, "content") else ""
response = _clean_response(raw_content or "")
await self._send_response(response, target)
def _extract_content(self, message) -> str:
"""Strip the bot mention and return clean message text."""
content = message.content
if self._client.user:
content = content.replace(f"<@{self._client.user.id}>", "").strip()
return content
async def _invoke_agent(self, content: str, session_id: str, target):
"""Run chat_with_tools with a typing indicator and timeout.
Returns a (run_output, error_response) tuple. On success the
error_response is ``None``; on failure run_output is ``None``.
"""
run_output = None
response = None
try:
@@ -548,51 +559,57 @@ class DiscordVendor(ChatPlatform):
except Exception as exc:
logger.error("Discord: chat_with_tools() failed: %s", exc)
response = "I'm having trouble reaching my inference backend right now. Please try again shortly."
return run_output, response
# Check if Agno paused the run for tool confirmation
if run_output is not None:
status = getattr(run_output, "status", None)
is_paused = status == "PAUSED" or str(status) == "RunStatus.paused"
async def _handle_paused_run(self, run_output, target, session_id: str) -> None:
"""If Agno paused the run for tool confirmation, enqueue approvals."""
status = getattr(run_output, "status", None)
is_paused = status == "PAUSED" or str(status) == "RunStatus.paused"
if is_paused and getattr(run_output, "active_requirements", None):
from config import settings
if not (is_paused and getattr(run_output, "active_requirements", None)):
return
if settings.discord_confirm_actions:
for req in run_output.active_requirements:
if getattr(req, "needs_confirmation", False):
te = req.tool_execution
tool_name = getattr(te, "tool_name", "unknown")
tool_args = getattr(te, "tool_args", {}) or {}
from config import settings
from timmy.approvals import create_item
if not settings.discord_confirm_actions:
return
item = create_item(
title=f"Discord: {tool_name}",
description=_format_action_description(tool_name, tool_args),
proposed_action=json.dumps({"tool": tool_name, "args": tool_args}),
impact=_get_impact_level(tool_name),
)
self._pending_actions[item.id] = {
"run_output": run_output,
"requirement": req,
"tool_name": tool_name,
"tool_args": tool_args,
"target": target,
"session_id": session_id,
}
await self._send_confirmation(target, tool_name, tool_args, item.id)
for req in run_output.active_requirements:
if not getattr(req, "needs_confirmation", False):
continue
te = req.tool_execution
tool_name = getattr(te, "tool_name", "unknown")
tool_args = getattr(te, "tool_args", {}) or {}
raw_content = run_output.content if hasattr(run_output, "content") else ""
response = _clean_response(raw_content or "")
from timmy.approvals import create_item
# Discord has a 2000 character limit — send with error handling
if response and response.strip():
for chunk in _chunk_message(response, 2000):
try:
await target.send(chunk)
except Exception as exc:
logger.error("Discord: failed to send message chunk: %s", exc)
break
item = create_item(
title=f"Discord: {tool_name}",
description=_format_action_description(tool_name, tool_args),
proposed_action=json.dumps({"tool": tool_name, "args": tool_args}),
impact=_get_impact_level(tool_name),
)
self._pending_actions[item.id] = {
"run_output": run_output,
"requirement": req,
"tool_name": tool_name,
"tool_args": tool_args,
"target": target,
"session_id": session_id,
}
await self._send_confirmation(target, tool_name, tool_args, item.id)
@staticmethod
async def _send_response(response: str | None, target) -> None:
"""Send a response to Discord, chunked to the 2000-char limit."""
if not response or not response.strip():
return
for chunk in _chunk_message(response, 2000):
try:
await target.send(chunk)
except Exception as exc:
logger.error("Discord: failed to send message chunk: %s", exc)
break
async def _get_or_create_thread(self, message):
"""Get the active thread for a channel, or create one.

View File

@@ -0,0 +1 @@
"""Lightning Network integration for tool-usage micro-payments."""

69
src/lightning/factory.py Normal file
View File

@@ -0,0 +1,69 @@
"""Lightning backend factory.
Returns a mock or real LND backend based on ``settings.lightning_backend``.
"""
from __future__ import annotations
import hashlib
import logging
import secrets
from dataclasses import dataclass
from config import settings
logger = logging.getLogger(__name__)
@dataclass
class Invoice:
"""Minimal Lightning invoice representation."""
payment_hash: str
payment_request: str
amount_sats: int
memo: str
class MockBackend:
"""In-memory mock Lightning backend for development and testing."""
def create_invoice(self, amount_sats: int, memo: str = "") -> Invoice:
"""Create a fake invoice with a random payment hash."""
raw = secrets.token_bytes(32)
payment_hash = hashlib.sha256(raw).hexdigest()
payment_request = f"lnbc{amount_sats}mock{payment_hash[:20]}"
logger.debug("Mock invoice: %s sats — %s", amount_sats, payment_hash[:12])
return Invoice(
payment_hash=payment_hash,
payment_request=payment_request,
amount_sats=amount_sats,
memo=memo,
)
# Singleton — lazily created
_backend: MockBackend | None = None
def get_backend() -> MockBackend:
"""Return the configured Lightning backend (currently mock-only).
Raises ``ValueError`` if an unsupported backend is requested.
"""
global _backend # noqa: PLW0603
if _backend is not None:
return _backend
kind = settings.lightning_backend
if kind == "mock":
_backend = MockBackend()
elif kind == "lnd":
# LND gRPC integration is on the roadmap — for now fall back to mock.
logger.warning("LND backend not yet implemented — using mock")
_backend = MockBackend()
else:
raise ValueError(f"Unknown lightning_backend: {kind!r}")
logger.info("Lightning backend: %s", kind)
return _backend

146
src/lightning/ledger.py Normal file
View File

@@ -0,0 +1,146 @@
"""In-memory Lightning transaction ledger.
Tracks invoices, settlements, and balances per the schema in
``docs/adr/018-lightning-ledger.md``. Uses a simple in-memory list so the
dashboard can display real (ephemeral) data without requiring SQLite yet.
"""
from __future__ import annotations
import logging
import uuid
from dataclasses import dataclass
from datetime import UTC, datetime
from enum import StrEnum
logger = logging.getLogger(__name__)
class TxType(StrEnum):
incoming = "incoming"
outgoing = "outgoing"
class TxStatus(StrEnum):
pending = "pending"
settled = "settled"
failed = "failed"
expired = "expired"
@dataclass
class LedgerEntry:
"""Single ledger row matching the ADR-018 schema."""
id: str
tx_type: TxType
status: TxStatus
payment_hash: str
amount_sats: int
memo: str
source: str
created_at: str
invoice: str = ""
preimage: str = ""
task_id: str = ""
agent_id: str = ""
settled_at: str = ""
fee_sats: int = 0
# ── In-memory store ──────────────────────────────────────────────────
_entries: list[LedgerEntry] = []
def create_invoice_entry(
payment_hash: str,
amount_sats: int,
memo: str = "",
source: str = "tool_usage",
task_id: str = "",
agent_id: str = "",
invoice: str = "",
) -> LedgerEntry:
"""Record a new incoming invoice in the ledger."""
entry = LedgerEntry(
id=uuid.uuid4().hex[:16],
tx_type=TxType.incoming,
status=TxStatus.pending,
payment_hash=payment_hash,
amount_sats=amount_sats,
memo=memo,
source=source,
task_id=task_id,
agent_id=agent_id,
invoice=invoice,
created_at=datetime.now(UTC).isoformat(),
)
_entries.append(entry)
logger.debug("Ledger entry created: %s (%s sats)", entry.id, amount_sats)
return entry
def mark_settled(payment_hash: str, preimage: str = "") -> LedgerEntry | None:
"""Mark a pending entry as settled by payment hash."""
for entry in _entries:
if entry.payment_hash == payment_hash and entry.status == TxStatus.pending:
entry.status = TxStatus.settled
entry.preimage = preimage
entry.settled_at = datetime.now(UTC).isoformat()
logger.debug("Ledger settled: %s", payment_hash[:12])
return entry
return None
def get_balance() -> dict:
"""Compute the current balance from settled and pending entries."""
incoming_total = sum(
e.amount_sats
for e in _entries
if e.tx_type == TxType.incoming and e.status == TxStatus.settled
)
outgoing_total = sum(
e.amount_sats
for e in _entries
if e.tx_type == TxType.outgoing and e.status == TxStatus.settled
)
fees = sum(e.fee_sats for e in _entries if e.status == TxStatus.settled)
pending_in = sum(
e.amount_sats
for e in _entries
if e.tx_type == TxType.incoming and e.status == TxStatus.pending
)
pending_out = sum(
e.amount_sats
for e in _entries
if e.tx_type == TxType.outgoing and e.status == TxStatus.pending
)
net = incoming_total - outgoing_total - fees
return {
"incoming_total_sats": incoming_total,
"outgoing_total_sats": outgoing_total,
"fees_paid_sats": fees,
"net_sats": net,
"pending_incoming_sats": pending_in,
"pending_outgoing_sats": pending_out,
"available_sats": net - pending_out,
}
def get_transactions(
tx_type: str | None = None,
status: str | None = None,
limit: int = 50,
) -> list[LedgerEntry]:
"""Return ledger entries, optionally filtered."""
result = _entries
if tx_type:
result = [e for e in result if e.tx_type.value == tx_type]
if status:
result = [e for e in result if e.status.value == status]
return list(reversed(result))[:limit]
def clear() -> None:
"""Reset the ledger (for testing)."""
_entries.clear()

View File

@@ -119,75 +119,84 @@ class BaseAgent(ABC):
"""
pass
async def run(self, message: str) -> str:
"""Run the agent with a message.
# Transient errors that indicate Ollama contention or temporary
# unavailability — these deserve a retry with backoff.
_TRANSIENT = (
httpx.ConnectError,
httpx.ReadError,
httpx.ReadTimeout,
httpx.ConnectTimeout,
ConnectionError,
TimeoutError,
)
Retries on transient failures (connection errors, timeouts) with
exponential backoff. GPU contention from concurrent Ollama
requests causes ReadError / ReadTimeout — these are transient
and should be retried, not raised immediately (#70).
async def run(self, message: str, *, max_retries: int = 3) -> str:
"""Run the agent with a message, retrying on transient failures.
Returns:
Agent response
GPU contention from concurrent Ollama requests causes ReadError /
ReadTimeout — these are transient and retried with exponential
backoff (#70).
"""
max_retries = 3
last_exception = None
# Transient errors that indicate Ollama contention or temporary
# unavailability — these deserve a retry with backoff.
_transient = (
httpx.ConnectError,
httpx.ReadError,
httpx.ReadTimeout,
httpx.ConnectTimeout,
ConnectionError,
TimeoutError,
)
response = await self._run_with_retries(message, max_retries)
await self._emit_response_event(message, response)
return response
async def _run_with_retries(self, message: str, max_retries: int) -> str:
"""Execute agent.run() with retry logic for transient errors."""
for attempt in range(1, max_retries + 1):
try:
result = self.agent.run(message, stream=False)
response = result.content if hasattr(result, "content") else str(result)
break # Success, exit the retry loop
except _transient as exc:
last_exception = exc
if attempt < max_retries:
# Contention backoff — longer waits because the GPU
# needs time to finish the other request.
wait = min(2**attempt, 16)
logger.warning(
"Ollama contention on attempt %d/%d: %s. Waiting %ds before retry...",
attempt,
max_retries,
type(exc).__name__,
wait,
)
await asyncio.sleep(wait)
else:
logger.error(
"Ollama unreachable after %d attempts: %s",
max_retries,
exc,
)
raise last_exception from exc
return result.content if hasattr(result, "content") else str(result)
except self._TRANSIENT as exc:
self._handle_retry_or_raise(
exc,
attempt,
max_retries,
transient=True,
)
await asyncio.sleep(min(2**attempt, 16))
except Exception as exc:
last_exception = exc
if attempt < max_retries:
logger.warning(
"Agent run failed on attempt %d/%d: %s. Retrying...",
attempt,
max_retries,
exc,
)
await asyncio.sleep(min(2 ** (attempt - 1), 8))
else:
logger.error(
"Agent run failed after %d attempts: %s",
max_retries,
exc,
)
raise last_exception from exc
self._handle_retry_or_raise(
exc,
attempt,
max_retries,
transient=False,
)
await asyncio.sleep(min(2 ** (attempt - 1), 8))
# Unreachable — _handle_retry_or_raise raises on last attempt.
raise RuntimeError("retry loop exited unexpectedly") # pragma: no cover
# Emit completion event
@staticmethod
def _handle_retry_or_raise(
exc: Exception,
attempt: int,
max_retries: int,
*,
transient: bool,
) -> None:
"""Log a retry warning or raise after exhausting attempts."""
if attempt < max_retries:
if transient:
logger.warning(
"Ollama contention on attempt %d/%d: %s. Waiting before retry...",
attempt,
max_retries,
type(exc).__name__,
)
else:
logger.warning(
"Agent run failed on attempt %d/%d: %s. Retrying...",
attempt,
max_retries,
exc,
)
else:
label = "Ollama unreachable" if transient else "Agent run failed"
logger.error("%s after %d attempts: %s", label, max_retries, exc)
raise exc
async def _emit_response_event(self, message: str, response: str) -> None:
"""Publish a completion event to the event bus if connected."""
if self.event_bus:
await self.event_bus.publish(
Event(
@@ -197,8 +206,6 @@ class BaseAgent(ABC):
)
)
return response
def get_capabilities(self) -> list[str]:
"""Get list of capabilities this agent provides."""
return self.tools

View File

@@ -102,9 +102,11 @@ class GrokBackend:
import httpx
from openai import OpenAI
from config import settings
return OpenAI(
api_key=self._api_key,
base_url="https://api.x.ai/v1",
base_url=settings.xai_base_url,
timeout=httpx.Timeout(300.0),
)
@@ -113,9 +115,11 @@ class GrokBackend:
import httpx
from openai import AsyncOpenAI
from config import settings
return AsyncOpenAI(
api_key=self._api_key,
base_url="https://api.x.ai/v1",
base_url=settings.xai_base_url,
timeout=httpx.Timeout(300.0),
)
@@ -260,6 +264,7 @@ class GrokBackend:
},
}
except Exception as exc:
logger.exception("Grok health check failed")
return {
"ok": False,
"error": str(exc),
@@ -426,6 +431,7 @@ class ClaudeBackend:
)
return {"ok": True, "error": None, "backend": "claude", "model": self._model}
except Exception as exc:
logger.exception("Claude health check failed")
return {"ok": False, "error": str(exc), "backend": "claude", "model": self._model}
# ── Private helpers ───────────────────────────────────────────────────

View File

@@ -37,6 +37,68 @@ def _is_interactive() -> bool:
return hasattr(sys.stdin, "isatty") and sys.stdin.isatty()
def _read_message_input(message: list[str]) -> str:
"""Join CLI args into a message, reading from stdin when requested.
Returns the final message string. Raises ``typer.Exit(1)`` when
stdin is explicitly requested (``-``) but empty.
"""
message_str = " ".join(message)
if message_str == "-" or not _is_interactive():
try:
stdin_content = sys.stdin.read().strip()
except (KeyboardInterrupt, EOFError):
stdin_content = ""
if stdin_content:
message_str = stdin_content
elif message_str == "-":
typer.echo("No input provided via stdin.", err=True)
raise typer.Exit(1)
return message_str
def _resolve_session_id(session_id: str | None, new_session: bool) -> str:
"""Return the effective session ID for a chat invocation."""
import uuid
if session_id is not None:
return session_id
if new_session:
return str(uuid.uuid4())
return _CLI_SESSION_ID
def _prompt_interactive(req, tool_name: str, tool_args: dict) -> None:
"""Display tool details and prompt the human for approval."""
description = format_action_description(tool_name, tool_args)
impact = get_impact_level(tool_name)
typer.echo()
typer.echo(typer.style("Tool confirmation required", bold=True))
typer.echo(f" Impact: {impact.upper()}")
typer.echo(f" {description}")
typer.echo()
if typer.confirm("Allow this action?", default=False):
req.confirm()
logger.info("CLI: approved %s", tool_name)
else:
req.reject(note="User rejected from CLI")
logger.info("CLI: rejected %s", tool_name)
def _decide_autonomous(req, tool_name: str, tool_args: dict) -> None:
"""Auto-approve allowlisted tools; reject everything else."""
if is_allowlisted(tool_name, tool_args):
req.confirm()
logger.info("AUTO-APPROVED (allowlist): %s", tool_name)
else:
req.reject(note="Auto-rejected: not in allowlist")
logger.info("AUTO-REJECTED (not allowlisted): %s %s", tool_name, str(tool_args)[:100])
def _handle_tool_confirmation(agent, run_output, session_id: str, *, autonomous: bool = False):
"""Prompt user to approve/reject dangerous tool calls.
@@ -51,6 +113,7 @@ def _handle_tool_confirmation(agent, run_output, session_id: str, *, autonomous:
Returns the final RunOutput after all confirmations are resolved.
"""
interactive = _is_interactive() and not autonomous
decide = _prompt_interactive if interactive else _decide_autonomous
max_rounds = 10 # safety limit
for _ in range(max_rounds):
@@ -66,39 +129,10 @@ def _handle_tool_confirmation(agent, run_output, session_id: str, *, autonomous:
for req in reqs:
if not getattr(req, "needs_confirmation", False):
continue
te = req.tool_execution
tool_name = getattr(te, "tool_name", "unknown")
tool_args = getattr(te, "tool_args", {}) or {}
if interactive:
# Human present — prompt for approval
description = format_action_description(tool_name, tool_args)
impact = get_impact_level(tool_name)
typer.echo()
typer.echo(typer.style("Tool confirmation required", bold=True))
typer.echo(f" Impact: {impact.upper()}")
typer.echo(f" {description}")
typer.echo()
approved = typer.confirm("Allow this action?", default=False)
if approved:
req.confirm()
logger.info("CLI: approved %s", tool_name)
else:
req.reject(note="User rejected from CLI")
logger.info("CLI: rejected %s", tool_name)
else:
# Autonomous mode — check allowlist
if is_allowlisted(tool_name, tool_args):
req.confirm()
logger.info("AUTO-APPROVED (allowlist): %s", tool_name)
else:
req.reject(note="Auto-rejected: not in allowlist")
logger.info(
"AUTO-REJECTED (not allowlisted): %s %s", tool_name, str(tool_args)[:100]
)
decide(req, tool_name, tool_args)
# Resume the run so the agent sees the confirmation result
try:
@@ -138,10 +172,39 @@ def think(
model_size: str | None = _MODEL_SIZE_OPTION,
):
"""Ask Timmy to think carefully about a topic."""
timmy = create_timmy(backend=backend, model_size=model_size, session_id=_CLI_SESSION_ID)
timmy = create_timmy(backend=backend, session_id=_CLI_SESSION_ID)
timmy.print_response(f"Think carefully about: {topic}", stream=True, session_id=_CLI_SESSION_ID)
def _read_message_input(message: list[str]) -> str:
"""Join CLI arguments and read from stdin when appropriate."""
message_str = " ".join(message)
if message_str == "-" or not _is_interactive():
try:
stdin_content = sys.stdin.read().strip()
except (KeyboardInterrupt, EOFError):
stdin_content = ""
if stdin_content:
message_str = stdin_content
elif message_str == "-":
typer.echo("No input provided via stdin.", err=True)
raise typer.Exit(1)
return message_str
def _resolve_session_id(session_id: str | None, new_session: bool) -> str:
"""Return the effective session ID based on CLI flags."""
import uuid
if session_id is not None:
return session_id
if new_session:
return str(uuid.uuid4())
return _CLI_SESSION_ID
@app.command()
def chat(
message: list[str] = typer.Argument(
@@ -178,38 +241,13 @@ def chat(
Read from stdin by passing "-" as the message or piping input.
"""
import uuid
message_str = _read_message_input(message)
session_id = _resolve_session_id(session_id, new_session)
timmy = create_timmy(backend=backend, session_id=session_id)
# Join multiple arguments into a single message string
message_str = " ".join(message)
# Handle stdin input if "-" is passed or stdin is not a tty
if message_str == "-" or not _is_interactive():
try:
stdin_content = sys.stdin.read().strip()
except (KeyboardInterrupt, EOFError):
stdin_content = ""
if stdin_content:
message_str = stdin_content
elif message_str == "-":
typer.echo("No input provided via stdin.", err=True)
raise typer.Exit(1)
if session_id is not None:
pass # use the provided value
elif new_session:
session_id = str(uuid.uuid4())
else:
session_id = _CLI_SESSION_ID
timmy = create_timmy(backend=backend, model_size=model_size, session_id=session_id)
# Use agent.run() so we can intercept paused runs for tool confirmation.
run_output = timmy.run(message_str, stream=False, session_id=session_id)
# Handle paused runs — dangerous tools need user approval
run_output = _handle_tool_confirmation(timmy, run_output, session_id, autonomous=autonomous)
# Print the final response
content = run_output.content if hasattr(run_output, "content") else str(run_output)
if content:
from timmy.session import _clean_response
@@ -278,7 +316,7 @@ def status(
model_size: str | None = _MODEL_SIZE_OPTION,
):
"""Print Timmy's operational status."""
timmy = create_timmy(backend=backend, model_size=model_size, session_id=_CLI_SESSION_ID)
timmy = create_timmy(backend=backend, session_id=_CLI_SESSION_ID)
timmy.print_response(STATUS_PROMPT, stream=False, session_id=_CLI_SESSION_ID)
@@ -451,5 +489,43 @@ def focus(
typer.echo("No active focus (broad mode).")
@app.command(name="healthcheck")
def healthcheck(
json_output: bool = typer.Option(False, "--json", "-j", help="Output as JSON"),
verbose: bool = typer.Option(
False, "--verbose", "-v", help="Show verbose output including issue details"
),
quiet: bool = typer.Option(False, "--quiet", "-q", help="Only show status line (no details)"),
):
"""Quick health snapshot before coding.
Shows CI status, critical issues (P0/P1), test flakiness, and token economy.
Fast execution (< 5 seconds) for pre-work checks.
Refs: #710
"""
import subprocess
import sys
from pathlib import Path
script_path = (
Path(__file__).resolve().parent.parent.parent
/ "timmy_automations"
/ "daily_run"
/ "health_snapshot.py"
)
cmd = [sys.executable, str(script_path)]
if json_output:
cmd.append("--json")
if verbose:
cmd.append("--verbose")
if quiet:
cmd.append("--quiet")
result = subprocess.run(cmd)
raise typer.Exit(result.returncode)
def main():
app()

View File

@@ -174,15 +174,8 @@ class ConversationManager:
return None
def should_use_tools(self, message: str, context: ConversationContext) -> bool:
"""Determine if this message likely requires tools.
Returns True if tools are likely needed, False for simple chat.
"""
message_lower = message.lower().strip()
# Tool keywords that suggest tool usage is needed
tool_keywords = [
_TOOL_KEYWORDS = frozenset(
{
"search",
"look up",
"find",
@@ -203,10 +196,11 @@ class ConversationManager:
"shell",
"command",
"install",
]
}
)
# Chat-only keywords that definitely don't need tools
chat_only = [
_CHAT_ONLY_KEYWORDS = frozenset(
{
"hello",
"hi ",
"hey",
@@ -221,30 +215,47 @@ class ConversationManager:
"goodbye",
"tell me about yourself",
"what can you do",
]
}
)
# Check for chat-only patterns first
for pattern in chat_only:
if pattern in message_lower:
return False
_SIMPLE_QUESTION_PREFIXES = ("what is", "who is", "how does", "why is", "when did", "where is")
_TIME_WORDS = ("today", "now", "current", "latest", "this week", "this month")
# Check for tool keywords
for keyword in tool_keywords:
if keyword in message_lower:
return True
def _is_chat_only(self, message_lower: str) -> bool:
"""Return True if the message matches a chat-only pattern."""
return any(kw in message_lower for kw in self._CHAT_ONLY_KEYWORDS)
# Simple questions (starting with what, who, how, why, when, where)
# usually don't need tools unless about current/real-time info
simple_question_words = ["what is", "who is", "how does", "why is", "when did", "where is"]
for word in simple_question_words:
if message_lower.startswith(word):
# Check if it's asking about current/real-time info
time_words = ["today", "now", "current", "latest", "this week", "this month"]
if any(t in message_lower for t in time_words):
return True
return False
def _has_tool_keyword(self, message_lower: str) -> bool:
"""Return True if the message contains a tool-related keyword."""
return any(kw in message_lower for kw in self._TOOL_KEYWORDS)
def _is_simple_question(self, message_lower: str) -> bool | None:
"""Check if message is a simple question.
Returns True if it needs tools (real-time info), False if it
doesn't, or None if the message isn't a simple question.
"""
for prefix in self._SIMPLE_QUESTION_PREFIXES:
if message_lower.startswith(prefix):
return any(t in message_lower for t in self._TIME_WORDS)
return None
def should_use_tools(self, message: str, context: ConversationContext) -> bool:
"""Determine if this message likely requires tools.
Returns True if tools are likely needed, False for simple chat.
"""
message_lower = message.lower().strip()
if self._is_chat_only(message_lower):
return False
if self._has_tool_keyword(message_lower):
return True
simple = self._is_simple_question(message_lower)
if simple is not None:
return simple
# Default: don't use tools for unclear cases
return False

View File

@@ -97,6 +97,7 @@ async def probe_tool_use() -> dict:
"error_type": "empty_result",
}
except Exception as exc:
logger.exception("Tool use probe failed")
return {
"success": False,
"capability": cap,
@@ -129,6 +130,7 @@ async def probe_multistep_planning() -> dict:
"error_type": "verification_failed",
}
except Exception as exc:
logger.exception("Multistep planning probe failed")
return {
"success": False,
"capability": cap,
@@ -151,6 +153,7 @@ async def probe_memory_write() -> dict:
"error_type": None,
}
except Exception as exc:
logger.exception("Memory write probe failed")
return {
"success": False,
"capability": cap,
@@ -179,6 +182,7 @@ async def probe_memory_read() -> dict:
"error_type": "empty_result",
}
except Exception as exc:
logger.exception("Memory read probe failed")
return {
"success": False,
"capability": cap,
@@ -214,6 +218,7 @@ async def probe_self_coding() -> dict:
"error_type": "verification_failed",
}
except Exception as exc:
logger.exception("Self-coding probe failed")
return {
"success": False,
"capability": cap,
@@ -325,6 +330,7 @@ class LoopQAOrchestrator:
result = await probe_fn()
except Exception as exc:
# Probe itself crashed — record failure and report
logger.exception("Loop QA probe %s crashed", cap.value)
capture_error(exc, source="loop_qa", context={"capability": cap.value})
result = {
"success": False,

View File

@@ -21,12 +21,16 @@ Usage::
from __future__ import annotations
import logging
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from PIL import ImageDraw
import os
import shutil
import sqlite3
import uuid
from contextlib import closing
from datetime import datetime
from datetime import UTC, datetime
from pathlib import Path
import httpx
@@ -192,7 +196,7 @@ def _bridge_to_work_order(title: str, body: str, category: str) -> None:
body,
category,
"timmy-thinking",
datetime.utcnow().isoformat(),
datetime.now(UTC).isoformat(),
),
)
conn.commit()
@@ -200,15 +204,61 @@ def _bridge_to_work_order(title: str, body: str, category: str) -> None:
logger.debug("Work order bridge failed: %s", exc)
async def _ensure_issue_session():
"""Get or create the cached MCP session, connecting if needed.
Returns the connected ``MCPTools`` instance.
"""
from agno.tools.mcp import MCPTools
global _issue_session
if _issue_session is None:
_issue_session = MCPTools(
server_params=_gitea_server_params(),
timeout_seconds=settings.mcp_timeout,
)
if not getattr(_issue_session, "_connected", False):
await _issue_session.connect()
_issue_session._connected = True
return _issue_session
def _build_issue_body(body: str) -> str:
"""Append the auto-filing signature to the issue body."""
full_body = body
if full_body:
full_body += "\n\n"
full_body += "---\n*Auto-filed by Timmy's thinking engine*"
return full_body
def _build_issue_args(title: str, full_body: str) -> dict:
"""Build MCP tool arguments for ``issue_write`` with method=create."""
owner, repo = settings.gitea_repo.split("/", 1)
return {
"method": "create",
"owner": owner,
"repo": repo,
"title": title,
"body": full_body,
}
def _category_from_labels(labels: str) -> str:
"""Derive a work-order category from comma-separated label names."""
label_list = [tag.strip() for tag in labels.split(",") if tag.strip()] if labels else []
return "bug" if "bug" in label_list else "suggestion"
async def create_gitea_issue_via_mcp(title: str, body: str = "", labels: str = "") -> str:
"""File a Gitea issue via the MCP server (standalone, no LLM loop).
Used by the thinking engine's ``_maybe_file_issues()`` post-hook.
Manages its own MCPTools session with lazy connect + graceful failure.
Uses ``tools.session.call_tool()`` for direct MCP invocation — the
``MCPTools`` wrapper itself does not expose ``call_tool()``.
Args:
title: Issue title.
body: Issue body (markdown).
@@ -221,46 +271,13 @@ async def create_gitea_issue_via_mcp(title: str, body: str = "", labels: str = "
return "Gitea integration is not configured."
try:
from agno.tools.mcp import MCPTools
session = await _ensure_issue_session()
full_body = _build_issue_body(body)
args = _build_issue_args(title, full_body)
global _issue_session
result = await session.session.call_tool("issue_write", arguments=args)
if _issue_session is None:
_issue_session = MCPTools(
server_params=_gitea_server_params(),
timeout_seconds=settings.mcp_timeout,
)
# Ensure connected
if not getattr(_issue_session, "_connected", False):
await _issue_session.connect()
_issue_session._connected = True
# Append auto-filing signature
full_body = body
if full_body:
full_body += "\n\n"
full_body += "---\n*Auto-filed by Timmy's thinking engine*"
# Parse owner/repo from settings
owner, repo = settings.gitea_repo.split("/", 1)
# Build tool arguments — gitea-mcp uses issue_write with method="create"
args = {
"method": "create",
"owner": owner,
"repo": repo,
"title": title,
"body": full_body,
}
# Call via the underlying MCP session (MCPTools doesn't expose call_tool)
result = await _issue_session.session.call_tool("issue_write", arguments=args)
# Bridge to local work order
label_list = [tag.strip() for tag in labels.split(",") if tag.strip()] if labels else []
category = "bug" if "bug" in label_list else "suggestion"
_bridge_to_work_order(title, body, category)
_bridge_to_work_order(title, body, _category_from_labels(labels))
logger.info("Created Gitea issue via MCP: %s", title[:60])
return f"Created issue: {title}\n{result}"
@@ -270,20 +287,8 @@ async def create_gitea_issue_via_mcp(title: str, body: str = "", labels: str = "
return f"Failed to create issue via MCP: {exc}"
def _generate_avatar_image() -> bytes:
"""Generate a Timmy-themed avatar image using Pillow.
Creates a 512x512 wizard-themed avatar with emerald/purple/gold palette.
Returns raw PNG bytes. Falls back to a minimal solid-color image if
Pillow drawing primitives fail.
"""
from PIL import Image, ImageDraw
size = 512
img = Image.new("RGB", (size, size), (15, 25, 20))
draw = ImageDraw.Draw(img)
# Background gradient effect — concentric circles
def _draw_background(draw: ImageDraw.ImageDraw, size: int) -> None:
"""Draw radial gradient background with concentric circles."""
for i in range(size // 2, 0, -4):
g = int(25 + (i / (size // 2)) * 30)
draw.ellipse(
@@ -291,33 +296,45 @@ def _generate_avatar_image() -> bytes:
fill=(10, g, 20),
)
# Wizard hat (triangle)
def _draw_wizard(draw: ImageDraw.ImageDraw) -> None:
"""Draw wizard hat, face, eyes, smile, monogram, and robe."""
hat_color = (100, 50, 160) # purple
draw.polygon(
[(256, 40), (160, 220), (352, 220)],
fill=hat_color,
outline=(180, 130, 255),
)
hat_outline = (180, 130, 255)
gold = (220, 190, 50)
pupil = (30, 30, 60)
# Hat brim
draw.ellipse([140, 200, 372, 250], fill=hat_color, outline=(180, 130, 255))
# Hat + brim
draw.polygon([(256, 40), (160, 220), (352, 220)], fill=hat_color, outline=hat_outline)
draw.ellipse([140, 200, 372, 250], fill=hat_color, outline=hat_outline)
# Face circle
# Face
draw.ellipse([190, 220, 322, 370], fill=(60, 180, 100), outline=(80, 220, 120))
# Eyes
# Eyes (whites + pupils)
draw.ellipse([220, 275, 248, 310], fill=(255, 255, 255))
draw.ellipse([264, 275, 292, 310], fill=(255, 255, 255))
draw.ellipse([228, 285, 242, 300], fill=(30, 30, 60))
draw.ellipse([272, 285, 286, 300], fill=(30, 30, 60))
draw.ellipse([228, 285, 242, 300], fill=pupil)
draw.ellipse([272, 285, 286, 300], fill=pupil)
# Smile
draw.arc([225, 300, 287, 355], start=10, end=170, fill=(30, 30, 60), width=3)
draw.arc([225, 300, 287, 355], start=10, end=170, fill=pupil, width=3)
# Stars around the hat
# "T" monogram on hat
draw.text((243, 100), "T", fill=gold)
# Robe
draw.polygon(
[(180, 370), (140, 500), (372, 500), (332, 370)],
fill=(40, 100, 70),
outline=(60, 160, 100),
)
def _draw_stars(draw: ImageDraw.ImageDraw) -> None:
"""Draw decorative gold stars around the wizard hat."""
gold = (220, 190, 50)
star_positions = [(120, 100), (380, 120), (100, 300), (400, 280), (256, 10)]
for sx, sy in star_positions:
for sx, sy in [(120, 100), (380, 120), (100, 300), (400, 280), (256, 10)]:
r = 8
draw.polygon(
[
@@ -333,18 +350,26 @@ def _generate_avatar_image() -> bytes:
fill=gold,
)
# "T" monogram on the hat
draw.text((243, 100), "T", fill=gold)
# Robe / body
draw.polygon(
[(180, 370), (140, 500), (372, 500), (332, 370)],
fill=(40, 100, 70),
outline=(60, 160, 100),
)
def _generate_avatar_image() -> bytes:
"""Generate a Timmy-themed avatar image using Pillow.
Creates a 512x512 wizard-themed avatar with emerald/purple/gold palette.
Returns raw PNG bytes. Falls back to a minimal solid-color image if
Pillow drawing primitives fail.
"""
import io
from PIL import Image, ImageDraw
size = 512
img = Image.new("RGB", (size, size), (15, 25, 20))
draw = ImageDraw.Draw(img)
_draw_background(draw, size)
_draw_wizard(draw)
_draw_stars(draw)
buf = io.BytesIO()
img.save(buf, format="PNG")
return buf.getvalue()

View File

@@ -78,83 +78,88 @@ def _migrate_schema(conn: sqlite3.Connection) -> None:
cursor = conn.execute("SELECT name FROM sqlite_master WHERE type='table'")
tables = {row[0] for row in cursor.fetchall()}
has_memories = "memories" in tables
has_episodes = "episodes" in tables
has_chunks = "chunks" in tables
has_facts = "facts" in tables
# Check if we need to migrate (old schema exists but new one doesn't fully)
if not has_memories:
if "memories" not in tables:
logger.info("Migration: Creating unified memories table")
# Schema will be created above
# Migrate episodes -> memories
if has_episodes and has_memories:
logger.info("Migration: Converting episodes table to memories")
try:
cols = _get_table_columns(conn, "episodes")
context_type_col = "context_type" if "context_type" in cols else "'conversation'"
conn.execute(f"""
INSERT INTO memories (
id, content, memory_type, source, embedding,
metadata, agent_id, task_id, session_id,
created_at, access_count, last_accessed
)
SELECT
id, content,
COALESCE({context_type_col}, 'conversation'),
COALESCE(source, 'agent'),
embedding,
metadata, agent_id, task_id, session_id,
COALESCE(timestamp, datetime('now')), 0, NULL
FROM episodes
""")
conn.execute("DROP TABLE episodes")
logger.info("Migration: Migrated episodes to memories")
except sqlite3.Error as exc:
logger.warning("Migration: Failed to migrate episodes: %s", exc)
# Migrate chunks -> memories as vault_chunk
if has_chunks and has_memories:
logger.info("Migration: Converting chunks table to memories")
try:
cols = _get_table_columns(conn, "chunks")
id_col = "id" if "id" in cols else "CAST(rowid AS TEXT)"
content_col = "content" if "content" in cols else "text"
source_col = (
"filepath" if "filepath" in cols else ("source" if "source" in cols else "'vault'")
)
embedding_col = "embedding" if "embedding" in cols else "NULL"
created_col = "created_at" if "created_at" in cols else "datetime('now')"
conn.execute(f"""
INSERT INTO memories (
id, content, memory_type, source, embedding,
created_at, access_count
)
SELECT
{id_col}, {content_col}, 'vault_chunk', {source_col},
{embedding_col}, {created_col}, 0
FROM chunks
""")
conn.execute("DROP TABLE chunks")
logger.info("Migration: Migrated chunks to memories")
except sqlite3.Error as exc:
logger.warning("Migration: Failed to migrate chunks: %s", exc)
# Drop old facts table
if has_facts:
try:
conn.execute("DROP TABLE facts")
logger.info("Migration: Dropped old facts table")
except sqlite3.Error as exc:
logger.warning("Migration: Failed to drop facts: %s", exc)
# Schema will be created by _ensure_schema above
conn.commit()
return
_migrate_episodes(conn, tables)
_migrate_chunks(conn, tables)
_drop_legacy_tables(conn, tables)
conn.commit()
def _migrate_episodes(conn: sqlite3.Connection, tables: set[str]) -> None:
"""Migrate episodes table rows into the unified memories table."""
if "episodes" not in tables:
return
logger.info("Migration: Converting episodes table to memories")
try:
cols = _get_table_columns(conn, "episodes")
context_type_col = "context_type" if "context_type" in cols else "'conversation'"
conn.execute(f"""
INSERT INTO memories (
id, content, memory_type, source, embedding,
metadata, agent_id, task_id, session_id,
created_at, access_count, last_accessed
)
SELECT
id, content,
COALESCE({context_type_col}, 'conversation'),
COALESCE(source, 'agent'),
embedding,
metadata, agent_id, task_id, session_id,
COALESCE(timestamp, datetime('now')), 0, NULL
FROM episodes
""")
conn.execute("DROP TABLE episodes")
logger.info("Migration: Migrated episodes to memories")
except sqlite3.Error as exc:
logger.warning("Migration: Failed to migrate episodes: %s", exc)
def _migrate_chunks(conn: sqlite3.Connection, tables: set[str]) -> None:
"""Migrate chunks table rows into the unified memories table as vault_chunk."""
if "chunks" not in tables:
return
logger.info("Migration: Converting chunks table to memories")
try:
cols = _get_table_columns(conn, "chunks")
id_col = "id" if "id" in cols else "CAST(rowid AS TEXT)"
content_col = "content" if "content" in cols else "text"
source_col = (
"filepath" if "filepath" in cols else ("source" if "source" in cols else "'vault'")
)
embedding_col = "embedding" if "embedding" in cols else "NULL"
created_col = "created_at" if "created_at" in cols else "datetime('now')"
conn.execute(f"""
INSERT INTO memories (
id, content, memory_type, source, embedding,
created_at, access_count
)
SELECT
{id_col}, {content_col}, 'vault_chunk', {source_col},
{embedding_col}, {created_col}, 0
FROM chunks
""")
conn.execute("DROP TABLE chunks")
logger.info("Migration: Migrated chunks to memories")
except sqlite3.Error as exc:
logger.warning("Migration: Failed to migrate chunks: %s", exc)
def _drop_legacy_tables(conn: sqlite3.Connection, tables: set[str]) -> None:
"""Drop old facts table if it exists."""
if "facts" not in tables:
return
try:
conn.execute("DROP TABLE facts")
logger.info("Migration: Dropped old facts table")
except sqlite3.Error as exc:
logger.warning("Migration: Failed to drop facts: %s", exc)
def _get_table_columns(conn: sqlite3.Connection, table_name: str) -> set[str]:
"""Get the column names for a table."""
cursor = conn.execute(f"PRAGMA table_info({table_name})")

View File

@@ -46,6 +46,64 @@ DB_PATH = PROJECT_ROOT / "data" / "memory.db"
# ───────────────────────────────────────────────────────────────────────────────
_DEFAULT_HOT_MEMORY_TEMPLATE = """\
# Timmy Hot Memory
> Working RAM — always loaded, ~300 lines max, pruned monthly
> Last updated: {date}
---
## Current Status
**Agent State:** Operational
**Mode:** Development
**Active Tasks:** 0
**Pending Decisions:** None
---
## Standing Rules
1. **Sovereignty First** — No cloud dependencies
2. **Local-Only Inference** — Ollama on localhost
3. **Privacy by Design** — Telemetry disabled
4. **Tool Minimalism** — Use tools only when necessary
5. **Memory Discipline** — Write handoffs at session end
---
## Agent Roster
| Agent | Role | Status |
|-------|------|--------|
| Timmy | Core | Active |
---
## User Profile
**Name:** (not set)
**Interests:** (to be learned)
---
## Key Decisions
(none yet)
---
## Pending Actions
- [ ] Learn user's name
---
*Prune date: {prune_date}*
"""
@contextmanager
def get_connection() -> Generator[sqlite3.Connection, None, None]:
"""Get database connection to unified memory database."""
@@ -98,6 +156,73 @@ def _get_table_columns(conn: sqlite3.Connection, table_name: str) -> set[str]:
return {row[1] for row in cursor.fetchall()}
def _migrate_episodes(conn: sqlite3.Connection) -> None:
"""Migrate episodes table rows into the unified memories table."""
logger.info("Migration: Converting episodes table to memories")
try:
cols = _get_table_columns(conn, "episodes")
context_type_col = "context_type" if "context_type" in cols else "'conversation'"
conn.execute(f"""
INSERT INTO memories (
id, content, memory_type, source, embedding,
metadata, agent_id, task_id, session_id,
created_at, access_count, last_accessed
)
SELECT
id, content,
COALESCE({context_type_col}, 'conversation'),
COALESCE(source, 'agent'),
embedding,
metadata, agent_id, task_id, session_id,
COALESCE(timestamp, datetime('now')), 0, NULL
FROM episodes
""")
conn.execute("DROP TABLE episodes")
logger.info("Migration: Migrated episodes to memories")
except sqlite3.Error as exc:
logger.warning("Migration: Failed to migrate episodes: %s", exc)
def _migrate_chunks(conn: sqlite3.Connection) -> None:
"""Migrate chunks table rows into the unified memories table."""
logger.info("Migration: Converting chunks table to memories")
try:
cols = _get_table_columns(conn, "chunks")
id_col = "id" if "id" in cols else "CAST(rowid AS TEXT)"
content_col = "content" if "content" in cols else "text"
source_col = (
"filepath" if "filepath" in cols else ("source" if "source" in cols else "'vault'")
)
embedding_col = "embedding" if "embedding" in cols else "NULL"
created_col = "created_at" if "created_at" in cols else "datetime('now')"
conn.execute(f"""
INSERT INTO memories (
id, content, memory_type, source, embedding,
created_at, access_count
)
SELECT
{id_col}, {content_col}, 'vault_chunk', {source_col},
{embedding_col}, {created_col}, 0
FROM chunks
""")
conn.execute("DROP TABLE chunks")
logger.info("Migration: Migrated chunks to memories")
except sqlite3.Error as exc:
logger.warning("Migration: Failed to migrate chunks: %s", exc)
def _drop_legacy_table(conn: sqlite3.Connection, table: str) -> None:
"""Drop a legacy table if it exists."""
try:
conn.execute(f"DROP TABLE {table}") # noqa: S608
logger.info("Migration: Dropped old %s table", table)
except sqlite3.Error as exc:
logger.warning("Migration: Failed to drop %s: %s", table, exc)
def _migrate_schema(conn: sqlite3.Connection) -> None:
"""Migrate from old three-table schema to unified memories table.
@@ -110,78 +235,16 @@ def _migrate_schema(conn: sqlite3.Connection) -> None:
tables = {row[0] for row in cursor.fetchall()}
has_memories = "memories" in tables
has_episodes = "episodes" in tables
has_chunks = "chunks" in tables
has_facts = "facts" in tables
# Check if we need to migrate (old schema exists)
if not has_memories and (has_episodes or has_chunks or has_facts):
if not has_memories and (tables & {"episodes", "chunks", "facts"}):
logger.info("Migration: Creating unified memories table")
# Schema will be created by _ensure_schema above
# Migrate episodes -> memories
if has_episodes and has_memories:
logger.info("Migration: Converting episodes table to memories")
try:
cols = _get_table_columns(conn, "episodes")
context_type_col = "context_type" if "context_type" in cols else "'conversation'"
conn.execute(f"""
INSERT INTO memories (
id, content, memory_type, source, embedding,
metadata, agent_id, task_id, session_id,
created_at, access_count, last_accessed
)
SELECT
id, content,
COALESCE({context_type_col}, 'conversation'),
COALESCE(source, 'agent'),
embedding,
metadata, agent_id, task_id, session_id,
COALESCE(timestamp, datetime('now')), 0, NULL
FROM episodes
""")
conn.execute("DROP TABLE episodes")
logger.info("Migration: Migrated episodes to memories")
except sqlite3.Error as exc:
logger.warning("Migration: Failed to migrate episodes: %s", exc)
# Migrate chunks -> memories as vault_chunk
if has_chunks and has_memories:
logger.info("Migration: Converting chunks table to memories")
try:
cols = _get_table_columns(conn, "chunks")
id_col = "id" if "id" in cols else "CAST(rowid AS TEXT)"
content_col = "content" if "content" in cols else "text"
source_col = (
"filepath" if "filepath" in cols else ("source" if "source" in cols else "'vault'")
)
embedding_col = "embedding" if "embedding" in cols else "NULL"
created_col = "created_at" if "created_at" in cols else "datetime('now')"
conn.execute(f"""
INSERT INTO memories (
id, content, memory_type, source, embedding,
created_at, access_count
)
SELECT
{id_col}, {content_col}, 'vault_chunk', {source_col},
{embedding_col}, {created_col}, 0
FROM chunks
""")
conn.execute("DROP TABLE chunks")
logger.info("Migration: Migrated chunks to memories")
except sqlite3.Error as exc:
logger.warning("Migration: Failed to migrate chunks: %s", exc)
# Drop old tables
if has_facts:
try:
conn.execute("DROP TABLE facts")
logger.info("Migration: Dropped old facts table")
except sqlite3.Error as exc:
logger.warning("Migration: Failed to drop facts: %s", exc)
if "episodes" in tables and has_memories:
_migrate_episodes(conn)
if "chunks" in tables and has_memories:
_migrate_chunks(conn)
if "facts" in tables:
_drop_legacy_table(conn, "facts")
conn.commit()
@@ -298,6 +361,85 @@ def store_memory(
return entry
def _build_search_filters(
context_type: str | None,
agent_id: str | None,
session_id: str | None,
) -> tuple[str, list]:
"""Build SQL WHERE clause and params from search filters."""
conditions: list[str] = []
params: list = []
if context_type:
conditions.append("memory_type = ?")
params.append(context_type)
if agent_id:
conditions.append("agent_id = ?")
params.append(agent_id)
if session_id:
conditions.append("session_id = ?")
params.append(session_id)
where_clause = "WHERE " + " AND ".join(conditions) if conditions else ""
return where_clause, params
def _fetch_memory_candidates(
where_clause: str, params: list, candidate_limit: int
) -> list[sqlite3.Row]:
"""Fetch candidate memory rows from the database."""
query_sql = f"""
SELECT * FROM memories
{where_clause}
ORDER BY created_at DESC
LIMIT ?
"""
params.append(candidate_limit)
with get_connection() as conn:
return conn.execute(query_sql, params).fetchall()
def _row_to_entry(row: sqlite3.Row) -> MemoryEntry:
"""Convert a database row to a MemoryEntry."""
return MemoryEntry(
id=row["id"],
content=row["content"],
source=row["source"],
context_type=row["memory_type"], # DB column -> API field
agent_id=row["agent_id"],
task_id=row["task_id"],
session_id=row["session_id"],
metadata=json.loads(row["metadata"]) if row["metadata"] else None,
embedding=json.loads(row["embedding"]) if row["embedding"] else None,
timestamp=row["created_at"],
)
def _score_and_filter(
rows: list[sqlite3.Row],
query: str,
query_embedding: list[float],
min_relevance: float,
) -> list[MemoryEntry]:
"""Score candidate rows by similarity and filter by min_relevance."""
results = []
for row in rows:
entry = _row_to_entry(row)
if entry.embedding:
score = cosine_similarity(query_embedding, entry.embedding)
else:
score = _keyword_overlap(query, entry.content)
entry.relevance_score = score
if score >= min_relevance:
results.append(entry)
results.sort(key=lambda x: x.relevance_score or 0, reverse=True)
return results
def search_memories(
query: str,
limit: int = 10,
@@ -320,65 +462,9 @@ def search_memories(
List of MemoryEntry objects sorted by relevance
"""
query_embedding = embed_text(query)
# Build query with filters
conditions = []
params = []
if context_type:
conditions.append("memory_type = ?")
params.append(context_type)
if agent_id:
conditions.append("agent_id = ?")
params.append(agent_id)
if session_id:
conditions.append("session_id = ?")
params.append(session_id)
where_clause = "WHERE " + " AND ".join(conditions) if conditions else ""
# Fetch candidates (we'll do in-memory similarity for now)
query_sql = f"""
SELECT * FROM memories
{where_clause}
ORDER BY created_at DESC
LIMIT ?
"""
params.append(limit * 3) # Get more candidates for ranking
with get_connection() as conn:
rows = conn.execute(query_sql, params).fetchall()
# Compute similarity scores
results = []
for row in rows:
entry = MemoryEntry(
id=row["id"],
content=row["content"],
source=row["source"],
context_type=row["memory_type"], # DB column -> API field
agent_id=row["agent_id"],
task_id=row["task_id"],
session_id=row["session_id"],
metadata=json.loads(row["metadata"]) if row["metadata"] else None,
embedding=json.loads(row["embedding"]) if row["embedding"] else None,
timestamp=row["created_at"],
)
if entry.embedding:
score = cosine_similarity(query_embedding, entry.embedding)
entry.relevance_score = score
if score >= min_relevance:
results.append(entry)
else:
# Fallback: check for keyword overlap
score = _keyword_overlap(query, entry.content)
entry.relevance_score = score
if score >= min_relevance:
results.append(entry)
# Sort by relevance and return top results
results.sort(key=lambda x: x.relevance_score or 0, reverse=True)
where_clause, params = _build_search_filters(context_type, agent_id, session_id)
rows = _fetch_memory_candidates(where_clause, params, limit * 3)
results = _score_and_filter(rows, query, query_embedding, min_relevance)
return results[:limit]
@@ -704,66 +790,12 @@ class HotMemory:
logger.debug(
"HotMemory._create_default() - creating default MEMORY.md for backward compatibility"
)
default_content = """# Timmy Hot Memory
> Working RAM — always loaded, ~300 lines max, pruned monthly
> Last updated: {date}
---
## Current Status
**Agent State:** Operational
**Mode:** Development
**Active Tasks:** 0
**Pending Decisions:** None
---
## Standing Rules
1. **Sovereignty First** — No cloud dependencies
2. **Local-Only Inference** — Ollama on localhost
3. **Privacy by Design** — Telemetry disabled
4. **Tool Minimalism** — Use tools only when necessary
5. **Memory Discipline** — Write handoffs at session end
---
## Agent Roster
| Agent | Role | Status |
|-------|------|--------|
| Timmy | Core | Active |
---
## User Profile
**Name:** (not set)
**Interests:** (to be learned)
---
## Key Decisions
(none yet)
---
## Pending Actions
- [ ] Learn user's name
---
*Prune date: {prune_date}*
""".format(
date=datetime.now(UTC).strftime("%Y-%m-%d"),
prune_date=(datetime.now(UTC).replace(day=25)).strftime("%Y-%m-%d"),
now = datetime.now(UTC)
content = _DEFAULT_HOT_MEMORY_TEMPLATE.format(
date=now.strftime("%Y-%m-%d"),
prune_date=now.replace(day=25).strftime("%Y-%m-%d"),
)
self.path.write_text(default_content)
self.path.write_text(content)
logger.info("HotMemory: Created default MEMORY.md")

581
src/timmy/quest_system.py Normal file
View File

@@ -0,0 +1,581 @@
"""Token Quest System for agent rewards.
Provides quest definitions, progress tracking, completion detection,
and token awards for agent accomplishments.
Quests are defined in config/quests.yaml and loaded at runtime.
"""
from __future__ import annotations
import logging
import time
from dataclasses import dataclass, field
from datetime import UTC, datetime, timedelta
from enum import StrEnum
from pathlib import Path
from typing import Any
import yaml
from config import settings
logger = logging.getLogger(__name__)
# Path to quest configuration
QUEST_CONFIG_PATH = Path(settings.repo_root) / "config" / "quests.yaml"
class QuestType(StrEnum):
"""Types of quests supported by the system."""
ISSUE_COUNT = "issue_count"
ISSUE_REDUCE = "issue_reduce"
DOCS_UPDATE = "docs_update"
TEST_IMPROVE = "test_improve"
DAILY_RUN = "daily_run"
CUSTOM = "custom"
class QuestStatus(StrEnum):
"""Status of a quest for an agent."""
NOT_STARTED = "not_started"
IN_PROGRESS = "in_progress"
COMPLETED = "completed"
CLAIMED = "claimed"
EXPIRED = "expired"
@dataclass
class QuestDefinition:
"""Definition of a quest from configuration."""
id: str
name: str
description: str
reward_tokens: int
quest_type: QuestType
enabled: bool
repeatable: bool
cooldown_hours: int
criteria: dict[str, Any]
notification_message: str
@classmethod
def from_dict(cls, data: dict[str, Any]) -> QuestDefinition:
"""Create a QuestDefinition from a dictionary."""
return cls(
id=data["id"],
name=data.get("name", "Unnamed Quest"),
description=data.get("description", ""),
reward_tokens=data.get("reward_tokens", 0),
quest_type=QuestType(data.get("type", "custom")),
enabled=data.get("enabled", True),
repeatable=data.get("repeatable", False),
cooldown_hours=data.get("cooldown_hours", 0),
criteria=data.get("criteria", {}),
notification_message=data.get(
"notification_message", "Quest Complete! You earned {tokens} tokens."
),
)
@dataclass
class QuestProgress:
"""Progress of a quest for a specific agent."""
quest_id: str
agent_id: str
status: QuestStatus
current_value: int = 0
target_value: int = 0
started_at: str = ""
completed_at: str = ""
claimed_at: str = ""
completion_count: int = 0
last_completed_at: str = ""
metadata: dict[str, Any] = field(default_factory=dict)
def to_dict(self) -> dict[str, Any]:
"""Convert to dictionary for serialization."""
return {
"quest_id": self.quest_id,
"agent_id": self.agent_id,
"status": self.status.value,
"current_value": self.current_value,
"target_value": self.target_value,
"started_at": self.started_at,
"completed_at": self.completed_at,
"claimed_at": self.claimed_at,
"completion_count": self.completion_count,
"last_completed_at": self.last_completed_at,
"metadata": self.metadata,
}
# In-memory storage for quest progress
_quest_progress: dict[str, QuestProgress] = {}
_quest_definitions: dict[str, QuestDefinition] = {}
_quest_settings: dict[str, Any] = {}
def _get_progress_key(quest_id: str, agent_id: str) -> str:
"""Generate a unique key for quest progress."""
return f"{agent_id}:{quest_id}"
def load_quest_config() -> tuple[dict[str, QuestDefinition], dict[str, Any]]:
"""Load quest definitions from quests.yaml.
Returns:
Tuple of (quest definitions dict, settings dict)
"""
global _quest_definitions, _quest_settings
if not QUEST_CONFIG_PATH.exists():
logger.warning("Quest config not found at %s", QUEST_CONFIG_PATH)
return {}, {}
try:
raw = QUEST_CONFIG_PATH.read_text()
config = yaml.safe_load(raw)
if not isinstance(config, dict):
logger.warning("Invalid quest config format")
return {}, {}
# Load quest definitions
quests_data = config.get("quests", {})
definitions = {}
for quest_id, quest_data in quests_data.items():
quest_data["id"] = quest_id
try:
definition = QuestDefinition.from_dict(quest_data)
definitions[quest_id] = definition
except (ValueError, KeyError) as exc:
logger.warning("Failed to load quest %s: %s", quest_id, exc)
# Load settings
_quest_settings = config.get("settings", {})
_quest_definitions = definitions
logger.debug("Loaded %d quest definitions", len(definitions))
return definitions, _quest_settings
except (OSError, yaml.YAMLError) as exc:
logger.warning("Failed to load quest config: %s", exc)
return {}, {}
def get_quest_definitions() -> dict[str, QuestDefinition]:
"""Get all quest definitions, loading if necessary."""
global _quest_definitions
if not _quest_definitions:
_quest_definitions, _ = load_quest_config()
return _quest_definitions
def get_quest_definition(quest_id: str) -> QuestDefinition | None:
"""Get a specific quest definition by ID."""
definitions = get_quest_definitions()
return definitions.get(quest_id)
def get_active_quests() -> list[QuestDefinition]:
"""Get all enabled quest definitions."""
definitions = get_quest_definitions()
return [q for q in definitions.values() if q.enabled]
def get_quest_progress(quest_id: str, agent_id: str) -> QuestProgress | None:
"""Get progress for a specific quest and agent."""
key = _get_progress_key(quest_id, agent_id)
return _quest_progress.get(key)
def get_or_create_progress(quest_id: str, agent_id: str) -> QuestProgress:
"""Get existing progress or create new for quest/agent."""
key = _get_progress_key(quest_id, agent_id)
if key not in _quest_progress:
quest = get_quest_definition(quest_id)
if not quest:
raise ValueError(f"Quest {quest_id} not found")
target = _get_target_value(quest)
_quest_progress[key] = QuestProgress(
quest_id=quest_id,
agent_id=agent_id,
status=QuestStatus.NOT_STARTED,
current_value=0,
target_value=target,
started_at=datetime.now(UTC).isoformat(),
)
return _quest_progress[key]
def _get_target_value(quest: QuestDefinition) -> int:
"""Extract target value from quest criteria."""
criteria = quest.criteria
if quest.quest_type == QuestType.ISSUE_COUNT:
return criteria.get("target_count", 1)
elif quest.quest_type == QuestType.ISSUE_REDUCE:
return criteria.get("target_reduction", 1)
elif quest.quest_type == QuestType.DAILY_RUN:
return criteria.get("min_sessions", 1)
elif quest.quest_type == QuestType.DOCS_UPDATE:
return criteria.get("min_files_changed", 1)
elif quest.quest_type == QuestType.TEST_IMPROVE:
return criteria.get("min_new_tests", 1)
return 1
def update_quest_progress(
quest_id: str,
agent_id: str,
current_value: int,
metadata: dict[str, Any] | None = None,
) -> QuestProgress:
"""Update progress for a quest."""
progress = get_or_create_progress(quest_id, agent_id)
progress.current_value = current_value
if metadata:
progress.metadata.update(metadata)
# Check if quest is now complete
if progress.current_value >= progress.target_value:
if progress.status not in (QuestStatus.COMPLETED, QuestStatus.CLAIMED):
progress.status = QuestStatus.COMPLETED
progress.completed_at = datetime.now(UTC).isoformat()
logger.info("Quest %s completed for agent %s", quest_id, agent_id)
return progress
def _is_on_cooldown(progress: QuestProgress, quest: QuestDefinition) -> bool:
"""Check if a repeatable quest is on cooldown."""
if not quest.repeatable or not progress.last_completed_at:
return False
if quest.cooldown_hours <= 0:
return False
try:
last_completed = datetime.fromisoformat(progress.last_completed_at)
cooldown_end = last_completed + timedelta(hours=quest.cooldown_hours)
return datetime.now(UTC) < cooldown_end
except (ValueError, TypeError):
return False
def claim_quest_reward(quest_id: str, agent_id: str) -> dict[str, Any] | None:
"""Claim the token reward for a completed quest.
Returns:
Reward info dict if successful, None if not claimable
"""
progress = get_quest_progress(quest_id, agent_id)
if not progress:
return None
quest = get_quest_definition(quest_id)
if not quest:
return None
# Check if quest is completed but not yet claimed
if progress.status != QuestStatus.COMPLETED:
return None
# Check cooldown for repeatable quests
if _is_on_cooldown(progress, quest):
return None
try:
# Award tokens via ledger
from lightning.ledger import create_invoice_entry, mark_settled
# Create a mock invoice for the reward
invoice_entry = create_invoice_entry(
payment_hash=f"quest_{quest_id}_{agent_id}_{int(time.time())}",
amount_sats=quest.reward_tokens,
memo=f"Quest reward: {quest.name}",
source="quest_reward",
agent_id=agent_id,
)
# Mark as settled immediately (quest rewards are auto-settled)
mark_settled(invoice_entry.payment_hash, preimage=f"quest_{quest_id}")
# Update progress
progress.status = QuestStatus.CLAIMED
progress.claimed_at = datetime.now(UTC).isoformat()
progress.completion_count += 1
progress.last_completed_at = progress.claimed_at
# Reset for repeatable quests
if quest.repeatable:
progress.status = QuestStatus.NOT_STARTED
progress.current_value = 0
progress.completed_at = ""
progress.claimed_at = ""
notification = quest.notification_message.format(tokens=quest.reward_tokens)
return {
"quest_id": quest_id,
"agent_id": agent_id,
"tokens_awarded": quest.reward_tokens,
"notification": notification,
"completion_count": progress.completion_count,
}
except Exception as exc:
logger.error("Failed to award quest reward: %s", exc)
return None
def check_issue_count_quest(
quest: QuestDefinition,
agent_id: str,
closed_issues: list[dict],
) -> QuestProgress | None:
"""Check progress for issue_count type quest."""
criteria = quest.criteria
target_labels = set(criteria.get("issue_labels", []))
# target_count is available in criteria but not used directly here
# Count matching issues
matching_count = 0
for issue in closed_issues:
issue_labels = {label.get("name", "") for label in issue.get("labels", [])}
if target_labels.issubset(issue_labels) or (not target_labels and issue_labels):
matching_count += 1
progress = update_quest_progress(
quest.id, agent_id, matching_count, {"matching_issues": matching_count}
)
return progress
def check_issue_reduce_quest(
quest: QuestDefinition,
agent_id: str,
previous_count: int,
current_count: int,
) -> QuestProgress | None:
"""Check progress for issue_reduce type quest."""
# target_reduction available in quest.criteria but we track actual reduction
reduction = max(0, previous_count - current_count)
progress = update_quest_progress(quest.id, agent_id, reduction, {"reduction": reduction})
return progress
def check_daily_run_quest(
quest: QuestDefinition,
agent_id: str,
sessions_completed: int,
) -> QuestProgress | None:
"""Check progress for daily_run type quest."""
# min_sessions available in quest.criteria but we track actual sessions
progress = update_quest_progress(
quest.id, agent_id, sessions_completed, {"sessions": sessions_completed}
)
return progress
def evaluate_quest_progress(
quest_id: str,
agent_id: str,
context: dict[str, Any],
) -> QuestProgress | None:
"""Evaluate quest progress based on quest type and context.
Args:
quest_id: The quest to evaluate
agent_id: The agent to evaluate for
context: Context data for evaluation (issues, metrics, etc.)
Returns:
Updated QuestProgress or None if evaluation failed
"""
quest = get_quest_definition(quest_id)
if not quest or not quest.enabled:
return None
progress = get_quest_progress(quest_id, agent_id)
# Check cooldown for repeatable quests
if progress and _is_on_cooldown(progress, quest):
return progress
try:
if quest.quest_type == QuestType.ISSUE_COUNT:
closed_issues = context.get("closed_issues", [])
return check_issue_count_quest(quest, agent_id, closed_issues)
elif quest.quest_type == QuestType.ISSUE_REDUCE:
prev_count = context.get("previous_issue_count", 0)
curr_count = context.get("current_issue_count", 0)
return check_issue_reduce_quest(quest, agent_id, prev_count, curr_count)
elif quest.quest_type == QuestType.DAILY_RUN:
sessions = context.get("sessions_completed", 0)
return check_daily_run_quest(quest, agent_id, sessions)
elif quest.quest_type == QuestType.CUSTOM:
# Custom quests require manual completion
return progress
else:
logger.debug("Quest type %s not yet implemented", quest.quest_type)
return progress
except Exception as exc:
logger.warning("Quest evaluation failed for %s: %s", quest_id, exc)
return progress
def auto_evaluate_all_quests(agent_id: str, context: dict[str, Any]) -> list[dict]:
"""Evaluate all active quests for an agent and award rewards.
Returns:
List of reward info for newly completed quests
"""
rewards = []
active_quests = get_active_quests()
for quest in active_quests:
progress = evaluate_quest_progress(quest.id, agent_id, context)
if progress and progress.status == QuestStatus.COMPLETED:
# Auto-claim the reward
reward = claim_quest_reward(quest.id, agent_id)
if reward:
rewards.append(reward)
return rewards
def get_agent_quests_status(agent_id: str) -> dict[str, Any]:
"""Get complete quest status for an agent."""
definitions = get_quest_definitions()
quests_status = []
total_rewards = 0
completed_count = 0
for quest_id, quest in definitions.items():
progress = get_quest_progress(quest_id, agent_id)
if not progress:
progress = get_or_create_progress(quest_id, agent_id)
is_on_cooldown = _is_on_cooldown(progress, quest) if quest.repeatable else False
quest_info = {
"quest_id": quest_id,
"name": quest.name,
"description": quest.description,
"reward_tokens": quest.reward_tokens,
"type": quest.quest_type.value,
"enabled": quest.enabled,
"repeatable": quest.repeatable,
"status": progress.status.value,
"current_value": progress.current_value,
"target_value": progress.target_value,
"completion_count": progress.completion_count,
"on_cooldown": is_on_cooldown,
"cooldown_hours_remaining": 0,
}
if is_on_cooldown and progress.last_completed_at:
try:
last = datetime.fromisoformat(progress.last_completed_at)
cooldown_end = last + timedelta(hours=quest.cooldown_hours)
hours_remaining = (cooldown_end - datetime.now(UTC)).total_seconds() / 3600
quest_info["cooldown_hours_remaining"] = round(max(0, hours_remaining), 1)
except (ValueError, TypeError):
pass
quests_status.append(quest_info)
total_rewards += progress.completion_count * quest.reward_tokens
completed_count += progress.completion_count
return {
"agent_id": agent_id,
"quests": quests_status,
"total_tokens_earned": total_rewards,
"total_quests_completed": completed_count,
"active_quests_count": len([q for q in quests_status if q["enabled"]]),
}
def reset_quest_progress(quest_id: str | None = None, agent_id: str | None = None) -> int:
"""Reset quest progress. Useful for testing.
Args:
quest_id: Specific quest to reset, or None for all
agent_id: Specific agent to reset, or None for all
Returns:
Number of progress entries reset
"""
global _quest_progress
count = 0
keys_to_reset = []
for key, _progress in _quest_progress.items():
key_agent, key_quest = key.split(":", 1)
if (quest_id is None or key_quest == quest_id) and (
agent_id is None or key_agent == agent_id
):
keys_to_reset.append(key)
for key in keys_to_reset:
del _quest_progress[key]
count += 1
return count
def get_quest_leaderboard() -> list[dict[str, Any]]:
"""Get a leaderboard of agents by quest completion."""
agent_stats: dict[str, dict[str, Any]] = {}
for _key, progress in _quest_progress.items():
agent_id = progress.agent_id
if agent_id not in agent_stats:
agent_stats[agent_id] = {
"agent_id": agent_id,
"total_completions": 0,
"total_tokens": 0,
"quests_completed": set(),
}
quest = get_quest_definition(progress.quest_id)
if quest:
agent_stats[agent_id]["total_completions"] += progress.completion_count
agent_stats[agent_id]["total_tokens"] += progress.completion_count * quest.reward_tokens
if progress.completion_count > 0:
agent_stats[agent_id]["quests_completed"].add(quest.id)
leaderboard = []
for stats in agent_stats.values():
leaderboard.append(
{
"agent_id": stats["agent_id"],
"total_completions": stats["total_completions"],
"total_tokens": stats["total_tokens"],
"unique_quests_completed": len(stats["quests_completed"]),
}
)
# Sort by total tokens (descending)
leaderboard.sort(key=lambda x: x["total_tokens"], reverse=True)
return leaderboard
# Initialize on module load
load_quest_config()

View File

@@ -392,31 +392,26 @@ def _build_insights(
return insights or ["Conversations look healthy. Keep up the good work."]
def self_reflect(limit: int = 30) -> str:
"""Review recent conversations and reflect on Timmy's own behavior.
def _format_recurring_topics(repeated: list[tuple[str, int]]) -> list[str]:
"""Format the recurring-topics section of a reflection report."""
if repeated:
lines = ["### Recurring Topics"]
for word, count in repeated:
lines.append(f'- "{word}" ({count} mentions)')
lines.append("")
return lines
return ["### Recurring Topics\nNo strong patterns detected.\n"]
Scans past session entries for patterns: low-confidence responses,
errors, repeated topics, and conversation quality signals. Returns
a structured reflection that Timmy can use to improve.
Args:
limit: How many recent entries to review (default 30).
Returns:
A formatted self-reflection report.
"""
sl = get_session_logger()
sl.flush()
entries = sl.get_recent_entries(limit=limit)
if not entries:
return "No conversation history to reflect on yet."
_messages, errors, timmy_msgs, user_msgs = _categorize_entries(entries)
low_conf = _find_low_confidence(timmy_msgs)
repeated = _find_repeated_topics(user_msgs)
# Build reflection report
def _assemble_report(
entries: list[dict],
errors: list[dict],
timmy_msgs: list[dict],
user_msgs: list[dict],
low_conf: list[dict],
repeated: list[tuple[str, int]],
) -> str:
"""Assemble the full self-reflection report from analyzed data."""
sections: list[str] = ["## Self-Reflection Report\n"]
sections.append(
f"Reviewed {len(entries)} recent entries: "
@@ -446,16 +441,37 @@ def self_reflect(limit: int = 30) -> str:
)
)
if repeated:
sections.append("### Recurring Topics")
for word, count in repeated:
sections.append(f'- "{word}" ({count} mentions)')
sections.append("")
else:
sections.append("### Recurring Topics\nNo strong patterns detected.\n")
sections.extend(_format_recurring_topics(repeated))
sections.append("### Insights")
for insight in _build_insights(low_conf, errors, repeated):
sections.append(f"- {insight}")
return "\n".join(sections)
def self_reflect(limit: int = 30) -> str:
"""Review recent conversations and reflect on Timmy's own behavior.
Scans past session entries for patterns: low-confidence responses,
errors, repeated topics, and conversation quality signals. Returns
a structured reflection that Timmy can use to improve.
Args:
limit: How many recent entries to review (default 30).
Returns:
A formatted self-reflection report.
"""
sl = get_session_logger()
sl.flush()
entries = sl.get_recent_entries(limit=limit)
if not entries:
return "No conversation history to reflect on yet."
_messages, errors, timmy_msgs, user_msgs = _categorize_entries(entries)
low_conf = _find_low_confidence(timmy_msgs)
repeated = _find_repeated_topics(user_msgs)
return _assemble_report(entries, errors, timmy_msgs, user_msgs, low_conf, repeated)

View File

@@ -341,6 +341,11 @@ class ThinkingEngine:
)
return None
# Capture arrival time *before* the LLM call so the thought
# timestamp reflects when the cycle started, not when the
# (potentially slow) generation finished. Fixes #582.
arrived_at = datetime.now(UTC).isoformat()
memory_context, system_context, recent_thoughts = self._build_thinking_context()
content, seed_type = await self._generate_novel_thought(
@@ -352,7 +357,7 @@ class ThinkingEngine:
if not content:
return None
thought = self._store_thought(content, seed_type)
thought = self._store_thought(content, seed_type, arrived_at=arrived_at)
self._last_thought_id = thought.id
await self._process_thinking_result(thought)
@@ -772,19 +777,10 @@ class ThinkingEngine:
except Exception as exc:
logger.debug("Thought issue filing skipped: %s", exc)
# -- system-snapshot helpers ----------------------------------------
# ── System snapshot helpers ────────────────────────────────────────────
def _snapshot_time(self) -> tuple[datetime, str]:
"""Return (now, formatted-time-line) for the local clock."""
now = datetime.now().astimezone()
tz = now.strftime("%Z") or "UTC"
line = (
f"Local time: {now.strftime('%I:%M %p').lstrip('0')} {tz}, {now.strftime('%A %B %d')}"
)
return now, line
def _snapshot_thought_count(self, now: datetime) -> str | None:
"""Return today's thought count as a display string, or *None*."""
def _snap_thought_count(self, now: datetime) -> str | None:
"""Return today's thought count, or *None* on failure."""
try:
today_start = now.replace(hour=0, minute=0, second=0, microsecond=0)
with _get_conn(self._db_path) as conn:
@@ -797,8 +793,8 @@ class ThinkingEngine:
logger.debug("Thought count query failed: %s", exc)
return None
def _snapshot_chat_activity(self) -> list[str]:
"""Return chat-activity lines (may be empty)."""
def _snap_chat_activity(self) -> list[str]:
"""Return chat-activity lines (in-memory, no I/O)."""
try:
from infrastructure.chat_store import message_log
@@ -814,16 +810,14 @@ class ThinkingEngine:
logger.debug("Chat activity query failed: %s", exc)
return []
def _snapshot_task_queue(self) -> str | None:
"""Return a one-line task-queue summary, or *None*."""
def _snap_task_queue(self) -> str | None:
"""Return a one-line task queue summary, or *None*."""
try:
from swarm.task_queue.models import get_task_summary_for_briefing
summary = get_task_summary_for_briefing()
running = summary.get("running", 0)
pending = summary.get("pending_approval", 0)
done = summary.get("completed", 0)
failed = summary.get("failed", 0)
s = get_task_summary_for_briefing()
running, pending = s.get("running", 0), s.get("pending_approval", 0)
done, failed = s.get("completed", 0), s.get("failed", 0)
if running or pending or done or failed:
return (
f"Tasks: {running} running, {pending} pending, "
@@ -833,21 +827,20 @@ class ThinkingEngine:
logger.debug("Task queue query failed: %s", exc)
return None
def _snapshot_workspace(self) -> list[str]:
"""Return workspace-update lines (may be empty)."""
def _snap_workspace(self) -> list[str]:
"""Return workspace-update lines (file-based Hermes comms)."""
try:
from timmy.workspace import workspace_monitor
updates = workspace_monitor.get_pending_updates()
lines: list[str] = []
new_corr = updates.get("new_correspondence")
new_inbox = updates.get("new_inbox_files", [])
if new_corr:
line_count = len([ln for ln in new_corr.splitlines() if ln.strip()])
lines.append(
f"Workspace: {line_count} new correspondence entries (latest from: Hermes)"
)
new_inbox = updates.get("new_inbox_files", [])
if new_inbox:
files_str = ", ".join(new_inbox[:5])
if len(new_inbox) > 5:
@@ -858,8 +851,6 @@ class ThinkingEngine:
logger.debug("Workspace check failed: %s", exc)
return []
# -- public snapshot entry point ------------------------------------
def _gather_system_snapshot(self) -> str:
"""Gather lightweight real system state for grounding thoughts in reality.
@@ -867,20 +858,24 @@ class ThinkingEngine:
recent chat activity, and task queue status. Never crashes — every
section is independently try/excepted.
"""
now, time_line = self._snapshot_time()
parts: list[str] = [time_line]
now = datetime.now().astimezone()
tz = now.strftime("%Z") or "UTC"
thought_line = self._snapshot_thought_count(now)
parts: list[str] = [
f"Local time: {now.strftime('%I:%M %p').lstrip('0')} {tz}, {now.strftime('%A %B %d')}"
]
thought_line = self._snap_thought_count(now)
if thought_line:
parts.append(thought_line)
parts.extend(self._snapshot_chat_activity())
parts.extend(self._snap_chat_activity())
task_line = self._snapshot_task_queue()
task_line = self._snap_task_queue()
if task_line:
parts.append(task_line)
parts.extend(self._snapshot_workspace())
parts.extend(self._snap_workspace())
return "\n".join(parts) if parts else ""
@@ -1183,14 +1178,25 @@ class ThinkingEngine:
raw = run.content if hasattr(run, "content") else str(run)
return _THINK_TAG_RE.sub("", raw) if raw else raw
def _store_thought(self, content: str, seed_type: str) -> Thought:
"""Persist a thought to SQLite."""
def _store_thought(
self,
content: str,
seed_type: str,
*,
arrived_at: str | None = None,
) -> Thought:
"""Persist a thought to SQLite.
Args:
arrived_at: ISO-8601 timestamp captured when the thinking cycle
started. Falls back to now() for callers that don't supply it.
"""
thought = Thought(
id=str(uuid.uuid4()),
content=content,
seed_type=seed_type,
parent_id=self._last_thought_id,
created_at=datetime.now(UTC).isoformat(),
created_at=arrived_at or datetime.now(UTC).isoformat(),
)
with _get_conn(self._db_path) as conn:
@@ -1271,6 +1277,53 @@ class ThinkingEngine:
logger.debug("Failed to broadcast thought: %s", exc)
def _query_thoughts(
db_path: Path, query: str, seed_type: str | None, limit: int
) -> list[sqlite3.Row]:
"""Run the thought-search SQL and return matching rows."""
pattern = f"%{query}%"
with _get_conn(db_path) as conn:
if seed_type:
return conn.execute(
"""
SELECT id, content, seed_type, created_at
FROM thoughts
WHERE content LIKE ? AND seed_type = ?
ORDER BY created_at DESC
LIMIT ?
""",
(pattern, seed_type, limit),
).fetchall()
return conn.execute(
"""
SELECT id, content, seed_type, created_at
FROM thoughts
WHERE content LIKE ?
ORDER BY created_at DESC
LIMIT ?
""",
(pattern, limit),
).fetchall()
def _format_thought_rows(rows: list[sqlite3.Row], query: str, seed_type: str | None) -> str:
"""Format thought rows into a human-readable string."""
lines = [f'Found {len(rows)} thought(s) matching "{query}":']
if seed_type:
lines[0] += f' [seed_type="{seed_type}"]'
lines.append("")
for row in rows:
ts = datetime.fromisoformat(row["created_at"])
local_ts = ts.astimezone()
time_str = local_ts.strftime("%Y-%m-%d %I:%M %p").lstrip("0")
seed = row["seed_type"]
content = row["content"].replace("\n", " ") # Flatten newlines for display
lines.append(f"[{time_str}] ({seed}) {content[:150]}")
return "\n".join(lines)
def search_thoughts(query: str, seed_type: str | None = None, limit: int = 10) -> str:
"""Search Timmy's thought history for reflections matching a query.
@@ -1288,58 +1341,17 @@ def search_thoughts(query: str, seed_type: str | None = None, limit: int = 10) -
Formatted string with matching thoughts, newest first, including
timestamps and seed types. Returns a helpful message if no matches found.
"""
# Clamp limit to reasonable bounds
limit = max(1, min(limit, 50))
try:
engine = thinking_engine
db_path = engine._db_path
# Build query with optional seed_type filter
with _get_conn(db_path) as conn:
if seed_type:
rows = conn.execute(
"""
SELECT id, content, seed_type, created_at
FROM thoughts
WHERE content LIKE ? AND seed_type = ?
ORDER BY created_at DESC
LIMIT ?
""",
(f"%{query}%", seed_type, limit),
).fetchall()
else:
rows = conn.execute(
"""
SELECT id, content, seed_type, created_at
FROM thoughts
WHERE content LIKE ?
ORDER BY created_at DESC
LIMIT ?
""",
(f"%{query}%", limit),
).fetchall()
rows = _query_thoughts(thinking_engine._db_path, query, seed_type, limit)
if not rows:
if seed_type:
return f'No thoughts found matching "{query}" with seed_type="{seed_type}".'
return f'No thoughts found matching "{query}".'
# Format results
lines = [f'Found {len(rows)} thought(s) matching "{query}":']
if seed_type:
lines[0] += f' [seed_type="{seed_type}"]'
lines.append("")
for row in rows:
ts = datetime.fromisoformat(row["created_at"])
local_ts = ts.astimezone()
time_str = local_ts.strftime("%Y-%m-%d %I:%M %p").lstrip("0")
seed = row["seed_type"]
content = row["content"].replace("\n", " ") # Flatten newlines for display
lines.append(f"[{time_str}] ({seed}) {content[:150]}")
return "\n".join(lines)
return _format_thought_rows(rows, query, seed_type)
except Exception as exc:
logger.warning("Thought search failed: %s", exc)

View File

@@ -909,82 +909,35 @@ def _experiment_tool_catalog() -> dict:
}
_CREATIVE_CATALOG_SOURCES: list[tuple[str, str, list[str]]] = [
("creative.tools.git_tools", "GIT_TOOL_CATALOG", ["forge", "helm", "orchestrator"]),
("creative.tools.image_tools", "IMAGE_TOOL_CATALOG", ["pixel", "orchestrator"]),
("creative.tools.music_tools", "MUSIC_TOOL_CATALOG", ["lyra", "orchestrator"]),
("creative.tools.video_tools", "VIDEO_TOOL_CATALOG", ["reel", "orchestrator"]),
("creative.director", "DIRECTOR_TOOL_CATALOG", ["orchestrator"]),
("creative.assembler", "ASSEMBLER_TOOL_CATALOG", ["reel", "orchestrator"]),
]
def _import_creative_catalogs(catalog: dict) -> None:
"""Import and merge creative tool catalogs from creative module."""
# ── Git tools ─────────────────────────────────────────────────────────────
try:
from creative.tools.git_tools import GIT_TOOL_CATALOG
for module_path, attr_name, available_in in _CREATIVE_CATALOG_SOURCES:
_merge_catalog(catalog, module_path, attr_name, available_in)
for tool_id, info in GIT_TOOL_CATALOG.items():
def _merge_catalog(
catalog: dict, module_path: str, attr_name: str, available_in: list[str]
) -> None:
"""Import a single creative catalog and merge its entries."""
try:
from importlib import import_module
source_catalog = getattr(import_module(module_path), attr_name)
for tool_id, info in source_catalog.items():
catalog[tool_id] = {
"name": info["name"],
"description": info["description"],
"available_in": ["forge", "helm", "orchestrator"],
}
except ImportError:
pass
# ── Image tools ────────────────────────────────────────────────────────────
try:
from creative.tools.image_tools import IMAGE_TOOL_CATALOG
for tool_id, info in IMAGE_TOOL_CATALOG.items():
catalog[tool_id] = {
"name": info["name"],
"description": info["description"],
"available_in": ["pixel", "orchestrator"],
}
except ImportError:
pass
# ── Music tools ────────────────────────────────────────────────────────────
try:
from creative.tools.music_tools import MUSIC_TOOL_CATALOG
for tool_id, info in MUSIC_TOOL_CATALOG.items():
catalog[tool_id] = {
"name": info["name"],
"description": info["description"],
"available_in": ["lyra", "orchestrator"],
}
except ImportError:
pass
# ── Video tools ────────────────────────────────────────────────────────────
try:
from creative.tools.video_tools import VIDEO_TOOL_CATALOG
for tool_id, info in VIDEO_TOOL_CATALOG.items():
catalog[tool_id] = {
"name": info["name"],
"description": info["description"],
"available_in": ["reel", "orchestrator"],
}
except ImportError:
pass
# ── Creative pipeline ──────────────────────────────────────────────────────
try:
from creative.director import DIRECTOR_TOOL_CATALOG
for tool_id, info in DIRECTOR_TOOL_CATALOG.items():
catalog[tool_id] = {
"name": info["name"],
"description": info["description"],
"available_in": ["orchestrator"],
}
except ImportError:
pass
# ── Assembler tools ───────────────────────────────────────────────────────
try:
from creative.assembler import ASSEMBLER_TOOL_CATALOG
for tool_id, info in ASSEMBLER_TOOL_CATALOG.items():
catalog[tool_id] = {
"name": info["name"],
"description": info["description"],
"available_in": ["reel", "orchestrator"],
"available_in": available_in,
}
except ImportError:
pass

View File

@@ -89,45 +89,31 @@ def list_swarm_agents() -> dict[str, Any]:
}
def delegate_to_kimi(task: str, working_directory: str = "") -> dict[str, Any]:
"""Delegate a coding task to Kimi, the external coding agent.
Kimi has 262K context and is optimized for code tasks: writing,
debugging, refactoring, test writing. Timmy thinks and plans,
Kimi executes bulk code changes.
Args:
task: Clear, specific coding task description. Include file paths
and expected behavior. Good: "Fix the bug in src/timmy/session.py
where sessions don't persist." Bad: "Fix all bugs."
working_directory: Directory for Kimi to work in. Defaults to repo root.
Returns:
Dict with success status and Kimi's output or error.
"""
def _find_kimi_cli() -> str | None:
"""Return the path to the kimi CLI binary, or None if not installed."""
import shutil
import subprocess
return shutil.which("kimi")
def _resolve_workdir(working_directory: str) -> str | dict[str, Any]:
"""Return a validated working directory path, or an error dict."""
from pathlib import Path
from config import settings
kimi_path = shutil.which("kimi")
if not kimi_path:
return {
"success": False,
"error": "kimi CLI not found on PATH. Install with: pip install kimi-cli",
}
workdir = working_directory or settings.repo_root
if not Path(workdir).is_dir():
return {
"success": False,
"error": f"Working directory does not exist: {workdir}",
}
return workdir
cmd = [kimi_path, "--print", "-p", task]
logger.info("Delegating to Kimi: %s (cwd=%s)", task[:80], workdir)
def _run_kimi(cmd: list[str], workdir: str) -> dict[str, Any]:
"""Execute the kimi subprocess and return a result dict."""
import subprocess
try:
result = subprocess.run(
@@ -153,7 +139,39 @@ def delegate_to_kimi(task: str, working_directory: str = "") -> dict[str, Any]:
"error": "Kimi timed out after 300s. Task may be too broad — try breaking it into smaller pieces.",
}
except Exception as exc:
logger.exception("Failed to run Kimi subprocess")
return {
"success": False,
"error": f"Failed to run Kimi: {exc}",
}
def delegate_to_kimi(task: str, working_directory: str = "") -> dict[str, Any]:
"""Delegate a coding task to Kimi, the external coding agent.
Kimi has 262K context and is optimized for code tasks: writing,
debugging, refactoring, test writing. Timmy thinks and plans,
Kimi executes bulk code changes.
Args:
task: Clear, specific coding task description. Include file paths
and expected behavior. Good: "Fix the bug in src/timmy/session.py
where sessions don't persist." Bad: "Fix all bugs."
working_directory: Directory for Kimi to work in. Defaults to repo root.
Returns:
Dict with success status and Kimi's output or error.
"""
kimi_path = _find_kimi_cli()
if not kimi_path:
return {
"success": False,
"error": "kimi CLI not found on PATH. Install with: pip install kimi-cli",
}
workdir = _resolve_workdir(working_directory)
if isinstance(workdir, dict):
return workdir
logger.info("Delegating to Kimi: %s (cwd=%s)", task[:80], workdir)
return _run_kimi([kimi_path, "--print", "-p", task], workdir)

View File

@@ -122,6 +122,7 @@ def check_ollama_health() -> dict[str, Any]:
models = response.json().get("models", [])
result["available_models"] = [m.get("name", "") for m in models]
except Exception as e:
logger.exception("Ollama health check failed")
result["error"] = str(e)
return result
@@ -289,6 +290,7 @@ def get_live_system_status() -> dict[str, Any]:
try:
result["system"] = get_system_info()
except Exception as exc:
logger.exception("Failed to get system info")
result["system"] = {"error": str(exc)}
# Task queue
@@ -301,6 +303,7 @@ def get_live_system_status() -> dict[str, Any]:
try:
result["memory"] = get_memory_status()
except Exception as exc:
logger.exception("Failed to get memory status")
result["memory"] = {"error": str(exc)}
# Uptime
@@ -326,6 +329,46 @@ def get_live_system_status() -> dict[str, Any]:
return result
def _build_pytest_cmd(venv_python: Path, scope: str) -> list[str]:
"""Build the pytest command list for the given scope."""
cmd = [str(venv_python), "-m", "pytest", "-x", "-q", "--tb=short", "--timeout=30"]
if scope == "fast":
cmd.extend(
[
"--ignore=tests/functional",
"--ignore=tests/e2e",
"--ignore=tests/integrations",
"tests/",
]
)
elif scope == "full":
cmd.append("tests/")
else:
cmd.append(scope)
return cmd
def _parse_pytest_output(output: str) -> dict[str, int]:
"""Extract passed/failed/error counts from pytest output."""
import re
passed = failed = errors = 0
for line in output.splitlines():
if "passed" in line or "failed" in line or "error" in line:
nums = re.findall(r"(\d+) (passed|failed|error)", line)
for count, kind in nums:
if kind == "passed":
passed = int(count)
elif kind == "failed":
failed = int(count)
elif kind == "error":
errors = int(count)
return {"passed": passed, "failed": failed, "errors": errors}
def run_self_tests(scope: str = "fast", _repo_root: str | None = None) -> dict[str, Any]:
"""Run Timmy's own test suite and report results.
@@ -349,53 +392,22 @@ def run_self_tests(scope: str = "fast", _repo_root: str | None = None) -> dict[s
if not venv_python.exists():
return {"success": False, "error": f"No venv found at {venv_python}"}
cmd = [str(venv_python), "-m", "pytest", "-x", "-q", "--tb=short", "--timeout=30"]
if scope == "fast":
# Unit tests only — skip functional/e2e/integration
cmd.extend(
[
"--ignore=tests/functional",
"--ignore=tests/e2e",
"--ignore=tests/integrations",
"tests/",
]
)
elif scope == "full":
cmd.append("tests/")
else:
# Specific path
cmd.append(scope)
cmd = _build_pytest_cmd(venv_python, scope)
try:
result = subprocess.run(cmd, capture_output=True, text=True, timeout=120, cwd=repo)
output = result.stdout + result.stderr
# Parse pytest output for counts
passed = failed = errors = 0
for line in output.splitlines():
if "passed" in line or "failed" in line or "error" in line:
import re
nums = re.findall(r"(\d+) (passed|failed|error)", line)
for count, kind in nums:
if kind == "passed":
passed = int(count)
elif kind == "failed":
failed = int(count)
elif kind == "error":
errors = int(count)
counts = _parse_pytest_output(output)
return {
"success": result.returncode == 0,
"passed": passed,
"failed": failed,
"errors": errors,
"total": passed + failed + errors,
**counts,
"total": counts["passed"] + counts["failed"] + counts["errors"],
"return_code": result.returncode,
"summary": output[-2000:] if len(output) > 2000 else output,
}
except subprocess.TimeoutExpired:
return {"success": False, "error": "Test run timed out (120s limit)"}
except Exception as exc:
logger.exception("Self-test run failed")
return {"success": False, "error": str(exc)}

View File

@@ -78,6 +78,11 @@ DEFAULT_MAX_UTTERANCE = 30.0 # safety cap — don't record forever
DEFAULT_SESSION_ID = "voice"
def _rms(block: np.ndarray) -> float:
"""Compute root-mean-square energy of an audio block."""
return float(np.sqrt(np.mean(block.astype(np.float32) ** 2)))
@dataclass
class VoiceConfig:
"""Configuration for the voice loop."""
@@ -161,13 +166,6 @@ class VoiceLoop:
min_blocks = int(self.config.min_utterance / 0.1)
max_blocks = int(self.config.max_utterance / 0.1)
audio_chunks: list[np.ndarray] = []
silent_count = 0
recording = False
def _rms(block: np.ndarray) -> float:
return float(np.sqrt(np.mean(block.astype(np.float32) ** 2)))
sys.stdout.write("\n 🎤 Listening... (speak now)\n")
sys.stdout.flush()
@@ -177,42 +175,69 @@ class VoiceLoop:
dtype="float32",
blocksize=block_size,
) as stream:
while self._running:
block, overflowed = stream.read(block_size)
if overflowed:
logger.debug("Audio buffer overflowed")
chunks = self._capture_audio_blocks(stream, block_size, silence_blocks, max_blocks)
rms = _rms(block)
return self._finalize_utterance(chunks, min_blocks, sr)
if not recording:
if rms > self.config.silence_threshold:
recording = True
silent_count = 0
audio_chunks.append(block.copy())
sys.stdout.write(" 📢 Recording...\r")
sys.stdout.flush()
def _capture_audio_blocks(
self,
stream,
block_size: int,
silence_blocks: int,
max_blocks: int,
) -> list[np.ndarray]:
"""Read audio blocks from *stream* until silence or max length.
Returns the list of captured audio chunks (may be empty).
"""
chunks: list[np.ndarray] = []
silent_count = 0
recording = False
while self._running:
block, overflowed = stream.read(block_size)
if overflowed:
logger.debug("Audio buffer overflowed")
rms = _rms(block)
if not recording:
if rms > self.config.silence_threshold:
recording = True
silent_count = 0
chunks.append(block.copy())
sys.stdout.write(" 📢 Recording...\r")
sys.stdout.flush()
else:
chunks.append(block.copy())
if rms < self.config.silence_threshold:
silent_count += 1
else:
audio_chunks.append(block.copy())
silent_count = 0
if rms < self.config.silence_threshold:
silent_count += 1
else:
silent_count = 0
if silent_count >= silence_blocks:
break
# End of utterance
if silent_count >= silence_blocks:
break
if len(chunks) >= max_blocks:
logger.info("Max utterance length reached, stopping.")
break
# Safety cap
if len(audio_chunks) >= max_blocks:
logger.info("Max utterance length reached, stopping.")
break
return chunks
if not audio_chunks or len(audio_chunks) < min_blocks:
@staticmethod
def _finalize_utterance(
chunks: list[np.ndarray], min_blocks: int, sample_rate: int
) -> np.ndarray | None:
"""Concatenate recorded chunks and report duration.
Returns ``None`` if the utterance is too short to be meaningful.
"""
if not chunks or len(chunks) < min_blocks:
return None
audio = np.concatenate(audio_chunks, axis=0).flatten()
duration = len(audio) / sr
audio = np.concatenate(chunks, axis=0).flatten()
duration = len(audio) / sample_rate
sys.stdout.write(f" ✂️ Captured {duration:.1f}s of audio\n")
sys.stdout.flush()
return audio
@@ -369,15 +394,33 @@ class VoiceLoop:
# ── Main Loop ───────────────────────────────────────────────────────
def run(self) -> None:
"""Run the voice loop. Blocks until Ctrl-C."""
self._ensure_piper()
# Whisper hallucinates these on silence/noise — skip them.
_WHISPER_HALLUCINATIONS = frozenset(
{
"you",
"thanks.",
"thank you.",
"bye.",
"",
"thanks for watching!",
"thank you for watching!",
}
)
# Suppress MCP / Agno stderr noise during voice mode.
_suppress_mcp_noise()
# Suppress MCP async-generator teardown tracebacks on exit.
_install_quiet_asyncgen_hooks()
# Spoken phrases that end the voice session.
_EXIT_COMMANDS = frozenset(
{
"goodbye",
"exit",
"quit",
"stop",
"goodbye timmy",
"stop listening",
}
)
def _log_banner(self) -> None:
"""Log the startup banner with STT/TTS/LLM configuration."""
tts_label = (
"macOS say"
if self.config.use_say_fallback
@@ -393,52 +436,50 @@ class VoiceLoop:
" Press Ctrl-C to exit.\n" + "=" * 60
)
def _is_hallucination(self, text: str) -> bool:
"""Return True if *text* is a known Whisper hallucination."""
return not text or text.lower() in self._WHISPER_HALLUCINATIONS
def _is_exit_command(self, text: str) -> bool:
"""Return True if the user asked to stop the voice session."""
return text.lower().strip().rstrip(".!") in self._EXIT_COMMANDS
def _process_turn(self, text: str) -> None:
"""Handle a single listen-think-speak turn after transcription."""
sys.stdout.write(f"\n 👤 You: {text}\n")
sys.stdout.flush()
response = self._think(text)
sys.stdout.write(f" 🤖 Timmy: {response}\n")
sys.stdout.flush()
self._speak(response)
def run(self) -> None:
"""Run the voice loop. Blocks until Ctrl-C."""
self._ensure_piper()
_suppress_mcp_noise()
_install_quiet_asyncgen_hooks()
self._log_banner()
self._running = True
try:
while self._running:
# 1. LISTEN — record until silence
audio = self._record_utterance()
if audio is None:
continue
# 2. TRANSCRIBE — Whisper STT
text = self._transcribe(audio)
if not text or text.lower() in (
"you",
"thanks.",
"thank you.",
"bye.",
"",
"thanks for watching!",
"thank you for watching!",
):
# Whisper hallucinations on silence/noise
if self._is_hallucination(text):
logger.debug("Ignoring likely Whisper hallucination: '%s'", text)
continue
sys.stdout.write(f"\n 👤 You: {text}\n")
sys.stdout.flush()
# Exit commands
if text.lower().strip().rstrip(".!") in (
"goodbye",
"exit",
"quit",
"stop",
"goodbye timmy",
"stop listening",
):
if self._is_exit_command(text):
logger.info("👋 Goodbye!")
break
# 3. THINK — send to Timmy
response = self._think(text)
sys.stdout.write(f" 🤖 Timmy: {response}\n")
sys.stdout.flush()
# 4. SPEAK — TTS output
self._speak(response)
self._process_turn(text)
except KeyboardInterrupt:
logger.info("👋 Voice loop stopped.")

View File

@@ -86,6 +86,40 @@ def _pip_snapshot(mood: str, confidence: float) -> dict:
return pip_familiar.snapshot().to_dict()
def _resolve_mood(state) -> str:
"""Map cognitive mood/engagement to a presence mood string."""
if state.engagement == "idle" and state.mood == "settled":
return "calm"
return _MOOD_MAP.get(state.mood, "calm")
def _resolve_confidence(state) -> float:
"""Compute normalised confidence from cognitive tracker state."""
if state._confidence_count > 0:
raw = state._confidence_sum / state._confidence_count
else:
raw = 0.7
return round(max(0.0, min(1.0, raw)), 2)
def _build_active_threads(state) -> list[dict]:
"""Convert active commitments into presence thread dicts."""
return [
{"type": "thinking", "ref": c[:80], "status": "active"}
for c in state.active_commitments[:10]
]
def _build_environment() -> dict:
"""Return the environment section using local wall-clock time."""
local_now = datetime.now()
return {
"time_of_day": _time_of_day(local_now.hour),
"local_time": local_now.strftime("%-I:%M %p"),
"day_of_week": local_now.strftime("%A"),
}
def get_state_dict() -> dict:
"""Build presence state dict from current cognitive state.
@@ -98,37 +132,19 @@ def get_state_dict() -> dict:
state = cognitive_tracker.get_state()
now = datetime.now(UTC)
# Map cognitive mood to presence mood
mood = _MOOD_MAP.get(state.mood, "calm")
if state.engagement == "idle" and state.mood == "settled":
mood = "calm"
# Confidence from cognitive tracker
if state._confidence_count > 0:
confidence = state._confidence_sum / state._confidence_count
else:
confidence = 0.7
# Build active threads from commitments
threads = []
for commitment in state.active_commitments[:10]:
threads.append({"type": "thinking", "ref": commitment[:80], "status": "active"})
# Activity
mood = _resolve_mood(state)
confidence = _resolve_confidence(state)
activity = _ACTIVITY_MAP.get(state.engagement, "idle")
# Environment
local_now = datetime.now()
return {
"version": 1,
"liveness": now.strftime("%Y-%m-%dT%H:%M:%SZ"),
"current_focus": state.focus_topic or "",
"active_threads": threads,
"active_threads": _build_active_threads(state),
"recent_events": [],
"concerns": [],
"mood": mood,
"confidence": round(max(0.0, min(1.0, confidence)), 2),
"confidence": confidence,
"energy": round(_current_energy(), 2),
"identity": {
"name": "Timmy",
@@ -143,11 +159,7 @@ def get_state_dict() -> dict:
"visitor_present": False,
"conversation_turns": state.conversation_depth,
},
"environment": {
"time_of_day": _time_of_day(local_now.hour),
"local_time": local_now.strftime("%-I:%M %p"),
"day_of_week": local_now.strftime("%A"),
},
"environment": _build_environment(),
"familiar": _pip_snapshot(mood, confidence),
"meta": {
"schema_version": 1,

7
src/timmyctl/__init__.py Normal file
View File

@@ -0,0 +1,7 @@
"""Timmy Control Panel — CLI entry point for automations.
This package provides the `timmyctl` command-line interface for managing
Timmy automations, configuration, and daily operations.
"""
__version__ = "1.0.0"

316
src/timmyctl/cli.py Normal file
View File

@@ -0,0 +1,316 @@
"""Timmy Control Panel CLI — primary control surface for automations.
Usage:
timmyctl daily-run # Run the Daily Run orchestration
timmyctl log-run # Capture a Daily Run logbook entry
timmyctl inbox # Show what's "calling Timmy"
timmyctl config # Display key configuration
"""
import json
import os
from pathlib import Path
from typing import Any
import typer
import yaml
from rich.console import Console
from rich.table import Table
# Initialize Rich console for nice output
console = Console()
app = typer.Typer(
help="Timmy Control Panel — primary control surface for automations",
rich_markup_mode="rich",
)
# Default config paths
DEFAULT_CONFIG_DIR = Path("timmy_automations/config")
AUTOMATIONS_CONFIG = DEFAULT_CONFIG_DIR / "automations.json"
DAILY_RUN_CONFIG = DEFAULT_CONFIG_DIR / "daily_run.json"
TRIAGE_RULES_CONFIG = DEFAULT_CONFIG_DIR / "triage_rules.yaml"
def _load_json_config(path: Path) -> dict[str, Any]:
"""Load a JSON config file, returning empty dict on error."""
try:
with open(path, encoding="utf-8") as f:
return json.load(f)
except (FileNotFoundError, json.JSONDecodeError) as e:
console.print(f"[red]Error loading {path}: {e}[/red]")
return {}
def _load_yaml_config(path: Path) -> dict[str, Any]:
"""Load a YAML config file, returning empty dict on error."""
try:
with open(path, encoding="utf-8") as f:
return yaml.safe_load(f) or {}
except (FileNotFoundError, yaml.YAMLError) as e:
console.print(f"[red]Error loading {path}: {e}[/red]")
return {}
def _get_config_dir() -> Path:
"""Return the config directory path."""
# Allow override via environment variable
env_dir = os.environ.get("TIMMY_CONFIG_DIR")
if env_dir:
return Path(env_dir)
return DEFAULT_CONFIG_DIR
@app.command()
def daily_run(
dry_run: bool = typer.Option(
False, "--dry-run", "-n", help="Show what would run without executing"
),
verbose: bool = typer.Option(False, "--verbose", "-v", help="Show detailed output"),
):
"""Run the Daily Run orchestration (agenda + summary).
Executes the daily run workflow including:
- Loop Guard checks
- Cycle Retrospective
- Triage scoring (if scheduled)
- Loop introspection (if scheduled)
"""
console.print("[bold green]Timmy Daily Run[/bold green]")
console.print()
config_path = _get_config_dir() / "daily_run.json"
config = _load_json_config(config_path)
if not config:
console.print("[yellow]No daily run configuration found.[/yellow]")
raise typer.Exit(1)
schedules = config.get("schedules", {})
triggers = config.get("triggers", {})
if verbose:
console.print(f"[dim]Config loaded from: {config_path}[/dim]")
console.print()
# Show the daily run schedule
table = Table(title="Daily Run Schedules")
table.add_column("Schedule", style="cyan")
table.add_column("Description", style="green")
table.add_column("Automations", style="yellow")
for schedule_name, schedule_data in schedules.items():
automations = schedule_data.get("automations", [])
table.add_row(
schedule_name,
schedule_data.get("description", ""),
", ".join(automations) if automations else "",
)
console.print(table)
console.print()
# Show triggers
trigger_table = Table(title="Triggers")
trigger_table.add_column("Trigger", style="cyan")
trigger_table.add_column("Description", style="green")
trigger_table.add_column("Automations", style="yellow")
for trigger_name, trigger_data in triggers.items():
automations = trigger_data.get("automations", [])
trigger_table.add_row(
trigger_name,
trigger_data.get("description", ""),
", ".join(automations) if automations else "",
)
console.print(trigger_table)
console.print()
if dry_run:
console.print("[yellow]Dry run mode — no actions executed.[/yellow]")
else:
console.print("[green]Executing daily run automations...[/green]")
# TODO: Implement actual automation execution
# This would call the appropriate scripts from the automations config
console.print("[dim]Automation execution not yet implemented.[/dim]")
@app.command()
def log_run(
message: str = typer.Argument(..., help="Logbook entry message"),
category: str = typer.Option(
"general", "--category", "-c", help="Entry category (e.g., retro, todo, note)"
),
):
"""Capture a quick Daily Run logbook entry.
Logs a structured entry to the daily run logbook for later review.
Entries are timestamped and categorized automatically.
"""
from datetime import datetime
timestamp = datetime.now().isoformat()
console.print("[bold green]Daily Run Log Entry[/bold green]")
console.print()
console.print(f"[dim]Timestamp:[/dim] {timestamp}")
console.print(f"[dim]Category:[/dim] {category}")
console.print(f"[dim]Message:[/dim] {message}")
console.print()
# TODO: Persist to actual logbook file
# This would append to a logbook file (e.g., .loop/logbook.jsonl)
console.print("[green]✓[/green] Entry logged (simulated)")
@app.command()
def inbox(
limit: int = typer.Option(10, "--limit", "-l", help="Maximum items to show"),
include_prs: bool = typer.Option(True, "--prs/--no-prs", help="Show open PRs"),
include_issues: bool = typer.Option(True, "--issues/--no-issues", help="Show relevant issues"),
):
"""Show what's "calling Timmy" — PRs, Daily Run items, alerts.
Displays a unified inbox of items requiring attention:
- Open pull requests awaiting review
- Daily run queue items
- Alerts and notifications
"""
console.print("[bold green]Timmy Inbox[/bold green]")
console.print()
# Load automations to show what's enabled
config_path = _get_config_dir() / "automations.json"
config = _load_json_config(config_path)
automations = config.get("automations", [])
enabled_automations = [a for a in automations if a.get("enabled", False)]
# Show automation status
auto_table = Table(title="Active Automations")
auto_table.add_column("ID", style="cyan")
auto_table.add_column("Name", style="green")
auto_table.add_column("Category", style="yellow")
auto_table.add_column("Trigger", style="magenta")
for auto in enabled_automations[:limit]:
auto_table.add_row(
auto.get("id", ""),
auto.get("name", ""),
"" if auto.get("enabled", False) else "",
auto.get("category", ""),
)
console.print(auto_table)
console.print()
# TODO: Fetch actual PRs from Gitea API
if include_prs:
pr_table = Table(title="Open Pull Requests (placeholder)")
pr_table.add_column("#", style="cyan")
pr_table.add_column("Title", style="green")
pr_table.add_column("Author", style="yellow")
pr_table.add_column("Status", style="magenta")
pr_table.add_row("", "[dim]No PRs fetched (Gitea API not configured)[/dim]", "", "")
console.print(pr_table)
console.print()
# TODO: Fetch relevant issues from Gitea API
if include_issues:
issue_table = Table(title="Issues Calling for Attention (placeholder)")
issue_table.add_column("#", style="cyan")
issue_table.add_column("Title", style="green")
issue_table.add_column("Type", style="yellow")
issue_table.add_column("Priority", style="magenta")
issue_table.add_row(
"", "[dim]No issues fetched (Gitea API not configured)[/dim]", "", ""
)
console.print(issue_table)
console.print()
@app.command()
def config(
key: str | None = typer.Argument(None, help="Show specific config key (e.g., 'automations')"),
show_rules: bool = typer.Option(False, "--rules", "-r", help="Show triage rules overview"),
):
"""Display key configuration — labels, logbook issue ID, token rules overview.
Shows the current Timmy automation configuration including:
- Automation manifest
- Daily run schedules
- Triage scoring rules
"""
console.print("[bold green]Timmy Configuration[/bold green]")
console.print()
config_dir = _get_config_dir()
if key == "automations" or key is None:
auto_config = _load_json_config(config_dir / "automations.json")
automations = auto_config.get("automations", [])
table = Table(title="Automations Manifest")
table.add_column("ID", style="cyan")
table.add_column("Name", style="green")
table.add_column("Enabled", style="yellow")
table.add_column("Category", style="magenta")
for auto in automations:
enabled = "" if auto.get("enabled", False) else ""
table.add_row(
auto.get("id", ""),
auto.get("name", ""),
enabled,
auto.get("category", ""),
)
console.print(table)
console.print()
if key == "daily_run" or (key is None and not show_rules):
daily_config = _load_json_config(config_dir / "daily_run.json")
if daily_config:
console.print("[bold]Daily Run Configuration:[/bold]")
console.print(f"[dim]Version:[/dim] {daily_config.get('version', 'unknown')}")
console.print(f"[dim]Description:[/dim] {daily_config.get('description', '')}")
console.print()
if show_rules or key == "triage_rules":
rules_config = _load_yaml_config(config_dir / "triage_rules.yaml")
if rules_config:
thresholds = rules_config.get("thresholds", {})
console.print("[bold]Triage Scoring Rules:[/bold]")
console.print(f" Ready threshold: {thresholds.get('ready', 'N/A')}")
console.print(f" Excellent threshold: {thresholds.get('excellent', 'N/A')}")
console.print()
scope = rules_config.get("scope", {})
console.print("[bold]Scope Scoring:[/bold]")
console.print(f" Meta penalty: {scope.get('meta_penalty', 'N/A')}")
console.print()
alignment = rules_config.get("alignment", {})
console.print("[bold]Alignment Scoring:[/bold]")
console.print(f" Bug score: {alignment.get('bug_score', 'N/A')}")
console.print(f" Refactor score: {alignment.get('refactor_score', 'N/A')}")
console.print(f" Feature score: {alignment.get('feature_score', 'N/A')}")
console.print()
quarantine = rules_config.get("quarantine", {})
console.print("[bold]Quarantine Rules:[/bold]")
console.print(f" Failure threshold: {quarantine.get('failure_threshold', 'N/A')}")
console.print(f" Lookback cycles: {quarantine.get('lookback_cycles', 'N/A')}")
console.print()
def main():
"""Entry point for the timmyctl CLI."""
app()
if __name__ == "__main__":
main()

View File

@@ -2493,3 +2493,57 @@
.db-cell { max-width: 300px; overflow: hidden; text-overflow: ellipsis; white-space: nowrap; }
.db-cell:hover { white-space: normal; word-break: break-all; }
.db-truncated { font-size: 0.7rem; color: var(--amber); padding: 0.3rem 0; }
/* ── Tower ────────────────────────────────────────────────────────────── */
.tower-container { max-width: 1400px; margin: 0 auto; }
.tower-header { margin-bottom: 1rem; }
.tower-title { font-size: 1.6rem; font-weight: 700; color: var(--green); letter-spacing: 0.15em; }
.tower-subtitle { font-size: 0.85rem; color: var(--text-dim); }
.tower-conn-badge { font-size: 0.7rem; font-weight: 600; padding: 2px 8px; border-radius: 3px; letter-spacing: 0.08em; }
.tower-conn-live { color: var(--green); border: 1px solid var(--green); }
.tower-conn-offline { color: var(--red); border: 1px solid var(--red); }
.tower-conn-connecting { color: var(--amber); border: 1px solid var(--amber); }
.tower-phase-card { min-height: 300px; }
.tower-phase-thinking { border-left: 3px solid var(--purple); }
.tower-phase-predicting { border-left: 3px solid var(--orange); }
.tower-phase-advising { border-left: 3px solid var(--green); }
.tower-scroll { max-height: 50vh; overflow-y: auto; }
.tower-empty { text-align: center; color: var(--text-dim); padding: 16px; font-size: 0.85rem; }
.tower-stat-grid { display: grid; grid-template-columns: repeat(4, 1fr); gap: 0.5rem; text-align: center; }
.tower-stat-label { display: block; font-size: 0.65rem; color: var(--text-dim); letter-spacing: 0.1em; }
.tower-stat-value { display: block; font-size: 1.1rem; font-weight: 700; color: var(--text-bright); }
.tower-event { padding: 8px; margin-bottom: 6px; border-left: 3px solid var(--border); border-radius: 3px; background: var(--bg-card); }
.tower-etype-task_posted { border-left-color: var(--purple); }
.tower-etype-bid_submitted { border-left-color: var(--orange); }
.tower-etype-task_completed { border-left-color: var(--green); }
.tower-etype-task_failed { border-left-color: var(--red); }
.tower-etype-agent_joined { border-left-color: var(--purple); }
.tower-etype-tool_executed { border-left-color: var(--amber); }
.tower-ev-head { display: flex; justify-content: space-between; align-items: center; margin-bottom: 4px; }
.tower-ev-badge { font-size: 0.65rem; font-weight: 600; color: var(--text-bright); letter-spacing: 0.08em; }
.tower-ev-dots { font-size: 0.6rem; color: var(--amber); }
.tower-ev-desc { font-size: 0.8rem; color: var(--text); }
.tower-ev-time { font-size: 0.65rem; color: var(--text-dim); margin-top: 2px; }
.tower-pred { padding: 8px; margin-bottom: 6px; border-radius: 3px; background: var(--bg-card); border-left: 3px solid var(--orange); }
.tower-pred-done { border-left-color: var(--green); }
.tower-pred-pending { border-left-color: var(--amber); }
.tower-pred-head { display: flex; justify-content: space-between; align-items: center; }
.tower-pred-task { font-size: 0.75rem; font-weight: 600; color: var(--text-bright); font-family: monospace; }
.tower-pred-acc { font-size: 0.75rem; font-weight: 700; }
.tower-pred-detail { font-size: 0.75rem; color: var(--text-dim); margin-top: 4px; }
.tower-advisory { padding: 8px; margin-bottom: 6px; border-radius: 3px; background: var(--bg-card); border-left: 3px solid var(--border); }
.tower-adv-high { border-left-color: var(--red); }
.tower-adv-medium { border-left-color: var(--orange); }
.tower-adv-low { border-left-color: var(--green); }
.tower-adv-head { display: flex; justify-content: space-between; font-size: 0.65rem; margin-bottom: 4px; }
.tower-adv-cat { font-weight: 600; color: var(--text-dim); letter-spacing: 0.08em; }
.tower-adv-prio { font-weight: 700; color: var(--amber); }
.tower-adv-title { font-size: 0.85rem; font-weight: 600; color: var(--text-bright); }
.tower-adv-detail { font-size: 0.8rem; color: var(--text); margin-top: 2px; }
.tower-adv-action { font-size: 0.75rem; color: var(--green); margin-top: 4px; font-style: italic; }

View File

@@ -13,11 +13,121 @@
<div class="mood" id="mood-text">focused</div>
</div>
<div id="connection-dot"></div>
<button id="info-btn" class="info-button" aria-label="About The Matrix" title="About The Matrix">
<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
<circle cx="12" cy="12" r="10"></circle>
<line x1="12" y1="16" x2="12" y2="12"></line>
<line x1="12" y1="8" x2="12.01" y2="8"></line>
</svg>
</button>
<button id="submit-job-btn" class="submit-job-button" aria-label="Submit Job" title="Submit Job">
<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
<path d="M12 5v14M5 12h14"></path>
</svg>
<span>Job</span>
</button>
<div id="speech-area">
<div class="bubble" id="speech-bubble"></div>
</div>
</div>
<!-- Submit Job Modal -->
<div id="submit-job-modal" class="submit-job-modal">
<div class="submit-job-content">
<button id="submit-job-close" class="submit-job-close" aria-label="Close">
<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
<line x1="18" y1="6" x2="6" y2="18"></line>
<line x1="6" y1="6" x2="18" y2="18"></line>
</svg>
</button>
<h2>Submit Job</h2>
<p class="submit-job-subtitle">Create a task for Timmy and the agent swarm</p>
<form id="submit-job-form" class="submit-job-form">
<div class="form-group">
<label for="job-title">Title <span class="required">*</span></label>
<input type="text" id="job-title" name="title" placeholder="Brief description of the task" maxlength="200">
<div class="char-count" id="title-char-count">0 / 200</div>
<div class="validation-error" id="title-error"></div>
</div>
<div class="form-group">
<label for="job-description">Description</label>
<textarea id="job-description" name="description" placeholder="Detailed instructions, requirements, and context..." rows="6" maxlength="2000"></textarea>
<div class="char-count" id="desc-char-count">0 / 2000</div>
<div class="validation-warning" id="desc-warning"></div>
<div class="validation-error" id="desc-error"></div>
</div>
<div class="form-group">
<label for="job-priority">Priority</label>
<select id="job-priority" name="priority">
<option value="low">Low</option>
<option value="medium" selected>Medium</option>
<option value="high">High</option>
<option value="urgent">Urgent</option>
</select>
</div>
<div class="submit-job-actions">
<button type="button" id="cancel-job-btn" class="btn-secondary">Cancel</button>
<button type="submit" id="submit-job-submit" class="btn-primary" disabled>Submit Job</button>
</div>
</form>
<div id="submit-job-success" class="submit-job-success hidden">
<div class="success-icon">
<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
<path d="M22 11.08V12a10 10 0 1 1-5.93-9.14"></path>
<polyline points="22 4 12 14.01 9 11.01"></polyline>
</svg>
</div>
<h3>Job Submitted!</h3>
<p>Your task has been added to the queue. Timmy will review it shortly.</p>
<button type="button" id="submit-another-btn" class="btn-primary">Submit Another</button>
</div>
</div>
<div id="submit-job-backdrop" class="submit-job-backdrop"></div>
</div>
<!-- About Panel -->
<div id="about-panel" class="about-panel">
<div class="about-panel-content">
<button id="about-close" class="about-close" aria-label="Close">
<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
<line x1="18" y1="6" x2="6" y2="18"></line>
<line x1="6" y1="6" x2="18" y2="18"></line>
</svg>
</button>
<h2>Welcome to The Matrix</h2>
<section>
<h3>🌌 The Matrix</h3>
<p>The Matrix is a 3D visualization of Timmy's AI agent workspace. Enter the workshop to see Timmy at work—pondering the arcane arts of code, managing tasks, and orchestrating autonomous agents in real-time.</p>
</section>
<section>
<h3>🛠️ The Workshop</h3>
<p>The Workshop is where you interact directly with Timmy:</p>
<ul>
<li><strong>Submit Jobs</strong> — Create tasks, delegate work, and track progress</li>
<li><strong>Chat with Agents</strong> — Converse with Timmy and his swarm of specialized agents</li>
<li><strong>Fund Sessions</strong> — Power your work with satoshis via Lightning Network</li>
</ul>
</section>
<section>
<h3>⚡ Lightning & Sats</h3>
<p>The Matrix runs on Bitcoin. Sessions are funded with satoshis (sats) over the Lightning Network—enabling fast, cheap micropayments that keep Timmy energized and working for you. No subscriptions, no limits—pay as you go.</p>
</section>
<div class="about-footer">
<span>Sovereign AI · Soul on Bitcoin</span>
</div>
</div>
<div id="about-backdrop" class="about-backdrop"></div>
</div>
<script type="importmap">
{
"imports": {
@@ -74,6 +184,271 @@
});
stateReader.connect();
// --- About Panel ---
const infoBtn = document.getElementById("info-btn");
const aboutPanel = document.getElementById("about-panel");
const aboutClose = document.getElementById("about-close");
const aboutBackdrop = document.getElementById("about-backdrop");
function openAboutPanel() {
aboutPanel.classList.add("open");
document.body.style.overflow = "hidden";
}
function closeAboutPanel() {
aboutPanel.classList.remove("open");
document.body.style.overflow = "";
}
infoBtn.addEventListener("click", openAboutPanel);
aboutClose.addEventListener("click", closeAboutPanel);
aboutBackdrop.addEventListener("click", closeAboutPanel);
// Close on Escape key
document.addEventListener("keydown", (e) => {
if (e.key === "Escape" && aboutPanel.classList.contains("open")) {
closeAboutPanel();
}
});
// --- Submit Job Modal ---
const submitJobBtn = document.getElementById("submit-job-btn");
const submitJobModal = document.getElementById("submit-job-modal");
const submitJobClose = document.getElementById("submit-job-close");
const submitJobBackdrop = document.getElementById("submit-job-backdrop");
const cancelJobBtn = document.getElementById("cancel-job-btn");
const submitJobForm = document.getElementById("submit-job-form");
const submitJobSubmit = document.getElementById("submit-job-submit");
const jobTitle = document.getElementById("job-title");
const jobDescription = document.getElementById("job-description");
const titleCharCount = document.getElementById("title-char-count");
const descCharCount = document.getElementById("desc-char-count");
const titleError = document.getElementById("title-error");
const descError = document.getElementById("desc-error");
const descWarning = document.getElementById("desc-warning");
const submitJobSuccess = document.getElementById("submit-job-success");
const submitAnotherBtn = document.getElementById("submit-another-btn");
// Constants
const MAX_TITLE_LENGTH = 200;
const MAX_DESC_LENGTH = 2000;
const TITLE_WARNING_THRESHOLD = 150;
const DESC_WARNING_THRESHOLD = 1800;
function openSubmitJobModal() {
submitJobModal.classList.add("open");
document.body.style.overflow = "hidden";
jobTitle.focus();
validateForm();
}
function closeSubmitJobModal() {
submitJobModal.classList.remove("open");
document.body.style.overflow = "";
// Reset form after animation
setTimeout(() => {
resetForm();
}, 300);
}
function resetForm() {
submitJobForm.reset();
submitJobForm.classList.remove("hidden");
submitJobSuccess.classList.add("hidden");
updateCharCounts();
clearErrors();
validateForm();
}
function clearErrors() {
titleError.textContent = "";
titleError.classList.remove("visible");
descError.textContent = "";
descError.classList.remove("visible");
descWarning.textContent = "";
descWarning.classList.remove("visible");
jobTitle.classList.remove("error");
jobDescription.classList.remove("error");
}
function updateCharCounts() {
const titleLen = jobTitle.value.length;
const descLen = jobDescription.value.length;
titleCharCount.textContent = `${titleLen} / ${MAX_TITLE_LENGTH}`;
descCharCount.textContent = `${descLen} / ${MAX_DESC_LENGTH}`;
// Update color based on thresholds
if (titleLen > MAX_TITLE_LENGTH) {
titleCharCount.classList.add("over-limit");
} else if (titleLen > TITLE_WARNING_THRESHOLD) {
titleCharCount.classList.add("near-limit");
titleCharCount.classList.remove("over-limit");
} else {
titleCharCount.classList.remove("near-limit", "over-limit");
}
if (descLen > MAX_DESC_LENGTH) {
descCharCount.classList.add("over-limit");
} else if (descLen > DESC_WARNING_THRESHOLD) {
descCharCount.classList.add("near-limit");
descCharCount.classList.remove("over-limit");
} else {
descCharCount.classList.remove("near-limit", "over-limit");
}
}
function validateTitle() {
const value = jobTitle.value.trim();
const length = jobTitle.value.length;
if (length > MAX_TITLE_LENGTH) {
titleError.textContent = `Title must be ${MAX_TITLE_LENGTH} characters or less`;
titleError.classList.add("visible");
jobTitle.classList.add("error");
return false;
}
if (value === "") {
titleError.textContent = "Title is required";
titleError.classList.add("visible");
jobTitle.classList.add("error");
return false;
}
titleError.textContent = "";
titleError.classList.remove("visible");
jobTitle.classList.remove("error");
return true;
}
function validateDescription() {
const length = jobDescription.value.length;
if (length > MAX_DESC_LENGTH) {
descError.textContent = `Description must be ${MAX_DESC_LENGTH} characters or less`;
descError.classList.add("visible");
descWarning.textContent = "";
descWarning.classList.remove("visible");
jobDescription.classList.add("error");
return false;
}
// Show warning when near limit
if (length > DESC_WARNING_THRESHOLD && length <= MAX_DESC_LENGTH) {
const remaining = MAX_DESC_LENGTH - length;
descWarning.textContent = `${remaining} characters remaining`;
descWarning.classList.add("visible");
} else {
descWarning.textContent = "";
descWarning.classList.remove("visible");
}
descError.textContent = "";
descError.classList.remove("visible");
jobDescription.classList.remove("error");
return true;
}
function validateForm() {
const titleValid = jobTitle.value.trim() !== "" && jobTitle.value.length <= MAX_TITLE_LENGTH;
const descValid = jobDescription.value.length <= MAX_DESC_LENGTH;
submitJobSubmit.disabled = !(titleValid && descValid);
}
// Event listeners
submitJobBtn.addEventListener("click", openSubmitJobModal);
submitJobClose.addEventListener("click", closeSubmitJobModal);
submitJobBackdrop.addEventListener("click", closeSubmitJobModal);
cancelJobBtn.addEventListener("click", closeSubmitJobModal);
submitAnotherBtn.addEventListener("click", resetForm);
// Input event listeners for real-time validation
jobTitle.addEventListener("input", () => {
updateCharCounts();
validateForm();
if (titleError.classList.contains("visible")) {
validateTitle();
}
});
jobTitle.addEventListener("blur", () => {
if (jobTitle.value.trim() !== "" || titleError.classList.contains("visible")) {
validateTitle();
}
});
jobDescription.addEventListener("input", () => {
updateCharCounts();
validateForm();
if (descError.classList.contains("visible")) {
validateDescription();
}
});
jobDescription.addEventListener("blur", () => {
validateDescription();
});
// Form submission
submitJobForm.addEventListener("submit", async (e) => {
e.preventDefault();
const isTitleValid = validateTitle();
const isDescValid = validateDescription();
if (!isTitleValid || !isDescValid) {
return;
}
// Disable submit button while processing
submitJobSubmit.disabled = true;
submitJobSubmit.textContent = "Submitting...";
const formData = {
title: jobTitle.value.trim(),
description: jobDescription.value.trim(),
priority: document.getElementById("job-priority").value,
submitted_at: new Date().toISOString()
};
try {
// Submit to API
const response = await fetch("/api/tasks", {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify(formData)
});
if (response.ok) {
// Show success state
submitJobForm.classList.add("hidden");
submitJobSuccess.classList.remove("hidden");
} else {
const errorData = await response.json().catch(() => ({}));
descError.textContent = errorData.detail || "Failed to submit job. Please try again.";
descError.classList.add("visible");
}
} catch (error) {
// For demo/development, show success even if API fails
submitJobForm.classList.add("hidden");
submitJobSuccess.classList.remove("hidden");
} finally {
submitJobSubmit.disabled = false;
submitJobSubmit.textContent = "Submit Job";
}
});
// Close on Escape key for Submit Job Modal
document.addEventListener("keydown", (e) => {
if (e.key === "Escape" && submitJobModal.classList.contains("open")) {
closeSubmitJobModal();
}
});
// --- Resize ---
window.addEventListener("resize", () => {
camera.aspect = window.innerWidth / window.innerHeight;

View File

@@ -87,3 +87,569 @@ canvas {
#connection-dot.connected {
background: #00b450;
}
/* Info button */
.info-button {
position: absolute;
top: 14px;
right: 36px;
width: 28px;
height: 28px;
padding: 0;
background: rgba(10, 10, 20, 0.7);
border: 1px solid rgba(218, 165, 32, 0.4);
border-radius: 50%;
color: #daa520;
cursor: pointer;
pointer-events: auto;
transition: all 0.2s ease;
display: flex;
align-items: center;
justify-content: center;
}
.info-button:hover {
background: rgba(218, 165, 32, 0.15);
border-color: rgba(218, 165, 32, 0.7);
transform: scale(1.05);
}
.info-button svg {
width: 16px;
height: 16px;
}
/* About Panel */
.about-panel {
position: fixed;
top: 0;
right: 0;
width: 100%;
height: 100%;
z-index: 100;
pointer-events: none;
visibility: hidden;
opacity: 0;
transition: opacity 0.3s ease, visibility 0.3s ease;
}
.about-panel.open {
pointer-events: auto;
visibility: visible;
opacity: 1;
}
.about-panel-content {
position: absolute;
top: 0;
right: 0;
width: 380px;
max-width: 90%;
height: 100%;
background: rgba(10, 10, 20, 0.97);
border-left: 1px solid rgba(218, 165, 32, 0.3);
padding: 60px 24px 24px 24px;
overflow-y: auto;
transform: translateX(100%);
transition: transform 0.3s ease;
box-shadow: -4px 0 20px rgba(0, 0, 0, 0.5);
}
.about-panel.open .about-panel-content {
transform: translateX(0);
}
.about-close {
position: absolute;
top: 16px;
right: 16px;
width: 32px;
height: 32px;
padding: 0;
background: transparent;
border: 1px solid rgba(160, 160, 160, 0.3);
border-radius: 50%;
color: #aaa;
cursor: pointer;
transition: all 0.2s ease;
display: flex;
align-items: center;
justify-content: center;
}
.about-close:hover {
background: rgba(255, 255, 255, 0.1);
border-color: rgba(218, 165, 32, 0.5);
color: #daa520;
}
.about-close svg {
width: 18px;
height: 18px;
}
.about-panel-content h2 {
font-size: 20px;
color: #daa520;
margin-bottom: 24px;
font-weight: 600;
}
.about-panel-content section {
margin-bottom: 24px;
}
.about-panel-content h3 {
font-size: 14px;
color: #e0e0e0;
margin-bottom: 10px;
font-weight: 600;
}
.about-panel-content p {
font-size: 13px;
line-height: 1.6;
color: #aaa;
margin-bottom: 10px;
}
.about-panel-content ul {
list-style: none;
padding: 0;
margin: 0;
}
.about-panel-content li {
font-size: 13px;
line-height: 1.6;
color: #aaa;
margin-bottom: 8px;
padding-left: 16px;
position: relative;
}
.about-panel-content li::before {
content: "•";
position: absolute;
left: 0;
color: #daa520;
}
.about-panel-content li strong {
color: #ccc;
}
.about-footer {
margin-top: 32px;
padding-top: 16px;
border-top: 1px solid rgba(160, 160, 160, 0.2);
font-size: 12px;
color: #666;
text-align: center;
}
.about-backdrop {
position: absolute;
top: 0;
left: 0;
width: 100%;
height: 100%;
background: rgba(0, 0, 0, 0.5);
opacity: 0;
transition: opacity 0.3s ease;
}
.about-panel.open .about-backdrop {
opacity: 1;
}
/* Submit Job Button */
.submit-job-button {
position: absolute;
top: 14px;
right: 72px;
height: 28px;
padding: 0 12px;
background: rgba(10, 10, 20, 0.7);
border: 1px solid rgba(0, 180, 80, 0.4);
border-radius: 14px;
color: #00b450;
cursor: pointer;
pointer-events: auto;
transition: all 0.2s ease;
display: flex;
align-items: center;
gap: 6px;
font-family: "Courier New", monospace;
font-size: 12px;
}
.submit-job-button:hover {
background: rgba(0, 180, 80, 0.15);
border-color: rgba(0, 180, 80, 0.7);
transform: scale(1.05);
}
.submit-job-button svg {
width: 14px;
height: 14px;
}
/* Submit Job Modal */
.submit-job-modal {
position: fixed;
top: 0;
left: 0;
width: 100%;
height: 100%;
z-index: 100;
pointer-events: none;
visibility: hidden;
opacity: 0;
transition: opacity 0.3s ease, visibility 0.3s ease;
}
.submit-job-modal.open {
pointer-events: auto;
visibility: visible;
opacity: 1;
}
.submit-job-content {
position: absolute;
top: 50%;
left: 50%;
transform: translate(-50%, -50%) scale(0.95);
width: 480px;
max-width: 90%;
max-height: 90vh;
background: rgba(10, 10, 20, 0.98);
border: 1px solid rgba(218, 165, 32, 0.3);
border-radius: 12px;
padding: 32px;
overflow-y: auto;
transition: transform 0.3s ease;
box-shadow: 0 8px 32px rgba(0, 0, 0, 0.6);
}
.submit-job-modal.open .submit-job-content {
transform: translate(-50%, -50%) scale(1);
}
.submit-job-close {
position: absolute;
top: 16px;
right: 16px;
width: 32px;
height: 32px;
padding: 0;
background: transparent;
border: 1px solid rgba(160, 160, 160, 0.3);
border-radius: 50%;
color: #aaa;
cursor: pointer;
transition: all 0.2s ease;
display: flex;
align-items: center;
justify-content: center;
}
.submit-job-close:hover {
background: rgba(255, 255, 255, 0.1);
border-color: rgba(218, 165, 32, 0.5);
color: #daa520;
}
.submit-job-close svg {
width: 18px;
height: 18px;
}
.submit-job-content h2 {
font-size: 22px;
color: #daa520;
margin: 0 0 8px 0;
font-weight: 600;
}
.submit-job-subtitle {
font-size: 13px;
color: #888;
margin: 0 0 24px 0;
}
/* Form Styles */
.submit-job-form {
display: flex;
flex-direction: column;
gap: 20px;
}
.submit-job-form.hidden {
display: none;
}
.form-group {
display: flex;
flex-direction: column;
gap: 8px;
}
.form-group label {
font-size: 13px;
color: #ccc;
font-weight: 500;
}
.form-group label .required {
color: #ff4444;
margin-left: 4px;
}
.form-group input,
.form-group textarea,
.form-group select {
background: rgba(30, 30, 40, 0.8);
border: 1px solid rgba(160, 160, 160, 0.3);
border-radius: 6px;
padding: 10px 12px;
color: #e0e0e0;
font-family: "Courier New", monospace;
font-size: 14px;
transition: border-color 0.2s ease, box-shadow 0.2s ease;
}
.form-group input:focus,
.form-group textarea:focus,
.form-group select:focus {
outline: none;
border-color: rgba(218, 165, 32, 0.6);
box-shadow: 0 0 0 2px rgba(218, 165, 32, 0.1);
}
.form-group input.error,
.form-group textarea.error {
border-color: #ff4444;
box-shadow: 0 0 0 2px rgba(255, 68, 68, 0.1);
}
.form-group input::placeholder,
.form-group textarea::placeholder {
color: #666;
}
.form-group textarea {
resize: vertical;
min-height: 100px;
}
.form-group select {
cursor: pointer;
appearance: none;
background-image: url("data:image/svg+xml,%3Csvg xmlns='http://www.w3.org/2000/svg' width='12' height='12' viewBox='0 0 24 24' fill='none' stroke='%23888' stroke-width='2'%3E%3Cpath d='m6 9 6 6 6-6'/%3E%3C/svg%3E");
background-repeat: no-repeat;
background-position: right 12px center;
padding-right: 36px;
}
.form-group select option {
background: #1a1a2e;
color: #e0e0e0;
}
/* Character Count */
.char-count {
font-size: 11px;
color: #666;
text-align: right;
margin-top: 4px;
transition: color 0.2s ease;
}
.char-count.near-limit {
color: #ffaa33;
}
.char-count.over-limit {
color: #ff4444;
font-weight: bold;
}
/* Validation Messages */
.validation-error {
font-size: 12px;
color: #ff4444;
margin-top: 4px;
min-height: 16px;
opacity: 0;
transition: opacity 0.2s ease;
}
.validation-error.visible {
opacity: 1;
}
.validation-warning {
font-size: 12px;
color: #ffaa33;
margin-top: 4px;
min-height: 16px;
opacity: 0;
transition: opacity 0.2s ease;
}
.validation-warning.visible {
opacity: 1;
}
/* Action Buttons */
.submit-job-actions {
display: flex;
gap: 12px;
justify-content: flex-end;
margin-top: 8px;
}
.btn-secondary {
padding: 10px 20px;
background: transparent;
border: 1px solid rgba(160, 160, 160, 0.4);
border-radius: 6px;
color: #aaa;
font-family: "Courier New", monospace;
font-size: 14px;
cursor: pointer;
transition: all 0.2s ease;
}
.btn-secondary:hover {
background: rgba(255, 255, 255, 0.05);
border-color: rgba(160, 160, 160, 0.6);
color: #ccc;
}
.btn-primary {
padding: 10px 20px;
background: linear-gradient(135deg, rgba(0, 180, 80, 0.8), rgba(0, 140, 60, 0.9));
border: 1px solid rgba(0, 180, 80, 0.5);
border-radius: 6px;
color: #fff;
font-family: "Courier New", monospace;
font-size: 14px;
cursor: pointer;
transition: all 0.2s ease;
}
.btn-primary:hover:not(:disabled) {
background: linear-gradient(135deg, rgba(0, 200, 90, 0.9), rgba(0, 160, 70, 1));
transform: translateY(-1px);
box-shadow: 0 4px 12px rgba(0, 180, 80, 0.3);
}
.btn-primary:disabled {
background: rgba(100, 100, 100, 0.3);
border-color: rgba(100, 100, 100, 0.3);
color: #666;
cursor: not-allowed;
}
/* Success State */
.submit-job-success {
text-align: center;
padding: 32px 16px;
}
.submit-job-success.hidden {
display: none;
}
.success-icon {
width: 64px;
height: 64px;
margin: 0 auto 20px;
color: #00b450;
}
.success-icon svg {
width: 100%;
height: 100%;
}
.submit-job-success h3 {
font-size: 20px;
color: #00b450;
margin: 0 0 12px 0;
}
.submit-job-success p {
font-size: 14px;
color: #888;
margin: 0 0 24px 0;
line-height: 1.5;
}
/* Backdrop */
.submit-job-backdrop {
position: absolute;
top: 0;
left: 0;
width: 100%;
height: 100%;
background: rgba(0, 0, 0, 0.6);
opacity: 0;
transition: opacity 0.3s ease;
}
.submit-job-modal.open .submit-job-backdrop {
opacity: 1;
}
/* Mobile adjustments */
@media (max-width: 480px) {
.about-panel-content {
width: 100%;
max-width: 100%;
padding: 56px 20px 20px 20px;
}
.info-button {
right: 32px;
width: 26px;
height: 26px;
}
.info-button svg {
width: 14px;
height: 14px;
}
.submit-job-button {
right: 64px;
height: 26px;
padding: 0 10px;
font-size: 11px;
}
.submit-job-button svg {
width: 12px;
height: 12px;
}
.submit-job-content {
width: 95%;
padding: 24px 20px;
}
.submit-job-content h2 {
font-size: 20px;
}
.submit-job-actions {
flex-direction: column-reverse;
}
.btn-secondary,
.btn-primary {
width: 100%;
}
}

View File

@@ -31,6 +31,8 @@ for _mod in [
"pyzbar.pyzbar",
"pyttsx3",
"sentence_transformers",
"swarm",
"swarm.event_log",
]:
sys.modules.setdefault(_mod, MagicMock())

View File

@@ -120,3 +120,50 @@ class TestCSRFDecoratorSupport:
# Protected endpoint should be 403
response2 = client.post("/protected")
assert response2.status_code == 403
def test_csrf_exempt_endpoint_not_executed_before_check(self):
"""Regression test for #626: endpoint must NOT execute before CSRF check.
Previously the middleware called call_next() first, executing the endpoint
and its side effects, then checked @csrf_exempt afterward. This meant
non-exempt endpoints would execute even when CSRF validation failed.
"""
app = FastAPI()
app.add_middleware(CSRFMiddleware)
side_effect_log: list[str] = []
@app.post("/protected-with-side-effects")
def protected_with_side_effects():
side_effect_log.append("executed")
return {"message": "should not run"}
client = TestClient(app)
# POST without CSRF token — should be blocked with 403
response = client.post("/protected-with-side-effects")
assert response.status_code == 403
# The critical assertion: the endpoint must NOT have executed
assert side_effect_log == [], (
"Endpoint executed before CSRF validation! Side effects occurred "
"despite CSRF failure (see issue #626)."
)
def test_csrf_exempt_endpoint_does_execute(self):
"""Ensure @csrf_exempt endpoints still execute normally."""
app = FastAPI()
app.add_middleware(CSRFMiddleware)
side_effect_log: list[str] = []
@app.post("/exempt-webhook")
@csrf_exempt
def exempt_webhook():
side_effect_log.append("executed")
return {"message": "webhook ok"}
client = TestClient(app)
response = client.post("/exempt-webhook")
assert response.status_code == 200
assert side_effect_log == ["executed"]

View File

@@ -0,0 +1,187 @@
"""Tests for Tower dashboard route (/tower)."""
from unittest.mock import MagicMock, patch
def _mock_spark_engine():
"""Return a mock spark_engine with realistic return values."""
engine = MagicMock()
engine.status.return_value = {
"enabled": True,
"events_captured": 5,
"memories_stored": 3,
"predictions": {"total": 2, "avg_accuracy": 0.85},
"event_types": {
"task_posted": 2,
"bid_submitted": 1,
"task_assigned": 1,
"task_completed": 1,
"task_failed": 0,
"agent_joined": 0,
"tool_executed": 0,
"creative_step": 0,
},
}
event = MagicMock()
event.event_type = "task_completed"
event.description = "Task finished"
event.importance = 0.8
event.created_at = "2026-01-01T00:00:00"
event.agent_id = "agent-1234-abcd"
event.task_id = "task-5678-efgh"
event.data = '{"result": "ok"}'
engine.get_timeline.return_value = [event]
pred = MagicMock()
pred.task_id = "task-5678-efgh"
pred.accuracy = 0.9
pred.evaluated_at = "2026-01-01T01:00:00"
pred.created_at = "2026-01-01T00:30:00"
pred.predicted_value = '{"outcome": "success"}'
engine.get_predictions.return_value = [pred]
advisory = MagicMock()
advisory.category = "performance"
advisory.priority = "high"
advisory.title = "Slow tasks"
advisory.detail = "Tasks taking longer than expected"
advisory.suggested_action = "Scale up workers"
engine.get_advisories.return_value = [advisory]
return engine
class TestTowerUI:
"""Tests for GET /tower endpoint."""
@patch("dashboard.routes.tower.spark_engine", new_callable=_mock_spark_engine)
def test_tower_returns_200(self, mock_engine, client):
response = client.get("/tower")
assert response.status_code == 200
@patch("dashboard.routes.tower.spark_engine", new_callable=_mock_spark_engine)
def test_tower_returns_html(self, mock_engine, client):
response = client.get("/tower")
assert "text/html" in response.headers["content-type"]
@patch("dashboard.routes.tower.spark_engine", new_callable=_mock_spark_engine)
def test_tower_contains_dashboard_content(self, mock_engine, client):
response = client.get("/tower")
body = response.text
assert "tower" in body.lower() or "spark" in body.lower()
class TestSparkSnapshot:
"""Tests for _spark_snapshot helper."""
@patch("dashboard.routes.tower.spark_engine", new_callable=_mock_spark_engine)
def test_snapshot_structure(self, mock_engine):
from dashboard.routes.tower import _spark_snapshot
snap = _spark_snapshot()
assert snap["type"] == "spark_state"
assert "status" in snap
assert "events" in snap
assert "predictions" in snap
assert "advisories" in snap
@patch("dashboard.routes.tower.spark_engine", new_callable=_mock_spark_engine)
def test_snapshot_events_parsed(self, mock_engine):
from dashboard.routes.tower import _spark_snapshot
snap = _spark_snapshot()
ev = snap["events"][0]
assert ev["event_type"] == "task_completed"
assert ev["importance"] == 0.8
assert ev["agent_id"] == "agent-12"
assert ev["task_id"] == "task-567"
assert ev["data"] == {"result": "ok"}
@patch("dashboard.routes.tower.spark_engine", new_callable=_mock_spark_engine)
def test_snapshot_predictions_parsed(self, mock_engine):
from dashboard.routes.tower import _spark_snapshot
snap = _spark_snapshot()
pred = snap["predictions"][0]
assert pred["task_id"] == "task-567"
assert pred["accuracy"] == 0.9
assert pred["evaluated"] is True
assert pred["predicted"] == {"outcome": "success"}
@patch("dashboard.routes.tower.spark_engine", new_callable=_mock_spark_engine)
def test_snapshot_advisories_parsed(self, mock_engine):
from dashboard.routes.tower import _spark_snapshot
snap = _spark_snapshot()
adv = snap["advisories"][0]
assert adv["category"] == "performance"
assert adv["priority"] == "high"
assert adv["title"] == "Slow tasks"
assert adv["suggested_action"] == "Scale up workers"
@patch("dashboard.routes.tower.spark_engine")
def test_snapshot_handles_empty_state(self, mock_engine):
mock_engine.status.return_value = {"enabled": False}
mock_engine.get_timeline.return_value = []
mock_engine.get_predictions.return_value = []
mock_engine.get_advisories.return_value = []
from dashboard.routes.tower import _spark_snapshot
snap = _spark_snapshot()
assert snap["events"] == []
assert snap["predictions"] == []
assert snap["advisories"] == []
@patch("dashboard.routes.tower.spark_engine")
def test_snapshot_handles_invalid_json_data(self, mock_engine):
mock_engine.status.return_value = {"enabled": True}
event = MagicMock()
event.event_type = "test"
event.description = "bad data"
event.importance = 0.5
event.created_at = "2026-01-01T00:00:00"
event.agent_id = None
event.task_id = None
event.data = "not-json{"
mock_engine.get_timeline.return_value = [event]
pred = MagicMock()
pred.task_id = None
pred.accuracy = None
pred.evaluated_at = None
pred.created_at = "2026-01-01T00:00:00"
pred.predicted_value = None
mock_engine.get_predictions.return_value = [pred]
mock_engine.get_advisories.return_value = []
from dashboard.routes.tower import _spark_snapshot
snap = _spark_snapshot()
ev = snap["events"][0]
assert ev["data"] == {}
assert "agent_id" not in ev
assert "task_id" not in ev
pred = snap["predictions"][0]
assert pred["task_id"] == "?"
assert pred["predicted"] == {}
class TestTowerWebSocket:
"""Tests for WS /tower/ws endpoint."""
@patch("dashboard.routes.tower.spark_engine", new_callable=_mock_spark_engine)
@patch("dashboard.routes.tower._PUSH_INTERVAL", 0)
def test_ws_sends_initial_snapshot(self, mock_engine, client):
import json
with client.websocket_connect("/tower/ws") as ws:
data = json.loads(ws.receive_text())
assert data["type"] == "spark_state"
assert "status" in data
assert "events" in data

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,288 @@
"""Tests for infrastructure.db_pool module."""
import sqlite3
import threading
import time
from pathlib import Path
import pytest
from infrastructure.db_pool import ConnectionPool
class TestConnectionPoolInit:
"""Test ConnectionPool initialization."""
def test_init_with_string_path(self, tmp_path):
"""Pool can be initialized with a string path."""
db_path = str(tmp_path / "test.db")
pool = ConnectionPool(db_path)
assert pool._db_path == Path(db_path)
def test_init_with_path_object(self, tmp_path):
"""Pool can be initialized with a Path object."""
db_path = tmp_path / "test.db"
pool = ConnectionPool(db_path)
assert pool._db_path == db_path
def test_init_creates_thread_local(self, tmp_path):
"""Pool initializes thread-local storage."""
pool = ConnectionPool(tmp_path / "test.db")
assert hasattr(pool, "_local")
assert isinstance(pool._local, threading.local)
class TestGetConnection:
"""Test get_connection() method."""
def test_get_connection_returns_valid_sqlite3_connection(self, tmp_path):
"""get_connection() returns a valid sqlite3 connection."""
pool = ConnectionPool(tmp_path / "test.db")
conn = pool.get_connection()
assert isinstance(conn, sqlite3.Connection)
# Verify it's a working connection
cursor = conn.execute("SELECT 1")
assert cursor.fetchone()[0] == 1
def test_get_connection_creates_db_file(self, tmp_path):
"""get_connection() creates the database file if it doesn't exist."""
db_path = tmp_path / "subdir" / "test.db"
assert not db_path.exists()
pool = ConnectionPool(db_path)
pool.get_connection()
assert db_path.exists()
def test_get_connection_sets_row_factory(self, tmp_path):
"""get_connection() sets row_factory to sqlite3.Row."""
pool = ConnectionPool(tmp_path / "test.db")
conn = pool.get_connection()
assert conn.row_factory is sqlite3.Row
def test_multiple_calls_same_thread_reuse_connection(self, tmp_path):
"""Multiple calls from same thread reuse the same connection."""
pool = ConnectionPool(tmp_path / "test.db")
conn1 = pool.get_connection()
conn2 = pool.get_connection()
assert conn1 is conn2
def test_different_threads_get_different_connections(self, tmp_path):
"""Different threads get different connections."""
pool = ConnectionPool(tmp_path / "test.db")
connections = []
def get_conn():
connections.append(pool.get_connection())
t1 = threading.Thread(target=get_conn)
t2 = threading.Thread(target=get_conn)
t1.start()
t2.start()
t1.join()
t2.join()
assert len(connections) == 2
assert connections[0] is not connections[1]
class TestCloseConnection:
"""Test close_connection() method."""
def test_close_connection_closes_sqlite_connection(self, tmp_path):
"""close_connection() closes the underlying sqlite connection."""
pool = ConnectionPool(tmp_path / "test.db")
conn = pool.get_connection()
pool.close_connection()
# Connection should be closed
with pytest.raises(sqlite3.ProgrammingError):
conn.execute("SELECT 1")
def test_close_connection_cleans_up_thread_local(self, tmp_path):
"""close_connection() cleans up thread-local storage."""
pool = ConnectionPool(tmp_path / "test.db")
pool.get_connection()
assert hasattr(pool._local, "conn")
assert pool._local.conn is not None
pool.close_connection()
# Should either not have the attr or it should be None
assert not hasattr(pool._local, "conn") or pool._local.conn is None
def test_close_connection_without_getting_connection_is_safe(self, tmp_path):
"""close_connection() is safe to call even without getting a connection first."""
pool = ConnectionPool(tmp_path / "test.db")
# Should not raise
pool.close_connection()
def test_close_connection_multiple_calls_is_safe(self, tmp_path):
"""close_connection() can be called multiple times safely."""
pool = ConnectionPool(tmp_path / "test.db")
pool.get_connection()
pool.close_connection()
# Should not raise
pool.close_connection()
class TestContextManager:
"""Test the connection() context manager."""
def test_connection_yields_valid_connection(self, tmp_path):
"""connection() context manager yields a valid sqlite3 connection."""
pool = ConnectionPool(tmp_path / "test.db")
with pool.connection() as conn:
assert isinstance(conn, sqlite3.Connection)
cursor = conn.execute("SELECT 42")
assert cursor.fetchone()[0] == 42
def test_connection_closes_on_exit(self, tmp_path):
"""connection() context manager closes connection on exit."""
pool = ConnectionPool(tmp_path / "test.db")
with pool.connection() as conn:
pass
# Connection should be closed after context exit
with pytest.raises(sqlite3.ProgrammingError):
conn.execute("SELECT 1")
def test_connection_closes_on_exception(self, tmp_path):
"""connection() context manager closes connection even on exception."""
pool = ConnectionPool(tmp_path / "test.db")
conn_ref = None
try:
with pool.connection() as conn:
conn_ref = conn
raise ValueError("Test exception")
except ValueError:
pass
# Connection should still be closed
with pytest.raises(sqlite3.ProgrammingError):
conn_ref.execute("SELECT 1")
def test_connection_context_manager_is_reusable(self, tmp_path):
"""connection() context manager can be used multiple times."""
pool = ConnectionPool(tmp_path / "test.db")
with pool.connection() as conn1:
result1 = conn1.execute("SELECT 1").fetchone()[0]
with pool.connection() as conn2:
result2 = conn2.execute("SELECT 2").fetchone()[0]
assert result1 == 1
assert result2 == 2
class TestThreadSafety:
"""Test thread-safety of the connection pool."""
def test_concurrent_access(self, tmp_path):
"""Multiple threads can use the pool concurrently."""
pool = ConnectionPool(tmp_path / "test.db")
results = []
errors = []
def worker(worker_id):
try:
with pool.connection() as conn:
conn.execute("CREATE TABLE IF NOT EXISTS test (id INTEGER)")
conn.execute("INSERT INTO test VALUES (?)", (worker_id,))
conn.commit()
time.sleep(0.01) # Small delay to increase contention
results.append(worker_id)
except Exception as e:
errors.append(e)
threads = [threading.Thread(target=worker, args=(i,)) for i in range(5)]
for t in threads:
t.start()
for t in threads:
t.join()
assert len(errors) == 0, f"Errors occurred: {errors}"
assert len(results) == 5
def test_thread_isolation(self, tmp_path):
"""Each thread has isolated connections (verified by thread-local data)."""
pool = ConnectionPool(tmp_path / "test.db")
results = []
def worker(worker_id):
# Get connection and write worker-specific data
conn = pool.get_connection()
conn.execute("CREATE TABLE IF NOT EXISTS isolation_test (thread_id INTEGER)")
conn.execute("DELETE FROM isolation_test") # Clear previous data
conn.execute("INSERT INTO isolation_test VALUES (?)", (worker_id,))
conn.commit()
# Read back the data
result = conn.execute("SELECT thread_id FROM isolation_test").fetchone()[0]
results.append((worker_id, result))
pool.close_connection()
threads = [threading.Thread(target=worker, args=(i,)) for i in range(3)]
for t in threads:
t.start()
for t in threads:
t.join()
# Each thread should have written and read its own ID
assert len(results) == 3
for worker_id, read_id in results:
assert worker_id == read_id, f"Thread {worker_id} read {read_id} instead"
class TestCloseAll:
"""Test close_all() method."""
def test_close_all_closes_current_thread_connection(self, tmp_path):
"""close_all() closes the connection for the current thread."""
pool = ConnectionPool(tmp_path / "test.db")
conn = pool.get_connection()
pool.close_all()
# Connection should be closed
with pytest.raises(sqlite3.ProgrammingError):
conn.execute("SELECT 1")
class TestIntegration:
"""Integration tests for real-world usage patterns."""
def test_basic_crud_operations(self, tmp_path):
"""Can perform basic CRUD operations through the pool."""
pool = ConnectionPool(tmp_path / "test.db")
with pool.connection() as conn:
# Create table
conn.execute("CREATE TABLE users (id INTEGER PRIMARY KEY, name TEXT)")
# Insert
conn.execute("INSERT INTO users (name) VALUES (?)", ("Alice",))
conn.execute("INSERT INTO users (name) VALUES (?)", ("Bob",))
conn.commit()
# Query
cursor = conn.execute("SELECT * FROM users ORDER BY id")
rows = cursor.fetchall()
assert len(rows) == 2
assert rows[0]["name"] == "Alice"
assert rows[1]["name"] == "Bob"
def test_multiple_pools_different_databases(self, tmp_path):
"""Multiple pools can manage different databases independently."""
pool1 = ConnectionPool(tmp_path / "db1.db")
pool2 = ConnectionPool(tmp_path / "db2.db")
with pool1.connection() as conn1:
conn1.execute("CREATE TABLE test (val INTEGER)")
conn1.execute("INSERT INTO test VALUES (1)")
conn1.commit()
with pool2.connection() as conn2:
conn2.execute("CREATE TABLE test (val INTEGER)")
conn2.execute("INSERT INTO test VALUES (2)")
conn2.commit()
# Verify isolation
with pool1.connection() as conn1:
result = conn1.execute("SELECT val FROM test").fetchone()[0]
assert result == 1
with pool2.connection() as conn2:
result = conn2.execute("SELECT val FROM test").fetchone()[0]
assert result == 2

View File

@@ -5,11 +5,13 @@ from datetime import UTC, datetime, timedelta
from unittest.mock import patch
from infrastructure.error_capture import (
_build_report_description,
_create_bug_report,
_dedup_cache,
_extract_traceback_info,
_get_git_context,
_is_duplicate,
_log_bug_report_created,
_log_error_event,
_notify_bug_report,
_record_to_session,
@@ -231,6 +233,68 @@ class TestLogErrorEvent:
_log_error_event(e, "test", "abc123", "file.py", 42, {"branch": "main"})
class TestBuildReportDescription:
"""Test _build_report_description helper."""
def test_includes_error_info(self):
try:
raise RuntimeError("desc test")
except RuntimeError as e:
desc = _build_report_description(
e,
"test_src",
None,
"hash1",
"tb...",
"file.py",
10,
{"branch": "main"},
)
assert "RuntimeError" in desc
assert "test_src" in desc
assert "file.py:10" in desc
assert "hash1" in desc
def test_includes_context_when_provided(self):
try:
raise RuntimeError("ctx desc")
except RuntimeError as e:
desc = _build_report_description(
e,
"src",
{"path": "/api"},
"h",
"tb",
"f.py",
1,
{},
)
assert "path=/api" in desc
def test_omits_context_when_none(self):
try:
raise RuntimeError("no ctx")
except RuntimeError as e:
desc = _build_report_description(
e,
"src",
None,
"h",
"tb",
"f.py",
1,
{},
)
assert "**Context:**" not in desc
class TestLogBugReportCreated:
"""Test _log_bug_report_created helper."""
def test_does_not_crash_on_missing_deps(self):
_log_bug_report_created("test", "task-1", "hash1", "title")
class TestCreateBugReport:
"""Test _create_bug_report helper."""

View File

@@ -0,0 +1,509 @@
"""Tests for infrastructure.models.multimodal — multi-modal model management."""
import json
from unittest.mock import MagicMock, patch
from infrastructure.models.multimodal import (
DEFAULT_FALLBACK_CHAINS,
KNOWN_MODEL_CAPABILITIES,
ModelCapability,
ModelInfo,
MultiModalManager,
get_model_for_capability,
model_supports_tools,
model_supports_vision,
pull_model_with_fallback,
)
# ---------------------------------------------------------------------------
# ModelCapability enum
# ---------------------------------------------------------------------------
class TestModelCapability:
def test_members_exist(self):
assert ModelCapability.TEXT
assert ModelCapability.VISION
assert ModelCapability.AUDIO
assert ModelCapability.TOOLS
assert ModelCapability.JSON
assert ModelCapability.STREAMING
def test_all_members_unique(self):
values = [m.value for m in ModelCapability]
assert len(values) == len(set(values))
# ---------------------------------------------------------------------------
# ModelInfo dataclass
# ---------------------------------------------------------------------------
class TestModelInfo:
def test_defaults(self):
info = ModelInfo(name="test-model")
assert info.name == "test-model"
assert info.capabilities == set()
assert info.is_available is False
assert info.is_pulled is False
assert info.size_mb is None
assert info.description == ""
def test_supports_true(self):
info = ModelInfo(name="m", capabilities={ModelCapability.TEXT, ModelCapability.VISION})
assert info.supports(ModelCapability.TEXT) is True
assert info.supports(ModelCapability.VISION) is True
def test_supports_false(self):
info = ModelInfo(name="m", capabilities={ModelCapability.TEXT})
assert info.supports(ModelCapability.VISION) is False
# ---------------------------------------------------------------------------
# Known model capabilities lookup table
# ---------------------------------------------------------------------------
class TestKnownModelCapabilities:
def test_vision_models_have_vision(self):
vision_names = [
"llama3.2-vision",
"llava",
"moondream",
"qwen2.5-vl",
]
for name in vision_names:
assert ModelCapability.VISION in KNOWN_MODEL_CAPABILITIES[name], name
def test_text_models_lack_vision(self):
text_only = ["deepseek-r1", "gemma2", "phi3"]
for name in text_only:
assert ModelCapability.VISION not in KNOWN_MODEL_CAPABILITIES[name], name
def test_all_models_have_text(self):
for name, caps in KNOWN_MODEL_CAPABILITIES.items():
assert ModelCapability.TEXT in caps, f"{name} should have TEXT"
# ---------------------------------------------------------------------------
# Default fallback chains
# ---------------------------------------------------------------------------
class TestDefaultFallbackChains:
def test_vision_chain_non_empty(self):
assert len(DEFAULT_FALLBACK_CHAINS[ModelCapability.VISION]) > 0
def test_tools_chain_non_empty(self):
assert len(DEFAULT_FALLBACK_CHAINS[ModelCapability.TOOLS]) > 0
def test_audio_chain_empty(self):
assert DEFAULT_FALLBACK_CHAINS[ModelCapability.AUDIO] == []
# ---------------------------------------------------------------------------
# Helpers to build a manager without hitting the network
# ---------------------------------------------------------------------------
def _fake_ollama_tags(*model_names: str) -> bytes:
"""Build a JSON response mimicking Ollama /api/tags."""
models = []
for name in model_names:
models.append({"name": name, "size": 4 * 1024 * 1024 * 1024, "details": {"family": "test"}})
return json.dumps({"models": models}).encode()
def _make_manager(model_names: list[str] | None = None) -> MultiModalManager:
"""Create a MultiModalManager with mocked Ollama responses."""
if model_names is None:
# No models available — Ollama unreachable
with patch("urllib.request.urlopen", side_effect=ConnectionError("no ollama")):
return MultiModalManager(ollama_url="http://localhost:11434")
resp = MagicMock()
resp.__enter__ = MagicMock(return_value=resp)
resp.__exit__ = MagicMock(return_value=False)
resp.read.return_value = _fake_ollama_tags(*model_names)
resp.status = 200
with patch("urllib.request.urlopen", return_value=resp):
return MultiModalManager(ollama_url="http://localhost:11434")
# ---------------------------------------------------------------------------
# MultiModalManager — init & refresh
# ---------------------------------------------------------------------------
class TestMultiModalManagerInit:
def test_init_no_ollama(self):
mgr = _make_manager(None)
assert mgr.list_available_models() == []
def test_init_with_models(self):
mgr = _make_manager(["llama3.1:8b", "llava:7b"])
names = {m.name for m in mgr.list_available_models()}
assert names == {"llama3.1:8b", "llava:7b"}
def test_refresh_updates_models(self):
mgr = _make_manager([])
assert mgr.list_available_models() == []
resp = MagicMock()
resp.__enter__ = MagicMock(return_value=resp)
resp.__exit__ = MagicMock(return_value=False)
resp.read.return_value = _fake_ollama_tags("gemma2:9b")
resp.status = 200
with patch("urllib.request.urlopen", return_value=resp):
mgr.refresh()
names = {m.name for m in mgr.list_available_models()}
assert "gemma2:9b" in names
# ---------------------------------------------------------------------------
# _detect_capabilities
# ---------------------------------------------------------------------------
class TestDetectCapabilities:
def test_exact_match(self):
mgr = _make_manager(None)
caps = mgr._detect_capabilities("llava:7b")
assert ModelCapability.VISION in caps
def test_base_name_match(self):
mgr = _make_manager(None)
caps = mgr._detect_capabilities("llava:99b")
# "llava:99b" not in table, but "llava" is
assert ModelCapability.VISION in caps
def test_unknown_model_defaults_to_text(self):
mgr = _make_manager(None)
caps = mgr._detect_capabilities("totally-unknown-model:1b")
assert caps == {ModelCapability.TEXT, ModelCapability.STREAMING}
# ---------------------------------------------------------------------------
# get_model_capabilities / model_supports
# ---------------------------------------------------------------------------
class TestGetModelCapabilities:
def test_available_model(self):
mgr = _make_manager(["llava:7b"])
caps = mgr.get_model_capabilities("llava:7b")
assert ModelCapability.VISION in caps
def test_unavailable_model_uses_detection(self):
mgr = _make_manager([])
caps = mgr.get_model_capabilities("llava:7b")
assert ModelCapability.VISION in caps
class TestModelSupports:
def test_supports_true(self):
mgr = _make_manager(["llava:7b"])
assert mgr.model_supports("llava:7b", ModelCapability.VISION) is True
def test_supports_false(self):
mgr = _make_manager(["deepseek-r1:7b"])
assert mgr.model_supports("deepseek-r1:7b", ModelCapability.VISION) is False
# ---------------------------------------------------------------------------
# get_models_with_capability
# ---------------------------------------------------------------------------
class TestGetModelsWithCapability:
def test_returns_vision_models(self):
mgr = _make_manager(["llava:7b", "deepseek-r1:7b"])
vision = mgr.get_models_with_capability(ModelCapability.VISION)
names = {m.name for m in vision}
assert "llava:7b" in names
assert "deepseek-r1:7b" not in names
def test_empty_when_none_available(self):
mgr = _make_manager(["deepseek-r1:7b"])
vision = mgr.get_models_with_capability(ModelCapability.VISION)
assert vision == []
# ---------------------------------------------------------------------------
# get_best_model_for
# ---------------------------------------------------------------------------
class TestGetBestModelFor:
def test_preferred_model_with_capability(self):
mgr = _make_manager(["llava:7b", "llama3.1:8b"])
result = mgr.get_best_model_for(ModelCapability.VISION, preferred_model="llava:7b")
assert result == "llava:7b"
def test_preferred_model_without_capability_uses_fallback(self):
mgr = _make_manager(["deepseek-r1:7b", "llava:7b"])
# preferred doesn't have VISION, fallback chain has llava:7b
result = mgr.get_best_model_for(ModelCapability.VISION, preferred_model="deepseek-r1:7b")
assert result == "llava:7b"
def test_fallback_chain_order(self):
# First in chain: llama3.2:3b
mgr = _make_manager(["llama3.2:3b", "llava:7b"])
result = mgr.get_best_model_for(ModelCapability.VISION)
assert result == "llama3.2:3b"
def test_any_capable_model_when_no_fallback(self):
mgr = _make_manager(["moondream:1.8b"])
mgr._fallback_chains[ModelCapability.VISION] = [] # clear chain
result = mgr.get_best_model_for(ModelCapability.VISION)
assert result == "moondream:1.8b"
def test_none_when_no_capable_model(self):
mgr = _make_manager(["deepseek-r1:7b"])
result = mgr.get_best_model_for(ModelCapability.VISION)
assert result is None
def test_preferred_model_not_available_skipped(self):
mgr = _make_manager(["llava:7b"])
# preferred_model "llava:13b" is not in available_models
result = mgr.get_best_model_for(ModelCapability.VISION, preferred_model="llava:13b")
assert result == "llava:7b"
# ---------------------------------------------------------------------------
# pull_model_with_fallback (manager method)
# ---------------------------------------------------------------------------
class TestPullModelWithFallback:
def test_already_available(self):
mgr = _make_manager(["llama3.1:8b"])
model, is_fallback = mgr.pull_model_with_fallback("llama3.1:8b")
assert model == "llama3.1:8b"
assert is_fallback is False
def test_pull_succeeds(self):
mgr = _make_manager([])
pull_resp = MagicMock()
pull_resp.__enter__ = MagicMock(return_value=pull_resp)
pull_resp.__exit__ = MagicMock(return_value=False)
pull_resp.status = 200
# After pull, refresh returns the model
refresh_resp = MagicMock()
refresh_resp.__enter__ = MagicMock(return_value=refresh_resp)
refresh_resp.__exit__ = MagicMock(return_value=False)
refresh_resp.read.return_value = _fake_ollama_tags("llama3.1:8b")
refresh_resp.status = 200
with patch("urllib.request.urlopen", side_effect=[pull_resp, refresh_resp]):
model, is_fallback = mgr.pull_model_with_fallback("llama3.1:8b")
assert model == "llama3.1:8b"
assert is_fallback is False
def test_pull_fails_uses_capability_fallback(self):
mgr = _make_manager(["llava:7b"])
with patch("urllib.request.urlopen", side_effect=ConnectionError("fail")):
model, is_fallback = mgr.pull_model_with_fallback(
"nonexistent-vision:1b",
capability=ModelCapability.VISION,
)
assert model == "llava:7b"
assert is_fallback is True
def test_pull_fails_uses_default_model(self):
mgr = _make_manager([settings_ollama_model := "llama3.1:8b"])
with (
patch("urllib.request.urlopen", side_effect=ConnectionError("fail")),
patch("infrastructure.models.multimodal.settings") as mock_settings,
):
mock_settings.ollama_model = settings_ollama_model
mock_settings.ollama_url = "http://localhost:11434"
model, is_fallback = mgr.pull_model_with_fallback("missing-model:99b")
assert model == "llama3.1:8b"
assert is_fallback is True
def test_auto_pull_false_skips_pull(self):
mgr = _make_manager([])
with patch("infrastructure.models.multimodal.settings") as mock_settings:
mock_settings.ollama_model = "default"
model, is_fallback = mgr.pull_model_with_fallback("missing:1b", auto_pull=False)
# Falls through to absolute last resort
assert model == "missing:1b"
assert is_fallback is False
def test_absolute_last_resort(self):
mgr = _make_manager([])
with (
patch("urllib.request.urlopen", side_effect=ConnectionError("fail")),
patch("infrastructure.models.multimodal.settings") as mock_settings,
):
mock_settings.ollama_model = "not-available"
model, is_fallback = mgr.pull_model_with_fallback("primary:1b")
assert model == "primary:1b"
assert is_fallback is False
# ---------------------------------------------------------------------------
# _pull_model
# ---------------------------------------------------------------------------
class TestPullModel:
def test_pull_success(self):
mgr = _make_manager([])
pull_resp = MagicMock()
pull_resp.__enter__ = MagicMock(return_value=pull_resp)
pull_resp.__exit__ = MagicMock(return_value=False)
pull_resp.status = 200
refresh_resp = MagicMock()
refresh_resp.__enter__ = MagicMock(return_value=refresh_resp)
refresh_resp.__exit__ = MagicMock(return_value=False)
refresh_resp.read.return_value = _fake_ollama_tags("new-model:1b")
refresh_resp.status = 200
with patch("urllib.request.urlopen", side_effect=[pull_resp, refresh_resp]):
assert mgr._pull_model("new-model:1b") is True
def test_pull_network_error(self):
mgr = _make_manager([])
with patch("urllib.request.urlopen", side_effect=ConnectionError("offline")):
assert mgr._pull_model("any-model:1b") is False
# ---------------------------------------------------------------------------
# configure_fallback_chain / get_fallback_chain
# ---------------------------------------------------------------------------
class TestFallbackChainConfig:
def test_configure_and_get(self):
mgr = _make_manager(None)
mgr.configure_fallback_chain(ModelCapability.VISION, ["model-a", "model-b"])
assert mgr.get_fallback_chain(ModelCapability.VISION) == ["model-a", "model-b"]
def test_get_returns_copy(self):
mgr = _make_manager(None)
chain = mgr.get_fallback_chain(ModelCapability.VISION)
chain.append("mutated")
assert "mutated" not in mgr.get_fallback_chain(ModelCapability.VISION)
def test_get_empty_for_unknown(self):
mgr = _make_manager(None)
# AUDIO has an empty chain by default
assert mgr.get_fallback_chain(ModelCapability.AUDIO) == []
# ---------------------------------------------------------------------------
# get_model_for_content
# ---------------------------------------------------------------------------
class TestGetModelForContent:
def test_image_content(self):
mgr = _make_manager(["llava:7b"])
model, is_fb = mgr.get_model_for_content("image")
assert model == "llava:7b"
def test_vision_content(self):
mgr = _make_manager(["llava:7b"])
model, _ = mgr.get_model_for_content("vision")
assert model == "llava:7b"
def test_multimodal_content(self):
mgr = _make_manager(["llava:7b"])
model, _ = mgr.get_model_for_content("multimodal")
assert model == "llava:7b"
def test_audio_content(self):
mgr = _make_manager(["llama3.1:8b"])
with patch("infrastructure.models.multimodal.settings") as mock_settings:
mock_settings.ollama_model = "llama3.1:8b"
mock_settings.ollama_url = "http://localhost:11434"
model, _ = mgr.get_model_for_content("audio")
assert model == "llama3.1:8b"
def test_text_content(self):
mgr = _make_manager(["llama3.1:8b"])
with patch("infrastructure.models.multimodal.settings") as mock_settings:
mock_settings.ollama_model = "llama3.1:8b"
mock_settings.ollama_url = "http://localhost:11434"
model, _ = mgr.get_model_for_content("text")
assert model == "llama3.1:8b"
def test_preferred_model_respected(self):
mgr = _make_manager(["llama3.2:3b", "llava:7b"])
model, _ = mgr.get_model_for_content("image", preferred_model="llama3.2:3b")
assert model == "llama3.2:3b"
def test_case_insensitive(self):
mgr = _make_manager(["llava:7b"])
model, _ = mgr.get_model_for_content("IMAGE")
assert model == "llava:7b"
# ---------------------------------------------------------------------------
# Module-level convenience functions
# ---------------------------------------------------------------------------
class TestConvenienceFunctions:
def _patch_manager(self, mgr):
return patch(
"infrastructure.models.multimodal._multimodal_manager",
mgr,
)
def test_get_model_for_capability(self):
mgr = _make_manager(["llava:7b"])
with self._patch_manager(mgr):
result = get_model_for_capability(ModelCapability.VISION)
assert result == "llava:7b"
def test_pull_model_with_fallback_convenience(self):
mgr = _make_manager(["llama3.1:8b"])
with self._patch_manager(mgr):
model, is_fb = pull_model_with_fallback("llama3.1:8b")
assert model == "llama3.1:8b"
assert is_fb is False
def test_model_supports_vision_true(self):
mgr = _make_manager(["llava:7b"])
with self._patch_manager(mgr):
assert model_supports_vision("llava:7b") is True
def test_model_supports_vision_false(self):
mgr = _make_manager(["llama3.1:8b"])
with self._patch_manager(mgr):
assert model_supports_vision("llama3.1:8b") is False
def test_model_supports_tools_true(self):
mgr = _make_manager(["llama3.1:8b"])
with self._patch_manager(mgr):
assert model_supports_tools("llama3.1:8b") is True
def test_model_supports_tools_false(self):
mgr = _make_manager(["deepseek-r1:7b"])
with self._patch_manager(mgr):
assert model_supports_tools("deepseek-r1:7b") is False
# ---------------------------------------------------------------------------
# ModelInfo in available_models — size_mb and description populated
# ---------------------------------------------------------------------------
class TestModelInfoPopulation:
def test_size_and_description(self):
mgr = _make_manager(["llama3.1:8b"])
info = mgr._available_models["llama3.1:8b"]
assert info.is_available is True
assert info.is_pulled is True
assert info.size_mb == 4 * 1024 # 4 GiB in MiB
assert info.description == "test"

View File

@@ -0,0 +1,149 @@
"""Tests for provider health history store and API endpoint."""
import time
from datetime import UTC, datetime, timedelta
from unittest.mock import MagicMock
import pytest
from src.infrastructure.router.history import HealthHistoryStore
@pytest.fixture
def store():
"""In-memory history store for testing."""
s = HealthHistoryStore(db_path=":memory:")
yield s
s.close()
@pytest.fixture
def sample_providers():
return [
{
"name": "anthropic",
"status": "healthy",
"error_rate": 0.01,
"avg_latency_ms": 250.5,
"circuit_state": "closed",
"total_requests": 100,
},
{
"name": "local",
"status": "degraded",
"error_rate": 0.15,
"avg_latency_ms": 80.0,
"circuit_state": "closed",
"total_requests": 50,
},
]
def test_record_and_retrieve(store, sample_providers):
store.record_snapshot(sample_providers)
history = store.get_history(hours=1)
assert len(history) == 1
assert len(history[0]["providers"]) == 2
assert history[0]["providers"][0]["name"] == "anthropic"
assert history[0]["providers"][1]["name"] == "local"
assert "timestamp" in history[0]
def test_multiple_snapshots(store, sample_providers):
store.record_snapshot(sample_providers)
time.sleep(0.01)
store.record_snapshot(sample_providers)
history = store.get_history(hours=1)
assert len(history) == 2
def test_hours_filtering(store, sample_providers):
old_ts = (datetime.now(UTC) - timedelta(hours=48)).isoformat()
store._conn.execute(
"""INSERT INTO snapshots
(timestamp, provider_name, status, error_rate,
avg_latency_ms, circuit_state, total_requests)
VALUES (?, ?, ?, ?, ?, ?, ?)""",
(old_ts, "anthropic", "healthy", 0.0, 100.0, "closed", 10),
)
store._conn.commit()
store.record_snapshot(sample_providers)
history = store.get_history(hours=24)
assert len(history) == 1
history = store.get_history(hours=72)
assert len(history) == 2
def test_prune(store, sample_providers):
old_ts = (datetime.now(UTC) - timedelta(hours=200)).isoformat()
store._conn.execute(
"""INSERT INTO snapshots
(timestamp, provider_name, status, error_rate,
avg_latency_ms, circuit_state, total_requests)
VALUES (?, ?, ?, ?, ?, ?, ?)""",
(old_ts, "anthropic", "healthy", 0.0, 100.0, "closed", 10),
)
store._conn.commit()
store.record_snapshot(sample_providers)
deleted = store.prune(keep_hours=168)
assert deleted == 1
history = store.get_history(hours=999)
assert len(history) == 1
def test_empty_history(store):
assert store.get_history(hours=24) == []
def test_capture_snapshot_from_router(store):
mock_metrics = MagicMock()
mock_metrics.error_rate = 0.05
mock_metrics.avg_latency_ms = 200.0
mock_metrics.total_requests = 42
mock_provider = MagicMock()
mock_provider.name = "test-provider"
mock_provider.status.value = "healthy"
mock_provider.metrics = mock_metrics
mock_provider.circuit_state.value = "closed"
mock_router = MagicMock()
mock_router.providers = [mock_provider]
store._capture_snapshot(mock_router)
history = store.get_history(hours=1)
assert len(history) == 1
p = history[0]["providers"][0]
assert p["name"] == "test-provider"
assert p["status"] == "healthy"
assert p["error_rate"] == 0.05
assert p["total_requests"] == 42
def test_history_api_endpoint(store, sample_providers):
"""GET /api/v1/router/history returns snapshot data."""
store.record_snapshot(sample_providers)
from fastapi import FastAPI
from fastapi.testclient import TestClient
from src.infrastructure.router.api import get_cascade_router
from src.infrastructure.router.api import router as api_router
from src.infrastructure.router.history import get_history_store
app = FastAPI()
app.include_router(api_router)
app.dependency_overrides[get_history_store] = lambda: store
app.dependency_overrides[get_cascade_router] = lambda: MagicMock()
client = TestClient(app)
resp = client.get("/api/v1/router/history?hours=1")
assert resp.status_code == 200
data = resp.json()
assert len(data) == 1
assert len(data[0]["providers"]) == 2
assert data[0]["providers"][0]["name"] == "anthropic"
app.dependency_overrides.clear()

View File

@@ -174,6 +174,103 @@ class TestDiscordVendor:
assert result is False
class TestExtractContent:
def test_strips_bot_mention(self):
from integrations.chat_bridge.vendors.discord import DiscordVendor
vendor = DiscordVendor()
vendor._client = MagicMock()
vendor._client.user.id = 12345
msg = MagicMock()
msg.content = "<@12345> hello there"
assert vendor._extract_content(msg) == "hello there"
def test_no_client_user(self):
from integrations.chat_bridge.vendors.discord import DiscordVendor
vendor = DiscordVendor()
vendor._client = MagicMock()
vendor._client.user = None
msg = MagicMock()
msg.content = "hello"
assert vendor._extract_content(msg) == "hello"
def test_empty_after_strip(self):
from integrations.chat_bridge.vendors.discord import DiscordVendor
vendor = DiscordVendor()
vendor._client = MagicMock()
vendor._client.user.id = 99
msg = MagicMock()
msg.content = "<@99>"
assert vendor._extract_content(msg) == ""
class TestInvokeAgent:
@staticmethod
def _make_typing_target():
"""Build a mock target whose .typing() is an async context manager."""
from contextlib import asynccontextmanager
target = AsyncMock()
@asynccontextmanager
async def _typing():
yield
target.typing = _typing
return target
@pytest.mark.asyncio
async def test_timeout_returns_error(self):
from integrations.chat_bridge.vendors.discord import DiscordVendor
vendor = DiscordVendor()
target = self._make_typing_target()
with patch(
"integrations.chat_bridge.vendors.discord.chat_with_tools", side_effect=TimeoutError
):
run_output, response = await vendor._invoke_agent("hi", "sess", target)
assert run_output is None
assert "too long" in response
@pytest.mark.asyncio
async def test_exception_returns_error(self):
from integrations.chat_bridge.vendors.discord import DiscordVendor
vendor = DiscordVendor()
target = self._make_typing_target()
with patch(
"integrations.chat_bridge.vendors.discord.chat_with_tools",
side_effect=RuntimeError("boom"),
):
run_output, response = await vendor._invoke_agent("hi", "sess", target)
assert run_output is None
assert "trouble" in response
class TestSendResponse:
@pytest.mark.asyncio
async def test_skips_empty(self):
from integrations.chat_bridge.vendors.discord import DiscordVendor
target = AsyncMock()
await DiscordVendor._send_response(None, target)
target.send.assert_not_called()
await DiscordVendor._send_response("", target)
target.send.assert_not_called()
@pytest.mark.asyncio
async def test_sends_short_message(self):
from integrations.chat_bridge.vendors.discord import DiscordVendor
target = AsyncMock()
await DiscordVendor._send_response("hello", target)
target.send.assert_called_once_with("hello")
class TestChunkMessage:
def test_short_message(self):
from integrations.chat_bridge.vendors.discord import _chunk_message

View File

@@ -0,0 +1,163 @@
"""Tests for cycle_result.json validation in loop_guard.
Covers validate_cycle_result(), _load_cycle_result(), and _is_issue_open().
"""
from __future__ import annotations
import json
import time
from pathlib import Path
from unittest.mock import patch
import pytest
import scripts.loop_guard as lg
@pytest.fixture(autouse=True)
def _isolate(tmp_path, monkeypatch):
"""Redirect loop_guard paths to tmp_path for isolation."""
monkeypatch.setattr(lg, "CYCLE_RESULT_FILE", tmp_path / "cycle_result.json")
monkeypatch.setattr(lg, "CYCLE_DURATION", 300)
monkeypatch.setattr(lg, "GITEA_API", "http://test:3000/api/v1")
monkeypatch.setattr(lg, "REPO_SLUG", "owner/repo")
def _write_cr(tmp_path, data: dict, age_seconds: float = 0) -> Path:
"""Write a cycle_result.json and optionally backdate it."""
p = tmp_path / "cycle_result.json"
p.write_text(json.dumps(data))
if age_seconds:
mtime = time.time() - age_seconds
import os
os.utime(p, (mtime, mtime))
return p
# --- _load_cycle_result ---
def test_load_cycle_result_missing(tmp_path):
assert lg._load_cycle_result() == {}
def test_load_cycle_result_valid(tmp_path):
_write_cr(tmp_path, {"issue": 42, "type": "fix"})
assert lg._load_cycle_result() == {"issue": 42, "type": "fix"}
def test_load_cycle_result_markdown_fenced(tmp_path):
p = tmp_path / "cycle_result.json"
p.write_text('```json\n{"issue": 99}\n```')
assert lg._load_cycle_result() == {"issue": 99}
def test_load_cycle_result_malformed(tmp_path):
p = tmp_path / "cycle_result.json"
p.write_text("not json at all")
assert lg._load_cycle_result() == {}
# --- _is_issue_open ---
def test_is_issue_open_true(monkeypatch):
monkeypatch.setattr(lg, "_get_token", lambda: "tok")
resp_data = json.dumps({"state": "open"}).encode()
class FakeResp:
def read(self):
return resp_data
def __enter__(self):
return self
def __exit__(self, *a):
pass
with patch("urllib.request.urlopen", return_value=FakeResp()):
assert lg._is_issue_open(42) is True
def test_is_issue_open_closed(monkeypatch):
monkeypatch.setattr(lg, "_get_token", lambda: "tok")
resp_data = json.dumps({"state": "closed"}).encode()
class FakeResp:
def read(self):
return resp_data
def __enter__(self):
return self
def __exit__(self, *a):
pass
with patch("urllib.request.urlopen", return_value=FakeResp()):
assert lg._is_issue_open(42) is False
def test_is_issue_open_no_token(monkeypatch):
monkeypatch.setattr(lg, "_get_token", lambda: "")
assert lg._is_issue_open(42) is None
def test_is_issue_open_api_error(monkeypatch):
monkeypatch.setattr(lg, "_get_token", lambda: "tok")
with patch("urllib.request.urlopen", side_effect=OSError("timeout")):
assert lg._is_issue_open(42) is None
# --- validate_cycle_result ---
def test_validate_no_file(tmp_path):
"""No file → returns False, no crash."""
assert lg.validate_cycle_result() is False
def test_validate_fresh_file_open_issue(tmp_path, monkeypatch):
"""Fresh file with open issue → kept."""
_write_cr(tmp_path, {"issue": 10})
monkeypatch.setattr(lg, "_is_issue_open", lambda n: True)
assert lg.validate_cycle_result() is False
assert (tmp_path / "cycle_result.json").exists()
def test_validate_stale_file_removed(tmp_path):
"""File older than 2× CYCLE_DURATION → removed."""
_write_cr(tmp_path, {"issue": 10}, age_seconds=700)
assert lg.validate_cycle_result() is True
assert not (tmp_path / "cycle_result.json").exists()
def test_validate_fresh_file_closed_issue(tmp_path, monkeypatch):
"""Fresh file referencing closed issue → removed."""
_write_cr(tmp_path, {"issue": 10})
monkeypatch.setattr(lg, "_is_issue_open", lambda n: False)
assert lg.validate_cycle_result() is True
assert not (tmp_path / "cycle_result.json").exists()
def test_validate_api_failure_keeps_file(tmp_path, monkeypatch):
"""API failure → file kept (graceful degradation)."""
_write_cr(tmp_path, {"issue": 10})
monkeypatch.setattr(lg, "_is_issue_open", lambda n: None)
assert lg.validate_cycle_result() is False
assert (tmp_path / "cycle_result.json").exists()
def test_validate_no_issue_field(tmp_path):
"""File without issue field → kept (only age check applies)."""
_write_cr(tmp_path, {"type": "fix"})
assert lg.validate_cycle_result() is False
assert (tmp_path / "cycle_result.json").exists()
def test_validate_stale_threshold_boundary(tmp_path, monkeypatch):
"""File just under threshold → kept (not stale yet)."""
_write_cr(tmp_path, {"issue": 10}, age_seconds=599)
monkeypatch.setattr(lg, "_is_issue_open", lambda n: True)
assert lg.validate_cycle_result() is False
assert (tmp_path / "cycle_result.json").exists()

327
tests/spark/test_advisor.py Normal file
View File

@@ -0,0 +1,327 @@
"""Comprehensive tests for spark.advisor module.
Covers all advisory-generation helpers:
- _check_failure_patterns (grouped agent failures)
- _check_agent_performance (top / struggling agents)
- _check_bid_patterns (spread + high average)
- _check_prediction_accuracy (low / high accuracy)
- _check_system_activity (idle / tasks-posted-but-no-completions)
- generate_advisories (integration, sorting, min-events guard)
"""
import json
from spark.advisor import (
_MIN_EVENTS,
Advisory,
_check_agent_performance,
_check_bid_patterns,
_check_failure_patterns,
_check_prediction_accuracy,
_check_system_activity,
generate_advisories,
)
from spark.memory import record_event
# ── Advisory dataclass ─────────────────────────────────────────────────────
class TestAdvisoryDataclass:
def test_defaults(self):
a = Advisory(
category="test",
priority=0.5,
title="T",
detail="D",
suggested_action="A",
)
assert a.subject is None
assert a.evidence_count == 0
def test_all_fields(self):
a = Advisory(
category="c",
priority=0.9,
title="T",
detail="D",
suggested_action="A",
subject="agent-1",
evidence_count=7,
)
assert a.subject == "agent-1"
assert a.evidence_count == 7
# ── _check_failure_patterns ────────────────────────────────────────────────
class TestCheckFailurePatterns:
def test_no_failures_returns_empty(self):
assert _check_failure_patterns() == []
def test_single_failure_not_enough(self):
record_event("task_failed", "once", agent_id="a1", task_id="t1")
assert _check_failure_patterns() == []
def test_two_failures_triggers_advisory(self):
for i in range(2):
record_event("task_failed", f"fail {i}", agent_id="agent-abc", task_id=f"t{i}")
results = _check_failure_patterns()
assert len(results) == 1
assert results[0].category == "failure_prevention"
assert results[0].subject == "agent-abc"
assert results[0].evidence_count == 2
def test_priority_scales_with_count(self):
for i in range(5):
record_event("task_failed", f"fail {i}", agent_id="agent-x", task_id=f"f{i}")
results = _check_failure_patterns()
assert len(results) == 1
assert results[0].priority > 0.5
def test_priority_capped_at_one(self):
for i in range(20):
record_event("task_failed", f"fail {i}", agent_id="agent-y", task_id=f"ff{i}")
results = _check_failure_patterns()
assert results[0].priority <= 1.0
def test_multiple_agents_separate_advisories(self):
for i in range(3):
record_event("task_failed", f"a fail {i}", agent_id="agent-a", task_id=f"a{i}")
record_event("task_failed", f"b fail {i}", agent_id="agent-b", task_id=f"b{i}")
results = _check_failure_patterns()
assert len(results) == 2
subjects = {r.subject for r in results}
assert subjects == {"agent-a", "agent-b"}
def test_events_without_agent_id_skipped(self):
for i in range(3):
record_event("task_failed", f"no-agent {i}", task_id=f"na{i}")
assert _check_failure_patterns() == []
# ── _check_agent_performance ───────────────────────────────────────────────
class TestCheckAgentPerformance:
def test_no_events_returns_empty(self):
assert _check_agent_performance() == []
def test_too_few_tasks_skipped(self):
record_event("task_completed", "done", agent_id="agent-1", task_id="t1")
assert _check_agent_performance() == []
def test_high_performer_detected(self):
for i in range(4):
record_event("task_completed", f"done {i}", agent_id="agent-star", task_id=f"s{i}")
results = _check_agent_performance()
perf = [r for r in results if r.category == "agent_performance"]
assert len(perf) == 1
assert "excels" in perf[0].title
assert perf[0].subject == "agent-star"
def test_struggling_agent_detected(self):
# 1 success, 4 failures = 20% rate
record_event("task_completed", "ok", agent_id="agent-bad", task_id="ok1")
for i in range(4):
record_event("task_failed", f"nope {i}", agent_id="agent-bad", task_id=f"bad{i}")
results = _check_agent_performance()
struggling = [r for r in results if "struggling" in r.title]
assert len(struggling) == 1
assert struggling[0].priority > 0.5
def test_middling_agent_no_advisory(self):
# 50% success rate — neither excelling nor struggling
for i in range(3):
record_event("task_completed", f"ok {i}", agent_id="agent-mid", task_id=f"m{i}")
for i in range(3):
record_event("task_failed", f"nope {i}", agent_id="agent-mid", task_id=f"mf{i}")
results = _check_agent_performance()
mid_advisories = [r for r in results if r.subject == "agent-mid"]
assert mid_advisories == []
def test_events_without_agent_id_skipped(self):
for i in range(5):
record_event("task_completed", f"done {i}", task_id=f"no-agent-{i}")
assert _check_agent_performance() == []
# ── _check_bid_patterns ────────────────────────────────────────────────────
class TestCheckBidPatterns:
def _record_bids(self, amounts):
for i, sats in enumerate(amounts):
record_event(
"bid_submitted",
f"bid {i}",
agent_id=f"a{i}",
task_id=f"bt{i}",
data=json.dumps({"bid_sats": sats}),
)
def test_too_few_bids_returns_empty(self):
self._record_bids([10, 20, 30])
assert _check_bid_patterns() == []
def test_wide_spread_detected(self):
# avg=50, spread=90 > 50*1.5=75
self._record_bids([5, 10, 50, 90, 95])
results = _check_bid_patterns()
spread_advisories = [r for r in results if "spread" in r.title.lower()]
assert len(spread_advisories) == 1
def test_high_average_detected(self):
self._record_bids([80, 85, 90, 95, 100])
results = _check_bid_patterns()
high_avg = [r for r in results if "High average" in r.title]
assert len(high_avg) == 1
def test_normal_bids_no_advisory(self):
# Tight spread, low average
self._record_bids([30, 32, 28, 31, 29])
results = _check_bid_patterns()
assert results == []
def test_invalid_json_data_skipped(self):
for i in range(6):
record_event(
"bid_submitted",
f"bid {i}",
agent_id=f"a{i}",
task_id=f"inv{i}",
data="not-json",
)
results = _check_bid_patterns()
assert results == []
def test_zero_bid_sats_skipped(self):
for i in range(6):
record_event(
"bid_submitted",
f"bid {i}",
data=json.dumps({"bid_sats": 0}),
)
assert _check_bid_patterns() == []
def test_both_spread_and_high_avg(self):
# Wide spread AND high average: avg=82, spread=150 > 82*1.5=123
self._record_bids([5, 80, 90, 100, 155])
results = _check_bid_patterns()
assert len(results) == 2
# ── _check_prediction_accuracy ─────────────────────────────────────────────
class TestCheckPredictionAccuracy:
def test_too_few_evaluations(self):
assert _check_prediction_accuracy() == []
def test_low_accuracy_advisory(self):
from spark.eidos import evaluate_prediction, predict_task_outcome
for i in range(4):
predict_task_outcome(f"pa-{i}", "task", ["agent-a"])
evaluate_prediction(f"pa-{i}", "agent-wrong", task_succeeded=False, winning_bid=999)
results = _check_prediction_accuracy()
low = [r for r in results if "Low prediction" in r.title]
assert len(low) == 1
assert low[0].priority > 0.5
def test_high_accuracy_advisory(self):
from spark.eidos import evaluate_prediction, predict_task_outcome
for i in range(4):
predict_task_outcome(f"ph-{i}", "task", ["agent-a"])
evaluate_prediction(f"ph-{i}", "agent-a", task_succeeded=True, winning_bid=30)
results = _check_prediction_accuracy()
high = [r for r in results if "Strong prediction" in r.title]
assert len(high) == 1
def test_middling_accuracy_no_advisory(self):
from spark.eidos import evaluate_prediction, predict_task_outcome
# Mix of correct and incorrect to get ~0.5 accuracy
for i in range(3):
predict_task_outcome(f"pm-{i}", "task", ["agent-a"])
evaluate_prediction(f"pm-{i}", "agent-a", task_succeeded=True, winning_bid=30)
for i in range(3):
predict_task_outcome(f"pmx-{i}", "task", ["agent-a"])
evaluate_prediction(f"pmx-{i}", "agent-wrong", task_succeeded=False, winning_bid=999)
results = _check_prediction_accuracy()
# avg should be middling — neither low nor high advisory
low = [r for r in results if "Low" in r.title]
high = [r for r in results if "Strong" in r.title]
# At least one side should be empty (depends on exact accuracy)
assert not (low and high)
# ── _check_system_activity ─────────────────────────────────────────────────
class TestCheckSystemActivity:
def test_no_events_idle_advisory(self):
results = _check_system_activity()
assert len(results) == 1
assert "No swarm activity" in results[0].title
def test_has_events_no_idle_advisory(self):
record_event("task_completed", "done", task_id="t1")
results = _check_system_activity()
idle = [r for r in results if "No swarm activity" in r.title]
assert idle == []
def test_tasks_posted_but_none_completing(self):
for i in range(5):
record_event("task_posted", f"posted {i}", task_id=f"tp{i}")
results = _check_system_activity()
stalled = [r for r in results if "none completing" in r.title.lower()]
assert len(stalled) == 1
assert stalled[0].evidence_count >= 4
def test_posts_with_completions_no_stalled_advisory(self):
for i in range(5):
record_event("task_posted", f"posted {i}", task_id=f"tpx{i}")
record_event("task_completed", "done", task_id="tpx0")
results = _check_system_activity()
stalled = [r for r in results if "none completing" in r.title.lower()]
assert stalled == []
# ── generate_advisories (integration) ──────────────────────────────────────
class TestGenerateAdvisories:
def test_below_min_events_returns_insufficient(self):
advisories = generate_advisories()
assert len(advisories) >= 1
assert advisories[0].title == "Insufficient data"
assert advisories[0].evidence_count == 0
def test_exactly_at_min_events_proceeds(self):
for i in range(_MIN_EVENTS):
record_event("task_posted", f"ev {i}", task_id=f"min{i}")
advisories = generate_advisories()
insufficient = [a for a in advisories if a.title == "Insufficient data"]
assert insufficient == []
def test_results_sorted_by_priority_descending(self):
for i in range(5):
record_event("task_posted", f"posted {i}", task_id=f"sp{i}")
for i in range(3):
record_event("task_failed", f"fail {i}", agent_id="agent-fail", task_id=f"sf{i}")
advisories = generate_advisories()
if len(advisories) >= 2:
for i in range(len(advisories) - 1):
assert advisories[i].priority >= advisories[i + 1].priority
def test_multiple_categories_produced(self):
# Create failures + posted-no-completions
for i in range(5):
record_event("task_failed", f"fail {i}", agent_id="agent-bad", task_id=f"mf{i}")
for i in range(5):
record_event("task_posted", f"posted {i}", task_id=f"mp{i}")
advisories = generate_advisories()
categories = {a.category for a in advisories}
assert len(categories) >= 2

299
tests/spark/test_eidos.py Normal file
View File

@@ -0,0 +1,299 @@
"""Comprehensive tests for spark.eidos module.
Covers:
- _get_conn (schema creation, WAL, busy timeout)
- predict_task_outcome (baseline, with history, edge cases)
- evaluate_prediction (correct, wrong, missing, double-eval)
- _compute_accuracy (all components, edge cases)
- get_predictions (filters: task_id, evaluated_only, limit)
- get_accuracy_stats (empty, after evaluations)
"""
import pytest
from spark.eidos import (
Prediction,
_compute_accuracy,
evaluate_prediction,
get_accuracy_stats,
get_predictions,
predict_task_outcome,
)
# ── Prediction dataclass ──────────────────────────────────────────────────
class TestPredictionDataclass:
def test_defaults(self):
p = Prediction(
id="1",
task_id="t1",
prediction_type="outcome",
predicted_value="{}",
actual_value=None,
accuracy=None,
created_at="2026-01-01",
evaluated_at=None,
)
assert p.actual_value is None
assert p.accuracy is None
# ── predict_task_outcome ──────────────────────────────────────────────────
class TestPredictTaskOutcome:
def test_baseline_no_history(self):
result = predict_task_outcome("t-base", "Do stuff", ["a1", "a2"])
assert result["likely_winner"] == "a1"
assert result["success_probability"] == 0.7
assert result["estimated_bid_range"] == [20, 80]
assert "baseline" in result["reasoning"]
assert "prediction_id" in result
def test_empty_candidates(self):
result = predict_task_outcome("t-empty", "Nothing", [])
assert result["likely_winner"] is None
def test_history_selects_best_agent(self):
history = {
"a1": {"success_rate": 0.3, "avg_winning_bid": 40},
"a2": {"success_rate": 0.95, "avg_winning_bid": 50},
}
result = predict_task_outcome("t-hist", "Task", ["a1", "a2"], agent_history=history)
assert result["likely_winner"] == "a2"
assert result["success_probability"] > 0.7
def test_history_agent_not_in_candidates_ignored(self):
history = {
"a-outside": {"success_rate": 0.99, "avg_winning_bid": 10},
}
result = predict_task_outcome("t-out", "Task", ["a1"], agent_history=history)
# a-outside not in candidates, so falls back to baseline
assert result["likely_winner"] == "a1"
def test_history_adjusts_bid_range(self):
history = {
"a1": {"success_rate": 0.5, "avg_winning_bid": 100},
"a2": {"success_rate": 0.8, "avg_winning_bid": 200},
}
result = predict_task_outcome("t-bid", "Task", ["a1", "a2"], agent_history=history)
low, high = result["estimated_bid_range"]
assert low == max(1, int(100 * 0.8))
assert high == int(200 * 1.2)
def test_history_with_zero_avg_bid_skipped(self):
history = {
"a1": {"success_rate": 0.8, "avg_winning_bid": 0},
}
result = predict_task_outcome("t-zero-bid", "Task", ["a1"], agent_history=history)
# Zero avg_winning_bid should be skipped, keep default range
assert result["estimated_bid_range"] == [20, 80]
def test_prediction_stored_in_db(self):
result = predict_task_outcome("t-db", "Store me", ["a1"])
preds = get_predictions(task_id="t-db")
assert len(preds) == 1
assert preds[0].id == result["prediction_id"]
assert preds[0].prediction_type == "outcome"
def test_success_probability_clamped(self):
history = {
"a1": {"success_rate": 1.5, "avg_winning_bid": 50},
}
result = predict_task_outcome("t-clamp", "Task", ["a1"], agent_history=history)
assert result["success_probability"] <= 1.0
# ── evaluate_prediction ───────────────────────────────────────────────────
class TestEvaluatePrediction:
def test_correct_prediction(self):
predict_task_outcome("t-eval-ok", "Task", ["a1"])
result = evaluate_prediction("t-eval-ok", "a1", task_succeeded=True, winning_bid=30)
assert result is not None
assert 0.0 <= result["accuracy"] <= 1.0
assert result["actual"]["winner"] == "a1"
assert result["actual"]["succeeded"] is True
def test_wrong_prediction(self):
predict_task_outcome("t-eval-wrong", "Task", ["a1"])
result = evaluate_prediction("t-eval-wrong", "a2", task_succeeded=False)
assert result is not None
assert result["accuracy"] < 1.0
def test_no_prediction_returns_none(self):
result = evaluate_prediction("nonexistent", "a1", task_succeeded=True)
assert result is None
def test_double_evaluation_returns_none(self):
predict_task_outcome("t-double", "Task", ["a1"])
evaluate_prediction("t-double", "a1", task_succeeded=True)
result = evaluate_prediction("t-double", "a1", task_succeeded=True)
assert result is None
def test_evaluation_updates_db(self):
predict_task_outcome("t-upd", "Task", ["a1"])
evaluate_prediction("t-upd", "a1", task_succeeded=True, winning_bid=50)
preds = get_predictions(task_id="t-upd", evaluated_only=True)
assert len(preds) == 1
assert preds[0].accuracy is not None
assert preds[0].actual_value is not None
assert preds[0].evaluated_at is not None
def test_winning_bid_none(self):
predict_task_outcome("t-nobid", "Task", ["a1"])
result = evaluate_prediction("t-nobid", "a1", task_succeeded=True)
assert result is not None
assert result["actual"]["winning_bid"] is None
# ── _compute_accuracy ─────────────────────────────────────────────────────
class TestComputeAccuracy:
def test_perfect_match(self):
predicted = {
"likely_winner": "a1",
"success_probability": 1.0,
"estimated_bid_range": [20, 40],
}
actual = {"winner": "a1", "succeeded": True, "winning_bid": 30}
assert _compute_accuracy(predicted, actual) == pytest.approx(1.0, abs=0.01)
def test_all_wrong(self):
predicted = {
"likely_winner": "a1",
"success_probability": 1.0,
"estimated_bid_range": [10, 20],
}
actual = {"winner": "a2", "succeeded": False, "winning_bid": 100}
assert _compute_accuracy(predicted, actual) < 0.3
def test_no_winner_in_predicted(self):
predicted = {"success_probability": 0.5, "estimated_bid_range": [20, 40]}
actual = {"winner": "a1", "succeeded": True, "winning_bid": 30}
acc = _compute_accuracy(predicted, actual)
# Winner component skipped, success + bid counted
assert 0.0 <= acc <= 1.0
def test_no_winner_in_actual(self):
predicted = {"likely_winner": "a1", "success_probability": 0.5}
actual = {"succeeded": True}
acc = _compute_accuracy(predicted, actual)
assert 0.0 <= acc <= 1.0
def test_bid_outside_range_partial_credit(self):
predicted = {
"likely_winner": "a1",
"success_probability": 1.0,
"estimated_bid_range": [20, 40],
}
# Bid just outside range
actual = {"winner": "a1", "succeeded": True, "winning_bid": 45}
acc = _compute_accuracy(predicted, actual)
assert 0.5 < acc < 1.0
def test_bid_far_outside_range(self):
predicted = {
"likely_winner": "a1",
"success_probability": 1.0,
"estimated_bid_range": [20, 40],
}
actual = {"winner": "a1", "succeeded": True, "winning_bid": 500}
acc = _compute_accuracy(predicted, actual)
assert acc < 1.0
def test_no_actual_bid(self):
predicted = {
"likely_winner": "a1",
"success_probability": 0.7,
"estimated_bid_range": [20, 40],
}
actual = {"winner": "a1", "succeeded": True, "winning_bid": None}
acc = _compute_accuracy(predicted, actual)
# Bid component skipped — only winner + success
assert 0.0 <= acc <= 1.0
def test_failed_prediction_low_probability(self):
predicted = {"success_probability": 0.1}
actual = {"succeeded": False}
acc = _compute_accuracy(predicted, actual)
# Predicted low success and task failed → high accuracy
assert acc > 0.8
# ── get_predictions ───────────────────────────────────────────────────────
class TestGetPredictions:
def test_empty_db(self):
assert get_predictions() == []
def test_filter_by_task_id(self):
predict_task_outcome("t-filter1", "A", ["a1"])
predict_task_outcome("t-filter2", "B", ["a2"])
preds = get_predictions(task_id="t-filter1")
assert len(preds) == 1
assert preds[0].task_id == "t-filter1"
def test_evaluated_only(self):
predict_task_outcome("t-eo1", "A", ["a1"])
predict_task_outcome("t-eo2", "B", ["a1"])
evaluate_prediction("t-eo1", "a1", task_succeeded=True)
preds = get_predictions(evaluated_only=True)
assert len(preds) == 1
assert preds[0].task_id == "t-eo1"
def test_limit(self):
for i in range(10):
predict_task_outcome(f"t-lim{i}", "X", ["a1"])
preds = get_predictions(limit=3)
assert len(preds) == 3
def test_combined_filters(self):
predict_task_outcome("t-combo", "A", ["a1"])
evaluate_prediction("t-combo", "a1", task_succeeded=True)
predict_task_outcome("t-combo2", "B", ["a1"])
preds = get_predictions(task_id="t-combo", evaluated_only=True)
assert len(preds) == 1
def test_order_by_created_desc(self):
for i in range(3):
predict_task_outcome(f"t-ord{i}", f"Task {i}", ["a1"])
preds = get_predictions()
# Most recent first
assert preds[0].task_id == "t-ord2"
# ── get_accuracy_stats ────────────────────────────────────────────────────
class TestGetAccuracyStats:
def test_empty(self):
stats = get_accuracy_stats()
assert stats["total_predictions"] == 0
assert stats["evaluated"] == 0
assert stats["pending"] == 0
assert stats["avg_accuracy"] == 0.0
assert stats["min_accuracy"] == 0.0
assert stats["max_accuracy"] == 0.0
def test_with_unevaluated(self):
predict_task_outcome("t-uneval", "X", ["a1"])
stats = get_accuracy_stats()
assert stats["total_predictions"] == 1
assert stats["evaluated"] == 0
assert stats["pending"] == 1
def test_with_evaluations(self):
for i in range(3):
predict_task_outcome(f"t-stats{i}", "X", ["a1"])
evaluate_prediction(f"t-stats{i}", "a1", task_succeeded=True, winning_bid=30)
stats = get_accuracy_stats()
assert stats["total_predictions"] == 3
assert stats["evaluated"] == 3
assert stats["pending"] == 0
assert stats["avg_accuracy"] > 0.0
assert stats["min_accuracy"] <= stats["avg_accuracy"] <= stats["max_accuracy"]

389
tests/spark/test_memory.py Normal file
View File

@@ -0,0 +1,389 @@
"""Comprehensive tests for spark.memory module.
Covers:
- SparkEvent / SparkMemory dataclasses
- _get_conn (schema creation, WAL, busy timeout, idempotent indexes)
- score_importance (all event types, boosts, edge cases)
- record_event (auto-importance, explicit importance, invalid JSON, swarm bridge)
- get_events (all filters, ordering, limit)
- count_events (total, by type)
- store_memory (with/without expiry)
- get_memories (all filters)
- count_memories (total, by type)
"""
import json
import pytest
from spark.memory import (
IMPORTANCE_HIGH,
IMPORTANCE_LOW,
IMPORTANCE_MEDIUM,
SparkEvent,
SparkMemory,
_get_conn,
count_events,
count_memories,
get_events,
get_memories,
record_event,
score_importance,
store_memory,
)
# ── Constants ─────────────────────────────────────────────────────────────
class TestConstants:
def test_importance_ordering(self):
assert IMPORTANCE_LOW < IMPORTANCE_MEDIUM < IMPORTANCE_HIGH
# ── Dataclasses ───────────────────────────────────────────────────────────
class TestSparkEventDataclass:
def test_all_fields(self):
ev = SparkEvent(
id="1",
event_type="task_posted",
agent_id="a1",
task_id="t1",
description="Test",
data="{}",
importance=0.5,
created_at="2026-01-01",
)
assert ev.event_type == "task_posted"
assert ev.agent_id == "a1"
def test_nullable_fields(self):
ev = SparkEvent(
id="2",
event_type="task_posted",
agent_id=None,
task_id=None,
description="",
data="{}",
importance=0.5,
created_at="2026-01-01",
)
assert ev.agent_id is None
assert ev.task_id is None
class TestSparkMemoryDataclass:
def test_all_fields(self):
mem = SparkMemory(
id="1",
memory_type="pattern",
subject="system",
content="Test insight",
confidence=0.8,
source_events=5,
created_at="2026-01-01",
expires_at="2026-12-31",
)
assert mem.memory_type == "pattern"
assert mem.expires_at == "2026-12-31"
def test_nullable_expires(self):
mem = SparkMemory(
id="2",
memory_type="anomaly",
subject="agent-1",
content="Odd behavior",
confidence=0.6,
source_events=3,
created_at="2026-01-01",
expires_at=None,
)
assert mem.expires_at is None
# ── _get_conn ─────────────────────────────────────────────────────────────
class TestGetConn:
def test_creates_tables(self):
with _get_conn() as conn:
tables = conn.execute("SELECT name FROM sqlite_master WHERE type='table'").fetchall()
names = {r["name"] for r in tables}
assert "spark_events" in names
assert "spark_memories" in names
def test_wal_mode(self):
with _get_conn() as conn:
mode = conn.execute("PRAGMA journal_mode").fetchone()[0]
assert mode == "wal"
def test_busy_timeout(self):
with _get_conn() as conn:
timeout = conn.execute("PRAGMA busy_timeout").fetchone()[0]
assert timeout == 5000
def test_idempotent(self):
# Calling _get_conn twice should not raise
with _get_conn():
pass
with _get_conn():
pass
# ── score_importance ──────────────────────────────────────────────────────
class TestScoreImportance:
@pytest.mark.parametrize(
"event_type,expected_min,expected_max",
[
("task_posted", 0.3, 0.5),
("bid_submitted", 0.1, 0.3),
("task_assigned", 0.4, 0.6),
("task_completed", 0.5, 0.7),
("task_failed", 0.9, 1.0),
("agent_joined", 0.4, 0.6),
("prediction_result", 0.6, 0.8),
],
)
def test_base_scores(self, event_type, expected_min, expected_max):
score = score_importance(event_type, {})
assert expected_min <= score <= expected_max
def test_unknown_event_default(self):
assert score_importance("never_heard_of_this", {}) == 0.5
def test_failure_boost(self):
score = score_importance("task_failed", {})
assert score == 1.0
def test_high_bid_boost(self):
low = score_importance("bid_submitted", {"bid_sats": 10})
high = score_importance("bid_submitted", {"bid_sats": 100})
assert high > low
assert high <= 1.0
def test_high_bid_on_failure(self):
score = score_importance("task_failed", {"bid_sats": 100})
assert score == 1.0 # capped at 1.0
def test_score_always_rounded(self):
score = score_importance("bid_submitted", {"bid_sats": 100})
assert score == round(score, 2)
# ── record_event ──────────────────────────────────────────────────────────
class TestRecordEvent:
def test_basic_record(self):
eid = record_event("task_posted", "New task", task_id="t1")
assert isinstance(eid, str)
assert len(eid) > 0
def test_auto_importance(self):
record_event("task_failed", "Failed", task_id="t-auto")
events = get_events(task_id="t-auto")
assert events[0].importance >= 0.9
def test_explicit_importance(self):
record_event("task_posted", "Custom", task_id="t-expl", importance=0.1)
events = get_events(task_id="t-expl")
assert events[0].importance == 0.1
def test_with_agent_and_data(self):
data = json.dumps({"bid_sats": 42})
record_event("bid_submitted", "Bid", agent_id="a1", task_id="t-data", data=data)
events = get_events(task_id="t-data")
assert events[0].agent_id == "a1"
parsed = json.loads(events[0].data)
assert parsed["bid_sats"] == 42
def test_invalid_json_data_uses_default_importance(self):
record_event("task_posted", "Bad data", task_id="t-bad", data="not-json")
events = get_events(task_id="t-bad")
assert events[0].importance == 0.4 # base for task_posted
def test_returns_unique_ids(self):
id1 = record_event("task_posted", "A")
id2 = record_event("task_posted", "B")
assert id1 != id2
# ── get_events ────────────────────────────────────────────────────────────
class TestGetEvents:
def test_empty_db(self):
assert get_events() == []
def test_filter_by_type(self):
record_event("task_posted", "A")
record_event("task_completed", "B")
events = get_events(event_type="task_posted")
assert len(events) == 1
assert events[0].event_type == "task_posted"
def test_filter_by_agent(self):
record_event("task_posted", "A", agent_id="a1")
record_event("task_posted", "B", agent_id="a2")
events = get_events(agent_id="a1")
assert len(events) == 1
assert events[0].agent_id == "a1"
def test_filter_by_task(self):
record_event("task_posted", "A", task_id="t1")
record_event("task_posted", "B", task_id="t2")
events = get_events(task_id="t1")
assert len(events) == 1
def test_filter_by_min_importance(self):
record_event("task_posted", "Low", importance=0.1)
record_event("task_failed", "High", importance=0.9)
events = get_events(min_importance=0.5)
assert len(events) == 1
assert events[0].importance >= 0.5
def test_limit(self):
for i in range(10):
record_event("task_posted", f"ev{i}")
events = get_events(limit=3)
assert len(events) == 3
def test_order_by_created_desc(self):
record_event("task_posted", "first", task_id="ord1")
record_event("task_posted", "second", task_id="ord2")
events = get_events()
# Most recent first
assert events[0].task_id == "ord2"
def test_combined_filters(self):
record_event("task_failed", "A", agent_id="a1", task_id="t1", importance=0.9)
record_event("task_posted", "B", agent_id="a1", task_id="t2", importance=0.4)
record_event("task_failed", "C", agent_id="a2", task_id="t3", importance=0.9)
events = get_events(event_type="task_failed", agent_id="a1", min_importance=0.5)
assert len(events) == 1
assert events[0].task_id == "t1"
# ── count_events ──────────────────────────────────────────────────────────
class TestCountEvents:
def test_empty(self):
assert count_events() == 0
def test_total(self):
record_event("task_posted", "A")
record_event("task_failed", "B")
assert count_events() == 2
def test_by_type(self):
record_event("task_posted", "A")
record_event("task_posted", "B")
record_event("task_failed", "C")
assert count_events("task_posted") == 2
assert count_events("task_failed") == 1
assert count_events("task_completed") == 0
# ── store_memory ──────────────────────────────────────────────────────────
class TestStoreMemory:
def test_basic_store(self):
mid = store_memory("pattern", "system", "Test insight")
assert isinstance(mid, str)
assert len(mid) > 0
def test_returns_unique_ids(self):
id1 = store_memory("pattern", "a", "X")
id2 = store_memory("pattern", "b", "Y")
assert id1 != id2
def test_with_all_params(self):
store_memory(
"anomaly",
"agent-1",
"Odd pattern",
confidence=0.9,
source_events=10,
expires_at="2026-12-31",
)
mems = get_memories(subject="agent-1")
assert len(mems) == 1
assert mems[0].confidence == 0.9
assert mems[0].source_events == 10
assert mems[0].expires_at == "2026-12-31"
def test_default_values(self):
store_memory("insight", "sys", "Default test")
mems = get_memories(subject="sys")
assert mems[0].confidence == 0.5
assert mems[0].source_events == 0
assert mems[0].expires_at is None
# ── get_memories ──────────────────────────────────────────────────────────
class TestGetMemories:
def test_empty(self):
assert get_memories() == []
def test_filter_by_type(self):
store_memory("pattern", "a", "X")
store_memory("anomaly", "a", "Y")
mems = get_memories(memory_type="pattern")
assert len(mems) == 1
assert mems[0].memory_type == "pattern"
def test_filter_by_subject(self):
store_memory("pattern", "a", "X")
store_memory("pattern", "b", "Y")
mems = get_memories(subject="a")
assert len(mems) == 1
def test_filter_by_min_confidence(self):
store_memory("pattern", "a", "Low", confidence=0.2)
store_memory("pattern", "b", "High", confidence=0.9)
mems = get_memories(min_confidence=0.5)
assert len(mems) == 1
assert mems[0].content == "High"
def test_limit(self):
for i in range(10):
store_memory("pattern", "a", f"M{i}")
mems = get_memories(limit=3)
assert len(mems) == 3
def test_combined_filters(self):
store_memory("pattern", "a", "Target", confidence=0.9)
store_memory("anomaly", "a", "Wrong type", confidence=0.9)
store_memory("pattern", "b", "Wrong subject", confidence=0.9)
store_memory("pattern", "a", "Low conf", confidence=0.1)
mems = get_memories(memory_type="pattern", subject="a", min_confidence=0.5)
assert len(mems) == 1
assert mems[0].content == "Target"
# ── count_memories ────────────────────────────────────────────────────────
class TestCountMemories:
def test_empty(self):
assert count_memories() == 0
def test_total(self):
store_memory("pattern", "a", "X")
store_memory("anomaly", "b", "Y")
assert count_memories() == 2
def test_by_type(self):
store_memory("pattern", "a", "X")
store_memory("pattern", "b", "Y")
store_memory("anomaly", "c", "Z")
assert count_memories("pattern") == 2
assert count_memories("anomaly") == 1
assert count_memories("insight") == 0

266
tests/test_command_log.py Normal file
View File

@@ -0,0 +1,266 @@
"""Tests for Morrowind command log and training export pipeline."""
from datetime import UTC, datetime, timedelta
from pathlib import Path
import pytest
from src.infrastructure.morrowind.command_log import CommandLog, CommandLogger
from src.infrastructure.morrowind.schemas import (
CommandInput,
CommandType,
PerceptionOutput,
)
from src.infrastructure.morrowind.training_export import TrainingExporter
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
NOW = datetime(2026, 3, 21, 14, 30, 0, tzinfo=UTC)
def _make_perception(**overrides) -> PerceptionOutput:
defaults = {
"timestamp": NOW,
"agent_id": "timmy",
"location": {"cell": "Balmora", "x": 1024.5, "y": -512.3, "z": 64.0},
"health": {"current": 85, "max": 100},
}
defaults.update(overrides)
return PerceptionOutput(**defaults)
def _make_command(**overrides) -> CommandInput:
defaults = {
"timestamp": NOW,
"agent_id": "timmy",
"command": "move_to",
"params": {"target_x": 1050.0},
"reasoning": "Moving closer to quest target.",
}
defaults.update(overrides)
return CommandInput(**defaults)
@pytest.fixture
def logger(tmp_path: Path) -> CommandLogger:
"""CommandLogger backed by an in-memory SQLite DB."""
db_path = tmp_path / "test.db"
return CommandLogger(db_url=f"sqlite:///{db_path}")
@pytest.fixture
def exporter(logger: CommandLogger) -> TrainingExporter:
return TrainingExporter(logger)
# ---------------------------------------------------------------------------
# CommandLogger — log_command
# ---------------------------------------------------------------------------
class TestLogCommand:
def test_basic_log(self, logger: CommandLogger):
cmd = _make_command()
row_id = logger.log_command(cmd)
assert row_id >= 1
def test_log_with_perception(self, logger: CommandLogger):
cmd = _make_command()
perception = _make_perception()
row_id = logger.log_command(cmd, perception=perception)
assert row_id >= 1
results = logger.query(limit=1)
assert len(results) == 1
assert results[0]["cell"] == "Balmora"
assert results[0]["perception_snapshot"]["location"]["cell"] == "Balmora"
def test_log_with_outcome(self, logger: CommandLogger):
cmd = _make_command()
row_id = logger.log_command(cmd, outcome="success: arrived at destination")
results = logger.query(limit=1)
assert results[0]["outcome"] == "success: arrived at destination"
def test_log_preserves_episode_id(self, logger: CommandLogger):
cmd = _make_command(episode_id="ep_test_001")
logger.log_command(cmd)
results = logger.query(episode_id="ep_test_001")
assert len(results) == 1
assert results[0]["episode_id"] == "ep_test_001"
# ---------------------------------------------------------------------------
# CommandLogger — query
# ---------------------------------------------------------------------------
class TestQuery:
def test_filter_by_command_type(self, logger: CommandLogger):
logger.log_command(_make_command(command="move_to"))
logger.log_command(_make_command(command="noop"))
logger.log_command(_make_command(command="move_to"))
results = logger.query(command_type="move_to")
assert len(results) == 2
assert all(r["command"] == "move_to" for r in results)
def test_filter_by_cell(self, logger: CommandLogger):
p1 = _make_perception(location={"cell": "Balmora", "x": 0, "y": 0, "z": 0})
p2 = _make_perception(location={"cell": "Vivec", "x": 0, "y": 0, "z": 0})
logger.log_command(_make_command(), perception=p1)
logger.log_command(_make_command(), perception=p2)
results = logger.query(cell="Vivec")
assert len(results) == 1
assert results[0]["cell"] == "Vivec"
def test_filter_by_time_range(self, logger: CommandLogger):
t1 = NOW - timedelta(hours=2)
t2 = NOW - timedelta(hours=1)
t3 = NOW
logger.log_command(_make_command(timestamp=t1.isoformat()))
logger.log_command(_make_command(timestamp=t2.isoformat()))
logger.log_command(_make_command(timestamp=t3.isoformat()))
results = logger.query(since=NOW - timedelta(hours=1, minutes=30), until=NOW)
assert len(results) == 2
def test_limit_and_offset(self, logger: CommandLogger):
for i in range(5):
logger.log_command(_make_command())
results = logger.query(limit=2, offset=0)
assert len(results) == 2
results = logger.query(limit=10, offset=3)
assert len(results) == 2
def test_empty_query(self, logger: CommandLogger):
results = logger.query()
assert results == []
# ---------------------------------------------------------------------------
# CommandLogger — export_training_data (JSONL)
# ---------------------------------------------------------------------------
class TestExportTrainingData:
def test_basic_export(self, logger: CommandLogger, tmp_path: Path):
perception = _make_perception()
for _ in range(3):
logger.log_command(_make_command(), perception=perception)
output = tmp_path / "train.jsonl"
count = logger.export_training_data(output)
assert count == 3
assert output.exists()
import json
lines = output.read_text().strip().split("\n")
assert len(lines) == 3
record = json.loads(lines[0])
assert "input" in record
assert "output" in record
assert record["output"]["command"] == "move_to"
def test_export_filter_by_episode(self, logger: CommandLogger, tmp_path: Path):
logger.log_command(_make_command(episode_id="ep_a"), perception=_make_perception())
logger.log_command(_make_command(episode_id="ep_b"), perception=_make_perception())
output = tmp_path / "ep_a.jsonl"
count = logger.export_training_data(output, episode_id="ep_a")
assert count == 1
# ---------------------------------------------------------------------------
# CommandLogger — storage management
# ---------------------------------------------------------------------------
class TestStorageManagement:
def test_count(self, logger: CommandLogger):
assert logger.count() == 0
logger.log_command(_make_command())
logger.log_command(_make_command())
assert logger.count() == 2
def test_rotate_old_entries(self, logger: CommandLogger):
old_time = NOW - timedelta(days=100)
logger.log_command(_make_command(timestamp=old_time.isoformat()))
logger.log_command(_make_command(timestamp=NOW.isoformat()))
deleted = logger.rotate(max_age_days=90)
assert deleted == 1
assert logger.count() == 1
def test_rotate_nothing_to_delete(self, logger: CommandLogger):
logger.log_command(_make_command(timestamp=NOW.isoformat()))
deleted = logger.rotate(max_age_days=1)
assert deleted == 0
# ---------------------------------------------------------------------------
# TrainingExporter — chat format
# ---------------------------------------------------------------------------
class TestTrainingExporterChat:
def test_chat_format_export(
self, logger: CommandLogger, exporter: TrainingExporter, tmp_path: Path
):
perception = _make_perception()
for _ in range(3):
logger.log_command(_make_command(), perception=perception)
output = tmp_path / "chat.jsonl"
stats = exporter.export_chat_format(output)
assert stats.total_records == 3
assert stats.format == "chat_completion"
import json
lines = output.read_text().strip().split("\n")
record = json.loads(lines[0])
assert record["messages"][0]["role"] == "system"
assert record["messages"][1]["role"] == "user"
assert record["messages"][2]["role"] == "assistant"
# ---------------------------------------------------------------------------
# TrainingExporter — episode sequences
# ---------------------------------------------------------------------------
class TestTrainingExporterEpisodes:
def test_episode_export(
self, logger: CommandLogger, exporter: TrainingExporter, tmp_path: Path
):
perception = _make_perception()
for i in range(5):
logger.log_command(
_make_command(episode_id="ep_test"),
perception=perception,
)
output_dir = tmp_path / "episodes"
stats = exporter.export_episode_sequences(output_dir, min_length=3)
assert stats.episodes_exported == 1
assert stats.total_records == 5
assert (output_dir / "ep_test.jsonl").exists()
def test_short_episodes_skipped(
self, logger: CommandLogger, exporter: TrainingExporter, tmp_path: Path
):
perception = _make_perception()
logger.log_command(_make_command(episode_id="short"), perception=perception)
output_dir = tmp_path / "episodes"
stats = exporter.export_episode_sequences(output_dir, min_length=3)
assert stats.episodes_exported == 0
assert stats.skipped_records == 1

470
tests/test_config_module.py Normal file
View File

@@ -0,0 +1,470 @@
"""Tests for src/config.py — Settings, validation, and helper functions."""
import os
from unittest.mock import patch
import pytest
class TestNormalizeOllamaUrl:
"""normalize_ollama_url replaces localhost with 127.0.0.1."""
def test_replaces_localhost(self):
from config import normalize_ollama_url
assert normalize_ollama_url("http://localhost:11434") == "http://127.0.0.1:11434"
def test_preserves_ip(self):
from config import normalize_ollama_url
assert normalize_ollama_url("http://192.168.1.5:11434") == "http://192.168.1.5:11434"
def test_preserves_non_localhost_hostname(self):
from config import normalize_ollama_url
assert normalize_ollama_url("http://ollama.local:11434") == "http://ollama.local:11434"
def test_replaces_multiple_occurrences(self):
from config import normalize_ollama_url
result = normalize_ollama_url("http://localhost:11434/localhost")
assert result == "http://127.0.0.1:11434/127.0.0.1"
class TestSettingsDefaults:
"""Settings instantiation produces correct defaults."""
def _make_settings(self, **env_overrides):
"""Create a fresh Settings instance with given env overrides."""
from config import Settings
clean_env = {
k: v
for k, v in os.environ.items()
if not k.startswith(("OLLAMA_", "TIMMY_", "AGENT_", "DEBUG"))
}
clean_env.update(env_overrides)
with patch.dict(os.environ, clean_env, clear=True):
return Settings()
def test_default_agent_name(self):
s = self._make_settings()
assert s.agent_name == "Agent"
def test_default_ollama_url(self):
s = self._make_settings()
assert s.ollama_url == "http://localhost:11434"
def test_default_ollama_model(self):
s = self._make_settings()
assert s.ollama_model == "qwen3:30b"
def test_default_ollama_num_ctx(self):
s = self._make_settings()
assert s.ollama_num_ctx == 4096
def test_default_debug_false(self):
s = self._make_settings()
assert s.debug is False
def test_default_timmy_env(self):
s = self._make_settings()
assert s.timmy_env == "development"
def test_default_timmy_test_mode(self):
s = self._make_settings()
assert s.timmy_test_mode is False
def test_default_spark_enabled(self):
s = self._make_settings()
assert s.spark_enabled is True
def test_default_lightning_backend(self):
s = self._make_settings()
assert s.lightning_backend == "mock"
def test_default_max_agent_steps(self):
s = self._make_settings()
assert s.max_agent_steps == 10
def test_default_memory_prune_days(self):
s = self._make_settings()
assert s.memory_prune_days == 90
def test_default_fallback_models_is_list(self):
s = self._make_settings()
assert isinstance(s.fallback_models, list)
assert len(s.fallback_models) > 0
def test_default_cors_origins_is_list(self):
s = self._make_settings()
assert isinstance(s.cors_origins, list)
def test_default_trusted_hosts_is_list(self):
s = self._make_settings()
assert isinstance(s.trusted_hosts, list)
assert "localhost" in s.trusted_hosts
def test_normalized_ollama_url_property(self):
s = self._make_settings()
assert "127.0.0.1" in s.normalized_ollama_url
assert "localhost" not in s.normalized_ollama_url
class TestSettingsEnvOverrides:
"""Environment variables override default values."""
def _make_settings(self, **env_overrides):
from config import Settings
clean_env = {
k: v
for k, v in os.environ.items()
if not k.startswith(("OLLAMA_", "TIMMY_", "AGENT_", "DEBUG"))
}
clean_env.update(env_overrides)
with patch.dict(os.environ, clean_env, clear=True):
return Settings()
def test_agent_name_override(self):
s = self._make_settings(AGENT_NAME="Timmy")
assert s.agent_name == "Timmy"
def test_ollama_url_override(self):
s = self._make_settings(OLLAMA_URL="http://10.0.0.1:11434")
assert s.ollama_url == "http://10.0.0.1:11434"
def test_ollama_model_override(self):
s = self._make_settings(OLLAMA_MODEL="llama3.1")
assert s.ollama_model == "llama3.1"
def test_debug_true_from_string(self):
s = self._make_settings(DEBUG="true")
assert s.debug is True
def test_debug_false_from_string(self):
s = self._make_settings(DEBUG="false")
assert s.debug is False
def test_numeric_override(self):
s = self._make_settings(OLLAMA_NUM_CTX="8192")
assert s.ollama_num_ctx == 8192
def test_max_agent_steps_override(self):
s = self._make_settings(MAX_AGENT_STEPS="25")
assert s.max_agent_steps == 25
def test_timmy_env_production(self):
s = self._make_settings(TIMMY_ENV="production")
assert s.timmy_env == "production"
def test_timmy_test_mode_true(self):
s = self._make_settings(TIMMY_TEST_MODE="true")
assert s.timmy_test_mode is True
def test_grok_enabled_override(self):
s = self._make_settings(GROK_ENABLED="true")
assert s.grok_enabled is True
def test_spark_enabled_override(self):
s = self._make_settings(SPARK_ENABLED="false")
assert s.spark_enabled is False
def test_memory_prune_days_override(self):
s = self._make_settings(MEMORY_PRUNE_DAYS="30")
assert s.memory_prune_days == 30
class TestSettingsTypeValidation:
"""Pydantic correctly parses and validates types from string env vars."""
def _make_settings(self, **env_overrides):
from config import Settings
clean_env = {
k: v
for k, v in os.environ.items()
if not k.startswith(("OLLAMA_", "TIMMY_", "AGENT_", "DEBUG"))
}
clean_env.update(env_overrides)
with patch.dict(os.environ, clean_env, clear=True):
return Settings()
def test_bool_from_1(self):
s = self._make_settings(DEBUG="1")
assert s.debug is True
def test_bool_from_0(self):
s = self._make_settings(DEBUG="0")
assert s.debug is False
def test_int_field_rejects_non_numeric(self):
from pydantic import ValidationError
with pytest.raises(ValidationError):
self._make_settings(OLLAMA_NUM_CTX="not_a_number")
def test_literal_field_rejects_invalid(self):
from pydantic import ValidationError
with pytest.raises(ValidationError):
self._make_settings(TIMMY_ENV="staging")
def test_literal_backend_rejects_invalid(self):
from pydantic import ValidationError
with pytest.raises(ValidationError):
self._make_settings(TIMMY_MODEL_BACKEND="openai")
def test_literal_backend_accepts_valid(self):
for backend in ("ollama", "grok", "claude", "auto"):
s = self._make_settings(TIMMY_MODEL_BACKEND=backend)
assert s.timmy_model_backend == backend
def test_extra_fields_ignored(self):
# model_config has extra="ignore"
s = self._make_settings(TOTALLY_UNKNOWN_FIELD="hello")
assert not hasattr(s, "totally_unknown_field")
class TestSettingsEdgeCases:
"""Edge cases: empty strings, missing vars, boundary values."""
def _make_settings(self, **env_overrides):
from config import Settings
clean_env = {
k: v
for k, v in os.environ.items()
if not k.startswith(("OLLAMA_", "TIMMY_", "AGENT_", "DEBUG"))
}
clean_env.update(env_overrides)
with patch.dict(os.environ, clean_env, clear=True):
return Settings()
def test_empty_string_tokens_stay_empty(self):
s = self._make_settings(TELEGRAM_TOKEN="", DISCORD_TOKEN="")
assert s.telegram_token == ""
assert s.discord_token == ""
def test_zero_int_fields(self):
s = self._make_settings(OLLAMA_NUM_CTX="0", MEMORY_PRUNE_DAYS="0")
assert s.ollama_num_ctx == 0
assert s.memory_prune_days == 0
def test_large_int_value(self):
s = self._make_settings(CHAT_API_MAX_BODY_BYTES="104857600")
assert s.chat_api_max_body_bytes == 104857600
def test_negative_int_accepted(self):
# Pydantic doesn't constrain these to positive
s = self._make_settings(MAX_AGENT_STEPS="-1")
assert s.max_agent_steps == -1
class TestComputeRepoRoot:
"""_compute_repo_root auto-detects .git directory."""
def test_returns_string(self):
from config import Settings
s = Settings()
result = s._compute_repo_root()
assert isinstance(result, str)
assert len(result) > 0
def test_explicit_repo_root_used(self):
from config import Settings
with patch.dict(os.environ, {"REPO_ROOT": "/tmp/myrepo"}, clear=False):
s = Settings()
s.repo_root = "/tmp/myrepo"
assert s._compute_repo_root() == "/tmp/myrepo"
class TestModelPostInit:
"""model_post_init resolves gitea_token from file fallback."""
def test_gitea_token_from_env(self):
from config import Settings
with patch.dict(os.environ, {"GITEA_TOKEN": "test-token-123"}, clear=False):
s = Settings()
assert s.gitea_token == "test-token-123"
def test_gitea_token_stays_empty_when_no_file(self):
from config import Settings
env = {k: v for k, v in os.environ.items() if k != "GITEA_TOKEN"}
with patch.dict(os.environ, env, clear=True):
with patch("os.path.isfile", return_value=False):
s = Settings()
assert s.gitea_token == ""
class TestCheckOllamaModelAvailable:
"""check_ollama_model_available handles network responses and errors."""
def test_returns_false_on_network_error(self):
from config import check_ollama_model_available
with patch("urllib.request.urlopen", side_effect=OSError("Connection refused")):
assert check_ollama_model_available("llama3.1") is False
def test_returns_true_when_model_found(self):
import json
from unittest.mock import MagicMock
from config import check_ollama_model_available
response_data = json.dumps({"models": [{"name": "llama3.1:8b-instruct"}]}).encode()
mock_response = MagicMock()
mock_response.read.return_value = response_data
mock_response.__enter__ = lambda s: s
mock_response.__exit__ = MagicMock(return_value=False)
with patch("urllib.request.urlopen", return_value=mock_response):
assert check_ollama_model_available("llama3.1") is True
def test_returns_false_when_model_not_found(self):
import json
from unittest.mock import MagicMock
from config import check_ollama_model_available
response_data = json.dumps({"models": [{"name": "qwen2.5:7b"}]}).encode()
mock_response = MagicMock()
mock_response.read.return_value = response_data
mock_response.__enter__ = lambda s: s
mock_response.__exit__ = MagicMock(return_value=False)
with patch("urllib.request.urlopen", return_value=mock_response):
assert check_ollama_model_available("llama3.1") is False
class TestGetEffectiveOllamaModel:
"""get_effective_ollama_model walks fallback chain."""
def test_returns_primary_when_available(self):
from config import get_effective_ollama_model
with patch("config.check_ollama_model_available", return_value=True):
result = get_effective_ollama_model()
assert result == "qwen3:30b"
def test_falls_back_when_primary_unavailable(self):
from config import get_effective_ollama_model
def side_effect(model):
return model == "llama3.1:8b-instruct"
with patch("config.check_ollama_model_available", side_effect=side_effect):
result = get_effective_ollama_model()
assert result == "llama3.1:8b-instruct"
def test_returns_user_model_when_nothing_available(self):
from config import get_effective_ollama_model
with patch("config.check_ollama_model_available", return_value=False):
result = get_effective_ollama_model()
assert result == "qwen3:30b"
class TestValidateStartup:
"""validate_startup enforces security in production, warns in dev."""
def setup_method(self):
import config
config._startup_validated = False
def test_skips_in_test_mode(self):
import config
with patch.dict(os.environ, {"TIMMY_TEST_MODE": "1"}):
config.validate_startup()
assert config._startup_validated is True
def test_dev_mode_warns_but_does_not_exit(self, caplog):
import logging
import config
config._startup_validated = False
env = {k: v for k, v in os.environ.items() if k != "TIMMY_TEST_MODE"}
env["TIMMY_ENV"] = "development"
with patch.dict(os.environ, env, clear=True):
with caplog.at_level(logging.WARNING, logger="config"):
config.validate_startup()
assert config._startup_validated is True
def test_production_exits_without_secrets(self):
import config
config._startup_validated = False
env = {k: v for k, v in os.environ.items() if k != "TIMMY_TEST_MODE"}
env["TIMMY_ENV"] = "production"
env.pop("L402_HMAC_SECRET", None)
env.pop("L402_MACAROON_SECRET", None)
with patch.dict(os.environ, env, clear=True):
with patch.object(config.settings, "timmy_env", "production"):
with patch.object(config.settings, "l402_hmac_secret", ""):
with patch.object(config.settings, "l402_macaroon_secret", ""):
with pytest.raises(SystemExit):
config.validate_startup(force=True)
def test_production_exits_with_cors_wildcard(self):
import config
config._startup_validated = False
env = {k: v for k, v in os.environ.items() if k != "TIMMY_TEST_MODE"}
env["TIMMY_ENV"] = "production"
with patch.dict(os.environ, env, clear=True):
with patch.object(config.settings, "timmy_env", "production"):
with patch.object(config.settings, "l402_hmac_secret", "secret1"):
with patch.object(config.settings, "l402_macaroon_secret", "secret2"):
with patch.object(config.settings, "cors_origins", ["*"]):
with pytest.raises(SystemExit):
config.validate_startup(force=True)
def test_production_passes_with_all_secrets(self):
import config
config._startup_validated = False
env = {k: v for k, v in os.environ.items() if k != "TIMMY_TEST_MODE"}
env["TIMMY_ENV"] = "production"
with patch.dict(os.environ, env, clear=True):
with patch.object(config.settings, "timmy_env", "production"):
with patch.object(config.settings, "l402_hmac_secret", "secret1"):
with patch.object(config.settings, "l402_macaroon_secret", "secret2"):
with patch.object(
config.settings,
"cors_origins",
["http://localhost:3000"],
):
config.validate_startup(force=True)
assert config._startup_validated is True
def test_idempotent_without_force(self):
import config
config._startup_validated = True
# Should return immediately without doing anything
config.validate_startup()
assert config._startup_validated is True
class TestAppStartTime:
"""APP_START_TIME is set at module load."""
def test_app_start_time_is_datetime(self):
from datetime import datetime
from config import APP_START_TIME
assert isinstance(APP_START_TIME, datetime)
def test_app_start_time_has_timezone(self):
from config import APP_START_TIME
assert APP_START_TIME.tzinfo is not None

244
tests/test_morrowind_api.py Normal file
View File

@@ -0,0 +1,244 @@
"""Tests for the Morrowind FastAPI harness endpoints.
Covers:
- GET /api/v1/morrowind/perception
- POST /api/v1/morrowind/command
- GET /api/v1/morrowind/status
"""
from __future__ import annotations
import json
from datetime import UTC, datetime
from pathlib import Path
from unittest.mock import MagicMock, patch
import pytest
from fastapi import FastAPI
from fastapi.testclient import TestClient
from infrastructure.morrowind.api import router, _get_command_logger
from infrastructure.morrowind import api as api_module
# ---------------------------------------------------------------------------
# Fixtures
# ---------------------------------------------------------------------------
SAMPLE_PERCEPTION = {
"protocol_version": "1.0.0",
"timestamp": "2024-06-15T10:30:00Z",
"agent_id": "timmy",
"location": {
"cell": "Balmora, Guild of Mages",
"x": 1234.5,
"y": 6789.0,
"z": 0.0,
"interior": True,
},
"health": {"current": 80, "max": 100},
"nearby_entities": [
{
"entity_id": "npc_001",
"name": "Ranis Athrys",
"entity_type": "npc",
"distance": 5.2,
"disposition": 65,
}
],
"inventory_summary": {
"gold": 250,
"item_count": 12,
"encumbrance_pct": 0.45,
},
"active_quests": [
{"quest_id": "mq_01", "name": "A Mysterious Note", "stage": 2}
],
"environment": {
"time_of_day": "morning",
"weather": "clear",
"is_combat": False,
"is_dialogue": False,
},
}
SAMPLE_COMMAND = {
"protocol_version": "1.0.0",
"timestamp": "2024-06-15T10:31:00Z",
"agent_id": "timmy",
"command": "move_to",
"params": {"x": 1300.0, "y": 6800.0, "z": 0.0},
"reasoning": "Moving to the guild entrance to speak with the quest giver.",
"episode_id": "ep_001",
}
@pytest.fixture
def app():
"""Create a fresh FastAPI app with the morrowind router."""
test_app = FastAPI()
test_app.include_router(router)
return test_app
@pytest.fixture
def client(app):
"""FastAPI test client."""
with TestClient(app) as c:
yield c
@pytest.fixture
def perception_file(tmp_path):
"""Write sample perception JSON to a temp file and patch the module path."""
p = tmp_path / "perception.json"
p.write_text(json.dumps(SAMPLE_PERCEPTION), encoding="utf-8")
with patch.object(api_module, "PERCEPTION_PATH", p):
yield p
@pytest.fixture
def mock_command_logger():
"""Patch the command logger with a mock."""
mock_logger = MagicMock()
mock_logger.log_command.return_value = 42
mock_logger.count.return_value = 7
with patch.object(api_module, "_command_logger", mock_logger):
yield mock_logger
# ---------------------------------------------------------------------------
# GET /perception
# ---------------------------------------------------------------------------
class TestGetPerception:
def test_success(self, client, perception_file):
"""Perception endpoint returns validated data."""
response = client.get("/api/v1/morrowind/perception")
assert response.status_code == 200
data = response.json()
assert data["agent_id"] == "timmy"
assert data["location"]["cell"] == "Balmora, Guild of Mages"
assert data["health"]["current"] == 80
assert data["health"]["max"] == 100
def test_file_not_found(self, client, tmp_path):
"""Returns 404 when perception file doesn't exist."""
missing = tmp_path / "nonexistent.json"
with patch.object(api_module, "PERCEPTION_PATH", missing):
response = client.get("/api/v1/morrowind/perception")
assert response.status_code == 404
assert "not found" in response.json()["detail"].lower()
def test_invalid_json(self, client, tmp_path):
"""Returns 422 when perception file contains invalid JSON."""
bad_file = tmp_path / "bad.json"
bad_file.write_text("not json", encoding="utf-8")
with patch.object(api_module, "PERCEPTION_PATH", bad_file):
response = client.get("/api/v1/morrowind/perception")
assert response.status_code == 422
def test_schema_validation_failure(self, client, tmp_path):
"""Returns 500 when JSON doesn't match PerceptionOutput schema."""
bad_data = tmp_path / "bad_schema.json"
bad_data.write_text(json.dumps({"not": "valid"}), encoding="utf-8")
with patch.object(api_module, "PERCEPTION_PATH", bad_data):
response = client.get("/api/v1/morrowind/perception")
assert response.status_code == 500
# ---------------------------------------------------------------------------
# POST /command
# ---------------------------------------------------------------------------
class TestPostCommand:
def test_success(self, client, mock_command_logger, perception_file):
"""Command is accepted, logged, and returns a command_id."""
response = client.post(
"/api/v1/morrowind/command",
json=SAMPLE_COMMAND,
)
assert response.status_code == 200
data = response.json()
assert data["command_id"] == 42
assert data["status"] == "accepted"
assert data["bridge_forwarded"] is False
mock_command_logger.log_command.assert_called_once()
def test_invalid_command_type(self, client, mock_command_logger):
"""Rejects commands with unknown command types."""
bad_command = {**SAMPLE_COMMAND, "command": "fly_to_moon"}
response = client.post(
"/api/v1/morrowind/command",
json=bad_command,
)
assert response.status_code == 422
def test_missing_reasoning(self, client, mock_command_logger):
"""Rejects commands without a reasoning field."""
no_reasoning = {**SAMPLE_COMMAND}
del no_reasoning["reasoning"]
response = client.post(
"/api/v1/morrowind/command",
json=no_reasoning,
)
assert response.status_code == 422
def test_empty_reasoning(self, client, mock_command_logger):
"""Rejects commands with empty reasoning."""
empty_reasoning = {**SAMPLE_COMMAND, "reasoning": ""}
response = client.post(
"/api/v1/morrowind/command",
json=empty_reasoning,
)
assert response.status_code == 422
def test_log_failure(self, client, tmp_path):
"""Returns 500 when command logging fails."""
mock_logger = MagicMock()
mock_logger.log_command.side_effect = RuntimeError("DB unavailable")
missing = tmp_path / "no_perception.json"
with (
patch.object(api_module, "_command_logger", mock_logger),
patch.object(api_module, "PERCEPTION_PATH", missing),
):
response = client.post(
"/api/v1/morrowind/command",
json=SAMPLE_COMMAND,
)
assert response.status_code == 500
assert "log command" in response.json()["detail"].lower()
# ---------------------------------------------------------------------------
# GET /morrowind/status
# ---------------------------------------------------------------------------
class TestGetStatus:
def test_connected(self, client, perception_file, mock_command_logger):
"""Status reports connected when perception file exists."""
response = client.get("/api/v1/morrowind/status")
assert response.status_code == 200
data = response.json()
assert data["connected"] is True
assert data["current_cell"] == "Balmora, Guild of Mages"
assert data["command_queue_depth"] == 7
assert data["vitals"]["health"] == "80/100"
def test_disconnected(self, client, tmp_path, mock_command_logger):
"""Status reports disconnected when perception file is missing."""
missing = tmp_path / "nonexistent.json"
with patch.object(api_module, "PERCEPTION_PATH", missing):
response = client.get("/api/v1/morrowind/status")
assert response.status_code == 200
data = response.json()
assert data["connected"] is False
assert data["current_cell"] is None
assert data["last_perception_timestamp"] is None

Some files were not shown because too many files have changed in this diff Show More