Commit Graph

80 Commits

Author SHA1 Message Date
fe7e14b10e [kimi] Split scorecard_service.py into focused modules (#1406) (#1461) 2026-03-24 20:06:31 +00:00
4d2aeb937f [loop-cycle-7] refactor: split research.py into research/ subpackage (#1405) (#1458) 2026-03-24 19:53:04 +00:00
8518db921e [kimi] Implement graceful shutdown and health checks (#1397) (#1457) 2026-03-24 19:31:14 +00:00
d0b6d87eb1 [perplexity] feat: Nexus v2 — Cognitive Awareness & Introspection Engine (#1090) (#1348)
Co-authored-by: Perplexity Computer <perplexity@tower.local>
Co-committed-by: Perplexity Computer <perplexity@tower.local>
2026-03-24 02:50:40 +00:00
0fefb1c297 [loop-cycle-2112] chore: remove unused imports (#1328) 2026-03-24 02:24:57 +00:00
af162f1a80 [claude] Add unit tests for scorecard_service.py (#1139) (#1320)
Co-authored-by: Claude (Opus 4.6) <claude@hermes.local>
Co-committed-by: Claude (Opus 4.6) <claude@hermes.local>
2026-03-24 02:12:47 +00:00
6bb5e7e1a6 [claude] Real-time monitoring dashboard for all agent systems (#862) (#1319) 2026-03-24 02:07:38 +00:00
b5fb6a85cf [claude] Fix pre-existing ruff lint errors blocking git hooks (#1247) (#1248) 2026-03-23 23:33:37 +00:00
3217c32356 [claude] feat: Nexus — persistent conversational awareness space with live memory (#1208) (#1211) 2026-03-23 22:34:48 +00:00
25157a71a8 [loop-cycle] fix: remove unused imports and fix formatting (lint) (#1209) 2026-03-23 22:30:03 +00:00
b12fa8aa07 [claude] Add unit tests for daily_run.py (#1186) (#1199) 2026-03-23 21:58:33 +00:00
f2a277f7b5 [claude] Add vllm-mlx as high-performance local inference backend (#1069) (#1089)
Co-authored-by: Claude (Opus 4.6) <claude@hermes.local>
Co-committed-by: Claude (Opus 4.6) <claude@hermes.local>
2026-03-23 15:34:13 +00:00
b5a65b9d10 [claude] Add unit tests for health.py (#945) (#1002)
Co-authored-by: Claude (Opus 4.6) <claude@hermes.local>
Co-committed-by: Claude (Opus 4.6) <claude@hermes.local>
2026-03-23 15:10:53 +00:00
447e2b18c2 [kimi] Generate daily/weekly agent scorecards (#712) (#790)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-22 01:41:52 +00:00
815933953c [kimi] Add WebSocket authentication for Matrix connections (#682) (#744) 2026-03-21 16:14:05 +00:00
d54493a87b [kimi] Add /api/matrix/health endpoint (#685) (#745) 2026-03-21 15:51:29 +00:00
ddadc95e55 [kimi] Add /api/matrix/memory/search endpoint (#678) (#740) 2026-03-21 14:52:31 +00:00
8fc8e0fc3d [kimi] Add /api/matrix/thoughts endpoint for recent thought stream (#677) (#739) 2026-03-21 14:44:46 +00:00
2a7b6d5708 [kimi] Add /api/matrix/bark endpoint — HTTP fallback for bark messages (#675) (#737) 2026-03-21 14:32:04 +00:00
9d4ac8e7cc [kimi] Add /api/matrix/config endpoint for world configuration (#674) (#736) 2026-03-21 14:25:19 +00:00
c9601ba32c [kimi] Add /api/matrix/agents endpoint for Matrix visualization (#673) (#735) 2026-03-21 14:18:46 +00:00
9d0f5c778e [loop-cycle-2] fix: resolve endpoint before execution in CSRF middleware (#626) (#656) 2026-03-20 23:05:09 +00:00
7452e8a4f0 fix: add missing tests for Tower route /tower (#621)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-20 16:22:13 -04:00
7da434c85b [loop-cycle-946] refactor: complete airllm removal (#486) (#545) 2026-03-19 20:46:20 -04:00
3afb62afb7 fix: add self_reflect tool for past behavior review (#417)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-19 09:39:14 -04:00
76b26ead55 rescue: WS heartbeat ping + commitment tracking from stale PRs (#415)
## What
Manually integrated unique code from two stale PRs that were **not** superseded by merged work.

### PR #399 (kimi/issue-362) — WebSocket heartbeat ping
- 15-second ping loop detects dead iPad/Safari connections
- `_heartbeat()` coroutine launched as background task per WS client
- `ping_task` properly cancelled on disconnect

### PR #408 (kimi/issue-322) — Conversation commitment tracking
- Regex extraction of commitments from Timmy replies (`I'll` / `I will` / `Let me`)
- `_record_commitments()` stores with dedup + cap at 10
- `_tick_commitments()` increments message counter per commitment
- `_build_commitment_context()` surfaces overdue commitments as grounding context
- Wired into `_bark_and_broadcast()` and `_generate_bark()`
- Public API: `get_commitments()`, `close_commitment()`, `reset_commitments()`

### Tests
22 new tests covering both features: extraction, recording, dedup, caps, tick/context, integration, heartbeat ping, dead connection handling.

---
This PR rescues unique code from stale PRs #399 and #408. The other two stale PRs (#402, #411) were already superseded by merged work and should be closed.

Co-authored-by: Perplexity Computer <perplexity@tower.dev>
Reviewed-on: http://localhost:3000/rockachopa/Timmy-time-dashboard/pulls/415
Co-authored-by: Perplexity Computer <perplexity@tower.local>
Co-committed-by: Perplexity Computer <perplexity@tower.local>
2026-03-19 03:22:44 -04:00
b67dbe922f fix: conversation grounding to prevent topic drift in Workshop (#406)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-19 02:39:15 -04:00
3571d528ad feat: Workshop Phase 1 — State Schema v1 (#404)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-19 02:24:13 -04:00
e89aef41bc [loop-cycle-392] refactor: DRY broadcast + bark error logging (#397, #398) (#400) 2026-03-19 02:01:58 -04:00
86224d042d feat: Workshop Phase 4 — visitor chat via WebSocket bark engine (#394)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-19 01:54:06 -04:00
f9d8509c15 fix: send world state snapshot on WS client connect (#390)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-19 01:28:57 -04:00
da43421d4e feat: broadcast Timmy state changes via WS relay (#380)
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-19 00:25:11 -04:00
19e7e61c92 [loop-cycle] refactor: DRY PRESENCE_FILE — single source of truth in workshop_state (#381) (#382) 2026-03-18 22:33:06 -04:00
3108971bd5 [loop-cycle-155] feat: GET /api/world/state — Workshop bootstrap endpoint (#373) (#378) 2026-03-18 22:13:49 -04:00
b4cb3e9975 [loop-cycle-54] refactor: consolidate three memory stores into single table (#37) (#223) 2026-03-15 13:33:24 -04:00
4a68f6cb8b [loop-cycle-53] refactor: break circular imports between packages (#164) (#193) 2026-03-15 12:52:18 -04:00
466ad08d7d [loop-cycle-34] fix: mock Ollama model resolution in create_timmy tests (#121) (#126) 2026-03-15 08:20:00 -04:00
b3809f5246 feat: add JSON status endpoints for briefing, memory, swarm (#49, #50) 2026-03-14 19:23:32 -04:00
79edfd1106 feat: persist chat history in SQLite — survives server restarts
Replace in-memory MessageLog with SQLite-backed implementation.
Same API surface (append/all/clear/len) so zero caller changes needed.

- data/chat.db stores messages with role, content, timestamp, source
- Lazy DB connection (opened on first use, not at import time)
- Retention policy: oldest messages pruned when count > 500
- New .recent(limit) method for efficient last-N queries
- Thread-safe with explicit locking
- WAL mode for concurrent read performance
- Test isolation: conftest redirects DB to tmp_path per test
- 8 new tests: persistence, retention, concurrency, source field

Closes #46
2026-03-14 16:09:26 -04:00
2f623826bd cleanup: delete dead modules — ~7,900 lines removed
Closes #22, Closes #23

Deleted: brain/, swarm/, openfang/, paperclip/, cascade_adapter,
memory_migrate, agents/timmy.py, dead routes + all corresponding tests.

Updated pyproject.toml, app.py, loop_qa.py for removed imports.
2026-03-14 09:49:24 -04:00
Trip T
78167675f2 feat: replace custom Gitea client with MCP servers
Replace the bespoke GiteaHand httpx client and tools_gitea.py wrappers
with official MCP tool servers (gitea-mcp + filesystem MCP), wired into
Agno via MCPTools. Switch all session functions to async (arun/acontinue_run)
so MCP tools auto-connect. Delete ~1070 lines of custom Gitea code.

- Create src/timmy/mcp_tools.py with MCP factories + standalone issue bridge
- Wire MCPTools into agent.py tool list (Gitea + filesystem)
- Switch session.py chat/chat_with_tools/continue_chat to async
- Update all callers (dashboard routes, Discord vendor, CLI, thinking engine)
- Add gitea_token fallback from ~/.config/gitea/token
- Add MCP session cleanup to app shutdown hook
- Update tool_safety.py for MCP tool names
- 11 new tests, all 1417 passing, coverage 74.2%

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 21:40:32 -04:00
Trip T
f6a6c0f62e feat: upgrade to qwen3.5, self-hosted Gitea CI, optimize Docker image
Model upgrade:
- qwen2.5:14b → qwen3.5:latest across config, tools, and docs
- Added qwen3.5 to multimodal model registry

Self-hosted Gitea CI:
- .gitea/workflows/tests.yml: lint + test jobs via act_runner
- Unified Dockerfile: pre-baked deps from poetry.lock for fast CI
- sitepackages=true in tox for ~2s dep resolution (was ~40s)
- OLLAMA_URL set to dead port in CI to prevent real LLM calls

Test isolation fixes:
- Smoke test fixture mocks create_timmy (was hitting real Ollama)
- WebSocket sends initial_state before joining broadcast pool (race fix)
- Tests use settings.ollama_model/url instead of hardcoded values
- skip_ci marker for Ollama-dependent tests, excluded in CI tox envs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 18:36:42 -04:00
Alexander Whitestone
36fc10097f Claude/angry cerf (#173)
* feat: set qwen3.5:latest as default model

- Make qwen3.5:latest the primary default model for faster inference
- Move llama3.1:8b-instruct to fallback chain
- Update text fallback chain to prioritize qwen3.5:latest

Retains full backward compatibility via cascade fallback.

* test: remove ~55 brittle, duplicate, and useless tests

Audit of all 100 test files identified tests that provided no real
regression protection. Removed:

- 4 files deleted entirely: test_setup_script (always skipped),
  test_csrf_bypass (tautological assertions), test_input_validation
  (accepts 200-500 status codes), test_security_regression (fragile
  source-pattern checks redundant with rendering tests)
- Duplicate test classes (TestToolTracking, TestCalculatorExtended)
- Mock-only tests that just verify mock wiring, not behavior
- Structurally broken tests (TestCreateToolFunctions patches after import)
- Empty/pass-body tests and meaningless assertions (len > 20)
- Flaky subprocess tests (aider tool calling real binary)

All 1328 remaining tests pass. Net: -699 lines, zero coverage loss.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: prevent test pollution from autoresearch_enabled mutation

test_autoresearch_perplexity.py was setting settings.autoresearch_enabled = True
but never restoring it in the finally block — polluting subsequent tests.
When pytest-randomly ordered it before test_experiments_page_shows_disabled_when_off,
the victim test saw enabled=True and failed to find "Disabled" in the page.

Fix both sides:
- Restore autoresearch_enabled in the finally block (root cause)
- Mock settings explicitly in the victim test (defense in depth)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Trip T <trip@local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 16:55:27 -04:00
Alexander Whitestone
9d78eb31d1 ruff (#169)
* polish: streamline nav, extract inline styles, improve tablet UX

- Restructure desktop nav from 8+ flat links + overflow dropdown into
  5 grouped dropdowns (Core, Agents, Intel, System, More) matching
  the mobile menu structure to reduce decision fatigue
- Extract all inline styles from mission_control.html and base.html
  notification elements into mission-control.css with semantic classes
- Replace JS-built innerHTML with secure DOM construction in
  notification loader and chat history
- Add CONNECTING state to connection indicator (amber) instead of
  showing OFFLINE before WebSocket connects
- Add tablet breakpoint (1024px) with larger touch targets for
  Apple Pencil / stylus use and safe-area padding for iPad toolbar
- Add active-link highlighting in desktop dropdown menus
- Rename "Mission Control" page title to "System Overview" to
  disambiguate from the chat home page
- Add "Home — Timmy Time" page title to index.html

https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h

* fix(security): move auth-gate credentials to environment variables

Hardcoded username, password, and HMAC secret in auth-gate.py replaced
with os.environ lookups. Startup now refuses to run if any variable is
unset. Added AUTH_GATE_SECRET/USER/PASS to .env.example.

https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h

* refactor(tooling): migrate from black+isort+bandit to ruff

Replace three separate linting/formatting tools with a single ruff
invocation. Updates tox.ini (lint, format, pre-push, pre-commit envs),
.pre-commit-config.yaml, and CI workflow. Fixes all ruff errors
including unused imports, missing raise-from, and undefined names.
Ruff config maps existing bandit skips to equivalent S-rules.

https://claude.ai/code/session_015uPUoKyYa8M2UAcyk5Gt6h

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-03-11 12:23:35 -04:00
Alexander Whitestone
622a6a9204 polish: extract inline CSS, add connection status, panel macro, favicon, ollama cache, toast system (#164)
Major:
- Extract all inline <style> blocks from 22 Jinja2 templates into
  static/css/mission-control.css — single cacheable stylesheet
- Add tox lint check that fails on inline <style> in templates

Minor:
1. Connection status indicator in topbar (green/amber/red dot) reflecting
   WebSocket + Ollama reachability, with auto-reconnect
2. Jinja2 {% macro panel(title) %} in macros.html — eliminates repeated
   .card.mc-panel markup; index.html converted as example
3. SVG favicon (purple T + orange dot)
4. 30-second TTL cache on _check_ollama() to avoid blocking the event loop
   on every health poll (asyncio.to_thread was already in place)
5. Toast notification system (McToast.show) for transient status messages —
   wired into connection status for Ollama/WebSocket state changes

Enforcement:
- CLAUDE.md updated with conventions 11-14 (no inline CSS, use panel macro,
  use toasts, never block the event loop)
- tox lint + pre-push environments now fail on inline <style> blocks

https://claude.ai/code/session_014FQ785MQdyJQ4BAXrRSo9w

Co-authored-by: Claude <noreply@anthropic.com>
2026-03-11 09:52:57 -04:00
Alexander Whitestone
2a5f317a12 fix: implement @csrf_exempt decorator support in CSRFMiddleware (#159) 2026-03-10 15:26:40 -04:00
Alexander Whitestone
904a7c564e feat: migrate to Agno native HITL tool confirmation flow (#158)
Replace the homebrew regex-based tool extraction and manual dispatch
(tool_executor.py) with Agno's built-in Human-In-The-Loop confirmation:

- Toolkit(requires_confirmation_tools=...) marks dangerous tools
- agent.run() returns RunOutput with status=paused when confirmation needed
- RunRequirement.confirm()/reject() + agent.continue_run() resumes execution

Dashboard and Discord vendor both use the native flow. DuckDuckGo import
isolated so its absence doesn't kill all tools. Test stubs cleaned up
(agno is a real dependency, only truly optional packages stubbed).

1384 tests pass in parallel (~14s).

Co-authored-by: Trip T <trip@local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 21:54:04 -04:00
Alexander Whitestone
fe484ad7b6 Fix input validation for chat and memory routes (#155) 2026-03-09 09:36:16 -04:00
Alexander Whitestone
8dbce25183 fix: handle concurrent table creation race in SQLite (#151) 2026-03-08 13:27:11 -04:00
Alexander Whitestone
ae3bb1cc21 feat: code quality audit + autoresearch integration + infra hardening (#150) 2026-03-08 12:50:44 -04:00