Commit Graph

482 Commits

Author SHA1 Message Date
db8cf5e13b Merge branch 'main' into kimi/issue-347
All checks were successful
Tests / lint (pull_request) Successful in 5s
Tests / test (pull_request) Successful in 1m31s
2026-03-18 21:07:49 -04:00
kimi
d6cadd0f58 test: add comprehensive unit tests for agents/loader.py
All checks were successful
Tests / lint (pull_request) Successful in 5s
Tests / test (pull_request) Successful in 1m14s
Fixes #347

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-18 21:02:08 -04:00
dff07c6529 [loop-cycle-149] feat: Workshop config inventory generator (#320) (#348)
All checks were successful
Tests / lint (push) Successful in 4s
Tests / test (push) Successful in 1m10s
2026-03-18 20:58:27 -04:00
11357ffdb4 test: add comprehensive unit tests for agentic_loop.py (#345)
All checks were successful
Tests / lint (push) Successful in 3s
Tests / test (push) Successful in 1m16s
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-18 20:54:02 -04:00
fcbb2b848b test: add unit tests for jot_note and log_decision artifact tools (#341)
All checks were successful
Tests / lint (push) Successful in 5s
Tests / test (push) Successful in 1m41s
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-18 20:47:38 -04:00
6621f4bd31 [loop-cycle-147] refactor: expand .gitignore to cover junk files (#336) (#339)
All checks were successful
Tests / lint (push) Successful in 3s
Tests / test (push) Successful in 1m21s
2026-03-18 20:37:13 -04:00
243b1a656f feat: give Timmy hands — artifact tools for conversation (#337)
Some checks failed
Tests / lint (push) Successful in 7s
Tests / test (push) Has been cancelled
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-18 20:36:38 -04:00
22e0d2d4b3 [loop-cycle-66] fix: replace language-model with inference-backend in error messages (#334)
All checks were successful
Tests / lint (push) Successful in 4s
Tests / test (push) Successful in 1m19s
2026-03-18 20:27:06 -04:00
bcc7b068a4 [loop-cycle-66] fix: remove language-model self-reference and add anti-assistant-speak guidance (#323) (#333)
All checks were successful
Tests / lint (push) Successful in 4s
Tests / test (push) Successful in 1m30s
2026-03-18 20:21:03 -04:00
bfd924fe74 [loop-cycle-65] feat: scaffold three-phase loop skeleton (#324) (#330)
All checks were successful
Tests / lint (push) Successful in 3s
Tests / test (push) Successful in 1m19s
2026-03-18 20:11:02 -04:00
844923b16b [loop-cycle-65] fix: validate file paths before filing thinking-engine issues (#327) (#329)
All checks were successful
Tests / lint (push) Successful in 3s
Tests / test (push) Successful in 1m9s
2026-03-18 20:07:19 -04:00
8ef0ad1778 fix: pause thought counter during idle periods (#319)
All checks were successful
Tests / lint (push) Successful in 6s
Tests / test (push) Successful in 1m7s
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-18 19:12:14 -04:00
9a21a4b0ff feat: SensoryEvent model + SensoryBus dispatcher (#318)
All checks were successful
Tests / lint (push) Successful in 7s
Tests / test (push) Successful in 1m2s
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-18 19:02:12 -04:00
ab71c71036 feat: time adapter — circadian awareness for Timmy (#315)
All checks were successful
Tests / lint (push) Successful in 3s
Tests / test (push) Successful in 57s
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-18 18:47:09 -04:00
39939270b7 fix: Gitea webhook adapter — normalize events to sensory bus (#309)
All checks were successful
Tests / lint (push) Successful in 3s
Tests / test (push) Successful in 1m1s
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-18 18:37:01 -04:00
0ab1ee9378 fix: proactive memory status check during thought tracking (#313)
Some checks failed
Tests / test (push) Has been cancelled
Tests / lint (push) Has been cancelled
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-18 18:36:59 -04:00
234187c091 fix: add periodic memory status checks during thought tracking (#311)
All checks were successful
Tests / lint (push) Successful in 5s
Tests / test (push) Successful in 1m0s
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-18 18:26:53 -04:00
f4106452d2 feat: implement v1 API endpoints for iPad app (#312)
All checks were successful
Tests / lint (push) Successful in 7s
Tests / test (push) Successful in 1m12s
Co-authored-by: manus <manus@timmy.local>
Co-committed-by: manus <manus@timmy.local>
2026-03-18 18:20:14 -04:00
f5a570c56d fix: add real-time data disclaimer to welcome message (#304)
All checks were successful
Tests / lint (push) Successful in 13s
Tests / test (push) Successful in 1m15s
2026-03-18 16:56:21 -04:00
rockachopa
96e7961a0e fix: make confidence visible to users when below 0.7 threshold (#259)
All checks were successful
Tests / lint (push) Successful in 3s
Tests / test (push) Successful in 52s
Co-authored-by: rockachopa <alexpaynex@gmail.com>
Co-committed-by: rockachopa <alexpaynex@gmail.com>
2026-03-15 19:36:52 -04:00
bcbdc7d7cb feat: add thought_search tool for querying Timmy's thinking history (#260)
Some checks failed
Tests / lint (push) Successful in 4s
Tests / test (push) Has been cancelled
Co-authored-by: Kimi Agent <kimi@timmy.local>
Co-committed-by: Kimi Agent <kimi@timmy.local>
2026-03-15 19:35:58 -04:00
80aba0bf6d [loop-cycle-63] feat: session_history tool — Timmy searches past conversations (#251) (#258)
Some checks failed
Tests / lint (push) Failing after 3s
Tests / test (push) Has been skipped
2026-03-15 15:11:43 -04:00
dd34dc064f [loop-cycle-62] fix: MEMORY.md corruption and hot memory staleness (#252) (#256)
Some checks failed
Tests / lint (push) Failing after 2s
Tests / test (push) Has been skipped
2026-03-15 15:01:19 -04:00
7bc355eed6 [loop-cycle-61] fix: strip think tags and harden fact parsing (#237) (#254)
Some checks failed
Tests / lint (push) Failing after 3s
Tests / test (push) Has been skipped
2026-03-15 14:50:09 -04:00
f9911c002c [loop-cycle-60] fix: retry with backoff on Ollama GPU contention (#70) (#238)
All checks were successful
Tests / lint (push) Successful in 4s
Tests / test (push) Successful in 54s
2026-03-15 14:28:47 -04:00
7f656fcf22 [loop-cycle-59] feat: gematria computation tool (#234) (#235)
Some checks failed
Tests / lint (push) Failing after 2s
Tests / test (push) Has been skipped
2026-03-15 14:14:38 -04:00
8c63dabd9d [loop-cycle-57] fix: wire confidence estimation into chat flow (#231) (#232)
All checks were successful
Tests / lint (push) Successful in 2s
Tests / test (push) Successful in 49s
2026-03-15 13:58:35 -04:00
a50af74ea2 [loop-cycle-56] fix: resolve 5 lint errors on main (#203) (#224)
All checks were successful
Tests / lint (push) Successful in 3s
Tests / test (push) Successful in 58s
2026-03-15 13:40:40 -04:00
b4cb3e9975 [loop-cycle-54] refactor: consolidate three memory stores into single table (#37) (#223)
Some checks failed
Tests / lint (push) Failing after 2s
Tests / test (push) Has been skipped
2026-03-15 13:33:24 -04:00
4a68f6cb8b [loop-cycle-53] refactor: break circular imports between packages (#164) (#193)
Some checks failed
Tests / lint (push) Failing after 3s
Tests / test (push) Has been skipped
2026-03-15 12:52:18 -04:00
b3840238cb [loop-cycle-52] feat: response audit trail with inputs, confidence, errors (#144) (#191)
Some checks failed
Tests / lint (push) Failing after 3s
Tests / test (push) Has been skipped
2026-03-15 12:34:48 -04:00
96c7e6deae [loop-cycle-52] fix: remove all qwen3.5 references (#182) (#190)
Some checks failed
Tests / lint (push) Has been cancelled
Tests / test (push) Has been cancelled
2026-03-15 12:34:21 -04:00
efef0cd7a2 fix: exclude backfilled data from success rate calculations (#189)
All checks were successful
Tests / lint (push) Successful in 3s
Tests / test (push) Successful in 1m3s
Backfilled retro entries lack main_green/hermes_clean fields (survivorship bias). Now rates are computed only from measured entries. LOOPSTAT shows "no data yet" instead of fake 100%.

Co-authored-by: Kimi Agent <kimi@timmy.local>
Reviewed-on: http://localhost:3000/rockachopa/Timmy-time-dashboard/pulls/189
Co-authored-by: hermes <hermes@timmy.local>
Co-committed-by: hermes <hermes@timmy.local>
2026-03-15 12:29:27 -04:00
766add6415 [loop-cycle-52] test: comprehensive session_logger.py coverage (#175) (#187)
Some checks failed
Tests / lint (push) Successful in 3s
Tests / test (push) Has been cancelled
2026-03-15 12:26:50 -04:00
56b08658b7 feat: workspace isolation + honest success metrics (#186)
Some checks failed
Tests / lint (push) Successful in 3s
Tests / test (push) Has been cancelled
## Workspace Isolation

No agent touches ~/Timmy-Time-dashboard anymore. Each agent gets a fully isolated clone under /tmp/timmy-agents/ with its own port, data directory, and TIMMY_HOME.

- scripts/agent_workspace.sh: init, reset, branch, destroy per agent
- Loop prompt updated: workspace paths replace worktree paths
- Smoke tests run in isolated /tmp/timmy-agents/smoke/repo

## Honest Success Metrics

Cycle success now requires BOTH hermes clean exit AND main green (smoke test passes). Tracks main_green_rate separately from hermes_clean_rate in summary.json.

Follows from PR #162 (triage + retro system).

Co-authored-by: Kimi Agent <kimi@timmy.local>
Reviewed-on: http://localhost:3000/rockachopa/Timmy-time-dashboard/pulls/186
Co-authored-by: hermes <hermes@timmy.local>
Co-committed-by: hermes <hermes@timmy.local>
2026-03-15 12:25:27 -04:00
f6d74b9f1d [loop-cycle-51] refactor: remove dead code from memory_system.py (#173) (#185)
All checks were successful
Tests / lint (push) Successful in 2s
Tests / test (push) Successful in 1m8s
2026-03-15 12:18:11 -04:00
e8dd065ad7 [loop-cycle-51] perf: mock subprocess in slow introspection test (#172) (#184)
Some checks failed
Tests / lint (push) Has been cancelled
Tests / test (push) Has been cancelled
2026-03-15 12:17:50 -04:00
5b57bf3dd0 [loop-cycle-50] fix: agent retry uses exponential backoff instead of fixed 1s delay (#174) (#181)
All checks were successful
Tests / lint (push) Successful in 6s
Tests / test (push) Successful in 1m20s
2026-03-15 12:08:30 -04:00
bcd6d7e321 [loop-cycle-50] refactor: replace bare sqlite3.connect() with context managers batch 2 (#157) (#180)
All checks were successful
Tests / lint (push) Successful in 4s
Tests / test (push) Successful in 1m55s
2026-03-15 11:58:43 -04:00
bea2749158 [loop-cycle-49] refactor: narrow broad except Exception catches — batch 1 (#158) (#178)
All checks were successful
Tests / lint (push) Successful in 2s
Tests / test (push) Successful in 1m42s
2026-03-15 11:48:54 -04:00
ca01ce62ad [loop-cycle-49] fix: mock _warmup_model in agent tests to prevent Ollama network calls (#159) (#177)
Some checks failed
Tests / lint (push) Successful in 5s
Tests / test (push) Has been cancelled
2026-03-15 11:46:20 -04:00
b960096331 feat: triage scoring, cycle retros, deep triage, and LOOPSTAT panel (#162)
All checks were successful
Tests / lint (push) Successful in 3s
Tests / test (push) Successful in 59s
2026-03-15 11:24:01 -04:00
204a6ed4e5 refactor: decompose _maybe_distill() into focused helpers (#151) (#160)
Some checks failed
Tests / lint (push) Has been cancelled
Tests / test (push) Has been cancelled
2026-03-15 11:23:45 -04:00
f15ad3375a [loop-cycle-47] feat: add confidence signaling module (#143) (#161)
All checks were successful
Tests / lint (push) Successful in 13s
Tests / test (push) Successful in 1m2s
2026-03-15 11:20:30 -04:00
5aea8be223 [loop-cycle-47] refactor: replace bare sqlite3.connect() with context managers (#148) (#155)
All checks were successful
Tests / lint (push) Successful in 4s
Tests / test (push) Successful in 1m4s
2026-03-15 11:05:39 -04:00
717dba9816 [loop-cycle-46] refactor: break up oversized functions in tools.py (#151) (#154)
All checks were successful
Tests / lint (push) Successful in 3s
Tests / test (push) Successful in 1m20s
2026-03-15 10:56:33 -04:00
466db7aed2 [loop-cycle-44] refactor: remove dead code batch 2 — agent_core + test_agent_core (#147) (#150)
All checks were successful
Tests / lint (push) Successful in 2s
Tests / test (push) Successful in 1m27s
2026-03-15 10:22:41 -04:00
d2c51763d0 [loop-cycle-43] refactor: remove 1035 lines of dead code (#136) (#146)
All checks were successful
Tests / lint (push) Successful in 4s
Tests / test (push) Successful in 1m4s
2026-03-15 10:10:12 -04:00
16b31b30cb fix: shell hand returncode bug, delete worthless python-exec test (#140)
All checks were successful
Tests / lint (push) Successful in 4s
Tests / test (push) Successful in 1m10s
- Fixed `proc.returncode or 0` bug that masked non-zero exit codes
- Deleted test_run_python_expression — Timmy does not run python, test was environment-dependent garbage
- Fixed test_run_nonzero_exit to use `ls` on nonexistent path instead of sys.executable

1515 passed, 76.7% coverage.

Co-authored-by: Kimi Agent <kimi@timmy.local>
Reviewed-on: http://localhost:3000/rockachopa/Timmy-time-dashboard/pulls/140
Co-authored-by: hermes <hermes@timmy.local>
Co-committed-by: hermes <hermes@timmy.local>
2026-03-15 09:56:50 -04:00
48c8efb2fb [loop-cycle-40] fix: use get_system_prompt() in cloud backends (#135) (#138)
Some checks failed
Tests / lint (push) Successful in 2s
Tests / test (push) Failing after 1m10s
## What

Cloud backends (Grok, Claude, AirLLM) were importing SYSTEM_PROMPT directly, which is always SYSTEM_PROMPT_LITE and contains unformatted {model_name} and {session_id} placeholders.

## Changes

- backends.py: Replace `from timmy.prompts import SYSTEM_PROMPT` with `from timmy.prompts import get_system_prompt`
- AirLLM: uses `get_system_prompt(tools_enabled=False, session_id="airllm")` (LITE tier, correct)
- Grok: uses `get_system_prompt(tools_enabled=True, session_id="grok")` (FULL tier)
- Claude: uses `get_system_prompt(tools_enabled=True, session_id="claude")` (FULL tier)
- 9 new tests verify formatted model names, correct tier selection, and session_id formatting

## Tests

1508 passed, 0 failed (41 new tests this cycle)

Fixes #135

Co-authored-by: Kimi Agent <kimi@timmy.local>
Reviewed-on: http://localhost:3000/rockachopa/Timmy-time-dashboard/pulls/138
Reviewed-by: rockachopa <alexpaynex@gmail.com>
Co-authored-by: hermes <hermes@timmy.local>
Co-committed-by: hermes <hermes@timmy.local>
2026-03-15 09:44:43 -04:00