- Fixed `proc.returncode or 0` bug that masked non-zero exit codes
- Deleted test_run_python_expression — Timmy does not run python, test was environment-dependent garbage
- Fixed test_run_nonzero_exit to use `ls` on nonexistent path instead of sys.executable
1515 passed, 76.7% coverage.
Co-authored-by: Kimi Agent <kimi@timmy.local>
Reviewed-on: http://localhost:3000/rockachopa/Timmy-time-dashboard/pulls/140
Co-authored-by: hermes <hermes@timmy.local>
Co-committed-by: hermes <hermes@timmy.local>
- Add src/timmy/workspace.py: WorkspaceMonitor tracks correspondence.md
line count and inbox file list via data/workspace_state.json
- Wire workspace checks into _gather_system_snapshot() so Timmy sees
new workspace activity in his thinking context
- Add 'workspace' seed type for workspace-triggered reflections
- Add _check_workspace() post-hook to mark items as seen after processing
- 16 tests covering detection, mark_seen, persistence, edge cases
1. gitea-mcp → gitea-mcp-server (brew binary name). Fixes Timmy's
Gitea triage — MCP server can now be found on PATH.
2. Mark test_returns_dict_with_expected_keys as @pytest.mark.slow —
it runs pytest recursively and always exceeds the 30s timeout.
3. Fix ruff F841 lint in test_cli.py (unused result= variable).
Branch protection enabled on Gitea: direct push to main now rejected.
AGENTS.md updated with Merge Policy section documenting the workflow.
Also fixes bbbbdcd breakage: restores result= in repl test functions
which were dropped by Kimi's 'remove unused variable' commit.
RCA: Kimi Agent pushed directly to main without running tests.
Three optimizations to the agentic loop:
1. Cache loop agent as singleton (avoid repeated warmups)
2. Sliding window for step context (last 2 results, not all)
3. Replace summary LLM call with deterministic summary
Saves 1 full LLM inference call per agentic loop invocation
(30-60s on local models) and reduces context window pressure.
Also fixes pre-existing test_cli.py repl test bugs (missing result= assignment).
Timmy can now delegate coding tasks to Kimi CLI (262K context).
Includes timeout handling, workdir validation, output truncation.
Sovereign division of labor — Timmy plans, Kimi codes.
Timmy can now run his own test suite via the run_self_tests() tool.
Supports 'fast' (unit only), 'full', or specific path scopes.
Returns structured results with pass/fail counts.
Sovereign self-verification — a fundamental capability.
- Replace flat file list with layered architecture map (config→agent→prompt→tool→memory→interface)
- Add SELF-MODIFICATION section: Timmy knows he can edit his own config and code
- Remove false limitation 'cannot modify own source code'
- Update tests to match new section headers, add self-modification tests
Closes#81 (reasoning depth)
Closes#86 (self-modification awareness)
[loop-cycle-11]
_maybe_consolidate() now checks get_memories(subject=agent_id)
before storing. Skips if a memory of the same type (pattern/anomaly)
was created within the last hour. Prevents duplicate consolidation
entries on repeated task completion/failure events.
Also restructured branching: neutral success rates (0.3-0.8) now
return early instead of falling through.
9 new tests. 1465 total passing.
Adds SELF-KNOWLEDGE section to both SYSTEM_PROMPT_LITE and
SYSTEM_PROMPT_FULL with:
- Codebase map (all src/timmy/ modules with descriptions)
- Current capabilities list (grounded, not generic)
- Known limitations (real gaps, not LLM platitudes)
Lite prompt gets condensed version; full prompt gets detailed.
Timmy can now answer 'what does tool_safety.py do?' and give
grounded answers about his actual limitations.
10 new tests. 1456 total passing.