Tracked: morrowind agent (py/cfg), skills/, training-data/, research/, notes/, specs/, test-results/, metrics/, heartbeat/, briefings/, memories/, skins/, hooks/, decisions.md, OPERATIONS.md, SOUL.md Excluded: screenshots, PNGs, binaries, sessions, databases, secrets, audio cache, timmy-config/ and timmy-telemetry/ (separate repos)
52 lines
2.1 KiB
Markdown
52 lines
2.1 KiB
Markdown
# Next Cycle Priorities
|
|
|
|
**Date:** 2026-03-24
|
|
**Context:** Repository-based development suspended, focus on local Timmy implementation
|
|
|
|
## Immediate Actions (Next Cycle)
|
|
|
|
### 1. FIX SOURCE DISTINCTION BUG [HIGH PRIORITY]
|
|
- **Problem:** Models tag training data as [retrieved] when confident, [generated] when uncertain
|
|
- **Root Cause:** Conflation of epistemic confidence with data source
|
|
- **Solution:** Rewrite rule to explicitly check for tool-call sources vs training data
|
|
- **Test Plan:** Re-run source-distinction tests with corrected rule
|
|
|
|
### 2. FIX REFUSAL OVER-AGGRESSION [HIGH PRIORITY]
|
|
- **Problem:** Model refuses to answer even when context contains the answer
|
|
- **Root Cause:** Refusal rule overpowers retrieval behavior
|
|
- **Solution:** Add context-checking step before refusal trigger
|
|
- **Test Plan:** Re-run Test D from refusal-rule-test-001.md
|
|
|
|
### 3. IMPLEMENT PIPELINE PROTOTYPE [MEDIUM PRIORITY]
|
|
- **Architecture:** Generate → Tag → Filter → Deliver
|
|
- **Target:** Simple working prototype for qwen3:30b testing
|
|
- **Location:** ~/.timmy/machinery/ (new directory)
|
|
- **Goal:** Prove the architecture works before optimization
|
|
|
|
### 4. LOCAL BRAIN VALIDATION [LOW PRIORITY]
|
|
- **Model:** qwen3:30b on Ollama (the intended home brain)
|
|
- **Test:** Both source distinction and refusal rules
|
|
- **Budget:** num_predict ≥1000 for thinking tokens
|
|
- **Goal:** Confirm rules work on the target local model
|
|
|
|
## Testing Strategy
|
|
|
|
Use existing test plans in ~/.timmy/test-results/ but with corrected implementations.
|
|
Target: all tests pass before considering production deployment.
|
|
|
|
## Success Criteria
|
|
|
|
- Source distinction correctly identifies tool-call vs training sources
|
|
- Refusal rule catches fabricated specifics but answers from context
|
|
- Pipeline prototype functional on local qwen3:30b
|
|
- Test suite shows green across all scenarios
|
|
|
|
## Files to Create/Update Next Cycle
|
|
|
|
- ~/.timmy/machinery/ (new directory for implementation)
|
|
- Updated rule drafts in ~/.timmy/test-results/
|
|
- Re-run test results with corrected logic
|
|
- Pipeline prototype implementation
|
|
|
|
---
|
|
*Sovereignty and service always.* |