From affc4e9a8fed24f75d643ea26bf8f28bac7fd988 Mon Sep 17 00:00:00 2001 From: Teknium <127238744+teknium1@users.noreply.github.com> Date: Sun, 1 Feb 2026 02:05:03 -0800 Subject: [PATCH] Update TODO.md --- TODO.md | 27 +++++---------------------- 1 file changed, 5 insertions(+), 22 deletions(-) diff --git a/TODO.md b/TODO.md index 9ffb7d1d2..3f4d750b7 100644 --- a/TODO.md +++ b/TODO.md @@ -141,7 +141,7 @@ These items need to be addressed ASAP: --- -## 2. Self-Reflection & Course Correction 🔄 +## 3. Self-Reflection & Course Correction 🔄 **Problem:** Current retry logic handles malformed outputs but not semantic failures. Agent doesn't reason about *why* something failed. @@ -166,7 +166,7 @@ These items need to be addressed ASAP: --- -## 3. Tool Composition & Learning 🔧 +## 4. Tool Composition & Learning 🔧 **Problem:** Tools are atomic. Complex tasks require repeated manual orchestration of the same tool sequences. @@ -197,7 +197,7 @@ These items need to be addressed ASAP: --- -## 4. Dynamic Skills Expansion 📚 +## 5. Dynamic Skills Expansion 📚 **Problem:** Skills system is elegant but static. Skills must be manually created and added. @@ -226,7 +226,7 @@ These items need to be addressed ASAP: --- -## 5. Task Continuation Hints 🎯 +## 6. Task Continuation Hints 🎯 **Problem:** Could be more helpful by suggesting logical next steps. @@ -240,7 +240,7 @@ These items need to be addressed ASAP: --- -## 6. Interactive Clarifying Questions Tool ❓ +## 7. Interactive Clarifying Questions Tool ❓ **Problem:** Agent sometimes makes assumptions or guesses when it should ask the user. Currently can only ask via text, which gets lost in long outputs. @@ -276,23 +276,6 @@ These items need to be addressed ASAP: --- -## 7. Uncertainty & Honesty Calibration 🎚️ - -**Problem:** Sometimes confidently wrong. Should be better calibrated about what I know vs. don't know. - -**Ideas:** -- [ ] **Source attribution** - Track where information came from: - - "According to the docs I just fetched..." vs "From my training data (may be outdated)..." - - Let user assess reliability themselves - -- [ ] **Cross-reference high-stakes claims** - Self-check for made-up details: - - When stakes are high, verify with tools before presenting as fact - - "Let me verify that before you act on it..." - -**Files to modify:** `run_agent.py`, response generation logic - ---- - ## 8. Resource Awareness & Efficiency 💰 **Problem:** No awareness of costs, time, or resource usage. Could be smarter about efficiency.