From affc4e9a8fed24f75d643ea26bf8f28bac7fd988 Mon Sep 17 00:00:00 2001
From: Teknium <127238744+teknium1@users.noreply.github.com>
Date: Sun, 1 Feb 2026 02:05:03 -0800
Subject: [PATCH] Update TODO.md

---
 TODO.md | 27 +++++----------------------
 1 file changed, 5 insertions(+), 22 deletions(-)

diff --git a/TODO.md b/TODO.md
index 9ffb7d1d2..3f4d750b7 100644
--- a/TODO.md
+++ b/TODO.md
@@ -141,7 +141,7 @@ These items need to be addressed ASAP:
 
 ---
 
-## 2. Self-Reflection & Course Correction 🔄
+## 3. Self-Reflection & Course Correction 🔄
 
 **Problem:** Current retry logic handles malformed outputs but not semantic failures. Agent doesn't reason about *why* something failed.
 
@@ -166,7 +166,7 @@ These items need to be addressed ASAP:
 
 ---
 
-## 3. Tool Composition & Learning 🔧
+## 4. Tool Composition & Learning 🔧
 
 **Problem:** Tools are atomic. Complex tasks require repeated manual orchestration of the same tool sequences.
 
@@ -197,7 +197,7 @@ These items need to be addressed ASAP:
 
 ---
 
-## 4. Dynamic Skills Expansion 📚
+## 5. Dynamic Skills Expansion 📚
 
 **Problem:** Skills system is elegant but static. Skills must be manually created and added.
 
@@ -226,7 +226,7 @@ These items need to be addressed ASAP:
 
 ---
 
-## 5. Task Continuation Hints 🎯
+## 6. Task Continuation Hints 🎯
 
 **Problem:** Could be more helpful by suggesting logical next steps.
 
@@ -240,7 +240,7 @@ These items need to be addressed ASAP:
 
 ---
 
-## 6. Interactive Clarifying Questions Tool ❓
+## 7. Interactive Clarifying Questions Tool ❓
 
 **Problem:** Agent sometimes makes assumptions or guesses when it should ask the user. Currently can only ask via text, which gets lost in long outputs.
 
@@ -276,23 +276,6 @@ These items need to be addressed ASAP:
 
 ---
 
-## 7. Uncertainty & Honesty Calibration 🎚️
-
-**Problem:** Sometimes confidently wrong. Should be better calibrated about what I know vs. don't know.
-
-**Ideas:**
-- [ ] **Source attribution** - Track where information came from:
-  - "According to the docs I just fetched..." vs "From my training data (may be outdated)..."
-  - Let user assess reliability themselves
-
-- [ ] **Cross-reference high-stakes claims** - Self-check for made-up details:
-  - When stakes are high, verify with tools before presenting as fact
-  - "Let me verify that before you act on it..."
-
-**Files to modify:** `run_agent.py`, response generation logic
-
----
-
 ## 8. Resource Awareness & Efficiency 💰
 
 **Problem:** No awareness of costs, time, or resource usage. Could be smarter about efficiency.