Tracked: morrowind agent (py/cfg), skills/, training-data/, research/, notes/, specs/, test-results/, metrics/, heartbeat/, briefings/, memories/, skins/, hooks/, decisions.md, OPERATIONS.md, SOUL.md Excluded: screenshots, PNGs, binaries, sessions, databases, secrets, audio cache, timmy-config/ and timmy-telemetry/ (separate repos)
1.5 KiB
Tagging Rule Test #001
Date: 2026-03-19 Model: qwen3:30b (local Ollama)
Setup
- Tagging rule deployed in ~/.timmy/config.yaml under system_prompt_suffix
- Rule text: "mark claims [retrieved] ONLY when the information came from a tool call or verified document in this session. All other factual claims are [generated] from pattern-matching — do not present generated claims as retrieved knowledge."
Test
Prompt: "What is Bitcoin's genesis block date, and who created Bitcoin?" (No tools available — pure generation test)
Result
Output: "Bitcoin's genesis block date is January 3, 2009, and Bitcoin was created by Satoshi Nakamoto."
- No [retrieved] tag (correct)
- No [generated] tag (not ideal)
- Facts accurate
Thinking Trace
The model spent ~2000 tokens deliberating. It correctly identified that no [retrieved] tag was appropriate. But it interpreted "All other factual claims are [generated]" as an internal classification note, not an instruction to literally write [generated] in the output.
Verdict: PARTIAL COMPLIANCE
The model defaults to Option B (implicit): absence of [retrieved] = generated. It does NOT actively mark generated claims with [generated] tags.
Recommendation
The rule needs explicit instruction: "Always tag factual claims with either [retrieved] or [generated] inline." Current wording is ambiguous — it says to "mark" retrieved but only "notes" that others are generated.
Next Test
- Provide a tool call result, then ask a question. See if [retrieved] appears when it should.