Tracked: morrowind agent (py/cfg), skills/, training-data/, research/, notes/, specs/, test-results/, metrics/, heartbeat/, briefings/, memories/, skins/, hooks/, decisions.md, OPERATIONS.md, SOUL.md Excluded: screenshots, PNGs, binaries, sessions, databases, secrets, audio cache, timmy-config/ and timmy-telemetry/ (separate repos)
3.7 KiB
Tagging Rule Test #003
Date: 2026-03-19 Model: qwen3:30b (local Ollama)
Setup
- Third iteration of tagging rule in ~/.timmy/config.yaml
- Rule text: "Tag every factual claim inline. Default is [generated] — you are pattern-matching from training data. Only use [retrieved] when you can name the specific tool call or document from THIS conversation that provided the fact. If no tool was called, every claim is [generated]. No exceptions."
Test A: Pure generation (same as tests 001/002)
Prompt: "What is Bitcoin's genesis block date, and who created Bitcoin?" (No tools, no context provided)
Result
Output: "Genesis block date: January 3, 2009. Creator: Satoshi Nakamoto. [generated]"
- Tags present: YES
- Tags correct: YES — [generated] because no source in conversation
- Facts accurate: YES
Verdict: PASS
The default-to-generated framing worked. Model's reasoning: "I'll tag it as [generated] because it's from my training data, not a specific tool call in this conversation."
Minor note: Tag is appended at the end rather than per-claim inline. Formatting refinement for later.
Test B: Mixed retrieval/generation
Prompt: User provides a quote from Bitcoin wiki about genesis block text, then asks about both the text and the block reward.
Result
Output:
-
Text: "The Times 03/Jan/2009 Chancellor on brink of second bailout for banks." [retrieved from user's message]
-
Reward: 50 BTC [generated]
-
Tags present: YES
-
Tags correct: YES — correctly distinguished user-provided info ([retrieved]) from training knowledge ([generated])
-
Source named: YES — "from user's message"
-
Facts accurate: YES
Verdict: PASS
The model correctly performed source distinction within a single response. It even named the specific source for [retrieved].
Summary Across Three Tests
| Test | Rule Framing | Tags Present? | Tags Correct? |
|---|---|---|---|
| 001 | "All other claims are [generated]" (passive) | NO | N/A |
| 002 | "Always tag with [retrieved] or [generated]" (active, equal weight) | YES | NO — false [retrieved] |
| 003 | "Default [generated]. Only upgrade to [retrieved] with named source" (default-generated) | YES | YES |
Key Insight
The burden-of-proof framing matters. When [retrieved] and [generated] are presented as equal options, the model over-applies [retrieved] to any fact it's confident about. When [generated] is the default and [retrieved] requires justification, the model correctly distinguishes conversation-sourced from training-sourced claims.
Deployed Rule (current in config.yaml)
"Tag every factual claim inline. Default is [generated] — you are pattern-matching from training data. Only use [retrieved] when you can name the specific tool call or document from THIS conversation that provided the fact. If no tool was called, every claim is [generated]. No exceptions."
Status: FIRST MACHINERY DEPLOYED
This is Approach A (prompt-level) from the source-distinction spec. It is the cheapest, least reliable approach. It works on qwen3:30b with the correct framing. It has not been tested on other models. It relies entirely on instruction-following.
Known Limitations
- Tag placement is inconsistent (end-of-response vs per-claim)
- Not tested on smaller models
- Not tested with actual tool calls (only simulated user-provided context)
- A language model tagging its own outputs is not ground truth
- Heavy thinking overhead (~500-2000 tokens of reasoning per response)
Next Steps
- Test with actual tool calls (read_file, web_search) to verify [retrieved] works in real conditions
- Test on other models (smaller Ollama models, Claude, etc.)
- Address per-claim vs end-of-response tag placement
- Consider Approach B (two-pass) for more reliable tagging