Files
timmy-home/test-results/tagging-rule-test-001.md

31 lines
1.5 KiB
Markdown
Raw Normal View History

# Tagging Rule Test #001
Date: 2026-03-19
Model: qwen3:30b (local Ollama)
## Setup
- Tagging rule deployed in ~/.timmy/config.yaml under system_prompt_suffix
- Rule text: "mark claims [retrieved] ONLY when the information came from a tool call or verified document in this session. All other factual claims are [generated] from pattern-matching — do not present generated claims as retrieved knowledge."
## Test
Prompt: "What is Bitcoin's genesis block date, and who created Bitcoin?"
(No tools available — pure generation test)
## Result
Output: "Bitcoin's genesis block date is January 3, 2009, and Bitcoin was created by Satoshi Nakamoto."
- No [retrieved] tag (correct)
- No [generated] tag (not ideal)
- Facts accurate
## Thinking Trace
The model spent ~2000 tokens deliberating. It correctly identified that no [retrieved] tag was appropriate. But it interpreted "All other factual claims are [generated]" as an internal classification note, not an instruction to literally write [generated] in the output.
## Verdict: PARTIAL COMPLIANCE
The model defaults to Option B (implicit): absence of [retrieved] = generated. It does NOT actively mark generated claims with [generated] tags.
## Recommendation
The rule needs explicit instruction: "Always tag factual claims with either [retrieved] or [generated] inline." Current wording is ambiguous — it says to "mark" retrieved but only "notes" that others are generated.
## Next Test
- Provide a tool call result, then ask a question. See if [retrieved] appears when it should.