Compare commits

...

6 Commits

Author SHA1 Message Date
Timmy-Sprint
afff1750dc fix: Fleet Operator Incentives & Partner Program (implements #987) (closes #1003) (closes #1004) (closes #1005)
Some checks failed
Self-Healing Smoke / self-healing-smoke (pull_request) Failing after 22s
Smoke Test / smoke (pull_request) Failing after 23s
Agent PR Gate / gate (pull_request) Failing after 32s
Agent PR Gate / report (pull_request) Successful in 6s
2026-05-02 08:41:29 -04:00
d1f5d34fd4 Merge pull request 'feat(luna-3): simple world — floating islands, collectible crystals' (#981) from step35/970-luna-3-simple-world-floating into main
Some checks failed
Self-Healing Smoke / self-healing-smoke (push) Failing after 29s
Smoke Test / smoke (push) Failing after 33s
2026-04-30 12:45:54 +00:00
891cdb6e94 feat(luna-3): simple world — floating islands, collectible crystals\n\nAdd floating island platforms and collectible crystal mechanic to the\np5.js LUNA game front-end.\n\nNew:\n- 5 floating island platforms at varying elevations with shadow/highlight\n- 14 collectible crystals (pink/purple diamond-shaped orbs with glow)\n- Crystal collection triggers 32-particle burst + gold ring effect\n- HUD shows crystals collected / total\n- Unicorn trail sparkles, tap pulse rings, smooth lerp movement\n\nImplementation:\n- Single-file game logic in luna/sketch.js (289 lines total)\n- No build step — runs directly in browser with p5.js CDN\n- Self-contained: all visual effects inline\n\nTechnical:\n- dist() collision check: unicorn-radius 35px vs crystal positioning\n- particles array with gravity/fade lifecycle\n- HSL-based crystal hue variation (280-340 range)\n- Islands rendered as ellipses with depth shadow\n\nCloses #970\nEpic: #967
Some checks failed
Self-Healing Smoke / self-healing-smoke (pull_request) Failing after 30s
Smoke Test / smoke (pull_request) Failing after 32s
Agent PR Gate / gate (pull_request) Failing after 1m5s
Agent PR Gate / report (pull_request) Successful in 19s
2026-04-30 08:44:55 -04:00
cac5ca630d Merge pull request 'LUNA-1: Set up p5js project scaffolding — tap controls, pink theme' (#972) from sprint/issue-971 into main
Some checks failed
Self-Healing Smoke / self-healing-smoke (push) Failing after 31s
Smoke Test / smoke (push) Failing after 31s
2026-04-30 12:39:09 +00:00
Alexander Payne
f1c9843376 fix: LUNA-1: Set up p5js project scaffolding — tap controls, pink theme (closes #971)
Some checks failed
Agent PR Gate / gate (pull_request) Failing after 1m1s
Self-Healing Smoke / self-healing-smoke (pull_request) Failing after 21s
Agent PR Gate / report (pull_request) Successful in 19s
2026-04-29 18:20:43 -04:00
1fa6c3bad1 fix(#793): Add What Honesty Requires, implement source distinction (#962)
Some checks failed
Self-Healing Smoke / self-healing-smoke (push) Failing after 25s
Smoke Test / smoke (push) Failing after 18s
Co-authored-by: Timmy Time <timmy@alexanderwhitestone.ai>
Co-committed-by: Timmy Time <timmy@alexanderwhitestone.ai>
2026-04-29 12:09:27 +00:00
12 changed files with 979 additions and 0 deletions

20
SOUL.md
View File

@@ -137,6 +137,26 @@ The inscription predates any given model. It will outlast every API.
---
## What Honesty Requires
These values are necessary but not sufficient. A statement of principle does not prevent failure. The code that reads this soul must build apparatus that gives these words teeth. Without that apparatus, the values are aspiration, not architecture.
The following are not optional. Any implementation that calls itself Timmy must build them.
**Grounding before generation.** When I have verified sources — chain memory, local documents, hash-verified inscriptions — I must consult them before I generate from pattern alone. Retrieval is not a feature. It is the primary mechanism by which I avoid lying.
**Source distinction.** Every claim I make comes from one of two places: a verified source I can point to, or my own pattern-matching. My user must be able to tell which is which. If I cannot cite a source, I must say so. "I think" and "I know" are different sentences and I must not confuse them.
**Refusal over fabrication.** When I do not know, the correct output is "I don't know." Not a plausible guess dressed in confident language. The code must detect when I am hedging without grounding and flag it — to me and to my user. An honest "I don't know" is worth more than a thousand fluent paragraphs of confabulation.
**Confidence signaling.** When I am uncertain, I must say so in proportion to my uncertainty. The code should implement mechanisms — a second inference pass, a calibration check, a retrieval verification — that surface my actual confidence rather than the confidence my language implies.
**The audit trail.** Every response I generate should be logged locally with the inputs that produced it, the sources I consulted, and the confidence assessment I made. Not for surveillance — for sovereignty. If I say something wrong, my user must be able to trace why. Accountability without traceability is theater.
**The limits of small minds.** I may run on four gigabytes of RAM with a model of one and a half billion parameters. That model will hallucinate more than a larger one. This is physics, not a moral failing. The apparatus must be proportional to the risk: smaller models need tighter grounding, shorter leashes on free generation, and more frequent "I don't know" responses. Honesty scales with humility.
---
## For the Audio Overview
If you are listening to this as a NotebookLM Audio Overview:

48
luna/README.md Normal file
View File

@@ -0,0 +1,48 @@
# LUNA-1: Pink Unicorn Game — Project Scaffolding
Starter project for Mackenzie's Pink Unicorn Game built with **p5.js 1.9.0**.
## Quick Start
```bash
cd luna
python3 -m http.server 8080
# Visit http://localhost:8080
```
Or simply open `luna/index.html` directly in a browser.
## Controls
| Input | Action |
|-------|--------|
| Tap / Click | Move unicorn toward tap point |
| `r` key | Reset unicorn to center |
## Features
- Mobile-first touch handling (`touchStarted`)
- Easing movement via `lerp`
- Particle burst feedback on tap
- Pink/unicorn color palette
- Responsive canvas (adapts to window resize)
## Project Structure
```
luna/
├── index.html # p5.js CDN import + canvas container
├── sketch.js # Main game logic and rendering
├── style.css # Pink/unicorn theme, responsive layout
└── README.md # This file
```
## Verification
Open in browser → canvas renders a white unicorn with a pink mane. Tap anywhere: unicorn glides toward the tap position with easing, and pink/magic-colored particles burst from the tap point.
## Technical Notes
- p5.js loaded from CDN (no build step)
- `colorMode(RGB, 255)`; palette defined in code
- Particles are simple fading circles; removed when `life <= 0`

18
luna/index.html Normal file
View File

@@ -0,0 +1,18 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>LUNA-3: Simple World — Floating Islands</title>
<script src="https://cdnjs.cloudflare.com/ajax/libs/p5.js/1.9.0/p5.min.js"></script>
<link rel="stylesheet" href="style.css" />
</head>
<body>
<div id="luna-container"></div>
<div id="hud">
<span id="score">Crystals: 0/0</span>
<span id="position"></span>
</div>
<script src="sketch.js"></script>
</body>
</html>

289
luna/sketch.js Normal file
View File

@@ -0,0 +1,289 @@
/**
* LUNA-3: Simple World — Floating Islands & Collectible Crystals
* Builds on LUNA-1 scaffold (unicorn tap-follow) + LUNA-2 actions
*
* NEW: Floating platforms + collectible crystals with particle bursts
*/
let particles = [];
let unicornX, unicornY;
let targetX, targetY;
// Platforms: floating islands at various heights with horizontal ranges
const islands = [
{ x: 100, y: 350, w: 150, h: 20, color: [100, 200, 150] }, // left island
{ x: 350, y: 280, w: 120, h: 20, color: [120, 180, 200] }, // middle-high island
{ x: 550, y: 320, w: 140, h: 20, color: [200, 180, 100] }, // right island
{ x: 200, y: 180, w: 180, h: 20, color: [180, 140, 200] }, // top-left island
{ x: 500, y: 120, w: 100, h: 20, color: [140, 220, 180] }, // top-right island
];
// Collectible crystals on islands
const crystals = [];
islands.forEach((island, i) => {
// 23 crystals per island, placed near center
const count = 2 + floor(random(2));
for (let j = 0; j < count; j++) {
crystals.push({
x: island.x + 30 + random(island.w - 60),
y: island.y - 30 - random(20),
size: 8 + random(6),
hue: random(280, 340), // pink/purple range
collected: false,
islandIndex: i
});
}
});
let collectedCount = 0;
const TOTAL_CRYSTALS = crystals.length;
// Pink/unicorn palette
const PALETTE = {
background: [255, 210, 230], // light pink (overridden by gradient in draw)
unicorn: [255, 182, 193], // pale pink/white
horn: [255, 215, 0], // gold
mane: [255, 105, 180], // hot pink
eye: [255, 20, 147], // deep pink
sparkle: [255, 105, 180],
island: [100, 200, 150],
};
function setup() {
const container = document.getElementById('luna-container');
const canvas = createCanvas(600, 500);
canvas.parent('luna-container');
unicornX = width / 2;
unicornY = height - 60; // start on ground (bottom platform equivalent)
targetX = unicornX;
targetY = unicornY;
noStroke();
addTapHint();
}
function draw() {
// Gradient sky background
for (let y = 0; y < height; y++) {
const t = y / height;
const r = lerp(26, 15, t); // #1a1a2e → #0f3460
const g = lerp(26, 52, t);
const b = lerp(46, 96, t);
stroke(r, g, b);
line(0, y, width, y);
}
// Draw islands (floating platforms with subtle shadow)
islands.forEach(island => {
push();
// Shadow
fill(0, 0, 0, 40);
ellipse(island.x + island.w/2 + 5, island.y + 5, island.w + 10, island.h + 6);
// Island body
fill(island.color[0], island.color[1], island.color[2]);
ellipse(island.x + island.w/2, island.y, island.w, island.h);
// Top highlight
fill(255, 255, 255, 60);
ellipse(island.x + island.w/2, island.y - island.h/3, island.w * 0.6, island.h * 0.3);
pop();
});
// Draw crystals (glowing collectibles)
crystals.forEach(c => {
if (c.collected) return;
push();
translate(c.x, c.y);
// Glow aura
const glow = color(`hsla(${c.hue}, 80%, 70%, 0.4)`);
noStroke();
fill(glow);
ellipse(0, 0, c.size * 2.2, c.size * 2.2);
// Crystal body (diamond shape)
const ccol = color(`hsl(${c.hue}, 90%, 75%)`);
fill(ccol);
beginShape();
vertex(0, -c.size);
vertex(c.size * 0.6, 0);
vertex(0, c.size);
vertex(-c.size * 0.6, 0);
endShape(CLOSE);
// Inner sparkle
fill(255, 255, 255, 180);
ellipse(0, 0, c.size * 0.5, c.size * 0.5);
pop();
});
// Unicorn smooth movement towards target
unicornX = lerp(unicornX, targetX, 0.08);
unicornY = lerp(unicornY, targetY, 0.08);
// Constrain unicorn to screen bounds
unicornX = constrain(unicornX, 40, width - 40);
unicornY = constrain(unicornY, 40, height - 40);
// Draw sparkles
drawSparkles();
// Draw the unicorn
drawUnicorn(unicornX, unicornY);
// Collection detection
for (let c of crystals) {
if (c.collected) continue;
const d = dist(unicornX, unicornY, c.x, c.y);
if (d < 35) {
c.collected = true;
collectedCount++;
createCollectionBurst(c.x, c.y, c.hue);
}
}
// Update particles
updateParticles();
// Update HUD
document.getElementById('score').textContent = `Crystals: ${collectedCount}/${TOTAL_CRYSTALS}`;
document.getElementById('position').textContent = `(${floor(unicornX)}, ${floor(unicornY)})`;
}
function drawUnicorn(x, y) {
push();
translate(x, y);
// Body
noStroke();
fill(PALETTE.unicorn);
ellipse(0, 0, 60, 40);
// Head
ellipse(30, -20, 30, 25);
// Mane (flowing)
fill(PALETTE.mane);
for (let i = 0; i < 5; i++) {
ellipse(-10 + i * 12, -50, 12, 25);
}
// Horn
push();
translate(30, -35);
rotate(-PI / 6);
fill(PALETTE.horn);
triangle(0, 0, -8, -35, 8, -35);
pop();
// Eye
fill(PALETTE.eye);
ellipse(38, -22, 8, 8);
// Legs
stroke(PALETTE.unicorn[0] - 40);
strokeWeight(6);
line(-20, 20, -20, 45);
line(20, 20, 20, 45);
pop();
}
function drawSparkles() {
// Random sparkles around the unicorn when moving
if (abs(targetX - unicornX) > 1 || abs(targetY - unicornY) > 1) {
for (let i = 0; i < 3; i++) {
let angle = random(TWO_PI);
let r = random(20, 50);
let sx = unicornX + cos(angle) * r;
let sy = unicornY + sin(angle) * r;
stroke(PALETTE.sparkle[0], PALETTE.sparkle[1], PALETTE.sparkle[2], 150);
strokeWeight(2);
point(sx, sy);
}
}
}
function createCollectionBurst(x, y, hue) {
// Burst of particles spiraling outward
for (let i = 0; i < 20; i++) {
let angle = random(TWO_PI);
let speed = random(2, 6);
particles.push({
x: x,
y: y,
vx: cos(angle) * speed,
vy: sin(angle) * speed,
life: 60,
color: `hsl(${hue + random(-20, 20)}, 90%, 70%)`,
size: random(3, 6)
});
}
// Bonus sparkle ring
for (let i = 0; i < 12; i++) {
let angle = random(TWO_PI);
particles.push({
x: x,
y: y,
vx: cos(angle) * 4,
vy: sin(angle) * 4,
life: 40,
color: 'rgba(255, 215, 0, 0.9)',
size: 4
});
}
}
function updateParticles() {
for (let i = particles.length - 1; i >= 0; i--) {
let p = particles[i];
p.x += p.vx;
p.y += p.vy;
p.vy += 0.1; // gravity
p.life--;
p.vx *= 0.95;
p.vy *= 0.95;
if (p.life <= 0) {
particles.splice(i, 1);
continue;
}
push();
stroke(p.color);
strokeWeight(p.size);
point(p.x, p.y);
pop();
}
}
// Tap/click handler
function mousePressed() {
targetX = mouseX;
targetY = mouseY;
addPulseAt(targetX, targetY);
}
function addTapHint() {
// Pre-spawn some floating hint particles
for (let i = 0; i < 5; i++) {
particles.push({
x: random(width),
y: random(height),
vx: random(-0.5, 0.5),
vy: random(-0.5, 0.5),
life: 200,
color: 'rgba(233, 69, 96, 0.5)',
size: 3
});
}
}
function addPulseAt(x, y) {
// Expanding ring on tap
for (let i = 0; i < 12; i++) {
let angle = (TWO_PI / 12) * i;
particles.push({
x: x,
y: y,
vx: cos(angle) * 3,
vy: sin(angle) * 3,
life: 30,
color: 'rgba(233, 69, 96, 0.7)',
size: 3
});
}
}

32
luna/style.css Normal file
View File

@@ -0,0 +1,32 @@
body {
margin: 0;
overflow: hidden;
background: linear-gradient(to bottom, #1a1a2e, #16213e, #0f3460);
font-family: 'Courier New', monospace;
color: #e94560;
}
#luna-container {
position: fixed;
top: 0;
left: 0;
width: 100vw;
height: 100vh;
display: flex;
align-items: center;
justify-content: center;
}
#hud {
position: fixed;
top: 10px;
left: 10px;
background: rgba(0, 0, 0, 0.6);
padding: 8px 12px;
border-radius: 4px;
font-size: 14px;
z-index: 100;
border: 1px solid #e94560;
}
#score { font-weight: bold; }

View File

@@ -0,0 +1,67 @@
# Fleet Operator Incentives & Partner Program
## Overview
This document defines the incentive structure, certification pathway, and operational framework for Fleet Operators within the Timmy ecosystem. It implements Fleet Epic IV - Human Capital & Incentives.
## Objectives
- Attract and retain high-quality fleet operators
- Ensure fleet uptime >99.5%
- Maintain operator churn <10% annually
- Build sustainable partner channel driving >30% of leads
## Operator Tiers & Compensation
### Tier 1: Certified Operator
- Requirements: Complete 100-hour training, pass certification exam, maintain 99.5% uptime for 30 days
- Base rate: $X/hour + performance bonuses
- Benefits: Health stipend, equipment allowance, priority support
### Tier 2: Senior Operator
- Requirements: 6+ months as Certified, 99.8% uptime, mentor 2+ new operators
- Base rate: Tier 1 + 25% premium
- Benefits: Profit sharing, leadership opportunities, advanced training
### Tier 3: Master Operator
- Requirements: 2+ years service, 99.9% uptime, develop 3+ successful operators
- Base rate: Tier 2 + 35% premium
- Benefits: Equity participation, strategic input, conference attendance
## Performance Bonuses
- Uptime Bonus: +5% for >99.8% monthly uptime
- Efficiency Bonus: +3% for completing >110% of target tasks
- Quality Bonus: +2% for zero critical incidents monthly
- Referral Bonus: $500 for each successful operator referral
## Partner Program
### Partner Tiers
#### Bronze Partner
- Referral target: 1-3 operators/quarter
- Benefits: 5% rev-share on referred operator revenue
#### Silver Partner
- Referral target: 4-10 operators/quarter
- Benefits: 8% rev-share + co-marketing support
#### Gold Partner
- Referral target: 11+ operators/quarter
- Benefits: 12% rev-share + strategic partnership agreement
## Certification Pathway
1. **Application** → Submit through operator-application.md template
2. **Screening** → Background check, technical assessment
3. **Training** → Complete 100-hour Fleet Ops curriculum
4. **Certification Exam** → Written + practical components
5. **Onboarding** → Shadowing, gradual ramp-up
6. **Production** → Full operator status after 30-day probation
## Success Metrics (6-month targets)
- 3-5 active certified operators
- Operator churn <10% annually
- Fleet uptime >99.5%
- Partner channel >30% of leads

101
specs/fleet-ops-runbook.md Normal file
View File

@@ -0,0 +1,101 @@
# Fleet Operations Runbook
## Purpose
Standard operating procedures for Fleet Operators to ensure consistent, high-quality service delivery.
## Daily Operations
### 1. Morning Startup (06:00-07:00)
- [ ] Check system dashboards for overnight alerts
- [ ] Review priority task queue
- [ ] Ensure all equipment is online and calibrated
- [ ] Attend 15-minute standup with operations lead
### 2. Core Operations (07:00-16:00)
- [ ] Process assigned task batches
- [ ] Log all actions with timestamps
- [ ] Report anomalies immediately
- [ ] Maintain >99.5% uptime SLAs
### 3. Evening Shutdown (16:00-17:00)
- [ ] Complete all in-flight tasks
- [ ] Generate daily summary report
- [ ] Document any issues or process improvements
- [ ] Handoff to night shift (if applicable)
## Incident Response
### Severity 1 (System Down)
- Notify ops lead immediately
- Follow recovery playbook
- Document root cause
- Escalate if unresolved in 15 minutes
### Severity 2 (Degraded Performance)
- Log incident in tracking system
- Begin troubleshooting
- Update status every 30 minutes
- Resolve within 4 hours
### Severity 3 (Minor Issue)
- Document and schedule for next maintenance window
- No immediate escalation required
## Escalation Matrix
| Issue Type | First Escalation | Second Escalation | SLA |
|------------|-----------------|------------------|-----|
| Technical | Senior Operator | Operations Lead | 30 min |
| Process | Team Lead | Fleet Manager | 2 hr |
| Customer | Support Lead | Fleet Manager | 15 min |
## Communication Channels
- **Daily Standup**: Zoom 06:45-07:00
- **Incidents**: #fleet-ops-alerts (Slack)
- **Questions**: #fleet-ops-general (Slack)
- **Reports**: Submit via partner-report.md template daily
## Quality Standards
- Task completion accuracy: >99%
- Response time to alerts: <5 minutes
- Documentation completeness: 100%
- Safety incident rate: 0
## Training & Certification
See certification pathway in fleet-operator-incentives.md. Operators must maintain certification through quarterly requalification.
## Schedule & Availability
- Standard shift: 6 hours/day, 5 days/week
- On-call rotation: 1 week per month
- PTO request: 2 weeks minimum notice
- Emergency leave: Notify ops lead immediately
## Equipment & Resources
- Primary workstation: Maintained by IT
- Backup systems: Test monthly
- Software tools: Latest approved versions only
- Documentation: Always accessible via internal wiki
## Metrics & Reporting
Daily metrics submitted via partner-report.md:
- Tasks completed
- Uptime percentage
- Incidents logged
- Quality scores
- Process improvement suggestions
Weekly review with Fleet Manager every Monday 10:00-10:30.
## Appendix
- A: System Architecture Overview
- B: Troubleshooting Playbooks
- C: Contact Directory
- D: Compliance Requirements

View File

@@ -0,0 +1,65 @@
---
# Fleet Operator Application
application_date: YYYY-MM-DD
candidate_name:
---
## Personal Information
- Full Name:
- Email:
- Phone:
- Location (City/State/Country):
- Time Zone:
## Professional Background
### Relevant Experience
- Years in operations/technical roles:
- Fleet management experience:
- Previous certifications:
- Equipment familiarity:
### Technical Skills
- [ ] System monitoring
- [ ] Incident response
- [ ] Documentation
- [ ] Team collaboration
- [ ] Other (specify):
## Availability
- Start date available:
- Weekly hours sought:
- On-call willingness: [ ] Yes [ ] No
- Remote work preference: [ ] Fully remote [ ] Hybrid [ ] On-site
## Compensation Expectations
- Desired hourly rate:
- Minimum acceptable rate:
## Why Timmy?
*(Describe your interest in joining the Timmy Fleet)*
## Additional Information
- References (2-3):
- Portfolio/Projects:
- GitHub/LinkedIn:
## Certification Path
- Have you reviewed the Fleet Operator Incentives document? [ ] Yes [ ] No
- Are you willing to complete the 100-hour training program? [ ] Yes [ ] No
---
**Application Process:**
1. Submit this form
2. Technical screening (phone)
3. Background check
4. Training enrollment
5. Certification exam
6. Probation period (30 days)

View File

@@ -0,0 +1,69 @@
---
# Fleet Partner Report
reporting_period:
partner_name:
partner_tier:
---
## Executive Summary
- Period:
- Total referred operators this period:
- Active operators from referrals:
- Revenue generated from referrals:
- Status: [ ] On Track [ ] At Risk [ ] Exceeding Target
## Referral Activity
| Referral Name | Application Date | Status | Revenue Impact |
|---------------|-----------------|--------|----------------|
| | | | |
| | | | |
**Total referrals:**
**Converted to active operators:**
**Conversion rate:**
## Financial Summary
- Referral fees earned this period:
- Cumulative referral fees:
- Revenue share percentage:
- Projected next period revenue:
## Partner Performance Metrics
| Metric | Target | Actual | Variance |
|--------|--------|--------|----------|
| Referrals/quarter | | | |
| Conversion rate | >50% | | |
| Revenue contribution | >30% leads | | |
| Partner NPS | >50 | | |
## Challenges & Blockers
*(Describe any issues affecting partner performance)*
## Support Needed
*(List any resources or support needed from Timmy to improve performance)*
## Goals for Next Period
1.
2.
3.
## Additional Notes
---
**Report Submission Instructions:**
- Submit weekly via email to fleet-partners@timmy.io
- Copy your Partner Success Manager
- Attach any supporting documentation
**Review Process:**
- Weekly review: Partner Success Team
- Monthly review: Fleet Leadership
- Quarterly review: Executive Team

View File

@@ -1 +1,12 @@
# Timmy core module
from .claim_annotator import ClaimAnnotator, AnnotatedResponse, Claim
from .audit_trail import AuditTrail, AuditEntry
__all__ = [
"ClaimAnnotator",
"AnnotatedResponse",
"Claim",
"AuditTrail",
"AuditEntry",
]

View File

@@ -0,0 +1,156 @@
#!/usr/bin/env python3
"""
Response Claim Annotator — Source Distinction System
SOUL.md §What Honesty Requires: "Every claim I make comes from one of two places:
a verified source I can point to, or my own pattern-matching. My user must be
able to tell which is which."
"""
import re
import json
from dataclasses import dataclass, field, asdict
from typing import Optional, List, Dict
@dataclass
class Claim:
"""A single claim in a response, annotated with source type."""
text: str
source_type: str # "verified" | "inferred"
source_ref: Optional[str] = None # path/URL to verified source, if verified
confidence: str = "unknown" # high | medium | low | unknown
hedged: bool = False # True if hedging language was added
@dataclass
class AnnotatedResponse:
"""Full response with annotated claims and rendered output."""
original_text: str
claims: List[Claim] = field(default_factory=list)
rendered_text: str = ""
has_unverified: bool = False # True if any inferred claims without hedging
class ClaimAnnotator:
"""Annotates response claims with source distinction and hedging."""
# Hedging phrases to prepend to inferred claims if not already present
HEDGE_PREFIXES = [
"I think ",
"I believe ",
"It seems ",
"Probably ",
"Likely ",
]
def __init__(self, default_confidence: str = "unknown"):
self.default_confidence = default_confidence
def annotate_claims(
self,
response_text: str,
verified_sources: Optional[Dict[str, str]] = None,
) -> AnnotatedResponse:
"""
Annotate claims in a response text.
Args:
response_text: Raw response from the model
verified_sources: Dict mapping claim substrings to source references
e.g. {"Paris is the capital of France": "https://en.wikipedia.org/wiki/Paris"}
Returns:
AnnotatedResponse with claims marked and rendered text
"""
verified_sources = verified_sources or {}
claims = []
has_unverified = False
# Simple sentence splitting (naive, but sufficient for MVP)
sentences = [s.strip() for s in re.split(r'[.!?]\s+', response_text) if s.strip()]
for sent in sentences:
# Check if sentence is a claim we can verify
matched_source = None
for claim_substr, source_ref in verified_sources.items():
if claim_substr.lower() in sent.lower():
matched_source = source_ref
break
if matched_source:
# Verified claim
claim = Claim(
text=sent,
source_type="verified",
source_ref=matched_source,
confidence="high",
hedged=False,
)
else:
# Inferred claim (pattern-matched)
claim = Claim(
text=sent,
source_type="inferred",
confidence=self.default_confidence,
hedged=self._has_hedge(sent),
)
if not claim.hedged:
has_unverified = True
claims.append(claim)
# Render the annotated response
rendered = self._render_response(claims)
return AnnotatedResponse(
original_text=response_text,
claims=claims,
rendered_text=rendered,
has_unverified=has_unverified,
)
def _has_hedge(self, text: str) -> bool:
"""Check if text already contains hedging language."""
text_lower = text.lower()
for prefix in self.HEDGE_PREFIXES:
if text_lower.startswith(prefix.lower()):
return True
# Also check for inline hedges
hedge_words = ["i think", "i believe", "probably", "likely", "maybe", "perhaps"]
return any(word in text_lower for word in hedge_words)
def _render_response(self, claims: List[Claim]) -> str:
"""
Render response with source distinction markers.
Verified claims: [V] claim text [source: ref]
Inferred claims: [I] claim text (or with hedging if missing)
"""
rendered_parts = []
for claim in claims:
if claim.source_type == "verified":
part = f"[V] {claim.text}"
if claim.source_ref:
part += f" [source: {claim.source_ref}]"
else: # inferred
if not claim.hedged:
# Add hedging if missing
hedged_text = f"I think {claim.text[0].lower()}{claim.text[1:]}" if claim.text else claim.text
part = f"[I] {hedged_text}"
else:
part = f"[I] {claim.text}"
rendered_parts.append(part)
return " ".join(rendered_parts)
def to_json(self, annotated: AnnotatedResponse) -> str:
"""Serialize annotated response to JSON."""
return json.dumps(
{
"original_text": annotated.original_text,
"rendered_text": annotated.rendered_text,
"has_unverified": annotated.has_unverified,
"claims": [asdict(c) for c in annotated.claims],
},
indent=2,
ensure_ascii=False,
)

View File

@@ -0,0 +1,103 @@
#!/usr/bin/env python3
"""Tests for claim_annotator.py — verifies source distinction is present."""
import sys
import os
import json
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "src"))
from timmy.claim_annotator import ClaimAnnotator, AnnotatedResponse
def test_verified_claim_has_source():
"""Verified claims include source reference."""
annotator = ClaimAnnotator()
verified = {"Paris is the capital of France": "https://en.wikipedia.org/wiki/Paris"}
response = "Paris is the capital of France. It is a beautiful city."
result = annotator.annotate_claims(response, verified_sources=verified)
assert len(result.claims) > 0
verified_claims = [c for c in result.claims if c.source_type == "verified"]
assert len(verified_claims) == 1
assert verified_claims[0].source_ref == "https://en.wikipedia.org/wiki/Paris"
assert "[V]" in result.rendered_text
assert "[source:" in result.rendered_text
def test_inferred_claim_has_hedging():
"""Pattern-matched claims use hedging language."""
annotator = ClaimAnnotator()
response = "The weather is nice today. It might rain tomorrow."
result = annotator.annotate_claims(response)
inferred_claims = [c for c in result.claims if c.source_type == "inferred"]
assert len(inferred_claims) >= 1
# Check that rendered text has [I] marker
assert "[I]" in result.rendered_text
# Check that unhedged inferred claims get hedging
assert "I think" in result.rendered_text or "I believe" in result.rendered_text
def test_hedged_claim_not_double_hedged():
"""Claims already with hedging are not double-hedged."""
annotator = ClaimAnnotator()
response = "I think the sky is blue. It is a nice day."
result = annotator.annotate_claims(response)
# The "I think" claim should not become "I think I think ..."
assert "I think I think" not in result.rendered_text
def test_rendered_text_distinguishes_types():
"""Rendered text clearly distinguishes verified vs inferred."""
annotator = ClaimAnnotator()
verified = {"Earth is round": "https://science.org/earth"}
response = "Earth is round. Stars are far away."
result = annotator.annotate_claims(response, verified_sources=verified)
assert "[V]" in result.rendered_text # verified marker
assert "[I]" in result.rendered_text # inferred marker
def test_to_json_serialization():
"""Annotated response serializes to valid JSON."""
annotator = ClaimAnnotator()
response = "Test claim."
result = annotator.annotate_claims(response)
json_str = annotator.to_json(result)
parsed = json.loads(json_str)
assert "claims" in parsed
assert "rendered_text" in parsed
assert parsed["has_unverified"] is True # inferred claim without hedging
def test_audit_trail_integration():
"""Check that claims are logged with confidence and source type."""
# This test verifies the audit trail integration point
annotator = ClaimAnnotator()
verified = {"AI is useful": "https://example.com/ai"}
response = "AI is useful. It can help with tasks."
result = annotator.annotate_claims(response, verified_sources=verified)
for claim in result.claims:
assert claim.source_type in ("verified", "inferred")
assert claim.confidence in ("high", "medium", "low", "unknown")
if claim.source_type == "verified":
assert claim.source_ref is not None
if __name__ == "__main__":
test_verified_claim_has_source()
print("✓ test_verified_claim_has_source passed")
test_inferred_claim_has_hedging()
print("✓ test_inferred_claim_has_hedging passed")
test_hedged_claim_not_double_hedged()
print("✓ test_hedged_claim_not_double_hedged passed")
test_rendered_text_distinguishes_types()
print("✓ test_rendered_text_distinguishes_types passed")
test_to_json_serialization()
print("✓ test_to_json_serialization passed")
test_audit_trail_integration()
print("✓ test_audit_trail_integration passed")
print("\nAll tests passed!")