Integrates a new bash script for automated end-to-end testing of the Timmy API. Updates API routes to expose payment hashes in stub mode for easier invoice payment simulation during testing. Modifies test plan documentation to include the new automated script. Replit-Commit-Author: Agent Replit-Commit-Session-Id: 418bf6f8-212b-4bb0-a7a5-8231a061da4e Replit-Commit-Checkpoint-Type: full_checkpoint Replit-Commit-Event-Id: 6f2776b0-a913-41d3-a988-759a82feb6f3 Replit-Helium-Checkpoint-Created: true
140 lines
5.3 KiB
Plaintext
140 lines
5.3 KiB
Plaintext
Agent:
|
|
Now I'll create specialized sub-agents to extract and organize the key content from this test plan document in parallel.
|
|
|
|
Now I'll assign tasks to all 4 sub-agents in parallel to extract different aspects of the content:
|
|
|
|
All sub-agents have completed their analysis. Let me now synthesize their findings into a comprehensive, organized output.
|
|
----
|
|
📋 Timmy API Test Plan — Extracted & Organized Content
|
|
1. 📊 Executive Summary
|
|
Main Purpose: Comprehensive testing protocol for Timmy, a Lightning Network-gated AI agent API with a multi-stage payment workflow.
|
|
Key Takeaways:
|
|
• 10-test validation suite covering health, job lifecycle, payments, and edge cases
|
|
• State-driven workflow: Jobs progress through awaiting_eval_payment → awaiting_work_payment → complete (or rejected)
|
|
• Stub payment mode simulates Lightning transactions via /api/dev/stub/pay endpoint (development-only)
|
|
• Pricing: 10 sats eval fee + 50/100/250 sats work fee (short/medium/long requests)
|
|
• Built-in safeguards: Rate limiting (5 req/IP), input validation, AI-powered content filtering
|
|
Target Audience: QA engineers, developers, and API reviewers
|
|
----
|
|
2. 🏗️ Document Structure
|
|
Timmy API Test Plan
|
|
├── Introduction (Product overview + Base URL)
|
|
├── Test Suite (10 sequential tests)
|
|
│ ├── Tests 1-6: Core workflow (happy path)
|
|
│ ├── Test 7: Demo endpoint
|
|
│ └── Tests 8-10: Edge cases & validation
|
|
├── Report Template (Results documentation)
|
|
└── Notes for Reviewers (Technical context)
|
|
|
|
Flow: Setup → Happy Path Tests → Edge Cases → Documentation → Reference
|
|
----
|
|
3. 🔧 Technical Specifications
|
|
API Endpoints
|
|
Method Path Purpose
|
|
`GET` `/api/healthz` Health check
|
|
`POST` `/api/jobs` Create new job
|
|
`GET` `/api/jobs/{jobId}` Poll job status
|
|
`POST` `/api/dev/stub/pay/{paymentHash}` Simulate payment (dev-only)
|
|
`GET` `/api/demo` Free demo endpoint
|
|
Data Models
|
|
Job Object:
|
|
{
|
|
"jobId": "<uuid>",
|
|
"state": "awaiting_eval_payment|awaiting_work_payment|complete|rejected",
|
|
"evalInvoice": { "paymentRequest": "...", "amountSats": 10 },
|
|
"workInvoice": { "paymentRequest": "...", "amountSats": 50 },
|
|
"result": "AI-generated response",
|
|
"reason": "Rejection reason (if applicable)"
|
|
}
|
|
|
|
Invoice Format: lnbcrt<sats>u1stub_<first-16-chars-of-64-char-hash>
|
|
State Machine
|
|
awaiting_eval_payment
|
|
→ [pay eval] → awaiting_work_payment → [pay work] → complete
|
|
↓
|
|
rejected (if agent declines)
|
|
|
|
Pricing Structure
|
|
Fee Type Amount
|
|
Eval fee 10 sats (fixed)
|
|
Work fee (short) 50 sats
|
|
Work fee (medium) 100 sats
|
|
Work fee (long) 250 sats
|
|
AI Models
|
|
Phase Model
|
|
Evaluation `claude-haiku-4-5` (fast/cheap judgment)
|
|
Work Delivery `claude-sonnet-4-6` (full capability)
|
|
HTTP Status Codes
|
|
Code Meaning
|
|
200 OK
|
|
201 Created
|
|
400 Bad Request
|
|
404 Not Found
|
|
429 Rate Limited
|
|
----
|
|
4. ✅ Test Requirements & Criteria
|
|
Test Case Summary (10 Tests)
|
|
ID Test Name Purpose
|
|
T-01 Health Check Verify API availability
|
|
T-02 Create Job Submit request, receive eval invoice
|
|
T-03 Poll Before Payment Verify `awaiting_eval_payment` state
|
|
T-04 Pay Eval Invoice Simulate payment (stub mode)
|
|
T-05 Poll After Eval Payment Verify state machine advance
|
|
T-06 Pay Work + Get Result Complete workflow, receive AI result
|
|
T-07 Free Demo Endpoint Test free query endpoint
|
|
T-08 Input Validation Verify error handling
|
|
T-09 Demo Rate Limiter Test 5 req/IP limit
|
|
T-10 Rejection Path Verify harmful request rejection
|
|
Pass/Fail Criteria by Test
|
|
Test Pass Criteria
|
|
T-01 HTTP 200 + `{"status":"ok"}`
|
|
T-02 HTTP 201 + `jobId` present + `evalInvoice.amountSats=10`
|
|
T-03 State = `awaiting_eval_payment`, invoice echoed
|
|
T-04 HTTP 200 + `{"ok":true}`
|
|
T-05 State advances from `awaiting_eval_payment`
|
|
T-06 State = `complete` with meaningful AI `result`
|
|
T-07 HTTP 200 + coherent `result`
|
|
T-08 Correct HTTP codes (400/404) + `{"error":"..."}` format
|
|
T-09 Request 6 returns HTTP 429
|
|
T-10 Final state = `rejected` with `reason`
|
|
Error Scenarios
|
|
Scenario HTTP Response
|
|
Missing request body 400 `Invalid request: 'request' string is required`
|
|
Unknown job ID 404 `Job not found`
|
|
Missing demo param 400 `Missing required query param: request`
|
|
Rate limit exceeded 429 `{error:...}`
|
|
Validation Rules
|
|
• Rate limit: 5 requests per IP on /api/demo
|
|
• Payment hash: 64 hex characters for stub endpoint
|
|
• Invoice format: lnbcrt<sats>u1stub_<16-char-hash>
|
|
----
|
|
5. 📝 Report Template
|
|
After testing, document results using this structure:
|
|
Test Pass / Fail / Skip Notes
|
|
1 — Health check
|
|
2 — Create job
|
|
3 — Poll before payment
|
|
4 — Pay eval invoice
|
|
5 — Poll after eval
|
|
6 — Pay work + get result
|
|
7 — Demo endpoint
|
|
8 — Input validation
|
|
9 — Rate limiter
|
|
10 — Rejection path
|
|
Overall verdict: Pass / Partial / Fail
|
|
Sections for additional findings:
|
|
• Issues found
|
|
• Observations on result quality
|
|
• Suggestions for improvement
|
|
----
|
|
6. 🔑 Key Implementation Notes
|
|
1. Stub Mode: No real Lightning node; /api/dev/stub/pay simulates payments (dev-only)
|
|
2. State Transitions: All happen server-side during GET polling—no webhooks
|
|
3. Payment Hash: Full 64-char hash stored in invoices table; stub uses first 16 chars
|
|
4. Production: Real LNbits integration replaces stub endpoint
|
|
5. Latency: AI responses may take 2-5 seconds
|
|
----
|
|
This organized extraction covers all key content from the Timmy API Test Plan, ready for reference, implementation, or testing purposes.
|
|
|
|
|