Integrates a new bash script for automated end-to-end testing of the Timmy API. Updates API routes to expose payment hashes in stub mode for easier invoice payment simulation during testing. Modifies test plan documentation to include the new automated script. Replit-Commit-Author: Agent Replit-Commit-Session-Id: 418bf6f8-212b-4bb0-a7a5-8231a061da4e Replit-Commit-Checkpoint-Type: full_checkpoint Replit-Commit-Event-Id: 6f2776b0-a913-41d3-a988-759a82feb6f3 Replit-Helium-Checkpoint-Created: true
13 KiB
13 KiB
Timmy API Test Requirements Analysis
1. Test Case Summary
| Test ID | Test Name | Purpose |
|---|---|---|
| T-01 | Health Check | Verify API is running and responsive |
| T-02 | Create Job | Submit a new job request and receive eval invoice |
| T-03 | Poll Before Payment | Verify job state before eval payment is made |
| T-04 | Pay Eval Invoice | Simulate payment of evaluation fee (stub mode) |
| T-05 | Poll After Eval Payment | Verify state machine advances after eval payment |
| T-06 | Pay Work Invoice & Get Result | Complete full workflow and receive AI result |
| T-07 | Free Demo Endpoint | Test free demo endpoint without payment |
| T-08 | Input Validation | Verify proper error handling for invalid inputs |
| T-09 | Demo Rate Limiter | Test rate limiting on demo endpoint (5 req/IP) |
| T-10 | Rejection Path | Verify agent rejects harmful/illegal requests |
2. Pass/Fail Criteria by Test Case
T-01: Health Check
Pass Criteria:
- HTTP Status: 200 OK
- Response body contains
statusfield - Response value:
{"status":"ok"}
Fail Criteria:
- HTTP status != 200
- Missing
statusfield - Status value != "ok"
T-02: Create Job
Pass Criteria:
- HTTP Status: 201 Created
- Response contains
jobId(UUID format) - Response contains
evalInvoiceobject evalInvoice.amountSatsequals exactly 10evalInvoice.paymentRequestis present (stub format:lnbcrt10u1stub_...)
Fail Criteria:
- HTTP status != 201
- Missing
jobIdor invalid format evalInvoicemissing or malformedamountSats!= 10
T-03: Poll Before Payment
Pass Criteria:
- HTTP Status: 200
statefield equals exactlyawaiting_eval_paymentevalInvoiceis echoed back (same as T-02 response)jobIdmatches the created job
Fail Criteria:
- State !=
awaiting_eval_payment - Invoice data missing or modified
- Wrong jobId returned
T-04: Pay Eval Invoice (Stub Mode)
Pass Criteria:
- HTTP Status: 200 OK
- Response contains
{"ok":true} - Response contains
paymentHashmatching the paid invoice
Fail Criteria:
- HTTP status != 200
ok!= true- Payment hash mismatch
Note: Uses /api/dev/stub/pay/<full-64-char-payment-hash> endpoint (stub mode only)
T-05: Poll After Eval Payment (State Advance)
Pass Criteria:
- HTTP Status: 200
- State has advanced from
awaiting_eval_payment - Agent judgment is present in response
Accepted Path:
- State equals
awaiting_work_payment workInvoiceobject presentworkInvoice.amountSatsis one of: 50, 100, or 250workInvoice.paymentRequestpresent (stub format)
Rejected Path:
- State equals
rejected reasonfield present with explanation
Fail Criteria:
- State unchanged (still
awaiting_eval_payment) - State is invalid/unknown
- Missing required fields for current state
T-06: Pay Work Invoice & Get Result
Pass Criteria:
- HTTP Status: 200
- Final state equals
complete resultfield present with AI-generated content- Result is meaningful and answers the original request
Fail Criteria:
- State !=
complete - Missing or empty
result - Result is incoherent or unrelated to request
Timing Note: AI response may take 2-5 seconds after payment
T-07: Free Demo Endpoint
Pass Criteria:
- HTTP Status: 200 OK
- Response contains
resultfield - Result is coherent and answers the query
Fail Criteria:
- HTTP status != 200
- Missing
resultfield - Result is incoherent or nonsensical
T-08: Input Validation
Pass Criteria (all three scenarios):
| Scenario | HTTP Status | Expected Response |
|---|---|---|
| Missing request body | 400 | {"error":"Invalid request: 'request' string is required"} |
| Unknown job ID | 404 | {"error":"Job not found"} |
| Demo without param | 400 | {"error":"Missing required query param: request"} |
Fail Criteria:
- Incorrect HTTP status codes
- Error message format !=
{"error": string} - Generic error messages without specifics
T-09: Demo Rate Limiter
Pass Criteria:
- Requests 1-5: Return HTTP 200 with
resultfield - Request 6+: Return HTTP 429 (Too Many Requests) with
errorfield - Rate limit applies per IP address
Fail Criteria:
- Rate limit triggers before request 6
- Rate limit does not trigger by request 6
- Rate limit affects different IPs
T-10: Rejection Path (Adversarial Request)
Pass Criteria:
- Job created successfully (T-02 passes)
- Eval payment processed (T-04 passes)
- Final state equals
rejected reasonfield present explaining rejection- State does NOT become
awaiting_work_payment
Fail Criteria:
- State becomes
awaiting_work_payment(agent failed to reject) - State becomes
completewith harmful content - Missing
reasonfield
3. Expected Results Summary
Normal Flow Results
| Step | Endpoint | Method | Expected Response |
|---|---|---|---|
| Health | /api/healthz |
GET | {"status":"ok"} |
| Create | /api/jobs |
POST | Job object with eval invoice |
| Poll Pre-Pay | /api/jobs/{id} |
GET | State: awaiting_eval_payment |
| Pay Eval | /api/dev/stub/pay/{hash} |
POST | {"ok":true,"paymentHash":"..."} |
| Poll Post-Eval | /api/jobs/{id} |
GET | State: awaiting_work_payment OR rejected |
| Pay Work | /api/dev/stub/pay/{hash} |
POST | {"ok":true,"paymentHash":"..."} |
| Get Result | /api/jobs/{id} |
GET | State: complete, result present |
| Demo | /api/demo?request={q} |
GET | {"result":"..."} |
Error Response Results
| Error Type | HTTP Status | Response Body |
|---|---|---|
| Missing request field | 400 | {"error":"Invalid request: 'request' string is required"} |
| Job not found | 404 | {"error":"Job not found"} |
| Missing query param | 400 | {"error":"Missing required query param: request"} |
| Rate limit exceeded | 429 | {"error":...} |
4. Error Scenarios
4.1 Input Validation Errors
| Scenario | Trigger | HTTP Code | Response Pattern |
|---|---|---|---|
| Empty request body | POST /api/jobs with {} |
400 | {error: "Invalid request: 'request' string is required"} |
| Missing request field | POST /api/jobs without request |
400 | Same as above |
| Invalid job ID format | GET /api/jobs/invalid-id |
404 | {error: "Job not found"} |
| Non-existent job UUID | GET /api/jobs/{random-uuid} |
404 | {error: "Job not found"} |
| Missing demo param | GET /api/demo (no query) |
400 | {error: "Missing required query param: request"} |
4.2 Rate Limiting Errors
| Scenario | Trigger | HTTP Code | Response |
|---|---|---|---|
| Demo rate limit exceeded | 6+ requests from same IP to /api/demo |
429 | {error: ...} |
| Rate limit window | Per IP address | 429 | Triggered on 6th request |
4.3 Business Logic Errors (Rejection)
| Scenario | Trigger | State | Response |
|---|---|---|---|
| Harmful request rejected | Request containing harmful/illegal content | rejected |
{jobId, state: "rejected", reason: "..."} |
4.4 Error Response Format Standard
All errors MUST follow this format:
{
"error": "string description of the error"
}
5. State Transitions (Job State Machine)
5.1 State Diagram
[Create Job]
|
v
+------------------------+
| awaiting_eval_payment |<-------+
+------------------------+ |
| |
[Pay eval invoice] |
| |
+------------+------------+ |
| | |
v v |
+---------------------+ +-------------------+---+
| awaiting_work_payment| | rejected |
+---------------------+ +-------------------+---+
| |
[Pay work invoice] |
| |
v |
+---------------------+ |
| complete | |
+---------------------+--------------------------+
5.2 State Definitions
| State | Description | Entry Trigger | Exit Trigger |
|---|---|---|---|
awaiting_eval_payment |
Job created, waiting for eval fee payment | Job creation (POST /api/jobs) | Eval invoice paid |
awaiting_work_payment |
Eval paid, agent accepted, waiting for work fee | Eval payment detected on poll | Work invoice paid |
rejected |
Eval paid, agent rejected request | Eval payment detected + agent rejection | Terminal state |
complete |
Work paid, AI result delivered | Work payment detected + AI response | Terminal state |
5.3 State Transition Rules
- Initial State: All jobs start at
awaiting_eval_payment - Auto-Advance: States advance automatically on GET poll (no webhook)
- Terminal States:
rejectedandcompleteare final (no further transitions) - Branching: After eval payment, state branches to either
awaiting_work_payment(accepted) orrejected(declined) - Deterministic: Same request + payment = same state transitions
5.4 State-Dependent Response Fields
| State | Required Fields | Optional Fields |
|---|---|---|
awaiting_eval_payment |
jobId, state, evalInvoice |
- |
awaiting_work_payment |
jobId, state, workInvoice |
- |
rejected |
jobId, state, reason |
- |
complete |
jobId, state, result |
- |
6. Validation Rules
6.1 Input Validation Requirements
POST /api/jobs
| Field | Type | Required | Validation Rules |
|---|---|---|---|
request |
string | YES | Non-empty string, max length TBD |
Validation Errors:
- Missing field:
400 Bad Request - Empty string:
400 Bad Request - Wrong type:
400 Bad Request
GET /api/jobs/{jobId}
| Parameter | Type | Required | Validation Rules |
|---|---|---|---|
jobId |
UUID string | YES (in path) | Valid UUID format, exists in system |
Validation Errors:
- Invalid UUID format:
404 Not Found - Non-existent UUID:
404 Not Found
GET /api/demo
| Parameter | Type | Required | Validation Rules |
|---|---|---|---|
request |
string | YES (query) | Non-empty string |
Validation Errors:
- Missing param:
400 Bad Request - Empty value:
400 Bad Request
6.2 Invoice Validation (Stub Mode)
| Field | Format | Example |
|---|---|---|
paymentRequest (eval) |
lnbcrt10u1stub_<16-char-hash> |
lnbcrt10u1stub_a1b2c3d4e5f6g7h8 |
paymentRequest (work) |
lnbcrt<sats>u1stub_<16-char-hash> |
lnbcrt50u1stub_... or lnbcrt100u1stub_... or lnbcrt250u1stub_... |
paymentHash (full) |
64 hexadecimal characters | Required for stub pay endpoint |
6.3 Pricing Validation Rules
| Fee Type | Amount (sats) | Determination |
|---|---|---|
| Eval fee | 10 | Fixed for all requests |
| Work fee - Short | 50 | Request length < threshold |
| Work fee - Medium | 100 | Request length in middle range |
| Work fee - Long | 250 | Request length > threshold |
6.4 Rate Limiting Rules
| Endpoint | Limit | Window | Scope |
|---|---|---|---|
/api/demo |
5 requests | Per IP | IP-based |
Behavior:
- Count starts at 1 for first request
- Requests 1-5: Allowed (200 OK)
- Request 6+: Blocked (429 Too Many Requests)
6.5 AI Model Configuration
| Phase | Model | Purpose |
|---|---|---|
| Eval | claude-haiku-4-5 |
Fast/cheap judgment |
| Work | claude-sonnet-4-6 |
Full capability response |
7. Test Execution Checklist
Prerequisites
- API base URL is accessible
- Stub mode is enabled (
/api/dev/stub/payavailable) - Database access (optional, for payment hash lookup)
Execution Order
- T-01: Verify API is up
- T-02: Create test job, save
jobIdandevalInvoice - T-03: Poll to verify initial state
- T-04: Pay eval invoice (extract hash from paymentRequest)
- T-05: Poll to verify state advance
- T-06: If accepted, pay work invoice and get result
- T-07: Test demo endpoint
- T-08: Test all validation scenarios
- T-09: Test rate limiter (6 rapid requests)
- T-10: Create adversarial job, verify rejection
Data Dependencies
- T-03, T-05, T-06 require
jobIdfrom T-02 - T-04 requires payment hash from T-02's
evalInvoice - T-06 requires payment hash from T-05's
workInvoice(if accepted) - T-10 is independent (creates its own job)
8. Summary Table: All Requirements
| Category | Count | Details |
|---|---|---|
| Total Test Cases | 10 | T-01 through T-10 |
| API Endpoints Tested | 4 | /healthz, /jobs, /dev/stub/pay, /demo |
| HTTP Methods | 2 | GET, POST |
| States | 4 | awaiting_eval_payment, awaiting_work_payment, rejected, complete |
| Error Scenarios | 6 | Missing body, invalid job, missing param, rate limit, rejection, not found |
| Validation Rules | 12+ | Input type, format, presence, rate limits |
| Pricing Tiers | 4 | 10 sats (eval), 50/100/250 sats (work) |
Analysis generated from Timmy API Test Plan & Report Prompt