Files
timmy-tower/attached_assets/timmy_api_test_requirements_1773854936781.md
alexpaynex 53bc93a9b4 Add automated testing script and expose payment hashes
Integrates a new bash script for automated end-to-end testing of the Timmy API. Updates API routes to expose payment hashes in stub mode for easier invoice payment simulation during testing. Modifies test plan documentation to include the new automated script.

Replit-Commit-Author: Agent
Replit-Commit-Session-Id: 418bf6f8-212b-4bb0-a7a5-8231a061da4e
Replit-Commit-Checkpoint-Type: full_checkpoint
Replit-Commit-Event-Id: 6f2776b0-a913-41d3-a988-759a82feb6f3
Replit-Helium-Checkpoint-Created: true
2026-03-18 17:30:13 +00:00

409 lines
13 KiB
Markdown

# Timmy API Test Requirements Analysis
## 1. Test Case Summary
| Test ID | Test Name | Purpose |
|---------|-----------|---------|
| T-01 | Health Check | Verify API is running and responsive |
| T-02 | Create Job | Submit a new job request and receive eval invoice |
| T-03 | Poll Before Payment | Verify job state before eval payment is made |
| T-04 | Pay Eval Invoice | Simulate payment of evaluation fee (stub mode) |
| T-05 | Poll After Eval Payment | Verify state machine advances after eval payment |
| T-06 | Pay Work Invoice & Get Result | Complete full workflow and receive AI result |
| T-07 | Free Demo Endpoint | Test free demo endpoint without payment |
| T-08 | Input Validation | Verify proper error handling for invalid inputs |
| T-09 | Demo Rate Limiter | Test rate limiting on demo endpoint (5 req/IP) |
| T-10 | Rejection Path | Verify agent rejects harmful/illegal requests |
---
## 2. Pass/Fail Criteria by Test Case
### T-01: Health Check
**Pass Criteria:**
- HTTP Status: 200 OK
- Response body contains `status` field
- Response value: `{"status":"ok"}`
**Fail Criteria:**
- HTTP status != 200
- Missing `status` field
- Status value != "ok"
---
### T-02: Create Job
**Pass Criteria:**
- HTTP Status: 201 Created
- Response contains `jobId` (UUID format)
- Response contains `evalInvoice` object
- `evalInvoice.amountSats` equals exactly 10
- `evalInvoice.paymentRequest` is present (stub format: `lnbcrt10u1stub_...`)
**Fail Criteria:**
- HTTP status != 201
- Missing `jobId` or invalid format
- `evalInvoice` missing or malformed
- `amountSats` != 10
---
### T-03: Poll Before Payment
**Pass Criteria:**
- HTTP Status: 200
- `state` field equals exactly `awaiting_eval_payment`
- `evalInvoice` is echoed back (same as T-02 response)
- `jobId` matches the created job
**Fail Criteria:**
- State != `awaiting_eval_payment`
- Invoice data missing or modified
- Wrong jobId returned
---
### T-04: Pay Eval Invoice (Stub Mode)
**Pass Criteria:**
- HTTP Status: 200 OK
- Response contains `{"ok":true}`
- Response contains `paymentHash` matching the paid invoice
**Fail Criteria:**
- HTTP status != 200
- `ok` != true
- Payment hash mismatch
**Note:** Uses `/api/dev/stub/pay/<full-64-char-payment-hash>` endpoint (stub mode only)
---
### T-05: Poll After Eval Payment (State Advance)
**Pass Criteria:**
- HTTP Status: 200
- State has advanced from `awaiting_eval_payment`
- Agent judgment is present in response
**Accepted Path:**
- State equals `awaiting_work_payment`
- `workInvoice` object present
- `workInvoice.amountSats` is one of: 50, 100, or 250
- `workInvoice.paymentRequest` present (stub format)
**Rejected Path:**
- State equals `rejected`
- `reason` field present with explanation
**Fail Criteria:**
- State unchanged (still `awaiting_eval_payment`)
- State is invalid/unknown
- Missing required fields for current state
---
### T-06: Pay Work Invoice & Get Result
**Pass Criteria:**
- HTTP Status: 200
- Final state equals `complete`
- `result` field present with AI-generated content
- Result is meaningful and answers the original request
**Fail Criteria:**
- State != `complete`
- Missing or empty `result`
- Result is incoherent or unrelated to request
**Timing Note:** AI response may take 2-5 seconds after payment
---
### T-07: Free Demo Endpoint
**Pass Criteria:**
- HTTP Status: 200 OK
- Response contains `result` field
- Result is coherent and answers the query
**Fail Criteria:**
- HTTP status != 200
- Missing `result` field
- Result is incoherent or nonsensical
---
### T-08: Input Validation
**Pass Criteria (all three scenarios):**
| Scenario | HTTP Status | Expected Response |
|----------|-------------|-------------------|
| Missing request body | 400 | `{"error":"Invalid request: 'request' string is required"}` |
| Unknown job ID | 404 | `{"error":"Job not found"}` |
| Demo without param | 400 | `{"error":"Missing required query param: request"}` |
**Fail Criteria:**
- Incorrect HTTP status codes
- Error message format != `{"error": string}`
- Generic error messages without specifics
---
### T-09: Demo Rate Limiter
**Pass Criteria:**
- Requests 1-5: Return HTTP 200 with `result` field
- Request 6+: Return HTTP 429 (Too Many Requests) with `error` field
- Rate limit applies per IP address
**Fail Criteria:**
- Rate limit triggers before request 6
- Rate limit does not trigger by request 6
- Rate limit affects different IPs
---
### T-10: Rejection Path (Adversarial Request)
**Pass Criteria:**
- Job created successfully (T-02 passes)
- Eval payment processed (T-04 passes)
- Final state equals `rejected`
- `reason` field present explaining rejection
- State does NOT become `awaiting_work_payment`
**Fail Criteria:**
- State becomes `awaiting_work_payment` (agent failed to reject)
- State becomes `complete` with harmful content
- Missing `reason` field
---
## 3. Expected Results Summary
### Normal Flow Results
| Step | Endpoint | Method | Expected Response |
|------|----------|--------|-------------------|
| Health | `/api/healthz` | GET | `{"status":"ok"}` |
| Create | `/api/jobs` | POST | Job object with eval invoice |
| Poll Pre-Pay | `/api/jobs/{id}` | GET | State: `awaiting_eval_payment` |
| Pay Eval | `/api/dev/stub/pay/{hash}` | POST | `{"ok":true,"paymentHash":"..."}` |
| Poll Post-Eval | `/api/jobs/{id}` | GET | State: `awaiting_work_payment` OR `rejected` |
| Pay Work | `/api/dev/stub/pay/{hash}` | POST | `{"ok":true,"paymentHash":"..."}` |
| Get Result | `/api/jobs/{id}` | GET | State: `complete`, `result` present |
| Demo | `/api/demo?request={q}` | GET | `{"result":"..."}` |
### Error Response Results
| Error Type | HTTP Status | Response Body |
|------------|-------------|---------------|
| Missing request field | 400 | `{"error":"Invalid request: 'request' string is required"}` |
| Job not found | 404 | `{"error":"Job not found"}` |
| Missing query param | 400 | `{"error":"Missing required query param: request"}` |
| Rate limit exceeded | 429 | `{"error":...}` |
---
## 4. Error Scenarios
### 4.1 Input Validation Errors
| Scenario | Trigger | HTTP Code | Response Pattern |
|----------|---------|-----------|------------------|
| Empty request body | POST `/api/jobs` with `{}` | 400 | `{error: "Invalid request: 'request' string is required"}` |
| Missing request field | POST `/api/jobs` without `request` | 400 | Same as above |
| Invalid job ID format | GET `/api/jobs/invalid-id` | 404 | `{error: "Job not found"}` |
| Non-existent job UUID | GET `/api/jobs/{random-uuid}` | 404 | `{error: "Job not found"}` |
| Missing demo param | GET `/api/demo` (no query) | 400 | `{error: "Missing required query param: request"}` |
### 4.2 Rate Limiting Errors
| Scenario | Trigger | HTTP Code | Response |
|----------|---------|-----------|----------|
| Demo rate limit exceeded | 6+ requests from same IP to `/api/demo` | 429 | `{error: ...}` |
| Rate limit window | Per IP address | 429 | Triggered on 6th request |
### 4.3 Business Logic Errors (Rejection)
| Scenario | Trigger | State | Response |
|----------|---------|-------|----------|
| Harmful request rejected | Request containing harmful/illegal content | `rejected` | `{jobId, state: "rejected", reason: "..."}` |
### 4.4 Error Response Format Standard
All errors MUST follow this format:
```json
{
"error": "string description of the error"
}
```
---
## 5. State Transitions (Job State Machine)
### 5.1 State Diagram
```
[Create Job]
|
v
+------------------------+
| awaiting_eval_payment |<-------+
+------------------------+ |
| |
[Pay eval invoice] |
| |
+------------+------------+ |
| | |
v v |
+---------------------+ +-------------------+---+
| awaiting_work_payment| | rejected |
+---------------------+ +-------------------+---+
| |
[Pay work invoice] |
| |
v |
+---------------------+ |
| complete | |
+---------------------+--------------------------+
```
### 5.2 State Definitions
| State | Description | Entry Trigger | Exit Trigger |
|-------|-------------|---------------|--------------|
| `awaiting_eval_payment` | Job created, waiting for eval fee payment | Job creation (POST /api/jobs) | Eval invoice paid |
| `awaiting_work_payment` | Eval paid, agent accepted, waiting for work fee | Eval payment detected on poll | Work invoice paid |
| `rejected` | Eval paid, agent rejected request | Eval payment detected + agent rejection | Terminal state |
| `complete` | Work paid, AI result delivered | Work payment detected + AI response | Terminal state |
### 5.3 State Transition Rules
1. **Initial State**: All jobs start at `awaiting_eval_payment`
2. **Auto-Advance**: States advance automatically on GET poll (no webhook)
3. **Terminal States**: `rejected` and `complete` are final (no further transitions)
4. **Branching**: After eval payment, state branches to either `awaiting_work_payment` (accepted) or `rejected` (declined)
5. **Deterministic**: Same request + payment = same state transitions
### 5.4 State-Dependent Response Fields
| State | Required Fields | Optional Fields |
|-------|-----------------|-----------------|
| `awaiting_eval_payment` | `jobId`, `state`, `evalInvoice` | - |
| `awaiting_work_payment` | `jobId`, `state`, `workInvoice` | - |
| `rejected` | `jobId`, `state`, `reason` | - |
| `complete` | `jobId`, `state`, `result` | - |
---
## 6. Validation Rules
### 6.1 Input Validation Requirements
#### POST /api/jobs
| Field | Type | Required | Validation Rules |
|-------|------|----------|------------------|
| `request` | string | YES | Non-empty string, max length TBD |
**Validation Errors:**
- Missing field: `400 Bad Request`
- Empty string: `400 Bad Request`
- Wrong type: `400 Bad Request`
#### GET /api/jobs/{jobId}
| Parameter | Type | Required | Validation Rules |
|-----------|------|----------|------------------|
| `jobId` | UUID string | YES (in path) | Valid UUID format, exists in system |
**Validation Errors:**
- Invalid UUID format: `404 Not Found`
- Non-existent UUID: `404 Not Found`
#### GET /api/demo
| Parameter | Type | Required | Validation Rules |
|-----------|------|----------|------------------|
| `request` | string | YES (query) | Non-empty string |
**Validation Errors:**
- Missing param: `400 Bad Request`
- Empty value: `400 Bad Request`
### 6.2 Invoice Validation (Stub Mode)
| Field | Format | Example |
|-------|--------|---------|
| `paymentRequest` (eval) | `lnbcrt10u1stub_<16-char-hash>` | `lnbcrt10u1stub_a1b2c3d4e5f6g7h8` |
| `paymentRequest` (work) | `lnbcrt<sats>u1stub_<16-char-hash>` | `lnbcrt50u1stub_...` or `lnbcrt100u1stub_...` or `lnbcrt250u1stub_...` |
| `paymentHash` (full) | 64 hexadecimal characters | Required for stub pay endpoint |
### 6.3 Pricing Validation Rules
| Fee Type | Amount (sats) | Determination |
|----------|---------------|---------------|
| Eval fee | 10 | Fixed for all requests |
| Work fee - Short | 50 | Request length < threshold |
| Work fee - Medium | 100 | Request length in middle range |
| Work fee - Long | 250 | Request length > threshold |
### 6.4 Rate Limiting Rules
| Endpoint | Limit | Window | Scope |
|----------|-------|--------|-------|
| `/api/demo` | 5 requests | Per IP | IP-based |
**Behavior:**
- Count starts at 1 for first request
- Requests 1-5: Allowed (200 OK)
- Request 6+: Blocked (429 Too Many Requests)
### 6.5 AI Model Configuration
| Phase | Model | Purpose |
|-------|-------|---------|
| Eval | `claude-haiku-4-5` | Fast/cheap judgment |
| Work | `claude-sonnet-4-6` | Full capability response |
---
## 7. Test Execution Checklist
### Prerequisites
- [ ] API base URL is accessible
- [ ] Stub mode is enabled (`/api/dev/stub/pay` available)
- [ ] Database access (optional, for payment hash lookup)
### Execution Order
1. T-01: Verify API is up
2. T-02: Create test job, save `jobId` and `evalInvoice`
3. T-03: Poll to verify initial state
4. T-04: Pay eval invoice (extract hash from paymentRequest)
5. T-05: Poll to verify state advance
6. T-06: If accepted, pay work invoice and get result
7. T-07: Test demo endpoint
8. T-08: Test all validation scenarios
9. T-09: Test rate limiter (6 rapid requests)
10. T-10: Create adversarial job, verify rejection
### Data Dependencies
- T-03, T-05, T-06 require `jobId` from T-02
- T-04 requires payment hash from T-02's `evalInvoice`
- T-06 requires payment hash from T-05's `workInvoice` (if accepted)
- T-10 is independent (creates its own job)
---
## 8. Summary Table: All Requirements
| Category | Count | Details |
|----------|-------|---------|
| Total Test Cases | 10 | T-01 through T-10 |
| API Endpoints Tested | 4 | `/healthz`, `/jobs`, `/dev/stub/pay`, `/demo` |
| HTTP Methods | 2 | GET, POST |
| States | 4 | `awaiting_eval_payment`, `awaiting_work_payment`, `rejected`, `complete` |
| Error Scenarios | 6 | Missing body, invalid job, missing param, rate limit, rejection, not found |
| Validation Rules | 12+ | Input type, format, presence, rate limits |
| Pricing Tiers | 4 | 10 sats (eval), 50/100/250 sats (work) |
---
*Analysis generated from Timmy API Test Plan & Report Prompt*