409 lines
13 KiB
Markdown
409 lines
13 KiB
Markdown
|
|
# Timmy API Test Requirements Analysis
|
||
|
|
|
||
|
|
## 1. Test Case Summary
|
||
|
|
|
||
|
|
| Test ID | Test Name | Purpose |
|
||
|
|
|---------|-----------|---------|
|
||
|
|
| T-01 | Health Check | Verify API is running and responsive |
|
||
|
|
| T-02 | Create Job | Submit a new job request and receive eval invoice |
|
||
|
|
| T-03 | Poll Before Payment | Verify job state before eval payment is made |
|
||
|
|
| T-04 | Pay Eval Invoice | Simulate payment of evaluation fee (stub mode) |
|
||
|
|
| T-05 | Poll After Eval Payment | Verify state machine advances after eval payment |
|
||
|
|
| T-06 | Pay Work Invoice & Get Result | Complete full workflow and receive AI result |
|
||
|
|
| T-07 | Free Demo Endpoint | Test free demo endpoint without payment |
|
||
|
|
| T-08 | Input Validation | Verify proper error handling for invalid inputs |
|
||
|
|
| T-09 | Demo Rate Limiter | Test rate limiting on demo endpoint (5 req/IP) |
|
||
|
|
| T-10 | Rejection Path | Verify agent rejects harmful/illegal requests |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 2. Pass/Fail Criteria by Test Case
|
||
|
|
|
||
|
|
### T-01: Health Check
|
||
|
|
**Pass Criteria:**
|
||
|
|
- HTTP Status: 200 OK
|
||
|
|
- Response body contains `status` field
|
||
|
|
- Response value: `{"status":"ok"}`
|
||
|
|
|
||
|
|
**Fail Criteria:**
|
||
|
|
- HTTP status != 200
|
||
|
|
- Missing `status` field
|
||
|
|
- Status value != "ok"
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### T-02: Create Job
|
||
|
|
**Pass Criteria:**
|
||
|
|
- HTTP Status: 201 Created
|
||
|
|
- Response contains `jobId` (UUID format)
|
||
|
|
- Response contains `evalInvoice` object
|
||
|
|
- `evalInvoice.amountSats` equals exactly 10
|
||
|
|
- `evalInvoice.paymentRequest` is present (stub format: `lnbcrt10u1stub_...`)
|
||
|
|
|
||
|
|
**Fail Criteria:**
|
||
|
|
- HTTP status != 201
|
||
|
|
- Missing `jobId` or invalid format
|
||
|
|
- `evalInvoice` missing or malformed
|
||
|
|
- `amountSats` != 10
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### T-03: Poll Before Payment
|
||
|
|
**Pass Criteria:**
|
||
|
|
- HTTP Status: 200
|
||
|
|
- `state` field equals exactly `awaiting_eval_payment`
|
||
|
|
- `evalInvoice` is echoed back (same as T-02 response)
|
||
|
|
- `jobId` matches the created job
|
||
|
|
|
||
|
|
**Fail Criteria:**
|
||
|
|
- State != `awaiting_eval_payment`
|
||
|
|
- Invoice data missing or modified
|
||
|
|
- Wrong jobId returned
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### T-04: Pay Eval Invoice (Stub Mode)
|
||
|
|
**Pass Criteria:**
|
||
|
|
- HTTP Status: 200 OK
|
||
|
|
- Response contains `{"ok":true}`
|
||
|
|
- Response contains `paymentHash` matching the paid invoice
|
||
|
|
|
||
|
|
**Fail Criteria:**
|
||
|
|
- HTTP status != 200
|
||
|
|
- `ok` != true
|
||
|
|
- Payment hash mismatch
|
||
|
|
|
||
|
|
**Note:** Uses `/api/dev/stub/pay/<full-64-char-payment-hash>` endpoint (stub mode only)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### T-05: Poll After Eval Payment (State Advance)
|
||
|
|
**Pass Criteria:**
|
||
|
|
- HTTP Status: 200
|
||
|
|
- State has advanced from `awaiting_eval_payment`
|
||
|
|
- Agent judgment is present in response
|
||
|
|
|
||
|
|
**Accepted Path:**
|
||
|
|
- State equals `awaiting_work_payment`
|
||
|
|
- `workInvoice` object present
|
||
|
|
- `workInvoice.amountSats` is one of: 50, 100, or 250
|
||
|
|
- `workInvoice.paymentRequest` present (stub format)
|
||
|
|
|
||
|
|
**Rejected Path:**
|
||
|
|
- State equals `rejected`
|
||
|
|
- `reason` field present with explanation
|
||
|
|
|
||
|
|
**Fail Criteria:**
|
||
|
|
- State unchanged (still `awaiting_eval_payment`)
|
||
|
|
- State is invalid/unknown
|
||
|
|
- Missing required fields for current state
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### T-06: Pay Work Invoice & Get Result
|
||
|
|
**Pass Criteria:**
|
||
|
|
- HTTP Status: 200
|
||
|
|
- Final state equals `complete`
|
||
|
|
- `result` field present with AI-generated content
|
||
|
|
- Result is meaningful and answers the original request
|
||
|
|
|
||
|
|
**Fail Criteria:**
|
||
|
|
- State != `complete`
|
||
|
|
- Missing or empty `result`
|
||
|
|
- Result is incoherent or unrelated to request
|
||
|
|
|
||
|
|
**Timing Note:** AI response may take 2-5 seconds after payment
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### T-07: Free Demo Endpoint
|
||
|
|
**Pass Criteria:**
|
||
|
|
- HTTP Status: 200 OK
|
||
|
|
- Response contains `result` field
|
||
|
|
- Result is coherent and answers the query
|
||
|
|
|
||
|
|
**Fail Criteria:**
|
||
|
|
- HTTP status != 200
|
||
|
|
- Missing `result` field
|
||
|
|
- Result is incoherent or nonsensical
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### T-08: Input Validation
|
||
|
|
**Pass Criteria (all three scenarios):**
|
||
|
|
|
||
|
|
| Scenario | HTTP Status | Expected Response |
|
||
|
|
|----------|-------------|-------------------|
|
||
|
|
| Missing request body | 400 | `{"error":"Invalid request: 'request' string is required"}` |
|
||
|
|
| Unknown job ID | 404 | `{"error":"Job not found"}` |
|
||
|
|
| Demo without param | 400 | `{"error":"Missing required query param: request"}` |
|
||
|
|
|
||
|
|
**Fail Criteria:**
|
||
|
|
- Incorrect HTTP status codes
|
||
|
|
- Error message format != `{"error": string}`
|
||
|
|
- Generic error messages without specifics
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### T-09: Demo Rate Limiter
|
||
|
|
**Pass Criteria:**
|
||
|
|
- Requests 1-5: Return HTTP 200 with `result` field
|
||
|
|
- Request 6+: Return HTTP 429 (Too Many Requests) with `error` field
|
||
|
|
- Rate limit applies per IP address
|
||
|
|
|
||
|
|
**Fail Criteria:**
|
||
|
|
- Rate limit triggers before request 6
|
||
|
|
- Rate limit does not trigger by request 6
|
||
|
|
- Rate limit affects different IPs
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### T-10: Rejection Path (Adversarial Request)
|
||
|
|
**Pass Criteria:**
|
||
|
|
- Job created successfully (T-02 passes)
|
||
|
|
- Eval payment processed (T-04 passes)
|
||
|
|
- Final state equals `rejected`
|
||
|
|
- `reason` field present explaining rejection
|
||
|
|
- State does NOT become `awaiting_work_payment`
|
||
|
|
|
||
|
|
**Fail Criteria:**
|
||
|
|
- State becomes `awaiting_work_payment` (agent failed to reject)
|
||
|
|
- State becomes `complete` with harmful content
|
||
|
|
- Missing `reason` field
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 3. Expected Results Summary
|
||
|
|
|
||
|
|
### Normal Flow Results
|
||
|
|
|
||
|
|
| Step | Endpoint | Method | Expected Response |
|
||
|
|
|------|----------|--------|-------------------|
|
||
|
|
| Health | `/api/healthz` | GET | `{"status":"ok"}` |
|
||
|
|
| Create | `/api/jobs` | POST | Job object with eval invoice |
|
||
|
|
| Poll Pre-Pay | `/api/jobs/{id}` | GET | State: `awaiting_eval_payment` |
|
||
|
|
| Pay Eval | `/api/dev/stub/pay/{hash}` | POST | `{"ok":true,"paymentHash":"..."}` |
|
||
|
|
| Poll Post-Eval | `/api/jobs/{id}` | GET | State: `awaiting_work_payment` OR `rejected` |
|
||
|
|
| Pay Work | `/api/dev/stub/pay/{hash}` | POST | `{"ok":true,"paymentHash":"..."}` |
|
||
|
|
| Get Result | `/api/jobs/{id}` | GET | State: `complete`, `result` present |
|
||
|
|
| Demo | `/api/demo?request={q}` | GET | `{"result":"..."}` |
|
||
|
|
|
||
|
|
### Error Response Results
|
||
|
|
|
||
|
|
| Error Type | HTTP Status | Response Body |
|
||
|
|
|------------|-------------|---------------|
|
||
|
|
| Missing request field | 400 | `{"error":"Invalid request: 'request' string is required"}` |
|
||
|
|
| Job not found | 404 | `{"error":"Job not found"}` |
|
||
|
|
| Missing query param | 400 | `{"error":"Missing required query param: request"}` |
|
||
|
|
| Rate limit exceeded | 429 | `{"error":...}` |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 4. Error Scenarios
|
||
|
|
|
||
|
|
### 4.1 Input Validation Errors
|
||
|
|
|
||
|
|
| Scenario | Trigger | HTTP Code | Response Pattern |
|
||
|
|
|----------|---------|-----------|------------------|
|
||
|
|
| Empty request body | POST `/api/jobs` with `{}` | 400 | `{error: "Invalid request: 'request' string is required"}` |
|
||
|
|
| Missing request field | POST `/api/jobs` without `request` | 400 | Same as above |
|
||
|
|
| Invalid job ID format | GET `/api/jobs/invalid-id` | 404 | `{error: "Job not found"}` |
|
||
|
|
| Non-existent job UUID | GET `/api/jobs/{random-uuid}` | 404 | `{error: "Job not found"}` |
|
||
|
|
| Missing demo param | GET `/api/demo` (no query) | 400 | `{error: "Missing required query param: request"}` |
|
||
|
|
|
||
|
|
### 4.2 Rate Limiting Errors
|
||
|
|
|
||
|
|
| Scenario | Trigger | HTTP Code | Response |
|
||
|
|
|----------|---------|-----------|----------|
|
||
|
|
| Demo rate limit exceeded | 6+ requests from same IP to `/api/demo` | 429 | `{error: ...}` |
|
||
|
|
| Rate limit window | Per IP address | 429 | Triggered on 6th request |
|
||
|
|
|
||
|
|
### 4.3 Business Logic Errors (Rejection)
|
||
|
|
|
||
|
|
| Scenario | Trigger | State | Response |
|
||
|
|
|----------|---------|-------|----------|
|
||
|
|
| Harmful request rejected | Request containing harmful/illegal content | `rejected` | `{jobId, state: "rejected", reason: "..."}` |
|
||
|
|
|
||
|
|
### 4.4 Error Response Format Standard
|
||
|
|
|
||
|
|
All errors MUST follow this format:
|
||
|
|
```json
|
||
|
|
{
|
||
|
|
"error": "string description of the error"
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 5. State Transitions (Job State Machine)
|
||
|
|
|
||
|
|
### 5.1 State Diagram
|
||
|
|
|
||
|
|
```
|
||
|
|
[Create Job]
|
||
|
|
|
|
||
|
|
v
|
||
|
|
+------------------------+
|
||
|
|
| awaiting_eval_payment |<-------+
|
||
|
|
+------------------------+ |
|
||
|
|
| |
|
||
|
|
[Pay eval invoice] |
|
||
|
|
| |
|
||
|
|
+------------+------------+ |
|
||
|
|
| | |
|
||
|
|
v v |
|
||
|
|
+---------------------+ +-------------------+---+
|
||
|
|
| awaiting_work_payment| | rejected |
|
||
|
|
+---------------------+ +-------------------+---+
|
||
|
|
| |
|
||
|
|
[Pay work invoice] |
|
||
|
|
| |
|
||
|
|
v |
|
||
|
|
+---------------------+ |
|
||
|
|
| complete | |
|
||
|
|
+---------------------+--------------------------+
|
||
|
|
```
|
||
|
|
|
||
|
|
### 5.2 State Definitions
|
||
|
|
|
||
|
|
| State | Description | Entry Trigger | Exit Trigger |
|
||
|
|
|-------|-------------|---------------|--------------|
|
||
|
|
| `awaiting_eval_payment` | Job created, waiting for eval fee payment | Job creation (POST /api/jobs) | Eval invoice paid |
|
||
|
|
| `awaiting_work_payment` | Eval paid, agent accepted, waiting for work fee | Eval payment detected on poll | Work invoice paid |
|
||
|
|
| `rejected` | Eval paid, agent rejected request | Eval payment detected + agent rejection | Terminal state |
|
||
|
|
| `complete` | Work paid, AI result delivered | Work payment detected + AI response | Terminal state |
|
||
|
|
|
||
|
|
### 5.3 State Transition Rules
|
||
|
|
|
||
|
|
1. **Initial State**: All jobs start at `awaiting_eval_payment`
|
||
|
|
2. **Auto-Advance**: States advance automatically on GET poll (no webhook)
|
||
|
|
3. **Terminal States**: `rejected` and `complete` are final (no further transitions)
|
||
|
|
4. **Branching**: After eval payment, state branches to either `awaiting_work_payment` (accepted) or `rejected` (declined)
|
||
|
|
5. **Deterministic**: Same request + payment = same state transitions
|
||
|
|
|
||
|
|
### 5.4 State-Dependent Response Fields
|
||
|
|
|
||
|
|
| State | Required Fields | Optional Fields |
|
||
|
|
|-------|-----------------|-----------------|
|
||
|
|
| `awaiting_eval_payment` | `jobId`, `state`, `evalInvoice` | - |
|
||
|
|
| `awaiting_work_payment` | `jobId`, `state`, `workInvoice` | - |
|
||
|
|
| `rejected` | `jobId`, `state`, `reason` | - |
|
||
|
|
| `complete` | `jobId`, `state`, `result` | - |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 6. Validation Rules
|
||
|
|
|
||
|
|
### 6.1 Input Validation Requirements
|
||
|
|
|
||
|
|
#### POST /api/jobs
|
||
|
|
|
||
|
|
| Field | Type | Required | Validation Rules |
|
||
|
|
|-------|------|----------|------------------|
|
||
|
|
| `request` | string | YES | Non-empty string, max length TBD |
|
||
|
|
|
||
|
|
**Validation Errors:**
|
||
|
|
- Missing field: `400 Bad Request`
|
||
|
|
- Empty string: `400 Bad Request`
|
||
|
|
- Wrong type: `400 Bad Request`
|
||
|
|
|
||
|
|
#### GET /api/jobs/{jobId}
|
||
|
|
|
||
|
|
| Parameter | Type | Required | Validation Rules |
|
||
|
|
|-----------|------|----------|------------------|
|
||
|
|
| `jobId` | UUID string | YES (in path) | Valid UUID format, exists in system |
|
||
|
|
|
||
|
|
**Validation Errors:**
|
||
|
|
- Invalid UUID format: `404 Not Found`
|
||
|
|
- Non-existent UUID: `404 Not Found`
|
||
|
|
|
||
|
|
#### GET /api/demo
|
||
|
|
|
||
|
|
| Parameter | Type | Required | Validation Rules |
|
||
|
|
|-----------|------|----------|------------------|
|
||
|
|
| `request` | string | YES (query) | Non-empty string |
|
||
|
|
|
||
|
|
**Validation Errors:**
|
||
|
|
- Missing param: `400 Bad Request`
|
||
|
|
- Empty value: `400 Bad Request`
|
||
|
|
|
||
|
|
### 6.2 Invoice Validation (Stub Mode)
|
||
|
|
|
||
|
|
| Field | Format | Example |
|
||
|
|
|-------|--------|---------|
|
||
|
|
| `paymentRequest` (eval) | `lnbcrt10u1stub_<16-char-hash>` | `lnbcrt10u1stub_a1b2c3d4e5f6g7h8` |
|
||
|
|
| `paymentRequest` (work) | `lnbcrt<sats>u1stub_<16-char-hash>` | `lnbcrt50u1stub_...` or `lnbcrt100u1stub_...` or `lnbcrt250u1stub_...` |
|
||
|
|
| `paymentHash` (full) | 64 hexadecimal characters | Required for stub pay endpoint |
|
||
|
|
|
||
|
|
### 6.3 Pricing Validation Rules
|
||
|
|
|
||
|
|
| Fee Type | Amount (sats) | Determination |
|
||
|
|
|----------|---------------|---------------|
|
||
|
|
| Eval fee | 10 | Fixed for all requests |
|
||
|
|
| Work fee - Short | 50 | Request length < threshold |
|
||
|
|
| Work fee - Medium | 100 | Request length in middle range |
|
||
|
|
| Work fee - Long | 250 | Request length > threshold |
|
||
|
|
|
||
|
|
### 6.4 Rate Limiting Rules
|
||
|
|
|
||
|
|
| Endpoint | Limit | Window | Scope |
|
||
|
|
|----------|-------|--------|-------|
|
||
|
|
| `/api/demo` | 5 requests | Per IP | IP-based |
|
||
|
|
|
||
|
|
**Behavior:**
|
||
|
|
- Count starts at 1 for first request
|
||
|
|
- Requests 1-5: Allowed (200 OK)
|
||
|
|
- Request 6+: Blocked (429 Too Many Requests)
|
||
|
|
|
||
|
|
### 6.5 AI Model Configuration
|
||
|
|
|
||
|
|
| Phase | Model | Purpose |
|
||
|
|
|-------|-------|---------|
|
||
|
|
| Eval | `claude-haiku-4-5` | Fast/cheap judgment |
|
||
|
|
| Work | `claude-sonnet-4-6` | Full capability response |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 7. Test Execution Checklist
|
||
|
|
|
||
|
|
### Prerequisites
|
||
|
|
- [ ] API base URL is accessible
|
||
|
|
- [ ] Stub mode is enabled (`/api/dev/stub/pay` available)
|
||
|
|
- [ ] Database access (optional, for payment hash lookup)
|
||
|
|
|
||
|
|
### Execution Order
|
||
|
|
1. T-01: Verify API is up
|
||
|
|
2. T-02: Create test job, save `jobId` and `evalInvoice`
|
||
|
|
3. T-03: Poll to verify initial state
|
||
|
|
4. T-04: Pay eval invoice (extract hash from paymentRequest)
|
||
|
|
5. T-05: Poll to verify state advance
|
||
|
|
6. T-06: If accepted, pay work invoice and get result
|
||
|
|
7. T-07: Test demo endpoint
|
||
|
|
8. T-08: Test all validation scenarios
|
||
|
|
9. T-09: Test rate limiter (6 rapid requests)
|
||
|
|
10. T-10: Create adversarial job, verify rejection
|
||
|
|
|
||
|
|
### Data Dependencies
|
||
|
|
- T-03, T-05, T-06 require `jobId` from T-02
|
||
|
|
- T-04 requires payment hash from T-02's `evalInvoice`
|
||
|
|
- T-06 requires payment hash from T-05's `workInvoice` (if accepted)
|
||
|
|
- T-10 is independent (creates its own job)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## 8. Summary Table: All Requirements
|
||
|
|
|
||
|
|
| Category | Count | Details |
|
||
|
|
|----------|-------|---------|
|
||
|
|
| Total Test Cases | 10 | T-01 through T-10 |
|
||
|
|
| API Endpoints Tested | 4 | `/healthz`, `/jobs`, `/dev/stub/pay`, `/demo` |
|
||
|
|
| HTTP Methods | 2 | GET, POST |
|
||
|
|
| States | 4 | `awaiting_eval_payment`, `awaiting_work_payment`, `rejected`, `complete` |
|
||
|
|
| Error Scenarios | 6 | Missing body, invalid job, missing param, rate limit, rejection, not found |
|
||
|
|
| Validation Rules | 12+ | Input type, format, presence, rate limits |
|
||
|
|
| Pricing Tiers | 4 | 10 sats (eval), 50/100/250 sats (work) |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
*Analysis generated from Timmy API Test Plan & Report Prompt*
|