Files
turboquant/benchmarks/tool-call-regression.md

33 lines
1.1 KiB
Markdown
Raw Normal View History

# Tool Call Regression Results
**Generated:** 2026-04-16T01:56:48.462512+00:00
**Model:** dry-run
**Endpoint:** none
**KV Type:** none
## Summary
| Metric | Value |
|--------|-------|
| Total tests | 10 |
| Passed | 10 |
| Failed | 0 |
| Accuracy | 100.0% |
| Threshold | 100% |
| Verdict | PASS |
## Test Matrix
| Test ID | Tool Expected | Tool Called | Schema | Args | Latency | Status |
|---------|--------------|-------------|--------|------|---------|--------|
| read_file_basic | read_file | none | OK | OK | 0ms | PASS |
| read_file_offset | read_file | none | OK | OK | 0ms | PASS |
| web_search_basic | web_search | none | OK | OK | 0ms | PASS |
| terminal_basic | terminal | none | OK | OK | 0ms | PASS |
| terminal_complex | terminal | none | OK | OK | 0ms | PASS |
| code_exec_basic | execute_code | none | OK | OK | 0ms | PASS |
| code_exec_complex | execute_code | none | OK | OK | 0ms | PASS |
| delegate_basic | delegate_task | none | OK | OK | 0ms | PASS |
| delegate_context | delegate_task | none | OK | OK | 0ms | PASS |
| parallel_two | read_file | none | OK | OK | 0ms | PASS |