[claude] Add agent performance regression benchmark suite (#1015) #1053
Reference in New Issue
Block a user
Delete Branch "claude/issue-1015"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Fixes #1015
Summary
reached_location,interacted_with) enable early success detectioncompare_runs()for regression detection (catches REGRESSION, IMPROVEMENT, SLOWER)scripts/run_benchmarks.py) with--tagsfiltering and--comparebaseline analysistox -e benchmarkenvironment for CI integrationTest plan
pytest tests/infrastructure/world/test_benchmark.py)tox -e lint)tox -e benchmarkto verify CLI executiontox -e benchmark -- --tags navigationto verify tag filtering