1
0

Claude/angry cerf (#173)

* feat: set qwen3.5:latest as default model

- Make qwen3.5:latest the primary default model for faster inference
- Move llama3.1:8b-instruct to fallback chain
- Update text fallback chain to prioritize qwen3.5:latest

Retains full backward compatibility via cascade fallback.

* test: remove ~55 brittle, duplicate, and useless tests

Audit of all 100 test files identified tests that provided no real
regression protection. Removed:

- 4 files deleted entirely: test_setup_script (always skipped),
  test_csrf_bypass (tautological assertions), test_input_validation
  (accepts 200-500 status codes), test_security_regression (fragile
  source-pattern checks redundant with rendering tests)
- Duplicate test classes (TestToolTracking, TestCalculatorExtended)
- Mock-only tests that just verify mock wiring, not behavior
- Structurally broken tests (TestCreateToolFunctions patches after import)
- Empty/pass-body tests and meaningless assertions (len > 20)
- Flaky subprocess tests (aider tool calling real binary)

All 1328 remaining tests pass. Net: -699 lines, zero coverage loss.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: prevent test pollution from autoresearch_enabled mutation

test_autoresearch_perplexity.py was setting settings.autoresearch_enabled = True
but never restoring it in the finally block — polluting subsequent tests.
When pytest-randomly ordered it before test_experiments_page_shows_disabled_when_off,
the victim test saw enabled=True and failed to find "Disabled" in the page.

Fix both sides:
- Restore autoresearch_enabled in the finally block (root cause)
- Mock settings explicitly in the victim test (defense in depth)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Trip T <trip@local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Alexander Whitestone
2026-03-11 16:55:27 -04:00
committed by GitHub
parent 0b91e45d90
commit 36fc10097f
17 changed files with 24 additions and 707 deletions

View File

@@ -11,7 +11,14 @@ class TestExperimentsRoute:
assert response.status_code == 200
assert "Autoresearch" in response.text
def test_experiments_page_shows_disabled_when_off(self, client):
@patch("dashboard.routes.experiments.settings")
def test_experiments_page_shows_disabled_when_off(self, mock_settings, client):
mock_settings.autoresearch_enabled = False
mock_settings.autoresearch_metric = "perplexity"
mock_settings.autoresearch_time_budget = 300
mock_settings.autoresearch_max_iterations = 10
mock_settings.repo_root = "/tmp"
mock_settings.autoresearch_workspace = "test-experiments"
response = client.get("/experiments")
assert response.status_code == 200
assert "disabled" in response.text.lower() or "Disabled" in response.text