Timmy-time-dashboard

perplexity/Timmy-time-dashboard

Archived

Fork 0

forked from Rockachopa/Timmy-time-dashboard

Commit Graph

Author	SHA1	Message	Date
Trip T	ea2dbdb4b5	fix: test DB isolation, Discord recovery, and over-mocked tests Test data was bleeding into production tasks.db because swarm.task_queue.models.DB_PATH (relative path) was never patched in conftest.clean_database. Fixed by switching to absolute paths via settings.repo_root and adding the missing module to the patching list. Discord bot could leak orphaned clients on retry after ERROR state. Added _cleanup_stale() to close stale client/task before each start() attempt, with improved logging in the token watcher. Rewrote test_paperclip_client.py to use httpx.MockTransport instead of patching _get/_post/_delete — tests now exercise real HTTP status codes, error handling, and JSON parsing. Added end-to-end test for capture_error → create_task DB isolation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 20:33:59 -04:00
Alexander Whitestone	36fc10097f	Claude/angry cerf (#173 ) * feat: set qwen3.5:latest as default model - Make qwen3.5:latest the primary default model for faster inference - Move llama3.1:8b-instruct to fallback chain - Update text fallback chain to prioritize qwen3.5:latest Retains full backward compatibility via cascade fallback. * test: remove ~55 brittle, duplicate, and useless tests Audit of all 100 test files identified tests that provided no real regression protection. Removed: - 4 files deleted entirely: test_setup_script (always skipped), test_csrf_bypass (tautological assertions), test_input_validation (accepts 200-500 status codes), test_security_regression (fragile source-pattern checks redundant with rendering tests) - Duplicate test classes (TestToolTracking, TestCalculatorExtended) - Mock-only tests that just verify mock wiring, not behavior - Structurally broken tests (TestCreateToolFunctions patches after import) - Empty/pass-body tests and meaningless assertions (len > 20) - Flaky subprocess tests (aider tool calling real binary) All 1328 remaining tests pass. Net: -699 lines, zero coverage loss. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: prevent test pollution from autoresearch_enabled mutation test_autoresearch_perplexity.py was setting settings.autoresearch_enabled = True but never restoring it in the finally block — polluting subsequent tests. When pytest-randomly ordered it before test_experiments_page_shows_disabled_when_off, the victim test saw enabled=True and failed to find "Disabled" in the page. Fix both sides: - Restore autoresearch_enabled in the finally block (root cause) - Mock settings explicitly in the victim test (defense in depth) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Trip T <trip@local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 16:55:27 -04:00
Alexander Whitestone	87dc5eadfe	Wire orchestrator pipe into task runner + pipe-verifying integration tests (#134 )	2026-03-06 01:20:14 -05:00

Author

SHA1

Message

Date

Trip T

ea2dbdb4b5

fix: test DB isolation, Discord recovery, and over-mocked tests

Test data was bleeding into production tasks.db because
swarm.task_queue.models.DB_PATH (relative path) was never patched in
conftest.clean_database. Fixed by switching to absolute paths via
settings.repo_root and adding the missing module to the patching list.

Discord bot could leak orphaned clients on retry after ERROR state.
Added _cleanup_stale() to close stale client/task before each start()
attempt, with improved logging in the token watcher.

Rewrote test_paperclip_client.py to use httpx.MockTransport instead of
patching _get/_post/_delete — tests now exercise real HTTP status codes,
error handling, and JSON parsing. Added end-to-end test for
capture_error → create_task DB isolation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-11 20:33:59 -04:00

Alexander Whitestone

36fc10097f

Claude/angry cerf (#173 )

* feat: set qwen3.5:latest as default model

- Make qwen3.5:latest the primary default model for faster inference
- Move llama3.1:8b-instruct to fallback chain
- Update text fallback chain to prioritize qwen3.5:latest

Retains full backward compatibility via cascade fallback.

* test: remove ~55 brittle, duplicate, and useless tests

Audit of all 100 test files identified tests that provided no real
regression protection. Removed:

- 4 files deleted entirely: test_setup_script (always skipped),
  test_csrf_bypass (tautological assertions), test_input_validation
  (accepts 200-500 status codes), test_security_regression (fragile
  source-pattern checks redundant with rendering tests)
- Duplicate test classes (TestToolTracking, TestCalculatorExtended)
- Mock-only tests that just verify mock wiring, not behavior
- Structurally broken tests (TestCreateToolFunctions patches after import)
- Empty/pass-body tests and meaningless assertions (len > 20)
- Flaky subprocess tests (aider tool calling real binary)

All 1328 remaining tests pass. Net: -699 lines, zero coverage loss.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: prevent test pollution from autoresearch_enabled mutation

test_autoresearch_perplexity.py was setting settings.autoresearch_enabled = True
but never restoring it in the finally block — polluting subsequent tests.
When pytest-randomly ordered it before test_experiments_page_shows_disabled_when_off,
the victim test saw enabled=True and failed to find "Disabled" in the page.

Fix both sides:
- Restore autoresearch_enabled in the finally block (root cause)
- Mock settings explicitly in the victim test (defense in depth)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Trip T <trip@local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-11 16:55:27 -04:00

Alexander Whitestone

87dc5eadfe

Wire orchestrator pipe into task runner + pipe-verifying integration tests (#134 )

2026-03-06 01:20:14 -05:00

3 Commits