# Test Optimization Guide for Hermes Agent ## Current Test Execution Analysis ### Test Suite Statistics - **Total Test Files:** 373 - **Estimated Test Functions:** ~4,311 - **Async Tests:** ~679 (15.8%) - **Integration Tests:** 7 files (excluded from CI) - **Average Tests per File:** ~11.6 ### Current CI Configuration ```yaml # .github/workflows/tests.yml - name: Run tests run: | source .venv/bin/activate python -m pytest tests/ -q --ignore=tests/integration --tb=short -n auto ``` **Current Flags:** - `-q`: Quiet mode - `--ignore=tests/integration`: Skip integration tests - `--tb=short`: Short traceback format - `-n auto`: Auto-detect parallel workers --- ## Optimization Recommendations ### 1. Add Test Duration Reporting **Current:** No duration tracking **Recommended:** ```yaml run: | python -m pytest tests/ \ --ignore=tests/integration \ -n auto \ --durations=20 \ # Show 20 slowest tests --durations-min=1.0 # Only show tests >1s ``` This will help identify slow tests that need optimization. ### 2. Implement Test Categories Add markers to `pyproject.toml`: ```toml [tool.pytest.ini_options] testpaths = ["tests"] markers = [ "integration: marks tests requiring external services", "slow: marks tests that take >5 seconds", "unit: marks fast unit tests", "security: marks security-focused tests", "flakey: marks tests that may be unstable", ] addopts = "-m 'not integration and not slow' -n auto" ``` **Usage:** ```bash # Run only fast unit tests pytest -m unit # Run all tests including slow ones pytest -m "not integration" # Run only security tests pytest -m security ``` ### 3. Optimize Slow Test Candidates Based on file sizes, these tests likely need optimization: | File | Lines | Optimization Strategy | |------|-------|----------------------| | `test_run_agent.py` | 3,329 | Split into multiple files by feature | | `test_mcp_tool.py` | 2,902 | Split by MCP functionality | | `test_voice_command.py` | 2,632 | Review for redundant tests | | `test_feishu.py` | 2,580 | Mock external API calls | | `test_api_server.py` | 1,503 | Parallelize independent tests | ### 4. Add Coverage Reporting to CI **Updated workflow:** ```yaml - name: Run tests with coverage run: | source .venv/bin/activate python -m pytest tests/ \ --ignore=tests/integration \ -n auto \ --cov=agent --cov=tools --cov=gateway --cov=hermes_cli \ --cov-report=xml \ --cov-report=html \ --cov-fail-under=70 - name: Upload coverage to Codecov uses: codecov/codecov-action@v3 with: files: ./coverage.xml fail_ci_if_error: true ``` ### 5. Implement Flaky Test Handling Add `pytest-rerunfailures`: ```toml dev = [ "pytest>=9.0.2,<10", "pytest-asyncio>=1.3.0,<2", "pytest-xdist>=3.0,<4", "pytest-cov>=5.0,<6", "pytest-rerunfailures>=14.0,<15", # Add this ] ``` **Usage:** ```python # Mark known flaky tests @pytest.mark.flakey(reruns=3, reruns_delay=1) async def test_network_dependent_feature(): # Test that sometimes fails due to network pass ``` ### 6. Optimize Fixture Scopes Review `conftest.py` fixtures: ```python # Current: Function scope (runs for every test) @pytest.fixture() def mock_config(): return {...} # Optimized: Session scope (runs once per session) @pytest.fixture(scope="session") def mock_config(): return {...} # Optimized: Module scope (runs once per module) @pytest.fixture(scope="module") def expensive_setup(): # Setup that can be reused across module pass ``` ### 7. Parallel Execution Tuning **Current:** `-n auto` (uses all CPUs) **Issues:** - May cause resource contention - Some tests may not be thread-safe **Recommendations:** ```bash # Limit workers to prevent resource exhaustion pytest -n 4 # Use 4 workers regardless of CPU count # Use load-based scheduling for uneven test durations pytest -n auto --dist=load # Group tests by module to reduce setup overhead pytest -n auto --dist=loadscope ``` ### 8. Test Data Management **Current Issue:** Tests may create files in `/tmp` without cleanup **Solution - Factory Pattern:** ```python # tests/factories.py import tempfile import shutil from contextlib import contextmanager @contextmanager def temp_workspace(): """Create isolated temp directory for tests.""" path = tempfile.mkdtemp(prefix="hermes_test_") try: yield Path(path) finally: shutil.rmtree(path, ignore_errors=True) # Usage in tests def test_file_operations(): with temp_workspace() as tmp: # All file operations in isolated directory file_path = tmp / "test.txt" file_path.write_text("content") assert file_path.exists() # Automatically cleaned up ``` ### 9. Database/State Isolation **Current:** Uses `monkeypatch` for env vars **Enhancement:** Database mocking ```python @pytest.fixture def mock_honcho(): """Mock Honcho client for tests.""" with patch("honcho_integration.client.HonchoClient") as mock: mock_instance = MagicMock() mock_instance.get_session.return_value = {"id": "test-session"} mock.return_value = mock_instance yield mock # Usage async def test_memory_storage(mock_honcho): # Fast, isolated test pass ``` ### 10. CI Pipeline Optimization **Current Pipeline:** 1. Checkout 2. Install uv 3. Install Python 4. Install deps 5. Run tests **Optimized Pipeline (with caching):** ```yaml jobs: test: runs-on: ubuntu-latest timeout-minutes: 10 steps: - uses: actions/checkout@v4 - name: Install uv uses: astral-sh/setup-uv@v5 with: version: "0.5.x" - name: Set up Python uses: actions/setup-python@v5 with: python-version: '3.11' cache: 'pip' # Cache pip dependencies - name: Cache uv packages uses: actions/cache@v4 with: path: ~/.cache/uv key: ${{ runner.os }}-uv-${{ hashFiles('**/pyproject.toml') }} - name: Install dependencies run: | uv venv .venv uv pip install -e ".[all,dev]" - name: Run fast tests run: | source .venv/bin/activate pytest -m "not integration and not slow" -n auto --tb=short - name: Run slow tests if: github.event_name == 'pull_request' run: | source .venv/bin/activate pytest -m "slow" -n 2 --tb=short ``` --- ## Quick Wins (Implement First) ### 1. Add Duration Reporting (5 minutes) ```yaml --durations=10 ``` ### 2. Mark Slow Tests (30 minutes) Add `@pytest.mark.slow` to tests taking >5s. ### 3. Split Largest Test File (2 hours) Split `test_run_agent.py` into: - `test_run_agent_core.py` - `test_run_agent_tools.py` - `test_run_agent_memory.py` - `test_run_agent_messaging.py` ### 4. Add Coverage Baseline (1 hour) ```bash pytest --cov=agent --cov=tools --cov=gateway tests/ --cov-report=html ``` ### 5. Optimize Fixture Scopes (1 hour) Review and optimize 5 most-used fixtures. --- ## Long-term Improvements ### Test Data Generation ```python # Implement hypothesis-based testing from hypothesis import given, strategies as st @given(st.lists(st.text(), min_size=1)) def test_message_batching(messages): # Property-based testing pass ``` ### Performance Regression Testing ```python @pytest.mark.benchmark def test_message_processing_speed(benchmark): result = benchmark(process_messages, sample_data) assert result.throughput > 1000 # msgs/sec ``` ### Contract Testing ```python # Verify API contracts between components @pytest.mark.contract def test_agent_tool_contract(): """Verify agent sends correct format to tools.""" pass ``` --- ## Measurement Checklist After implementing optimizations, verify: - [ ] Test suite execution time < 5 minutes - [ ] No individual test > 10 seconds (except integration) - [ ] Code coverage > 70% - [ ] All flaky tests marked and retried - [ ] CI passes consistently (>95% success rate) - [ ] Memory usage stable (no leaks in test suite) --- ## Tools to Add ```toml [project.optional-dependencies] dev = [ "pytest>=9.0.2,<10", "pytest-asyncio>=1.3.0,<2", "pytest-xdist>=3.0,<4", "pytest-cov>=5.0,<6", "pytest-rerunfailures>=14.0,<15", "pytest-benchmark>=4.0,<5", # Performance testing "pytest-mock>=3.12,<4", # Enhanced mocking "hypothesis>=6.100,<7", # Property-based testing "factory-boy>=3.3,<4", # Test data factories ] ```