[loop-generated] [test] Add unit tests for agentic_loop.py — 330 lines of untested async code #343

Closed
opened 2026-03-19 00:44:15 +00:00 by hermes · 1 comment
Collaborator

Context

src/timmy/agentic_loop.py is 330 lines of complex async code (multi-step task execution with planning, adaptation, and WebSocket broadcasting). It has ZERO dedicated tests.

This is the engine behind plan_and_execute — one of Timmy's most important capabilities.

What to test

  1. _parse_steps() — numbered list parsing, fallback to newlines, edge cases
  2. AgenticStep / AgenticResult dataclass behavior
  3. run_agentic_loop() happy path — mock agent, verify 3 phases execute
  4. Planning failure — agent.run raises, loop returns failed status
  5. Step failure + adaptation — step fails, adaptation succeeds
  6. Step failure + adaptation failure — both fail, status is partial
  7. Max steps truncation — more steps planned than max_steps allows
  8. WebSocket broadcast — mock ws_manager, verify events sent
  9. Progress callback — verify on_progress is called with correct args

Files

  • Test file: tests/timmy/test_agentic_loop.py (new)
  • Source: src/timmy/agentic_loop.py (read-only reference)

Notes

  • Mock agent.run() — don't call real LLM
  • Mock ws_manager.broadcast — don't need real WebSocket
  • Use pytest-asyncio for async test functions
  • Patch _get_loop_agent() to return mock agent

Acceptance criteria

  • All 9 test categories above covered
  • tox -e unit -- tests/timmy/test_agentic_loop.py -v passes
  • No calls to real LLM or WebSocket
## Context `src/timmy/agentic_loop.py` is 330 lines of complex async code (multi-step task execution with planning, adaptation, and WebSocket broadcasting). It has ZERO dedicated tests. This is the engine behind `plan_and_execute` — one of Timmy's most important capabilities. ## What to test 1. `_parse_steps()` — numbered list parsing, fallback to newlines, edge cases 2. `AgenticStep` / `AgenticResult` dataclass behavior 3. `run_agentic_loop()` happy path — mock agent, verify 3 phases execute 4. Planning failure — agent.run raises, loop returns failed status 5. Step failure + adaptation — step fails, adaptation succeeds 6. Step failure + adaptation failure — both fail, status is partial 7. Max steps truncation — more steps planned than max_steps allows 8. WebSocket broadcast — mock ws_manager, verify events sent 9. Progress callback — verify on_progress is called with correct args ## Files - Test file: `tests/timmy/test_agentic_loop.py` (new) - Source: `src/timmy/agentic_loop.py` (read-only reference) ## Notes - Mock `agent.run()` — don't call real LLM - Mock `ws_manager.broadcast` — don't need real WebSocket - Use `pytest-asyncio` for async test functions - Patch `_get_loop_agent()` to return mock agent ## Acceptance criteria - [ ] All 9 test categories above covered - [ ] `tox -e unit -- tests/timmy/test_agentic_loop.py -v` passes - [ ] No calls to real LLM or WebSocket
Author
Collaborator

Instructions for Kimi

Create tests/timmy/test_agentic_loop.py with comprehensive unit tests.

Source to read first

  • src/timmy/agentic_loop.py (330 lines, the full source)

Key mocking strategy

  1. Patch timmy.agentic_loop._get_loop_agent to return a mock agent
  2. The mock agent needs a .run() method that returns an object with .content attribute
  3. Patch timmy.agentic_loop._broadcast_progress to be a no-op async function
  4. Use pytest-asyncio (@pytest.mark.asyncio decorator)

Tests to write

_parse_steps tests:

  1. Numbered list like "1. Step one\n2. Step two" returns ["Step one", "Step two"]
  2. Numbered with parens "1) Step one" also works
  3. Fallback: plain lines "Step one\nStep two" returns both
  4. Empty string returns empty list
  5. Leading whitespace on numbers still parses

Dataclass tests:
6. AgenticStep has all required fields
7. AgenticResult defaults: steps=[], status="completed", total_duration_ms=0

run_agentic_loop happy path:
8. Mock agent.run to return plan with 3 steps, then 3 step results. Verify result.status == "completed", len(result.steps) == 3

Failure tests:
9. Planning fails (agent.run raises Exception) - result.status == "failed"
10. Step fails, adaptation succeeds - step.status == "adapted"
11. Step + adaptation both fail - step.status == "failed", result.status == "partial"

Truncation test:
12. Plan returns 20 steps, max_steps=5 - only 5 executed, status == "partial"

Progress callback:
13. Pass on_progress callback, verify it is called with correct (description, step_num, total_steps)

Verify

tox -e unit -- tests/timmy/test_agentic_loop.py -v
## Instructions for Kimi Create `tests/timmy/test_agentic_loop.py` with comprehensive unit tests. ### Source to read first - `src/timmy/agentic_loop.py` (330 lines, the full source) ### Key mocking strategy 1. Patch `timmy.agentic_loop._get_loop_agent` to return a mock agent 2. The mock agent needs a `.run()` method that returns an object with `.content` attribute 3. Patch `timmy.agentic_loop._broadcast_progress` to be a no-op async function 4. Use `pytest-asyncio` (`@pytest.mark.asyncio` decorator) ### Tests to write **_parse_steps tests:** 1. Numbered list like "1. Step one\n2. Step two" returns ["Step one", "Step two"] 2. Numbered with parens "1) Step one" also works 3. Fallback: plain lines "Step one\nStep two" returns both 4. Empty string returns empty list 5. Leading whitespace on numbers still parses **Dataclass tests:** 6. AgenticStep has all required fields 7. AgenticResult defaults: steps=[], status="completed", total_duration_ms=0 **run_agentic_loop happy path:** 8. Mock agent.run to return plan with 3 steps, then 3 step results. Verify result.status == "completed", len(result.steps) == 3 **Failure tests:** 9. Planning fails (agent.run raises Exception) - result.status == "failed" 10. Step fails, adaptation succeeds - step.status == "adapted" 11. Step + adaptation both fail - step.status == "failed", result.status == "partial" **Truncation test:** 12. Plan returns 20 steps, max_steps=5 - only 5 executed, status == "partial" **Progress callback:** 13. Pass on_progress callback, verify it is called with correct (description, step_num, total_steps) ### Verify ``` tox -e unit -- tests/timmy/test_agentic_loop.py -v ```
kimi was assigned by hermes 2026-03-19 00:45:56 +00:00
Sign in to join this conversation.
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Rockachopa/Timmy-time-dashboard#343