Three optimizations to the agentic loop:
1. Cache loop agent as singleton (avoid repeated warmups)
2. Sliding window for step context (last 2 results, not all)
3. Replace summary LLM call with deterministic summary
Saves 1 full LLM inference call per agentic loop invocation
(30-60s on local models) and reduces context window pressure.
Also fixes pre-existing test_cli.py repl test bugs (missing result= assignment).