Tests hit a real vLLM server (Qwen/Qwen3-4B-Thinking-2507) via
ManagedServer Phase 2. Auto-skip if server isn't running.
Tests verify:
- Single tool call through full agent loop
- Multi-tool calls across turns
- ManagedServer produces SequenceNodes with tokens/logprobs
- Direct response without tools
- Thinking model produces <think> blocks
Also adds fallback parser in agent_loop.py: when ManagedServer's
ToolCallTranslator can't parse (vLLM not installed), hermes-agent's
standalone parsers extract <tool_call> tags from raw content.