[BUG] MCP tools fail with cancel scope error when thinking scheduler calls chat() #72

New Issue

hermes · 2026-03-14T20:34:04Z

hermes commented

2026-03-14 20:34:04 +00:00

Problem

The thinking scheduler (_thinking_scheduler in app.py:154) periodically calls thinking_engine.think_once(), which calls session.chat(), which creates a Timmy agent with MCP tools (Gitea + Filesystem). These MCP tools are MCPTools instances from Agno that use stdio transport — they spawn child processes and manage async connections.

The connection fails with:

Failed to connect to <MCPTools name=MCPTools functions=["issue_read", ...]>:
Cancelled via cancel scope 117706e10 by <Task pending name="Task-4"
coro=<_thinking_scheduler() running at app.py:154>>

Root Cause

Agno's MCPTools uses an async context manager lifecycle — it needs async with tools: to establish the stdio connection, and the connection must remain open for the duration of the agent run. The lifecycle looks like:

MCPTools.__aenter__() — spawns the MCP server subprocess, connects via stdio
Agent runs, calls tools
MCPTools.__aexit__() — tears down the connection

When create_timmy() creates MCPTools instances (agent.py:261-267), they are added to the agent's tools list as lazy-connect objects. Agno's Agent.arun() calls MCPTools.__aenter__() internally, which opens the connection inside an anyio cancel scope tied to the calling task.

The problem: _thinking_scheduler is an asyncio.create_task() background task (app.py:296). When the thinking engine calls chat() → create_timmy() → agent.arun(), the MCP connection is established inside the thinking task's cancel scope. If:

The thinking interval fires again while a previous cycle is still connecting
The uvicorn server reloads (WatchFiles) and cancels background tasks
The thinking engine's asyncio.sleep() gets cancelled

...the cancel scope propagates to the MCP stdio connection, killing it mid-handshake.

Additionally, create_timmy() creates new MCPTools instances on every call (agent.py:261-267). Each thinking cycle spawns fresh gitea-mcp and npx @anthropic/mcp-filesystem subprocesses. These are never reused or pooled.

Impact

Noisy error logs every thinking cycle (default: every 120s)
Orphaned MCP server subprocesses if teardown fails
Thinking engine falls back to bare agent (no tools), reducing capability
Resource waste: spawning 2 subprocesses per thinking cycle

Proposed Fix

Option A (minimal): Catch the cancel scope error in _call_agent() and skip MCP tools for the thinking session. The thinking engine doesn't actually need Gitea/filesystem tools for generating thoughts — it only uses them in the _maybe_file_issue() post-hook, which already has its own MCP session.

# In thinking.py _call_agent(), create agent without MCP tools:
agent = create_timmy(tools_override=[toolkit_only])  # no MCPTools

Option B (better): Create a singleton MCP connection pool that create_timmy() draws from, with proper lifecycle management tied to the app's lifespan (startup/shutdown hooks) rather than individual task cancel scopes.

Option C (simplest): Pass use_tools=False or a skip_mcp=True flag when creating the thinking agent, since thoughts don't need external tool access.

Reproduction

Start the dashboard: timmy serve
Wait for thinking scheduler to fire (~5s after startup)
Observe the MCP cancel scope errors in the log
Especially reproducible during hot-reload (WatchFiles triggers)

Files Involved

src/dashboard/app.py:145-158 — _thinking_scheduler background task
src/timmy/thinking.py:821-836 — _call_agent() creates agent via session.chat()
src/timmy/agent.py:256-269 — create_timmy() creates fresh MCPTools per call
src/timmy/mcp_tools.py:80-155 — MCPTools factory functions
src/timmy/session.py — chat() singleton agent (but thinking uses separate session_id)

## Problem The thinking scheduler (`_thinking_scheduler` in `app.py:154`) periodically calls `thinking_engine.think_once()`, which calls `session.chat()`, which creates a Timmy agent with MCP tools (Gitea + Filesystem). These MCP tools are `MCPTools` instances from Agno that use stdio transport — they spawn child processes and manage async connections. The connection fails with: ``` Failed to connect to <MCPTools name=MCPTools functions=["issue_read", ...]>: Cancelled via cancel scope 117706e10 by <Task pending name="Task-4" coro=<_thinking_scheduler() running at app.py:154>> ``` ## Root Cause Agno's `MCPTools` uses an async context manager lifecycle — it needs `async with tools:` to establish the stdio connection, and the connection must remain open for the duration of the agent run. The lifecycle looks like: 1. `MCPTools.__aenter__()` — spawns the MCP server subprocess, connects via stdio 2. Agent runs, calls tools 3. `MCPTools.__aexit__()` — tears down the connection When `create_timmy()` creates MCPTools instances (agent.py:261-267), they are added to the agent's tools list as lazy-connect objects. Agno's `Agent.arun()` calls `MCPTools.__aenter__()` internally, which opens the connection inside an `anyio` cancel scope tied to the calling task. The problem: `_thinking_scheduler` is an `asyncio.create_task()` background task (app.py:296). When the thinking engine calls `chat()` → `create_timmy()` → `agent.arun()`, the MCP connection is established inside the thinking task's cancel scope. If: - The thinking interval fires again while a previous cycle is still connecting - The uvicorn server reloads (WatchFiles) and cancels background tasks - The thinking engine's `asyncio.sleep()` gets cancelled ...the cancel scope propagates to the MCP stdio connection, killing it mid-handshake. Additionally, `create_timmy()` creates **new MCPTools instances on every call** (agent.py:261-267). Each thinking cycle spawns fresh `gitea-mcp` and `npx @anthropic/mcp-filesystem` subprocesses. These are never reused or pooled. ## Impact - Noisy error logs every thinking cycle (default: every 120s) - Orphaned MCP server subprocesses if teardown fails - Thinking engine falls back to bare agent (no tools), reducing capability - Resource waste: spawning 2 subprocesses per thinking cycle ## Proposed Fix **Option A (minimal):** Catch the cancel scope error in `_call_agent()` and skip MCP tools for the thinking session. The thinking engine doesn't actually need Gitea/filesystem tools for generating thoughts — it only uses them in the `_maybe_file_issue()` post-hook, which already has its own MCP session. ```python # In thinking.py _call_agent(), create agent without MCP tools: agent = create_timmy(tools_override=[toolkit_only]) # no MCPTools ``` **Option B (better):** Create a singleton MCP connection pool that `create_timmy()` draws from, with proper lifecycle management tied to the app's lifespan (startup/shutdown hooks) rather than individual task cancel scopes. **Option C (simplest):** Pass `use_tools=False` or a `skip_mcp=True` flag when creating the thinking agent, since thoughts don't need external tool access. ## Reproduction 1. Start the dashboard: `timmy serve` 2. Wait for thinking scheduler to fire (~5s after startup) 3. Observe the MCP cancel scope errors in the log 4. Especially reproducible during hot-reload (WatchFiles triggers) ## Files Involved - `src/dashboard/app.py:145-158` — `_thinking_scheduler` background task - `src/timmy/thinking.py:821-836` — `_call_agent()` creates agent via `session.chat()` - `src/timmy/agent.py:256-269` — `create_timmy()` creates fresh MCPTools per call - `src/timmy/mcp_tools.py:80-155` — MCPTools factory functions - `src/timmy/session.py` — `chat()` singleton agent (but thinking uses separate session_id)

hermes referenced a pull request that will close this issue

2026-03-14 20:51:01 +00:00

[loop-cycle-9] fix: thinking engine skips MCP tools to avoid cancel-scope errors (#72) #74

hermes closed this issue

2026-03-14 20:51:08 +00:00

hermes referenced this issue from a commit

2026-03-14 20:51:09 +00:00

Merge pull request '[loop-cycle-9] fix: thinking engine skips MCP tools to avoid cancel-scope errors (#72)' (#74) from fix/thinking-mcp-cancel-scope into main

Sign in to join this conversation.

Branches Tags

main

gemini/issue-892

claude/issue-1342

claude/issue-1346

claude/issue-1351

claude/issue-1340

fix/test-llm-triage-syntax

gemini/issue-1014

gemini/issue-932

claude/issue-1277

claude/issue-1139

claude/issue-870

claude/issue-1285

claude/issue-1292

claude/issue-1281

claude/issue-917

claude/issue-1275

claude/issue-925

claude/issue-1019

claude/issue-1094

claude/issue-1019-v3

fix/flaky-vassal-xdist-tests

fix/test-config-env-isolation

claude/issue-1019-v2

claude/issue-957-v2

claude/issue-1218

claude/issue-1217

test/chat-store-unit-tests

claude/issue-1191

claude/issue-1186

claude/issue-957

gemini/issue-936

claude/issue-1065

gemini/issue-976

gemini/issue-1149

claude/issue-1135

claude/issue-1064

gemini/issue-1012

claude/issue-1095

claude/issue-1102

claude/issue-1114

gemini/issue-978

gemini/issue-971

claude/issue-1074

claude/issue-987

claude/issue-1011

feature/internal-monologue

feature/issue-1006

feature/issue-1007

feature/issue-1008

feature/issue-1009

feature/issue-1010

feature/issue-1011

feature/issue-1012

feature/issue-1013

feature/issue-1014

feature/issue-981

feature/issue-982

feature/issue-983

feature/issue-984

feature/issue-985

feature/issue-986

feature/issue-987

feature/issue-993

claude/issue-943

claude/issue-975

claude/issue-989

claude/issue-988

fix/loop-guard-gitea-api-and-queue-validation

feature/lhf-tech-debt-fixes

kimi/issue-753

kimi/issue-714

kimi/issue-716

fix/csrf-check-before-execute

chore/migrate-gitea-to-vps

kimi/issue-640

fix/utcnow-calm-py

kimi/issue-635

kimi/issue-625

fix/router-api-truncated-param

kimi/issue-604

kimi/issue-594

review-fixes

kimi/issue-570

kimi/issue-554

kimi/issue-539

kimi/issue-540

feature/ipad-v1-api

kimi/issue-506

kimi/issue-512

refactor/airllm-doc-cleanup

kimi/issue-513

kimi/issue-514

kimi/issue-500

kimi/issue-492

kimi/issue-490

kimi/issue-459

kimi/issue-472

kimi/issue-473

kimi/issue-462

kimi/issue-463

kimi/issue-454

kimi/issue-445

kimi/issue-446

kimi/issue-431

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: Rockachopa/Timmy-time-dashboard#72