feat(delegate): add observability metadata to subagent results (#1175)
* fix: Home Assistant event filtering now closed by default Previously, when no watch_domains or watch_entities were configured, ALL state_changed events passed through to the agent, causing users to be flooded with notifications for every HA entity change. Now events are dropped by default unless the user explicitly configures: - watch_domains: list of domains to monitor (e.g. climate, light) - watch_entities: list of specific entity IDs to monitor - watch_all: true (new option — opt-in to receive all events) A warning is logged at connect time if no filters are configured, guiding users to set up their HA platform config. All 49 gateway HA tests + 52 HA tool tests pass. * docs: update Home Assistant integration documentation - homeassistant.md: Fix event filtering docs to reflect closed-by-default behavior. Add watch_all option. Replace Python dict config example with YAML. Fix defaults table (was incorrectly showing 'all'). Add required configuration warning admonition. - environment-variables.md: Add HASS_TOKEN and HASS_URL to Messaging section. - messaging/index.md: Add Home Assistant to description, architecture diagram, platform toolsets table, and Next Steps links. * fix(terminal): strip provider env vars from background and PTY subprocesses Extends the env var blocklist from #1157 to also cover the two remaining leaky paths in process_registry.py: - spawn_local() PTY path (line 156) - spawn_local() background Popen path (line 197) Both were still using raw os.environ, leaking provider vars to background processes and interactive PTY sessions. Now uses the same dynamic _HERMES_PROVIDER_ENV_BLOCKLIST from local.py. Explicit env_vars passed to spawn_local() still override the blocklist, matching the existing behavior for callers that intentionally need these. Gap identified by PR #1004 (@PeterFile). * feat(delegate): add observability metadata to subagent results Enrich delegate_task results with metadata from the child AIAgent: - model: which model the child used - exit_reason: completed | interrupted | max_iterations - tokens.input / tokens.output: token counts - tool_trace: per-tool-call trace with byte sizes and ok/error status Tool trace uses tool_call_id matching to correctly pair parallel tool calls with their results, with a fallback for messages without IDs. Cherry-picked from PR #872 by @omerkaz, with fixes: - Fixed parallel tool call trace pairing (was always updating last entry) - Removed redundant 'iterations' field (identical to existing 'api_calls') - Added test for parallel tool call trace correctness Co-authored-by: omerkaz <omerkaz@users.noreply.github.com> --------- Co-authored-by: omerkaz <omerkaz@users.noreply.github.com>
This commit is contained in:
@@ -276,12 +276,70 @@ def _run_single_child(
|
||||
else:
|
||||
status = "failed"
|
||||
|
||||
# Build tool trace from conversation messages (already in memory).
|
||||
# Uses tool_call_id to correctly pair parallel tool calls with results.
|
||||
tool_trace: list[Dict[str, Any]] = []
|
||||
trace_by_id: Dict[str, Dict[str, Any]] = {}
|
||||
messages = result.get("messages") or []
|
||||
if isinstance(messages, list):
|
||||
for msg in messages:
|
||||
if not isinstance(msg, dict):
|
||||
continue
|
||||
if msg.get("role") == "assistant":
|
||||
for tc in (msg.get("tool_calls") or []):
|
||||
fn = tc.get("function", {})
|
||||
entry_t = {
|
||||
"tool": fn.get("name", "unknown"),
|
||||
"args_bytes": len(fn.get("arguments", "")),
|
||||
}
|
||||
tool_trace.append(entry_t)
|
||||
tc_id = tc.get("id")
|
||||
if tc_id:
|
||||
trace_by_id[tc_id] = entry_t
|
||||
elif msg.get("role") == "tool":
|
||||
content = msg.get("content", "")
|
||||
is_error = bool(
|
||||
content and "error" in content[:80].lower()
|
||||
)
|
||||
result_meta = {
|
||||
"result_bytes": len(content),
|
||||
"status": "error" if is_error else "ok",
|
||||
}
|
||||
# Match by tool_call_id for parallel calls
|
||||
tc_id = msg.get("tool_call_id")
|
||||
target = trace_by_id.get(tc_id) if tc_id else None
|
||||
if target is not None:
|
||||
target.update(result_meta)
|
||||
elif tool_trace:
|
||||
# Fallback for messages without tool_call_id
|
||||
tool_trace[-1].update(result_meta)
|
||||
|
||||
# Determine exit reason
|
||||
if interrupted:
|
||||
exit_reason = "interrupted"
|
||||
elif completed:
|
||||
exit_reason = "completed"
|
||||
else:
|
||||
exit_reason = "max_iterations"
|
||||
|
||||
# Extract token counts (safe for mock objects)
|
||||
_input_tokens = getattr(child, "session_prompt_tokens", 0)
|
||||
_output_tokens = getattr(child, "session_completion_tokens", 0)
|
||||
_model = getattr(child, "model", None)
|
||||
|
||||
entry: Dict[str, Any] = {
|
||||
"task_index": task_index,
|
||||
"status": status,
|
||||
"summary": summary,
|
||||
"api_calls": api_calls,
|
||||
"duration_seconds": duration,
|
||||
"model": _model if isinstance(_model, str) else None,
|
||||
"exit_reason": exit_reason,
|
||||
"tokens": {
|
||||
"input": _input_tokens if isinstance(_input_tokens, (int, float)) else 0,
|
||||
"output": _output_tokens if isinstance(_output_tokens, (int, float)) else 0,
|
||||
},
|
||||
"tool_trace": tool_trace,
|
||||
}
|
||||
if status == "failed":
|
||||
entry["error"] = result.get("error", "Subagent did not produce a response.")
|
||||
|
||||
Reference in New Issue
Block a user