docs/browser-integration-analysis.md

# Browser Integration Analysis: Browser Use + Graphify + Multica

**Issue:** #262 — Investigation: Browser Use + Graphify + Multica — Hermes Integration Analysis
**Date:** 2026-04-10
**Author:** Hermes Agent (burn branch)

## Executive Summary

This document evaluates three browser-related projects for integration with
hermes-agent. Each tool is assessed on capability, integration complexity,
security posture, and strategic fit with Hermes's existing browser stack.

| Tool              | Recommendation          | Integration Path        |
|-------------------|-------------------------|-------------------------|
| Browser Use       | **Integrate** (PoC)     | Tool + MCP server       |
| Graphify          | Investigate further     | MCP server or tool      |
| Multica           | Skip (for now)          | N/A — premature         |

---

## 1. Browser Use (`browser-use`)

### What It Does

Browser Use is a Python library that wraps Playwright to provide LLM-driven
browser automation. An agent describes a task in natural language, and
browser-use autonomously navigates, clicks, types, and extracts data by
feeding the page's accessibility tree to an LLM and executing the resulting
actions in a loop.

Key capabilities:
- Autonomous multi-step browser workflows from a single text instruction
- Accessibility tree extraction (DOM + ARIA snapshot)
- Screenshot and visual context for multimodal models
- Form filling, navigation, data extraction, file downloads
- Custom actions (register callable Python functions the LLM can invoke)
- Parallel agent execution (multiple browser agents simultaneously)
- Cloud execution via browser-use.com API (no local browser needed)

### Integration with Hermes

**Primary path: Custom Hermes tool** wrapping `browser-use` as a high-level
"automated browsing" capability alongside the existing `browser_tool.py`
(low-level, agent-controlled) tools.

**Why a separate tool rather than replacing browser_tool.py:**
- Hermes's existing browser tools (navigate, snapshot, click, type) give the
  LLM fine-grained step-by-step control — this is valuable for interactive
  tasks and debugging.
- browser-use gives coarse-grained "do this task for me" autonomy — better
  for multi-step extraction workflows where the LLM would otherwise need
  10+ tool calls.
- Both modes have legitimate use cases. Offer both.

**Integration architecture:**

```
hermes-agent
  tools/
    browser_tool.py          # Existing — low-level agent-controlled browsing
    browser_use_tool.py      # NEW — high-level autonomous browsing (PoC)
      |
      +-- browser_use.run()  # Wraps browser-use Agent class
      +-- browser_use.extract()  # Wraps browser-use for data extraction
```

The tool registers with `tools/registry.py` as toolset `browser_use` with
a `check_fn` that verifies `browser-use` is installed.

**Alternative: MCP server** — browser-use could also be exposed as an MCP
server for multi-agent setups where subagents need independent browser
access. This is a follow-up, not the initial integration.

### Dependencies and Requirements

```
pip install browser-use          # Core library
playwright install chromium      # Playwright browser binary
```

Or use cloud mode with `BROWSER_USE_API_KEY` — no local browser needed.

Python 3.11+, Playwright. No exotic system dependencies beyond what
Hermes already requires for its existing browser tool.

### Security Considerations

| Concern                    | Mitigation                                              |
|----------------------------|---------------------------------------------------------|
| Arbitrary URL access       | Reuse Hermes's `website_policy` and `url_safety` modules |
| Data exfiltration          | Browser-use agents run in isolated Playwright contexts; no access to Hermes filesystem |
| Prompt injection via page  | browser-use feeds page content to LLM — same risk as existing browser_snapshot; already handled by Hermes prompt hardening |
| Credential leakage         | Do not pass API keys to untrusted pages; cloud mode keeps credentials server-side |
| Resource exhaustion        | Set max_steps on browser-use Agent to prevent infinite loops |
| Downloaded files           | Playwright download path is sandboxed; tool should restrict to temp directory |

**Key security property:** browser-use executes within Playwright's sandboxed
browser context. The LLM controlling browser-use is Hermes itself (or a
configured auxiliary model), not the page content. This is equivalent to the
existing browser tool's security model.

### Performance Characteristics

- **Startup:** ~2-3s for Playwright Chromium launch (same as existing local mode)
- **Per-step:** ~1-3s per LLM call + browser action (comparable to manual
  browser_navigate + browser_snapshot loop)
- **Full task (5-10 steps):** ~15-45s depending on page complexity
- **Token usage:** Each step sends the accessibility tree to the LLM.
  Browser-use supports vision mode (screenshots) which is more token-heavy.
- **Parallelism:** Supports multiple concurrent browser agents

**Comparison to existing tools:**
For a 10-step browser task, the existing approach requires 10+ Hermes API
calls (navigate, snapshot, click, type, snapshot, click, ...). Browser-use
consolidates this into a single Hermes tool call that internally runs its
own LLM loop. This reduces Hermes API round-trips but shifts the LLM cost
to browser-use's internal model calls.

### Recommendation: INTEGRATE

Browser Use fills a clear gap — autonomous multi-step browser tasks — that
complements Hermes's existing fine-grained browser tools. The integration
is straightforward (Python library, same security model). A PoC tool is
provided in `tools/browser_use_tool.py`.

---

## 2. Graphify

### What It Does

Graphify is a knowledge graph extraction tool that processes unstructured
text (including web content) and extracts entities, relationships, and
structured knowledge into a graph format. It can:

- Extract entities and relationships from text using NLP/LLM techniques
- Build knowledge graphs from web-scraped content
- Support incremental graph updates as new content is processed
- Export graphs in standard formats (JSON-LD, RDF, etc.)

(Note: "Graphify" as a project name is used by several tools. The most
relevant for browser integration is the concept of extracting structured
knowledge graphs from web content during or after browsing.)

### Integration with Hermes

**Primary path: MCP server or Hermes tool** that takes web content (from
browser_tool or web_extract) and produces structured knowledge graphs.

**Integration architecture:**

```
hermes-agent
  tools/
    graphify_tool.py          # NEW — knowledge graph extraction from text
      |
      +-- graphify.extract()  # Extract entities/relations from text
      +-- graphify.merge()    # Merge into existing graph
      +-- graphify.query()    # Query the accumulated graph
```

Or via MCP:
```
hermes-agent --mcp-server graphify-mcp
  -> tools: graphify_extract, graphify_query, graphify_export
```

**Synergy with browser tools:**
1. `browser_navigate` + `browser_snapshot` to get page content
2. `graphify_extract` to pull entities and relationships
3. Repeat across multiple pages to build a domain knowledge graph
4. `graphify_query` to answer questions about accumulated knowledge

### Dependencies and Requirements

Varies significantly depending on the specific Graphify implementation.
Typical requirements:
- Python 3.11+
- spaCy or similar NLP library for entity extraction
- Optional: Neo4j or NetworkX for graph storage
- LLM access (can reuse Hermes's existing model configuration)

### Security Considerations

| Concern                    | Mitigation                                              |
|----------------------------|---------------------------------------------------------|
| Processing untrusted text  | NLP extraction is read-only; no code execution          |
| Graph data persistence     | Store in Hermes's data directory with appropriate permissions |
| Information aggregation    | Knowledge graphs could accumulate sensitive data; provide clear/delete commands |
| External graph DB access   | If using Neo4j, require authentication and restrict to localhost |

### Performance Characteristics

- **Extraction:** ~0.5-2s per page depending on content length and NLP model
- **Graph operations:** Sub-second for graphs under 100K nodes
- **Storage:** Lightweight (JSON/SQLite) for small graphs, Neo4j for large-scale
- **Token usage:** If using LLM-based extraction, ~500-2000 tokens per page

### Recommendation: INVESTIGATE FURTHER

The concept is sound — knowledge graph extraction from web content is a
natural complement to browser tools. However:

1. **Multiple competing tools** exist under this name; need to identify the
   best-maintained option
2. **Value proposition unclear** vs. Hermes's existing memory system and
   file-based knowledge storage
3. **NLP dependency** adds complexity (spaCy models are ~500MB)

**Suggested next steps:**
- Evaluate specific Graphify implementations (graphify.ai, custom NLP pipelines)
- Prototype with a lightweight approach: LLM-based entity extraction + NetworkX
- Assess whether Hermes's existing memory/graph_store.py can serve this role

---

## 3. Multica

### What It Does

Multica is a multi-agent browser coordination framework. It enables multiple
AI agents to collaboratively browse the web, with features for:

- Task decomposition: splitting complex web tasks across multiple agents
- Shared browser state: agents see a common view of browsing progress
- Coordination protocols: agents can communicate about what they've found
- Parallel web research: multiple agents researching different aspects simultaneously

### Integration with Hermes

**Theoretical path:** Multica would integrate as a higher-level orchestration
layer on top of Hermes's existing browser tools, coordinating multiple
Hermes subagents (via `delegate_tool`) each with browser access.

**Integration architecture:**

```
hermes-agent (orchestrator)
  delegate_tool -> subagent_1 (browser_navigate, browser_snapshot, ...)
  delegate_tool -> subagent_2 (browser_navigate, browser_snapshot, ...)
  delegate_tool -> subagent_3 (browser_navigate, browser_snapshot, ...)
                    |
                    +-- Multica coordination layer (shared state, task splitting)
```

### Dependencies and Requirements

- Complex multi-agent orchestration infrastructure
- Shared state management between agents
- Potentially a custom runtime for agent coordination
- Likely requires significant architectural changes to Hermes's delegation model

### Security Considerations

| Concern                    | Mitigation                                              |
|----------------------------|---------------------------------------------------------|
| Multiple agents on same browser | Session isolation per agent (Hermes already does this) |
| Coordinated exfiltration   | Same per-agent restrictions apply                       |
| Amplified prompt injection | Each agent processes its own pages independently         |
| Resource multiplication    | N agents = N browser instances = Nx resource usage      |

### Performance Characteristics

- **Scaling:** Near-linear improvement for embarrassingly parallel tasks
  (e.g., "research 10 companies simultaneously")
- **Overhead:** Significant coordination overhead for tightly coupled tasks
- **Resource cost:** Each agent needs its own LLM calls + browser instance
- **Complexity:** Debugging multi-agent browser workflows is extremely difficult

### Recommendation: SKIP (for now)

Multica addresses a real need (parallel web research) but is premature for
Hermes for several reasons:

1. **Hermes already has subagent delegation** (`delegate_tool`) — agents can
   already do parallel browser work without Multica
2. **No mature implementation** — Multica is more of a concept than a
   production-ready tool
3. **Complexity vs. benefit** — the coordination overhead and debugging
   difficulty outweigh the benefits for most use cases
4. **Better alternatives exist** — for parallel research, simply delegating
   multiple subagents with browser tools is simpler and already works

**Revisit when:** Hermes's delegation model supports shared state between
subagents, or a mature Multica implementation emerges.

---

## Integration Roadmap

### Phase 1: Browser Use PoC (this PR)
- [x] Create `tools/browser_use_tool.py` wrapping browser-use as Hermes tool
- [x] Create `docs/browser-integration-analysis.md` (this document)
- [ ] Test with real browser tasks
- [ ] Add to toolset configuration

### Phase 2: Browser Use Production (follow-up)
- [ ] Add `browser_use` to `toolsets.py` toolset definitions
- [ ] Add configuration options in `config.yaml`
- [ ] Add tests in `tests/test_browser_use_tool.py`
- [ ] Consider MCP server variant for subagent use

### Phase 3: Graphify Investigation (follow-up)
- [ ] Evaluate specific Graphify implementations
- [ ] Prototype lightweight LLM-based entity extraction tool
- [ ] Assess integration with existing `graph_store.py`
- [ ] Create PoC if investigation is positive

### Phase 4: Multi-Agent Browser (future)
- [ ] Monitor Multica ecosystem maturity
- [ ] Evaluate when delegation model supports shared state
- [ ] Consider simpler parallel delegation patterns first

---

## Appendix: Existing Browser Stack

Hermes already has a comprehensive browser tool stack:

| Component             | Description                                      |
|-----------------------|--------------------------------------------------|
| `browser_tool.py`     | Low-level agent-controlled browser (navigate, click, type, snapshot) |
| `browser_camofox.py`  | Anti-detection browser via Camofox REST API       |
| `browser_providers/`  | Cloud providers (Browserbase, Browser Use API, Firecrawl) |
| `web_tools.py`        | Web search (Parallel) and extraction (Firecrawl) |
| `mcp_tool.py`         | MCP client for connecting external tool servers   |

The existing stack covers:
- **Local browsing:** Headless Chromium via agent-browser CLI
- **Cloud browsing:** Browserbase, Browser Use cloud, Firecrawl
- **Anti-detection:** Camofox (local) or Browserbase advanced stealth
- **Content extraction:** Firecrawl for clean markdown extraction
- **Search:** Parallel AI web search

New browser integrations should complement rather than replace these tools.
feat: browser integration analysis + PoC tool (#262) Add docs/browser-integration-analysis.md: - Technical analysis of Browser Use, Graphify, and Multica for Hermes - Integration paths, security considerations, performance characteristics - Clear recommendations: Browser Use (integrate), Graphify (investigate), Multica (skip) - Phased integration roadmap Add tools/browser_use_tool.py: - Wraps browser-use library as Hermes tool (toolset: browser_use) - Three tools: browser_use_run, browser_use_extract, browser_use_compare - Autonomous multi-step browser automation from natural language tasks - Integrates with existing url_safety and website_policy security modules - Supports both local Playwright and cloud execution modes - Follows existing tool registration pattern (registry.register) Refs: #262 2026-04-10 07:10:29 -04:00			`# Browser Integration Analysis: Browser Use + Graphify + Multica`

			`Issue: #262 — Investigation: Browser Use + Graphify + Multica — Hermes Integration Analysis`
			`Date: 2026-04-10`
			`Author: Hermes Agent (burn branch)`

			`## Executive Summary`

			`This document evaluates three browser-related projects for integration with`
			`hermes-agent. Each tool is assessed on capability, integration complexity,`
			`security posture, and strategic fit with Hermes's existing browser stack.`

			`\| Tool \| Recommendation \| Integration Path \|`
			`\|-------------------\|-------------------------\|-------------------------\|`
			`\| Browser Use \| Integrate (PoC) \| Tool + MCP server \|`
			`\| Graphify \| Investigate further \| MCP server or tool \|`
			`\| Multica \| Skip (for now) \| N/A — premature \|`

			`---`

			## 1. Browser Use (`browser-use`)

			`### What It Does`

			`Browser Use is a Python library that wraps Playwright to provide LLM-driven`
			`browser automation. An agent describes a task in natural language, and`
			`browser-use autonomously navigates, clicks, types, and extracts data by`
			`feeding the page's accessibility tree to an LLM and executing the resulting`
			`actions in a loop.`

			`Key capabilities:`
			`- Autonomous multi-step browser workflows from a single text instruction`
			`- Accessibility tree extraction (DOM + ARIA snapshot)`
			`- Screenshot and visual context for multimodal models`
			`- Form filling, navigation, data extraction, file downloads`
			`- Custom actions (register callable Python functions the LLM can invoke)`
			`- Parallel agent execution (multiple browser agents simultaneously)`
			`- Cloud execution via browser-use.com API (no local browser needed)`

			`### Integration with Hermes`

			Primary path: Custom Hermes tool wrapping `browser-use` as a high-level
			"automated browsing" capability alongside the existing `browser_tool.py`
			`(low-level, agent-controlled) tools.`

			`Why a separate tool rather than replacing browser_tool.py:`
			`- Hermes's existing browser tools (navigate, snapshot, click, type) give the`
			`LLM fine-grained step-by-step control — this is valuable for interactive`
			`tasks and debugging.`
			`- browser-use gives coarse-grained "do this task for me" autonomy — better`
			`for multi-step extraction workflows where the LLM would otherwise need`
			`10+ tool calls.`
			`- Both modes have legitimate use cases. Offer both.`

			`Integration architecture:`

			```
			`hermes-agent`
			`tools/`
			`browser_tool.py # Existing — low-level agent-controlled browsing`
			`browser_use_tool.py # NEW — high-level autonomous browsing (PoC)`
			`\|`
			`+-- browser_use.run() # Wraps browser-use Agent class`
			`+-- browser_use.extract() # Wraps browser-use for data extraction`
			```

			The tool registers with `tools/registry.py` as toolset `browser_use` with
			a `check_fn` that verifies `browser-use` is installed.

			`Alternative: MCP server — browser-use could also be exposed as an MCP`
			`server for multi-agent setups where subagents need independent browser`
			`access. This is a follow-up, not the initial integration.`

			`### Dependencies and Requirements`

			```
			`pip install browser-use # Core library`
			`playwright install chromium # Playwright browser binary`
			```

			Or use cloud mode with `BROWSER_USE_API_KEY` — no local browser needed.

			`Python 3.11+, Playwright. No exotic system dependencies beyond what`
			`Hermes already requires for its existing browser tool.`

			`### Security Considerations`

			`\| Concern \| Mitigation \|`
			`\|----------------------------\|---------------------------------------------------------\|`
			\| Arbitrary URL access \| Reuse Hermes's `website_policy` and `url_safety` modules \|
			`\| Data exfiltration \| Browser-use agents run in isolated Playwright contexts; no access to Hermes filesystem \|`
			`\| Prompt injection via page \| browser-use feeds page content to LLM — same risk as existing browser_snapshot; already handled by Hermes prompt hardening \|`
			`\| Credential leakage \| Do not pass API keys to untrusted pages; cloud mode keeps credentials server-side \|`
			`\| Resource exhaustion \| Set max_steps on browser-use Agent to prevent infinite loops \|`
			`\| Downloaded files \| Playwright download path is sandboxed; tool should restrict to temp directory \|`

			`Key security property: browser-use executes within Playwright's sandboxed`
			`browser context. The LLM controlling browser-use is Hermes itself (or a`
			`configured auxiliary model), not the page content. This is equivalent to the`
			`existing browser tool's security model.`

			`### Performance Characteristics`

			`- Startup: ~2-3s for Playwright Chromium launch (same as existing local mode)`
			`- Per-step: ~1-3s per LLM call + browser action (comparable to manual`
			`browser_navigate + browser_snapshot loop)`
			`- Full task (5-10 steps): ~15-45s depending on page complexity`
			`- Token usage: Each step sends the accessibility tree to the LLM.`
			`Browser-use supports vision mode (screenshots) which is more token-heavy.`
			`- Parallelism: Supports multiple concurrent browser agents`

			`Comparison to existing tools:`
			`For a 10-step browser task, the existing approach requires 10+ Hermes API`
			`calls (navigate, snapshot, click, type, snapshot, click, ...). Browser-use`
			`consolidates this into a single Hermes tool call that internally runs its`
			`own LLM loop. This reduces Hermes API round-trips but shifts the LLM cost`
			`to browser-use's internal model calls.`

			`### Recommendation: INTEGRATE`

			`Browser Use fills a clear gap — autonomous multi-step browser tasks — that`
			`complements Hermes's existing fine-grained browser tools. The integration`
			`is straightforward (Python library, same security model). A PoC tool is`
			provided in `tools/browser_use_tool.py`.

			`---`

			`## 2. Graphify`

			`### What It Does`

			`Graphify is a knowledge graph extraction tool that processes unstructured`
			`text (including web content) and extracts entities, relationships, and`
			`structured knowledge into a graph format. It can:`

			`- Extract entities and relationships from text using NLP/LLM techniques`
			`- Build knowledge graphs from web-scraped content`
			`- Support incremental graph updates as new content is processed`
			`- Export graphs in standard formats (JSON-LD, RDF, etc.)`

			`(Note: "Graphify" as a project name is used by several tools. The most`
			`relevant for browser integration is the concept of extracting structured`
			`knowledge graphs from web content during or after browsing.)`

			`### Integration with Hermes`

			`Primary path: MCP server or Hermes tool that takes web content (from`
			`browser_tool or web_extract) and produces structured knowledge graphs.`

			`Integration architecture:`

			```
			`hermes-agent`
			`tools/`
			`graphify_tool.py # NEW — knowledge graph extraction from text`
			`\|`
			`+-- graphify.extract() # Extract entities/relations from text`
			`+-- graphify.merge() # Merge into existing graph`
			`+-- graphify.query() # Query the accumulated graph`
			```

			`Or via MCP:`
			```
			`hermes-agent --mcp-server graphify-mcp`
			`-> tools: graphify_extract, graphify_query, graphify_export`
			```

			`Synergy with browser tools:`
			1. `browser_navigate` + `browser_snapshot` to get page content
			2. `graphify_extract` to pull entities and relationships
			`3. Repeat across multiple pages to build a domain knowledge graph`
			4. `graphify_query` to answer questions about accumulated knowledge

			`### Dependencies and Requirements`

			`Varies significantly depending on the specific Graphify implementation.`
			`Typical requirements:`
			`- Python 3.11+`
			`- spaCy or similar NLP library for entity extraction`
			`- Optional: Neo4j or NetworkX for graph storage`
			`- LLM access (can reuse Hermes's existing model configuration)`

			`### Security Considerations`

			`\| Concern \| Mitigation \|`
			`\|----------------------------\|---------------------------------------------------------\|`
			`\| Processing untrusted text \| NLP extraction is read-only; no code execution \|`
			`\| Graph data persistence \| Store in Hermes's data directory with appropriate permissions \|`
			`\| Information aggregation \| Knowledge graphs could accumulate sensitive data; provide clear/delete commands \|`
			`\| External graph DB access \| If using Neo4j, require authentication and restrict to localhost \|`

			`### Performance Characteristics`

			`- Extraction: ~0.5-2s per page depending on content length and NLP model`
			`- Graph operations: Sub-second for graphs under 100K nodes`
			`- Storage: Lightweight (JSON/SQLite) for small graphs, Neo4j for large-scale`
			`- Token usage: If using LLM-based extraction, ~500-2000 tokens per page`

			`### Recommendation: INVESTIGATE FURTHER`

			`The concept is sound — knowledge graph extraction from web content is a`
			`natural complement to browser tools. However:`

			`1. Multiple competing tools exist under this name; need to identify the`
			`best-maintained option`
			`2. Value proposition unclear vs. Hermes's existing memory system and`
			`file-based knowledge storage`
			`3. NLP dependency adds complexity (spaCy models are ~500MB)`

			`Suggested next steps:`
			`- Evaluate specific Graphify implementations (graphify.ai, custom NLP pipelines)`
			`- Prototype with a lightweight approach: LLM-based entity extraction + NetworkX`
			`- Assess whether Hermes's existing memory/graph_store.py can serve this role`

			`---`

			`## 3. Multica`

			`### What It Does`

			`Multica is a multi-agent browser coordination framework. It enables multiple`
			`AI agents to collaboratively browse the web, with features for:`

			`- Task decomposition: splitting complex web tasks across multiple agents`
			`- Shared browser state: agents see a common view of browsing progress`
			`- Coordination protocols: agents can communicate about what they've found`
			`- Parallel web research: multiple agents researching different aspects simultaneously`

			`### Integration with Hermes`

			`Theoretical path: Multica would integrate as a higher-level orchestration`
			`layer on top of Hermes's existing browser tools, coordinating multiple`
			Hermes subagents (via `delegate_tool`) each with browser access.

			`Integration architecture:`

			```
			`hermes-agent (orchestrator)`
			`delegate_tool -> subagent_1 (browser_navigate, browser_snapshot, ...)`
			`delegate_tool -> subagent_2 (browser_navigate, browser_snapshot, ...)`
			`delegate_tool -> subagent_3 (browser_navigate, browser_snapshot, ...)`
			`\|`
			`+-- Multica coordination layer (shared state, task splitting)`
			```

			`### Dependencies and Requirements`

			`- Complex multi-agent orchestration infrastructure`
			`- Shared state management between agents`
			`- Potentially a custom runtime for agent coordination`
			`- Likely requires significant architectural changes to Hermes's delegation model`

			`### Security Considerations`

			`\| Concern \| Mitigation \|`
			`\|----------------------------\|---------------------------------------------------------\|`
			`\| Multiple agents on same browser \| Session isolation per agent (Hermes already does this) \|`
			`\| Coordinated exfiltration \| Same per-agent restrictions apply \|`
			`\| Amplified prompt injection \| Each agent processes its own pages independently \|`
			`\| Resource multiplication \| N agents = N browser instances = Nx resource usage \|`

			`### Performance Characteristics`

			`- Scaling: Near-linear improvement for embarrassingly parallel tasks`
			`(e.g., "research 10 companies simultaneously")`
			`- Overhead: Significant coordination overhead for tightly coupled tasks`
			`- Resource cost: Each agent needs its own LLM calls + browser instance`
			`- Complexity: Debugging multi-agent browser workflows is extremely difficult`

			`### Recommendation: SKIP (for now)`

			`Multica addresses a real need (parallel web research) but is premature for`
			`Hermes for several reasons:`

			1. Hermes already has subagent delegation (`delegate_tool`) — agents can
			`already do parallel browser work without Multica`
			`2. No mature implementation — Multica is more of a concept than a`
			`production-ready tool`
			`3. Complexity vs. benefit — the coordination overhead and debugging`
			`difficulty outweigh the benefits for most use cases`
			`4. Better alternatives exist — for parallel research, simply delegating`
			`multiple subagents with browser tools is simpler and already works`

			`Revisit when: Hermes's delegation model supports shared state between`
			`subagents, or a mature Multica implementation emerges.`

			`---`

			`## Integration Roadmap`

			`### Phase 1: Browser Use PoC (this PR)`
			- [x] Create `tools/browser_use_tool.py` wrapping browser-use as Hermes tool
			- [x] Create `docs/browser-integration-analysis.md` (this document)
			`- [ ] Test with real browser tasks`
			`- [ ] Add to toolset configuration`

			`### Phase 2: Browser Use Production (follow-up)`
			- [ ] Add `browser_use` to `toolsets.py` toolset definitions
			- [ ] Add configuration options in `config.yaml`
			- [ ] Add tests in `tests/test_browser_use_tool.py`
			`- [ ] Consider MCP server variant for subagent use`

			`### Phase 3: Graphify Investigation (follow-up)`
			`- [ ] Evaluate specific Graphify implementations`
			`- [ ] Prototype lightweight LLM-based entity extraction tool`
			- [ ] Assess integration with existing `graph_store.py`
			`- [ ] Create PoC if investigation is positive`

			`### Phase 4: Multi-Agent Browser (future)`
			`- [ ] Monitor Multica ecosystem maturity`
			`- [ ] Evaluate when delegation model supports shared state`
			`- [ ] Consider simpler parallel delegation patterns first`

			`---`

			`## Appendix: Existing Browser Stack`

			`Hermes already has a comprehensive browser tool stack:`

			`\| Component \| Description \|`
			`\|-----------------------\|--------------------------------------------------\|`
			\| `browser_tool.py` \| Low-level agent-controlled browser (navigate, click, type, snapshot) \|
			\| `browser_camofox.py` \| Anti-detection browser via Camofox REST API \|
			\| `browser_providers/` \| Cloud providers (Browserbase, Browser Use API, Firecrawl) \|
			\| `web_tools.py` \| Web search (Parallel) and extraction (Firecrawl) \|
			\| `mcp_tool.py` \| MCP client for connecting external tool servers \|

			`The existing stack covers:`
			`- Local browsing: Headless Chromium via agent-browser CLI`
			`- Cloud browsing: Browserbase, Browser Use cloud, Firecrawl`
			`- Anti-detection: Camofox (local) or Browserbase advanced stealth`
			`- Content extraction: Firecrawl for clean markdown extraction`
			`- Search: Parallel AI web search`

			`New browser integrations should complement rather than replace these tools.`