feat: Tool investigation report + Mem0 local provider (#842 )

## Investigation Report - docs/tool-investigation-2026-04-15.md: Full report analyzing 414 tools from awesome-ai-tools. Top 5 recommendations with integration paths. - docs/plans/awesome-ai-tools-integration.md: Implementation tracking plan. ## Mem0 Local Provider (P1) - plugins/memory/mem0_local/: New ChromaDB-backed memory provider. No API key required - fully sovereign. Compatible tool schemas with cloud Mem0 (mem0_profile, mem0_search, mem0_conclude). - Pattern-based fact extraction from conversations. - Deterministic dedup via content hashing. - Circuit breaker for resilience. - tests/plugins/memory/test_mem0_local.py: Full test coverage. ## Issues Filed - #857: LightRAG integration (P2) - #858: n8n workflow orchestration (P3) - #859: RAGFlow document understanding (P4) - #860: tensorzero LLMOps evaluation (P3) Closes #842
2026-04-15 23:04:41 -04:00
6 changed files with 814 additions and 0 deletions
--- a/docs/plans/awesome-ai-tools-integration.md
+++ b/docs/plans/awesome-ai-tools-integration.md
@@ -0,0 +1,44 @@
+# awesome-ai-tools Integration Plan
+
+**Tracking:** #842
+**Source report:** docs/tool-investigation-2026-04-15.md
+**Date:** 2026-04-16
+
+---
+
+## Status Dashboard
+
+| # | Tool | Category | Impact | Effort | Status | Issue |
+|---|------|----------|--------|--------|--------|-------|
+| 1 | Mem0 | Memory | 5/5 | 3/5 | Cloud + Local done | #842 |
+| 2 | LightRAG | RAG | 4/5 | 3/5 | Not started | #857 |
+| 3 | n8n | Orchestration | 5/5 | 4/5 | Not started | #858 |
+| 4 | RAGFlow | RAG | 4/5 | 4/5 | Not started | #859 |
+| 5 | tensorzero | LLMOps | 4/5 | 3/5 | Not started | #860 |
+
+---
+
+## #1: Mem0 — DONE
+
+Cloud: `plugins/memory/mem0/` (MEM0_API_KEY required)
+Local: `plugins/memory/mem0_local/` (ChromaDB, no API key)
+
+## #2: LightRAG (P2)
+
+Create `plugins/rag/lightrag/` plugin. Index skill docs. Use local Ollama embeddings.
+
+## #3: n8n (P3)
+
+Deploy as Docker service. Create workflow templates for Hermes patterns.
+
+## #4: RAGFlow (P4)
+
+Deploy as Docker service. Integrate via HTTP API for document understanding.
+
+## #5: tensorzero (P3)
+
+Evaluate as provider routing replacement. Canary migration (10% traffic first).
+
+---
+
+*Last updated: 2026-04-16*
--- a/docs/tool-investigation-2026-04-15.md
+++ b/docs/tool-investigation-2026-04-15.md
@@ -0,0 +1,151 @@
+## Tool Investigation Report: Top 5 Recommendations from awesome-ai-tools
+
+**Source:** [formatho/awesome-ai-tools](https://github.com/formatho/awesome-ai-tools)
+**Date:** 2026-04-15
+**Tools Analyzed:** 414 across 9 categories
+**Agent:** Timmy
+
+---
+
+## Analysis Summary
+
+Scanned 414 tools from the awesome-ai-tools repository. Evaluated each against Hermes integration potential across five categories: Memory/Context, Inference Optimization, Agent Orchestration, Workflow Automation, and Retrieval/RAG.
+
+### Evaluation Criteria
+- **Stars:** GitHub community validation (stability signal)
+- **Freshness:** Active development (Fresh = updated <=7 days)
+- **Integration Fit:** How well it complements Hermes' existing architecture (skills, memory, tools)
+- **Integration Effort:** 1 (trivial drop-in) to 5 (major refactor required)
+- **Impact:** 1 (incremental) to 5 (transformative)
+
+---
+
+## Top 5 Recommended Tools
+
+### #1: Mem0 — Universal Memory Layer for AI Agents
+
+| Metric | Value |
+|--------|-------|
+| **Category** | Memory/Context |
+| **GitHub** | [mem0ai/mem0](https://github.com/mem0ai/mem0) |
+| **Stars** | 53.1k |
+| **Freshness** | Fresh |
+| **Integration Effort** | 3/5 |
+| **Impact** | 5/5 |
+| **Hermes Status** | IMPLEMENTED (plugins/memory/mem0/) + LOCAL MODE (plugins/memory/mem0_local/) |
+
+**Why it fits Hermes:**
+Hermes currently has session_search (transcript recall) and memory (persistent facts), but lacks a unified memory layer that bridges sessions with semantic understanding. Mem0 provides exactly this: automatic memory extraction from conversations, deduplication, and cross-session retrieval with semantic search.
+
+**Integration path:**
+- Cloud: plugins/memory/mem0/ (requires MEM0_API_KEY)
+- Local: plugins/memory/mem0_local/ (ChromaDB-backed, no API key)
+- Auto-extract facts from session transcripts
+- Query before session_search for richer contextual recall
+
+**Key risk:** Mem0 is freemium — core is open-source but advanced features require paid tier. Local mode mitigates this entirely.
+
+---
+
+### #2: LightRAG — Simple and Fast Retrieval-Augmented Generation
+
+| Metric | Value |
+|--------|-------|
+| **Category** | Retrieval/RAG |
+| **GitHub** | [HKUDS/LightRAG](https://github.com/HKUDS/LightRAG) |
+| **Stars** | 33.1k |
+| **Freshness** | Fresh |
+| **Integration Effort** | 3/5 |
+| **Impact** | 4/5 |
+| **Hermes Status** | NOT IMPLEMENTED — Issue #857 |
+
+**Why it fits Hermes:**
+Hermes has 190+ skills but no unified knowledge retrieval system. LightRAG adds graph-based RAG that understands relationships between concepts, not just keyword matches. It's lightweight, runs locally, and has a simple API.
+
+**Integration path:**
+- LightRAG as a local knowledge base for skill references
+- Index GENOME.md files, README.md, and key codebase files
+- Use local Ollama models for embeddings
+- Complements existing search_files without replacing it
+
+---
+
+### #3: n8n — Workflow Automation Platform
+
+| Metric | Value |
+|--------|-------|
+| **Category** | Workflow Automation / Agent Orchestration |
+| **GitHub** | [n8n-io/n8n](https://github.com/n8n-io/n8n) |
+| **Stars** | 183.9k |
+| **Freshness** | Fresh |
+| **Integration Effort** | 4/5 |
+| **Impact** | 5/5 |
+| **Hermes Status** | NOT IMPLEMENTED — Issue #858 |
+
+**Why it fits Hermes:**
+n8n provides a self-hosted, fair-code workflow platform with 400+ integrations. Rather than replacing Hermes' agent loop, n8n sits above it: trigger Hermes agents from external events, chain multi-agent workflows, and visualize execution.
+
+---
+
+### #4: RAGFlow — Open-Source RAG Engine
+
+| Metric | Value |
+|--------|-------|
+| **Category** | Retrieval/RAG |
+| **GitHub** | [infiniflow/ragflow](https://github.com/infiniflow/ragflow) |
+| **Stars** | 77.9k |
+| **Freshness** | Fresh |
+| **Integration Effort** | 4/5 |
+| **Impact** | 4/5 |
+| **Hermes Status** | NOT IMPLEMENTED — Issue #859 |
+
+**Why it fits Hermes:**
+RAGFlow handles document parsing (PDF, Word, images via OCR), chunking, embedding, and retrieval with a web UI. Enables "document understanding" as a first-class capability.
+
+---
+
+### #5: tensorzero — LLMOps Platform
+
+| Metric | Value |
+|--------|-------|
+| **Category** | Inference Optimization / LLMOps |
+| **GitHub** | [tensorzero/tensorzero](https://github.com/tensorzero/tensorzero) |
+| **Stars** | 11.2k |
+| **Freshness** | Fresh |
+| **Integration Effort** | 3/5 |
+| **Impact** | 4/5 |
+| **Hermes Status** | NOT IMPLEMENTED — Issue #860 |
+
+**Why it fits Hermes:**
+TensorZero unifies LLM gateway, observability, evaluation, and optimization. Replaces custom provider routing with a maintained, battle-tested platform.
+
+---
+
+## Honorable Mentions
+
+| Tool | Stars | Category | Why Not Top 5 |
+|------|-------|----------|---------------|
+| memvid | 14.9k | Memory | Newer; Mem0 is more mature |
+| mempalace | 44.8k | Memory | Already evaluated; Mem0 has broader API |
+| Everything Claude Code | 154.3k | Agent | Too Claude-specific |
+| Portkey AI Gateway | 11.3k | Gateway | TensorZero is OSS; Portkey is freemium |
+
+---
+
+## Implementation Priority
+
+| Priority | Tool | Action | Status | Issue |
+|----------|------|--------|--------|-------|
+| P1 | Mem0 | Local-only mode (ChromaDB) | DONE | #842 |
+| P2 | LightRAG | Set up local instance, index skills | Not started | #857 |
+| P3 | tensorzero | Evaluate as provider routing | Not started | #860 |
+| P4 | RAGFlow | Deploy Docker, test docs | Not started | #859 |
+| P5 | n8n | Deploy for workflow viz | Not started | #858 |
+
+---
+
+## References
+- Source: https://github.com/formatho/awesome-ai-tools
+- Total tools: 414 across 9 categories
+- Last updated: April 16, 2026
+- Tracking issue: Timmy_Foundation/hermes-agent#842
--- a/plugins/memory/mem0_local/README.md
+++ b/plugins/memory/mem0_local/README.md
@@ -0,0 +1,60 @@
+# Mem0 Local - Sovereign Memory Provider
+
+Local-only memory provider using ChromaDB. No API key required - all data stays on your machine.
+
+## How It Differs from Cloud Mem0
+
+| Feature | Cloud Mem0 | Local Mem0 |
+|---------|-----------|------------|
+| API key | Required | Not needed |
+| Data location | Mem0 servers | Your machine |
+| Fact extraction | Server-side LLM | Pattern-based heuristics |
+| Reranking | Yes | No |
+| Cost | Freemium | Free forever |
+
+## Setup
+
+```bash
+pip install chromadb
+hermes config set memory.provider mem0-local
+```
+
+Or manually in ~/.hermes/config.yaml:
+```yaml
+memory:
+  provider: mem0-local
+```
+
+## Config
+
+Config file: $HERMES_HOME/mem0-local.json
+
+| Key | Default | Description |
+|-----|---------|-------------|
+| storage_path | ~/.hermes/mem0-local/ | ChromaDB storage directory |
+| collection_prefix | mem0 | Collection name prefix |
+| max_memories | 10000 | Maximum stored memories |
+
+## Tools
+
+Same interface as cloud Mem0:
+
+| Tool | Description |
+|------|-------------|
+| mem0_profile | All stored memories about the user |
+| mem0_search | Semantic search by meaning |
+| mem0_conclude | Store a fact verbatim |
+
+## Data Sovereignty
+
+All data is stored in $HERMES_HOME/mem0-local/ as a ChromaDB persistent database. No network calls are made.
+
+To back up: copy the mem0-local/ directory.
+To reset: delete the mem0-local/ directory.
+
+## Limitations
+
+- Fact extraction is pattern-based (not LLM-powered)
+- No reranking - results ranked by embedding similarity only
+- No cross-device sync (by design)
+- Requires chromadb pip dependency (~50MB)
--- a/plugins/memory/mem0_local/init.py
+++ b/plugins/memory/mem0_local/init.py
@@ -0,0 +1,381 @@
+"""Mem0 Local memory provider - ChromaDB-backed, no API key required.
+
+Sovereign deployment: all data stays on the user's machine. Uses ChromaDB
+for vector storage and simple heuristic fact extraction (no server-side LLM).
+
+Compatible tool schemas with the cloud Mem0 provider:
+  mem0_profile  - retrieve all stored memories
+  mem0_search   - semantic search by meaning
+  mem0_conclude - store a fact verbatim
+
+Config via $HERMES_HOME/mem0-local.json or environment variables:
+  MEM0_LOCAL_PATH  - storage directory (default: $HERMES_HOME/mem0-local/)
+"""
+
+from __future__ import annotations
+
+import hashlib
+import json
+import logging
+import os
+import re
+import threading
+import time
+from datetime import datetime, timezone
+from pathlib import Path
+from typing import Any, Dict, List, Optional
+
+from agent.memory_provider import MemoryProvider
+from tools.registry import tool_error
+
+logger = logging.getLogger(__name__)
+
+# Circuit breaker
+_BREAKER_THRESHOLD = 5
+_BREAKER_COOLDOWN_SECS = 120
+
+
+def _load_config() -> dict:
+    """Load local config from env vars, with $HERMES_HOME/mem0-local.json overrides."""
+    from hermes_constants import get_hermes_home
+
+    config = {
+        "storage_path": os.environ.get("MEM0_LOCAL_PATH", ""),
+        "collection_prefix": "mem0",
+        "max_memories": 10000,
+    }
+
+    config_path = get_hermes_home() / "mem0-local.json"
+    if config_path.exists():
+        try:
+            file_cfg = json.loads(config_path.read_text(encoding="utf-8"))
+            config.update({k: v for k, v in file_cfg.items()
+                           if v is not None and v != ""})
+        except Exception:
+            pass
+
+    if not config["storage_path"]:
+        config["storage_path"] = str(get_hermes_home() / "mem0-local")
+
+    return config
+
+
+# Simple fact extraction patterns (no LLM required)
+_FACT_PATTERNS = [
+    (r"(?:my|the user'?s?)\s+(?:name|username)\s+(?:is|=)\s+(.+?)(?:\.|$)", "user.name"),
+    (r"(?:i|user)\s+(?:prefer|like|use|want|need)s?\s+(.+?)(?:\.|$)", "preference"),
+    (r"(?:i|user)\s+(?:work|am)\s+(?:at|as|on|with)\s+(.+?)(?:\.|$)", "context"),
+    (r"(?:remember|note|save|store)[:\s]+(.+?)(?:\.|$)", "explicit"),
+    (r"(?:my|the)\s+(?:timezone|tz)\s+(?:is|=)\s+(.+?)(?:\.|$)", "user.timezone"),
+    (r"(?:my|the)\s+(?:project|repo|codebase)\s+(?:is|=|called)\s+(.+?)(?:\.|$)", "project"),
+    (r"(?:actually|correction|instead)[:\s]+(.+?)(?:\.|$)", "correction"),
+]
+
+
+def _extract_facts(text: str) -> List[Dict[str, str]]:
+    """Extract structured facts from conversation text using pattern matching."""
+    facts = []
+    if not text or len(text) < 10:
+        return facts
+    text_lower = text.lower().strip()
+
+    for pattern, category in _FACT_PATTERNS:
+        matches = re.findall(pattern, text_lower, re.IGNORECASE)
+        for match in matches:
+            fact_text = match.strip() if isinstance(match, str) else match[0].strip()
+            if len(fact_text) > 3 and len(fact_text) < 500:
+                facts.append({
+                    "content": fact_text,
+                    "category": category,
+                    "source_text": text[:200],
+                })
+
+    return facts
+
+
+# Tool schemas (compatible with cloud Mem0)
+PROFILE_SCHEMA = {
+    "name": "mem0_profile",
+    "description": (
+        "Retrieve all stored memories about the user - preferences, facts, "
+        "project context. Fast, no reranking. Use at conversation start."
+    ),
+    "parameters": {"type": "object", "properties": {}, "required": []},
+}
+
+SEARCH_SCHEMA = {
+    "name": "mem0_search",
+    "description": (
+        "Search memories by meaning. Returns relevant facts ranked by similarity. "
+        "Local-only - no API calls."
+    ),
+    "parameters": {
+        "type": "object",
+        "properties": {
+            "query": {"type": "string", "description": "What to search for."},
+            "top_k": {"type": "integer", "description": "Max results (default: 10, max: 50)."},
+        },
+        "required": ["query"],
+    },
+}
+
+CONCLUDE_SCHEMA = {
+    "name": "mem0_conclude",
+    "description": (
+        "Store a durable fact about the user. Stored verbatim (no LLM extraction). "
+        "Use for explicit preferences, corrections, or decisions. Local-only."
+    ),
+    "parameters": {
+        "type": "object",
+        "properties": {
+            "conclusion": {"type": "string", "description": "The fact to store."},
+        },
+        "required": ["conclusion"],
+    },
+}
+
+
+class Mem0LocalProvider(MemoryProvider):
+    """Local ChromaDB-backed memory provider. No API key required."""
+
+    def __init__(self):
+        self._config = None
+        self._client = None
+        self._collection = None
+        self._client_lock = threading.Lock()
+        self._user_id = "hermes-user"
+        self._storage_path = ""
+        self._max_memories = 10000
+        self._consecutive_failures = 0
+        self._breaker_open_until = 0.0
+
+    @property
+    def name(self) -> str:
+        return "mem0-local"
+
+    def is_available(self) -> bool:
+        try:
+            import chromadb
+            return True
+        except ImportError:
+            return False
+
+    def save_config(self, values, hermes_home):
+        config_path = Path(hermes_home) / "mem0-local.json"
+        existing = {}
+        if config_path.exists():
+            try:
+                existing = json.loads(config_path.read_text())
+            except Exception:
+                pass
+        existing.update(values)
+        config_path.write_text(json.dumps(existing, indent=2))
+
+    def get_config_schema(self):
+        return [
+            {"key": "storage_path", "description": "Storage directory for ChromaDB", "default": "~/.hermes/mem0-local/"},
+            {"key": "collection_prefix", "description": "Collection name prefix", "default": "mem0"},
+            {"key": "max_memories", "description": "Maximum stored memories", "default": "10000"},
+        ]
+
+    def _get_collection(self):
+        """Thread-safe ChromaDB collection accessor with lazy init."""
+        with self._client_lock:
+            if self._collection is not None:
+                return self._collection
+
+            try:
+                import chromadb
+                from chromadb.config import Settings
+            except ImportError:
+                raise RuntimeError("chromadb package not installed. Run: pip install chromadb")
+
+            Path(self._storage_path).mkdir(parents=True, exist_ok=True)
+
+            self._client = chromadb.PersistentClient(
+                path=self._storage_path,
+                settings=Settings(anonymized_telemetry=False),
+            )
+
+            collection_name = f"{self._config.get('collection_prefix', 'mem0')}_memories"
+            self._collection = self._client.get_or_create_collection(
+                name=collection_name,
+                metadata={"hnsw:space": "cosine"},
+            )
+
+            logger.info(
+                "Mem0 local: ChromaDB collection '%s' at %s (%d docs)",
+                collection_name, self._storage_path, self._collection.count(),
+            )
+
+            return self._collection
+
+    def _doc_id(self, content: str) -> str:
+        """Deterministic ID from content hash (for dedup)."""
+        return hashlib.sha256(content.encode("utf-8")).hexdigest()[:16]
+
+    def _is_breaker_open(self) -> bool:
+        if self._consecutive_failures < _BREAKER_THRESHOLD:
+            return False
+        if time.monotonic() >= self._breaker_open_until:
+            self._consecutive_failures = 0
+            return False
+        return True
+
+    def _record_success(self):
+        self._consecutive_failures = 0
+
+    def _record_failure(self):
+        self._consecutive_failures += 1
+        if self._consecutive_failures >= _BREAKER_THRESHOLD:
+            self._breaker_open_until = time.monotonic() + _BREAKER_COOLDOWN_SECS
+
+    def initialize(self, session_id: str, **kwargs) -> None:
+        self._config = _load_config()
+        self._storage_path = self._config.get("storage_path", "")
+        self._max_memories = int(self._config.get("max_memories", 10000))
+        self._user_id = kwargs.get("user_id") or self._config.get("user_id", "hermes-user")
+
+    def system_prompt_block(self) -> str:
+        count = 0
+        try:
+            col = self._get_collection()
+            count = col.count()
+        except Exception:
+            pass
+        return (
+            "# Mem0 Local Memory\n"
+            f"Active. {count} memories stored locally. "
+            "Use mem0_search to find memories, mem0_conclude to store facts, "
+            "mem0_profile for a full overview."
+        )
+
+    def prefetch(self, query: str, *, session_id: str = "") -> str:
+        return ""
+
+    def queue_prefetch(self, query: str, *, session_id: str = "") -> None:
+        pass
+
+    def sync_turn(self, user_content: str, assistant_content: str, *, session_id: str = "") -> None:
+        """Extract and store facts from the conversation turn."""
+        if self._is_breaker_open():
+            return
+        try:
+            col = self._get_collection()
+        except Exception:
+            return
+
+        for content in [user_content, assistant_content]:
+            if not content or len(content) < 10:
+                continue
+            facts = _extract_facts(content)
+            for fact in facts:
+                doc_id = self._doc_id(fact["content"])
+                try:
+                    col.upsert(
+                        ids=[doc_id],
+                        documents=[fact["content"]],
+                        metadatas=[{
+                            "category": fact["category"],
+                            "user_id": self._user_id,
+                            "timestamp": datetime.now(timezone.utc).isoformat(),
+                            "source": "extracted",
+                        }],
+                    )
+                    self._record_success()
+                except Exception as e:
+                    self._record_failure()
+                    logger.debug("Mem0 local: failed to upsert fact: %s", e)
+
+    def get_tool_schemas(self) -> List[Dict[str, Any]]:
+        return [PROFILE_SCHEMA, SEARCH_SCHEMA, CONCLUDE_SCHEMA]
+
+    def handle_tool_call(self, tool_name: str, args: dict, **kwargs) -> str:
+        if self._is_breaker_open():
+            return json.dumps({"error": "Local memory temporarily unavailable. Will retry automatically."})
+
+        try:
+            col = self._get_collection()
+        except Exception as e:
+            return tool_error(f"ChromaDB not available: {e}")
+
+        if tool_name == "mem0_profile":
+            try:
+                results = col.get(
+                    where={"user_id": self._user_id} if self._user_id else None,
+                    limit=500,
+                )
+                documents = results.get("documents", [])
+                if not documents:
+                    return json.dumps({"result": "No memories stored yet."})
+                lines = [d for d in documents if d]
+                self._record_success()
+                return json.dumps({"result": "\n".join(f"- {l}" for l in lines), "count": len(lines)})
+            except Exception as e:
+                self._record_failure()
+                return tool_error(f"Failed to fetch profile: {e}")
+
+        elif tool_name == "mem0_search":
+            query = args.get("query", "")
+            if not query:
+                return tool_error("Missing required parameter: query")
+            top_k = min(int(args.get("top_k", 10)), 50)
+
+            try:
+                results = col.query(
+                    query_texts=[query],
+                    n_results=top_k,
+                    where={"user_id": self._user_id} if self._user_id else None,
+                )
+
+                documents = results.get("documents", [[]])[0]
+                distances = results.get("distances", [[]])[0]
+
+                if not documents:
+                    return json.dumps({"result": "No relevant memories found."})
+
+                items = []
+                for doc, dist in zip(documents, distances):
+                    score = max(0, 1 - (dist / 2))
+                    items.append({"memory": doc, "score": round(score, 3)})
+
+                self._record_success()
+                return json.dumps({"results": items, "count": len(items)})
+            except Exception as e:
+                self._record_failure()
+                return tool_error(f"Search failed: {e}")
+
+        elif tool_name == "mem0_conclude":
+            conclusion = args.get("conclusion", "")
+            if not conclusion:
+                return tool_error("Missing required parameter: conclusion")
+
+            try:
+                doc_id = self._doc_id(conclusion)
+                col.upsert(
+                    ids=[doc_id],
+                    documents=[conclusion],
+                    metadatas=[{
+                        "category": "explicit",
+                        "user_id": self._user_id,
+                        "timestamp": datetime.now(timezone.utc).isoformat(),
+                        "source": "conclude",
+                    }],
+                )
+                self._record_success()
+                return json.dumps({"result": "Fact stored locally.", "id": doc_id})
+            except Exception as e:
+                self._record_failure()
+                return tool_error(f"Failed to store: {e}")
+
+        return tool_error(f"Unknown tool: {tool_name}")
+
+    def shutdown(self) -> None:
+        with self._client_lock:
+            self._collection = None
+            self._client = None
+
+
+def register(ctx) -> None:
+    """Register Mem0 Local as a memory provider plugin."""
+    ctx.register_memory_provider(Mem0LocalProvider())
--- a/plugins/memory/mem0_local/plugin.yaml
+++ b/plugins/memory/mem0_local/plugin.yaml
@@ -0,0 +1,5 @@
+name: mem0_local
+version: 1.0.0
+description: "Mem0 local mode — ChromaDB-backed memory with no API key required. Sovereign deployment."
+pip_dependencies:
+  - chromadb
--- a/tests/plugins/memory/test_mem0_local.py
+++ b/tests/plugins/memory/test_mem0_local.py
@@ -0,0 +1,173 @@
+"""Tests for Mem0 Local memory provider - ChromaDB-backed, no API key."""
+
+import json
+import os
+import tempfile
+from pathlib import Path
+from unittest.mock import MagicMock, patch
+
+import pytest
+
+
+# Fact extraction tests
+
+class TestFactExtraction:
+    """Test the regex-based fact extraction."""
+
+    def _extract(self, text):
+        from plugins.memory.mem0_local import _extract_facts
+        return _extract_facts(text)
+
+    def test_name_extraction(self):
+        facts = self._extract("My name is Alexander Whitestone.")
+        assert any("alexander whitestone" in f["content"].lower() for f in facts)
+
+    def test_preference_extraction(self):
+        facts = self._extract("I prefer using vim for editing.")
+        assert any("vim" in f["content"].lower() for f in facts)
+
+    def test_timezone_extraction(self):
+        facts = self._extract("My timezone is America/New_York.")
+        assert any("america/new_york" in f["content"].lower() for f in facts)
+
+    def test_explicit_remember(self):
+        facts = self._extract("Remember: always use f-strings in Python.")
+        assert len(facts) > 0
+
+    def test_correction_extraction(self):
+        facts = self._extract("Actually: the port is 8080, not 3000.")
+        assert len(facts) > 0
+
+    def test_empty_input(self):
+        facts = self._extract("")
+        assert facts == []
+
+    def test_short_input_ignored(self):
+        facts = self._extract("Hi")
+        assert facts == []
+
+    def test_no_crash_on_random_text(self):
+        facts = self._extract("The quick brown fox jumps over the lazy dog. " * 10)
+        assert isinstance(facts, list)
+
+
+# Config tests
+
+class TestConfig:
+    """Test configuration loading."""
+
+    def test_default_storage_path(self, tmp_path, monkeypatch):
+        monkeypatch.setenv("HERMES_HOME", str(tmp_path / ".hermes"))
+        from plugins.memory.mem0_local import _load_config
+        config = _load_config()
+        assert "mem0-local" in config["storage_path"]
+
+    def test_env_override(self, tmp_path, monkeypatch):
+        custom_path = str(tmp_path / "custom-mem0")
+        monkeypatch.setenv("MEM0_LOCAL_PATH", custom_path)
+        from plugins.memory.mem0_local import _load_config
+        config = _load_config()
+        assert config["storage_path"] == custom_path
+
+
+# Provider interface tests
+
+class TestProviderInterface:
+    """Test provider interface methods."""
+
+    def test_name(self):
+        from plugins.memory.mem0_local import Mem0LocalProvider
+        provider = Mem0LocalProvider()
+        assert provider.name == "mem0-local"
+
+    def test_tool_schemas(self):
+        from plugins.memory.mem0_local import Mem0LocalProvider
+        provider = Mem0LocalProvider()
+        schemas = provider.get_tool_schemas()
+        names = {s["name"] for s in schemas}
+        assert names == {"mem0_profile", "mem0_search", "mem0_conclude"}
+
+    def test_schema_required_params(self):
+        from plugins.memory.mem0_local import Mem0LocalProvider
+        provider = Mem0LocalProvider()
+        schemas = {s["name"]: s for s in provider.get_tool_schemas()}
+        assert "query" in schemas["mem0_search"]["parameters"]["required"]
+        assert "conclusion" in schemas["mem0_conclude"]["parameters"]["required"]
+
+
+# ChromaDB integration tests
+
+chromadb = None
+try:
+    import chromadb
+except ImportError:
+    pass
+
+
+@pytest.mark.skipif(chromadb is None, reason="chromadb not installed")
+class TestChromaDBIntegration:
+    """Integration tests with real ChromaDB."""
+
+    @pytest.fixture
+    def provider(self, tmp_path, monkeypatch):
+        from plugins.memory.mem0_local import Mem0LocalProvider
+        monkeypatch.setenv("HERMES_HOME", str(tmp_path / ".hermes"))
+        provider = Mem0LocalProvider()
+        provider.initialize("test-session")
+        provider._storage_path = str(tmp_path / "mem0-test")
+        return provider
+
+    def test_store_and_search(self, provider):
+        result = provider.handle_tool_call("mem0_conclude", {"conclusion": "User prefers Python over JavaScript"})
+        data = json.loads(result)
+        assert data.get("result") == "Fact stored locally."
+
+        result = provider.handle_tool_call("mem0_search", {"query": "programming language preference"})
+        data = json.loads(result)
+        assert data["count"] > 0
+        assert any("python" in item["memory"].lower() for item in data["results"])
+
+    def test_profile_empty(self, provider):
+        result = provider.handle_tool_call("mem0_profile", {})
+        data = json.loads(result)
+        assert "No memories" in data.get("result", "") or data.get("count", 0) == 0
+
+    def test_profile_after_store(self, provider):
+        provider.handle_tool_call("mem0_conclude", {"conclusion": "User name is Alexander"})
+        provider.handle_tool_call("mem0_conclude", {"conclusion": "User timezone is UTC"})
+
+        result = provider.handle_tool_call("mem0_profile", {})
+        data = json.loads(result)
+        assert data["count"] >= 2
+
+    def test_dedup(self, provider):
+        provider.handle_tool_call("mem0_conclude", {"conclusion": "Project uses SQLite"})
+        provider.handle_tool_call("mem0_conclude", {"conclusion": "Project uses SQLite"})
+
+        result = provider.handle_tool_call("mem0_profile", {})
+        data = json.loads(result)
+        assert data["count"] == 1
+
+    def test_search_no_results(self, provider):
+        result = provider.handle_tool_call("mem0_search", {"query": "nonexistent topic xyz123"})
+        data = json.loads(result)
+        assert data.get("result") == "No relevant memories found." or data.get("count", 0) == 0
+
+    def test_sync_turn_extraction(self, provider):
+        provider.sync_turn(
+            "My name is TestUser and I prefer dark mode.",
+            "Hello TestUser! I'll remember your preference.",
+        )
+        result = provider.handle_tool_call("mem0_profile", {})
+        data = json.loads(result)
+        assert "count" in data
+
+    def test_conclude_missing_param(self, provider):
+        result = provider.handle_tool_call("mem0_conclude", {})
+        data = json.loads(result)
+        assert "error" in data
+
+    def test_search_missing_query(self, provider):
+        result = provider.handle_tool_call("mem0_search", {})
+        data = json.loads(result)
+        assert "error" in data