From bf048c8aecf0a3d7801ecf9f32f766e97046179b Mon Sep 17 00:00:00 2001 From: teknium1 Date: Sat, 7 Mar 2026 20:39:05 -0800 Subject: [PATCH] =?UTF-8?q?feat:=20add=20qmd=20optional=20skill=20?= =?UTF-8?q?=E2=80=94=20local=20knowledge=20base=20search?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add official optional skill for qmd (tobi/qmd), a local on-device search engine for personal knowledge bases, notes, docs, and meeting transcripts. Covers: - Installation and setup for macOS and Linux - Collection management and context annotations - All search modes: BM25, vector, hybrid with reranking - MCP integration (stdio and HTTP daemon modes) - Structured query patterns and best practices - systemd/launchd service configs for daemon persistence Placed in optional-skills/ due to heavyweight requirements (Node >= 22, ~2GB local models). --- optional-skills/research/qmd/SKILL.md | 441 ++++++++++++++++++++++++++ 1 file changed, 441 insertions(+) create mode 100644 optional-skills/research/qmd/SKILL.md diff --git a/optional-skills/research/qmd/SKILL.md b/optional-skills/research/qmd/SKILL.md new file mode 100644 index 000000000..9dce442ed --- /dev/null +++ b/optional-skills/research/qmd/SKILL.md @@ -0,0 +1,441 @@ +--- +name: qmd +description: Search personal knowledge bases, notes, docs, and meeting transcripts locally using qmd — a hybrid retrieval engine with BM25, vector search, and LLM reranking. Supports CLI and MCP integration. +version: 1.0.0 +author: Hermes Agent + Teknium +license: MIT +platforms: [macos, linux] +metadata: + hermes: + tags: [Search, Knowledge-Base, RAG, Notes, MCP, Local-AI] + related_skills: [obsidian, native-mcp, arxiv] +--- + +# QMD — Query Markup Documents + +Local, on-device search engine for personal knowledge bases. Indexes markdown +notes, meeting transcripts, documentation, and any text-based files, then +provides hybrid search combining keyword matching, semantic understanding, and +LLM-powered reranking — all running locally with no cloud dependencies. + +Created by [Tobi Lütke](https://github.com/tobi/qmd). MIT licensed. + +## When to Use + +- User asks to search their notes, docs, knowledge base, or meeting transcripts +- User wants to find something across a large collection of markdown/text files +- User wants semantic search ("find notes about X concept") not just keyword grep +- User has already set up qmd collections and wants to query them +- User asks to set up a local knowledge base or document search system +- Keywords: "search my notes", "find in my docs", "knowledge base", "qmd" + +## Prerequisites + +### Node.js >= 22 (required) + +```bash +# Check version +node --version # must be >= 22 + +# macOS — install or upgrade via Homebrew +brew install node@22 + +# Linux — use NodeSource or nvm +curl -fsSL https://deb.nodesource.com/setup_22.x | sudo -E bash - +sudo apt-get install -y nodejs +# or with nvm: +nvm install 22 && nvm use 22 +``` + +### SQLite with Extension Support (macOS only) + +macOS system SQLite lacks extension loading. Install via Homebrew: + +```bash +brew install sqlite +``` + +### Install qmd + +```bash +npm install -g @tobilu/qmd +# or with Bun: +bun install -g @tobilu/qmd +``` + +First run auto-downloads 3 local GGUF models (~2GB total): + +| Model | Purpose | Size | +|-------|---------|------| +| embeddinggemma-300M-Q8_0 | Vector embeddings | ~300MB | +| qwen3-reranker-0.6b-q8_0 | Result reranking | ~640MB | +| qmd-query-expansion-1.7B | Query expansion | ~1.1GB | + +### Verify Installation + +```bash +qmd --version +qmd status +``` + +## Quick Reference + +| Command | What It Does | Speed | +|---------|-------------|-------| +| `qmd search "query"` | BM25 keyword search (no models) | ~0.2s | +| `qmd vsearch "query"` | Semantic vector search (1 model) | ~3s | +| `qmd query "query"` | Hybrid + reranking (all 3 models) | ~2-3s warm, ~19s cold | +| `qmd get ` | Retrieve full document content | instant | +| `qmd multi-get "glob"` | Retrieve multiple files | instant | +| `qmd collection add --name ` | Add a directory as a collection | instant | +| `qmd context add "description"` | Add context metadata to improve retrieval | instant | +| `qmd embed` | Generate/update vector embeddings | varies | +| `qmd status` | Show index health and collection info | instant | +| `qmd mcp` | Start MCP server (stdio) | persistent | +| `qmd mcp --http --daemon` | Start MCP server (HTTP, warm models) | persistent | + +## Setup Workflow + +### 1. Add Collections + +Point qmd at directories containing your documents: + +```bash +# Add a notes directory +qmd collection add ~/notes --name notes + +# Add project docs +qmd collection add ~/projects/myproject/docs --name project-docs + +# Add meeting transcripts +qmd collection add ~/meetings --name meetings + +# List all collections +qmd collection list +``` + +### 2. Add Context Descriptions + +Context metadata helps the search engine understand what each collection +contains. This significantly improves retrieval quality: + +```bash +qmd context add qmd://notes "Personal notes, ideas, and journal entries" +qmd context add qmd://project-docs "Technical documentation for the main project" +qmd context add qmd://meetings "Meeting transcripts and action items from team syncs" +``` + +### 3. Generate Embeddings + +```bash +qmd embed +``` + +This processes all documents in all collections and generates vector +embeddings. Re-run after adding new documents or collections. + +### 4. Verify + +```bash +qmd status # shows index health, collection stats, model info +``` + +## Search Patterns + +### Fast Keyword Search (BM25) + +Best for: exact terms, code identifiers, names, known phrases. +No models loaded — near-instant results. + +```bash +qmd search "authentication middleware" +qmd search "handleError async" +``` + +### Semantic Vector Search + +Best for: natural language questions, conceptual queries. +Loads embedding model (~3s first query). + +```bash +qmd vsearch "how does the rate limiter handle burst traffic" +qmd vsearch "ideas for improving onboarding flow" +``` + +### Hybrid Search with Reranking (Best Quality) + +Best for: important queries where quality matters most. +Uses all 3 models — query expansion, parallel BM25+vector, reranking. + +```bash +qmd query "what decisions were made about the database migration" +``` + +### Structured Multi-Mode Queries + +Combine different search types in a single query for precision: + +```bash +# BM25 for exact term + vector for concept +qmd query $'lex: rate limiter\nvec: how does throttling work under load' + +# With query expansion +qmd query $'expand: database migration plan\nlex: "schema change"' +``` + +### Query Syntax (lex/BM25 mode) + +| Syntax | Effect | Example | +|--------|--------|---------| +| `term` | Prefix match | `perf` matches "performance" | +| `"phrase"` | Exact phrase | `"rate limiter"` | +| `-term` | Exclude term | `performance -sports` | + +### HyDE (Hypothetical Document Embeddings) + +For complex topics, write what you expect the answer to look like: + +```bash +qmd query $'hyde: The migration plan involves three phases. First, we add the new columns without dropping the old ones. Then we backfill data. Finally we cut over and remove legacy columns.' +``` + +### Scoping to Collections + +```bash +qmd search "query" --collection notes +qmd query "query" --collection project-docs +``` + +### Output Formats + +```bash +qmd search "query" --json # JSON output (best for parsing) +qmd search "query" --limit 5 # Limit results +qmd get "#abc123" # Get by document ID +qmd get "path/to/file.md" # Get by file path +qmd get "file.md:50" -l 100 # Get specific line range +qmd multi-get "journals/*.md" --json # Batch retrieve by glob +``` + +## MCP Integration (Recommended) + +qmd exposes an MCP server that provides search tools directly to +Hermes Agent via the native MCP client. This is the preferred +integration — once configured, the agent gets qmd tools automatically +without needing to load this skill. + +### Option A: Stdio Mode (Simple) + +Add to `~/.hermes/config.yaml`: + +```yaml +mcp_servers: + qmd: + command: "qmd" + args: ["mcp"] + timeout: 30 + connect_timeout: 45 +``` + +This registers tools: `mcp_qmd_search`, `mcp_qmd_vsearch`, +`mcp_qmd_deep_search`, `mcp_qmd_get`, `mcp_qmd_status`. + +**Tradeoff:** Models load on first search call (~19s cold start), +then stay warm for the session. Acceptable for occasional use. + +### Option B: HTTP Daemon Mode (Fast, Recommended for Heavy Use) + +Start the qmd daemon separately — it keeps models warm in memory: + +```bash +# Start daemon (persists across agent restarts) +qmd mcp --http --daemon + +# Runs on http://localhost:8181 by default +``` + +Then configure Hermes Agent to connect via HTTP: + +```yaml +mcp_servers: + qmd: + url: "http://localhost:8181/mcp" + timeout: 30 +``` + +**Tradeoff:** Uses ~2GB RAM while running, but every query is fast +(~2-3s). Best for users who search frequently. + +### Keeping the Daemon Running + +#### macOS (launchd) + +```bash +cat > ~/Library/LaunchAgents/com.qmd.daemon.plist << 'EOF' + + + + + Label + com.qmd.daemon + ProgramArguments + + qmd + mcp + --http + --daemon + + RunAtLoad + + KeepAlive + + StandardOutPath + /tmp/qmd-daemon.log + StandardErrorPath + /tmp/qmd-daemon.log + + +EOF + +launchctl load ~/Library/LaunchAgents/com.qmd.daemon.plist +``` + +#### Linux (systemd user service) + +```bash +mkdir -p ~/.config/systemd/user + +cat > ~/.config/systemd/user/qmd-daemon.service << 'EOF' +[Unit] +Description=QMD MCP Daemon +After=network.target + +[Service] +ExecStart=qmd mcp --http --daemon +Restart=on-failure +RestartSec=10 +Environment=PATH=/usr/local/bin:/usr/bin:/bin + +[Install] +WantedBy=default.target +EOF + +systemctl --user daemon-reload +systemctl --user enable --now qmd-daemon +systemctl --user status qmd-daemon +``` + +### MCP Tools Reference + +Once connected, these tools are available as `mcp_qmd_*`: + +| MCP Tool | Maps To | Description | +|----------|---------|-------------| +| `mcp_qmd_search` | `qmd search` | BM25 keyword search | +| `mcp_qmd_vsearch` | `qmd vsearch` | Semantic vector search | +| `mcp_qmd_deep_search` | `qmd query` | Hybrid search + reranking | +| `mcp_qmd_get` | `qmd get` | Retrieve document by ID or path | +| `mcp_qmd_status` | `qmd status` | Index health and stats | + +The MCP tools accept structured JSON queries for multi-mode search: + +```json +{ + "searches": [ + {"type": "lex", "query": "authentication middleware"}, + {"type": "vec", "query": "how user login is verified"} + ], + "collections": ["project-docs"], + "limit": 10 +} +``` + +## CLI Usage (Without MCP) + +When MCP is not configured, use qmd directly via terminal: + +``` +terminal(command="qmd query 'what was decided about the API redesign' --json", timeout=30) +``` + +For setup and management tasks, always use terminal: + +``` +terminal(command="qmd collection add ~/Documents/notes --name notes") +terminal(command="qmd context add qmd://notes 'Personal research notes and ideas'") +terminal(command="qmd embed") +terminal(command="qmd status") +``` + +## How the Search Pipeline Works + +Understanding the internals helps choose the right search mode: + +1. **Query Expansion** — A fine-tuned 1.7B model generates 2 alternative + queries. The original gets 2x weight in fusion. +2. **Parallel Retrieval** — BM25 (SQLite FTS5) and vector search run + simultaneously across all query variants. +3. **RRF Fusion** — Reciprocal Rank Fusion (k=60) merges results. + Top-rank bonus: #1 gets +0.05, #2-3 get +0.02. +4. **LLM Reranking** — qwen3-reranker scores top 30 candidates (0.0-1.0). +5. **Position-Aware Blending** — Ranks 1-3: 75% retrieval / 25% reranker. + Ranks 4-10: 60/40. Ranks 11+: 40/60 (trusts reranker more for long tail). + +**Smart Chunking:** Documents are split at natural break points (headings, +code blocks, blank lines) targeting ~900 tokens with 15% overlap. Code +blocks are never split mid-block. + +## Best Practices + +1. **Always add context descriptions** — `qmd context add` dramatically + improves retrieval accuracy. Describe what each collection contains. +2. **Re-embed after adding documents** — `qmd embed` must be re-run when + new files are added to collections. +3. **Use `qmd search` for speed** — when you need fast keyword lookup + (code identifiers, exact names), BM25 is instant and needs no models. +4. **Use `qmd query` for quality** — when the question is conceptual or + the user needs the best possible results, use hybrid search. +5. **Prefer MCP integration** — once configured, the agent gets native + tools without needing to load this skill each time. +6. **Daemon mode for frequent users** — if the user searches their + knowledge base regularly, recommend the HTTP daemon setup. +7. **First query in structured search gets 2x weight** — put the most + important/certain query first when combining lex and vec. + +## Troubleshooting + +### "Models downloading on first run" +Normal — qmd auto-downloads ~2GB of GGUF models on first use. +This is a one-time operation. + +### Cold start latency (~19s) +This happens when models aren't loaded in memory. Solutions: +- Use HTTP daemon mode (`qmd mcp --http --daemon`) to keep warm +- Use `qmd search` (BM25 only) when models aren't needed +- MCP stdio mode loads models on first search, stays warm for session + +### macOS: "unable to load extension" +Install Homebrew SQLite: `brew install sqlite` +Then ensure it's on PATH before system SQLite. + +### "No collections found" +Run `qmd collection add --name ` to add directories, +then `qmd embed` to index them. + +### Embedding model override (CJK/multilingual) +Set `QMD_EMBED_MODEL` environment variable for non-English content: +```bash +export QMD_EMBED_MODEL="your-multilingual-model" +``` + +## Data Storage + +- **Index & vectors:** `~/.cache/qmd/index.sqlite` +- **Models:** Auto-downloaded to local cache on first run +- **No cloud dependencies** — everything runs locally + +## References + +- [GitHub: tobi/qmd](https://github.com/tobi/qmd) +- [QMD Changelog](https://github.com/tobi/qmd/blob/main/CHANGELOG.md)