diff --git a/skills/research/duckduckgo-search/SKILL.md b/skills/research/duckduckgo-search/SKILL.md index 0bfc64739..ea14e6b30 100644 --- a/skills/research/duckduckgo-search/SKILL.md +++ b/skills/research/duckduckgo-search/SKILL.md @@ -1,7 +1,7 @@ --- name: duckduckgo-search -description: Free web search via DuckDuckGo — text, news, images, videos. No API key needed. Use the Python DDGS library or CLI to search, then web_extract for full content. -version: 1.2.0 +description: Free web search via DuckDuckGo — text, news, images, videos. No API key needed. Prefer the `ddgs` CLI when installed; use the Python DDGS library only after verifying that `ddgs` is available in the current runtime. +version: 1.3.0 author: gamedevCloudy license: MIT metadata: @@ -9,26 +9,96 @@ metadata: tags: [search, duckduckgo, web-search, free, fallback] related_skills: [arxiv] fallback_for_toolsets: [web] -prerequisites: - commands: [ddgs] --- # DuckDuckGo Search Free web search using DuckDuckGo. **No API key required.** -Preferred when `web_search` tool is unavailable or unsuitable (no `FIRECRAWL_API_KEY` set). Can also be used as a standalone search tool. +Preferred when `web_search` is unavailable or unsuitable (for example when `FIRECRAWL_API_KEY` is not set). Can also be used as a standalone search path when DuckDuckGo results are specifically desired. -## Setup +## Detection Flow + +Check what is actually available before choosing an approach: ```bash -# Install the ddgs package (one-time) -pip install ddgs +# Check CLI availability +command -v ddgs >/dev/null && echo "DDGS_CLI=installed" || echo "DDGS_CLI=missing" ``` -## Python API (Primary) +Decision tree: +1. If `ddgs` CLI is installed, prefer `terminal` + `ddgs` +2. If `ddgs` CLI is missing, do not assume `execute_code` can import `ddgs` +3. If the user wants DuckDuckGo specifically, install `ddgs` first in the relevant environment +4. Otherwise fall back to built-in web/browser tools -Use the `DDGS` class in `execute_code` for structured results with typed fields. +Important runtime note: +- Terminal and `execute_code` are separate runtimes +- A successful shell install does not guarantee `execute_code` can import `ddgs` +- Never assume third-party Python packages are preinstalled inside `execute_code` + +## Installation + +Install `ddgs` only when DuckDuckGo search is specifically needed and the runtime does not already provide it. + +```bash +# Python package + CLI entrypoint +pip install ddgs + +# Verify CLI +ddgs --help +``` + +If a workflow depends on Python imports, verify that same runtime can import `ddgs` before using `from ddgs import DDGS`. + +## Method 1: CLI Search (Preferred) + +Use the `ddgs` command via `terminal` when it exists. This is the preferred path because it avoids assuming the `execute_code` sandbox has the `ddgs` Python package installed. + +```bash +# Text search +ddgs text -k "python async programming" -m 5 + +# News search +ddgs news -k "artificial intelligence" -m 5 + +# Image search +ddgs images -k "landscape photography" -m 10 + +# Video search +ddgs videos -k "python tutorial" -m 5 + +# With region filter +ddgs text -k "best restaurants" -m 5 -r us-en + +# Recent results only (d=day, w=week, m=month, y=year) +ddgs text -k "latest AI news" -m 5 -t w + +# JSON output for parsing +ddgs text -k "fastapi tutorial" -m 5 -o json +``` + +### CLI Flags + +| Flag | Description | Example | +|------|-------------|---------| +| `-k` | Keywords (query) — **required** | `-k "search terms"` | +| `-m` | Max results | `-m 5` | +| `-r` | Region | `-r us-en` | +| `-t` | Time limit | `-t w` (week) | +| `-s` | Safe search | `-s off` | +| `-o` | Output format | `-o json` | + +## Method 2: Python API (Only After Verification) + +Use the `DDGS` class in `execute_code` or another Python runtime only after verifying that `ddgs` is installed there. Do not assume `execute_code` includes third-party packages by default. + +Safe wording: +- "Use `execute_code` with `ddgs` after installing or verifying the package if needed" + +Avoid saying: +- "`execute_code` includes `ddgs`" +- "DuckDuckGo search works by default in `execute_code`" **Important:** `max_results` must always be passed as a **keyword argument** — positional usage raises an error on all methods. @@ -76,7 +146,7 @@ from ddgs import DDGS with DDGS() as ddgs: for r in ddgs.images("semiconductor chip", max_results=5): print(r["title"]) - print(r["image"]) # direct image URL + print(r["image"]) print(r.get("thumbnail", "")) print(r.get("source", "")) print() @@ -94,9 +164,9 @@ from ddgs import DDGS with DDGS() as ddgs: for r in ddgs.videos("FastAPI tutorial", max_results=5): print(r["title"]) - print(r.get("content", "")) # video URL - print(r.get("duration", "")) # e.g. "26:03" - print(r.get("provider", "")) # YouTube, etc. + print(r.get("content", "")) + print(r.get("duration", "")) + print(r.get("provider", "")) print(r.get("published", "")) print() ``` @@ -112,50 +182,17 @@ Returns: `title`, `content`, `description`, `duration`, `provider`, `published`, | `images()` | Visuals, diagrams | title, image, thumbnail, url | | `videos()` | Tutorials, demos | title, content, duration, provider | -## CLI (Alternative) - -Use the `ddgs` command via terminal when you don't need structured field access. - -```bash -# Text search -ddgs text -k "python async programming" -m 5 - -# News search -ddgs news -k "artificial intelligence" -m 5 - -# Image search -ddgs images -k "landscape photography" -m 10 - -# Video search -ddgs videos -k "python tutorial" -m 5 - -# With region filter -ddgs text -k "best restaurants" -m 5 -r us-en - -# Recent results only (d=day, w=week, m=month, y=year) -ddgs text -k "latest AI news" -m 5 -t w - -# JSON output for parsing -ddgs text -k "fastapi tutorial" -m 5 -o json -``` - -### CLI Flags - -| Flag | Description | Example | -|------|-------------|---------| -| `-k` | Keywords (query) — **required** | `-k "search terms"` | -| `-m` | Max results | `-m 5` | -| `-r` | Region | `-r us-en` | -| `-t` | Time limit | `-t w` (week) | -| `-s` | Safe search | `-s off` | -| `-o` | Output format | `-o json` | - ## Workflow: Search then Extract -DuckDuckGo returns titles, URLs, and snippets — not full page content. To get full content, follow up with `web_extract`: +DuckDuckGo returns titles, URLs, and snippets — not full page content. To get full page content, search first and then extract the most relevant URL with `web_extract`, browser tools, or curl. -1. **Search** with ddgs to find relevant URLs -2. **Extract** content using the `web_extract` tool (if available) or curl +CLI example: + +```bash +ddgs text -k "fastapi deployment guide" -m 3 -o json +``` + +Python example, only after verifying `ddgs` is installed in that runtime: ```python from ddgs import DDGS @@ -164,25 +201,37 @@ with DDGS() as ddgs: results = list(ddgs.text("fastapi deployment guide", max_results=3)) for r in results: print(r["title"], "->", r["href"]) - -# Then use web_extract tool on the best URL ``` +Then extract the best URL with `web_extract` or another content-retrieval tool. + ## Limitations - **Rate limiting**: DuckDuckGo may throttle after many rapid requests. Add a short delay between searches if needed. -- **No content extraction**: ddgs returns snippets, not full page content. Use `web_extract` or curl for that. +- **No content extraction**: `ddgs` returns snippets, not full page content. Use `web_extract`, browser tools, or curl for the full article/page. - **Results quality**: Generally good but less configurable than Firecrawl's search. - **Availability**: DuckDuckGo may block requests from some cloud IPs. If searches return empty, try different keywords or wait a few seconds. -- **Field variability**: Return fields may vary between results or ddgs versions. Use `.get()` for optional fields to avoid KeyError. +- **Field variability**: Return fields may vary between results or `ddgs` versions. Use `.get()` for optional fields to avoid `KeyError`. +- **Separate runtimes**: A successful `ddgs` install in terminal does not automatically mean `execute_code` can import it. + +## Troubleshooting + +| Problem | Likely Cause | What To Do | +|---------|--------------|------------| +| `ddgs: command not found` | CLI not installed in the shell environment | Install `ddgs`, or use built-in web/browser tools instead | +| `ModuleNotFoundError: No module named 'ddgs'` | Python runtime does not have the package installed | Do not use Python DDGS there until that runtime is prepared | +| Search returns nothing | Temporary rate limiting or poor query | Wait a few seconds, retry, or adjust the query | +| CLI works but `execute_code` import fails | Terminal and `execute_code` are different runtimes | Keep using CLI, or separately prepare the Python runtime | ## Pitfalls - **`max_results` is keyword-only**: `ddgs.text("query", 5)` raises an error. Use `ddgs.text("query", max_results=5)`. +- **Do not assume the CLI exists**: Check `command -v ddgs` before using it. +- **Do not assume `execute_code` can import `ddgs`**: `from ddgs import DDGS` may fail with `ModuleNotFoundError` unless that runtime was prepared separately. +- **Package name**: The package is `ddgs` (previously `duckduckgo-search`). Install with `pip install ddgs`. - **Don't confuse `-k` and `-m`** (CLI): `-k` is for keywords, `-m` is for max results count. -- **Package name**: The package is `ddgs` (was previously `duckduckgo-search`). Install with `pip install ddgs`. -- **Empty results**: If ddgs returns nothing, it may be rate-limited. Wait a few seconds and retry. +- **Empty results**: If `ddgs` returns nothing, it may be rate-limited. Wait a few seconds and retry. ## Validated With -Smoke-tested with `ddgs==9.11.2` on Python 3.13. All four methods (text, news, images, videos) confirmed working with keyword `max_results`. +Validated examples against `ddgs==9.11.2` semantics. Skill guidance now treats CLI availability and Python import availability as separate concerns so the documented workflow matches actual runtime behavior. diff --git a/website/docs/reference/skills-catalog.md b/website/docs/reference/skills-catalog.md index 8fb22f397..4a1ecf629 100644 --- a/website/docs/reference/skills-catalog.md +++ b/website/docs/reference/skills-catalog.md @@ -253,7 +253,7 @@ Skills for academic research, paper discovery, literature review, domain reconna | `arxiv` | Search and retrieve academic papers from arXiv using their free REST API. No API key needed. Search by keyword, author, category, or ID. Combine with web_extract or the ocr-and-documents skill to read full paper content. | `research/arxiv` | | `blogwatcher` | Monitor blogs and RSS/Atom feeds for updates using the blogwatcher CLI. Add blogs, scan for new articles, and track what you've read. | `research/blogwatcher` | | `domain-intel` | Passive domain reconnaissance using Python stdlib. Subdomain discovery, SSL certificate inspection, WHOIS lookups, DNS records, domain availability checks, and bulk multi-domain analysis. No API keys required. | `research/domain-intel` | -| `duckduckgo-search` | Free web search via DuckDuckGo — text, news, images, videos. No API key needed. Use the Python DDGS library or CLI to search, then web_extract for full content. | `research/duckduckgo-search` | +| `duckduckgo-search` | Free web search via DuckDuckGo — text, news, images, videos. No API key needed. Prefer the `ddgs` CLI when installed; use the Python DDGS library only after verifying that `ddgs` is available in the current runtime. | `research/duckduckgo-search` | | `ml-paper-writing` | Write publication-ready ML/AI papers for NeurIPS, ICML, ICLR, ACL, AAAI, COLM. Use when drafting papers from research repos, structuring arguments, verifying citations, or preparing camera-ready submissions. Includes LaTeX templates, reviewer guidelines, and citation verificatio… | `research/ml-paper-writing` | | `polymarket` | Query Polymarket prediction market data — search markets, get prices, orderbooks, and price history. Read-only via public REST APIs, no API key needed. | `research/polymarket` |