Merge PR #598: feat(skill): expand duckduckgo-search with DDGS Python API coverage
Authored by areu01or00. Adds Python DDGS library examples for text, news, images, and video search with structured return field docs.
This commit is contained in:
@@ -1,7 +1,7 @@
|
||||
---
|
||||
name: duckduckgo-search
|
||||
description: Free web search via DuckDuckGo when Firecrawl is unavailable. No API key needed. Use ddgs CLI or Python library to find URLs, then web_extract for content.
|
||||
version: 1.1.0
|
||||
description: Free web search via DuckDuckGo — text, news, images, videos. No API key needed. Use the Python DDGS library or CLI to search, then web_extract for full content.
|
||||
version: 1.2.0
|
||||
author: gamedevCloudy
|
||||
license: MIT
|
||||
metadata:
|
||||
@@ -10,17 +10,11 @@ metadata:
|
||||
related_skills: [arxiv]
|
||||
---
|
||||
|
||||
# DuckDuckGo Search (Firecrawl Fallback)
|
||||
# DuckDuckGo Search
|
||||
|
||||
Free web search using DuckDuckGo. **No API key required.**
|
||||
|
||||
## When to Use This
|
||||
|
||||
Use this skill ONLY when the `web_search` tool is not available (i.e., `FIRECRAWL_API_KEY` is not set). If `web_search` works, prefer it — it returns richer results with built-in content extraction.
|
||||
|
||||
Signs you need this fallback:
|
||||
- `web_search` tool is not listed in your available tools
|
||||
- `web_search` returns an error about missing FIRECRAWL_API_KEY
|
||||
Preferred when `web_search` tool is unavailable or unsuitable (no `FIRECRAWL_API_KEY` set). Can also be used as a standalone search tool.
|
||||
|
||||
## Setup
|
||||
|
||||
@@ -29,14 +23,109 @@ Signs you need this fallback:
|
||||
pip install ddgs
|
||||
```
|
||||
|
||||
## Web Search (Primary Use Case)
|
||||
## Python API (Primary)
|
||||
|
||||
### Via Terminal (ddgs CLI)
|
||||
Use the `DDGS` class in `execute_code` for structured results with typed fields.
|
||||
|
||||
**Important:** `max_results` must always be passed as a **keyword argument** — positional usage raises an error on all methods.
|
||||
|
||||
### Text Search
|
||||
|
||||
Best for: general research, companies, documentation.
|
||||
|
||||
```python
|
||||
from ddgs import DDGS
|
||||
|
||||
with DDGS() as ddgs:
|
||||
for r in ddgs.text("python async programming", max_results=5):
|
||||
print(r["title"])
|
||||
print(r["href"])
|
||||
print(r.get("body", "")[:200])
|
||||
print()
|
||||
```
|
||||
|
||||
Returns: `title`, `href`, `body`
|
||||
|
||||
### News Search
|
||||
|
||||
Best for: current events, breaking news, latest updates.
|
||||
|
||||
```python
|
||||
from ddgs import DDGS
|
||||
|
||||
with DDGS() as ddgs:
|
||||
for r in ddgs.news("AI regulation 2026", max_results=5):
|
||||
print(r["date"], "-", r["title"])
|
||||
print(r.get("source", ""), "|", r["url"])
|
||||
print(r.get("body", "")[:200])
|
||||
print()
|
||||
```
|
||||
|
||||
Returns: `date`, `title`, `body`, `url`, `image`, `source`
|
||||
|
||||
### Image Search
|
||||
|
||||
Best for: visual references, product images, diagrams.
|
||||
|
||||
```python
|
||||
from ddgs import DDGS
|
||||
|
||||
with DDGS() as ddgs:
|
||||
for r in ddgs.images("semiconductor chip", max_results=5):
|
||||
print(r["title"])
|
||||
print(r["image"]) # direct image URL
|
||||
print(r.get("thumbnail", ""))
|
||||
print(r.get("source", ""))
|
||||
print()
|
||||
```
|
||||
|
||||
Returns: `title`, `image`, `thumbnail`, `url`, `height`, `width`, `source`
|
||||
|
||||
### Video Search
|
||||
|
||||
Best for: tutorials, demos, explainers.
|
||||
|
||||
```python
|
||||
from ddgs import DDGS
|
||||
|
||||
with DDGS() as ddgs:
|
||||
for r in ddgs.videos("FastAPI tutorial", max_results=5):
|
||||
print(r["title"])
|
||||
print(r.get("content", "")) # video URL
|
||||
print(r.get("duration", "")) # e.g. "26:03"
|
||||
print(r.get("provider", "")) # YouTube, etc.
|
||||
print(r.get("published", ""))
|
||||
print()
|
||||
```
|
||||
|
||||
Returns: `title`, `content`, `description`, `duration`, `provider`, `published`, `statistics`, `uploader`
|
||||
|
||||
### Quick Reference
|
||||
|
||||
| Method | Use When | Key Fields |
|
||||
|--------|----------|------------|
|
||||
| `text()` | General research, companies | title, href, body |
|
||||
| `news()` | Current events, updates | date, title, source, body, url |
|
||||
| `images()` | Visuals, diagrams | title, image, thumbnail, url |
|
||||
| `videos()` | Tutorials, demos | title, content, duration, provider |
|
||||
|
||||
## CLI (Alternative)
|
||||
|
||||
Use the `ddgs` command via terminal when you don't need structured field access.
|
||||
|
||||
```bash
|
||||
# Basic search — returns titles, URLs, and snippets
|
||||
# Text search
|
||||
ddgs text -k "python async programming" -m 5
|
||||
|
||||
# News search
|
||||
ddgs news -k "artificial intelligence" -m 5
|
||||
|
||||
# Image search
|
||||
ddgs images -k "landscape photography" -m 10
|
||||
|
||||
# Video search
|
||||
ddgs videos -k "python tutorial" -m 5
|
||||
|
||||
# With region filter
|
||||
ddgs text -k "best restaurants" -m 5 -r us-en
|
||||
|
||||
@@ -47,16 +136,6 @@ ddgs text -k "latest AI news" -m 5 -t w
|
||||
ddgs text -k "fastapi tutorial" -m 5 -o json
|
||||
```
|
||||
|
||||
### Via Python (in execute_code)
|
||||
|
||||
```python
|
||||
from hermes_tools import terminal
|
||||
|
||||
# Search and get results
|
||||
result = terminal("ddgs text -k 'python web framework comparison' -m 5")
|
||||
print(result["output"])
|
||||
```
|
||||
|
||||
### CLI Flags
|
||||
|
||||
| Flag | Description | Example |
|
||||
@@ -68,44 +147,39 @@ print(result["output"])
|
||||
| `-s` | Safe search | `-s off` |
|
||||
| `-o` | Output format | `-o json` |
|
||||
|
||||
## Other Search Types
|
||||
## Workflow: Search then Extract
|
||||
|
||||
```bash
|
||||
# Image search
|
||||
ddgs images -k "landscape photography" -m 10
|
||||
|
||||
# News search
|
||||
ddgs news -k "artificial intelligence" -m 5
|
||||
|
||||
# Video search
|
||||
ddgs videos -k "python tutorial" -m 5
|
||||
```
|
||||
|
||||
## Workflow: Search → Extract
|
||||
|
||||
DuckDuckGo finds URLs. To get full page content, follow up with `web_extract`:
|
||||
DuckDuckGo returns titles, URLs, and snippets — not full page content. To get full content, follow up with `web_extract`:
|
||||
|
||||
1. **Search** with ddgs to find relevant URLs
|
||||
2. **Extract** content using the `web_extract` tool (if available) or curl
|
||||
|
||||
```bash
|
||||
# Step 1: Find URLs
|
||||
ddgs text -k "fastapi tutorial" -m 3
|
||||
```python
|
||||
from ddgs import DDGS
|
||||
|
||||
# Step 2: Extract full content from a result URL
|
||||
# (use web_extract tool if available, otherwise curl)
|
||||
curl -s "https://example.com/article" | head -200
|
||||
with DDGS() as ddgs:
|
||||
results = list(ddgs.text("fastapi deployment guide", max_results=3))
|
||||
for r in results:
|
||||
print(r["title"], "->", r["href"])
|
||||
|
||||
# Then use web_extract tool on the best URL
|
||||
```
|
||||
|
||||
## Limitations
|
||||
|
||||
- **Rate limiting**: DuckDuckGo may throttle after many rapid requests. Add `sleep 1` between searches if needed.
|
||||
- **No content extraction**: ddgs only returns titles, URLs, and snippets — not full page content. Use `web_extract` or curl for that.
|
||||
- **Rate limiting**: DuckDuckGo may throttle after many rapid requests. Add a short delay between searches if needed.
|
||||
- **No content extraction**: ddgs returns snippets, not full page content. Use `web_extract` or curl for that.
|
||||
- **Results quality**: Generally good but less configurable than Firecrawl's search.
|
||||
- **Availability**: DuckDuckGo may block requests from some cloud IPs. If searches return empty, try different keywords or add a short delay.
|
||||
- **Availability**: DuckDuckGo may block requests from some cloud IPs. If searches return empty, try different keywords or wait a few seconds.
|
||||
- **Field variability**: Return fields may vary between results or ddgs versions. Use `.get()` for optional fields to avoid KeyError.
|
||||
|
||||
## Pitfalls
|
||||
|
||||
- **Don't confuse `-k` and `-m`**: `-k` is for keywords (the query), `-m` is for max results count.
|
||||
- **`max_results` is keyword-only**: `ddgs.text("query", 5)` raises an error. Use `ddgs.text("query", max_results=5)`.
|
||||
- **Don't confuse `-k` and `-m`** (CLI): `-k` is for keywords, `-m` is for max results count.
|
||||
- **Package name**: The package is `ddgs` (was previously `duckduckgo-search`). Install with `pip install ddgs`.
|
||||
- **Empty results**: If ddgs returns nothing, it may be rate-limited. Wait a few seconds and retry.
|
||||
|
||||
## Validated With
|
||||
|
||||
Smoke-tested with `ddgs==9.11.2` on Python 3.13. All four methods (text, news, images, videos) confirmed working with keyword `max_results`.
|
||||
|
||||
Reference in New Issue
Block a user