[claude] Add web_fetch tool (trafilatura) for full-page content extraction (#973) #1004

Merged

claude merged 1 commits from claude/issue-973 into main

2026-03-22 23:03:38 +00:00

Author	SHA1	Message	Date
Alexander Whitestone	0c5bbb1b4b	feat: add web_fetch tool for full-page content extraction (trafilatura) Some checks failed Tests / lint (pull_request) Failing after 4s Details Tests / test (pull_request) Has been skipped Details Implements web_fetch(url, max_tokens) tool that downloads a URL, extracts clean readable text via trafilatura, and truncates to a token budget. Registered as an Agno tool in the full toolkit. - Validates URL scheme before attempting fetch - Uses requests with 15s timeout and TimmyResearchBot/1.0 user-agent - Graceful degradation: missing packages, timeouts, HTTP errors, empty pages - Added trafilatura as optional dependency with 'research' extra - 11 unit tests covering all acceptance criteria Fixes #973 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-22 19:03:08 -04:00

Author

SHA1

Message

Date

Alexander Whitestone

0c5bbb1b4b

feat: add web_fetch tool for full-page content extraction (trafilatura)

Tests / lint (pull_request) Failing after 4s

Details

Tests / test (pull_request) Has been skipped

Details

Implements web_fetch(url, max_tokens) tool that downloads a URL,
extracts clean readable text via trafilatura, and truncates to a
token budget. Registered as an Agno tool in the full toolkit.

- Validates URL scheme before attempting fetch
- Uses requests with 15s timeout and TimmyResearchBot/1.0 user-agent
- Graceful degradation: missing packages, timeouts, HTTP errors, empty pages
- Added trafilatura as optional dependency with 'research' extra
- 11 unit tests covering all acceptance criteria

Fixes #973

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-22 19:03:08 -04:00

[claude] Add web_fetch tool (trafilatura) for full-page content extraction (#973) #1004

1 Commits