From f84230527cdffc9523331f3b10e6db7d3da20a63 Mon Sep 17 00:00:00 2001 From: Teknium <127238744+teknium1@users.noreply.github.com> Date: Sun, 22 Mar 2026 04:31:22 -0700 Subject: [PATCH] docs(skill): add split, merge, search examples to ocr-and-documents skill (#2461) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * fix: respect DashScope v1 runtime mode for alibaba Remove the hardcoded Alibaba branch from resolve_runtime_provider() that forced api_mode='anthropic_messages' regardless of the base URL. Alibaba now goes through the generic API-key provider path, which auto-detects the protocol from the URL: - /apps/anthropic → anthropic_messages (via endswith check) - /v1 → chat_completions (default) This fixes Alibaba setup with OpenAI-compatible DashScope endpoints (e.g. coding-intl.dashscope.aliyuncs.com/v1) that were broken because runtime always forced Anthropic mode even when setup saved a /v1 URL. Based on PR #2024 by @kshitijk4poor. * docs(skill): add split, merge, search examples to ocr-and-documents skill Adds pymupdf examples for PDF splitting, merging, and text search to the existing ocr-and-documents skill. No new dependencies — pymupdf already covers all three operations natively. --------- Co-authored-by: kshitijk4poor --- .../productivity/ocr-and-documents/SKILL.md | 38 +++++++++++++++++++ 1 file changed, 38 insertions(+) diff --git a/skills/productivity/ocr-and-documents/SKILL.md b/skills/productivity/ocr-and-documents/SKILL.md index cbbc07aad..2fdf4ea41 100644 --- a/skills/productivity/ocr-and-documents/SKILL.md +++ b/skills/productivity/ocr-and-documents/SKILL.md @@ -122,6 +122,44 @@ web_extract(urls=["https://arxiv.org/pdf/2402.03300"]) web_search(query="arxiv GRPO reinforcement learning 2026") ``` +## Split, Merge & Search + +pymupdf handles these natively — use `execute_code` or inline Python: + +```python +# Split: extract pages 1-5 to a new PDF +import pymupdf +doc = pymupdf.open("report.pdf") +new = pymupdf.open() +for i in range(5): + new.insert_pdf(doc, from_page=i, to_page=i) +new.save("pages_1-5.pdf") +``` + +```python +# Merge multiple PDFs +import pymupdf +result = pymupdf.open() +for path in ["a.pdf", "b.pdf", "c.pdf"]: + result.insert_pdf(pymupdf.open(path)) +result.save("merged.pdf") +``` + +```python +# Search for text across all pages +import pymupdf +doc = pymupdf.open("report.pdf") +for i, page in enumerate(doc): + results = page.search_for("revenue") + if results: + print(f"Page {i+1}: {len(results)} match(es)") + print(page.get_text("text")) +``` + +No extra dependencies needed — pymupdf covers split, merge, search, and text extraction in one package. + +--- + ## Notes - `web_extract` is always first choice for URLs