__pycache__/web_tools.cpython-310.pyc

o
<00>"<22>hn%<00>	@s4dZddlZddlZddlZddlmZddlmZee<02>d<05>d<06>Z	de
de
fd	d
<EFBFBD>Zd*de
d
ede
fdd<0F>Z
d+dee
de
de
fdd<13>Zd,de
de
de
de
fdd<19>Zdefdd<1B>Zedkr<>	ed<1D>ed<1E>e<11>szed<1F>ed <20>ed!<21>ed"<22>ed#<23>ed$<24>ed%<25>ed&<26>ed'<27>ed(<28>ed)<29>dSdS)-a<>
Standalone Web Tools Module

This module provides generic web tools that work with multiple backend providers.
Currently uses Tavily as the backend, but the interface makes it easy to swap
to other providers like Firecrawl without changing the function signatures.

Available tools:
- web_search_tool: Search the web for information
- web_extract_tool: Extract content from specific web pages
- web_crawl_tool: Crawl websites with specific instructions

Backend compatibility:
- Tavily: https://docs.tavily.com/
- Firecrawl: https://docs.firecrawl.dev/features/search

Usage:
    from web_tools import web_search_tool, web_extract_tool, web_crawl_tool
    
    # Search the web
    results = web_search_tool("Python machine learning libraries", limit=3)
    
    # Extract content from URLs  
    content = web_extract_tool(["https://example.com"], format="markdown")
    
    # Crawl a website
    crawl_data = web_crawl_tool("example.com", "Find contact information")
<EFBFBD>N)<01>List)<01>TavilyClient<6E>TAVILY_API_KEY)<01>api_key<65>text<78>returncCs(d}d}t<00>|d|<00>}t<00>|d|<03>}|S)a<>
    Remove base64 encoded images from text to reduce token count and clutter.
    
    This function finds and removes base64 encoded images in various formats:
    - (data:image/png;base64,...)
    - (data:image/jpeg;base64,...)
    - (data:image/svg+xml;base64,...)
    - data:image/[type];base64,... (without parentheses)
    
    Args:
        text: The text content to clean
        
    Returns:
        Cleaned text with base64 images replaced with placeholders
    z+\(data:image/[^;]+;base64,[A-Za-z0-9+/=]+\)z'data:image/[^;]+;base64,[A-Za-z0-9+/=]+z[BASE64_IMAGE_REMOVED])<02>re<72>sub)r<00>base64_with_parens_pattern<72>base64_pattern<72>cleaned_text<78>r
<00>0/home/teknium/Projects/Hermes-Agent/web_tools.py<70>clean_base64_images)s
r<00><00>query<72>limitc
Cs<>z-td|<00>d|<01>d<03><05>tj||dd<05>}tdt|<02>dg<00><02><01>d<08><03>tj|d	d
<EFBFBD>}t|<03>WStyT}zdt	|<04><01><00>}td|<05><00><02>t<05>d
|i<01>WYd}~Sd}~ww)ap
    Search the web for information using available search API backend.
    
    This function provides a generic interface for web search that can work
    with multiple backends. Currently uses Tavily but can be easily swapped.
    
    Args:
        query (str): The search query to look up
        limit (int): Maximum number of results to return (default: 5)
    
    Returns:
        str: JSON string containing search results with the following structure:
             {
                 "query": str,
                 "results": [
                     {
                         "title": str,
                         "url": str,
                         "content": str,
                         "score": float
                     },
                     ...
                 ]
             }
    
    Raises:
        Exception: If search fails or API key is not set
    u🔍 Searching the web for: 'z
' (limit: <20>)<29>advanced)r<00>max_results<74>search_depthu
✅ Found <20>resultsz results<74><00><01>indentzError searching web: <20>❌ <20>errorN)
<EFBFBD>print<6E>
tavily_client<6E>search<63>len<65>get<65>json<6F>dumpsr<00>	Exception<6F>str)rr<00>response<73>result_json<6F>e<>	error_msgr
r
r<00>web_search_toolJs
<08><02>r*<00>urls<6C>formatc	
Cs<>zMtdt|<00><01>d<02><03>tj||d<03>}tdt|<02>dg<00><02><01>d<06><03>|<02>dg<00>D]}|<03>dd<08>}t|<03>d	d
<EFBFBD><02>}td|<04>d|<05>d
<0A><05>q&tj|dd<0F>}t|<06>WStyt}zdt	|<07><01><00>}td|<08><00><02>t<05>d|i<01>WYd}~Sd}~ww)a<>
    Extract content from specific web pages using available extraction API backend.
    
    This function provides a generic interface for web content extraction that
    can work with multiple backends. Currently uses Tavily but can be easily swapped.
    
    Args:
        urls (List[str]): List of URLs to extract content from
        format (str): Desired output format ("markdown" or "html", optional)
    
    Returns:
        str: JSON string containing extracted content with the following structure:
             {
                 "results": [
                     {
                         "url": str,
                         "title": str,
                         "raw_content": str,
                         "content": str
                     },
                     ...
                 ]
             }
    
    Raises:
        Exception: If extraction fails or API key is not set
    u📄 Extracting content from z URL(s))r+r,u✅ Extracted content from r<00> pages<65>url<72>Unknown URL<52>raw_content<6E>u  📝 <20> (<28> characters)rrzError extracting content: rrN)
rr r<00>extractr!r"r#rr$r%)	r+r,r&<00>resultr.<00>content_lengthr'r(r)r
r
r<00>web_extract_toolxs 
<08><02>r7<00>basicr.<00>instructions<6E>depthc
CszZ|r	d|<01>d<02>nd}td|<00>|<03><00><03>tj|d|pd|d<07>}tdt|<04>d	g<00><02><01>d
<EFBFBD><03>|<04>d	g<00>D]}|<05>dd<0C>}t|<05>d
d<03><02>}td|<06>d|<07>d<10><05>q3tj|dd<12>}t|<08>WSty<>}	zdt	|	<09><01><00>}
td|
<EFBFBD><00><02>t<05>d|
i<01>WYd}	~	Sd}	~	ww)a<>
    Crawl a website with specific instructions using available crawling API backend.
    
    This function provides a generic interface for web crawling that can work
    with multiple backends. Currently uses Tavily but can be easily swapped.
    
    Args:
        url (str): The base URL to crawl (can include or exclude https://)
        instructions (str): Instructions for what to crawl/extract using LLM intelligence (optional)
        depth (str): Depth of extraction ("basic" or "advanced", default: "basic")
    
    Returns:
        str: JSON string containing crawled content with the following structure:
             {
                 "results": [
                     {
                         "url": str,
                         "title": str,
                         "content": str
                     },
                     ...
                 ]
             }
    
    Raises:
        Exception: If crawling fails or API key is not set
    z with instructions: '<27>'r1u🕷️ Crawling <20>zGet all available content)r.rr9<00>
extract_depthu✅ Crawled rr-r.r/<00>contentu  🌐 r2r3rrzError crawling website: rrN)
rr<00>crawlr r!r"r#rr$r%)r.r9r:<00>instructions_textr&r5<00>page_urlr6r'r(r)r
r
r<00>web_crawl_tool<6F>s,<06>
<08><02>rBcCstt<01>d<01><01>S)z<>
    Check if the Tavily API key is available in environment variables.
    
    Returns:
        bool: True if API key is set, False otherwise
    r)<03>bool<6F>os<6F>getenvr
r
r
r<00>check_tavily_api_key<65>srF<00>__main__u 🌐 Standalone Web Tools Modulez(========================================u/❌ TAVILY_API_KEY environment variable not setz>Please set your API key: export TAVILY_API_KEY='your-key-here'z#Get API key at: https://tavily.com/<2F>u✅ Tavily API key foundu!🛠️  Web tools ready for use!z
Example usage:zI  from web_tools import web_search_tool, web_extract_tool, web_crawl_toolz/  results = web_search_tool('Python tutorials')z5  content = web_extract_tool(['https://example.com'])zB  crawl_data = web_crawl_tool('example.com', 'Find documentation'))r)N)Nr8)<15>__doc__r"rDr<00>typingr<00>tavilyrrErr%r<00>intr*r7rBrCrF<00>__name__r<00>exitr
r
r
r<00><module>s:!.4;
<04>