Files

teknium1 732c66b0f3 refactor: reorganize skills into sub-categories

The skills directory was getting disorganized — mlops alone had 40
skills in a flat list, and 12 categories were singletons with just
one skill each.

Code change:
- prompt_builder.py: Support sub-categories in skill scanner.
  skills/mlops/training/axolotl/SKILL.md now shows as category
  'mlops/training' instead of just 'mlops'. Backwards-compatible
  with existing flat structure.

Split mlops (40 skills) into 7 sub-categories:
- mlops/training (12): accelerate, axolotl, flash-attention,
  grpo-rl-training, peft, pytorch-fsdp, pytorch-lightning,
  simpo, slime, torchtitan, trl-fine-tuning, unsloth
- mlops/inference (8): gguf, guidance, instructor, llama-cpp,
  obliteratus, outlines, tensorrt-llm, vllm
- mlops/models (6): audiocraft, clip, llava, segment-anything,
  stable-diffusion, whisper
- mlops/vector-databases (4): chroma, faiss, pinecone, qdrant
- mlops/evaluation (5): huggingface-tokenizers,
  lm-evaluation-harness, nemo-curator, saelens, weights-and-biases
- mlops/cloud (2): lambda-labs, modal
- mlops/research (1): dspy

Merged singleton categories:
- gifs → media (gif-search joins youtube-content)
- music-creation → media (heartmula, songsee)
- diagramming → creative (excalidraw joins ascii-art)
- ocr-and-documents → productivity
- domain → research (domain-intel)
- feeds → research (blogwatcher)
- market-data → research (polymarket)

Fixed misplaced skills:
- mlops/code-review → software-development (not ML-specific)
- mlops/ml-paper-writing → research (academic writing)

Added DESCRIPTION.md files for all new/updated categories.

2026-03-09 03:35:53 -07:00

4.4 KiB

Raw Blame History

name, description

name	description
domain-intel	Passive domain reconnaissance using Python stdlib. Subdomain discovery, SSL certificate inspection, WHOIS lookups, DNS records, domain availability checks, and bulk multi-domain analysis. No API keys required.

Domain Intelligence — Passive OSINT

Passive domain reconnaissance using only Python stdlib. Zero dependencies. Zero API keys. Works on Linux, macOS, and Windows.

Helper script

This skill includes scripts/domain_intel.py — a complete CLI tool for all domain intelligence operations.

# Subdomain discovery via Certificate Transparency logs
python3 SKILL_DIR/scripts/domain_intel.py subdomains example.com

# SSL certificate inspection (expiry, cipher, SANs, issuer)
python3 SKILL_DIR/scripts/domain_intel.py ssl example.com

# WHOIS lookup (registrar, dates, name servers — 100+ TLDs)
python3 SKILL_DIR/scripts/domain_intel.py whois example.com

# DNS records (A, AAAA, MX, NS, TXT, CNAME)
python3 SKILL_DIR/scripts/domain_intel.py dns example.com

# Domain availability check (passive: DNS + WHOIS + SSL signals)
python3 SKILL_DIR/scripts/domain_intel.py available coolstartup.io

# Bulk analysis — multiple domains, multiple checks in parallel
python3 SKILL_DIR/scripts/domain_intel.py bulk example.com github.com google.com
python3 SKILL_DIR/scripts/domain_intel.py bulk example.com github.com --checks ssl,dns

SKILL_DIR is the directory containing this SKILL.md file. All output is structured JSON.

Available commands

Command	What it does	Data source
`subdomains`	Find subdomains from certificate logs	crt.sh (HTTPS)
`ssl`	Inspect TLS certificate details	Direct TCP:443 to target
`whois`	Registration info, registrar, dates	WHOIS servers (TCP:43)
`dns`	A, AAAA, MX, NS, TXT, CNAME records	System DNS + Google DoH
`available`	Check if domain is registered	DNS + WHOIS + SSL signals
`bulk`	Run multiple checks on multiple domains	All of the above

When to use this vs built-in tools

Use this skill for infrastructure questions: subdomains, SSL certs, WHOIS, DNS records, availability
Use web_search for general research about what a domain/company does
Use web_extract to get the actual content of a webpage
Use terminal with curl -I for a simple "is this URL reachable" check

Task	Better tool	Why
"What does example.com do?"	`web_extract`	Gets page content, not DNS/WHOIS data
"Find info about a company"	`web_search`	General research, not domain-specific
"Is this website safe?"	`web_search`	Reputation checks need web context
"Check if a URL is reachable"	`terminal` with `curl -I`	Simple HTTP check
"Find subdomains of X"	This skill	Only passive source for this
"When does the SSL cert expire?"	This skill	Built-in tools can't inspect TLS
"Who registered this domain?"	This skill	WHOIS data not in web search
"Is coolstartup.io available?"	This skill	Passive availability via DNS+WHOIS+SSL

Platform compatibility

Pure Python stdlib (socket, ssl, urllib, json, concurrent.futures). Works identically on Linux, macOS, and Windows with no dependencies.

crt.sh queries use HTTPS (port 443) — works behind most firewalls
WHOIS queries use TCP port 43 — may be blocked on restrictive networks
DNS queries use Google DoH (HTTPS) for MX/NS/TXT — firewall-friendly
SSL checks connect to the target on port 443 — the only "active" operation

Data sources

All queries are passive — no port scanning, no vulnerability testing:

crt.sh — Certificate Transparency logs (subdomain discovery, HTTPS only)
WHOIS servers — Direct TCP to 100+ authoritative TLD registrars
Google DNS-over-HTTPS — MX, NS, TXT, CNAME resolution (firewall-friendly)
System DNS — A/AAAA record resolution
SSL check is the only "active" operation (TCP connection to target:443)

Notes

WHOIS queries use TCP port 43 — may be blocked on restrictive networks
Some WHOIS servers redact registrant info (GDPR) — mention this to the user
crt.sh can be slow for very popular domains (thousands of certs) — set reasonable expectations
The availability check is heuristic-based (3 passive signals) — not authoritative like a registrar API

Contributed by @FurkanL0

4.4 KiB Raw Blame History