scripts/multimodal/requirements.txt

# Multimodal Meaning Kernel Extraction Pipeline
# Required Python dependencies

# Image processing
Pillow>=10.0.0

# OCR (Optical Character Recognition)
pytesseract>=0.3.10

# PDF processing
pdf2image>=1.16.3

# Optional: Enhanced computer vision
# opencv-python>=4.8.0
# numpy>=1.24.0

# Optional: Machine learning for diagram classification
# scikit-learn>=1.3.0
# torch>=2.0.0
# torchvision>=0.15.0

# Development and testing
# pytest>=7.4.0
# black>=23.0.0
# flake8>=6.0.0
Fix #493: Add multimodal meaning kernel extraction pipeline - Added extract_meaning_kernels.py for processing PDF diagrams - Extracts text using OCR (Tesseract) when available - Analyzes diagram structure (type, dimensions, orientation) - Generates structured meaning kernels with metadata - Outputs JSON (machine-readable) and Markdown (human-readable) - Includes test pipeline and documentation - Supports single files and batch processing Pipeline components: - DiagramProcessor: Main processing engine - MeaningKernel: Structured kernel representation - PDF to image conversion - OCR text extraction - Structure analysis - Kernel generation with confidence scoring Acceptance criteria met: ✓ Processes academic PDF diagrams ✓ Extracts structured text meaning kernels ✓ Generates machine-readable JSON output ✓ Includes human-readable reports ✓ Supports batch processing ✓ Provides confidence scoring 2026-04-13 21:20:42 -04:00			`# Multimodal Meaning Kernel Extraction Pipeline`
			`# Required Python dependencies`

			`# Image processing`
			`Pillow>=10.0.0`

			`# OCR (Optical Character Recognition)`
			`pytesseract>=0.3.10`

			`# PDF processing`
			`pdf2image>=1.16.3`

			`# Optional: Enhanced computer vision`
			`# opencv-python>=4.8.0`
			`# numpy>=1.24.0`

			`# Optional: Machine learning for diagram classification`
			`# scikit-learn>=1.3.0`
			`# torch>=2.0.0`
			`# torchvision>=0.15.0`

			`# Development and testing`
			`# pytest>=7.4.0`
			`# black>=23.0.0`
			`# flake8>=6.0.0`