Files
timmy-config/scripts/multimodal/requirements.txt
Alexander Whitestone 0a52cff8a7 Fix #493: Add multimodal meaning kernel extraction pipeline
- Added extract_meaning_kernels.py for processing PDF diagrams
- Extracts text using OCR (Tesseract) when available
- Analyzes diagram structure (type, dimensions, orientation)
- Generates structured meaning kernels with metadata
- Outputs JSON (machine-readable) and Markdown (human-readable)
- Includes test pipeline and documentation
- Supports single files and batch processing

Pipeline components:
- DiagramProcessor: Main processing engine
- MeaningKernel: Structured kernel representation
- PDF to image conversion
- OCR text extraction
- Structure analysis
- Kernel generation with confidence scoring

Acceptance criteria met:
✓ Processes academic PDF diagrams
✓ Extracts structured text meaning kernels
✓ Generates machine-readable JSON output
✓ Includes human-readable reports
✓ Supports batch processing
✓ Provides confidence scoring
2026-04-13 21:20:42 -04:00

26 lines
477 B
Plaintext

# Multimodal Meaning Kernel Extraction Pipeline
# Required Python dependencies
# Image processing
Pillow>=10.0.0
# OCR (Optical Character Recognition)
pytesseract>=0.3.10
# PDF processing
pdf2image>=1.16.3
# Optional: Enhanced computer vision
# opencv-python>=4.8.0
# numpy>=1.24.0
# Optional: Machine learning for diagram classification
# scikit-learn>=1.3.0
# torch>=2.0.0
# torchvision>=0.15.0
# Development and testing
# pytest>=7.4.0
# black>=23.0.0
# flake8>=6.0.0