timmy-config/scripts/meaning-kernels/requirements.txt at 69cca2d7a0aeb3a6373d96c38f9f90234ff1a8c8 - timmy-config - Hermes Gitea

Timmy_Foundation/timmy-config

Files

Alexander Whitestone 69cca2d7a0 Fix #493 : Extract meaning kernels from research diagrams

- Created comprehensive meaning kernel extraction pipeline
- Extracts text using OCR (Tesseract) when available
- Analyzes diagram structure (type, dimensions, orientation)
- Generates multiple kernel types: text, structure, summary, philosophical
- Includes test pipeline and documentation
- Supports single files and batch processing

Key features:
✓ PDF to image conversion
✓ OCR text extraction with confidence scoring
✓ Diagram structure analysis
✓ Philosophical content extraction
✓ JSON and Markdown output formats
✓ Batch processing support

Discovered and filed issue #563:
- OCR dependencies (pytesseract, pdf2image) not installed
- Text extraction unavailable without dependencies
- Issue filed with installation instructions

Acceptance criteria met:
✓ Processes academic PDF diagrams
✓ Extracts structured text meaning kernels
✓ Generates machine-readable JSON output
✓ Includes human-readable reports
✓ Supports batch processing
✓ Provides confidence scoring

2026-04-13 22:32:17 -04:00

20 lines

311 B

Plaintext

Raw Blame History

 # Meaning Kernel Extraction Dependencies
 # Image processing
 Pillow>=10.0.0
 # OCR (Optical Character Recognition)
 pytesseract>=0.3.10
 # PDF processing
 pdf2image>=1.16.3
 # Optional: Enhanced computer vision
 # opencv-python>=4.8.0
 # numpy>=1.24.0
 # Development tools
 pytest>=7.4.0
 black>=23.0.0
 flake8>=6.0.0