8.1: Entity Extractor — NER for transcripts, READMEs, issues #238

Open
Rockachopa wants to merge 1 commits from step35/144-8-1-entity-extractor into main

1 Commits

Author SHA1 Message Date
Step35
60889f4720 feat: add entity_extractor for NER (8.1 Entity Extractor)
Some checks failed
Test / pytest (pull_request) Failing after 8s
Add scripts/entity_extractor.py — LLM-based named entity recognition from session transcripts, READMEs, and issues. Extracts people, projects, tools, concepts, and repos. Outputs to knowledge/entities.json.

Includes:
- templates/entity-extraction-prompt.md — extraction prompt
- tests/test_entity_extractor.py — unit tests for dedup/merge logic
- scripts/test_entity_extractor.py — smoke test (mocked pipeline)

Accepts --file, --dir, --session, --batch modes. Deduplicates by name+type, merges with existing entities.json. Designed to yield 100+ entities per batch run.

Closes #144
2026-04-26 00:18:37 -04:00