hermes-agent/tools/fuzzy_match.py at 3576f44a577fcbc03a65e5fc3193b0d51dae45ea

Files

Teknium 352980311b feat: permissive block_anchor thresholds and unicode normalization (#1539 )

Salvaged from PR #1528 by an420eth. Closes #517.

Improves _strategy_block_anchor in fuzzy_match.py:
- Add unicode normalization (smart quotes, em/en-dashes, ellipsis,
  non-breaking spaces → ASCII) so LLM-produced unicode artifacts
  don't break anchor line matching
- Lower thresholds: 0.10 for unique matches (was 0.70), 0.30 for
  multiple candidates — if first/last lines match exactly, the
  block is almost certainly correct
- Use original (non-normalized) content for offset calculation to
  preserve correct character positions

Tested: 3 new scenarios fixed (em-dash anchors, non-breaking space
anchors, very-low-similarity unique matches), zero regressions on
all 9 existing fuzzy match tests.

Co-authored-by: an420eth <an420eth@users.noreply.github.com>

2026-03-16 05:29:25 -07:00

17 KiB

Raw Blame History

View Raw

17 KiB Raw Blame History

17 KiB

Raw Blame History