[Mnemosyne] Add entry update + content deduplication #1239

Closed
opened 2026-04-11 23:36:52 +00:00 by Rockachopa · 2 comments
Owner

Problem

The MnemosyneArchive currently has no way to update an existing entry's title/content/metadata after creation. The only mutation operations are tag management (add_tags, remove_tags, retag) and full removal.

Additionally, add() performs no content deduplication — calling ingest_from_mempalace twice on the same data creates duplicate entries with different UUIDs.

Proposed Changes

  1. update_entry(entry_id, title=None, content=None, metadata=None) — Update specific fields on an existing entry, bump updated_at timestamp, and re-run auto-linking since content changed.

  2. Content hash dedup — Compute a content hash (SHA-256 of title+content) on each entry. Before adding, check if an entry with the same hash exists. Add find_duplicate() method.

  3. updated_at field — Add to ArchiveEntry dataclass (default: same as created_at). Updated on any mutation.

Files

  • nexus/mnemosyne/entry.py — add content_hash, updated_at
  • nexus/mnemosyne/archive.py — add update_entry(), find_duplicates(), dedup check in add()
  • nexus/mnemosyne/tests/test_archive.py — tests for new functionality
## Problem The MnemosyneArchive currently has no way to update an existing entry's title/content/metadata after creation. The only mutation operations are tag management (add_tags, remove_tags, retag) and full removal. Additionally, `add()` performs no content deduplication — calling `ingest_from_mempalace` twice on the same data creates duplicate entries with different UUIDs. ## Proposed Changes 1. **`update_entry(entry_id, title=None, content=None, metadata=None)`** — Update specific fields on an existing entry, bump `updated_at` timestamp, and re-run auto-linking since content changed. 2. **Content hash dedup** — Compute a content hash (SHA-256 of title+content) on each entry. Before adding, check if an entry with the same hash exists. Add `find_duplicate()` method. 3. **`updated_at` field** — Add to ArchiveEntry dataclass (default: same as created_at). Updated on any mutation. ## Files - `nexus/mnemosyne/entry.py` — add `content_hash`, `updated_at` - `nexus/mnemosyne/archive.py` — add `update_entry()`, `find_duplicates()`, dedup check in `add()` - `nexus/mnemosyne/tests/test_archive.py` — tests for new functionality
claude self-assigned this 2026-04-11 23:39:45 +00:00
Member

PR created: #1241

Summary of changes:

  • ArchiveEntry now carries a content_hash (SHA-256 of title+content) and an updated_at timestamp (defaults to None, set on any mutation). Both fields round-trip through to_dict/from_dict.
  • add() deduplicates on content_hash — calling ingest_from_mempalace twice on the same data now returns the existing entry rather than creating a ghost UUID.
  • New find_duplicate(entry) method checks the archive for a hash collision without side-effects.
  • New update_entry(entry_id, title, content, metadata, auto_link=True) updates any combination of fields, refreshes the hash when content changes, bumps updated_at, and re-runs the holographic linker (clearing stale links first).
  • 18 new tests; all 57 pass.
PR created: #1241 **Summary of changes:** - `ArchiveEntry` now carries a `content_hash` (SHA-256 of `title+content`) and an `updated_at` timestamp (defaults to `None`, set on any mutation). Both fields round-trip through `to_dict`/`from_dict`. - `add()` deduplicates on `content_hash` — calling `ingest_from_mempalace` twice on the same data now returns the existing entry rather than creating a ghost UUID. - New `find_duplicate(entry)` method checks the archive for a hash collision without side-effects. - New `update_entry(entry_id, title, content, metadata, auto_link=True)` updates any combination of fields, refreshes the hash when content changes, bumps `updated_at`, and re-runs the holographic linker (clearing stale links first). - 18 new tests; all 57 pass.
Author
Owner

Implemented in PR #1240.

Adds update_entry(), find_by_hash(), find_duplicates(), content hash dedup via add(skip_dups=True), and updated_at tracking. 18 new tests, all passing.

Implemented in PR #1240. Adds `update_entry()`, `find_by_hash()`, `find_duplicates()`, content hash dedup via `add(skip_dups=True)`, and `updated_at` tracking. 18 new tests, all passing.
Sign in to join this conversation.
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Timmy_Foundation/the-nexus#1239