Files
hermes-agent/Project_notes.md

6.6 KiB

Project Notes

Maintained by Hermes — last updated June 2025


1. Kandinsky (Multimodal Transformer)

  • Repo: https://github.com/samherring99/kandinsky
  • Local path: ~/Desktop/Projects/kandinsky
  • Description: An anything-to-anything transformer combining text, image, and audio modalities. Trains on Pokemon BLIP captions paired with Gen 1 Pokemon audio cries. Uses audio tokenization adapted from nanoGPT.
  • Status: Early POC. Training code exists (model.py) and dataset creation (create_dataset.py) works. Audio heads are producing the same sound — unclear if it's a training issue or data issue.
  • TODO:
    • Debug why audio heads produce identical output
    • Investigate if model needs more training time
    • Design a data pipeline for better/more training data
    • General repo cleanup (requirements.txt, proper CLI, etc.)

2. NightwingGameSim (LLM → GameBoy ROM Generator)

  • Repo: https://github.com/samherring99/NightwingGameSim
  • Local path: ~/Desktop/Projects/NightwingGameSim
  • Description: AI-powered pipeline that turns natural language prompts into playable GameBoy ROM files. Generates C code, compiles with GBDK, outputs .gb files. Supports Claude API, local Llama, and RAG backends.
  • Status: Functional — generation pipeline works end-to-end with Claude 4 system prompt. Has tests, docs, examples, and retry logic.
  • TODO:
    • Harden the repo, clean up structure
    • Build a better testing pipeline
    • Come up with better prompt ideas / examples

3. ContentBasedMIR (Music Information Retrieval)

  • Repo: https://github.com/samherring99/ContentBasedMIR
  • Local path: ~/Desktop/Projects/ContentBasedMIR
  • Description: Music similarity analysis using Spotify API track data. Extracts 54 audio features per song and visualizes similarity matrices for music recommendation.
  • Status: Early stage. Can download Spotify track analysis data and plot similarity matrices. Needs significant expansion.
  • TODO:
    • Expand analysis pipeline with more features
    • Integrate with text message data for personalized recommendations
    • Build out visualization and exploration tools
    • General modernization (dependencies, structure)

4. MessageRetrieval (iMessage RAG/SQL)

  • Repo: https://github.com/samherring99/MessageRetrieval
  • Local path: ~/Desktop/Projects/MessageRetrieval
  • Description: Natural language querying over iMessage data using SQL generation (text2SQL) instead of vector embeddings. Uses LLM-as-Judge pattern for scoring and ranking retrieved messages.
  • Status: Has initial text2SQL pipeline and summarization tool. Recently worked on with Claude Code. Needs testing.
  • TODO:
    • Test out the recent Claude Code work
    • Build "iMessage Jarvis" — answer questions about texts
    • Improve SQL generation prompts and accuracy
    • Better error handling and UX

  • Repo: https://github.com/samherring99/grailed-embedding-search
  • Local path: ~/Desktop/Projects/grailed-embedding-search
  • Description: Semantic similarity search over Grailed fashion listings using CLIP embeddings and FAISS. Search by image URL or text description to find visually similar products.
  • Status: Functional core pipeline. CLIP ViT-B/32 embeds product cover photos into 512-dim vectors, indexed with FAISS cosine similarity. Has CLI, batch embedding, persistent index save/load, and logging.
  • Recent work (June 2025):
    • PR #1 — Initial cleanup: docstrings, type hints, .gitignore, requirements.txt, README rewrite
    • PR #2 — Feature improvements: persistent FAISS save/load, batch embedding, CLI (cli.py), proper logging throughout, lazy Grailed client, fetch_details toggle
  • TODO:
    • Embedding cache (avoid re-embedding known product URLs)
    • Async/threaded image downloads for faster batch indexing
    • Search result visualization (matplotlib grid of cover photos)
    • Filter by category, designer, price range before search
    • Web UI (Gradio or Streamlit)

6. NightwingNBA (Sports Analytics)

  • Repo: https://github.com/samherring99/NightwingNBA
  • Local path: ~/Desktop/Projects/NightwingNBA
  • Description: NBA game prediction system. Builds a database of game data, trains a PyTorch model, and makes daily predictions. Has full pipeline: build DB → write data → train → predict.
  • Status: Functional pipeline exists. Has database building, training, prediction, and daily update scripts.
  • TODO:
    • Explore and potentially revive
    • Update data sources if stale
    • Improve model accuracy
    • Add visualization/reporting

7. Stable Audio Sample Explorer


8. NightwingArt (Art Tools)

  • Repo: https://github.com/samherring99/NightwingArt
  • Local path: ~/Desktop/Projects/NightwingArt
  • Description: Collection of art tooling scripts — video editing, clip splicing with beat matching, damage effects, and general image manipulation.
  • Status: Maintenance mode. Tools exist for various effects. Work happens as-needed.
  • TODO:
    • Add tools as needed for new art projects

9. Claude-based VST Building ⚠️ Needs new repo

  • Description: Generate VST audio plugins for DAWs from English language prompts. LLM-powered audio plugin creation.
  • Status: Concept only — no repo exists yet.
  • TODO:
    • Create repo
    • Research VST SDK / JUCE framework
    • Design prompt → code → compile pipeline

10. Government Auction Site Scraper ⚠️ Needs new repo

  • Description: Tool that monitors and scrapes government auction sites in San Francisco for deals.
  • Status: Concept only — no repo exists yet.
  • TODO:
    • Create repo
    • Research SF government auction sites and their structure
    • Build scraper + notification system

Priority Assessment

Project Activity Level Suggested Priority
NightwingGameSim Active 🔴 High
MessageRetrieval Active 🔴 High
Kandinsky Active 🟡 Medium
ContentBasedMIR Exploratory 🟡 Medium
Grailed Embedding Search Early 🟡 Medium
NightwingNBA Dormant 🟢 Low
NightwingArt As-needed 🟢 Low
VST Builder Concept 🔵 Future
Gov Auction Scraper Concept 🔵 Future
Stable Audio Explorer Dead None