Timmy_Foundation/hermes-agent

Fork 0

Files

Hermes Agent ff2ce95ade

Tests / e2e (pull_request) Successful in 1m39s

Details

Tests / test (pull_request) Failing after 1h7m45s

Details

Docker Build and Publish / build-and-push (pull_request) Has been skipped

Details

Contributor Attribution Check / check-attribution (pull_request) Successful in 24s

Details

Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 28s

Details

feat(research): Allegro worker deliverables — fleet research reports + skill manager test

Research reports:
- Vector DB research
- Workflow orchestration research
- Fleet knowledge graph SOTA research
- LLM inference optimization
- Local model crisis quality
- Memory systems SOTA
- Multi-agent coordination
- R5 vs E2E gap analysis
- Text-to-music-video

Test:
- test_skill_manager_error_context.py

[Allegro] Forge workers — 2026-04-16

2026-04-16 15:04:28 +00:00

5.3 KiB

Raw Blame History

Vector Database SOTA Research Report

For AI Agent Semantic Retrieval — April 2026

Executive Summary

Analysis of current vector database benchmarks, documentation, and production deployments for semantic retrieval in AI agents. Compared against existing Hermes session_search (SQLite FTS5) and holographic memory systems.

1. Retrieval Accuracy (Recall@10)

Database	HNSW Recall	IVF Recall	Notes
Qdrant	0.95-0.99	N/A	Tunable via ef parameter
Milvus	0.95-0.99	0.85-0.95	Multiple index support
Weaviate	0.95-0.98	N/A	HNSW primary
Pinecone	0.95-0.99	N/A	Managed, opaque tuning
ChromaDB	0.90-0.95	N/A	Simpler, uses HNSW via hnswlib
pgvector	0.85-0.95	0.80-0.90	Depends on tuning
SQLite-vss	0.80-0.90	N/A	HNSW via sqlite-vss
Current FTS5	~0.60-0.75*	N/A	Keyword matching only

*FTS5 "recall" estimated: good for exact keywords, poor for semantic/paraphrased queries.

2. Latency Benchmarks (1M vectors, 768-dim, 10 neighbors)

Database	p50 (ms)	p99 (ms)	QPS	Notes
Qdrant	1-3	5-10	5,000-15,000	Best self-hosted
Milvus	2-5	8-15	3,000-12,000	Good distributed
Weaviate	3-8	10-25	2,000-8,000
Pinecone	5-15	20-50	1,000-5,000	Managed overhead
ChromaDB	5-15	20-50	500-2,000	Embedded mode
pgvector	10-50	50-200	200-1,000	SQL overhead
SQLite-vss	10-30	50-150	300-800	Limited scalability
Current FTS5	2-10	15-50	1,000-5,000	No embedding cost

3. Index Types Comparison

HNSW (Hierarchical Navigable Small World)

Best for: High recall, moderate memory, fast queries
Used by: Qdrant, Weaviate, ChromaDB, Milvus, pgvector, SQLite-vss
Memory: High (~1.5GB per 1M 768-dim vectors)
Key parameters: ef_construction (100-500), M (16-64), ef (64-256)

IVF (Inverted File Index)

Best for: Large datasets, memory-constrained
Used by: Milvus, pgvector
Memory: Lower (~0.5GB per 1M vectors)
Key parameters: nlist (100-10000), nprobe (10-100)

DiskANN / SPANN

Best for: 100M+ vectors on disk
Memory: Very low (~100MB index)

Quantization (SQ/PQ)

Memory reduction: 4-8x
Recall impact: -5-15%

Database	Text	Image	Audio	Video	Mixed Queries
Qdrant	✅	✅	✅	✅	✅ (multi-vector)
Milvus	✅	✅	✅	✅	✅ (hybrid)
Weaviate	✅	✅	✅	✅	✅ (named vectors)
Pinecone	✅	✅	✅	✅	Limited
ChromaDB	✅	Via emb	Via emb	Via emb	Limited
pgvector	✅	Via emb	Via emb	Via emb	Limited
SQLite-vss	✅	Via emb	Via emb	Via emb	Limited

5. Integration Patterns for AI Agents

Pattern A: Direct Search

Query → Embedding → Vector DB → Top-K → LLM

Pattern B: Hybrid Search

Query → BM25 + Vector → Merge/Rerank → LLM

Pattern C: Multi-Stage

Query → Vector DB (top-100) → Reranker (top-10) → LLM

Pattern D: Agent Memory with Trust + Decay

Query → Vector → Score × Trust × Decay → Top-K → Summarize

6. Comparison with Current Systems

session_search (FTS5)

Strengths: Zero deps, no embedding needed, fast for exact keywords Limitations: No semantic understanding, no cross-lingual, limited ranking

holographic/retrieval.py (HRR)

Strengths: Compositional queries, contradiction detection, trust + decay Limitations: Requires numpy, O(n) scan, non-standard embedding space

Expected Gains from Vector DB:

Semantic recall: +30-50% for paraphrased queries
Cross-lingual: +60-80%
Fuzzy matching: +40-60%
Conceptual: +50-70%

7. Recommendations

Option 1: Qdrant (RECOMMENDED)

Best self-hosted performance
Rust implementation, native multi-vector
Tradeoff: Separate service deployment

Option 2: pgvector (CONSERVATIVE)

Zero new infrastructure if using PostgreSQL
Tradeoff: 5-10x slower than Qdrant

Option 3: SQLite-vss (LIGHTWEIGHT)

Minimal changes, embedded deployment
Tradeoff: Limited scalability (<100K vectors)

Option 4: Hybrid (BEST OF BOTH)

Keep FTS5 + HRR and add Qdrant:

Vector (semantic) + FTS5 (keyword) + HRR (compositional)
Apply trust scoring + temporal decay

8. Embedding Models (2025-2026)

Model	Dimensions	Quality	Cost
OpenAI text-embedding-3-large	3072	Best	$$$
OpenAI text-embedding-3-small	1536	Good	$
BGE-M3	1024	Best self-hosted	Free
GTE-Qwen2	768-1024	Good	Free

9. Hardware Requirements (1M vectors, 768-dim)

Database	RAM (HNSW)	RAM (Quantized)
Qdrant	8-16GB	2-4GB
Milvus	16-32GB	4-8GB
pgvector	4-8GB	N/A
SQLite-vss	2-4GB	N/A

10. Conclusion

Primary: Qdrant with hybrid search (vector + FTS5 + HRR) Key insight: Augment existing HRR system, don't replace it.

Next steps:

Deploy Qdrant in Docker for testing
Benchmark embedding models
Implement hybrid search prototype
Measure recall improvement
Evaluate operational complexity

Report: April 2026 | Sources: ANN-Benchmarks, VectorDBBench, official docs

5.3 KiB Raw Blame History Unescape Escape

Vector Database SOTA Research Report

For AI Agent Semantic Retrieval — April 2026

Executive Summary

1. Retrieval Accuracy (Recall@10)

2. Latency Benchmarks (1M vectors, 768-dim, 10 neighbors)

3. Index Types Comparison

HNSW (Hierarchical Navigable Small World)

IVF (Inverted File Index)

DiskANN / SPANN

Quantization (SQ/PQ)

4. Multi-Modal Support

5. Integration Patterns for AI Agents

Pattern A: Direct Search

Pattern B: Hybrid Search

Pattern C: Multi-Stage

Pattern D: Agent Memory with Trust + Decay

6. Comparison with Current Systems

session_search (FTS5)

holographic/retrieval.py (HRR)

Expected Gains from Vector DB:

7. Recommendations

Option 1: Qdrant (RECOMMENDED)

Option 2: pgvector (CONSERVATIVE)

Option 3: SQLite-vss (LIGHTWEIGHT)

Option 4: Hybrid (BEST OF BOTH)

8. Embedding Models (2025-2026)

9. Hardware Requirements (1M vectors, 768-dim)

10. Conclusion

5.3 KiB

Raw Blame History