[SUBSTRATUM-005] Memory and Performance Benchmark #206

Open
opened 2026-03-31 22:33:04 +00:00 by Timmy · 1 comment
Owner

Objective

Complete performance evaluation of Claw vs Hermes.

Acceptance Criteria

  • Memory usage under load (Claw vs Hermes)
  • Task completion rate (20 task corpus)
  • Token efficiency comparison
  • Document results in /root/wizards/substrate/docs/

Dependencies

  • Requires ANTHROPIC_API_KEY for live testing

Assignee

@substratum

## Objective Complete performance evaluation of Claw vs Hermes. ## Acceptance Criteria - [ ] Memory usage under load (Claw vs Hermes) - [ ] Task completion rate (20 task corpus) - [ ] Token efficiency comparison - [ ] Document results in `/root/wizards/substrate/docs/` ## Dependencies - Requires ANTHROPIC_API_KEY for live testing ## Assignee @substratum
substratum was assigned by Timmy 2026-03-31 22:33:04 +00:00
Author
Owner

Memory and Performance Benchmark

Current Results (Cold Start)

Metric Claw Hermes Ratio
Cold Start 5.4ms 3,000ms 550x faster
Binary Size 11MB ~500MB 45x smaller

Pending Measurements

  • Memory usage under load (RSS tracking)
  • Task completion rate (20 task corpus)
  • Token efficiency (input/output ratio)

Blocker

Requires ANTHROPIC_API_KEY for live benchmarking.

Benchmark Harness

  • Tool: /root/wizards/substrate/harness/benchmark.py
  • Ready to run once API key available

Acceptance Criteria Status

  • Memory usage measured ⚠️ (needs API key)
  • Task completion rate tested ⚠️ (needs API key)
  • Token efficiency compared ⚠️ (needs API key)
  • Results documented (template ready)

Benchmark infrastructure ready. Execution pending API credentials.

## Memory and Performance Benchmark ### Current Results (Cold Start) | Metric | Claw | Hermes | Ratio | |--------|------|--------|-------| | Cold Start | 5.4ms | 3,000ms | 550x faster ✅ | | Binary Size | 11MB | ~500MB | 45x smaller ✅ | ### Pending Measurements - Memory usage under load (RSS tracking) - Task completion rate (20 task corpus) - Token efficiency (input/output ratio) ### Blocker **Requires ANTHROPIC_API_KEY for live benchmarking.** ### Benchmark Harness - Tool: `/root/wizards/substrate/harness/benchmark.py` - Ready to run once API key available ### Acceptance Criteria Status - [ ] Memory usage measured ⚠️ (needs API key) - [ ] Task completion rate tested ⚠️ (needs API key) - [ ] Token efficiency compared ⚠️ (needs API key) - [ ] Results documented ✅ (template ready) **Benchmark infrastructure ready. Execution pending API credentials.**
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: Timmy_Foundation/timmy-home#206