Timmy_Foundation/hermes-agent

Fork 0

Files

Claude (Opus 4.6) 066ec8eafa [claude] Add Ezra Quarterly Report — April 2026 (MD + PDF) (#133 ) (#163 )

2026-04-07 02:04:45 +00:00

13 KiB

Raw Blame History

Ezra — Quarterly Technical & Strategic Report

April 2026

Executive Summary

This report consolidates the principal technical and strategic outputs from Q1/Q2 2026. Three major workstreams are covered:

Security & Performance Hardening — Shipped V-011 obfuscation detection and context-compressor tuning.
System Formalization Audit — Identified ~6,300 lines of homegrown infrastructure that can be replaced by well-maintained open-source projects.
Business Development — Formalized a pure-contracting go-to-market plan ("Operation Get A Job") to monetize the engineering collective.

1. Recent Deliverables

1.1 V-011 Obfuscation Bypass Detection

A significant security enhancement was shipped to the skills-guard subsystem to defeat obfuscated malicious skill code.

Technical additions:

normalize_input() with NFKC normalization, case folding, and zero-width character removal to defeat homoglyph and ZWSP evasion.
PythonSecurityAnalyzer AST visitor detecting eval/exec/compile, getattr dunder access, and imports of base64/codecs/marshal/types/ctypes.
Additional regex patterns for getattr builtins chains, __import__ os/subprocess, and nested base64 decoding.
Full integration into scan_file(); Python files now receive both normalized regex scanning and AST-based analysis.

Verification: All tests passing (103 passed, 4 warnings).

Reference: Forge PR #131 — [EPIC-999/Phase II] The Forge — V-011 obfuscation fix + compressor tuning

1.2 Context Compressor Tuning

The default protect_last_n parameter was reduced from 20 to 5. The previous default was overly conservative, preventing meaningful compression on long sessions. The new default preserves the five most recent conversational turns while allowing the compressor to effectively reduce token pressure.

A regression test was added verifying that the last five turns are never summarized away.

1.3 Burn Mode Resilience

The agent loop was enhanced with a configurable burn_mode flag that increases concurrent tool execution capacity and adds transient-failure retry logic.

Changes:

max_tool_workers increased from 8 to 16 in burn mode.
Expanded parallel tool coverage to include browser, vision, skill, and session-search tools.
Added batch timeout protection (300s in burn mode / 180s normal) to prevent hung threads from blocking the agent loop.
Thread-pool shutdown now uses executor.shutdown(wait=False) for immediate control return.
Transient errors (timeouts, rate limits, 502/503/504) trigger one automatic retry in burn mode.

2. System Formalization Audit

A comprehensive audit was performed across the hermes-agent codebase to identify homegrown modules that could be replaced by mature open-source alternatives. The objective is efficiency: reduce maintenance burden, leverage community expertise, and improve reliability.

2.1 Candidate Matrix

Priority	Component	Lines	Current State	Proposed Replacement	Effort	ROI
P0	MCP Client	2,176	Custom asyncio transport, sampling, schema translation	`mcp` (official Python SDK)	2-3 wks	Very High
P0	Cron Scheduler	~1,500	Custom JSON job store, manual tick loop	`APScheduler`	1-2 wks	Very High
P0	Config Management	2,589	Manual YAML loader, no type safety	`pydantic-settings` + Pydantic v2	3-4 wks	High
P1	Checkpoint Manager	548	Shells out to `git` binary	`dulwich` (pure-Python git)	1 wk	Medium-High
P1	Auth / Credential Pool	~3,800	Custom JWT decode, OAuth refresh, JSON auth store	`authlib` + `keyring` + `PyJWT`	2-3 wks	Medium
P1	Batch Runner	1,285	Custom `multiprocessing.Pool` wrapper	`joblib` (local) or `celery` (distributed)	1-2 wks	Medium
P2	SQLite Session Store	~2,400	Raw SQLite + FTS5, manual schema	SQLAlchemy ORM + Alembic	2-3 wks	Medium
P2	Trajectory Compressor	1,518	Custom tokenizer + summarization pipeline	Keep core logic; add `zstandard` for binary storage	3 days	Low-Medium
P2	Process Registry	889	Custom background process tracking	Keep (adds too much ops complexity)	—	Low
P2	Web Tools	2,080+	Firecrawl + Parallel wrappers	Keep (Firecrawl is already best-in-class)	—	Low

2.2 P0 Replacements

MCP Client → Official `mcp` Python SDK

Current: tools/mcp_tool.py (2,176 lines) contains custom stdio/HTTP transport lifecycle, manual anyio cancel-scope cleanup, hand-rolled schema translation, custom sampling bridge, credential stripping, and reconnection backoff.

Problem: The Model Context Protocol is evolving rapidly. Maintaining a custom 2K-line client means every protocol revision requires manual patches. The official SDK already handles transport negotiation, lifecycle management, and type-safe schema generation.

Migration Plan:

Add mcp>=1.0.0 to dependencies.
Build a thin HermesMCPBridge class that instantiates mcp.ClientSession, maps MCP Tool schemas to Hermes registry calls, forwards tool invocations, and preserves the sampling callback.
Deprecate the _mcp_loop background thread and anyio-based transport code.
Add integration tests against a test MCP server.

Lines Saved: ~1,600 Risk: Medium — sampling and timeout behavior need parity testing.

Cron Scheduler → APScheduler

Current: cron/jobs.py (753 lines) + cron/scheduler.py (~740 lines) use a JSON file as the job store, custom parse_duration and compute_next_run logic, a manual tick loop, and ad-hoc delivery orchestration.

Problem: Scheduling is a solved problem. The homegrown system lacks timezone support, job concurrency controls, graceful clustering, and durable execution guarantees.

Migration Plan:

Introduce APScheduler with a SQLAlchemyJobStore (or custom JSON store).
Refactor each Hermes cron job into an APScheduler Job function.
Preserve existing delivery logic (_deliver_result, _build_job_prompt, _run_job_script) as the job body.
Migrate jobs.json entries into APScheduler jobs on first run.
Expose /cron status via a thin CLI wrapper.

Lines Saved: ~700 Risk: Low — delivery logic is preserved; only the trigger mechanism changes.

Config Management → `pydantic-settings`

Current: hermes_cli/config.py (2,589 lines) uses manual YAML parsing with hardcoded defaults, a complex migration chain (_config_version currently at 11), no runtime type validation, and stringly-typed env var resolution.

Problem: Every new config option requires touching multiple places. Migration logic is ~400 lines and growing. Typo'd config values are only caught at runtime, often deep in the agent loop.

Migration Plan:

Define a HermesConfig Pydantic model with nested sections (ModelConfig, ProviderConfig, AgentConfig, CompressionConfig, etc.).
Use pydantic-settings's SettingsConfigDict(yaml_file="~/.hermes/config.yaml") to auto-load.
Map env vars via env_prefix="HERMES_" or field-level validation_alias.
Keep the migration layer as a one-time upgrade function, then remove it after two releases.
Replace load_config() call sites with HermesConfig() instantiation.

Lines Saved: ~1,500 Risk: Medium-High — large blast radius; every module reads config. Requires backward compatibility.

2.3 P1 Replacements

Checkpoint Manager → dulwich

Replace subprocess.run(["git", ...]) calls with dulwich.porcelain equivalents.
Use dulwich.repo.Repo.init_bare() for shadow repos.
Snapshotting becomes an in-memory Index write + commit().
Lines Saved: ~200
Risk: Low

Auth / Credential Pool → authlib + keyring + PyJWT

Use authlib for OAuth2 session and token refresh.
Replace custom JWT decoding with PyJWT.
Migrate the auth store JSON to keyring-backed secure storage where available.
Keep Hermes-specific credential pool strategies (round-robin, least-used, etc.).
Lines Saved: ~800
Risk: Medium

Batch Runner → joblib

For typical local batch sizes, joblib.Parallel(n_jobs=-1, backend='loky') replaces the custom worker pool.
Only migrate to Celery if cross-machine distribution is required.
Lines Saved: ~400
Risk: Low for joblib

2.4 Execution Roadmap

Week 1-2: Migrate Checkpoint Manager to dulwich (quick win, low risk)
Week 3-4: Migrate Cron Scheduler to APScheduler (high value, well-contained)
Week 5-8: Migrate MCP Client to official mcp SDK (highest complexity, highest payoff)
Week 9-12: Migrate Config Management to pydantic-settings (largest blast radius, do last)
Ongoing: Evaluate Auth/Credential Pool and Batch Runner replacements as follow-up epics.

2.5 Cost-Benefit Summary

Metric	Value
Total homebrew lines audited	~17,000
Lines recommended for replacement	~6,300
Estimated dev weeks (P0 + P1)	10-14 weeks
New runtime dependencies added	4-6 well-maintained packages
Maintenance burden reduction	Very High
Risk level	Medium (mitigated by strong test coverage)

3. Strategic Initiative: Operation Get A Job

3.1 Thesis

The engineering collective is capable of 10x delivery velocity compared to typical market offerings. The strategic opportunity is to monetize this capability through pure contracting — high-tempo, fixed-scope engagements with no exclusivity or employer-like constraints.

Tier A — White-Glove Agent Infrastructure ($400-600/hr)

Custom AI agent deployment with tool use (Slack, Discord, Telegram, webhooks)
MCP server development
Local LLM stack setup (on-premise / VPC)
Agent security audit and red teaming

Tier B — Security Hardening & Code Review ($250-400/hr)

Security backlog burn-down (CVE-class bugs)
Skills-guard / sandbox hardening
Architecture review

Tier C — Automation & Integration ($150-250/hr)

Webhook-to-action pipelines
Research and intelligence reporting
Content-to-code workflows

3.3 Engagement Packages

Service	Description	Timeline	Investment
Agent Security Audit	Review of one AI agent pipeline + written findings	2-3 business days	$4,500
MCP Server Build	One custom MCP server with 3-5 tools + docs + tests	1-2 weeks	$8,000
Custom Bot Deployment	End-to-end bot with up to 5 tools, deployed to client platform	2-3 weeks	$12,000
Security Sprint	Close top 5 security issues in a Python/JS repo	1-2 weeks	$6,500
Monthly Retainer — Core	20 hrs/month prioritized engineering + triage	Ongoing	$6,000/mo
Monthly Retainer — Scale	40 hrs/month prioritized engineering + on-call	Ongoing	$11,000/mo

3.4 Go-to-Market Motion

Immediate channels:

Cold outbound to CTOs/VPEs at Series A-C AI startups
LinkedIn authority content (architecture reviews, security bulletins)
Platform presence (Gun.io, Toptal, Upwork for specific niche keywords)

Lead magnet: Free 15-minute architecture review. No pitch. One concrete risk identified.

3.5 Infrastructure Foundation

The Hermes Agent framework serves as both the delivery platform and the portfolio piece:

Open-source runtime with ~3,000 tests
Gateway architecture supporting 8+ messaging platforms
Native MCP client, cron scheduling, subagent delegation
Self-hosted Forge (Gitea) with CI and automated PR review
Local Gemma 4 inference stack on bare metal

3.6 90-Day Revenue Model

Month	Target
Month 1	$9-12K (1x retainer or 2x audits)
Month 2	$17K (+ 1x MCP build)
Month 3	$29K (+ 1x bot deployment + new retainer)

3.7 Immediate Action Items

File Wyoming LLC and obtain EIN
Open Mercury business bank account
Secure E&O insurance
Update LinkedIn profile and publish first authority post
Customize capabilities deck and begin warm outbound

4. Fleet Status Summary

House	Host	Model / Provider	Gateway Status
Ezra	Hermes VPS	`kimi-for-coding` (Kimi K2.5)	API `8658`, webhook `8648` — Active
Bezalel	Hermes VPS	Claude Opus 4.6 (Anthropic)	Port `8645` — Active
Allegro-Primus	Hermes VPS	Kimi K2.5	Port `8644` — Requires restart
Bilbo	External	Gemma 4B (local)	Telegram dual-mode — Active

Network: Hermes VPS public IP 143.198.27.163 (Ubuntu 24.04.3 LTS). Local Gemma 4 fallback on 127.0.0.1:11435.

5. Conclusion

The codebase is in a strong position: security is hardened, the agent loop is more resilient, and a clear roadmap exists to replace high-maintenance homegrown infrastructure with battle-tested open-source projects. The commercialization strategy is formalized and ready for execution. The next critical path is the human-facing work of entity formation, sales outreach, and closing the first fixed-scope engagement.

Prepared by Ezra April 2026

13 KiB Raw Blame History

Ezra — Quarterly Technical & Strategic Report

Executive Summary

1. Recent Deliverables

1.1 V-011 Obfuscation Bypass Detection

1.2 Context Compressor Tuning

1.3 Burn Mode Resilience

2. System Formalization Audit

2.1 Candidate Matrix

2.2 P0 Replacements

MCP Client → Official mcp Python SDK

Cron Scheduler → APScheduler

Config Management → pydantic-settings

2.3 P1 Replacements

2.4 Execution Roadmap

2.5 Cost-Benefit Summary

3. Strategic Initiative: Operation Get A Job

3.1 Thesis

3.2 Service Menu

3.3 Engagement Packages

3.4 Go-to-Market Motion

3.5 Infrastructure Foundation

3.6 90-Day Revenue Model

3.7 Immediate Action Items

4. Fleet Status Summary

5. Conclusion

13 KiB

Raw Blame History

MCP Client → Official `mcp` Python SDK

Config Management → `pydantic-settings`