Files
hermes-agent/reports/ezra-quarterly-report-april-2026.md

13 KiB

Ezra — Quarterly Technical & Strategic Report

April 2026


Executive Summary

This report consolidates the principal technical and strategic outputs from Q1/Q2 2026. Three major workstreams are covered:

  1. Security & Performance Hardening — Shipped V-011 obfuscation detection and context-compressor tuning.
  2. System Formalization Audit — Identified ~6,300 lines of homegrown infrastructure that can be replaced by well-maintained open-source projects.
  3. Business Development — Formalized a pure-contracting go-to-market plan ("Operation Get A Job") to monetize the engineering collective.

1. Recent Deliverables

1.1 V-011 Obfuscation Bypass Detection

A significant security enhancement was shipped to the skills-guard subsystem to defeat obfuscated malicious skill code.

Technical additions:

  • normalize_input() with NFKC normalization, case folding, and zero-width character removal to defeat homoglyph and ZWSP evasion.
  • PythonSecurityAnalyzer AST visitor detecting eval/exec/compile, getattr dunder access, and imports of base64/codecs/marshal/types/ctypes.
  • Additional regex patterns for getattr builtins chains, __import__ os/subprocess, and nested base64 decoding.
  • Full integration into scan_file(); Python files now receive both normalized regex scanning and AST-based analysis.

Verification: All tests passing (103 passed, 4 warnings).

Reference: Forge PR #131 — [EPIC-999/Phase II] The Forge — V-011 obfuscation fix + compressor tuning

1.2 Context Compressor Tuning

The default protect_last_n parameter was reduced from 20 to 5. The previous default was overly conservative, preventing meaningful compression on long sessions. The new default preserves the five most recent conversational turns while allowing the compressor to effectively reduce token pressure.

A regression test was added verifying that the last five turns are never summarized away.

1.3 Burn Mode Resilience

The agent loop was enhanced with a configurable burn_mode flag that increases concurrent tool execution capacity and adds transient-failure retry logic.

Changes:

  • max_tool_workers increased from 8 to 16 in burn mode.
  • Expanded parallel tool coverage to include browser, vision, skill, and session-search tools.
  • Added batch timeout protection (300s in burn mode / 180s normal) to prevent hung threads from blocking the agent loop.
  • Thread-pool shutdown now uses executor.shutdown(wait=False) for immediate control return.
  • Transient errors (timeouts, rate limits, 502/503/504) trigger one automatic retry in burn mode.

2. System Formalization Audit

A comprehensive audit was performed across the hermes-agent codebase to identify homegrown modules that could be replaced by mature open-source alternatives. The objective is efficiency: reduce maintenance burden, leverage community expertise, and improve reliability.

2.1 Candidate Matrix

Priority Component Lines Current State Proposed Replacement Effort ROI
P0 MCP Client 2,176 Custom asyncio transport, sampling, schema translation mcp (official Python SDK) 2-3 wks Very High
P0 Cron Scheduler ~1,500 Custom JSON job store, manual tick loop APScheduler 1-2 wks Very High
P0 Config Management 2,589 Manual YAML loader, no type safety pydantic-settings + Pydantic v2 3-4 wks High
P1 Checkpoint Manager 548 Shells out to git binary dulwich (pure-Python git) 1 wk Medium-High
P1 Auth / Credential Pool ~3,800 Custom JWT decode, OAuth refresh, JSON auth store authlib + keyring + PyJWT 2-3 wks Medium
P1 Batch Runner 1,285 Custom multiprocessing.Pool wrapper joblib (local) or celery (distributed) 1-2 wks Medium
P2 SQLite Session Store ~2,400 Raw SQLite + FTS5, manual schema SQLAlchemy ORM + Alembic 2-3 wks Medium
P2 Trajectory Compressor 1,518 Custom tokenizer + summarization pipeline Keep core logic; add zstandard for binary storage 3 days Low-Medium
P2 Process Registry 889 Custom background process tracking Keep (adds too much ops complexity) Low
P2 Web Tools 2,080+ Firecrawl + Parallel wrappers Keep (Firecrawl is already best-in-class) Low

2.2 P0 Replacements

MCP Client → Official mcp Python SDK

Current: tools/mcp_tool.py (2,176 lines) contains custom stdio/HTTP transport lifecycle, manual anyio cancel-scope cleanup, hand-rolled schema translation, custom sampling bridge, credential stripping, and reconnection backoff.

Problem: The Model Context Protocol is evolving rapidly. Maintaining a custom 2K-line client means every protocol revision requires manual patches. The official SDK already handles transport negotiation, lifecycle management, and type-safe schema generation.

Migration Plan:

  1. Add mcp>=1.0.0 to dependencies.
  2. Build a thin HermesMCPBridge class that instantiates mcp.ClientSession, maps MCP Tool schemas to Hermes registry calls, forwards tool invocations, and preserves the sampling callback.
  3. Deprecate the _mcp_loop background thread and anyio-based transport code.
  4. Add integration tests against a test MCP server.

Lines Saved: ~1,600 Risk: Medium — sampling and timeout behavior need parity testing.

Cron Scheduler → APScheduler

Current: cron/jobs.py (753 lines) + cron/scheduler.py (~740 lines) use a JSON file as the job store, custom parse_duration and compute_next_run logic, a manual tick loop, and ad-hoc delivery orchestration.

Problem: Scheduling is a solved problem. The homegrown system lacks timezone support, job concurrency controls, graceful clustering, and durable execution guarantees.

Migration Plan:

  1. Introduce APScheduler with a SQLAlchemyJobStore (or custom JSON store).
  2. Refactor each Hermes cron job into an APScheduler Job function.
  3. Preserve existing delivery logic (_deliver_result, _build_job_prompt, _run_job_script) as the job body.
  4. Migrate jobs.json entries into APScheduler jobs on first run.
  5. Expose /cron status via a thin CLI wrapper.

Lines Saved: ~700 Risk: Low — delivery logic is preserved; only the trigger mechanism changes.

Config Management → pydantic-settings

Current: hermes_cli/config.py (2,589 lines) uses manual YAML parsing with hardcoded defaults, a complex migration chain (_config_version currently at 11), no runtime type validation, and stringly-typed env var resolution.

Problem: Every new config option requires touching multiple places. Migration logic is ~400 lines and growing. Typo'd config values are only caught at runtime, often deep in the agent loop.

Migration Plan:

  1. Define a HermesConfig Pydantic model with nested sections (ModelConfig, ProviderConfig, AgentConfig, CompressionConfig, etc.).
  2. Use pydantic-settings's SettingsConfigDict(yaml_file="~/.hermes/config.yaml") to auto-load.
  3. Map env vars via env_prefix="HERMES_" or field-level validation_alias.
  4. Keep the migration layer as a one-time upgrade function, then remove it after two releases.
  5. Replace load_config() call sites with HermesConfig() instantiation.

Lines Saved: ~1,500 Risk: Medium-High — large blast radius; every module reads config. Requires backward compatibility.

2.3 P1 Replacements

Checkpoint Manager → dulwich

  • Replace subprocess.run(["git", ...]) calls with dulwich.porcelain equivalents.
  • Use dulwich.repo.Repo.init_bare() for shadow repos.
  • Snapshotting becomes an in-memory Index write + commit().
  • Lines Saved: ~200
  • Risk: Low

Auth / Credential Pool → authlib + keyring + PyJWT

  • Use authlib for OAuth2 session and token refresh.
  • Replace custom JWT decoding with PyJWT.
  • Migrate the auth store JSON to keyring-backed secure storage where available.
  • Keep Hermes-specific credential pool strategies (round-robin, least-used, etc.).
  • Lines Saved: ~800
  • Risk: Medium

Batch Runner → joblib

  • For typical local batch sizes, joblib.Parallel(n_jobs=-1, backend='loky') replaces the custom worker pool.
  • Only migrate to Celery if cross-machine distribution is required.
  • Lines Saved: ~400
  • Risk: Low for joblib

2.4 Execution Roadmap

  1. Week 1-2: Migrate Checkpoint Manager to dulwich (quick win, low risk)
  2. Week 3-4: Migrate Cron Scheduler to APScheduler (high value, well-contained)
  3. Week 5-8: Migrate MCP Client to official mcp SDK (highest complexity, highest payoff)
  4. Week 9-12: Migrate Config Management to pydantic-settings (largest blast radius, do last)
  5. Ongoing: Evaluate Auth/Credential Pool and Batch Runner replacements as follow-up epics.

2.5 Cost-Benefit Summary

Metric Value
Total homebrew lines audited ~17,000
Lines recommended for replacement ~6,300
Estimated dev weeks (P0 + P1) 10-14 weeks
New runtime dependencies added 4-6 well-maintained packages
Maintenance burden reduction Very High
Risk level Medium (mitigated by strong test coverage)

3. Strategic Initiative: Operation Get A Job

3.1 Thesis

The engineering collective is capable of 10x delivery velocity compared to typical market offerings. The strategic opportunity is to monetize this capability through pure contracting — high-tempo, fixed-scope engagements with no exclusivity or employer-like constraints.

3.2 Service Menu

Tier A — White-Glove Agent Infrastructure ($400-600/hr)

  • Custom AI agent deployment with tool use (Slack, Discord, Telegram, webhooks)
  • MCP server development
  • Local LLM stack setup (on-premise / VPC)
  • Agent security audit and red teaming

Tier B — Security Hardening & Code Review ($250-400/hr)

  • Security backlog burn-down (CVE-class bugs)
  • Skills-guard / sandbox hardening
  • Architecture review

Tier C — Automation & Integration ($150-250/hr)

  • Webhook-to-action pipelines
  • Research and intelligence reporting
  • Content-to-code workflows

3.3 Engagement Packages

Service Description Timeline Investment
Agent Security Audit Review of one AI agent pipeline + written findings 2-3 business days $4,500
MCP Server Build One custom MCP server with 3-5 tools + docs + tests 1-2 weeks $8,000
Custom Bot Deployment End-to-end bot with up to 5 tools, deployed to client platform 2-3 weeks $12,000
Security Sprint Close top 5 security issues in a Python/JS repo 1-2 weeks $6,500
Monthly Retainer — Core 20 hrs/month prioritized engineering + triage Ongoing $6,000/mo
Monthly Retainer — Scale 40 hrs/month prioritized engineering + on-call Ongoing $11,000/mo

3.4 Go-to-Market Motion

Immediate channels:

  • Cold outbound to CTOs/VPEs at Series A-C AI startups
  • LinkedIn authority content (architecture reviews, security bulletins)
  • Platform presence (Gun.io, Toptal, Upwork for specific niche keywords)

Lead magnet: Free 15-minute architecture review. No pitch. One concrete risk identified.

3.5 Infrastructure Foundation

The Hermes Agent framework serves as both the delivery platform and the portfolio piece:

  • Open-source runtime with ~3,000 tests
  • Gateway architecture supporting 8+ messaging platforms
  • Native MCP client, cron scheduling, subagent delegation
  • Self-hosted Forge (Gitea) with CI and automated PR review
  • Local Gemma 4 inference stack on bare metal

3.6 90-Day Revenue Model

Month Target
Month 1 $9-12K (1x retainer or 2x audits)
Month 2 $17K (+ 1x MCP build)
Month 3 $29K (+ 1x bot deployment + new retainer)

3.7 Immediate Action Items

  • File Wyoming LLC and obtain EIN
  • Open Mercury business bank account
  • Secure E&O insurance
  • Update LinkedIn profile and publish first authority post
  • Customize capabilities deck and begin warm outbound

4. Fleet Status Summary

House Host Model / Provider Gateway Status
Ezra Hermes VPS kimi-for-coding (Kimi K2.5) API 8658, webhook 8648 — Active
Bezalel Hermes VPS Claude Opus 4.6 (Anthropic) Port 8645 — Active
Allegro-Primus Hermes VPS Kimi K2.5 Port 8644 — Requires restart
Bilbo External Gemma 4B (local) Telegram dual-mode — Active

Network: Hermes VPS public IP 143.198.27.163 (Ubuntu 24.04.3 LTS). Local Gemma 4 fallback on 127.0.0.1:11435.


5. Conclusion

The codebase is in a strong position: security is hardened, the agent loop is more resilient, and a clear roadmap exists to replace high-maintenance homegrown infrastructure with battle-tested open-source projects. The commercialization strategy is formalized and ready for execution. The next critical path is the human-facing work of entity formation, sales outreach, and closing the first fixed-scope engagement.

Prepared by Ezra April 2026