Files
hermes-agent/scripts/gen_agent_cert.sh
Alexander Whitestone 4214082fb6
All checks were successful
Lint / lint (pull_request) Successful in 8s
feat: A2A auth — mutual TLS between fleet agents
Implements mTLS for securing agent-to-agent communication in the Hermes
fleet. Fixes #806.

Changes:
- scripts/gen_fleet_ca.sh: generate a self-signed Fleet CA (4096-bit RSA,
  10-year validity) that signs all agent certificates
- scripts/gen_agent_cert.sh: generate per-agent certs (Timmy, Allegro,
  Ezra) signed by the fleet CA with SAN entries and clientAuth/serverAuth
  extended key usage
- agent/mtls.py: new module providing:
  - build_server_ssl_context() — TLS_SERVER context with CERT_REQUIRED,
    enforces client cert against Fleet CA
  - build_client_ssl_context() — TLS_CLIENT context for outbound A2A calls
  - MTLSMiddleware — ASGI middleware that rejects unauthenticated requests
    to A2A routes (/.well-known/agent-card*, /api/agent-card, /a2a/) with
    HTTP 403 when mTLS is enabled
  - is_mtls_configured() — checks HERMES_MTLS_CERT/KEY/CA env vars
- hermes_cli/web_server.py: wire MTLSMiddleware into the FastAPI app;
  pass SSL context to uvicorn when HERMES_MTLS_* env vars are set so
  the server runs TLS with mandatory client cert verification
- ansible/roles/hermes_mtls/: Ansible role to distribute Fleet CA cert,
  agent cert, and agent key to fleet nodes; writes an env file with
  HERMES_MTLS_* vars and restarts the hermes-gateway service
- ansible/fleet_mtls.yml: fleet-wide playbook referencing the role for
  Timmy, Allegro, and Ezra nodes
- tests/test_mtls.py: 15 tests covering is_mtls_configured, SSL context
  creation with real cryptography-generated certs, and MTLSMiddleware
  (unauthorized agent rejected → 403, authorized agent accepted → 200)

mTLS is opt-in: set HERMES_MTLS_CERT, HERMES_MTLS_KEY, and HERMES_MTLS_CA
to enable. When unset, the server behaves exactly as before.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 18:04:00 -04:00

130 lines
3.5 KiB
Bash

#!/usr/bin/env bash
# gen_agent_cert.sh — Generate a TLS certificate for a fleet agent.
#
# Usage:
# ./scripts/gen_agent_cert.sh --agent <name> [--ca-dir <dir>] [--out-dir <dir>]
#
# Known agents: timmy, allegro, ezra (case-insensitive; any name is accepted)
#
# Outputs (default: ~/.hermes/pki/agents/<name>/):
# <name>.key — agent private key (chmod 600, stays on the agent host)
# <name>.crt — agent certificate (signed by the fleet CA)
#
# Run gen_fleet_ca.sh first if you haven't already.
# Refs #806
set -euo pipefail
CERT_DAYS=365 # 1 year; rotate annually
KEY_BITS=2048
# ---------------------------------------------------------------------------
# Parse args
# ---------------------------------------------------------------------------
AGENT_NAME=""
CA_DIR="${HOME}/.hermes/pki/ca"
OUT_DIR=""
while [[ $# -gt 0 ]]; do
case "$1" in
--agent) AGENT_NAME="${2,,}"; shift 2 ;; # lower-case
--ca-dir) CA_DIR="$2"; shift 2 ;;
--out-dir) OUT_DIR="$2"; shift 2 ;;
-h|--help)
echo "Usage: $0 --agent <name> [--ca-dir <dir>] [--out-dir <dir>]"
echo " Known agents: timmy, allegro, ezra"
exit 0
;;
*)
echo "Unknown option: $1" >&2
exit 1
;;
esac
done
if [[ -z "$AGENT_NAME" ]]; then
echo "ERROR: --agent <name> is required." >&2
exit 1
fi
OUT_DIR="${OUT_DIR:-${HOME}/.hermes/pki/agents/${AGENT_NAME}}"
# ---------------------------------------------------------------------------
# Prereq check
# ---------------------------------------------------------------------------
if ! command -v openssl &>/dev/null; then
echo "ERROR: openssl not found." >&2
exit 1
fi
CA_KEY="$CA_DIR/fleet-ca.key"
CA_CRT="$CA_DIR/fleet-ca.crt"
if [[ ! -f "$CA_KEY" || ! -f "$CA_CRT" ]]; then
echo "ERROR: Fleet CA not found in $CA_DIR" >&2
echo " Run scripts/gen_fleet_ca.sh first." >&2
exit 1
fi
mkdir -p "$OUT_DIR"
chmod 700 "$OUT_DIR"
AGENT_KEY="$OUT_DIR/${AGENT_NAME}.key"
AGENT_CRT="$OUT_DIR/${AGENT_NAME}.crt"
AGENT_CSR="$OUT_DIR/${AGENT_NAME}.csr"
if [[ -f "$AGENT_KEY" || -f "$AGENT_CRT" ]]; then
echo "Cert for agent '$AGENT_NAME' already exists in $OUT_DIR"
echo " $AGENT_KEY"
echo " $AGENT_CRT"
echo "Delete them manually if you want to regenerate."
exit 0
fi
echo "Generating cert for agent '$AGENT_NAME' ..."
SUBJECT="/CN=${AGENT_NAME}.fleet.hermes/O=Hermes/OU=Fleet Agent"
# Agent private key
openssl genrsa -out "$AGENT_KEY" "$KEY_BITS" 2>/dev/null
chmod 600 "$AGENT_KEY"
# Certificate Signing Request
openssl req -new \
-key "$AGENT_KEY" \
-out "$AGENT_CSR" \
-subj "$SUBJECT" 2>/dev/null
# Sign with fleet CA — include SAN so modern TLS stacks accept it
EXT_CONF=$(mktemp)
trap 'rm -f "$EXT_CONF" "$AGENT_CSR"' EXIT
cat > "$EXT_CONF" <<EOF
[v3_agent]
basicConstraints = CA:FALSE
keyUsage = critical, digitalSignature, keyEncipherment
extendedKeyUsage = clientAuth, serverAuth
subjectKeyIdentifier = hash
authorityKeyIdentifier = keyid,issuer
subjectAltName = DNS:${AGENT_NAME}.fleet.hermes, DNS:${AGENT_NAME}
EOF
openssl x509 -req \
-in "$AGENT_CSR" \
-CA "$CA_CRT" \
-CAkey "$CA_KEY" \
-CAcreateserial \
-out "$AGENT_CRT" \
-days "$CERT_DAYS" \
-extfile "$EXT_CONF" \
-extensions v3_agent 2>/dev/null
chmod 644 "$AGENT_CRT"
echo ""
echo "Agent cert generated:"
echo " Private key : $AGENT_KEY"
echo " Certificate : $AGENT_CRT"
echo ""
openssl x509 -in "$AGENT_CRT" -noout -subject -issuer -dates