Timmy_Foundation/hermes-agent

Fork 0

Files

Hermes Agent ff2ce95ade

Tests / e2e (pull_request) Successful in 1m39s

Details

Tests / test (pull_request) Failing after 1h7m45s

Details

Docker Build and Publish / build-and-push (pull_request) Has been skipped

Details

Contributor Attribution Check / check-attribution (pull_request) Successful in 24s

Details

Supply Chain Audit / Scan PR for supply chain risks (pull_request) Successful in 28s

Details

feat(research): Allegro worker deliverables — fleet research reports + skill manager test

Research reports:
- Vector DB research
- Workflow orchestration research
- Fleet knowledge graph SOTA research
- LLM inference optimization
- Local model crisis quality
- Memory systems SOTA
- Multi-agent coordination
- R5 vs E2E gap analysis
- Text-to-music-video

Test:
- test_skill_manager_error_context.py

[Allegro] Forge workers — 2026-04-16

2026-04-16 15:04:28 +00:00

26 KiB

Raw Blame History

Multi-Agent Coordination SOTA Research Report

Fleet Knowledge Graph — Architecture Patterns & Integration Recommendations

Date: 2025-04-14
Scope: Agent-to-agent communication, shared memory, task delegation, consensus protocols, conflict resolution
Frameworks Analyzed: CrewAI, AutoGen, MetaGPT, ChatDev, CAMEL, LangGraph
Target Fleet: Hermes (orchestrator), Timmy, Claude Code, Gemini, Kimi

1. EXECUTIVE SUMMARY

Six major multi-agent frameworks each solve coordination differently. The SOTA converges on four core patterns: role-based delegation with capability matching, shared state via publish-subscribe messaging, directed-graph task flows with conditional routing, and layered memory (short-term context + long-term knowledge graph). For our fleet, the optimal architecture combines AutoGen's GraphFlow (dag-based task routing), CrewAI's hierarchical memory (short-term RAG + long-term SQLite + entity memory), MetaGPT's standardized output contracts (typed task artifacts), and CAMEL's role-playing delegation protocol (inception-prompted agent negotiation).

2. FRAMEWORK-BY-FRAMEWORK ANALYSIS

2.1 CrewAI (v1.14.x) — Role-Based Crews with Hierarchical Orchestration

Core Architecture:

Process modes: Process.sequential (tasks execute in order), Process.hierarchical (manager agent delegates to workers)
Agent delegation: allow_delegation=True enables agents to call other agents as tools, selecting the best agent for subtasks
Memory system: Crew-level memory=True enables UnifiedMemory with:
- Short-term: RAG-backed (embeddings → vector store) for recent task context
- Long-term: SQLite-backed for persistent task outcomes
- Entity memory: Tracks entities (people, companies, concepts) across tasks
- User memory: Per-user preference tracking
- Embedder: Configurable (OpenAI, Cohere, Jina, local ONNX, etc.)
Knowledge sources: knowledge_sources=[StringKnowledgeSource(...)] for RAG-grounded context per agent or crew
Flows: @start, @listen, @router decorators for DAG orchestration across crews. or_() and and_() combinators for conditional triggers
Callbacks: before_kickoff_callbacks, after_kickoff_callbacks, step_callback, task_callback

Key Patterns for Fleet:

Delegation-as-tool: Agents can invoke other agents by role → our fleet agents could expose themselves as callable tools to each other
Sequential handoff: Task output from Agent A feeds directly as input to Agent B → pipeline pattern
Hierarchical manager: A manager LLM decomposes goals and assigns tasks → matches Hermes-as-orchestrator pattern
Shared memory with scopes: Crew-level memory visible to all agents, agent-level memory private

Limitations:

No native inter-process communication — all agents live in the same process
Manager/hierarchical mode requires an LLM call just for delegation decisions (extra latency/cost)
No built-in conflict resolution for concurrent writes to shared memory

2.2 AutoGen (v0.7.5) — Flexible Team Topologies with Graph-Based Coordination

Core Architecture:

Team topologies (5 types):
- RoundRobinGroupChat: Sequential turn-taking, each agent speaks in order
- SelectorGroupChat: LLM selects next speaker based on conversation context (selector_prompt template)
- MagenticOneGroupChat: Orchestrator-driven (from Microsoft's Magentic-One paper), with stall detection and replanning
- Swarm: Handoff-based — current speaker explicitly hands off to target via HandoffMessage
- GraphFlow: Directed acyclic graph execution — agents execute based on DAG edges with conditional routing, fan-out, join patterns, and loop support
Agent types:
- AssistantAgent: Standard LLM agent with tools
- CodeExecutorAgent: Runs code in isolated environments
- UserProxyAgent: Human-in-the-loop proxy
- SocietyOfMindAgent: Meta-agent — wraps an inner team and summarizes their output as a single response (composable nesting)
- MessageFilterAgent: Filters/transforms messages between agents
Termination conditions: TextMentionTermination, MaxMessageTermination, SourceMatchTermination, HandoffTermination, TimeoutTermination, FunctionCallTermination, TokenUsageTermination, ExternalTermination (programmatic control), FunctionalTermination (custom function)
Memory: Sequence[Memory] on agents — per-agent memory stores (RAG-backed)
GraphFlow specifics:
- DiGraphBuilder.add_node(agent, activation='all'|'any')
- DiGraphBuilder.add_edge(source, target, condition=callable|str) — conditional edges
- set_entry_point(agent) — defines graph root
- Supports: sequential, parallel fan-out, conditional branching, join patterns, loops with exit conditions
- Node activation: 'all' (wait for all incoming edges) vs 'any' (trigger on first)

Key Patterns for Fleet:

GraphFlow is the SOTA pattern for multi-agent orchestration — DAG-based, conditional, supports parallel branches and joins
SocietyOfMindAgent enables hierarchical composition — a team of agents wrapped as a single agent that can participate in a larger team
Selector pattern (LLM picks next speaker) is elegant for heterogeneous fleets where capability matching matters
Swarm handoff maps directly to our ACP handoff mechanism
Termination conditions are composable — termination_a | termination_b (OR), termination_a & termination_b (AND)

2.3 MetaGPT — SOP-Driven Multi-Agent with Standardized Artifacts

Core Architecture (from paper + codebase):

SOP (Standard Operating Procedure): Tasks decomposed into phases, each with specific roles and required artifacts
Role-based agents: Each role has name, profile, goal, constraints, actions (specific output types)
Shared Message Environment: All agents publish to and subscribe from a shared Environment object
Publish-Subscribe: Agents subscribe to message types/topics they care about, ignore others
Standardized Output: Each action produces a typed artifact (e.g., SystemDesign, Task, Code) — structured contracts between agents
Memory: Memory class stores all messages, retrievable by relevance. Role.react() calls observe() then act() based on observed messages
Communication: Asynchronous message passing — agents publish results to environment, interested agents react

Key Patterns for Fleet:

Typed artifact contracts: Each agent publishes structured outputs (not free-form text) → reduces ambiguity in inter-agent communication
Pub-sub messaging: Decouples sender from receiver — agents don't need to know about each other, just subscribe to relevant topics
SOP-driven phases: Define workflow phases (e.g., "analysis" → "implementation" → "review") with specific agents per phase
Environment as blackboard: Shared state all agents can read/write — classic blackboard architecture for AI systems

2.4 ChatDev — Chat-Chain Architecture for Software Development

Core Architecture:

Chat Chain: Sequential phases (design → code → test → document), each phase is a two-agent conversation
Role pairing: Each phase pairs complementary roles (e.g., CEO ↔ CTO, Programmer ↔ Reviewer)
Communicative dehallucination: Agents communicate through structured prompts that constrain outputs to prevent hallucination
Phase transitions: Phase completion triggers next phase, output from one phase seeds the next
Memory: Conversation history within each phase; phase outputs stored as artifacts

Key Patterns for Fleet:

Phase-gated pipeline: Each phase must produce a specific artifact type before proceeding
Complementary role pairing: Pair agents with opposing perspectives (creator ↔ reviewer) for higher quality
Communicative protocols: Structured conversation templates reduce free-form ambiguity

2.5 CAMEL — Role-Playing Autonomous Multi-Agent Communication

Core Architecture:

RolePlaying society: Two agents (assistant + user) collaborate with inception prompting
Task specification: with_task_specify=True uses a task-specify agent to refine the initial prompt into a concrete task
Task planning: with_task_planner=True adds a planning agent that decomposes the task
Critic-in-the-loop: with_critic_in_the_loop=True adds a critic agent that evaluates and approves/rejects
Inception prompting: Both agents receive system messages that establish their roles, goals, and communication protocol
Termination: Agents signal completion via specific tokens or phrases

Key Patterns for Fleet:

Inception prompting: Agents negotiate a shared understanding of the task before executing
Critic-in-the-loop: A dedicated reviewer agent validates outputs before acceptance
Role-playing protocol: Structured back-and-forth between complementary agents
Task refinement chain: Raw goal → specified task → planned subtasks → executed

2.6 LangGraph — Graph-Based Stateful Agent Workflows

Core Architecture (from documentation/paper):

StateGraph: Typed state schema shared across all nodes (agents/tools)
Nodes: Functions (agents, tools, transforms) that read/modify shared state
Edges: Conditional routing based on state or agent decisions
Checkpointer: Persistent state snapshots (SQLite, Postgres, in-memory) — enables pause/resume
Human-in-the-loop: Interrupt nodes for approval, edit, review
Streaming: Real-time node-by-node or token-by-token output
Subgraphs: Composable graph composition — subgraph as a node in parent graph
State channels: Multiple state namespaces for different aspects of the workflow

Key Patterns for Fleet:

Shared typed state: All agents operate on a well-defined state schema — eliminates ambiguity about what data each agent sees
Checkpoint persistence: Workflow can be paused, resumed, forked — critical for long-running agent tasks
Conditional edges: Route based on agent output type or state values
Subgraph composition: Each fleet agent could be a subgraph, composed into larger workflows
Command-based routing: Nodes return Command(goto="node_name", update={...}) for explicit control flow

3. CROSS-CUTTING PATTERNS ANALYSIS

3.1 Agent-to-Agent Communication

Pattern	Frameworks	Latency	Decoupling	Structured
Direct tool invocation	CrewAI, AutoGen	Low	Low	Medium
Pub-sub messaging	MetaGPT	Medium	High	High
Handoff messages	AutoGen Swarm	Low	Medium	High
Chat-chain conversations	ChatDev, CAMEL	High	Low	Medium
Shared state graph	LangGraph, AutoGen GraphFlow	Low	Medium	High

Recommendation: Use handoff + shared state pattern. Agents communicate via typed handoff messages (what task was completed, what artifacts produced) while sharing a typed state object (knowledge graph entries).

3.2 Shared Memory Patterns

Pattern	Frameworks	Persistence	Scope	Query Method
RAG-backed short-term	CrewAI, AutoGen	Session	Crew/Team	Embedding similarity
SQLite long-term	CrewAI	Cross-session	Global	SQL + embeddings
Entity memory	CrewAI	Cross-session	Global	Entity lookup
Message store	MetaGPT	Session	Environment	Relevance search
Typed state channels	LangGraph	Checkpointed	Graph	State field access
Frozen snapshot	Hermes (current)	Cross-session	Agent	System prompt injection

Recommendation: Implement three-tier memory:

Session state (LangGraph-style typed state graph) — shared within a workflow
Fleet knowledge graph (new) — structured triples/relations between entities, projects, decisions
Agent-local memory (existing MEMORY.md pattern) — per-agent persistent notes

3.3 Task Delegation

Pattern	Frameworks	Decision Maker	Granularity
Manager decomposition	CrewAI hierarchical	Manager LLM	Task-level
Delegation-as-tool	CrewAI	Self-selecting	Subtask
Selector-based	AutoGen SelectorGroupChat	LLM selector	Turn-level
Handoff-based	AutoGen Swarm	Current agent	Message-level
Graph-defined	AutoGen GraphFlow, LangGraph	Pre-defined DAG	Node-level
SOP-based	MetaGPT	Phase rules	Phase-level

Recommendation: Use hybrid delegation:

Graph-based for known workflows (CI/CD, code review pipelines) — pre-defined DAGs
Selector-based for exploratory tasks (research, debugging) — LLM picks best agent
Handoff-based for agent-initiated delegation — current agent explicitly hands off

3.4 Consensus Protocols

No framework implements true consensus protocols (Raft, PBFT). Instead:

Pattern	What It Solves
Critic-in-the-loop (CAMEL)	Single reviewer approves/rejects
Aggregator synthesis (MoA/Mixture-of-Agents)	Multiple responses synthesized into one
Hierarchical manager (CrewAI)	Manager makes final decision
MagenticOne orchestrator (AutoGen)	Orchestrator plans and replans

Recommendation for Fleet: Implement weighted ensemble consensus:

Multiple agents produce independent solutions
A synthesis agent aggregates (like MoA pattern already in Hermes)
For critical decisions, require 2-of-3 agreement from designated expert agents

3.5 Conflict Resolution

Conflict Type	Resolution Strategy
Concurrent memory writes	File locking + atomic rename (Hermes already does this)
Conflicting agent outputs	Critic/validator agent evaluates both
Task assignment conflicts	Single orchestrator (Hermes) assigns, no self-assignment
State graph race conditions	LangGraph checkpoint + merge strategies

Recommendation:

Write conflicts: Atomic operations with optimistic locking (existing pattern)
Output conflicts: Dedicate one agent as "judge" for each workflow
Assignment conflicts: Centralized orchestrator (Hermes) — no agent self-delegation to other fleet members without approval

4. FLEET ARCHITECTURE RECOMMENDATION

4.1 Proposed Architecture: "Fleet Knowledge Graph" (FKG)

┌─────────────────────────────────────────────────────────────┐
│                    FLEET KNOWLEDGE GRAPH                     │
│                                                             │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐  │
│  │ Entities  │  │ Relations│  │ Artifacts│  │ Decisions│  │
│  │ (nodes)   │──│ (edges)  │──│ (typed)  │──│ (history)│  │
│  └──────────┘  └──────────┘  └──────────┘  └──────────┘  │
│                                                             │
│  Storage: SQLite + FTS5 (existing hermes_state.py pattern)  │
│  Schema: RDF-lite triples with typed properties             │
└─────────────────────┬───────────────────────────────────────┘
                      │
          ┌───────────┼───────────┐
          │           │           │
     ┌────▼────┐ ┌────▼────┐ ┌───▼─────┐
     │ Session │ │ Agent   │ │ Workflow│
     │ State   │ │ Memory  │ │ History │
     │ (shared)│ │ (local) │ │ (audit) │
     └─────────┘ └─────────┘ └─────────┘

4.2 Fleet Member Roles

Agent	Role	Strengths	Delegation Style
Hermes	Orchestrator	Planning, tool use, multi-platform	Delegator (spawns others)
Claude Code	Code specialist	Deep code reasoning, ACP integration	Executor (receives tasks)
Gemini	Multimodal analyst	Vision, large context, fast	Executor (receives tasks)
Kimi	Coding assistant	Code generation, long context	Executor (receives tasks)
Timmy	(Details TBD)	TBD	Executor (receives tasks)

4.3 Communication Protocol

Inter-Agent Message Format (inspired by MetaGPT's typed artifacts):

{
  "message_type": "task_request|task_response|handoff|knowledge_update|conflict",
  "source_agent": "hermes",
  "target_agent": "claude_code",
  "task_id": "uuid",
  "parent_task_id": "uuid|null",
  "payload": {
    "goal": "...",
    "context": "...",
    "artifacts": [{"type": "code", "path": "..."}, {"type": "analysis", "content": "..."}],
    "constraints": ["..."],
    "priority": "high|medium|low"
  },
  "knowledge_graph_refs": ["entity:project-x", "relation:depends-on"],
  "timestamp": "ISO8601",
  "signature": "hmac-or-uuid"
}

4.4 Task Flow Patterns

Pattern 1: Pipeline (ChatDev-style)

Hermes → [Analyze] → Claude Code → [Implement] → Gemini → [Review] → Hermes → [Deliver]

Pattern 2: Fan-out/Fan-in (AutoGen GraphFlow-style)

         ┌→ Claude Code (code) ──┐
Hermes ──┼→ Gemini (analysis) ───┼→ Hermes (synthesize)
         └→ Kimi (docs) ─────────┘

Pattern 3: Debate (CAMEL-style)

Claude Code (proposal) ↔ Gemini (critic) → Hermes (judge)

Pattern 4: Selector (AutoGen SelectorGroupChat)

Hermes (orchestrator) → LLM selects best agent → Agent executes → Result → Repeat

4.5 Knowledge Graph Schema

-- Core entities
CREATE TABLE fkg_entities (
    id TEXT PRIMARY KEY,
    entity_type TEXT NOT NULL,  -- 'project', 'file', 'agent', 'task', 'concept', 'decision'
    name TEXT NOT NULL,
    properties JSON,            -- Flexible typed properties
    created_by TEXT,             -- Agent that created this
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Relations between entities
CREATE TABLE fkg_relations (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    source_entity TEXT REFERENCES fkg_entities(id),
    target_entity TEXT REFERENCES fkg_entities(id),
    relation_type TEXT NOT NULL, -- 'depends-on', 'created-by', 'reviewed-by', 'part-of', 'conflicts-with'
    properties JSON,
    confidence REAL DEFAULT 1.0,
    created_by TEXT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Task execution history
CREATE TABLE fkg_task_history (
    task_id TEXT PRIMARY KEY,
    parent_task_id TEXT,
    goal TEXT,
    assigned_agent TEXT,
    status TEXT,                 -- 'pending', 'running', 'completed', 'failed', 'conflict'
    result_summary TEXT,
    artifacts JSON,              -- List of produced artifacts
    knowledge_refs JSON,         -- Entities/relations this task touched
    started_at TIMESTAMP,
    completed_at TIMESTAMP
);

-- Conflict tracking
CREATE TABLE fkg_conflicts (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    entity_id TEXT REFERENCES fkg_entities(id),
    conflict_type TEXT,          -- 'concurrent_write', 'contradictory_output', 'resource_contention'
    agent_a TEXT,
    agent_b TEXT,
    resolution TEXT,
    resolved_by TEXT,
    resolved_at TIMESTAMP
);

-- Full-text search across everything
CREATE VIRTUAL TABLE fkg_search USING fts5(
    entity_name, entity_type, properties_text,
    content='fkg_entities', content_rowid='rowid'
);

5. INTEGRATION RECOMMENDATIONS

5.1 Phase 1: Foundation (Immediate — 1-2 weeks)

Implement FKG SQLite database at ~/.hermes/fleet_knowledge.db
- Extend existing hermes_state.py pattern (already uses SQLite + FTS5)
- Add schema from §4.5
- Create tools/fleet_knowledge_tool.py with CRUD operations
Create fleet agent registry in agent/fleet_registry.py
- Map agent names → transport (ACP, API, subprocess)
- Store capabilities, specializations, availability status
- Integrate with existing acp_adapter/ and delegate_tool.py
Define message protocol as typed Python dataclasses
- FleetMessage, TaskRequest, TaskResponse, KnowledgeUpdate
- Validation via Pydantic (already a CrewAI/dependency)

5.2 Phase 2: Communication Layer (2-4 weeks)

Build fleet delegation on top of existing delegate_tool.py
- Extend to support cross-agent delegation (not just child subagents)
- ACP transport for Claude Code (already supported via acp_command)
- OpenRouter/OpenAI-compatible API for Gemini, Kimi
- Reuse existing credential pool and provider resolution
Implement selector-based task routing (AutoGen SelectorGroupChat pattern)
- LLM-based agent selection based on task description + agent capabilities
- Hermes acts as the selector/orchestrator
- Simple heuristic fallback (code → Claude Code, vision → Gemini, etc.)
Add typed artifact contracts (MetaGPT pattern)
- Each task produces a typed artifact (code, analysis, docs, review)
- Artifacts stored in FKG with entity relations
- Downstream agents consume typed inputs, not free-form text

5.3 Phase 3: Advanced Patterns (4-6 weeks)

Implement workflow DAGs (AutoGen GraphFlow pattern)
- Pre-defined workflows as directed graphs (code review pipeline, research pipeline)
- Conditional routing based on artifact types or agent decisions
- Fan-out/fan-in for parallel execution across fleet agents
Add conflict resolution
- Detect concurrent writes to same FKG entities
- Critic agent validates contradictory outputs
- Track resolution history for learning
Build consensus mechanism for critical decisions
- Weighted voting based on agent expertise
- MoA-style aggregation (already implemented in mixture_of_agents_tool.py)
- Escalation to human for irreconcilable conflicts

5.4 Phase 4: Intelligence (6-8 weeks)

Learning from delegation history
- Track which agent performs best for which task types
- Adjust routing weights over time
- RL-style improvement of delegation decisions
Fleet-level memory evolution
- Entities and relations in FKG become the "shared brain"
- Agents contribute knowledge as they work
- Cross-agent knowledge synthesis (one agent's discovery benefits all)

6. BENCHMARKS & PERFORMANCE CONSIDERATIONS

6.1 Latency Estimates

Pattern	Overhead	Notes
Direct delegation (current)	~30s per subagent	Spawn + run + collect
ACP transport (Claude Code)	~2-5s connection + task time	Subprocess handshake
API-based (Gemini/Kimi)	~1-2s + task time	Standard HTTP
Selector routing	+1 LLM call (~2-5s)	For agent selection
GraphFlow routing	+state overhead (~100ms)	Pre-defined, no LLM call
FKG query	~1-5ms	SQLite indexed query
MoA consensus	~15-30s (4 parallel + 1 aggregator)	Already implemented

6.2 Recommended Configuration

# Fleet coordination config (add to config.yaml)
fleet:
  enabled: true
  knowledge_db: "~/.hermes/fleet_knowledge.db"
  
  agents:
    hermes:
      role: orchestrator
      transport: local
    claude_code:
      role: code_specialist
      transport: acp
      acp_command: "claude"
      acp_args: ["--acp", "--stdio"]
      capabilities: ["code", "debugging", "architecture"]
    gemini:
      role: multimodal_analyst
      transport: api
      provider: openrouter
      model: "google/gemini-3-pro-preview"
      capabilities: ["vision", "analysis", "large_context"]
    kimi:
      role: coding_assistant
      transport: api
      provider: kimi-coding
      capabilities: ["code", "long_context"]
  
  delegation:
    strategy: selector  # selector | pipeline | graph
    max_concurrent: 3
    timeout_seconds: 300
  
  consensus:
    enabled: true
    min_agreement: 2  # 2-of-3 for critical decisions
    escalation_agent: hermes
  
  knowledge:
    auto_extract: true  # Extract entities from task results
    relation_confidence_threshold: 0.7
    search_provider: fts5  # fts5 | vector | hybrid

7. EXISTING HERMES INFRASTRUCTURE TO LEVERAGE

Component	What It Provides	Reuse For
`delegate_tool.py`	Subagent spawning, isolated contexts	Fleet delegation transport
`mixture_of_agents_tool.py`	Multi-model consensus/aggregation	Fleet consensus protocol
`memory_tool.py`	Bounded persistent memory with atomic writes	Pattern for FKG writes
`acp_adapter/`	ACP server for IDE integration	Claude Code transport
`hermes_state.py`	SQLite + FTS5 session store	FKG database foundation
`tools/registry.py`	Central tool registry	Fleet knowledge tool registration
`agent/credential_pool.py`	Credential rotation	Multi-provider auth
`hermes_cli/runtime_provider.py`	Provider resolution	Fleet agent connection

8. KEY TAKEAWAYS

GraphFlow (AutoGen) is the SOTA orchestration pattern — DAG-based execution with conditional routing beats sequential chains and pure LLM-delegation for structured workflows
Three-tier memory is essential — Session state (volatile), knowledge graph (persistent structured), agent memory (persistent per-agent notes)
Typed artifacts over free-form text — MetaGPT's approach of standardized output contracts dramatically reduces inter-agent ambiguity
Hybrid delegation beats any single pattern — Pre-defined DAGs for known workflows, LLM selection for exploratory tasks, handoff for agent-initiated delegation
Critic-in-the-loop is the practical consensus mechanism — Don't implement Byzantine fault tolerance; a dedicated reviewer agent with clear acceptance criteria is sufficient
Our existing infrastructure covers ~60% of what's needed — delegate_tool, MoA, memory_tool, ACP adapter, and SQLite patterns are solid foundations to build on
The fleet knowledge graph is the differentiator — No existing framework has a proper shared knowledge graph that persists across agent interactions. Building this gives us a unique advantage.

Report generated from analysis of CrewAI v1.14.1, AutoGen v0.7.5, CAMEL v0.2.90 (installed locally), plus MetaGPT, ChatDev, and LangGraph documentation.

26 KiB Raw Blame History

Multi-Agent Coordination SOTA Research Report

Fleet Knowledge Graph — Architecture Patterns & Integration Recommendations

1. EXECUTIVE SUMMARY

2. FRAMEWORK-BY-FRAMEWORK ANALYSIS

2.1 CrewAI (v1.14.x) — Role-Based Crews with Hierarchical Orchestration

2.2 AutoGen (v0.7.5) — Flexible Team Topologies with Graph-Based Coordination

2.3 MetaGPT — SOP-Driven Multi-Agent with Standardized Artifacts

2.4 ChatDev — Chat-Chain Architecture for Software Development

2.5 CAMEL — Role-Playing Autonomous Multi-Agent Communication

2.6 LangGraph — Graph-Based Stateful Agent Workflows

3. CROSS-CUTTING PATTERNS ANALYSIS

3.1 Agent-to-Agent Communication

3.2 Shared Memory Patterns

3.3 Task Delegation

3.4 Consensus Protocols

3.5 Conflict Resolution

4. FLEET ARCHITECTURE RECOMMENDATION

4.1 Proposed Architecture: "Fleet Knowledge Graph" (FKG)

4.2 Fleet Member Roles

4.3 Communication Protocol

4.4 Task Flow Patterns

4.5 Knowledge Graph Schema

5. INTEGRATION RECOMMENDATIONS

5.1 Phase 1: Foundation (Immediate — 1-2 weeks)

5.2 Phase 2: Communication Layer (2-4 weeks)

5.3 Phase 3: Advanced Patterns (4-6 weeks)

5.4 Phase 4: Intelligence (6-8 weeks)

6. BENCHMARKS & PERFORMANCE CONSIDERATIONS

6.1 Latency Estimates

6.2 Recommended Configuration

7. EXISTING HERMES INFRASTRUCTURE TO LEVERAGE

8. KEY TAKEAWAYS

26 KiB

Raw Blame History