docs/plans/fleet-knowledge-graph-sota-research.md

# SOTA Research: Multi-Agent Coordination & Fleet Knowledge Graphs

**Date:** 2026-04-14  
**Scope:** Agent-to-agent communication, shared memory, task delegation, consensus protocols  
**Frameworks Analyzed:** CrewAI, AutoGen, MetaGPT, ChatDev, CAMEL

---

## 1. Architecture Pattern Summary

### 1.1 CrewAI — Role-Based Crew Orchestration

**Core Pattern:** Agents organized into "Crews" with explicit roles, goals, and backstories. Tasks are assigned to agents, executed via sequential or hierarchical process flows.

**Agent-to-Agent Communication:**
- **Sequential:** Agent A completes Task A → output injected into Task B's context for Agent B
- **Hierarchical:** Manager agent delegates to worker agents, collects results, synthesizes
- **Context passing:** Tasks can declare `context: [other_tasks]` — outputs from dependent tasks are automatically injected into the current task's prompt
- **No direct agent-to-agent messaging** — communication is mediated through task outputs

**Shared Memory (v2 — Unified Memory):**
- `Memory` class with `remember()` / `recall()` using vector embeddings (LanceDB/ChromaDB)
- **Scope-based isolation:** `MemoryScope` provides path-based namespacing (`/crew/research/agent-foo`)
- **Composite scoring:** semantic similarity (0.5) + recency (0.3) + importance (0.2)
- **RecallFlow:** LLM-driven deep recall with adaptive query expansion
- **Privacy flags:** Private memories only visible to the source that created them
- **Background saves:** ThreadPoolExecutor with write barrier (drain_writes before recall)

**Task Delegation:**
- Agent tools include `Delegate Work to Co-worker` and `Ask Question to Co-worker`
- Delegation creates a new task for another agent, results come back to delegator
- Depth-limited (no infinite delegation chains)

**State & Checkpointing:**
- `SqliteProvider` / `JsonProvider` for state checkpoint persistence
- `CheckpointConfig` with event-driven persistence
- Flow state is Pydantic models with serialization

**Cache:**
- Thread-safe in-memory tool result cache with RWLock
- Key: `{tool_name}-{input}` → cached output

### 1.2 AutoGen (Microsoft) — Conversation-Centric Teams

**Core Pattern:** Agents communicate through shared conversation threads. A "Group Chat Manager" controls turn-taking and speaker selection.

**Agent-to-Agent Communication:**
- **Shared message thread** — all agents see all messages (like a group chat)
- **Three team patterns:**
  - `RoundRobinGroupChat`: Fixed order cycling through participants
  - `SelectorGroupChat`: LLM-based speaker selection with candidate filtering
  - `SwarmGroupChat`: Handoff-based routing (agent sends HandoffMessage to next agent)
  - `GraphFlow` (DiGraph): DAG-based execution with conditional edges, parallel fan-out, loops
  - `MagenticOneOrchestrator`: Ledger-based orchestration with task planning, progress tracking, stall detection

**Shared State:**
- `ChatCompletionContext` — manages message history per agent (can be unbounded or windowed)
- `ModelContext` shared across agents in a team
- State serialization: `save_state()` / `load_state()` for all managers
- **No built-in vector memory** — context is purely conversational

**Task Delegation:**
- `Swarm`: Agents use `HandoffMessage` to explicitly route control
- `GraphFlow`: Conditional edges route based on message content (keyword or callable)
- `MagenticOne`: Orchestrator maintains a "task ledger" (facts + plan) and dynamically re-plans on stalls

**Consensus / Termination:**
- `TerminationCondition` — composable conditions (text match, max messages, source-based)
- No explicit consensus protocols — termination is manager-decided

**Key Insight:** AutoGen's `ChatCompletionContext` is the closest analog to shared memory, but it's purely sequential message history, not a knowledge base.

### 1.3 MetaGPT — SOP-Driven Software Teams

**Core Pattern:** Agents follow Standard Operating Procedures (SOPs). Each agent has a defined role (Product Manager, Architect, Engineer, QA) and produces structured artifacts.

**Agent-to-Agent Communication:**
- **Publish-Subscribe via Environment:** Agents publish "actions" to a shared Environment, subscribers react
- **Structured outputs:** Each role produces specific artifact types (PRD, design doc, code, test cases)
- **Message routing:** Environment acts as a message bus, filtering by subscriber interest

**Shared Memory:**
- `Environment` class maintains shared state (project workspace)
- File-based shared memory: agents write/read from a shared filesystem
- `SharedMemory` for cross-agent context (structured data, not free-form text)

**Task Delegation:**
- Implicit through SOP stages: PM → Architect → Engineer → QA
- Each agent's output is the next agent's input
- No dynamic re-delegation

**Consensus:**
- Sequential SOP execution (no parallel agents)
- QA agent can trigger re-work loops back to Engineer

### 1.4 ChatDev — Chat-Chain Software Development

**Core Pattern:** Agents follow a "chat chain" — a sequence of chat phases (designing, coding, testing, documenting). Each phase involves a pair of agents (CEO↔CTO, Programmer↔Reviewer, etc.).

**Agent-to-Agent Communication:**
- **Paired chat sessions:** Two agents communicate in each phase (role-play between instructor and assistant)
- **Chain propagation:** Phase N's output (code, design doc) becomes Phase N+1's input
- **No broadcast** — communication is strictly pairwise within phases

**Shared Memory:**
- Software-centric: shared code repository is the "memory"
- Each phase modifies/inherits the codebase
- No explicit vector memory or knowledge graph

**Task Delegation:**
- Hardcoded phase sequence: Design → Code → Test → Document
- Each phase delegates to a specific agent pair
- No dynamic task re-assignment

**Consensus:**
- Phase-level termination: when both agents agree the phase is complete
- "Thought" tokens for chain-of-thought within chat

### 1.5 CAMEL — Role-Playing & Workforce

**Core Pattern:** Two primary modes:
1. **RolePlaying:** Two-agent conversation with task specification and optional critic
2. **Workforce:** Multi-agent with coordinator, task planner, and worker pool

**Agent-to-Agent Communication:**
- **RolePlaying:** Structured turn-taking between assistant and user agents
- **Workforce:** Coordinator assigns tasks via `TaskChannel`, workers return results
- **Worker types:** `SingleAgentWorker` (single ChatAgent), `RolePlayingWorker` (two-agent pair)

**Shared Memory / Task Channel:**
- `TaskChannel` — async queue-based task dispatch with packet tracking
  - States: SENT → PROCESSING → RETURNED → ARCHIVED
  - O(1) lookup by task ID, status-based filtering, assignee/publisher queues
- `WorkflowMemoryManager` — persists workflow patterns as markdown files
  - Role-based organization: workflows stored by `role_identifier`
  - Agent-based intelligent selection: LLM picks relevant past workflows
  - Versioned: metadata tracks creation time and version numbers

**Task Delegation:**
- Coordinator agent decomposes complex tasks using LLM analysis
- Tasks assigned to workers based on capability matching
- Failed tasks trigger: retry, create new worker, or further decomposition
- `FailureHandlingConfig` with configurable `RecoveryStrategy`

**Consensus / Quality:**
- Quality evaluation via structured output (response format enforced)
- Task dependencies tracked (worker receives dependency tasks as context)
- `WorkforceMetrics` for tracking execution statistics

---

## 2. Key Architectural Patterns for Fleet Knowledge Graph

### 2.1 Communication Topology Patterns

| Pattern | Used By | Description |
|---------|---------|-------------|
| **Sequential Chain** | CrewAI, ChatDev, MetaGPT | A→B→C linear flow, output feeds next |
| **Shared Thread** | AutoGen | All agents see all messages |
| **Publish-Subscribe** | MetaGPT | Environment-based message bus |
| **Paired Chat** | ChatDev, CAMEL | Two-agent conversation pairs |
| **Handoff Routing** | AutoGen Swarm | Agent explicitly names next speaker |
| **DAG Graph** | AutoGen GraphFlow | Conditional edges, parallel, loops |
| **Ledger Orchestration** | AutoGen MagenticOne | Maintains task ledger, re-plans |
| **Task Channel** | CAMEL | Async queue with packet states |

### 2.2 Shared State Patterns

| Pattern | Used By | Description |
|---------|---------|-------------|
| **Vector Memory** | CrewAI | Embeddings + scope-based namespacing |
| **Message History** | AutoGen | Sequential conversation context |
| **File System** | MetaGPT, ChatDev | Agents read/write shared files |
| **Task Channel** | CAMEL | Async packet-based task dispatch |
| **Workflow Files** | CAMEL | Markdown-based workflow memory |
| **Tool Cache** | CrewAI | In-memory RWLock tool result cache |
| **State Checkpoint** | CrewAI, AutoGen | Serialized Pydantic/SQLite checkpoints |

### 2.3 Task Delegation Patterns

| Pattern | Used By | Description |
|---------|---------|-------------|
| **Role Assignment** | CrewAI | Fixed agent per task |
| **Manager Delegation** | CrewAI Hierarchical | Manager assigns tasks dynamically |
| **Speaker Selection** | AutoGen Selector | LLM picks next agent |
| **Handoff** | AutoGen Swarm | Agent explicitly transfers control |
| **SOP Routing** | MetaGPT | Stage-based implicit delegation |
| **Coordinator** | CAMEL Workforce | LLM-based task decomposition + assignment |
| **Dynamic Worker Creation** | CAMEL Workforce | Create new workers on failure |

### 2.4 Conflict Resolution Patterns

| Pattern | Used By | Description |
|---------|---------|-------------|
| **Manager Arbitration** | CrewAI Hierarchical | Manager resolves conflicts |
| **Critic-in-the-loop** | CAMEL | Critic agent evaluates and selects |
| **Quality Gate** | CAMEL Workforce | Structured quality evaluation |
| **Termination Conditions** | AutoGen | Composable stop conditions |
| **Stall Detection** | AutoGen MagenticOne | Re-plans when progress stalls |

---

## 3. Recommendations for Hermes Fleet Knowledge Graph

### 3.1 Architecture: Hybrid Graph + Memory

Based on the SOTA analysis, the optimal fleet knowledge graph should combine:

1. **CrewAI's scoped memory** for hierarchical knowledge organization
   - Path-based namespaces: `/fleet/{fleet_id}/agent/{agent_id}/diary`
   - Composite scoring: semantic + recency + importance
   - Background writes with read barriers

2. **CAMEL's TaskChannel** for task dispatch and tracking
   - Packet states (SENT → PROCESSING → RETURNED → ARCHIVED)
   - O(1) lookup by task ID
   - Assignee/publisher tracking

3. **AutoGen's DiGraph** for execution flow definition
   - DAG with conditional edges for complex workflows
   - Parallel fan-out for independent tasks
   - Activation conditions (all vs any) for synchronization points

4. **AutoGen MagenticOne's ledger** for shared task context
   - Maintained facts, plan, and progress ledger
   - Dynamic re-planning on stalls

### 3.2 Fleet Knowledge Graph Schema

```
/fleet/{fleet_id}/
  ├── shared/              # Shared knowledge (all agents read)
  │   ├── facts/           # Known facts, constraints
  │   ├── decisions/       # Record of decisions made
  │   └── context/         # Active task context
  ├── agent/{agent_id}/
  │   ├── diary/           # Agent's personal experience log
  │   ├── capabilities/    # What this agent can do
  │   └── state/           # Current task state
  ├── tasks/
  │   ├── {task_id}/       # Task metadata, dependencies, status
  │   └── graph/           # DAG definition for task dependencies
  └── consensus/
      ├── proposals/       # Pending proposals
      └── decisions/       # Resolved consensus decisions
```

### 3.3 Key Design Decisions

1. **Diary System (Agent Memory):**
   - Each agent writes to its own scoped memory after every significant action
   - LLM-analyzed importance scoring (like CrewAI's unified memory)
   - Cross-agent recall: agents can query other agents' diaries for relevant experiences
   - Decay: old low-importance memories expire

2. **Shared State (Fleet Knowledge):**
   - SQLite-backed (like Hermes' existing `state.db`) with FTS5 search
   - Hierarchical scopes (like CrewAI's MemoryScope)
   - Write-ahead log for concurrent access
   - Read barriers before queries (like CrewAI's `drain_writes`)

3. **Task Delegation:**
   - Coordinator pattern (like CAMEL's Workforce)
   - Task decomposition via LLM
   - Failed task → retry, reassign, or decompose
   - Max depth limit (like Hermes' existing MAX_DEPTH=2)

4. **Consensus Protocol:**
   - Proposal-based: agent proposes, others vote/acknowledge
   - Timeout-based fallback: if no response within N seconds, proceed
   - Manager override: designated manager can break ties
   - Simple majority for non-critical, unanimity for critical decisions

5. **Conflict Resolution:**
   - Last-write-wins for non-critical state
   - Optimistic locking with version numbers
   - Manager arbitration for task assignment conflicts
   - Quality gates (like CAMEL) for output validation

### 3.4 Integration with Existing Hermes Architecture

Hermes already has strong foundations:
- **Delegation system** (`delegate_tool.py`): Isolated child agents, parallel execution, depth limits
- **State DB** (`hermes_state.py`): SQLite + FTS5, WAL mode, session tracking, message history
- **Credential pools**: Shared credentials with rotation

The fleet knowledge graph should extend these patterns:
- **Session DB → Fleet DB:** Add tables for fleet metadata, agent registrations, task graphs
- **Memory tool → Fleet Memory:** Scoped vector memory shared across fleet agents
- **Delegate tool → Fleet Delegation:** Task channel with persistence, quality evaluation
- **New: Consensus module:** Proposal/vote protocol with timeout handling

---

## 4. Reference Implementations

| Component | Best Reference | Key Takeaway |
|-----------|---------------|--------------|
| Scoped Memory | CrewAI `Memory` + `MemoryScope` | Path-based namespaces, composite scoring, background writes |
| Task Dispatch | CAMEL `TaskChannel` | Packet-based with state machine, O(1) lookup |
| Execution DAG | AutoGen `DiGraphBuilder` | Fluent builder, conditional edges, activation groups |
| Orchestration | AutoGen `MagenticOneOrchestrator` | Ledger-based planning, stall detection, re-planning |
| Agent Communication | AutoGen `SelectorGroupChat` | LLM-based speaker selection, shared message thread |
| Quality Evaluation | CAMEL Workforce | Structured output for quality scoring |
| Workflow Memory | CAMEL `WorkflowMemoryManager` | Markdown-based, role-organized, versioned |
| State Checkpoint | CrewAI `SqliteProvider` | JSONB checkpoints, WAL mode |
| Tool Cache | CrewAI `CacheHandler` | RWLock-based concurrent tool result cache |

---

## 5. Open Questions

1. **Graph vs Vector for knowledge:** Should fleet knowledge use a proper graph DB (e.g., Neo4j) or stick with vector + SQLite?
   - Recommendation: Start with SQLite + vectors (existing stack), add graph later if needed

2. **Real-time vs Batch:** Should agents receive updates in real-time or batched?
   - Recommendation: Event-driven for critical updates, batched for diary entries

3. **Security model:** How should cross-agent access be controlled?
   - Recommendation: Role-based ACLs on scope paths, similar to CrewAI's privacy flags

4. **Scalability:** How many agents can a single fleet support?
   - Recommendation: Start with 10-agent fleets, optimize SQLite concurrency first
feat(research): Allegro worker deliverables — fleet research reports + skill manager test Research reports: - Vector DB research - Workflow orchestration research - Fleet knowledge graph SOTA research - LLM inference optimization - Local model crisis quality - Memory systems SOTA - Multi-agent coordination - R5 vs E2E gap analysis - Text-to-music-video Test: - test_skill_manager_error_context.py [Allegro] Forge workers — 2026-04-16 2026-04-16 15:04:28 +00:00			`# SOTA Research: Multi-Agent Coordination & Fleet Knowledge Graphs`

			`Date: 2026-04-14`
			`Scope: Agent-to-agent communication, shared memory, task delegation, consensus protocols`
			`Frameworks Analyzed: CrewAI, AutoGen, MetaGPT, ChatDev, CAMEL`

			`---`

			`## 1. Architecture Pattern Summary`

			`### 1.1 CrewAI — Role-Based Crew Orchestration`

			`Core Pattern: Agents organized into "Crews" with explicit roles, goals, and backstories. Tasks are assigned to agents, executed via sequential or hierarchical process flows.`

			`Agent-to-Agent Communication:`
			`- Sequential: Agent A completes Task A → output injected into Task B's context for Agent B`
			`- Hierarchical: Manager agent delegates to worker agents, collects results, synthesizes`
			- Context passing: Tasks can declare `context: [other_tasks]` — outputs from dependent tasks are automatically injected into the current task's prompt
			`- No direct agent-to-agent messaging — communication is mediated through task outputs`

			`Shared Memory (v2 — Unified Memory):`
			- `Memory` class with `remember()` / `recall()` using vector embeddings (LanceDB/ChromaDB)
			- Scope-based isolation: `MemoryScope` provides path-based namespacing (`/crew/research/agent-foo`)
			`- Composite scoring: semantic similarity (0.5) + recency (0.3) + importance (0.2)`
			`- RecallFlow: LLM-driven deep recall with adaptive query expansion`
			`- Privacy flags: Private memories only visible to the source that created them`
			`- Background saves: ThreadPoolExecutor with write barrier (drain_writes before recall)`

			`Task Delegation:`
			- Agent tools include `Delegate Work to Co-worker` and `Ask Question to Co-worker`
			`- Delegation creates a new task for another agent, results come back to delegator`
			`- Depth-limited (no infinite delegation chains)`

			`State & Checkpointing:`
			- `SqliteProvider` / `JsonProvider` for state checkpoint persistence
			- `CheckpointConfig` with event-driven persistence
			`- Flow state is Pydantic models with serialization`

			`Cache:`
			`- Thread-safe in-memory tool result cache with RWLock`
			- Key: `{tool_name}-{input}` → cached output

			`### 1.2 AutoGen (Microsoft) — Conversation-Centric Teams`

			`Core Pattern: Agents communicate through shared conversation threads. A "Group Chat Manager" controls turn-taking and speaker selection.`

			`Agent-to-Agent Communication:`
			`- Shared message thread — all agents see all messages (like a group chat)`
			`- Three team patterns:`
			- `RoundRobinGroupChat`: Fixed order cycling through participants
			- `SelectorGroupChat`: LLM-based speaker selection with candidate filtering
			- `SwarmGroupChat`: Handoff-based routing (agent sends HandoffMessage to next agent)
			- `GraphFlow` (DiGraph): DAG-based execution with conditional edges, parallel fan-out, loops
			- `MagenticOneOrchestrator`: Ledger-based orchestration with task planning, progress tracking, stall detection

			`Shared State:`
			- `ChatCompletionContext` — manages message history per agent (can be unbounded or windowed)
			- `ModelContext` shared across agents in a team
			- State serialization: `save_state()` / `load_state()` for all managers
			`- No built-in vector memory — context is purely conversational`

			`Task Delegation:`
			- `Swarm`: Agents use `HandoffMessage` to explicitly route control
			- `GraphFlow`: Conditional edges route based on message content (keyword or callable)
			- `MagenticOne`: Orchestrator maintains a "task ledger" (facts + plan) and dynamically re-plans on stalls

			`Consensus / Termination:`
			- `TerminationCondition` — composable conditions (text match, max messages, source-based)
			`- No explicit consensus protocols — termination is manager-decided`

			Key Insight: AutoGen's `ChatCompletionContext` is the closest analog to shared memory, but it's purely sequential message history, not a knowledge base.

			`### 1.3 MetaGPT — SOP-Driven Software Teams`

			`Core Pattern: Agents follow Standard Operating Procedures (SOPs). Each agent has a defined role (Product Manager, Architect, Engineer, QA) and produces structured artifacts.`

			`Agent-to-Agent Communication:`
			`- Publish-Subscribe via Environment: Agents publish "actions" to a shared Environment, subscribers react`
			`- Structured outputs: Each role produces specific artifact types (PRD, design doc, code, test cases)`
			`- Message routing: Environment acts as a message bus, filtering by subscriber interest`

			`Shared Memory:`
			- `Environment` class maintains shared state (project workspace)
			`- File-based shared memory: agents write/read from a shared filesystem`
			- `SharedMemory` for cross-agent context (structured data, not free-form text)

			`Task Delegation:`
			`- Implicit through SOP stages: PM → Architect → Engineer → QA`
			`- Each agent's output is the next agent's input`
			`- No dynamic re-delegation`

			`Consensus:`
			`- Sequential SOP execution (no parallel agents)`
			`- QA agent can trigger re-work loops back to Engineer`

			`### 1.4 ChatDev — Chat-Chain Software Development`

			`Core Pattern: Agents follow a "chat chain" — a sequence of chat phases (designing, coding, testing, documenting). Each phase involves a pair of agents (CEO↔CTO, Programmer↔Reviewer, etc.).`

			`Agent-to-Agent Communication:`
			`- Paired chat sessions: Two agents communicate in each phase (role-play between instructor and assistant)`
			`- Chain propagation: Phase N's output (code, design doc) becomes Phase N+1's input`
			`- No broadcast — communication is strictly pairwise within phases`

			`Shared Memory:`
			`- Software-centric: shared code repository is the "memory"`
			`- Each phase modifies/inherits the codebase`
			`- No explicit vector memory or knowledge graph`

			`Task Delegation:`
			`- Hardcoded phase sequence: Design → Code → Test → Document`
			`- Each phase delegates to a specific agent pair`
			`- No dynamic task re-assignment`

			`Consensus:`
			`- Phase-level termination: when both agents agree the phase is complete`
			`- "Thought" tokens for chain-of-thought within chat`

			`### 1.5 CAMEL — Role-Playing & Workforce`

			`Core Pattern: Two primary modes:`
			`1. RolePlaying: Two-agent conversation with task specification and optional critic`
			`2. Workforce: Multi-agent with coordinator, task planner, and worker pool`

			`Agent-to-Agent Communication:`
			`- RolePlaying: Structured turn-taking between assistant and user agents`
			- Workforce: Coordinator assigns tasks via `TaskChannel`, workers return results
			- Worker types: `SingleAgentWorker` (single ChatAgent), `RolePlayingWorker` (two-agent pair)

			`Shared Memory / Task Channel:`
			- `TaskChannel` — async queue-based task dispatch with packet tracking
			`- States: SENT → PROCESSING → RETURNED → ARCHIVED`
			`- O(1) lookup by task ID, status-based filtering, assignee/publisher queues`
			- `WorkflowMemoryManager` — persists workflow patterns as markdown files
			- Role-based organization: workflows stored by `role_identifier`
			`- Agent-based intelligent selection: LLM picks relevant past workflows`
			`- Versioned: metadata tracks creation time and version numbers`

			`Task Delegation:`
			`- Coordinator agent decomposes complex tasks using LLM analysis`
			`- Tasks assigned to workers based on capability matching`
			`- Failed tasks trigger: retry, create new worker, or further decomposition`
			- `FailureHandlingConfig` with configurable `RecoveryStrategy`

			`Consensus / Quality:`
			`- Quality evaluation via structured output (response format enforced)`
			`- Task dependencies tracked (worker receives dependency tasks as context)`
			- `WorkforceMetrics` for tracking execution statistics

			`---`

			`## 2. Key Architectural Patterns for Fleet Knowledge Graph`

			`### 2.1 Communication Topology Patterns`

			`\| Pattern \| Used By \| Description \|`
			`\|---------\|---------\|-------------\|`
			`\| Sequential Chain \| CrewAI, ChatDev, MetaGPT \| A→B→C linear flow, output feeds next \|`
			`\| Shared Thread \| AutoGen \| All agents see all messages \|`
			`\| Publish-Subscribe \| MetaGPT \| Environment-based message bus \|`
			`\| Paired Chat \| ChatDev, CAMEL \| Two-agent conversation pairs \|`
			`\| Handoff Routing \| AutoGen Swarm \| Agent explicitly names next speaker \|`
			`\| DAG Graph \| AutoGen GraphFlow \| Conditional edges, parallel, loops \|`
			`\| Ledger Orchestration \| AutoGen MagenticOne \| Maintains task ledger, re-plans \|`
			`\| Task Channel \| CAMEL \| Async queue with packet states \|`

			`### 2.2 Shared State Patterns`

			`\| Pattern \| Used By \| Description \|`
			`\|---------\|---------\|-------------\|`
			`\| Vector Memory \| CrewAI \| Embeddings + scope-based namespacing \|`
			`\| Message History \| AutoGen \| Sequential conversation context \|`
			`\| File System \| MetaGPT, ChatDev \| Agents read/write shared files \|`
			`\| Task Channel \| CAMEL \| Async packet-based task dispatch \|`
			`\| Workflow Files \| CAMEL \| Markdown-based workflow memory \|`
			`\| Tool Cache \| CrewAI \| In-memory RWLock tool result cache \|`
			`\| State Checkpoint \| CrewAI, AutoGen \| Serialized Pydantic/SQLite checkpoints \|`

			`### 2.3 Task Delegation Patterns`

			`\| Pattern \| Used By \| Description \|`
			`\|---------\|---------\|-------------\|`
			`\| Role Assignment \| CrewAI \| Fixed agent per task \|`
			`\| Manager Delegation \| CrewAI Hierarchical \| Manager assigns tasks dynamically \|`
			`\| Speaker Selection \| AutoGen Selector \| LLM picks next agent \|`
			`\| Handoff \| AutoGen Swarm \| Agent explicitly transfers control \|`
			`\| SOP Routing \| MetaGPT \| Stage-based implicit delegation \|`
			`\| Coordinator \| CAMEL Workforce \| LLM-based task decomposition + assignment \|`
			`\| Dynamic Worker Creation \| CAMEL Workforce \| Create new workers on failure \|`

			`### 2.4 Conflict Resolution Patterns`

			`\| Pattern \| Used By \| Description \|`
			`\|---------\|---------\|-------------\|`
			`\| Manager Arbitration \| CrewAI Hierarchical \| Manager resolves conflicts \|`
			`\| Critic-in-the-loop \| CAMEL \| Critic agent evaluates and selects \|`
			`\| Quality Gate \| CAMEL Workforce \| Structured quality evaluation \|`
			`\| Termination Conditions \| AutoGen \| Composable stop conditions \|`
			`\| Stall Detection \| AutoGen MagenticOne \| Re-plans when progress stalls \|`

			`---`

			`## 3. Recommendations for Hermes Fleet Knowledge Graph`

			`### 3.1 Architecture: Hybrid Graph + Memory`

			`Based on the SOTA analysis, the optimal fleet knowledge graph should combine:`

			`1. CrewAI's scoped memory for hierarchical knowledge organization`
			- Path-based namespaces: `/fleet/{fleet_id}/agent/{agent_id}/diary`
			`- Composite scoring: semantic + recency + importance`
			`- Background writes with read barriers`

			`2. CAMEL's TaskChannel for task dispatch and tracking`
			`- Packet states (SENT → PROCESSING → RETURNED → ARCHIVED)`
			`- O(1) lookup by task ID`
			`- Assignee/publisher tracking`

			`3. AutoGen's DiGraph for execution flow definition`
			`- DAG with conditional edges for complex workflows`
			`- Parallel fan-out for independent tasks`
			`- Activation conditions (all vs any) for synchronization points`

			`4. AutoGen MagenticOne's ledger for shared task context`
			`- Maintained facts, plan, and progress ledger`
			`- Dynamic re-planning on stalls`

			`### 3.2 Fleet Knowledge Graph Schema`

			```
			`/fleet/{fleet_id}/`
			`├── shared/ # Shared knowledge (all agents read)`
			`│ ├── facts/ # Known facts, constraints`
			`│ ├── decisions/ # Record of decisions made`
			`│ └── context/ # Active task context`
			`├── agent/{agent_id}/`
			`│ ├── diary/ # Agent's personal experience log`
			`│ ├── capabilities/ # What this agent can do`
			`│ └── state/ # Current task state`
			`├── tasks/`
			`│ ├── {task_id}/ # Task metadata, dependencies, status`
			`│ └── graph/ # DAG definition for task dependencies`
			`└── consensus/`
			`├── proposals/ # Pending proposals`
			`└── decisions/ # Resolved consensus decisions`
			```

			`### 3.3 Key Design Decisions`

			`1. Diary System (Agent Memory):`
			`- Each agent writes to its own scoped memory after every significant action`
			`- LLM-analyzed importance scoring (like CrewAI's unified memory)`
			`- Cross-agent recall: agents can query other agents' diaries for relevant experiences`
			`- Decay: old low-importance memories expire`

			`2. Shared State (Fleet Knowledge):`
			- SQLite-backed (like Hermes' existing `state.db`) with FTS5 search
			`- Hierarchical scopes (like CrewAI's MemoryScope)`
			`- Write-ahead log for concurrent access`
			- Read barriers before queries (like CrewAI's `drain_writes`)

			`3. Task Delegation:`
			`- Coordinator pattern (like CAMEL's Workforce)`
			`- Task decomposition via LLM`
			`- Failed task → retry, reassign, or decompose`
			`- Max depth limit (like Hermes' existing MAX_DEPTH=2)`

			`4. Consensus Protocol:`
			`- Proposal-based: agent proposes, others vote/acknowledge`
			`- Timeout-based fallback: if no response within N seconds, proceed`
			`- Manager override: designated manager can break ties`
			`- Simple majority for non-critical, unanimity for critical decisions`

			`5. Conflict Resolution:`
			`- Last-write-wins for non-critical state`
			`- Optimistic locking with version numbers`
			`- Manager arbitration for task assignment conflicts`
			`- Quality gates (like CAMEL) for output validation`

			`### 3.4 Integration with Existing Hermes Architecture`

			`Hermes already has strong foundations:`
			- Delegation system (`delegate_tool.py`): Isolated child agents, parallel execution, depth limits
			- State DB (`hermes_state.py`): SQLite + FTS5, WAL mode, session tracking, message history
			`- Credential pools: Shared credentials with rotation`

			`The fleet knowledge graph should extend these patterns:`
			`- Session DB → Fleet DB: Add tables for fleet metadata, agent registrations, task graphs`
			`- Memory tool → Fleet Memory: Scoped vector memory shared across fleet agents`
			`- Delegate tool → Fleet Delegation: Task channel with persistence, quality evaluation`
			`- New: Consensus module: Proposal/vote protocol with timeout handling`

			`---`

			`## 4. Reference Implementations`

			`\| Component \| Best Reference \| Key Takeaway \|`
			`\|-----------\|---------------\|--------------\|`
			\| Scoped Memory \| CrewAI `Memory` + `MemoryScope` \| Path-based namespaces, composite scoring, background writes \|
			\| Task Dispatch \| CAMEL `TaskChannel` \| Packet-based with state machine, O(1) lookup \|
			\| Execution DAG \| AutoGen `DiGraphBuilder` \| Fluent builder, conditional edges, activation groups \|
			\| Orchestration \| AutoGen `MagenticOneOrchestrator` \| Ledger-based planning, stall detection, re-planning \|
			\| Agent Communication \| AutoGen `SelectorGroupChat` \| LLM-based speaker selection, shared message thread \|
			`\| Quality Evaluation \| CAMEL Workforce \| Structured output for quality scoring \|`
			\| Workflow Memory \| CAMEL `WorkflowMemoryManager` \| Markdown-based, role-organized, versioned \|
			\| State Checkpoint \| CrewAI `SqliteProvider` \| JSONB checkpoints, WAL mode \|
			\| Tool Cache \| CrewAI `CacheHandler` \| RWLock-based concurrent tool result cache \|

			`---`

			`## 5. Open Questions`

			`1. Graph vs Vector for knowledge: Should fleet knowledge use a proper graph DB (e.g., Neo4j) or stick with vector + SQLite?`
			`- Recommendation: Start with SQLite + vectors (existing stack), add graph later if needed`

			`2. Real-time vs Batch: Should agents receive updates in real-time or batched?`
			`- Recommendation: Event-driven for critical updates, batched for diary entries`

			`3. Security model: How should cross-agent access be controlled?`
			`- Recommendation: Role-based ACLs on scope paths, similar to CrewAI's privacy flags`

			`4. Scalability: How many agents can a single fleet support?`
			`- Recommendation: Start with 10-agent fleets, optimize SQLite concurrency first`