# Bannerlord Feudal Multi-Agent Hierarchy Design **Issue:** #1099 **Parent Epic:** #1091 (Project Bannerlord) **Date:** 2026-03-23 **Status:** Draft --- ## Overview This document specifies the multi-agent hierarchy for Timmy's Bannerlord campaign. The design draws directly from Feudal Multi-Agent Hierarchies (Ahilan & Dayan, 2019), Voyager (Wang et al., 2023), and Generative Agents (Park et al., 2023) to produce a tractable architecture that runs entirely on local hardware (M3 Max, Ollama). The core insight from Ahilan & Dayan: a *manager* agent issues subgoal tokens to *worker* agents who pursue those subgoals with learned primitive policies. Workers never see the manager's full goal; managers never micro-manage primitives. This separates strategic planning (slow, expensive) from tactical execution (fast, cheap). --- ## 1. King-Level Timmy — Subgoal Vocabulary Timmy is the King agent. He operates on the **campaign map** timescale (days to weeks of in-game time). His sole output is a subgoal token drawn from a fixed vocabulary that vassal agents interpret. ### Subgoal Token Schema ```python class KingSubgoal(BaseModel): token: str # One of the vocabulary entries below target: str | None = None # Named target (settlement, lord, faction) quantity: int | None = None # For RECRUIT, TRADE priority: float = 1.0 # 0.0–2.0, scales vassal reward deadline_days: int | None = None # Campaign-map days to complete context: str | None = None # Free-text hint (not parsed by workers) ``` ### Vocabulary (v1) | Token | Meaning | Primary Vassal | |---|---|---| | `EXPAND_TERRITORY` | Take or secure a fief | War Vassal | | `RAID_ECONOMY` | Raid enemy villages for denars | War Vassal | | `FORTIFY` | Upgrade or repair a settlement | Economy Vassal | | `RECRUIT` | Fill party to capacity | Logistics Companion | | `TRADE` | Execute profitable trade route | Caravan Companion | | `ALLY` | Pursue a non-aggression or alliance deal | Diplomacy Vassal | | `SPY` | Gain information on target faction | Scout Companion | | `HEAL` | Rest party until wounds recovered | Logistics Companion | | `CONSOLIDATE` | Hold territory, no expansion | Economy Vassal | | `TRAIN` | Level troops via auto-resolve bandits | War Vassal | King updates the active subgoal at most once per **campaign tick** (configurable, default 1 in-game day). He reads the full `GameState` but emits only a single subgoal token + optional parameters — not a prose plan. ### King Decision Loop ``` while campaign_running: state = gabs.get_state() # Full kingdom + map snapshot subgoal = king_llm.decide(state) # Qwen3:32b, temp=0.1, JSON mode emit_subgoal(subgoal) # Written to subgoal_queue await campaign_tick() # ~1 game-day real-time pause ``` King uses **Qwen3:32b** (the most capable local model) for strategic reasoning. Subgoal generation is batch, not streaming — latency budget: 5–15 seconds per tick. --- ## 2. Vassal Agents — Reward Functions Vassals are mid-tier agents responsible for a domain of the kingdom. Each vassal has a defined reward function. Vassals run on **Qwen3:14b** (balanced capability vs. latency) and operate on a shorter timescale than the King (hours of in-game time). ### 2a. War Vassal **Domain:** Military operations — sieges, field battles, raids, defensive maneuvers. **Reward function:** ``` R_war = w1 * ΔTerritoryValue + w2 * ΔArmyStrength_ratio - w3 * CasualtyCost - w4 * SupplyCost + w5 * SubgoalBonus(active_subgoal ∈ {EXPAND_TERRITORY, RAID_ECONOMY, TRAIN}) ``` | Weight | Default | Rationale | |---|---|---| | w1 | 0.40 | Territory is the primary long-term asset | | w2 | 0.25 | Army ratio relative to nearest rival | | w3 | 0.20 | Casualties are expensive to replace | | w4 | 0.10 | Supply burn limits campaign duration | | w5 | 0.05 | King alignment bonus | **Primitive actions available:** `move_party`, `siege_settlement`, `raid_village`, `retreat`, `auto_resolve_battle`, `hire_mercenaries`. ### 2b. Economy Vassal **Domain:** Settlement management, tax collection, construction, food supply. **Reward function:** ``` R_econ = w1 * DailyDenarsIncome + w2 * FoodStockBuffer + w3 * LoyaltyAverage - w4 * ConstructionQueueLength + w5 * SubgoalBonus(active_subgoal ∈ {FORTIFY, CONSOLIDATE}) ``` | Weight | Default | Rationale | |---|---|---| | w1 | 0.35 | Income is the fuel for everything | | w2 | 0.25 | Starvation causes immediate loyalty crash | | w3 | 0.20 | Low loyalty triggers revolt | | w4 | 0.15 | Idle construction is opportunity cost | | w5 | 0.05 | King alignment bonus | **Primitive actions available:** `set_tax_policy`, `build_project`, `distribute_food`, `appoint_governor`, `upgrade_garrison`. ### 2c. Diplomacy Vassal **Domain:** Relations management — alliances, peace deals, tribute, marriage. **Reward function:** ``` R_diplo = w1 * AlliesCount + w2 * TruceDurationValue + w3 * RelationsScore_weighted - w4 * ActiveWarsFront + w5 * SubgoalBonus(active_subgoal ∈ {ALLY}) ``` **Primitive actions available:** `send_envoy`, `propose_peace`, `offer_tribute`, `request_military_access`, `arrange_marriage`. --- ## 3. Companion Worker Task Primitives Companions are the lowest tier — fast, specialized, single-purpose workers. They run on **Qwen3:8b** (or smaller) for sub-2-second response times. Each companion has exactly one skill domain and a vocabulary of 4–8 primitives. ### 3a. Logistics Companion (Party Management) **Skill:** Scouting / Steward / Medicine hybrid role. | Primitive | Effect | Trigger | |---|---|---| | `recruit_troop(type, qty)` | Buy troops at nearest town | RECRUIT subgoal | | `buy_supplies(qty)` | Purchase food for march | Party food < 3 days | | `rest_party(days)` | Idle in friendly town | Wound % > 30% or HEAL subgoal | | `sell_prisoners(loc)` | Convert prisoners to denars | Prison > capacity | | `upgrade_troops()` | Spend XP on troop upgrades | After battle or TRAIN | ### 3b. Caravan Companion (Trade) **Skill:** Trade / Charm. | Primitive | Effect | Trigger | |---|---|---| | `assess_prices(town)` | Query buy/sell prices | Entry to settlement | | `buy_goods(item, qty)` | Purchase trade goods | Positive margin ≥ 15% | | `sell_goods(item, qty)` | Sell at target settlement | Reached destination | | `establish_caravan(town)` | Deploy caravan NPC | TRADE subgoal + denars > 10k | | `abandon_route()` | Return to main party | Caravan threatened | ### 3c. Scout Companion (Intelligence) **Skill:** Scouting / Roguery. | Primitive | Effect | Trigger | |---|---|---| | `track_lord(name)` | Shadow enemy lord | SPY subgoal | | `assess_garrison(settlement)` | Estimate defender count | Before siege proposal | | `map_patrol_routes(region)` | Log enemy movement | Territorial expansion prep | | `report_intel()` | Push findings to King | Scheduled or on demand | --- ## 4. Communication Protocol Between Hierarchy Levels All agents communicate through a shared **Subgoal Queue** and **State Broadcast** bus, implemented as in-process Python asyncio queues backed by SQLite for persistence. ### Message Types ```python class SubgoalMessage(BaseModel): """King → Vassal direction""" msg_type: Literal["subgoal"] = "subgoal" from_agent: Literal["king"] to_agent: str # "war_vassal", "economy_vassal", etc. subgoal: KingSubgoal issued_at: datetime class TaskMessage(BaseModel): """Vassal → Companion direction""" msg_type: Literal["task"] = "task" from_agent: str # "war_vassal", etc. to_agent: str # "logistics_companion", etc. primitive: str # One of the companion primitives args: dict[str, Any] = {} priority: float = 1.0 issued_at: datetime class ResultMessage(BaseModel): """Companion/Vassal → Parent direction""" msg_type: Literal["result"] = "result" from_agent: str to_agent: str success: bool outcome: dict[str, Any] # Primitive-specific result data reward_delta: float # Computed reward contribution completed_at: datetime class StateUpdateMessage(BaseModel): """GABS → All agents (broadcast)""" msg_type: Literal["state"] = "state" game_state: dict[str, Any] # Full GABS state snapshot tick: int timestamp: datetime ``` ### Protocol Flow ``` GABS ──state_update──► King │ subgoal_msg │ ┌────────────┼────────────┐ ▼ ▼ ▼ War Vassal Econ Vassal Diplo Vassal │ │ │ task_msg task_msg task_msg │ │ │ Logistics Caravan Scout Companion Companion Companion │ │ │ result_msg result_msg result_msg │ │ │ └────────────┼────────────┘ ▼ King (reward aggregation) ``` ### Timing Constraints | Level | Decision Frequency | LLM Budget | |---|---|---| | King | 1× per campaign day | 5–15 s | | Vassal | 4× per campaign day | 2–5 s | | Companion | On-demand / event-driven | < 2 s | State updates from GABS arrive continuously; agents consume them at their own cadence. No agent blocks another's queue. ### Conflict Resolution If two vassals propose conflicting actions (e.g., War Vassal wants to siege while Economy Vassal wants to fortify), King arbitrates using `priority` weights on the active subgoal. The highest-priority active subgoal wins resource contention. --- ## 5. Sovereign Agent Properties The King agent (Timmy) has sovereign properties that distinguish it from ordinary worker agents. These map directly to Timmy's existing identity architecture. ### 5a. Decentralized Identifier (DID) ``` did:key:z6Mk ``` The King's DID is persisted in `~/.timmy/identity.json` (existing SOUL.md pattern). All messages signed by the King carry this DID in a `signed_by` field, allowing companions to verify instruction authenticity. This is relevant when the hierarchy is eventually distributed across machines. ### 5b. Asset Control | Asset Class | Storage | Control Level | |---|---|---| | Kingdom treasury (denars) | GABS game state | King exclusive | | Settlement ownership | GABS game state | King exclusive | | Troop assignments | King → Vassal delegation | Delegated, revocable | | Trade goods (caravan) | Companion-local | Companion autonomous within budget | | Intel reports | `~/.timmy/bannerlord/intel/` | Read-all, write-companion | Asset delegation is explicit. Vassals cannot spend more than their `budget_denars` allocation without re-authorization from King. Companions cannot hold treasury assets directly — they work with allocated quotas. ### 5c. Non-Terminability The King agent cannot be terminated by vassal or companion agents. Termination authority is reserved for: 1. The human operator (Ctrl+C or `timmy stop`) 2. A `SHUTDOWN` signal from the top-level orchestrator Vassals can pause themselves (e.g., awaiting GABS state) but cannot signal the King to stop. This prevents a misbehaving military vassal from ending the campaign. Implementation: King runs in the main asyncio event loop. Vassals and companions run in `asyncio.TaskGroup` subgroups. Only the King's task holds a reference to the TaskGroup cancel scope. --- ## Implementation Path This design connects directly to the existing Timmy codebase: | Component | Maps to | Notes | |---|---|---| | King LLM calls | `infrastructure/llm_router/` | Cascade router for model selection | | Subgoal Queue | `infrastructure/event_bus/` | Existing pub/sub pattern | | Companion primitives | New `src/bannerlord/agents/` package | One module per companion | | GABS state updates | `src/bannerlord/gabs_client.py` | TCP JSON-RPC, port 4825 | | Asset ledger | `src/bannerlord/ledger.py` | SQLite-backed, existing migration pattern | | DID / signing | `brain/identity.py` | Extends existing SOUL.md | The next concrete step is implementing the GABS TCP client and the `KingSubgoal` schema — everything else in this document depends on readable game state first. --- ## References - Ahilan, S. & Dayan, P. (2019). Feudal Multi-Agent Hierarchies for Cooperative Reinforcement Learning. https://arxiv.org/abs/1901.08492 - Rood, S. (2022). Scaling Reinforcement Learning through Feudal Hierarchy (NPS thesis). - Wang, G. et al. (2023). Voyager: An Open-Ended Embodied Agent with Large Language Models. https://arxiv.org/abs/2305.16291 - Park, J.S. et al. (2023). Generative Agents: Interactive Simulacra of Human Behavior. https://arxiv.org/abs/2304.03442 - Silveira, T. (2022). CiF-Bannerlord: Social AI Integration in Bannerlord.