13 KiB
Bannerlord Feudal Multi-Agent Hierarchy Design
Issue: #1099 Parent Epic: #1091 (Project Bannerlord) Date: 2026-03-23 Status: Draft
Overview
This document specifies the multi-agent hierarchy for Timmy's Bannerlord campaign. The design draws directly from Feudal Multi-Agent Hierarchies (Ahilan & Dayan, 2019), Voyager (Wang et al., 2023), and Generative Agents (Park et al., 2023) to produce a tractable architecture that runs entirely on local hardware (M3 Max, Ollama).
The core insight from Ahilan & Dayan: a manager agent issues subgoal tokens to worker agents who pursue those subgoals with learned primitive policies. Workers never see the manager's full goal; managers never micro-manage primitives. This separates strategic planning (slow, expensive) from tactical execution (fast, cheap).
1. King-Level Timmy — Subgoal Vocabulary
Timmy is the King agent. He operates on the campaign map timescale (days to weeks of in-game time). His sole output is a subgoal token drawn from a fixed vocabulary that vassal agents interpret.
Subgoal Token Schema
class KingSubgoal(BaseModel):
token: str # One of the vocabulary entries below
target: str | None = None # Named target (settlement, lord, faction)
quantity: int | None = None # For RECRUIT, TRADE
priority: float = 1.0 # 0.0–2.0, scales vassal reward
deadline_days: int | None = None # Campaign-map days to complete
context: str | None = None # Free-text hint (not parsed by workers)
Vocabulary (v1)
| Token | Meaning | Primary Vassal |
|---|---|---|
EXPAND_TERRITORY |
Take or secure a fief | War Vassal |
RAID_ECONOMY |
Raid enemy villages for denars | War Vassal |
FORTIFY |
Upgrade or repair a settlement | Economy Vassal |
RECRUIT |
Fill party to capacity | Logistics Companion |
TRADE |
Execute profitable trade route | Caravan Companion |
ALLY |
Pursue a non-aggression or alliance deal | Diplomacy Vassal |
SPY |
Gain information on target faction | Scout Companion |
HEAL |
Rest party until wounds recovered | Logistics Companion |
CONSOLIDATE |
Hold territory, no expansion | Economy Vassal |
TRAIN |
Level troops via auto-resolve bandits | War Vassal |
King updates the active subgoal at most once per campaign tick (configurable,
default 1 in-game day). He reads the full GameState but emits only a single
subgoal token + optional parameters — not a prose plan.
King Decision Loop
while campaign_running:
state = gabs.get_state() # Full kingdom + map snapshot
subgoal = king_llm.decide(state) # Qwen3:32b, temp=0.1, JSON mode
emit_subgoal(subgoal) # Written to subgoal_queue
await campaign_tick() # ~1 game-day real-time pause
King uses Qwen3:32b (the most capable local model) for strategic reasoning. Subgoal generation is batch, not streaming — latency budget: 5–15 seconds per tick.
2. Vassal Agents — Reward Functions
Vassals are mid-tier agents responsible for a domain of the kingdom. Each vassal has a defined reward function. Vassals run on Qwen3:14b (balanced capability vs. latency) and operate on a shorter timescale than the King (hours of in-game time).
2a. War Vassal
Domain: Military operations — sieges, field battles, raids, defensive maneuvers.
Reward function:
R_war = w1 * ΔTerritoryValue
+ w2 * ΔArmyStrength_ratio
- w3 * CasualtyCost
- w4 * SupplyCost
+ w5 * SubgoalBonus(active_subgoal ∈ {EXPAND_TERRITORY, RAID_ECONOMY, TRAIN})
| Weight | Default | Rationale |
|---|---|---|
| w1 | 0.40 | Territory is the primary long-term asset |
| w2 | 0.25 | Army ratio relative to nearest rival |
| w3 | 0.20 | Casualties are expensive to replace |
| w4 | 0.10 | Supply burn limits campaign duration |
| w5 | 0.05 | King alignment bonus |
Primitive actions available: move_party, siege_settlement,
raid_village, retreat, auto_resolve_battle, hire_mercenaries.
2b. Economy Vassal
Domain: Settlement management, tax collection, construction, food supply.
Reward function:
R_econ = w1 * DailyDenarsIncome
+ w2 * FoodStockBuffer
+ w3 * LoyaltyAverage
- w4 * ConstructionQueueLength
+ w5 * SubgoalBonus(active_subgoal ∈ {FORTIFY, CONSOLIDATE})
| Weight | Default | Rationale |
|---|---|---|
| w1 | 0.35 | Income is the fuel for everything |
| w2 | 0.25 | Starvation causes immediate loyalty crash |
| w3 | 0.20 | Low loyalty triggers revolt |
| w4 | 0.15 | Idle construction is opportunity cost |
| w5 | 0.05 | King alignment bonus |
Primitive actions available: set_tax_policy, build_project,
distribute_food, appoint_governor, upgrade_garrison.
2c. Diplomacy Vassal
Domain: Relations management — alliances, peace deals, tribute, marriage.
Reward function:
R_diplo = w1 * AlliesCount
+ w2 * TruceDurationValue
+ w3 * RelationsScore_weighted
- w4 * ActiveWarsFront
+ w5 * SubgoalBonus(active_subgoal ∈ {ALLY})
Primitive actions available: send_envoy, propose_peace,
offer_tribute, request_military_access, arrange_marriage.
3. Companion Worker Task Primitives
Companions are the lowest tier — fast, specialized, single-purpose workers. They run on Qwen3:8b (or smaller) for sub-2-second response times. Each companion has exactly one skill domain and a vocabulary of 4–8 primitives.
3a. Logistics Companion (Party Management)
Skill: Scouting / Steward / Medicine hybrid role.
| Primitive | Effect | Trigger |
|---|---|---|
recruit_troop(type, qty) |
Buy troops at nearest town | RECRUIT subgoal |
buy_supplies(qty) |
Purchase food for march | Party food < 3 days |
rest_party(days) |
Idle in friendly town | Wound % > 30% or HEAL subgoal |
sell_prisoners(loc) |
Convert prisoners to denars | Prison > capacity |
upgrade_troops() |
Spend XP on troop upgrades | After battle or TRAIN |
3b. Caravan Companion (Trade)
Skill: Trade / Charm.
| Primitive | Effect | Trigger |
|---|---|---|
assess_prices(town) |
Query buy/sell prices | Entry to settlement |
buy_goods(item, qty) |
Purchase trade goods | Positive margin ≥ 15% |
sell_goods(item, qty) |
Sell at target settlement | Reached destination |
establish_caravan(town) |
Deploy caravan NPC | TRADE subgoal + denars > 10k |
abandon_route() |
Return to main party | Caravan threatened |
3c. Scout Companion (Intelligence)
Skill: Scouting / Roguery.
| Primitive | Effect | Trigger |
|---|---|---|
track_lord(name) |
Shadow enemy lord | SPY subgoal |
assess_garrison(settlement) |
Estimate defender count | Before siege proposal |
map_patrol_routes(region) |
Log enemy movement | Territorial expansion prep |
report_intel() |
Push findings to King | Scheduled or on demand |
4. Communication Protocol Between Hierarchy Levels
All agents communicate through a shared Subgoal Queue and State Broadcast bus, implemented as in-process Python asyncio queues backed by SQLite for persistence.
Message Types
class SubgoalMessage(BaseModel):
"""King → Vassal direction"""
msg_type: Literal["subgoal"] = "subgoal"
from_agent: Literal["king"]
to_agent: str # "war_vassal", "economy_vassal", etc.
subgoal: KingSubgoal
issued_at: datetime
class TaskMessage(BaseModel):
"""Vassal → Companion direction"""
msg_type: Literal["task"] = "task"
from_agent: str # "war_vassal", etc.
to_agent: str # "logistics_companion", etc.
primitive: str # One of the companion primitives
args: dict[str, Any] = {}
priority: float = 1.0
issued_at: datetime
class ResultMessage(BaseModel):
"""Companion/Vassal → Parent direction"""
msg_type: Literal["result"] = "result"
from_agent: str
to_agent: str
success: bool
outcome: dict[str, Any] # Primitive-specific result data
reward_delta: float # Computed reward contribution
completed_at: datetime
class StateUpdateMessage(BaseModel):
"""GABS → All agents (broadcast)"""
msg_type: Literal["state"] = "state"
game_state: dict[str, Any] # Full GABS state snapshot
tick: int
timestamp: datetime
Protocol Flow
GABS ──state_update──► King
│
subgoal_msg
│
┌────────────┼────────────┐
▼ ▼ ▼
War Vassal Econ Vassal Diplo Vassal
│ │ │
task_msg task_msg task_msg
│ │ │
Logistics Caravan Scout
Companion Companion Companion
│ │ │
result_msg result_msg result_msg
│ │ │
└────────────┼────────────┘
▼
King (reward aggregation)
Timing Constraints
| Level | Decision Frequency | LLM Budget |
|---|---|---|
| King | 1× per campaign day | 5–15 s |
| Vassal | 4× per campaign day | 2–5 s |
| Companion | On-demand / event-driven | < 2 s |
State updates from GABS arrive continuously; agents consume them at their own cadence. No agent blocks another's queue.
Conflict Resolution
If two vassals propose conflicting actions (e.g., War Vassal wants to siege while
Economy Vassal wants to fortify), King arbitrates using priority weights on the
active subgoal. The highest-priority active subgoal wins resource contention.
5. Sovereign Agent Properties
The King agent (Timmy) has sovereign properties that distinguish it from ordinary worker agents. These map directly to Timmy's existing identity architecture.
5a. Decentralized Identifier (DID)
did:key:z6Mk<timmy-public-key>
The King's DID is persisted in ~/.timmy/identity.json (existing SOUL.md pattern).
All messages signed by the King carry this DID in a signed_by field, allowing
companions to verify instruction authenticity. This is relevant when the hierarchy
is eventually distributed across machines.
5b. Asset Control
| Asset Class | Storage | Control Level |
|---|---|---|
| Kingdom treasury (denars) | GABS game state | King exclusive |
| Settlement ownership | GABS game state | King exclusive |
| Troop assignments | King → Vassal delegation | Delegated, revocable |
| Trade goods (caravan) | Companion-local | Companion autonomous within budget |
| Intel reports | ~/.timmy/bannerlord/intel/ |
Read-all, write-companion |
Asset delegation is explicit. Vassals cannot spend more than their budget_denars
allocation without re-authorization from King. Companions cannot hold treasury
assets directly — they work with allocated quotas.
5c. Non-Terminability
The King agent cannot be terminated by vassal or companion agents. Termination authority is reserved for:
- The human operator (Ctrl+C or
timmy stop) - A
SHUTDOWNsignal from the top-level orchestrator
Vassals can pause themselves (e.g., awaiting GABS state) but cannot signal the King to stop. This prevents a misbehaving military vassal from ending the campaign.
Implementation: King runs in the main asyncio event loop. Vassals and companions
run in asyncio.TaskGroup subgroups. Only the King's task holds a reference to
the TaskGroup cancel scope.
Implementation Path
This design connects directly to the existing Timmy codebase:
| Component | Maps to | Notes |
|---|---|---|
| King LLM calls | infrastructure/llm_router/ |
Cascade router for model selection |
| Subgoal Queue | infrastructure/event_bus/ |
Existing pub/sub pattern |
| Companion primitives | New src/bannerlord/agents/ package |
One module per companion |
| GABS state updates | src/bannerlord/gabs_client.py |
TCP JSON-RPC, port 4825 |
| Asset ledger | src/bannerlord/ledger.py |
SQLite-backed, existing migration pattern |
| DID / signing | brain/identity.py |
Extends existing SOUL.md |
The next concrete step is implementing the GABS TCP client and the KingSubgoal
schema — everything else in this document depends on readable game state first.
References
- Ahilan, S. & Dayan, P. (2019). Feudal Multi-Agent Hierarchies for Cooperative Reinforcement Learning. https://arxiv.org/abs/1901.08492
- Rood, S. (2022). Scaling Reinforcement Learning through Feudal Hierarchy (NPS thesis).
- Wang, G. et al. (2023). Voyager: An Open-Ended Embodied Agent with Large Language Models. https://arxiv.org/abs/2305.16291
- Park, J.S. et al. (2023). Generative Agents: Interactive Simulacra of Human Behavior. https://arxiv.org/abs/2304.03442
- Silveira, T. (2022). CiF-Bannerlord: Social AI Integration in Bannerlord.