Phase 9: Simulation Orchestration — Devlog¶
Summary¶
Phase 9 implements the top-level simulation orchestrator — the master loop that ties all 8 prior phases into coherent multi-scale campaign simulations. This is the first time the simulation can run end-to-end: scenario load → environment evolving → AI commanders deciding → orders propagating → units moving → detection → combat → morale → logistics → victory evaluation.
Test count: 372 new tests (3,586 total: 3,214 prior + 372 Phase 9)
Source files: 8 modules in stochastic_warfare/simulation/
YAML data: 4 new campaign scenario files
Dependencies: None new
What Was Built¶
Source Modules (8 files)¶
| Module | Purpose | Tests |
|---|---|---|
simulation/__init__.py |
Package init | — |
simulation/scenario.py |
Pydantic campaign YAML schema, SimulationContext, ScenarioLoader | 78 |
simulation/victory.py |
VictoryEvaluator with 5 condition types, ObjectiveState tracking | 55 |
simulation/recorder.py |
SimulationRecorder subscribing to EventBus, state snapshots | 37 |
simulation/metrics.py |
CampaignMetrics static analysis (time series, summaries) | 35 |
simulation/battle.py |
BattleManager: tactical loop, engagement detection, deferred damage | 42 |
simulation/campaign.py |
CampaignManager: strategic ticks, reinforcements, supply | 23 |
simulation/engine.py |
SimulationEngine: master loop, resolution switching, checkpoint | 62 |
Integration Tests¶
tests/integration/test_phase9_integration.py— 40 tests covering full end-to-end scenarios
YAML Scenario Files (4 new)¶
| File | Purpose |
|---|---|
data/scenarios/test_campaign/scenario.yaml |
Minimal: 2 sides, 1 objective, 24h |
data/scenarios/test_campaign_multi/scenario.yaml |
Multiple engagement points, 2 objectives, 48h |
data/scenarios/test_campaign_reinforce/scenario.yaml |
3 reinforcement waves, 24h |
data/scenarios/test_campaign_logistics/scenario.yaml |
Supply chain emphasis, multiple depots, 72h |
Performance Optimizations¶
- LOS result caching: Per-tick cache in
terrain/los.pykeyed on(obs_row, obs_col, tgt_row, tgt_col, obs_height_cm, tgt_height_cm). Cleared at tick start viaclear_los_cache(). - Pathfinding threat cost caching: Per-
find_path()call threat cost cache complements existing terrain difficulty cache. - Engine integration: LOS cache cleared at start of each tick in
SimulationEngine.step().
Design Decisions¶
DD-1: No New ModuleId Value¶
Same reasoning as Phases 7-8. RNGManager._initialize() spawns len(ModuleId) child SeedSequences. Adding values changes spawn count, breaking deterministic replay. The orchestrator coordinates; domain modules provide randomness. Campaign manager reuses ModuleId.CORE.
DD-2: Shared SimulationContext (No Sub-Simulation)¶
No sub-simulation spawn. The engine shares one master clock and switches tick resolution. Data flows through the shared SimulationContext — no boundary serialization needed. This keeps deterministic replay simple and avoids RNG stream splitting complexity.
DD-3: Tick Resolution via Clock¶
TickResolution enum (STRATEGIC/OPERATIONAL/TACTICAL) with automatic switching:
- STRATEGIC (3600s): No forces in contact
- OPERATIONAL (300s): Transitional after battle concludes
- TACTICAL (5s): Active engagements in progress
Switching is automatic based on BattleManager.active_battles.
DD-4: ScenarioLoader Reuses Phase 7 Patterns¶
ScenarioLoader delegates force building and weapon/sensor assignment to existing validation/scenario_runner.py functions (build_terrain, build_forces, _assign_weapons, _assign_sensors). This avoids duplicating complex wiring logic.
DD-5: No Domain Logic in simulation/¶
Strictly enforced. simulation/ contains only: sequencing (which module updates in what order), data routing (passing outputs from one module as inputs to the next), resolution management, and state collection.
DD-6: Recorder Subscribes to Event Base Class¶
MRO dispatch ensures all event subtypes are captured. Recording is always on during a run; filtering is post-processing.
DD-7: Fixed Reinforcement Schedule¶
Scenario YAML defines exact arrival times. The campaign manager checks the schedule each tick and spawns arriving units via UnitLoader. Poisson random arrivals deferred.
Deviations from Plan¶
- Test counts differ from plan estimates: Plan estimated ~370 tests, achieved 372. Close match.
- LOSEngine wired into SimulationContext: Performance optimization agent added
los_enginefield to SimulationContext and wired it in ScenarioLoader — not in original plan but essential for LOS caching integration. - Viewshed vectorization deferred: Plan marked this as lower priority; confirmed as such during implementation.
- STRtree still deferred: Infrastructure spatial queries continue using brute-force. Profile data needed to justify complexity.
- Force aggregation/disaggregation deferred: Original development-phases.md scope included this. Not attempted — all units remain at individual resolution. Documented as known limitation #1.
- Multi-scale spatial transitions deferred: Original scope mentioned strategic graph ↔ tactical grid ↔ unit continuous transitions. Implementation uses temporal resolution switching (tick duration) only, not spatial scale transitions. The single shared
SimulationContextoperates at one spatial resolution. - Campaign loop profiling deferred: Full cProfile-based optimization deferred to Phase 10 when real campaign scenarios provide meaningful workloads.
Issues & Fixes¶
- Heightmap.data private attribute: Tests referenced
hm.data.shapebut Heightmap stores data as_data. Fixed by usinghm.shape(public property). - ScenarioRunner import in _create_engines: Referenced
ScenarioRunner._build_morale_config()without importing. Fixed with local import. - 1-hour campaign with 3600s tick: Test
step_returns_false_when_not_overfailed because first tick reached time limit. Fixed by using 24-hour campaign.
Known Limitations / Post-MVP Refinements¶
- No force aggregation/disaggregation — all units at individual resolution, no strategic "force blobs"
- Single-threaded simulation loop — required for deterministic PRNG replay
- No auto-resolve option — every engagement runs full tactical resolution
- Simplified strategic movement — units move without detailed operational pathfinding
- Fixed reinforcement schedule — no Poisson/stochastic arrivals
- No naval campaign management — structurally supported but not tested with naval scenarios
- Synthetic terrain only — programmatic heightmaps, not real topographic data
- LOS cache is per-tick only — cleared each tick after movement, no multi-tick memoization
- No weather evolution mid-campaign beyond what
WeatherEngine.step()provides when wired - Viewshed vectorization deferred — lower priority per plan
- STRtree for infrastructure spatial queries still deferred — waiting for profiling data
Lessons Learned¶
- ScenarioLoader is the most complex single function: Wiring 11 domain modules requires careful import ordering and parameter threading. Reusing Phase 7 patterns (
build_forces,build_terrain) prevented significant duplication. - Resolution switching is simple once clock supports it:
SimulationClock.set_tick_duration()was already in Phase 0. The engine just calls it when battle state changes. - Mock contexts are essential for fast unit tests: Full
ScenarioLoader.load()takes ~0.5s (YAML parsing, terrain generation). MockSimulationContextwith only needed fields makes engine tests run in <0.3s total. - Deferred damage pattern carries forward: Battle manager's tactical loop reuses the same deferred damage pattern from Phase 7's scenario runner.
- Victory evaluator must update objectives before checking conditions: Calling
update_objective_control()beforeevaluate()ensures territory control reflects current unit positions. - Per-tick LOS caching requires engine integration: The cache must be cleared at tick start (after movement), which means the engine must know about the LOS engine — added
los_enginefield to SimulationContext. - Background agents work well for independent tasks: Integration tests and performance optimization ran in parallel with no conflicts.