How it works

"Clean-room" here means exactly what it means in the ScummVM / OpenRCT2 / OpenMW / OpenLara projects: every line of code in the engine port is fresh Rust, written from format documentation + decompile-then-rewrite logic, not by auto-translating MIPS assembly. We read the Ghidra dumps to understand what each function does; we write the Rust to do that thing idiomatically. The decompiled C in ghidra/scripts/funcs/*.txt is reference material, not committable engine code.

The legal posture: zero Sony bytes ship in the repo or in any released binary. No game executable, no asset data, no decompressed Sony strings, no decompiled-C dumps with literal data - all gitignored. The engine binary is empty until you point it at your own disc image. CI runs without disc data, so disc-dependent tests skip when LEGAIA_DISC_BIN is unset. This is the same model used by the projects above and is well-established legal territory.

Goal & non-goals

Goal: a playable port of Legend of Legaia (NA SCUS-94254) on modern systems via Rust + wgpu, with optional WASM/web target. JP/EU regions land after NA is solid.

Non-goals:

  • Improving the game (no HD remaster, no balance changes, no QoL beyond what the original supported).
  • Modding kit (useful as a side-effect, not as a designed deliverable).
  • Translation work.
  • Static recompilation of SCUS_942.54. The engine is clean-room from documented specs and decompile-then-rewrite logic - not auto-translated MIPS.

Architectural principles

  • Asset crates stay engine-agnostic. crates/tim, crates/tmd, etc. don't depend on wgpu / SDL3 / cpal. They produce typed in-memory representations; the engine layer turns those into GPU resources / audio buffers.
  • Mockable I/O for tests. The disc read path is abstracted via crates/iso::RawDisc; the same pattern extends to file-system extraction so tests can run without a disc.
  • Deterministic gameplay. RNG seeded from a known value; physics tick on a fixed timestep. Required for any future TAS / verification work.
  • No "fix the bug" temptation. If the original game has quirky damage rounding or oddly-timed cutscenes, replicate them. Behavioural fidelity is in scope; QoL is not.
  • Behaviour tests against runtime traces. Long-term, capture inputs + RNG + frame outputs from the original game, replay through the engine, diff. The asset-viewer phase landed enough infrastructure to make this possible later.

Crate layering

iso          ← (none)
prot         → iso (conceptual)
lzs          ← (none)
asset        → lzs, prot
tmd          ← (none)
tim          ← (none)
xa           ← (none)
vab          → xa  (shares SPU-ADPCM F0/F1 filter constants)
mdt          ← (none)
mes          ← (none)
anm          ← (none)
extract      → all of the above

engine-core  ← (none)
engine-render → engine-core
engine-audio → engine-core
engine-vm    → engine-core
asset-viewer → engine-*, all parser crates

A future sound crate (sequencer playback for .spk sequences and the .dpk / .MAP / .PCH family) would depend on vab. A future battle / menu module belongs inside engine-vm next to the actor + field VMs rather than as a separate crate.

Phase 1 - asset viewer (de-risks integration)

A standalone binary that loads the disc, lets the user navigate PROT entries, and renders / plays them. Render API: winit + wgpu (Vulkan / Metal / DX12 / WebGPU backends). Audio: cpal-backed mixer.

Implemented

CrateWhat's there
engine-coreVfs trait + three backends: DirVfs (extracted-dir), DiscVfs (reads PROT.DAT / CDNAME.TXT directly from a .bin ISO9660 tree, no extraction step needed), MemoryVfs (WASM in-memory). AssetCache, FrameTime. SceneHost::open_disc(path) bootstraps the engine from a disc image; BootSession::open_disc(path, cfg) wraps it for the runtime. Every legaia-engine subcommand (info, list-scenes, play, play-window, save) accepts --disc PATH as an alternative to --extracted-root. Engine-agnostic, no GPU deps.
engine-renderRenderer (wgpu device + surface + textured-quad pipeline + flat / textured-mesh pipelines + lines pipeline). Aspect-preserving letterbox. Software PSX VRAM emulation (1024×512 R16Uint, per-prim CBA/TSB + 4/8/15bpp + CLUT decoded in fragment shader).
engine-audioAudioOut (cpal-backed) + clean-room PSX SPU model (24-voice mixer, streaming ADPCM, ADSR, 512 KB SPU RAM, libspu-shaped transfer engine). VabBank::upload drops VAB bodies into SPU RAM; play_note translates a MIDI key into voice config + key-on. Sequencer drives a SEQ + VAB pair from the cpal callback.
asset-viewerwinit binary with subcommands: tim, tmd, stage, vab, prot.

The PROT browser dispatch handles tim_passthrough, tim_pack, data_field_streaming, scene_tmd_stream, scene_vab_stream, and a VAB byte-search fallback for any class with embedded banks.

Open Phase 1 milestones

  • XA stream playback (streaming voice in engine-audio).
  • Multi-voice mixer (the PSX SPU runs 24 voices; current mixer plays one).
  • ADSR shaping for VAB tones.
  • Per-vertex normals from the TMD per-object normal table (currently the renderer derives normals via screen-space derivatives, which is flat-shading).

Phase 2 - runtime port

Port the script VM, field-loader chain, and effect VM. Handler-by-handler translation: dump each opcode handler from Ghidra, hand-port to Rust, unit-test against captured runtime traces. Aim for behavioural fidelity per opcode, not byte-exactness of the VM internals.

Implemented

  • Actor VM - crates/engine-vm/src/lib.rs. All 13 opcodes ported, full unit-test coverage. Drives the title screen sprite cluster.
  • Field VM - crates/engine-vm/src/field.rs. All 43 explicit opcodes of FUN_801DE840 are ported with a FieldHost trait abstracting every SCUS callback. Cross-context dispatch (extended-bit prefix), YIELD caller-propagation, Op49State tristate (with the inline-MES walker for sub-0), the 0x4C outer-nibble dispatcher, the 0x38 halt-acquire path, and the 0x5x/0x6x/0x7x default-route fourth-flag-bank dispatchers are all wired.
  • Move VM - crates/engine-vm/src/move_vm.rs. All 71 main opcodes (0x00..0x46) of FUN_80023070 ported, plus the 0x2F extension dispatcher (61 sub-opcodes via FUN_801D362C). Per-frame entry is actor_tick, mirroring the gate at FUN_80021DF4 + 0x80022B94.
  • Motion VM - crates/engine-vm/src/motion_vm.rs. All 6 opcodes ported including the 12-bit fixed-point angle-math opcodes 0x38 RotateToAngle and 0x4C FaceTarget.
  • Effect VM - crates/engine-vm/src/effect_vm.rs. Slot pool (32 master + 128 child slots), Pool::init / Pool::spawn / Pool::tick ports of FUN_801DE914 / FUN_801DFDF8 / FUN_801E0088, Pool::spawn_by_ui_id + EffectCatalog for UI-element routing.
  • Battle action state machine - crates/engine-vm/src/battle_action.rs. Port of FUN_801E295C (16 KB, the largest function in the battle overlay) as a per-frame edge-triggered state machine across 47 explicit states in 7 bands. Attack chain fires apply_damage at the swing-apex byte. The Tactical-Arts strike band additionally calls apply_art_strike(ArtStrikeInfo) with the per-strike power byte, dmg_timing, status effect, and hit cue resolved from the active actor's chosen art via BattleActionHost::art_record.
  • Title-overlay sub-mode dispatcher - crates/engine-vm/src/title_overlay.rs. 25-entry JT at 0x801CF244 (the per-frame FUN_801DD35C tick), state-struct field offsets, observed state[+0x204] = N transitions. Four modes are semantically labelled (Init, Idle, AttractIdle, AttractDelay); the other 21 carry Phase0xNN placeholders. Standout pin: Phase06 writes _DAT_8007B83C = 0x02 at 0x801DFC00 - the title-screen → main-game master-mode transition (exported as MASTER_GAME_MODE_FIELD_LAUNCH + PHASE06_LAUNCH_GAME_PC).
  • SCUS sprite-emit primitives - crates/engine-vm/src/title_prim.rs. Clean-room ports of the three SCUS helpers the title tick calls into: FUN_80058298 (ClearImage fill-rect), FUN_80058490 (MoveImage VRAM-copy), FUN_800198E0 (sprite-descriptor dispatcher with tag-0x11 + alpha-OR pre-pass + width-divisor variants). PrimHost trait abstracts the four engine callbacks. Overlay-side helpers (FUN_801E1C1C etc., shared across menu / battle / shop / save UI overlays) are deferred to their own port.
  • Composite world / actor system - crates/engine-core/src/world.rs. World owns the actor table, battle ctx, effect pool, field-VM ctx, per-actor move-VM buffers, shop/inn/level-up session state, tactical-arts tracker, and ANM AnimPlayer instances. World::tick drives all of them in order per frame.
  • Clean-room SPU mixer - crates/engine-audio/src/spu/. 24-voice SPU model with streaming ADPCM, ADSR, 512 KB SPU RAM, libspu-shaped transfer engine. BGM cross-fade (30-frame volume ramp) and sequencer pause gating. WASM path uses WebAudioOut (ScriptProcessorNode). See audio.

Phase 3 - gameplay assembly

All major gameplay systems are wired into engine-core::World and driven from engine-shell::BootSession.

  • Shop / Inn / Level-up - ShopSession, InnSession, LevelUpTracker in engine-core. MenuRuntime routes buy/sell/quantity/confirm/exit through session state; HP/MP restore wired on inn commit; XP distribution fires BattleEvent::LevelUp per character per level. LevelUpBanner (180-frame countdown, same shape as ArtLearnedBanner) set by apply_battle_xp, ticked by World::tick. level_up_draws_for() in engine-render produces a two-line yellow/green overlay (title + HP/MP gains); wired into play-window HUD at anchor (8, 60). Exact XP and per-level stat tables remain placeholder until full level-up overlay capture.
  • Tactical Arts learning UI - TacticalArtsTracker tracks per-char / per-art use counts; ArtLearnedBanner counts down in World::tick; BattleEvent::TacticalArtLearned fires when the threshold is crossed.
  • Status effects - crates/engine-vm/src/status_effects.rs tracks the eight retail conditions, named with the game's in-game ailment terms (Toxic / Numb / Venom / Sleep / Confuse / Curse / Stone / Faint), with per-instance turn counters and damage-over-time formulas (Toxic = max_hp / 16, Venom = current_hp / 8). World::tick_status_effects folds tick damage into BattleActor::hp; fold_battle_event pushes EnemyEffect bytes from art strikes into the tracker.
  • AP / Spirit gauge - crates/engine-core/src/ap_gauge.rs models the per-character AP budget (base 4, +1 per 10 levels capped at 10) plus the +5 Spirit-press bonus. art_ap_cost(action) mirrors the per-action-byte cost table; the world carries [ApGauge; 3] and resets all three at turn start.
  • Battle stat aggregator - crates/engine-core/src/battle_stats.rs. Clean-room port of FUN_80042558: walks 8 equipment slots, sums per-item modifiers (EquipmentTable), ORs ability bits into a 256-bit mask, folds in status-effect modifiers (Toxic -ATK/-DEF, Confuse halves accuracy, immobilising statuses zero evasion, Curse / Faint block Magic).
  • Item catalog - crates/engine-core/src/items.rs. Typed ItemEffect enum (Heal / Cure / Revive / StatBoost / Spirit / Capture / Escape / Damage / KeyItem); apply_effect(effect, &TargetSnapshot) -> ItemOutcome resolves the side-effect pure-functionally. Vanilla catalog ships 19 entries. World::use_item(item_id, target_slot) wraps the resolver and folds outcomes back into world state - HP / MP gains capped at the actor's max, status cure / cure_all clears the matching tracker entries, Spirit-restore items refund AP via ApGauge::refund.
  • Battle round lifecycle - crates/engine-core/src/battle_round.rs. BattleRound::begin(&mut world, &[StatRecord; 8], &EquipmentTable, &StatusModifiers) orchestrates per-round bookkeeping: resets every party AP gauge, recomputes per-slot BattleStats, and writes the resolved attack / UDF / LDF back into World::battle_attack / battle_defense_split. BattleRound::end(&mut world) ticks every actor's status, drains tick damage into BattleActor::hp, and returns the death count. The returned BattleRound carries action_blocked / magic_blocked arrays the action validator filters command input against.
  • Per-actor animation runtime - crates/engine-vm/src/anim_vm.rs. AnimRuntime::with_slots(N) manages a fixed actor pool that wraps AnimPlayer for the keyframe path and surfaces a Host::on_opaque_record hook for record-level side-effects (sprite swaps, voice cues). Per tick the runtime emits an AnimEvent stream (PoseUpdated / OpaqueTick / Finished / Replaced) so engines drive renderer / SFX side effects without polling per-actor state.
  • Per-actor physics tick - crates/engine-vm/src/actor_tick.rs. Layered port of FUN_80021DF4 (4732 bytes, 1183 instructions). ActorPhysics models the retail actor record's tick-relevant fields with offset annotations; tick_actor(physics, scalars, listener) runs the dispatch ladder. Each dispatch byte (0x01..=0x07) selects a layered subset of side-effects: common pre-update, keyframe accel (0x02 / 0x06), positional SFX emitter (0x05), path interpolation (0x03), default movement (every byte except 0x05), and the common late-update (env clamps, render submissions, keyframe pose write for 0x06). Cross-cutting effects surface as TickEvent entries (SfxUpdate / SfxRelease / SplineDraw / DampDraw / MoveVmKick / UnlinkRequest / KeyframePoseWritten) so engines drive their audio mixer / scene graph / move-VM driver from a single typed event stream.
  • Battle command runner - crates/engine-core/src/battle_runner.rs. BattleRunner sits between player input and the action SM: begin_round delegates to BattleRound::begin for AP refresh + stat recompute, push_command / push_chained_art gate input against ApGauge, commit_turn resolves the per-slot queue through resolve_action_queue (Miracle / Super expansion), and end_round drives BattleRound::end for tick-damage drainage. Per-slot buffers + chained-art lists let the player switch between party members mid-turn without losing state.
  • Battle HUD model - crates/engine-core/src/battle_hud.rs. Renderer-agnostic BattleHud holds per-slot HP / MP / AP / status icons, a queue of DamagePopups with fade timers, and a ringed log column. engine-render::battle_hud_draws_for turns it into a Vec<TextDraw>; engines feed the HUD from BattleEvent::ApplyArtStrike (popups), StatusEvent (icons), and BattleRound::begin / end (slot panels).
  • Inventory item-use session - crates/engine-core/src/inventory_use.rs. InventoryUseSession drives the "open inventory → pick item → pick target → use it" flow shared between the field menu and the battle command menu. Filters items by InventoryContext (battle vs field), validates target compatibility (Revive needs a dead target; everything else needs a live one), and folds the resolved ItemOutcome into world state via World::use_item.
  • SFX bank + scheduler - crates/engine-audio/src/sfx.rs. SfxBank maps cue IDs (the HitCue::kind byte from art records, plus engine-extended slots for menu blips / footsteps) to per-cue SfxEntry descriptors that delegate to VabBank::play_note. SfxScheduler::tick_frame drains a queue of PendingCues with retail-style timing_frames offsets so cues fire on the correct anim frame relative to the strike. The live battle loop drives it: the bank is decoded from the user's executable at boot and each resolved BattleSfxCue keys on through the per-scene VAB.
  • Menu sub-screens - MenuRuntime handles StatusCharacter / StatusEquipment / StatusInventory with cursor input, data-view methods, and commit side-effects (unequip slot, decrement inventory item).
  • Save / load - LGSF v1 self-describing binary (magic + story_flags + money + inventory pairs + party records); World::save_full / load_full; memory-card writeback via legaia_save::card::write_block.
  • BGM + audio - AudioBgmDirector cross-fades between tracks over 30 frames; sequencer pause gating; input::Mapping persists key bindings to TOML.
  • Windowed engine binary - legaia-engine play-window opens 960×720 via winit; play-str plays back PSX STR + XA in a window; config set --binding edits key maps.
  • WASM disc-bytes Vfs - MemoryVfs, Archive::from_bytes, SceneHost::from_prot_bytes, and LegaiaRuntime::load_disc / enter_scene / disc_loaded drive the in-browser engine from uploaded disc bytes. The viewer “Run engine” tab wires it in JS.
  • Region support - legaia_prot::Region enum (NA / EU / JP); ProtIndex::with_region(); documented in docs/reference/builds.md.

Open Phase 3 items

  • Shop / inn exact item-price data pending shop overlay capture; current prices are synthetic placeholders.
  • Exact XP / stat-gain tables from the level-up overlay (current placeholder: 100×n² curve); banner render layer is wired, only the tables need the capture.
  • Scene-init ANM binding per-actor (blocked on tracing the 0x8007C018 pointer-table registration order).
  • legaia-engine play --scene cutsceneN PROT-scene routing (direct play-str <file> works; scene-entry routing pending STR-entry trace).
  • Exact field-VM WARP map_id → scene-name table (7 destinations; pre-WARP handler that sets DAT_80084548 not yet traced; DefaultMapIdResolver uses CDNAME sequential order as approximation).

Provenance + memory hygiene

The decompiled C dumps under ghidra/scripts/funcs/ are reference material. Engine code in crates/engine-vm/ is fresh Rust written from the decompile - never paste, always rewrite from the documented spec.

Per-opcode tests live next to the port; they use synthetic bytecode (no Sony bytes) so the test suite stays clean-room.

Engine integration scenarios

scripts/engine/scenarios.toml declares scenarios that drive the headless BootSession for a fixed frame count and assert the SHA-256 of the resulting SaveFile byte stream matches a recorded baseline. Mirrors the byte-level mednafen scenarios manifest - both files live side by side so a feature touching either layer is forced to consider regression coverage on the other.

Schema lives in crates/engine-shell/src/scenarios.rs; the disc-gated runner in crates/engine-shell/tests/scenarios.rs exercises every entry. The CLI runner is legaia-engine scenarios [--bless] (the --bless flag rewrites the manifest in place with observed hashes for blessing).

A scenario row whose expected_save_sha256 is empty is "unblessed" - the test reports the observed hash and skips assertion; the CLI runner exits non-zero unless --bless is on. That forces every new scenario to be reviewed once before it can drift silently.

VRAM diff harness

legaia-engine info --runtime-vram <bin> --vram-diff-png <path> and legaia-engine vram-oracle --runtime-vram <bin> already compare engine VRAM (built via SceneResources::build_targeted) against a runtime VRAM blob captured from a save state. The vram-oracle subcommand also exposes:

  • --rows-csv <path> - per-Y row CSV of pixel-level diff stats (y, runtime_nz, engine_nz, overlap, runtime_only, engine_only). Drift in any single row above a threshold (e.g. row 479 NPC CLUT) shows up as a high runtime_only count for that row only, which is the regression signature of a missed targeted-upload pass.
  • --clut-regions - one-line health report per documented CLUT band (NPC palette row 479, character / texture-page CLUT rows). A <-- gap flag flags the engine-missing case.

Pair with mednafen-state vram-dump --out-bin to get the runtime ground-truth blob, and with mednafen-state prim-dispatch-survey to confirm the per-prim renderer dispatch tables haven't drifted between the saves you're comparing.

Static-mask parity (vram_oracle_e1)

A save state's VRAM is a live snapshot: much of the texpage region is dynamic / residual state (animation frames, battle leftovers, scroll position). Comparing two captures of the same scene (town01 pre- vs post-battle) shows ~40% of the primary texture band differs between them, so a stateless engine pre-pass can never be byte-exact against a single snapshot. The disc-gated vram_oracle_e1 test therefore asserts against the static mask - the words identical across every same-scene capture (the scene's genuine static VRAM). For each scene with ≥ 2 captures it builds the engine VRAM with the field-mode DMA-every-TIM pre-pass (upload_all_tims) and asserts the engine never uploads a wrong texel on a static pixel in the texpage region, excluding the runtime-managed NPC / character CLUT band (vram_oracle::NPC_CLUT_BAND_ROWS, row 479 ±). Incompleteness is not flagged - the engine doesn't yet assemble every boot-resident texture (font / menu atlases) - but the correctness of what it does upload is.

The per-scene mask premise (“stable across same-scene captures = genuinely static”) has two capture-pinned failure modes, each with its own refinement: (1) global shared bands are history-dependent - the befect_data effect-texture band (one disc source, resident across every field scene) carries a handful of pixels whose boot-resident value differs from the disc copy until a battle re-uploads the disc bytes (pinned at (853, 271): pre-battle / menu captures hold 0xFFFF words where the disc TIM - and every post-battle capture - holds 0x3333); when a scene's captures share battle history the per-scene mask misclassifies those pixels as static, so refine_mask_with_shared_band demands staticity across all scenes' captures for cells inside scene::effect_texture_image_rects. (2) World-map CLUT palette cycling - row 506's head is the 13-frame ocean CLUT animation (a capture holds an arbitrary phase, never the disc base CLUT), rows 508/509 each animate a few entries, row 508's entries 32..47 mirror its own 0..15 head, and row 506's tail holds a runtime-generated palette found in no disc bundle; WORLD_MAP_CLUT_CYCLE_ROWS / clear_world_map_clut_cycle_rows exclude the three rows for world-map scenes only (row 507, a non-animated terrain CLUT, stays asserted).

See also