How it works

At runtime, "load this scene" doesn't mean "load one file". A single battle scene needs character meshes, monster meshes, the active CLUTs (palettes), the per-monster sound bank, the visual-effect script archive, plus all the text the battle dialog box might print. They're scattered across dozens of PROT entries.

Each scene type has its own loader function whose job is to go fetch all that. They all share the same shape:

  • The loader is a small state machine with one numbered case per asset slot.
  • Each case calls the asset-type dispatcher FUN_8001F05C with the bytes for one entry.
  • Some cases also kick off async CD reads to stage things into memory while the rest of the scene is initialising.

There's a dev/retail split that runs through nearly every loader. Dev builds load by PROT index directly - the index numbers are baked in. Retail builds resolve dev-style paths (h:\PROT\FIELD\<scene>\name) through FUN_8003E6BC, which uses the CDNAME.TXT name map to find the corresponding PROT index. Both branches end up at the same files; only the lookup mechanism differs. That's why we don't see a literal "scene N uses PROT entries [a, b, c]" table anywhere in the executable - the table is the CDNAME block layout itself.

The asset descriptor walker (FUN_80020224) is the loop that walks a per-scene asset descriptor and calls the dispatcher per descriptor. It has zero static callers in SCUS_942.54, which would suggest "dead code" - except its sole runtime caller is the town overlay's MAIN_INIT, at the start of every field session. So it is exercised by retail gameplay; the call just lives in RAM.

Battle bundle

Function
FUN_800520F0
Shape
11-case state machine

Notable cases:

CaseLoads
6The befect_data bundle (PROT 0x369–0x36B)
0xEInitialises the runtime effect 2-pack wrapper via FUN_801DE914
0xFFDispatches 0x801F17F8, the side-band streaming-effect handler that streams summon.dat and readef.dat

Two cases call FUN_8003E104(monster_idx, slot, dst_buf) to populate slots 7 and 8 with the active battle's monster sound banks - the per-monster body of h:\mpack\monster.snd. Each monster has a (start_lba, end_lba+1) entry pair in the TOC at 0x801C8980 - 0x10. See Audio → "Monster sound bank" for the full loader contract.

The asset-viewer's --bundle battle mode overlays the extraction 865–890 tim_scan window - an empirically-tuned VRAM set spanning the player battle files, the befect cluster, and the sound_data2 streams (the directory labels carry the +2 filename shift) - so character meshes have the right CLUT bindings.

Per-PROT walker (FUN_8001FE70)

One battle-scene state (in FUN_800513F0, around 0x80051A50) calls FUN_8001FA88(scene_base + slot, 0, dst_buf) to load a per-scene PROT entry into the working buffer, then FUN_8001FE70(dst_buf) to walk its chunk list. The walker is the dispatch path for the scene_tmd_stream layout - leading TMD body followed by streaming chunks - and is different from the standard FUN_8002541C streaming walker:

  • First chunk: read chunk0_header = u32 LE at offset 0. Low 24 bits = TMD body size. Round up to 32-byte alignment, allocate a buffer of that size at _DAT_8007B864, copy the TMD body in via FUN_8003D26C.
  • Loop: advance by (prev_size & ~3) + 4 to the next chunk header. Read header; if header & 0xFFFFFF == 0, exit (terminator). Otherwise:
    • If (header >> 24) == 0x01 -> call FUN_800198E0(payload_ptr) (LoadImage).
    • If (header >> 24) == 0x02 -> exit (explicit terminator).
    • Other types are skipped silently (the loop advances without uploading).
  • Returns once the terminator is hit. It returns param_1 + 1 - a pointer to the word just past the terminating header, i.e. the start of the next region. The FUN_80026B4C TMD register call that follows hooks the parsed TMD into the per-scene mesh pointer table at 0x8007C018 + idx*4.

This is the path that uploads field NPC palettes to VRAM row 479 - they're plain PSX TIMs wrapped in type-0x01 chunks, dispatched only during battle init. See npc-palette.html for the cross-save corroboration and scene-bundles.html → "Streaming tail" for the type-byte table.

Concatenated sub-streams (the “two-list” / continuation case)

Some scene_tmd_stream entries hold more than one complete [chunk0 TMD][type-0x01 TIM chunks][terminator] sub-stream concatenated, each on a 0x800 (sector) boundary (0006_town01: sub-stream 0 at 0x0, sub-stream 1 with its own leading TMD at 0x14000; verified across the town01 / town0b / town0c clusters). The bytes earlier notes called a “continuation TIM list” are really the second sub-stream's TIM chunks - it is self-contained, not a bare tail of sub-stream 0.

FUN_8001FE70 walks exactly one sub-stream; its return (param_1 + 1, past the terminator) lands on the next sub-stream, so a sector/slot-indexed caller can walk the rest. The single static caller FUN_800513F0 (battle init) calls it once (the s3 < 4 loop above the call is 4-party-member setup, not a sub-stream loop), so battle uploads only sub-stream 0. The multi-sub-stream caller is the per-scene field/town dispatch (FUN_8001F7C0FUN_80020224FUN_8001F05C, overlay-resident, capture-blocked). scene_tmd_stream::sub_streams enumerates the blocks properly; the engine's field-mode loader already skips all these battle-only TIMs (row-479 palettes aren't field-resident).

Field / town scene loader

Functions
FUN_8001F7C0 + FUN_800255B8
Path roots
DATA\FIELD\ and h:\PROT\FIELD\<scene>\

Each scene reserves six file types in CDNAME's per-scene block, and the loader walks the scene asset table at the leading PROT entry to pull each file in turn. The on-disc form is the canonical 7-typed-asset bundle (07 00 00 00 lead).

The descriptor offsets past the first are file-relative against the loaded raw footprint (= the bundle entry's extended on-disc footprint, Archive::read_entry), not relative to a decompressed working buffer: the walker hands base + data_offset to the dispatcher as the source of an independent LZS stream, which it then decompresses into a separate target. The offsets routinely run past the TOC-indexed end into the trailing-overlay sectors the per-PROT TOC crops off - e.g. 0588_juui1.BIN's indexed view is 67584 B but desc[4].data_offset is 177413, valid against the 186368 B extended footprint.

The asset chain shape is "load the scene asset table, walk each descriptor, then load each typed sub-asset via the dispatcher". The slot→asset mapping itself is positional + offset-based and fully pinned (see the descriptor-walker section below); what remains partial is the runtime cross-reference stitching between already-loaded sub-assets (e.g. a placed actor in the MAN naming a TMD-pack index), which the loader resolves from live pointers.

WARP opcode → scene transition flow

Field-VM opcode
0x3E with op0 ≥ 100
Scene handler
FUN_80025980

When the field VM executes opcode 0x3E with op0 ≥ 100, it stores map_id = op0 - 100 in DAT_8007ba34 and switches game mode to 0xe (SCENE_TRANSITION). The mode handler FUN_80025980 then loads a code overlay at PROT index map_id + 0x4d (or map_id + 0x4f when map_id ≥ 6).

Only 7 distinct WARP destinations exist (map_ids 0–6), each loading a scene-type overlay at PROT 0x4D–0x55 whose entry function resides at an overlay-resident address (0x801CF070, 0x801CE8A0, etc.). A genuine warp's op0 is therefore always 100..=106; the placement classifier uses this range (plus the absence of the 0x80 cross-context prefix) to reject text-desync phantom warps - see the World map entity-classification notes.

The scene name (stored at DAT_80084548, max 8 chars) is pre-set before FUN_80025980 executes. The overlay entry function reads this buffer and passes it to FUN_8001F7C0 and FUN_80020118. The mechanism that writes the scene name before the WARP fires is in a pre-transition handler not yet fully traced.

GlobalRole
DAT_80084548Scene name string (pre-set before WARP fires)
DAT_80084540Current scene PROT base index (short)
DAT_8007b768Pending destination PROT index; 0xffff = none
DAT_8007ba34Pending warp map_id (0–6); read by FUN_80025980

The DefaultMapIdResolver in engine-core::scene uses CDNAME blocks in ascending PROT-index order as a positional approximation. The actual retail warp only supports 7 destinations and the scene name is determined by a pre-WARP state-machine path still to be captured.

Asset descriptor walker

Function
FUN_80020224
Sole runtime caller
Town overlay's FUN_801D6704 (MAIN_INIT) at 0x801D6B0C with a0 = 0
Result stored at
0x80087AF8

Walks the asset descriptor format and calls the asset-type dispatcher per descriptor. So the walker IS exercised by retail gameplay, just not from a static call site inside SCUS_942.54.

The mapping is positional - there is no separate slot→asset indirection table; the descriptor's data_offset field is the indirection. The chain, traced from the field init at FUN_801D6704:

  1. FUN_8001E1B4 allocates a single 0x62C00-byte asset buffer once at boot and stores its base at _DAT_8007b85c.
  2. FUN_8001F7C0 reads the per-scene field FILE into a 0x14000-byte scratch (_DAT_1f8003ec); the decoded table is relocated so its count word lands at the asset-buffer base.
  3. FUN_80020224 reads count = *base, then for slot in 0..count calls dispatch(base + descriptor[slot].data_offset, type_size, …), descriptors at base + 8 + slot*8, OR-ing the per-slot return codes into a status word.
  4. FUN_8001F05C splits type = type_size >> 24 and size = type_size & 0x00FF_FFFF, then jumps via the table at 0x80010638 + type*4 (type bound < 0x15).

So slot i ⇒ the i-th 8-byte descriptor; payload at base + data_offset; handler keyed by type_size >> 24. scene_asset_table::resolve returns the table plus the base it is relative to for both the bare variant (base 0) and the prescript-prefixed scripted variant (base at a 0x800-aligned offset past the event prescript); SceneAssetTable::slots reproduces the walk and payload_range(slot, base) resolves a slot's payload span. A disc-gated corpus test (scene_asset_table_walk_real) verifies the walk against every classified entry (88 bare + 79 scripted). The file relocation into _DAT_8007b85c and the exact base the walker receives for the scripted variant are runtime values (capture-blocked); the static resolver reconstructs the base structurally.

CLUT-data scattering

Many character meshes reference CLUT (palette) rows that live in different PROT entries from their TMD source. The runtime asset chain stitches them together - the loader puts the relevant TIMs into VRAM before the TMD is rendered.

Engines drive this from SceneResources::build_targeted: parse every TMD in the scene's CDNAME block, collect the union of all prim-target rectangles (CLUT rows + texture-page UV bboxes the meshes sample), then walk every TIM and decide per-block whether to upload it - suppressing the image block when it would land on a CLUT row another mesh references, and vice-versa. Matches the retail field loader's "DMA only the texture bytes the current scene's meshes need" pattern and avoids the 4bpp-vs-256-wide CLUT collisions that previously dropped 80%+ of textured prims through the prim filter. See renderer → Engine-side targeted upload + shared blocks for the engine-side wiring.

Field-shared CDNAME blocks

FIELD_SHARED_BLOCKS = ["init_data", "player_data"] is the set of CDNAME blocks the retail field engine keeps resident in VRAM across scene transitions. The field-character meshes and textures both originate from PROT 0874 (the retail player_data / player.lzs container FUN_8001E890 loads by disc index 0x36c; the extraction file named player_data is 0876 - the +2 filename shift - a VAB + empty TIM_LIST + SEQ trailer with no TMDs): §0 = the 5-TMD character mesh pack that populates DAT_8007C018[0..4], §1 = effect / vdf models, §2 = the field-character texture pack - eight TIMs uploaded to VRAM (the three Vahn/Noa/Gala atlas pages at texpage (832, 256) with per-character CLUTs on row 478). See character-mesh § textures and world-map-overlay § Disc-side source of [0..4]. init_data (PROT 0) holds shared UI / sprite tiles. SceneHost::enter_field_scene passes both blocks to build_targeted so the player atlas survives every town / dungeon transition without being re-uploaded per scene.

The legacy SceneResources::build (no shared blocks, unfiltered upload) is preserved for tests + diagnostic surfaces; engines should prefer build_targeted for production scene loads. The asset-viewer's --vram-extra-dir flag remains the manual workaround for browsing extracted tim_scan/ dirs that aren't tied to a CDNAME scene.

Field vs battle dispatch (SceneLoadKind)

SceneResources::build_targeted_with_options(scene, shared, BuildOptions { kind }) lets callers pick the dispatch path the build mimics:

  • SceneLoadKind::Battle (legacy default of build_targeted): uploads every TIM the scanner finds AND parses every TMD the scanner finds. Includes the leading TMD + type-0x01 TIM chunks inside scene_tmd_stream PROT entries (which FUN_8001FE70 walks during battle init) and every TMD + TIM embedded inside battle_data records (which the FUN_8001E890 chain loads at boot for battle init). Town01 keep ratio: 99.3% under disc-gated test.
  • SceneLoadKind::Field: skips both sources. scene_tmd_stream entries are excluded entirely - their leading TMDs go to the battle character TMD register (_DAT_8007B864), never rendered from a field scene, and their type-0x01 TIM chunks upload the same mesh's textures. battle_data records are skipped at parse + upload time - the pack is battle-init resident, not part of field VRAM. Retail field saves carry row 479 fb_x=0..256 = zero.

SceneHost::enter_field_scene uses SceneLoadKind::Field so the engine port matches the retail dispatch boundary. BuildOptions::default stays Battle so legacy build_targeted calls keep their previous semantics.

Battle-boot pre-load (build_battle_boot_vram)

SceneResources::build_battle_boot_vram(battle_data_scenes) builds a VRAM blob from the player battle files (BATTLE_BOOT_BLOCKS = ["edstati3", "battle_data"] - the two extraction labels covering the retail battle_data block, extraction 863..866; non-pack entries in either block fail detection and are skipped). It walks every record's LZS stream, uploads any standard-PSX-TIM textures it finds, and invokes the descriptor-driven CLUT pass via battle_data_pack::clut_uploads. The retail engine performs an equivalent pre-pass via FUN_8001E890 at boot or first-battle entry so battle-init has the character meshes resident before the scene-specific FUN_8001FE70 walk fires.

Today the CLUT pass is a documented no-op until the battle_data post-TMD descriptor at u32[3..0x20] is pinned (see battle-data-pack). Engines that want the API in place can call build_battle_boot_vram to walk the pack and accept any TIM-shaped textures it does carry; once descriptor decoding lands, the same call also surfaces the per-record (fb_x, fb_y) CLUT placements without further wiring changes. The returned VRAM is intentionally separate from the scene's field VRAM - battle init merges it with the scene-specific upload pass; field rendering does not.

Diagnosing missing CLUTs

To find which PROT entry provides a missing CLUT row, run:

asset clut-finder <vram_x> <vram_y> --extracted-root extracted

This walks extracted/tim_scan/<entry>/*.tim and reports every TIM whose CLUT covers the requested VRAM cell.

For scene-level diagnostics straight off the disc image (no extracted tree required), legaia-engine clut-trace --scene <name> --disc <bin> walks every dropping MissingClut prim in a CDNAME scene, groups by (cba, depth), and reports the suppliers found across the whole PROT corpus by rectangle containment. Pair with --runtime-vram <bin> (mednafen save state captured via mednafen-state vram-dump --out-bin) to mark which missing rows are populated at runtime (engine loader gap) vs absent everywhere (mesh references unreachable CLUT - likely needs a sub-pack walker port).

Row-479 NPC CLUTs: scene_tmd_stream type-0x01, not battle_data

The four town01 NPC TMDs at field intersections sample CLUT row y=479 slots x=128..240 (CBA = 0x77C8..0x77CF). An earlier hypothesis was that those palettes lived inside the battle_data block (the player battle files, extraction 863..866) and would land in VRAM via a boot pre-load - the byte-match corpus on battle-data-pack refutes that. The actual source is the matching scene_tmd_stream entries in town01's own CDNAME block, wrapped in type-0x01 chunk headers that FUN_8001FE70 dispatches during battle init (see row-479 NPC CLUTs). Retail field saves carry row 479 = zero because retail field-mode rendering never has those CLUTs resident either.

SceneResources::build_targeted_with_options(.., SceneLoadKind::Field) matches this dispatch boundary: it excludes both the leading TMD and the type-0x01 TIMs from every scene_tmd_stream entry, so the field-mode TMD pool drops the battle character meshes that retail wouldn't render either. The 388-prim "MissingClut" measurement that surfaced under the previous battle-mode default disappears - those prims belonged to meshes that simply aren't loaded in field mode.

The player battle file parser remains the entry point for battle-init: legaia_asset::battle_data_pack decompresses every TMD slot's LZS stream and exposes the embedded Legaia TMDs + 32-byte layout header. The post-TMD texture/CLUT pool layout is partially TBD - the descriptor at u32[3..0x20] points at specific palette positions but the encoding isn't pinned. build_battle_boot_vram wires the API in place so once descriptor decoding lands, battle scenes pick up the per-record (fb_x, fb_y) CLUTs without further integration work.

VRAM oracle (engine vs runtime)

legaia-engine vram-oracle --scene <name> --disc <bin> --runtime-vram <bin> rebuilds the scene's targeted VRAM and reports per-band overlap counts (top half / texpage primary / texpage CLUTs) against the runtime ground truth. --diff-png <path> writes a 1024x512 colour-coded RGBA8 diff (red = runtime-only / gap, green = engine-only / extras, blue = both non-zero with different content, greyscale = exact match). --tiles adds a 64x64 tile-by-tile breakdown so a specific page region's coverage can be inspected.

Music / SFX selection (BGM lookup)

Documented in detail under the field VM → "BGM lookup table" section. The short version: the BGM ID is a PROT-relative offset, not a literal table lookup.

// FUN_800243F0
if (bgm_id < 2000) {
    prot_idx = scene_local_base + 6 + bgm_id;     // _DAT_80084540 + 6 + bgm_id
} else {
    prot_idx = global_pool_base + (bgm_id - 2000); // _DAT_8007BC64 + bgm_id - 2000
}

The "BGM table" is the CDNAME.TXT per-scene block layout. There's no separate BGM index in SCUS_942.54.

Sound bank loader and streaming-asset loader

FunctionRole
FUN_8001FA88 Sound subsystem init / .dpk loader. Loads bse.dat master bank once at boot, then per-scene .dpk files via the path-based opener with h:\main\bg\domepack\<name>.dpk. Dev builds bypass the path-builder and load PROT index 0x37A (sound_data2) plus param_1 + 5 directly.
FUN_8001FC00 Streaming-asset loader. Builds paths under the sound\ prefix; the XA / .pac / STR consumer. Same dev/retail split as the field loader.

Full breakdown in Audio.

Top-level extraction pipeline

legaia-extract (the binary in crates/extract) drives the offline preservation pipeline:

verify → disc → PROT → categorize → streaming-format extract → TIM → PNG

See the extraction tooling page for per-stage CLI invocations.

See also