Overview

Implementation: crates/asset/src/lib.rs::parse_streaming.

> Scope note. This doc covers the typed-chunk streaming shape that FUN_8002541C consumes - not the wider question of "what does the per-scene CDNAME block actually carry?" Per-scene field bundles use multiple shapes; the typed map below identifies which shapes show up where.

Layout

[u32 type_size] [size_bytes of raw data]
[u32 type_size] [size_bytes of raw data]
...
[u32 terminator]    // type_size with low 24 bits all zero

Where:

  • type_size = (type_byte << 24) | (size_bytes & 0x00FFFFFF).
  • type_byte matches the asset type table.
  • The next chunk header starts at current_pos + 4 + (size & ~3) - i.e., header + size truncated to a 4-byte boundary. Sizes are always 4-aligned in practice.
  • Terminator: any header u32 whose low 24 bits are zero.

What's in the wild

Strict-validating PROT entries with asset scan-stream finds 26 hits, concentrated in the _other5 cluster (entries 1214–1219+) plus a few in dolk2, rikuroa2, rayman. Each entry has 3 chunks of single-asset shape:

chunk[0]: TIM   (single, magic 0x10)        - sprite atlas / texture
chunk[1]: TMD2  (single, magic 0x80000002)  - single Legaia TMD
chunk[2]: MOVE2 (single, magic 0x08)        - animation data
terminator

The chunk layouts are single assets here (one TIM, one TMD2, one MOVE2), not packs. Other clusters elsewhere in the corpus do use pack-shaped TIM_LIST / TMD chunks; the pack format handles that case.

Trailer data

Some entries contain bytes past the streaming terminator. asset extract preserves these as _trailer.bin next to the extracted chunks. The function that consumes the trailer hasn't been located; tracing the caller of FUN_8002541C in the field/town overlay is the next move if a specific entry's trailer looks structured.

Per-scene field bundles - what's still open

The CDNAME block for a typical field/town scene (e.g. town01, bubu1) carries 8–12 PROT entries. Categorize identifies several known shapes per block:

Common slot Class Typical content
0 / 1 SceneTmdStream or TmdSizePrefix scene mesh (room geometry)
1 / 2 Pack (TIM-pack) scene textures / sprite atlas
2 / 3 SceneEventScripts or SceneScriptedAssetTable per-event field-VM bytecode
3 / 4 MesContainer dialog text
4 / 5 Pack (ANM-pack) per-actor animation sets
5..7 PochiFiller reserved-but-unused dev fillers
6..8 SceneVabStream (rare; only on scenes with custom audio) per-scene VAB + SEQ

What's NOT modelled yet:

  • Cross-entry pointers (NPC references that point into other PROT entries - the asset chain in asset-loader.md is best-effort).
  • The runtime-reconstructed slot-to-asset mapping inside field-pack containers (magic 0x01059B84, 124 entries) - known shape, unknown slot semantics.
  • The retail engine's per-scene "asset table" indirection (SceneAssetTable and SceneScriptedAssetTable detectors fire on a small fraction of entries; full reverse needs an overlay capture of FUN_8001f7c0 at scene-load time).

The categorize sweep covers the bulk of bytes - every PROT entry classifies to something, and ~95% of bytes fall into known classes. Refining the residual classes is the work tracked under "Reverse-engineer DATA_FIELD per-scene layout" in docs/subsystems/engine.md.

See also