Dialog font (proportional Latin) Confirmed
The proportional sans-serif font used by the dialog box, the field menu, and most in-game UI text. It lives in VRAM at runtime and is referenced by every text-rendering primitive the engine emits.
Overview
The font has three pieces of static data, all in SCUS_942.54:
- A 256-byte width table at
0x80073F1C, indexed by character byte. - A 38-entry escape-sequence table at
0x80074050, indexed by the byte that follows a0xCEruntime escape. - The glyph bitmaps, which sit in VRAM at
(896, 0)..(960, 256)(a 4bpp tile-page covering 256×256 source pixels). They're loaded from disc into VRAM by an overlay-resident routine; the on-disc PROT entry that carries them has not yet been classified by the static categorizer.
Glyph layout in VRAM
Format: 4bpp indexed, 16-pixel × 16-pixel cells, 16 columns × 14 rows = 224 cells. Cell c (for character byte c in the range 0x20..=0xFF) lives at:
U = (c & 0x0F) * 16
V = (c & 0xF0) - 0x20
Drawn region within each cell is 14 pixels wide × 15 pixels tall (W=0x0E, H=0x0F in the GP0 0x64 packet). The remaining 2 pixels of width and 1 pixel of height per cell are inter-glyph guard space.
Character codes 0x00..=0x1F are reserved for control / escape bytes (0x7C newline, 0xCE escape prefix, 0xCF color change, 0x20 space) - they do not have glyphs.
Width table (advance lookup)
0x80073F1C u8 widths[256]
256 bytes, indexed by character byte. The advance for character c is computed as:
advance = widths[c] + DAT_800740E8 + 1
where DAT_800740E8 is a per-string padding override that's normally zero (and is reset to zero at the end of each render call). The trailing +1 is a fixed inter-character gap.
Bytes widths[0x00..=0x1F] overlap with three actor-name strings ("Meta", "Terra", "Ozma") that live at 0x80073F24..0x80073F3B; only entries 0x20..=0xFF are meaningful for glyph advance.
Sample widths from the table:
c |
char | width |
|---|---|---|
0x20 |
|
4 |
0x21 |
! |
4 |
0x41 |
A |
7 |
0x49 |
I |
3 |
0x4D |
M |
8 |
0x57 |
W |
9 |
0x69 |
i |
3 |
0x6D |
m |
8 |
0x7E |
~ |
9 |
The full table is dumped to extracted/font/dialog_font_widths.csv and extracted/font/dialog_font_metadata.json as part of the extraction step.
CLUT
VRAM (96, 510) // CLUT 0 - dialog grayscale (white-on-transparent text)
VRAM (96 + 16*i, 510) // CLUT i - colored variants for status text, system prompts
Sixteen 16-color CLUTs are placed end-to-end across VRAM Y=510, one every 16 horizontal pixels. CLUT 0 is the canonical dialog palette: index 2 = transparent black, index 3 = white, indices 0/1/4..7 = mid-tone grays for anti-aliasing.
The runtime selects which CLUT to use via DAT_8007B454, modifiable inline by the 0xCF color-change escape (see below). The CLUT word written into the GP0 packet is DAT_8007B454 + 0x7F86; the constant 0x7F86 decodes as VRAM CLUT-coords (96, 510), so DAT_8007B454 is just an additive index 0..15.
Escape table (0x80074050)
Triggered by byte 0xCE in the rendered string. The byte that follows indexes a 4-byte record:
struct EscapeEntry {
i16 string_id; // 0 = render runtime variable; nonzero = look up a string
u8 advance_px; // pixel advance after rendering this escape
i8 y_offset; // Y offset (or variable index when string_id == 0)
};
There are 38 entries (table indices 0x00..=0x25).
| Index | string_id |
advance |
y_offset |
Meaning |
|---|---|---|---|---|
0x00..=0x07 |
55..62 | 16 | -2 | Icon strings (likely controller-button glyphs / currency icon) |
0x08 |
98 | 12 | +2 | String 98 |
0x09..=0x0A |
132,133 | 12 | 0 | Strings 132/133 |
0x0B..=0x0E |
0 | 32 | 0..3 | Variable substitution - y_offset is the variable index (HP/MP/gold/exp slot), renderer calls FUN_80034B78 to format the integer |
0x0F |
137 | 38 | 0 | String 137 (longest single-shot escape - ~6 chars wide) |
0x10..=0x13 |
36,34,35,37 | 12 | 0 | Active actor name - string IDs 34/35/36/37 align with the in-SCUS actor name strings ("Meta"/"Terra"/"Ozma"/...) |
0x14..=0x1C |
139..147 | 20 | 0 | Strings 139..147 |
0x1D..=0x25 |
148..156 | 28 | 0 | Strings 148..156 |
When string_id != 0, the renderer calls FUN_8002C488(x, y + y_offset, string_id) to draw the looked-up string. When string_id == 0, y_offset < 4 selects which scratch variable (the four runtime-tracked numbers) and the renderer calls FUN_80034B78 to format and draw it.
Rendering pipeline
| Step | Function | Notes |
|---|---|---|
| Source preprocessor | FUN_80036514 |
Expands authoring-time ^X (0x5E) escapes into runtime 0xCE (X-0x2D) escape stream. |
| Word-wrap pre-pass | FUN_80036044 |
Called from FUN_8003CC98. Wraps lines to fit the dialog box width. |
| Single-line renderer | FUN_80036888 |
Iterates bytes, dispatches escapes, emits one GP0 0x64 sprite per glyph. |
| Multi-line wrapper | FUN_8003CC98 |
FUN_80036044 + FUN_80036888. Used by the field dialogue renderer chain. |
| Text-actor tick | FUN_80031D00 |
Per-actor text rendering; uses an alternate width-bucketed glyph layout for HUD/status numbers (column-0 stride 8 px, height 12 px) - see DAT_80073DCC. |
Per-glyph GP0 packet (variable-size textured rectangle, opaque, with raw-texture color):
[0x04 00 00 00] // OT-list terminator
[0x64 80 80 80] // cmd 0x64 + RGB shading
[i16 X][i16 Y] // top-left in screen coords
[u8 U][u8 V][u16 CLUT] // U,V within texture page; CLUT word
[u16 W=14][u16 H=15] // sprite size in pixels
The texture page is set earlier by a separate GP0 0xE1 (DRAWMODE) primitive - it is not embedded in the per-glyph packet.
Inline control bytes
| Byte | Operand | Meaning |
|---|---|---|
0x20 |
- | Space. No glyph; advance X by widths[0x20] (=4). |
0x7C |
- | Newline. Advance Y by 14 px; reset X to line-start. |
0xCE |
u8 | Escape - index into the table at 0x80074050. |
0xCF |
u8 | Color change. Sets DAT_8007B454 (CLUT additive index 0..15). |
0x00 |
- | String terminator. |
any other 0x21..=0xFF |
- | Glyph: emit one sprite via the formula above. |
Provenance (Ghidra trace)
| Subject | Source |
|---|---|
| Width table location + indexing | ghidra/scripts/funcs/80036888.txt line 345 (+ (uint)*(byte *)((int)&DAT_80073f1c + (uint)bVar1)) |
| Glyph U/V formula | ghidra/scripts/funcs/80036888.txt lines 332-335 (*pbVar4 << 4 for U, (bVar1 & 0xf0) - 0x20 for V) |
| GP0 packet shape | ghidra/scripts/funcs/8003c11c.txt (the simpler text-actor renderer with the same packet layout) |
| Escape table location + entry layout | ghidra/scripts/funcs/80036888.txt lines 282-321 |
| CLUT base | ghidra/scripts/funcs/80036888.txt lines 195-196 (addiu v1,v1,0x7f86) |
| Color-change escape | ghidra/scripts/funcs/80036888.txt lines 278-280 (case 0xCF) |
Author-time ^X preprocessor |
ghidra/scripts/funcs/80036514.txt lines 246-249 |
| Multi-line wrapper | ghidra/scripts/funcs/8003cc98.txt |
This renderer chain draws field dialogue, which has no dedicated opcode: a field NPC's text is its inline interaction-script MES (retail actor[+0x90]), shown by the per-frame actor-dialog SM FUN_80039b7c + the dialog pager FUN_801D84D0, triggered by the field-interact op (0x3E op0<100) — see field script VM § Field dialogue. (FUN_8001FD44 is not the opener — it is the scene-change packet reached by the 0x3F named scene-change; the _DAT_1F800394 |= 0x40 it sets is a scene-transition-pending flag, not a "dialog active" lock.)
What's still open
- On-disc carrier of the glyph bitmap. The static categorizer in
crates/assetdoesn't yet recognise the PROT entry that carries the font - the bitmap is reachable only from a save-state VRAM dump. The font extractor writesdialog_font_vram_4bpp.bin(the raw 32 KB tile-page bytes) alongside the PNG, andscripts/find-font-carrier.pysearches every PROT entry for matching slices. The script runs three search strategies:
- Direct slice match (4 fixed offsets, 64-byte windows).
- Glyph-row signature match (per-cell 8-byte rows concatenated for 4–8 adjacent cells; survives row-major byte permutations).
- LZS-decompressed match (decompresses every entry that parses as a valid LZS container and re-runs both probe sets against the decoded bytes).
All three strategies return zero hits across PROT.DAT, DMY.DAT, and SCUS_942.54. The font is therefore neither raw nor LZS-resident in the static asset corpus. Two remaining unblock paths:
- Trace the
LoadImage(GP0 0xA0) DMA call that uploads the tile-page at(896, 0)and identify which PROT entry it pulls from. Thefind_lui_writers.pyGhidra script can locate the LUI+ADDIU pair that loads the source pointer; the destination is the GPU FIFO at0x1F801810so the search target is "writes to a struct that ultimately reaches_DAT_1F801810". - Diff a save state captured before the title screen finishes booting against one captured during a dialog - the font region transitions from zero to populated, so the disc read that fills it sits in the boot sequence somewhere between
FUN_8003E4E8(PROT TOC loader) and the first dialog open. Theanalyze-overlay.sh --label dialogcapture (already committed underghidra/scripts/funcs/overlay_dialog_*.txt) holds the dialog overlay's text-rendering routines; the font upload itself sits earlier in boot, in a routine that's still SCUS-resident or in the title overlay.
- String IDs in the escape table. Entries
0x00..=0x07(advance 16,y_offset = -2) likely render multi-character icon strings from the same string pool that backsFUN_8002C488. The pool itself isn't yet decoded - its index 34..37 entries match the SCUS-resident actor name strings, suggesting the pool's first ~150 entries are mostly UI strings + actor names. 0xCCopcode. The text-actor renderer atFUN_80031D00recognises a small handful of single-byte ops (0xCC..=0xCF) inside its glyph stream that are distinct from the dialog renderer's0xCE/0xCF. They're outside the dialog font's scope and tracked under the field script VM docs.
Extraction tools
extracted/font/ (gitignored - Sony pixel data) is produced by the font-extraction step:
| File | What it is |
|---|---|
dialog_font_sheet.png |
The full 256×256 source-pixel font tile-page, 4bpp expanded with CLUT 0 |
dialog_font_atlas.png |
Per-glyph atlas, 14×15 cells laid out in 16 columns × 14 rows (224 glyphs total) |
dialog_font_metadata.json |
Width table + escape table + VRAM source rect, in machine-readable form |
dialog_font_widths.csv |
Just the width table as CSV |
dialog_font_vram_4bpp.bin |
Raw 32 KB 4bpp VRAM bytes (downstream tooling can hash + search PROT for the carrier) |
The extractor reads the SCUS executable for the static tables, and a mednafen save state's &GPURAM[0][0] section for the live VRAM bytes. The font region is byte-stable across all captured save states (with cosmetic differences only in cells touched by transient UI elements that share the tile-page).
The committed extractor is crates/font/src/bin/font-extract.rs:
cargo run -p legaia-font --bin font-extract -- \
--scus extracted/SCUS_942.54 \
--save "$HOME/.mednafen/mcs/Legend of Legaia (USA).<hash>.mcN" \
--out extracted/font
Save-state parsing locates VRAM by searching for the &GPURAM[0][0] variable header (mednafen uses a u8 name_len; bytes name; u32 size; bytes data; record format inside each section); no MDFNSVST section walk is required.