Why PCSX-Redux

Three properties make it the right tool for runtime probes:

  • Open-source + scriptable. The Lua API exposes the CPU register file, main RAM as a file-like object, and a breakpoint manager.
  • Interpreter CPU + debug mode. The interpreter (-interpreter) is the only CPU back-end that hits Lua breakpoints, and the interpreter only invokes the debug-process hook when DebugSettings::Debug is set (-debugger). Both flags are required; silently neither alone fires Lua breakpoints. (Source: psxinterpreter.cc:1652if constexpr (debug).)
  • Save-state load from Lua. PCSX.loadSaveState(zReader(file)) loads a .sstate file at runtime, which lets the autorun script reach any captured game state without driving the GUI.

Mednafen's binary save-state format is supported for offline RAM scans via the mednafen-state crate, but its runtime debugger is GUI-only; PCSX-Redux is where the breakpoint probes run.

Setup

The expected on-disk layout (matches the run-script defaults):

~/Tools/pcsx-redux/pcsx-redux                  # locally-built binary
~/Tools/pcsx-redux/<TITLE_ID>.sstate<N>        # PCSX-Redux quicksave (F1..F10 in-emulator)
~/.mednafen/firmware/SCPH1001.BIN              # PSX BIOS, reused from mednafen
~/Downloads/Legend of Legaia (USA)/            # disc image

The <TITLE_ID> is the PSX disc's product code (e.g. SCUS94254 for the USA release of Legaia); PCSX-Redux writes one file per quicksave slot when you press the assigned F-key in the running emulator. Each probe's documentation calls out which game state the save needs to be in — pick a save you've prepared locally that matches.

Override any of these via env vars (PCSX_REDUX, LEGAIA_BIOS, LEGAIA_SSTATE, LEGAIA_ISO). The repo doesn't ship the binary or BIOS or disc; those stay local.

Save-state library (immutable backups)

PCSX-Redux quicksave slots (<TITLE_ID>.sstate<N>) and mednafen mc{N} cards are ephemeral — the next time you save in that slot, the bytes are gone, and a save you reverse-engineered against has to be recaptured from scratch. To stop that, back interesting states up into a fingerprint-named library:

scripts/manage-states.py backup pcsx-redux ~/Tools/pcsx-redux/SCUS94254.sstate6 \
    --label field_walled_collision_pin
scripts/manage-states.py library          # list what's backed up + catalogue status

backup copies the file to saves/library/<emulator>/<sha256>.<ext> (immutable; the sha256 is the filename, so it never collides or gets overwritten) and records the fingerprint on the named scripts/scenarios.toml scenario as backup_fingerprint. The library directory is gitignored (it holds Sony game RAM); the committed pointer is the manifest's backup_fingerprint field. When a scenario has one, both scripts/manage-states.py and run_probe.sh --scenario resolve the library copy in preference to the live slot — so probes keep working after you've saved over the original slot. See the field schema + workflow at the top of scripts/scenarios.toml.

The harness

scripts/pcsx-redux/run_probe.sh is the canonical wrapper. Despite the name, every other Lua autorun re-uses it via the LEGAIA_LUA override:

LEGAIA_SSTATE=$HOME/Tools/pcsx-redux/<your-saved-state>.sstate \
LEGAIA_LUA=scripts/pcsx-redux/autorun_world_map_fog_probe.lua \
LEGAIA_OUT=/tmp/fog_probe.csv \
LEGAIA_FRAMES=600 \
    bash scripts/pcsx-redux/run_probe.sh

The wrapper:

  1. Verifies the binary / BIOS / save state / Lua file all exist (fails early with a clear error if any one is missing).
  2. Launches PCSX-Redux with -interpreter -debugger -run -bios <SCPH> -iso <bin> -dofile <lua> -stdout and pipes the emulator log to logs/pcsx_<probe>.log.
  3. Tails the log for a === summary === block on exit.

The -stdout flag is what makes the autorun's PCSX.log(...) calls visible to the parent shell.

The probe pattern

Every autorun script under scripts/pcsx-redux/ follows the same state machine:

  1. WAIT_BOOT — vsync listener counts up while the emulator boots the BIOS to a known state (typically 60 vsyncs = 1s).
  2. ARMED_LOADED — load the save state, read the register file, compute breakpoint addresses (often GP-relative), arm the probes, write an initial snapshot. Capture for LEGAIA_FRAMES vsyncs while breakpoints log hits to the CSV.
  3. DONE — disarm breakpoints, write a final snapshot, PCSX.quit(0).

This pattern is factored out as a shared library at scripts/pcsx-redux/lib/probe.lua, which is an umbrella that re-exports the per-concern submodules under scripts/pcsx-redux/lib/probe/env, mem, sstate, pad, bp, csv, snapshot, sm, watch, and symbols. A new probe doesn't reimplement the state machine, the memory readers, the save-state loader, the pad-override helpers, the CSV writer, or the live-snapshot writer — it imports them:

package.path = package.path .. ";scripts/pcsx-redux/lib/?.lua"
local probe = require("probe")

local csv = probe.csv_open("/tmp/x.csv", "addr,pc,ra")

probe.run({
    sstate         = probe.getenv("LEGAIA_SSTATE", DEFAULT),
    capture_frames = probe.getenv_num("LEGAIA_FRAMES", 600),
    snapshot_path  = "/tmp/x.hits.txt",
    on_arm = function()
        local descs = {}
        for _, addr in ipairs({ 0x801E76D4 }) do
            local d = { addr = addr, name = string.format("0x%08X", addr),
                        hits_ref = { n = 0 } }
            probe.arm_breakpoint(addr, "Exec", 4, d.name, function()
                d.hits_ref.n = d.hits_ref.n + 1
                local r = PCSX.getRegisters()
                csv:row("0x%08X,0x%08X,0x%08X",
                    addr, tonumber(r.pc), tonumber(r.GPR.n.ra))
            end)
            descs[#descs + 1] = d
        end
        return descs
    end,
    on_done = function() csv:close() end,
})

probe.ram_offset(addr) is bit.band(addr, 0x1FFFFFFF) — strips the KSEG segment selector so KSEG0 (0x80xxxxxx) and KSEG1 (0xA0xxxxxx) map to the same physical byte. Always work in absolute PSX virtual addresses on input; convert at the boundary.

Call-context capture

probe.capture_call_context(label) returns a multi-line text snapshot of the CPU at the moment of a breakpoint hit:

  • All 32 GPRs by MIPS name (zero, at, v0, …, ra), four per row.
  • The 8 instruction words straddling PC (pc-0x20..pc+0x60), one row per 16 bytes, with a <- pc marker on the row containing PC. Lets the reader see the calling instruction context without round-tripping through Ghidra.
  • The 32 stack words at sp (sp..sp+0x80), 4 per row. The MIPS calling convention saves ra into a sp-relative prologue slot for any non-leaf function, so this captures the visible ra-chain without DWARF unwind info. Walking the chain still requires reading the prologue offsets out of the disassembly post-hoc, but the bytes you need to do that are already in the snapshot.

probe.append_call_context(path, snap) is the matching writer; it opens the file in append mode so multi-shot probes can stack snapshots without overwriting earlier ones. The slot-4 reader and the XP-table probe both use this for first-hit detail dumps.

Write-watchpoint logging (probe.watch)

The recurring “what writes this address?” probe arms a Write breakpoint and, in the callback, logs (elapsed, label, addr, pc, ra, new_value) to a CSV plus a first-N call-context dump. probe.watch factors that closure out (it composes bp + mem + snapshot, adding no new emulator interaction): probe.watch.new{ csv=…, detail_path=…, elapsed=… } then w:arm(addr, width, label).

Early-quit signal

probe.run polls ctx.request_quit each vsync and exits the capture loop on the next tick if it's set. Probes use this to bail as soon as their stop condition is met (e.g. every probe in a sweep has hit at least once), instead of waiting for LEGAIA_FRAMES to elapse:

on_capture = function(ctx, _elapsed)
    if every_probe_hit() then
        ctx.request_quit = true
    end
end,

Symbolic breakpoint addresses

Hard-coded 0x801DA51C-style breakpoint targets break across overlay re-imports that shift function entry points. The symbol resolver accepts Ghidra-canonical names from two sources:

  • Function entry points (FUN_801DA51C, slot-4 k10_shared labels, named overlays). Source: per-function dump headers under ghidra/scripts/funcs/*.txt.
  • Global data labels (DAT_8007078C / _DAT_8007BCD0, both case forms accepted). Source: the same dump-header walk, plus a regex harvest of DAT_xxxxxxxx references from the decomp body content (so DAT names show up even before dump_globals.py has been run for a given program), plus a dedicated dump_globals.py Jython script for authoritative names + lengths.

Three ways to use it:

-- Bespoke autorun:
local symbols = require("probe.symbols").load()
probe.arm_breakpoint(symbols.FUN_801DA51C, "Exec", 4, "world_map_sm", cb)
# .probe.toml: addr/base accept either an int or a symbol-name string.
[[breakpoint]]
addr = "FUN_801DD35C"     # resolves at spec-load time
kind = "Exec"
[[breakpoint]]
addr  = "_DAT_801EF16C"
kind  = "Read"
width = 4
# Regenerate after adding new dumps (covers funcs/* dumps and globals_*).
python3 scripts/pcsx-redux/build-symbols.py
# Authoritative globals (one-time per program; optional but lossless):
docker compose exec ghidra /ghidra/support/analyzeHeadless /projects legaia \
    -process SCUS_942.54 -noanalysis -postScript /scripts/dump_globals.py
# ... or pass `-process overlay_<name>.bin` for per-overlay globals.
python3 scripts/pcsx-redux/build-symbols.py

The resolver fails loudly on a typo'd symbol name — arming a breakpoint at nil otherwise silently captures zero hits and the probe runs to completion with no diagnostic. The hex portion of the name is case-insensitive: docs use FUN_801DD35C, Ghidra emits FUN_801dd35c, both resolve identically.

scripts/pcsx-redux/probes/_check_specs.py cross-validates every .probe.toml spec's symbol references against symbols.json so a typo'd symbol fails CI rather than the probe run.

Things that catch people out

  • Breakpoint width matters. lbu from a watched word triggers only when the width-1 byte falls inside the breakpoint's range. Arming a width-4 probe at an LW target works; arming a width-1 probe at an LBU target works; mismatches silently miss hits.
  • GP-relative addresses are decided at runtime. A naive hard-coded address can be wrong across overlay swaps. Read gp from PCSX.getRegisters() after the save-state load, then compute breakpoint addresses from there.
  • Sign-extended u64s in Lua. PCSX-Redux returns CPU register values as signed Lua numbers (64-bit doubles). gp = 0xFFFFFFFF8007B318 is the sign-extended display of 0x8007B318. Use bit.band(v, 0xFFFFFFFF) to normalise before formatting.
  • In-RAM guard predicates. Pure bitwise comparisons against literals like 0x80000000 interact with Lua's 32-bit signed return shape from bit.band — the literal is the unsigned 2147483648 while the bit-result is the signed -2147483648, so ~= returns true even when the addresses match. Use the explicit bit.band(addr, 0x1FFFFFFF) < RAM_SIZE form from the existing helpers; don't reinvent it.
  • GPU::Vsync events fire on game-driven VSync(0) calls, not 60 Hz hardware. PCSX-Redux delivers GPU::Vsync when the game calls libcd's VSync(0) syscall, which is sparse during boot init / CD-DMA phases. A probe waiting on vsync_count >= 600 to fire during boot can sit for minutes of wall time even when emulator-time has advanced past the target. For boot-phase timing use a memory watchpoint at a known transition register (e.g. _DAT_801EF16C title countdown) instead of a vsync-count target — the watchpoint fires precisely when the game writes the state transition.
  • Don't readAt(2 MiB, 0) inside a vsync callback. A single 2 MiB PCSX.getMemoryAsFile():readAt(...) call permanently degrades subsequent GPU::Vsync event delivery in the same emulator launch — subsequent callbacks fire rarely or not at all (probably a heavyweight Lua GC pass disrupts the event loop). One-shot full-RAM dumps work because the script transitions to a quit-soon state after the single read; multi-snapshot probes break. Workarounds: keep individual reads small (64 KiB at a time is safe), or take one dump per emulator launch (chained single-shots).
  • PCSX.quit(0) doesn't always exit the process. Wrap every probe invocation with timeout --kill-after=10s <budget> so a hung emulator gets reliably killed. The captured data is already on disk by the time PCSX.quit fires — the timeout-kill is purely cleanup.

Catalogue

The committed scripts live in scripts/pcsx-redux/. Each Lua file documents its purpose in a header comment block; the catalogue here is the high-level index.

Runtime probes (Lua autorun)

ScriptProbesWhat it answered
autorun_world_map_probe.luaReads at _DAT_8007BCD0..D8 (gate-arm params), gate flag _DAT_801F351C writes, and four FUN_801D7EA0 entriesPins the world-map POLY_FT4 emitter's one-shot gate flag + the three-param block driving it.
autorun_world_map_fog_probe.luaReads at five fog fields (GP-relative -0x2E0 / -0x2DC / -0x2D1 / -0x2BC / +0x90) + 1 KiB LUT dumpCaptures the per-Z fog-tint LUT the overlay leaves at 0x801F7644..0x801F8690 consult on every vertex.
autorun_prim_pool_writers.luaWrites across the 341 KB GPU prim pool at 0x800AD400+Confirms the eight overlay-resident high-mode renderers are the ones writing the pool (matches FUN_80043390's dispatch table).
autorun_lzs_and_bundle_probe.luaLZS decode entries + bundle dispatcher (FUN_8001F05C) during world-map loadPins which PROT entries get LZS-decoded for the world-map bundle.
autorun_slot4_consumer_pcs.luaExec bps at the cluster-A + cluster-B LW PCs identified during the slot-4 REKingdom-agnostic: hits the same SCUS function PCs regardless of where slot 4 lives in RAM for the destination kingdom. Confirmed cross-kingdom: cluster A and B fire on Drake, Sebucus (town → map02) and Karisto (town → map03) with the same caller RAs (cluster B's RA 0x80059C00 is byte-identical across all three; cluster A's RAs 0x8001B47C inside FUN_8001ada4 + 0x801F78D4 world-map overlay are present in every kingdom). Hit-count scales with per-kingdom record count. Output CSV is probe_idx, cluster, pc, name, ra, a0..a3, s8; .detail.txt sidecar captures first-hit call-context per PC. LEGAIA_PC_CAP=N raises the default 200-hit-per-PC cap for uncapped totals.
autorun_slot4_dispatcher_args.luaExec bp at 0x80043390 (cluster A dispatcher entry)Captures the original call args before the kind handlers clobber a1 / a2: caller RA, descriptor pointer a0, packed cmd_flags (a1), fade_flags (a2), and the first command word's kind / count. Use this to classify which of the four dispatcher banks (0x00 / 0x50 / 0xA0 / 0xF0) each call lands in. LEGAIA_DISP_CAP=N raises the default 200000-hit cap.
autorun_dump_slot4.luaDumps the slot-4 RAM region directlyProduces the ground-truth byte buffer for verify_slot4_in_ram.py.
autorun_xp_table_reader.luaRead bps tiled across 0x8007123C..0x80071300Originally written to pin the runtime XP-table reader. Superseded — the real XP curve is DAT_80076AF4, read by the overlay applier FUN_801E9504; the old 0x8007123C target is an off-by-0x800 artefact over a sin-LUT slice (see level-up XP table). Re-target the bps to 0x80076AF4 before re-running. The CSV / detail-sidecar shape of the probe is generic and reusable for any tiled-read-bp scan.
autorun_field_pack_projection.luaExec bp at FUN_8001F7C0 (scene asset loader) entry; one-shot Exec bp at the loader's return address; dumps post-load RAM windowCaptures the loader's on-disc → RAM projection that a single save state can't observe. LEGAIA_HOLD_BUTTON / LEGAIA_HOLD drive the warp-tile input from inside the probe; the run quits ~30 vsyncs after the first post-load dump. Diff via scripts/pcsx-redux/diff_field_pack_projection.py against the on-disc PROT bytes. World-map scenes (map01 / map02 / map03) are not field-pack-formatted — running against them produces a 75 KB GP0-primitive pool projection at _DAT_8007B8D0 - 0x12800 instead.
autorun_dump_full_ram.luaDumps the full 2 MiB main RAMOne-shot snapshot for downstream analysis. One dump per launch only — see the readAt(2 MiB) caveat above.
autorun_boot_walk_snapshots.luaMulti-snapshot RAM-and-register probe; dumps at each emulator vsync in LEGAIA_TARGETS (comma-separated) with chunked reads spread across vsync callbacksWalks a save state through several timeline points in one emulator launch. Known limitation: the chunked-read workaround works for ~2-4 close-together snapshots but degrades past ~10 chunks; for high-vsync targets prefer chained single-shots of autorun_dump_full_ram.lua.
autorun_countdown_trigger.luaMemory write-watchpoint at LEGAIA_WATCH_ADDR (default 0x801EF16C, the title-attract countdown); width-2 Write BP. Optional screenshot via PCSX.GPU.takeScreenShot() taken inside the BP callback before the deferred RAM dump.Watchpoint-driven RAM + screenshot snapshot: fires the dump at the exact moment the game writes the watched register. LEGAIA_HIT_SKIP ignores the first N hits before snapshotting (default 1 to skip the boot-time DMA write). LEGAIA_DUMP_BASE / LEGAIA_DUMP_LEN restrict the dump window (default 0x801C0000 / 0x40000 = overlay window). Decode the screen to PNG via scripts/pcsx-redux/decode_pcsx_screen.py. Pinned FUN_801DD35C as the title-overlay tick — see boot — tick function.
autorun_player_pos_watch.luaWrite-watchpoint on the player actor world-position fields (*(0x8007C364) + 0x14 X / +0x18 Z), armed lazily in on_capture after the save loads (the target is a runtime pointer deref). Cycles the four d-pad directions (camera facing unknown) so at least one produces a position write.Pinned the town/field free-movement integrator: hits land in FUN_801d01b0 (overlay 0897) at the four sh player[+0x14/0x18] stores 0x801D0684/06E4/0744/07B4, with collision via FUN_801cfe4c. CSV columns tick, axis, write_addr, pc, ra, new_val + a .detail.txt call-context sidecar. Run against a save parked in a walkable field/town. See subsystems/field-locomotion.md.
autorun_man_source.luaExec breakpoint at the asset-type dispatcher FUN_8001F05C, filtered to the MAN dispatch (a1 >> 24 == 3). On hit logs a0 (source pointer), size, a2/a3 flags, caller RA, and the resulting _DAT_8007b898 buffer, captures call context, and dumps the source bytes; also dumps the resident MAN at capture start. Drive a transition with LEGAIA_HOLD_BUTTON / LEGAIA_HOLD.Pinned a field scene's runtime MAN source (_DAT_8007b898). Caller is FUN_80020224, the scene_asset_table walker that reads the table base from _DAT_8007b85c and feeds the dispatcher source = table_base + descriptor.data_offset. Captured a standalone-town load: the MAN's LZS stream byte-matches a count=6 scene_asset_table descriptor in the town's own PROT block - the variant a strict count-7 detector skipped. Run against the overworld_into_town_man_load scenario (Down ~0.75s into a town entrance).
autorun_title_overlay_writer_hunt.luaWrite bps at 8 anchor addresses across the title-overlay code region (0x801CC000..0x801EF018)Pins the SCUS-side title-overlay loader: any write into the overlay window fires a BP whose pc + ra + call-context dump identify the writer function. Run cold-boot (LEGAIA_NO_SSTATE=1) since in-game saves are past the load point.
autorun_monster_record_source.luaExec bps at the monster init FUN_80054CB0 (logs the live record: name / HP / MP / stats), the battle archive loader FUN_800542C8, the relative disc-seek FUN_8003E964 (a0 = (id-1)*40 sectors → monster id), the generic disc read FUN_8003E800 (logs the CdlLOC → disc LBA → PROT.DAT offset for 40-sector reads), and the retail host-trap open FUN_800608F0.Pinned the monster stat archive to PROT entry 0867_battle_data (extended footprint): per-id 0x14000 LZS slot at (id-1)*0x14000. Run against a battle save (Rim Elm scripted fights). Three decoded records match the live actor stats byte-for-byte. The monster_data label (PROT 869) is a stub. See battle — monster archive.
autorun_battle_reward_source.luaWrite breakpoints on the XP accumulator 0x80084440, party gold 0x8008459C, party XP bank 0x800845A4, and a candidate gold accumulator; each hit logs the writing PC + all GPRs + the new value, and the staged totals are snapshotted each second. Exec bps at the commit FUN_80026018 and monster-init FUN_80054CB0.Confirmed the victory reward path. Run against the rim_elm_gimard_victory scenario (a lone-enemy fight captured mid-combo so it resolves without input). Gimard's gold went 500 → 515 (+15) via a write at FUN_8004E568, matching the record's base gold (+0x44=60) through the lone-enemy floor((gold>>1)/2) formula. Pinned the reward fields to record +0x44..+0x49 (gold / EXP / drop id / drop %). See battle-formulas — victory spoils.
autorun_title_staging_capture.luaExec bp at FUN_8001A55C (LZS decoder); per-decode src buffer dumpPins the PROT source of the title overlay. Each fired decode dumps the compressed source bytes to <OUT_DIR>/decode_NNN_*.bin; an offline script byte-matches against PROT entries. Run cold-boot.
autorun_load_screen_dump.luaLoads sstate9 (parked on the Continue → Load screen), settles LEGAIA_FRAMES vsyncs, then dumps the rendered framebuffer via PCSX.GPU.takeScreenShot() + full 2 MiB main RAMGround-truth capture for pinning the load-screen panel border + slot-pill source sprites. Output load_screen_fb.raw + .meta decode to PNG via scripts/pcsx-redux/decode_load_screen.py. The framebuffer pixels match PSX 320×240 coords 1:1, so sprite-rect dst positions can be measured directly. For full ground-truth VRAM (not just the rendered framebuffer), pair with extract_vram_from_sstate.py + decode_vram.py on the same save state — that pipeline pinned the load-screen panel CLUT to row 2 of the system-UI TIM at PROT.DAT[0x018E0]. The probe arms no breakpoints, so it runs with --fast for ~30s end-to-end. See save-screen — sprite asset sources.
autorun_town01_script_flow.luaExec bps at the scene-load init FUN_8003aeb0, the system-script prologue runner FUN_8003ab2c, the per-frame VM step FUN_801de840 (deduped into a per-context table keyed by a2 = ctx ptr: script_id ctx+0x50, bytecode ctx+0x90, pc range, hits), and the three nibble-7 collision-grid write sites 0x801e1d00 / 0x801e1d74 / 0x801e1e84. Dumps the live collision grid (*_DAT_1f8003ec + 0x4000, scratchpad-resolved) at first + last frame with a wall-tile count + ASCII map.Pins a field scene's script execution model — which contexts run, their scripts, and whether walls are painted per-frame or only at load. On the field_walled_collision_pin scenario it showed: 7455 painted wall tiles, a single steady-state context (script_id 0xFB, bytecode 0x8010F092, looping pc 0x102..0x297 — matching the clean-room engine's static trace), and zero nibble-7 paints while standing still (walls are load-time only). To capture the load-time paint flow, replay a pre-transition save / drive a step into a scene exit so FUN_8003aeb0 + the nibble-7 BPs fire. See subsystems/field-locomotion.md.
autorun_audio_trace.luaCalls PCSX.createSaveState() every LEGAIA_INTERVAL vsyncs; walks the protobuf in-place via FFI pointer arithmetic; slices out only the SPU sub-message (~600 KiB per capture vs. 20 MiB for the full state); appends to one binary stream prefixed with LEGSPU01Multi-frame retail-trace input for the I1b(b) audio-trace parity oracle. Pair with extract_audio_trace_from_sstates.py to decode into the JSONL AudioTraceFrame shape that legaia-engine audio-trace --retail-jsonl consumes. The probe runs against any save state — best signal comes from one parked mid-BGM. PCSX-Redux's Lua API does not expose the SPU register file directly, so createSaveState is the load-bearing primitive; the FFI walk avoids materialising the full 20 MiB state per vsync (which would degrade GPU::Vsync delivery via Lua GC pressure, same shape as the readAt(2 MiB) caveat above).

Save-state to Python (offline analysis)

ScriptInputOutput
dump_kingdom_ram_layout.py.sstate files for the three kingdomsPer-kingdom RAM-layout JSON used by the world-overview page.
walk_actor_lists.py.sstate for a world-map sessionWalks the seven actor-list heads + dumps per-actor records (used by resolve_actor_tmds.py).
resolve_actor_tmds.py.sstate + the kingdom slot-1 TMD packWalks actor[+0x44] mesh-head chains, finds the containing TMD via backward magic-word search, maps to a pack slot. Output is site/world-overview-live.json.
verify_slot4_in_ram.pyautorun_dump_slot4.lua outputConfirms the live RAM region matches the disc-decoded slot-4 sub-bodies byte-for-byte.
diff_slot4_ram_vs_disc.pyLive + disc slot-4 bytesGenerates the byte-level diff visualisation.
match_prim_groups_to_disc.pyLive prim-pool dump + disc TMD packMatches POLY_FT4 prim groups back to their source TMD bodies.
diff_field_pack_projection.py.post.NN.bin + .meta from the field-pack projection probe; on-disc LZS-decoded PROT entryWalks the canonical 97-slot field-pack schema; for each slot, compares runtime RAM bytes against on-disc bytes and prints a per-slot diff sorted by changed-byte count, plus a hex preview of the first divergence per slot.
decode_pcsx_screen.py<OUT>.screen + .screen.meta from autorun_countdown_trigger.lua (or any probe that calls PCSX.GPU.takeScreenShot())PNG of the visible framebuffer at the capture moment. Decodes BGR555 (bpp=16) or BGR888 (bpp=24). Pillow required for PNG output; falls back to raw RGB888 if Pillow is missing.
decode_load_screen.pyload_screen_fb.raw + .meta from autorun_load_screen_dump.luaPNG of the rendered load-screen framebuffer. Dependency-free (uses stdlib zlib + manual PNG chunks); pixel coordinates match PSX 320×240 framebuffer 1:1. Pairs with the panel-source RE in subsystems/save-screen.md.
extract_audio_trace_from_sstates.pyThe LEGSPU01-magic binary stream from autorun_audio_trace.luaJSONL stream of AudioTraceFrame records consumed by legaia-engine audio-trace --retail-jsonl and the disc-gated audio_trace_multi integration test. Walks PCSX-Redux's SPU protobuf schema: 24 × Channel sub-messages (Chan::Data + ADSRInfo + ADSRInfoEx) plus the 512-byte SPU register file (MainVol_L / MainVol_R at offset 0x180/0x182, Reverb_Mode at 0x1AA). Voice "audible" = Chan::Data.on || Chan::Data.stop; ADSRInfoEx.state is the configured envelope shape and reads as Sustain for unused voices, so it is not a reliable audibility signal.
extract_vram_from_sstate.pyA PCSX-Redux .sstate* file1 MiB raw BGR555 VRAM blob (vram.bin). Gunzips the save state and finds the GPU.vram protobuf field (canonical tag 0x1A 0x80 0x80 0x40 = field 3, wire-type 2, length 0x100000). Dependency-free. The PCSX-Redux equivalent of mednafen-state vram-dump: ground-truth VRAM at any parked state, useful for back-referencing sprite sources and CLUT rows against the extracted TIM corpus.
decode_vram.pyvram.bin from extract_vram_from_sstate.py1024×512 PNG of the BGR555 VRAM. Stdlib-only. Pixel coords map 1:1 to PSX VRAM (fb_x, fb_y), so CLUT rows at fb_y=480+ and texture pages at fb_x≥640 are visible at a glance.
scan_panel_prims.pyA 2 MiB main-RAM dump (e.g. load_screen_ram.bin) + optional --rect X0 Y0 X1 Y1 framebuffer rectLists every GP0 textured-sprite primitive (cmd byte 0x64..0x67) whose dst falls in the rect, decoded into (dst_x, dst_y, u, v, clut_x, clut_y, w, h). Groups by CLUT so the unique source tiles each CLUT references stand out. Used to pin the 9-slice tile geometry of the load-screen panel (14 prims sampling CLUT row 2 of the system-UI TIM) — see subsystems/save-screen.md.

One-shot wrappers

run_probe.sh is the single canonical shell harness for every probe. It accepts both env vars (LEGAIA_LUA, LEGAIA_SSTATE, LEGAIA_OUT, …) and matching --lua / --sstate / --out / --scenario / --fast flags. Output defaults to captures/<probe-stem>/<iso-timestamp>/ so each run gets a fresh per-run subtree.

# Default world-map probe (interpreter mode, Lua BPs fire).
bash scripts/pcsx-redux/run_probe.sh

# Pick a different probe.
bash scripts/pcsx-redux/run_probe.sh --lua scripts/pcsx-redux/autorun_dump_slot4.lua

# Resolve the save state via a named scenario from scripts/scenarios.toml.
bash scripts/pcsx-redux/run_probe.sh --scenario cold_boot_pre_init \
    --lua scripts/pcsx-redux/autorun_countdown_trigger.lua

# Fast (recompiler) mode - drops `-interpreter -debugger`. Lua **BPs do
# NOT fire** under the recompiler, so this is only useful for
# vsync-event-only probes (e.g. autorun_dump_full_ram.lua).
bash scripts/pcsx-redux/run_probe.sh --fast \
    --lua scripts/pcsx-redux/autorun_dump_full_ram.lua

The earlier run_world_map_probe.sh / run_fast_probe.sh / run_dump_slot4.sh wrappers were folded into this one runner.

GDB-stub bridge (gdb_probe.py)

gdb_probe.py is the one-shot escape hatch. PCSX-Redux exposes a GDB Remote Serial Protocol stub on TCP port 3333 (settings: Emulator → GDB server port); this script speaks the protocol directly. Use it when the .probe.toml state machine is overkill — ad-hoc reads, single-shot "break-here-read-there" investigations, register dumps.

SubcommandUse
read-mem ADDR LEN [--out F]Hex dump or raw bytes to file. ADDR is hex or a Ghidra symbol.
read-regsDump 38 PSX MIPS GPRs + PC.
write-mem ADDR HEXBYTESPatch memory in-flight.
when-pc-hits ADDR --read-mem A,L [--out F]One-shot: arm exec BP, continue, read on hit, disarm.
watch ADDR LEN --kind {read,write,access}Insert a watchpoint, print the stop reply when it fires.
selftestRun protocol-codec + client self-tests against an in-process mock server (no live emulator needed).

When to use this vs .probe.toml:

  • .probe.toml for repeatable captures that produce a CSV which probe.py regress can gate on.
  • gdb_probe.py for one-shot ad-hoc queries — no schema, no scenario, no state machine to author.
# Read 512 bytes of the kingdom slot-4 region in-flight:
scripts/pcsx-redux/gdb_probe.py read-mem 0x8011A624 512

# Dump registers right now:
scripts/pcsx-redux/gdb_probe.py read-regs

# One-shot break-and-read: when the title overlay tick fires, dump the
# attract-countdown register:
scripts/pcsx-redux/gdb_probe.py when-pc-hits FUN_801DD35C \
    --read-mem _DAT_801EF16C,16

Symbol names resolve via the same ghidra/scripts/symbols.json the Lua probe layer uses; misses raise with the regenerate-via hint. Hex (0x801DE840, 801de840) is always accepted.

Analysing probe outputs (probe.py)

probe.py is the Python-side companion to a .probe.toml run. It operates on the CSV outputs and provides four operations the Lua side intentionally doesn't try to do in-emulator:

SubcommandUse
probe.py summary RUNHeader + row count + canonical fingerprint.
probe.py fingerprint RUNSHA-256 over canonicalised rows. Independent of row order and of --ignored columns.
probe.py diff BASELINE CURRENTSet-diff: added / removed rows. Useful for inspecting why two runs differ.
probe.py regress BASELINE CURRENTFingerprint compare. Exits 0 on match, 1 on regression. Foundation for Phase G CI gating.

--ignore COL[,COL...] drops named columns before comparison / hashing. Use it for fields that naturally vary between runs without representing a regression — most commonly tick (the per-bp hit counter is order-dependent) and sometimes pc (when the same code path gets reached via different inlining decisions across overlay rebuilds).

# Re-run a probe spec, compare against a committed baseline:
bash scripts/pcsx-redux/run_probe.sh --spec scripts/pcsx-redux/probes/xp_table_readers.probe.toml
scripts/pcsx-redux/probe.py regress \
    captures/baselines/xp_table_readers.csv \
    captures/xp_table_readers/<latest>/xp_table_readers.csv \
    --ignore tick

Authoring a new probe

Two shapes are supported, in order of preference:

Declarative .probe.toml (simple probes)

For "arm N breakpoints, dump K columns to CSV" or "settle then dump a RAM region", the probe is a single TOML file under scripts/pcsx-redux/probes/ with no Lua code at all. The shared probes/_runner.lua parses the spec via lib/probe/toml.lua and dispatches into lib/probe/spec.lua.

Schema (see probes/xp_table_readers.probe.toml for the breakpoint-fan-out case and probes/dump_full_ram.probe.toml for the RAM-dump case):

scenario        = "title_attract"   # informational; LEGAIA_SSTATE wins
capture_frames  = 600
output_path     = "my_probe.csv"
capture_columns = ["tick", "addr", "pc", "ra", "value_u32"]

[detail]                            # optional: first N hits get full
hits = 8                            # register/code/stack snapshots in a
path = "my_probe.detail.txt"        # .detail.txt sidecar

[[breakpoint]]                      # individual breakpoint
addr  = 0x80017EC8
kind  = "Exec"                      # "Exec" | "Read" | "Write"
width = 4
name  = "world_map_tick"

[[breakpoint_range]]                # fan out N adjacent breakpoints
base     = 0x8007123C
length   = 196                      # bytes
stride   = 4                        # bytes per bp
kind     = "Read"
name_fmt = "xp+0x%03X"              # %X / %x / %d = byte offset from base

Capture-column vocab (built into lib/probe/spec.lua): tick, addr, offset, pc, ra, sp, width, value_u8 / value_u16 / value_u32.

Run it:

bash scripts/pcsx-redux/run_probe.sh \
    --spec scripts/pcsx-redux/probes/my_probe.probe.toml \
    --scenario title_attract     # or --sstate /path/to/state.sstate

Validate the schema (without launching PCSX-Redux):

python3 scripts/pcsx-redux/probes/_check_specs.py

If lua5.1 is available, the validator also parses each spec via lib/probe/toml.lua and asserts the structural output matches Python's tomllib — catches divergence between the Lua TOML reader and the canonical TOML spec.

Lua autorun (bespoke probes)

For anything more elaborate (per-hit logic that depends on register state, multi-state-machine probes, dynamic breakpoint arming, etc.), write a Lua autorun. The fastest path:

  1. Start from scripts/pcsx-redux/autorun_slot4_consumer_pcs.lua — the canonical thin probe (~145 lines) that uses the shared library for everything except the per-probe breakpoint body.
  2. Edit the PROBE_OFFSETS (or your own probe-address list), the CSV header, and the per-hit row written from inside the breakpoint callback. The boot-delay / capture-vsync / disarm state machine comes from probe.run({...}) — don't reimplement it.
  3. Run with the harness:
    LEGAIA_LUA=scripts/pcsx-redux/autorun_your_thing.lua \
    LEGAIA_OUT=/tmp/your_probe.csv \
        bash scripts/pcsx-redux/run_probe.sh
  4. Iterate on the live CSV. The harness re-launches the emulator per run; the CSV is overwritten each time. While the probe is running, the snapshot file (<probe>.hits.txt next to the CSV) is rewritten every 60 vsyncs — tail it from another shell to watch hit counts climb live.

When the probe surfaces a useful signal, commit the Lua file under scripts/pcsx-redux/ and update the catalogue table above. The CSV output itself is gitignored — it's a per-run artifact, not a project state.

See also