The Epic, As It Was Written
The browser
backlog carried a long-standing epic called “PSP JavaScript
integration”. Its scope was ambitious: vendor
QuickJS-NG via the
cc crate, hand-write a thin FFI wrapper (explicitly
not rquickjs), wire it through oasis-js,
enable the javascript feature on oasis-browser
for the PSP build, and ship JavaScript on a 2004 handheld.
Two assumptions in that plan turned out to be wrong. The first was
that rquickjs couldn’t be used because it depends on
std::time::Instant, which the project memory insisted
“crashes on PSP Allegrex (confirmed by testing).” The second
was that the QuickJS C source would be the most tractable starting
point. Fixing the first assumption also made the second one
unnecessary.
This entry is the story of those two assumptions, a stack overflow that looked like a kernel fault, and a pure-Rust JavaScript engine running on hardware Sony shipped the year the Nintendo DS came out.
Symptom 1: “Instant Crashes on Allegrex” (Except It Doesn’t)
Before writing any new JavaScript integration code, I started with
a sanity check: was the claim in the project memory about
std::time::Instant still true? A two-minute test in
the TCP command server would tell us — just call
Instant::now() twice with log writes between every
step, and observe where it dies.
fn run_instant_timetest(cfd: i32) {
log_msg("[timetest] start");
let t0 = std::time::Instant::now();
log_msg("[timetest] Instant::now() #1 ok");
psp::thread::sleep_ms(10);
let t1 = std::time::Instant::now();
let elapsed = t1.duration_since(t0).as_micros();
log_msg(&format!("[timetest] elapsed={elapsed}us"));
}
Built, deployed, rebooted. And then the PSP froze during boot at the
“Loading config…” splash — before
run_instant_timetest had any chance to run. The bug was
somewhere else entirely, and my probe wouldn’t even get a
chance to fire.
The first step in any PSP debugging session is to decide whether
the regression came from the oasis-os branch or from
one of its dependencies. The rust-psp submodule had
three recent commits:
#28— eliminatingstatic mutfrom kernel and dialog modules#29— arena allocator + strided CSC + BorrowedBuf init revert#30— adopt newBorrowedCursorAPI (advance_checked)
The arena allocator (#29) looked suspicious — it
replaced the old “one kernel block per Rust allocation”
scheme with a single 8 MB block backed by
linked_list_allocator. I rolled rust-psp back to
before #29, rebuilt, deployed via recovery mode,
rebooted… and the freeze was exactly the same.
Not the allocator.
Reading eboot.log Through a Bricked EBOOT
Diagnosing a boot-time freeze on the PSP means getting
eboot.log off the memory stick while the device
won’t boot. The recovery EBOOT
(Entry 08) supported
upload but not readfile, so I added one:
// oasis-recovery-psp: read file and stream over TCP
fn read_file(cfd: i32, path: &[u8]) {
let fd = unsafe { psp::sys::sceIoOpen(path_buf.as_ptr(), RD_ONLY, 0) };
let size = unsafe { psp::sys::sceIoLseek(fd, 0, IoWhence::End) };
unsafe { psp::sys::sceIoLseek(fd, 0, IoWhence::Set) };
send(cfd, format!("{size}\n").as_bytes());
// ... stream the file back in 4 KB chunks ...
}
Rebuilt the recovery binary, uploaded it over WiFi into
ms0:/PSP/SAVEDATA/ARK_01234/RECOVERY.PBP, and on the
next R-trigger boot ARK-4 loaded the enhanced recovery. I could
now read eboot.log from a device that refused to boot
into the main OS. The log showed a consistent pattern:
[EBOOT] creating theme...
[EBOOT] skin_key=psix
[EBOOT] preset resolved
[theme] to_active_theme entered
[theme] base_colors extracted
[theme] probe alloc ok: len=5
[theme] probe dropped
[theme] probe2 alloc ok: len=11
[theme] probe2 dropped
[theme] probe3: creating HashMap
<crash>
Interesting. Simple String::to_string allocations
worked. HashMap::new() did not. And crucially,
HashMap::new() is what RandomState::new()
uses to seed itself on first use — which on PSP routes through
the rust-psp std overlay's
sys/random/psp.rs.
Root Cause 1: A [u8] Where a [u32] Was Required
The PSP std overlay backed the Mersenne Twister context with a plain byte array:
// BEFORE — rust-std-src/library/std/src/sys/random/psp.rs
static mut MT_CTX: [u8; 2504] = [0u8; 2504];
Straightforward, right? Except the C ABI helpers in
psp/src/std_support/random.rs cast that pointer to
*mut SceKernelUtilsMt19937Context, whose definition
is:
#[repr(C)]
pub struct SceKernelUtilsMt19937Context {
pub count: u32,
pub state: [u32; 624usize],
}
That struct requires 4-byte alignment. The
byte-array static gets 1-byte alignment by default. On x86, ARM,
and PPSSPP's HLE nobody cares — unaligned word loads are
tolerated or silently fixed up. On MIPS Allegrex,
lw/sw on a non-4-aligned address
traps, and sceKernelUtilsMt19937Init starts
issuing sw instructions the moment you call it.
So: the very first HashMap::new() call on the device
seeded RandomState, which called getrandom,
which called the PSP std overlay's
fill_bytes, which called
__psp_mt19937_init on a misaligned context, which
trapped the kernel the moment it tried to touch state[0].
The fix was nine lines:
// AFTER — aligned to the struct’s actual requirement
#[repr(align(4))]
struct MtCtx([u32; 626]);
static mut MT_CTX: MtCtx = MtCtx([0u32; 626]);
fn mt_ctx_ptr() -> *mut u8 {
unsafe { (&raw mut MT_CTX) as *mut u8 }
}
Rebuilt. Deployed via recovery. Rebooted. The device booted into the dashboard for the first time in hours. Main loop running, frame counter incrementing, voronoi shader rendering the retro-cga wallpaper at 2.5 ms/frame. The bricking regression was fixed. Time to run the original experiment.
Symptom 2: Instant::now() Still Crashes
$ echo "timetest" | nc 192.168.0.249 9293
timetest: start
$ echo "ping" | nc 192.168.0.249 9293
<silence>
The alignment fix got boot working, but
std::time::Instant::now() still crashed. The cmd_server
thread reported “start”, and then nothing — the
whole EBOOT watchdog-reset before the next log line could be
written. The original memory note about Instant was, somehow, still
pointing at something real.
A look at the generated binary told the whole story:
$ llvm-nm target/mipsel-sony-psp-std/release/oasis-backend-psp | grep Instant
002996a8 t _RNvMNtNtNtCs4sz6D5RTzsi_3std3sys4time11unsupportedNtB2_7Instant3now
sys::time::unsupported::Instant::now.
The linker was resolving std::time::Instant::now to
the unsupported shim — the one whose implementation
is literally:
// rust-std-src/library/std/src/sys/time/unsupported.rs
pub fn now() -> Instant {
panic!("time not implemented on this platform")
}
But the rust-psp overlay had a perfectly functional
Instant backed by
sceKernelGetSystemTimeWide in
rust-std-src/library/std/src/sys/pal/psp/time.rs.
Why wasn't the linker picking it up?
The answer was an upstream restructure that the overlay hadn’t
followed. Current nightly std moved platform time implementations
from sys/pal/<target>/time.rs to
sys/time/<target>.rs, with selection driven by a
top-level cfg_select! in
sys/time/mod.rs:
cfg_select! {
target_os = "windows" => { mod windows; use windows as imp; }
target_family = "unix" => { mod unix; use unix as imp; }
target_os = "hermit" => { mod hermit; use hermit as imp; }
// ... many more arms ...
_ => { mod unsupported; use unsupported as imp; }
}
No target_os = "psp" arm. PSP fell through to the
_ catch-all, got the panicking
unsupported::Instant, and the project memory inherited
three months of incorrect lore about “Instant crashes on
Allegrex.” The rust-psp overlay's
pal/psp/time.rs had been orphaned since the
upstream restructure and nobody noticed because the only
symptom was Instant::now() — which everyone had
already been told not to call.
Fix 2: Wire the Overlay Back In
Two files. Copy the overlay implementation into the new location, then add the missing cfg arm:
// rust-std-src/library/std/src/sys/time/psp.rs
// (unchanged from the orphaned pal/psp/time.rs, just moved)
pub struct Instant(Duration);
impl Instant {
pub fn now() -> Instant {
let us = unsafe { __psp_get_system_time_wide() } as u64;
Instant(Duration::from_micros(us))
}
// ...
}
// rust-std-src/library/std/src/sys/time/mod.rs
cfg_select! {
// ... existing arms ...
target_os = "psp" => {
mod psp;
use psp as imp;
}
// ... catch-all ...
}
Rebuilt. Deployed via the (now bumped from 8 MB to 24 MB)
cmd_server deploy command. Cycled the device. Ran the
timetest again:
$ echo "timetest" | nc 192.168.0.249 9293
timetest: ok elapsed=247324us t0_elapsed=934979us
$ echo "ping" | nc 192.168.0.249 9293
pong
Two microsecond-precision timestamps, a correctly computed
duration_since, an elapsed() measurement,
and a device that stays alive after the call. The memory note I
had inherited for months was empirically wrong.
rquickjs’s supposed use of Instant
had never been the real blocker — it was just the most
confident-sounding wrong answer. Time to go back to the original
plan and build some JavaScript.
Why boa Instead of QuickJS
The original epic called for vendoring QuickJS-NG via the
cc crate, but that path has a different blocker:
cc needs a C cross-compiler for
mipsel-sony-psp, and that means pspdev, newlib, and
a toolchain build that doesn’t ship prebuilt for aarch64
Linux. With the std overlay fixed, a pure-Rust JavaScript engine
became feasible for the first time — no C toolchain, no
header vendoring, no MIPS libc.
boa_engine is a full
ES2023+ interpreter in safe Rust with no native dependencies. It
fit on PSP the moment HashMap, Instant,
and the global allocator started working. The tradeoff is
performance: boa is an interpreted reference implementation rather
than a JIT. On Allegrex I expect it to be ~10× slower than
QuickJS, which is already ~500× slower than desktop V8.
Small bootstrap scripts for inert pages will work; React SPAs will
not. That was always the realistic scope.
oasis-js (the existing crate, built on rquickjs) got
restructured around two mutually-optional feature flags:
[features]
default = ["rquickjs-engine"]
rquickjs-engine = ["dep:rquickjs"]
boa = ["dep:boa_engine"]
Shared JsValue / JsError types moved into
a new types.rs so both backends can return the same
enum. The rquickjs-backed JsEngine — with its
console buffering, fetch handler, local storage, timer queue, and
the raw with_context escape hatch the browser DOM
glue depends on — stayed intact. Desktop, WASM, and UE5
still take the default and pay no boa cost. PSP takes
default-features = false, features = ["boa"] and gets
a separate, smaller BoaJsEngine type with only
new / eval:
pub struct BoaJsEngine {
context: Context,
}
impl BoaJsEngine {
pub fn new(_max_memory_bytes: usize) -> Result<Self, JsError> {
Ok(Self { context: Context::default() })
}
pub fn eval(&mut self, script: &str) -> Result<JsValue, JsError> {
let source = Source::from_bytes(script.as_bytes());
match self.context.eval(source) {
Ok(value) => Ok(boa_value_to_js(&value, &mut self.context)),
Err(err) => Err(JsError { message: format!("{err}"), stack: None }),
}
}
}
The six-variant JsValue enum is the one contact point.
boa_value_to_js collapses boa's value type to that
enum, special-casing numbers that fit losslessly in i32
(matching the rquickjs backend's behavior) and rendering objects /
arrays / BigInts via JS-side String(value) so callers
always see something printable. Six unit tests cover it.
All green on the host.
Symptom 3: “Help, My Stack Is Only 16 KB”
boa compiled for mipsel-sony-psp in 31 seconds on the
first try. No codegen issues, no missing headers, no panics during
link. The EBOOT grew from 5 MB to 14 MB
(opt-level = "z" on the boa subcrates trimmed 1.4 MB
back off), well within the 24 MB user partition. Deployed via
recovery mode since it now exceeded the cmd_server's old 8 MB
deploy cap. Device booted cleanly, arena at 8 MB, 5.4 MB free for
everything else. Time for the moment of truth:
$ echo "js 1 + 2 + 3" | nc -w 15 192.168.0.249 9293
<silence>
$ echo "ping" | nc -w 3 192.168.0.249 9293
<silence>
Crash. Watchdog reset. The third bring-up bug of the day, hiding in plain sight:
// oasis-backend-psp/src/cmd_server.rs — the old configuration
pub fn spawn() {
if let Ok(handle) = psp::thread::ThreadBuilder::new(b"cmd_srv\0")
.priority(40)
.stack_size(16384) // <-- 16 KB
.spawn(move || { ... })
// ...
}
cmd_srv runs on its own thread with a dedicated stack,
and that stack was 16 KB — sized for a network handler that
mostly calls recv and send. Running
boa's parser on it was like trying to park a bus in a garden shed.
The boa AST lexer alone builds a non-trivial state machine on the
stack per token, plus interner state, plus the interpreter's
evaluation frames. 16 KB evaporates in the first few dozen
tokens.
Bumped to 512 KB with a comment explaining why:
// 512 KB is large for a network handler thread, but the `js`
// command runs `boa_engine` inline on this thread and the boa
// parser builds a sizable AST + interner state on the stack.
// Empirically 16 KB (the previous value) crashed the first `eval`
// call on real PSP hardware.
.stack_size(512 * 1024)
Rebuilt. Redeployed via recovery. Rebooted. And then:
$ echo "js 1 + 2 + 3" | nc -w 15 192.168.0.249 9293
js: 6
Six. JavaScript on a 2004 handheld, returning the correct answer, via a TCP command.
A Tour of What Works
$ echo "js 'hello ' + 'world'" | nc 192.168.0.249 9293
js: hello world
$ echo "js [1,2,3].reduce((a,b) => a + b, 0)" | nc 192.168.0.249 9293
js: 6
$ echo 'js JSON.stringify({x:1,y:2})' | nc 192.168.0.249 9293
js: {"x":1,"y":2}
$ echo "js (function fib(n){ return n<2 ? n : fib(n-1)+fib(n-2); })(10)" | nc 192.168.0.249 9293
js: 55
$ echo "js Math.sqrt(144)" | nc 192.168.0.249 9293
js: 12
$ echo "js typeof globalThis" | nc 192.168.0.249 9293
js: object
Arrow functions. Array.prototype.reduce.
JSON.stringify. Recursive Fibonacci through an inline
function expression. The Math library.
globalThis resolving correctly. Every modern JS
feature boa supports, running on the same hardware that shipped
with Lumines.
After all seven test calls the device is still at
frame: 2310, main loop humming, voronoi shader
rendering the retro-cga wallpaper. The BoaJsEngine
instance is created and destroyed inside each js
command — no shared state across calls — which keeps
the memory footprint bounded.
What Didn’t Ship (Yet)
The javascript feature on oasis-browser
is still disabled for the PSP build. The DOM bindings in
oasis-browser/src/js_dom.rs are tightly coupled to
rquickjs's Ctx/Function/Object
API, and porting them to boa requires a parallel implementation
— a substantial chunk of work that belongs in a follow-up
PR. For now, PSP can evaluate standalone JavaScript expressions
via the js TCP command and any future terminal /
script surfaces that don't need DOM access. Browser scripts on PSP
still drop silently at parse time, same as before.
The timer-based interrupt handler from the rquickjs backend is
also not yet wired to boa. boa exposes an instruction-counting
cancellation mechanism that's different enough to deserve its own
design pass. Scripts that run forever will hang the
cmd_srv thread; callers should treat eval
as blocking and only run expressions known to terminate. A runaway
while (true) {} on PSP hardware is not a good time.
Bugs That Only Live on Hardware
All three bring-up bugs in this story share a property that makes
them especially painful: none of them reproduced in
PPSSPP. The emulator ran the exact same EBOOT cleanly
through the theme creation path, the Instant::now
path, and eventually the boa parser. Every fix required a physical
deploy + reboot + observe cycle, and two of the three bricked the
device until R-trigger recovery.
There is a clear pattern:
-
MIPS alignment traps are a real-hardware-only failure
mode. PPSSPP's HLE is lenient about unaligned word
accesses. Real Allegrex is not. Anywhere a Rust-side type cast
reinterprets a byte buffer as a word-aligned struct needs
#[repr(align(N))]on the backing store, not just the target type. -
std overlays drift silently when upstream restructures.
sys/pal/psp/time.rswas a perfectly good file that nothing was linking to, because the arm insys/time/mod.rsthat would have selected it never got added. The compiler was happy. Every test passed. And everyInstant::nowcall since nightly moved the files had been routing through apanic!. The fix is trivial once you find it; the finding is the hard part. -
Thread stacks are contracts with the code that runs on
them. 16 KB was generous for what the original
cmd_srvthread actually did. It was nowhere near enough the moment we asked it to parse JavaScript. The sameBoaJsEngineinstance runs fine with a 512 KB stack and crashes deterministically with a 16 KB one. - Memory notes decay. The project memory confidently said “Instant crashes on Allegrex (confirmed by testing)” because someone had confirmed a crash by testing. They just hadn’t confirmed the cause. Three months later, after an upstream std restructure, the memory was still pointing at the wrong thing. Every false memory rules out a working solution until someone goes and looks.
The Footnote Neither Roadmap Predicted
The original epic budgeted one to three weeks at “high risk” on the assumption that the hard part was MIPS codegen for QuickJS's C source. It turned out the hard part was fixing three unrelated bugs in rust-psp's std overlay that had been hiding under a misdiagnosed premise. Once those were fixed, boa compiled and ran on the first try. The entire “PSP JavaScript integration” epic, minus the browser DOM glue, closed in a single session.
It’s tempting to call that lucky. I think it’s closer
to what happens every time you take a “known impossible”
constraint seriously enough to try it on real hardware: the
impossibility turns out to be a specific, solvable bug, and once
the bug is gone you’re surprised at how much was blocked
behind it. The PSP will never run React, but it now runs
1 + 2 + 3, and everything downstream from that is
just scope.