Runtime Model (Proposal)

AXL Runtime — Design Proposal

Status: proposal / thinking document. Not current behavior. Do not treat any code sketch here as canonical until implemented and cross-linked from docs/AXL-Concurrency.md.

This doc captures the higher-level runtime model we’d like AXL to grow into: CRT0 owning a default event loop, Linux-style signal handling for Ctrl-C, axl_yield() as a first-class cooperative escape hatch, and a coherent story for resource cleanup when main returns or exits early. It also calls out the hard limits we can’t paper over (UEFI has no preemption).

1. Motivation

Today, Ctrl-C handling is scattered and cooperative-in-a-bad-way:

The event loop observes the shell break event and returns -1.
axl_wait_* / axl_event_wait_* map that to AXL_CANCELLED.
Every caller has to notice the magic return code and unwind.
Apps that don’t use these primitives (pure CPU loops, or naive reads from a file) aren’t interruptible at all.
There is no centralized “on Ctrl-C, clean up and exit” path; each app reinvents it.

Linux developers reaching for AXL will expect:

Ctrl-C ends the program by default.
A signal-install API for apps that want custom cleanup.
Long-running operations feel responsive to interruption.
Resources get freed when the program exits.

We can’t give them POSIX signals — UEFI BSP has no preemption, a tight CPU loop is inherently uninterruptible. But we can give them a cooperative runtime that feels Linux-shaped for any app that uses AXL APIs, which is effectively all of them (consumers link against libaxl.a for almost everything — printf, malloc, file I/O, networking, all yield through AXL).

The key insight: we control every AXL API. If every slow API checks a flag, the app gets Linux-like responsiveness without needing preemption.

2. Runtime model

2.1 Who owns what

_AxlEntry (CRT0)
  ├─ _axl_init()
  │    ├─ initialize memory, console, backend
  │    ├─ install shell-break notify → sets g_axl_interrupted
  │    ├─ install UEFI watchdog (60s livelock guard) [optional, opt-in]
  │    ├─ create axl_loop_default() — library-wide default loop
  │    └─ initialize atexit registry
  ├─ _axl_get_args() → argc/argv
  ├─ main(argc, argv)                       ← app runs here
  └─ _axl_cleanup()
       ├─ run atexit callbacks in reverse order
       ├─ axl_loop_free(default_loop)
       ├─ close break notify
       ├─ cancel watchdog
       └─ memory leak report (AXL_MEM_DEBUG)

CRT0 owns: the default loop, the break notify, the atexit registry, the watchdog timer. These exist from _axl_init through _axl_cleanup — for the app’s entire lifetime.

The app owns: anything it allocates. It can register axl_atexit handlers to free them automatically.

2.2 Break / signal subsystem

Shell break handling moves out of axl_loop_run and the wait helpers. Instead:

CRT0 registers a notify callback on the shell break event during _axl_init.
The notify sets g_axl_interrupted = true and invokes any user handler registered via axl_signal_install.
Default policy (no handler installed): interrupted flag is set, next yield point observes it and initiates clean exit (_axl_cleanup + gBS->Exit).

Public API:

/* Signal handler runs in a limited context — set flags, log,
 * return. Do not allocate, do not call Boot Services that mutate
 * state. Any cleanup should happen at the next yield point or in
 * an axl_atexit handler. */
typedef void (*AxlSignalHandler)(void);

void axl_signal_install(AxlSignalHandler on_interrupt);
void axl_signal_default(void);               /* restore auto-exit */
bool axl_interrupted(void);                  /* poll the flag */

Rationale: matches Linux signal(SIGINT, handler) shape while acknowledging the UEFI constraint that handlers run at raised TPL and can’t do much. In practice, a handler typically sets a per-app “please unwind” flag and returns; the main thread’s next yield exits through the normal path.

2.3 The default loop

CRT0 creates axl_loop_default() during init. It exists for the app’s lifetime but is not automatically running — an app that needs an event loop calls axl_loop_run(axl_loop_default()), or ignores it entirely.

Why have it at all? Three use cases:

Apps that need a loop but don’t want to manage it — much like asyncio.get_event_loop().
Library-internal background work — e.g., the atexit registry, periodic watchdog pets if we add one.
Anchor for yield-point dispatching — axl_yield() can opportunistically dispatch pending sources on the default loop, letting timers/defers advance during CPU work.

Apps can still create their own loops (and frequently will, for scope/resource-management reasons). Nested loops are a real concern — see §3.

3. `axl_yield()`: cooperative escape hatch

The new public API:

/**
 * @brief Cooperative yield point.
 *
 * Call inside tight loops to make them interruptible. Three
 * things can happen, in order:
 *
 * 1. If axl_interrupted() is true, the default signal handler
 *    runs (or the installed one), and on completion the program
 *    unwinds via _axl_cleanup + exit. An installed handler that
 *    returns normally lets axl_yield return — then the caller
 *    sees axl_interrupted() == true and can react.
 *
 * 2. If the default loop has pending events (timers, defers,
 *    signals), they dispatch — bounded by a single iteration.
 *    This keeps scheduled work alive during app CPU loops.
 *
 * 3. Otherwise, returns immediately (one flag read).
 *
 * Cost: ~nanoseconds when idle. Safe from any context except raised-
 * TPL notify handlers. Inline-friendly (single volatile-bool read).
 */
void axl_yield(void);

3.1 Where AXL APIs inject yields automatically

Every AXL public API that can take noticeable time should call axl_yield(). The guideline:

If the function can execute for longer than a few microseconds under reasonable inputs, and it doesn’t already use axl_loop_* internally, instrument it with axl_yield().

Initial targets:

Area	Functions	Pattern
File I/O	`axl_fread`, `axl_fwrite`, `axl_file_read_all`, `axl_file_write_all`, directory iteration	yield per N bytes or per directory entry
Network blocking	Already use waits — cancel path already exists; only need to funnel through the new flag
HTTP upload/download	`axl_http_get`, large-body POST, body-streamed responses	yield per chunk
Data operations	`axl_array_sort` (if ever added), hash-table growth, digest update on large buffers	yield per N elements/bytes
Format / printf	Very long format strings, stream writes to slow backends	yield after N chars
Task pool polling	Already loop-driven; fine
Shell / SMBIOS / SMBUS / IPMI	Inherently slow hardware ops	yield on retry boundaries

Not worth instrumenting:

O(1) or short-O(log n) operations (hash-table insert, list push, str-copy-small) — overhead would dwarf the work.
Pure arithmetic helpers.
Anything under a few µs typical.

3.2 App code using `axl_yield`

int main(int argc, char **argv) {
    /* CPU-heavy scan with no AXL calls in the hot loop */
    for (size_t i = 0; i < huge; i++) {
        crunch(&state, i);
        if ((i & 0xFFF) == 0) axl_yield();   /* every 4k iterations */
    }
    return 0;
}

Callers choose their own cadence. AXL never demands a minimum — it’s the same contract Rust’s .await and Node’s microtask queue expose: “the runtime can act at your yield points, and only there.”

4. Resource cleanup when `main` returns

4.1 UEFI vs POSIX exit semantics

This is where UEFI diverges sharply from Linux. On Linux, when main returns or the process calls exit(), the kernel reclaims the entire address space — heap, file descriptors, signal registrations, everything. Sloppy programs don’t crash the OS; they just waste memory until exit.

UEFI has no process model. There is no per-application address space. There is no teardown. When an AXL app returns, control flows back to the Shell (or BDS), which has no knowledge of what the app allocated. Specifically:

Resource	On Linux `exit()`	On UEFI app return
Heap (`axl_malloc`)	kernel reclaims	leaks until reboot — each allocation is a separate `gBS->AllocatePool` call from a firmware-global pool
`EFI_EVENT` / `AxlEvent`	closed	crash hazard — firmware keeps the event registered; if a later `SignalEvent` calls a notify function whose code pages were unloaded with your image, system crashes post-exit
Installed protocols	N/A	crash hazard — firmware holds the vtable forever
File handles	closed	filesystem driver keeps state pinned
Loaded child images	N/A	stay in memory
UEFI variables, network handles, registered callbacks	N/A	all leak

The firmware-facing resources (events, protocols, registered callbacks) are the dangerous class. A crash two minutes after the app exits — triggered by a timer firing into unloaded code — is one of the harder UEFI bugs to diagnose.

Today _axl_cleanup (src/posix/axl-app.c:92) only:

Frees the argv/argc it allocated in _axl_init.
Under AXL_MEM_DEBUG, calls axl_mem_dump_leaks() — a diagnostic report, not cleanup. It names what leaked; it doesn’t free anything.

Phase A7 fixes this by making the library responsible for firmware-facing resources it handed to the user, and for running a guaranteed cleanup path on every exit type.

4.2 The internal resource registry

Design principle: every library function that creates a firmware-facing resource registers it. On exit, a sweep closes whatever’s left. This is not garbage collection or refcounting — it’s a safety net for sloppy app code.

Two-tier policy

Tier 1 — firmware-facing or container-owned (always tracked, always swept).

Creator	What enters the registry	Removed by
`axl_event_new`	one event (crash hazard if leaked)	`axl_event_free` (removes before teardown) or `_axl_cleanup` sweep
`axl_cancellable_new`	the wrapped event	`axl_cancellable_free` or sweep
`axl_loop_new`	the loop + each internal event it creates	`axl_loop_free` or sweep
`axl_arena_new`	the arena (covers all sub-allocations inside it — see below)	`axl_arena_free` or sweep
(future) `axl_file_open`, `axl_http_client_new`, `axl_tcp_*`	respective handle	respective `_free` or sweep

On sweep, each remaining entry’s type determines its teardown call. Sweep order is LIFO (reverse registration order), matching atexit semantics and letting containers (loops) tear down before their contents (events they registered as sources).

Tier 2 — heap (axl_malloc et al.).

axl_malloc already tracks every allocation under AXL_MEM_DEBUG via a doubly-linked list (see src/mem/axl-mem.c:100). Extend the cleanup path:

Under AXL_MEM_DEBUG: keep current behavior — report on cleanup, don’t free. Dev sees bugs and fixes them.
In release builds: walk the same list, axl_free each entry. Heap returns to the firmware pool cleanly.

Rationale: heap leaks waste memory but don’t crash firmware. Auto-freeing in debug would hide bugs; auto-freeing in release is the production safety net. Tier 1 is different — leaks there can crash the system, so safety wins in every mode.

Arena sub-allocations (axl_arena_alloc) do not produce individual tracker entries — they’re pure bump-pointer offsets into the arena’s backing buffer, not separate heap blocks. The arena itself is what gets tracked (tier-1 registry above), and freeing it reclaims every sub-allocation it handed out at once. Callers who lean on AxlArena for scoped lifetimes get implicit coverage: thousands of sub-allocations, one registry entry, one sweep call clears them all.

4.2.1 Caller attribution for sweep warnings

Sweep warnings are most useful when they name user-code file:line, not the library wrapper. Today, axl_calloc inside axl_arena_new records src/mem/axl-arena.c as the alloc site — technically accurate, practically useless for debugging.

Same trick the allocator already uses: the public APIs become macros that capture __FILE__ / __LINE__ at the user call site, forward to an _impl function that accepts them:

/* include/axl/axl-arena.h */
#define axl_arena_new(cap)  axl_arena_new_impl((cap), __FILE__, __LINE__)
AxlArena *axl_arena_new_impl(size_t capacity, const char *file, int line);

Extend to axl_event_new, axl_loop_new, axl_cancellable_new, and the future file/http wrappers. Sweep output goes from:

[WARN] runtime: 1 MB heap leaked  (src/mem/axl-arena.c:48)

to:

[WARN] runtime: auto-closing 1 leaked AxlArena  (main.c:17, 1 MB)

Much more actionable.

Registry structure (sketch)

/* src/runtime/axl-registry.c (new under Phase A7) */

typedef enum {
    AXL_RES_EVENT,
    AXL_RES_LOOP,
    AXL_RES_FILE,
    /* grows as more library wrappers are added */
} AxlResourceKind;

/* Called by library wrappers in their new/free functions */
uint32_t _axl_registry_add(AxlResourceKind kind, void *resource,
                           const char *file, int line);
void     _axl_registry_remove(uint32_t handle);

/* Called from _axl_cleanup after user atexit handlers have run */
void     _axl_registry_sweep(void);

Each tier-1 wrapper changes from:

AxlEvent *axl_event_new(void) {
    /* ...existing init... */
    return e;
}

to:

AxlEvent *axl_event_new(void) {
    /* ...existing init... */
    e->_registry_handle = _axl_registry_add(AXL_RES_EVENT, e,
                                            __FILE__, __LINE__);
    return e;
}

void axl_event_free(AxlEvent *e) {
    if (e == NULL || e->magic != AXL_EVENT_MAGIC) return;
    _axl_registry_remove(e->_registry_handle);
    /* ...existing teardown... */
}

Sweep logging

When the sweep finds anything, loudly log it — the user’s code should be fixed, not silently rescued:

[WARN] runtime: auto-closing 3 leaked AxlEvent instances
   event@0x7FE12340  allocated at src/myapp.c:42 by axl_event_new
   event@0x7FE12380  allocated at src/myapp.c:58 by axl_loop_new
   event@0x7FE12400  allocated at src/myapp.c:91 by axl_tcp_accept_async
[WARN] runtime: 1024 bytes of heap auto-freed on exit (set
       AXL_MEM_DEBUG to get per-allocation detail)

Same pattern axl_mem_dump_leaks uses; just extend to tier-1 resources.

Double-close safety

The sweep walks resources that slipped past explicit _free calls. Magic-number guards on AxlEvent and AxlCancellable catch any ordering bug (loop frees before its child events are swept, etc.) by no-oping on dead magic with a logged warning.

4.3 `axl_atexit` — POSIX-flavored cleanup registry

/**
 * @brief Register a callback to run during _axl_cleanup.
 *
 * Callbacks fire in LIFO order (last-registered-first-run), which
 * matches C's atexit() and matches stack-unwinding intuition for
 * "tear down the newest thing first." Each callback receives the
 * user data pointer supplied at registration.
 *
 * Use cases: free top-level resources (loops, caches, HTTP
 * clients, open files) that would leak if not explicitly released.
 *
 * Limits: registry is fixed-size (default 16 entries). Returns
 * a handle so handlers can be removed early via axl_atexit_remove.
 */
typedef void (*AxlAtexitFn)(void *data);

uint32_t axl_atexit(AxlAtexitFn fn, void *data);
void     axl_atexit_remove(uint32_t handle);

4.4 `axl_exit(rc)` — the guaranteed-cleanup exit path

Today, app code that calls gBS->Exit directly (or aborts through some other path) bypasses _axl_cleanup entirely — argv isn’t freed, leak report doesn’t fire, and once the registry lands, events won’t be swept either. This is a landmine.

Phase A7 introduces:

/**
 * @brief Terminate the application with cleanup guaranteed.
 *
 * Runs atexit callbacks (LIFO), sweeps the resource registry,
 * runs heap cleanup per build mode (debug: report; release: free),
 * then calls gBS->Exit(image, status, 0, NULL). Does not return.
 *
 * This is the ONLY blessed exit path. Apps that return from main
 * take the same path via the AXL_APP entry wrapper. Apps that
 * call gBS->Exit directly bypass cleanup -- don't.
 */
AXL_NORETURN void axl_exit(int rc);

All the exit flows funnel through it:

Entry	Path
`main` returns	`AXL_APP` wrapper → `_axl_cleanup` → `gBS->Exit`
App calls `axl_exit(rc)`	`_axl_cleanup` → `gBS->Exit`
App calls `exit(rc)` (POSIX compat)	thin wrapper to `axl_exit`
Default break handler fires	`_axl_cleanup` → `gBS->Exit(..., EFI_ABORTED, ...)`
Installed break handler returns	flag set → next yield / wait returns `AXL_CANCELLED` → caller unwinds → `main` returns → wrapper path

_axl_cleanup itself becomes:

void _axl_cleanup(void) {
    _axl_atexit_run_all();      /* user callbacks, LIFO */
    _axl_registry_sweep();       /* tier-1 firmware resources */
#ifdef AXL_MEM_DEBUG
    axl_mem_dump_leaks();        /* diagnose */
#else
    axl_mem_sweep_free_all();    /* safety net (new) */
#endif
    /* argv, io streams, break-notify teardown, watchdog cancel */
    ...
}

4.5 What fires when

Normal exit path (main returns):

Entry wrapper captures rc from main.
Calls axl_exit(rc) (or inlines the body).
axl_exit runs _axl_cleanup, calls gBS->Exit.

Explicit exit (axl_exit(rc) or exit(rc)):

Same as above from step 2. Unwinding stack above the call does not happen — AXL_AUTOPTR in outer scopes does not run. Apps that need scope cleanup must register via axl_atexit.

Break-driven exit, default handler (no axl_signal_install):

Break notify fires at raised TPL → sets g_axl_interrupted, calls registered default handler.
Default handler returns; next yield/wait observes the flag.
Yield path calls axl_exit(EFI_ABORTED).

Break-driven exit, user handler installed:

Break notify fires → sets flag → calls user handler.
User handler does limited work (set local flag, log) and returns.
Next yield or wait returns AXL_CANCELLED to the caller.
Caller unwinds normally through AXL_AUTOPTR etc.
main returns; entry wrapper path runs.

The user-installed handler is never expected to do cleanup itself. It can’t reliably — it runs at raised TPL with limited services available. Cleanup happens on the normal unwind path, same as any other exit.

4.6 What AXL_AUTOPTR handles already

Scope-bound resources (declared with AXL_AUTOPTR(AxlEvent) etc.) automatically free on scope exit — including when a wait returns AXL_CANCELLED and the caller unwinds back through the scope. No atexit entry needed for those.

axl_atexit is specifically for long-lived resources that outlive function scope and would leak at process exit.

5. Nested loops

“What happens when a user embeds an Axl main loop within our CRT0 created loop?”

Scenarios and their semantics:

5.1 App doesn’t use the default loop at all

int main(int argc, char **argv) {
    AxlLoop *loop = axl_loop_new();
    /* ... register sources ... */
    axl_loop_run(loop);
    axl_loop_free(loop);
    return 0;
}

Semantics: fine. Default loop exists idle in CRT0 but nothing drives it. Break is still detected (via notify callback, not via loop dispatch). App’s loop is the active one; it picks up the break flag via its own sources (the break-event poll continues to register there too, under the hood).

5.2 App uses default loop directly

int main(int argc, char **argv) {
    AxlLoop *loop = axl_loop_default();
    axl_loop_add_timer(loop, 1000, on_tick, NULL);
    axl_loop_run(loop);
    /* no axl_loop_free — CRT0 owns this one */
    return 0;
}

Semantics: fine. One loop, no nesting. Default loop is torn down in _axl_cleanup by CRT0.

5.3 App creates its own loop alongside the default

int main(int argc, char **argv) {
    /* default loop exists, idle */
    AxlLoop *my_loop = axl_loop_new();
    axl_loop_add_timer(my_loop, 1000, on_tick, NULL);
    axl_loop_run(my_loop);  /* drives my_loop, not the default */
    axl_loop_free(my_loop);
    return 0;
}

Semantics: the two loops are independent. The running one dispatches its sources; the default sits idle. Sources registered with the default loop (e.g., if CRT0 has a watchdog timer there) do not fire while my_loop is running. This is OK because CRT0 shouldn’t rely on the default loop being driven — break is notify-based, not loop-based.

5.4 True nested loops (inner loop runs while outer is running)

This happens inside the library today: axl_wait_* and axl_event_wait_* spin up a throwaway loop to wait, while the caller’s outer loop is blocked in a callback.

outer axl_loop_run
  └─ source fires → cb is running
       └─ cb calls axl_wait_for_flag(...)
            └─ creates throwaway inner loop, runs it
                 └─ inner dispatches inner sources until flag is true
            └─ inner freed, axl_wait_for_flag returns
       └─ cb returns
  └─ outer resumes dispatch

Semantics: fine. Throwaway loops are a known pattern. Inner loop has its own sources (event, timeout, cancel event). Outer loop’s sources are not dispatched during the inner run — that’s the nesting cost, accepted.

5.5 Rule

The default loop is never used as a wait-helper throwaway. Wait/event-wait always create their own ephemeral loops. This prevents source leaks between unrelated waits, and keeps the default loop’s invariants (for CRT0’s own use) intact.

6. Proposed public API surface

Summary of new symbols:

/* axl-signal.h — new file */
typedef void (*AxlSignalHandler)(void);

void axl_signal_install(AxlSignalHandler on_interrupt);
void axl_signal_default(void);
bool axl_interrupted(void);

/* axl-runtime.h — new file, or fold into axl-loop.h */
AxlLoop *axl_loop_default(void);
void     axl_yield(void);

AXL_NORETURN void axl_exit(int rc);

/* axl-atexit.h — new file, or fold into axl-signal.h */
typedef void (*AxlAtexitFn)(void *data);

uint32_t axl_atexit(AxlAtexitFn fn, void *data);
void     axl_atexit_remove(uint32_t handle);

/* Resource registry -- internal API, called by library wrappers.
 * Exposed publicly only as an INSPECTION/STATS surface (optional). */
size_t axl_registry_count(void);           /* debug: how many live? */

Split vs combined: the design doesn’t care. Most likely axl-runtime.h as a single header, with AxlSignalHandler, AxlAtexitFn, axl_yield, axl_loop_default, axl_interrupted, axl_signal_install, axl_atexit all together.

Source layout: new module src/runtime/ owning axl-signal.c, axl-yield.c, axl-atexit.c, axl-runtime-init.c (called from _axl_init).

7. What we are not doing

setjmp/longjmp from the break notify. Classic footgun; skips all destructors and leaks resources; corrupts invariants.
UEFI watchdog as a signal mechanism. Watchdog is reset-only; can’t be repurposed. Optional use as a library-livelock guard only.
NMIs, hardware interrupts, or firmware-specific preemption hooks. Platform-dependent, unreliable, out of AXL’s scope.
Any claim that CPU-bound app code that ignores AXL is interruptible. It isn’t, and that’s honest. Document loudly.

8. Prototype plan

Agreement: prototype this in sdk/examples/ before landing in CRT0. Lets us validate the API shape end-to-end without touching production code paths.

8.1 Scope of the prototype

sdk/examples/runtime-demo.c — a single SDK example that sketches what each piece looks like from the consumer perspective. The demo implements its own mini-runtime (not CRT0-integrated) to exercise the API shape. If the shape feels right, we promote it into CRT0.

What the prototype demonstrates:

axl_signal_install — registers a handler that logs “got Ctrl-C, will exit cleanly” and returns.
axl_atexit — registers a cleanup fn that frees a mock resource; verified to fire LIFO on both normal and break exit.
axl_yield — called inside a long CPU loop (~10 s of counting). User hits Ctrl-C mid-loop and the demo exits within one yield interval.
Default loop — demo calls axl_loop_default() and adds a timer source, shows the source gets dispatched and torn down in _axl_cleanup.
Nested loop — demo inside a callback calls axl_wait_for_flag, proves the inner throwaway loop doesn’t interfere with the outer.
Registry sweep — demo deliberately leaks an AxlEvent (creates it but never calls axl_event_free), then exits. Verifies the sweep warning fires with file:line info and the event is closed before gBS->Exit. Variants: leaked loop, leaked cancellable, leaked heap (debug vs release behavior).
axl_exit vs return — same program exited both ways; prove both paths run atexit, sweep, leak report.

Each scenario runs as a subcommand (./runtime-demo signal, ... atexit, ... yield, ... leak-event, etc.) so we can exercise them individually in QEMU.

8.2 Mini-runtime inside the demo

Since CRT0 isn’t actually integrated yet, the prototype defines stand-alone implementations at file scope:

/* Private to runtime-demo.c — becomes CRT0 if shape proves out. */
static volatile bool g_demo_interrupted;
static AxlSignalHandler g_demo_handler;
static struct { AxlAtexitFn fn; void *data; } g_demo_atexit[16];
static size_t g_demo_atexit_count;

static void demo_yield(void);
static void demo_signal_install(AxlSignalHandler fn);
static bool demo_interrupted(void);
static uint32_t demo_atexit(AxlAtexitFn fn, void *data);
/* ... */

Break-event wiring in the demo: register a notify on gEfiShellProtocol->GetExecutionBreakEvent() manually, set the flag, invoke handler. Exits the demo end via return from main (no forced gBS->Exit — we’re in a demo, not CRT0).

8.3 Success criteria for the prototype

Before promoting to CRT0:

[ ] All 7 scenarios run cleanly in QEMU X64 + AARCH64.
[ ] Break during the yield scenario exits within ~100 ms.
[ ] atexit callbacks run in LIFO order on both normal + break exits; proven via printf ordering.
[ ] Default loop’s sources fire at least once, then tear down without leak warnings.
[ ] Nested wait-inside-callback returns correctly without corrupting the outer loop.
[ ] Registry sweep catches every leaked tier-1 resource and logs file:line. Deliberately-leaked AxlEvent is closed before gBS->Exit returns control to the Shell.
[ ] axl_exit(rc) and return rc produce identical cleanup behavior (same atexit order, same sweep output).
[ ] No memory leaks on any successful path (AXL_MEM_DEBUG clean).
[ ] CPU-idle regression test still passes (ratio < 0.60) — the prototype must not introduce busy-polling.

If any of these fails, we iterate on the API shape in the prototype before committing to CRT0 integration.

8.4 After prototype

Once the API shape is proved:

Move the mini-runtime implementations into src/runtime/.
Add public headers.
Hook _axl_init / _axl_cleanup to call runtime setup/teardown.
Instrument the initial set of AXL APIs with axl_yield (one PR per module so blast radius is small).
Update docs/AXL-Concurrency.md to cross-link this doc and reference the runtime model.
Add a SIGINT-aware test to test/integration/ — spawn a QEMU instance that sends break mid-test, verify clean exit and no leaks.

9. Design decisions locked in

Captured here so they don’t re-surface as questions during implementation:

Registry is always on. No AXL_NO_RUNTIME_REGISTRY escape hatch. Drivers and runtime images rarely create resources through the axl_event_* public API — they work directly with backend or EDK2 primitives — so the registry cost falls on the app-level consumers who benefit from it.
Heap sweep is mode-dependent: debug reports, release frees. Debug must not auto-free or developers never see their bugs.
Sweep order is LIFO registration order. Matches atexit and lets containers tear down before their contents.
axl_exit is the only blessed exit path. Bypassing it (raw gBS->Exit, explicit PE return) is documented as unsafe and skips all cleanup.
User break handlers don’t do cleanup. They run at raised TPL where cleanup isn’t safe; cleanup runs on the unwind.

10. Open questions

Registry storage. Fixed-size array (16? 64?) or dynamic (AxlArray-backed)? Fixed avoids alloc in cleanup path which is nice; dynamic handles edge cases (apps with hundreds of live resources). Lean dynamic — the cost is one arena-backed AxlArray and it rarely grows past initial capacity.
Break during axl_yield with an installed handler that returns normally. Does yield return, or does it force exit? Proposal: returns; caller’s responsibility to react to axl_interrupted() afterward. Matches POSIX signal-handler semantics.
How does Ctrl-C interact with AxlCancellable? Currently they both map to AXL_CANCELLED. Under the new model, should a cancellable see a separate flag? Proposal: keep the current unified AXL_CANCELLED return, but axl_interrupted() only reports Ctrl-C (not cancellables). Distinguishes “user wants out” from “internal subsystem wants me to stop.”
Should axl_yield dispatch on the default loop even if no break is pending? The §3 sketch says yes. Counterargument: surprising side effects (timer callbacks fire at unpredictable times). Proposal: dispatch only if pending work is immediately ready (non-blocking poll). No wait, no iteration count.
Watchdog default: on or off? Leaning off. Opt-in via axl_watchdog_enable(60) for apps that want the livelock guard.

11. What this doesn’t help with

CPU-bound app code with no axl_yield and no AXL calls: still uninterruptible. Document with a specific example.
Code hung inside a firmware call (UEFI protocol deadlock): not our problem; watchdog reset is the only option.
Bugs in firmware event handling: platform-specific; document workarounds as they come up.

Appendix: Decision log

Captures the high-level choices made in our design conversations so future contributors don’t re-litigate them.

No longjmp. Rejected in the signals discussion for async- signal-unsafety reasons. See §7.
No watchdog repurpose. Watchdog is reset-only on every platform; not useful for signal-like semantics. See §7.
Yes CRT0-owned runtime. Controlling every AXL API is the right leverage point — cooperative yields in library code approximate POSIX signal responsiveness. See §1 and §3.
Default loop is optional, not mandatory. Apps that already manage their own don’t have to change. See §5.
Sleep is Ctrl-C interruptible. Landed in commit 72ae173, documented in axl-wait.h. This doc builds on that assumption.