Concurrency Model
AXL Concurrency Model
AXL targets UEFI, which is cooperatively concurrent and
single-threaded on the BSP. There are no OS threads, no preemption,
no shared-memory locking to design around. Concurrency comes from
callbacks (via AxlLoop) and, where available, work offload onto
other cores (via AxlTask).
This doc is the single source of truth for which primitive to reach for. It also records why AXL picked the libuv / GLib / asyncio style of callbacks-plus-stop-tokens over GIL-style threading, stackful coroutines, and protothread macros — so future contributors don’t re-litigate the design.
See src/event/README.md for the prose on the event primitives themselves, and src/loop/README.md for event-loop mechanics.
Also in design: AXL-Runtime.md — a proposal
for a CRT0-owned runtime (default loop, Linux-style signal handling,
axl_yield() for tight loops, axl_atexit for cleanup). Not yet
implemented; current Ctrl-C handling still flows through
axl_loop_run’s return code.
A note on naming
“Event” appears three times in AXL docs:
The event loop (
AxlLoop) – the dispatcher.An event source – a thing registered with the loop (timer, idle, raw event, …).
AxlEvent– one kind of source: a one-shot latch.
This mirrors UEFI’s own overload. An AxlEvent is a one-shot
latch backed by a UEFI event, and the event loop dispatches them.
The four-axis taxonomy
Every AXL concurrency primitive answers exactly one of four questions. Overlap is minimal and deliberate.
Axis |
Primitive |
Purpose |
Loop integration |
|---|---|---|---|
Dispatch — “when does my code run?” |
|
The event reactor |
is the loop |
“Run this soon, on next tick” – escape a constrained callback |
requires a running |
||
Coordination — “how do I wait for X?” |
|
Interruptible poll of memory (MMIO status, hardware) or a predicate |
spins up a throwaway |
Producer signals → waiter resumes (zero polling, UEFI-event-driven) |
spins up a throwaway |
||
Interruptible sleep |
spins up a throwaway |
||
Notification — “how do I tell others?” |
Stop token shared across async ops; cancel once, many ops abort |
typed wrapper over |
|
Pub/sub bus – decoupled, many subscribers |
delivery is deferred via |
||
Direct callback |
Coupled point-to-point |
caller-defined |
|
|
Hand a raw UEFI event to |
foreign-event interop (TCP completion tokens, protocol-notify) |
|
Work offload — “run where?” |
|
Real parallelism on APs (other cores); falls back single-core |
AP dispatch, polled via |
Fire-and-forget AP work with a BSP callback |
registers an idle source on the caller’s loop |
Decision guide
Pick by what you need to do, not by what primitive looks closest:
I need to… |
Use |
|---|---|
Run code every N ms |
|
Run code once after a delay |
|
React to keyboard input |
|
Do background work between events |
|
React when a UEFI protocol appears |
|
Integrate a firmware-owned EFI_EVENT |
|
Integrate an AxlEvent I own |
|
Run code safely from a constrained context |
|
Let my async callback wake the main thread |
|
Let a caller abort any number of async ops |
|
Poll a hardware status register (CPU idles) |
|
Interruptible sleep |
|
Wait on a complex condition, driving a state machine |
|
Decouple two modules with named events |
|
Offload CPU-heavy work to another core |
|
Why this model, not another
AXL’s shape is event loop + callbacks + stop-tokens. That’s the model Node.js (libuv), Nginx, pre-async Python asyncio, libev, and GLib all chose — the standard answer for cooperative I/O concurrency in a single-threaded runtime. This section records the alternatives considered and why they don’t fit UEFI.
Why not Python’s GIL model
The GIL is a lock that exists only because CPython has real OS
threads and needs to serialize interpreter state. UEFI has no OS
threads on the BSP, so there is nothing to lock. The GIL is a
workaround born of legacy threading; borrowing the name without
the problem would add ceremony without value. Where AXL does touch
real parallelism (APs via AxlTask), the primitive is an explicit
submit / poll queue — no shared mutable state, no lock needed.
Why not stackful coroutines (fibers / green threads)
Each coroutine gets its own stack. A firmware app might juggle 20 async ops; at 16 KB/stack that’s 320 KB, on a system where every KB matters. Debuggers choke on stack swaps. Lifetime management (what owns the coroutine, when is it reaped) would become a whole new story AXL doesn’t have today. The memory and complexity cost doesn’t buy enough.
Why not stackless coroutines / protothreads
Macro tricks (switch/case / computed goto) to fake yield points.
Local variables don’t survive yields, so porting existing C code is
painful and error-prone. Debugging a protothread is reading a
generated state machine by hand. Worth considering if the codebase
were greenfield and async flows were deep — ours are rarely more
than three callbacks deep.
Why not async/await via macros
C doesn’t have it natively. Adding a macro-based emulation (a la
libasync or
asyncify) buys brevity at the
cost of legibility and surprises UEFI developers. The counterweight
in AXL is the sync wrappers built on top of the callback
primitives: axl_tcp_connect, axl_http_get, axl_wait_*,
axl_event_wait_timeout. They let the common flat case be written
synchronously; callback nesting only appears where truly async
composition is needed. That’s the right pressure valve for firmware.
Where this breaks down
Three-level async flows (connect → TLS handshake → HTTP request)
become scope soup. The sync wrappers are the near-term answer. If a
concrete pain point surfaces later, a thin AxlFuture / promise
layer on top of AxlEvent could compose them with .then() /
.all(). Don’t build it speculatively.
Where the primitives live
src/loop/ dispatch axl-loop.c, axl-defer.c, axl-signal.c
src/event/ coordination axl-event.c, axl-cancellable.c, axl-wait.c
src/task/ offload axl-task-pool.c, axl-async.c, axl-buf-pool.c
This layout is intentional: each directory corresponds to one axis of the taxonomy. Adding a new concurrency primitive? Pick an axis. If it doesn’t fit any of the four, reconsider whether the primitive earns its weight.
Background reading
GLib
GMainLoop— the closest cousin. Same shape: loop + sources +GCancellable.libuv design — the single-threaded event loop behind Node.js.
Python asyncio pre-await era — protocol + transport callbacks. What
async/awaitreplaced.Linux kernel
struct completion(docs) — historical name for what AXL callsAxlEvent.C++
std::latch— closest C++ analogue ofAxlEvent’s one-shot latch semantics.