AxlFind — Byte-Substring Search Engine

See AxlData — Data Structures for an overview of all data modules.

A Boyer-Moore-Horspool byte-substring search (with case-insensitive and whole-word variants, forward and backward) that runs over an abstract random-access byte source — an AxlByteReader. The same engine drives a flat memory block (the built-in AxlMemReader), a AxlTextBuffer (gap buffer), and a AxlPieceTree (out-of-core piece table); the axl_text_buffer_find / axl_piece_tree_find wrappers build the appropriate reader and call axl_find_in_source.

The reader pulls overlapping windows so a match straddling the source’s internal boundaries (a piece boundary, the gap) is never missed; a contiguous source that supplies the optional peek is scanned in place with no copy. A successful find reports an AxlMatch (start + length); length is carried explicitly so the result shape already fits variable-length matchers.

Header: <axl/axl-find.h>

API Reference

Enums

enum AxlFindFlags

Flags for axl_find_in_source and the axl_*_find wrappers.

axl-find.h:

Byte-substring search over an abstract random-access byte source.

The search engine (Boyer-Moore-Horspool, with case-insensitive and whole-word variants) reads through an AxlByteReader — a tiny function-table over whatever holds the bytes — so the same engine drives a flat memory block, an AxlTextBuffer (gap buffer), and an AxlPieceTree (out-of-core piece table). The reader’s read pulls windowed chunks (with overlap to catch matches spanning the source’s internal boundaries); an optional peek lets a contiguous source be scanned in place with no copy.

axl_find_in_source is the single engine; the per-type axl_*_find wrappers (axl_text_buffer_find, axl_piece_tree_find) build the appropriate reader and call it. A successful find reports an AxlMatch (start + length); length is carried explicitly so the result shape already fits variable-length matchers (a future regex / fuzzy engine slots in behind the same reader without reshaping callers).

Byte-oriented; the only “word” notion is for AXL_FIND_WHOLE_WORD, where a word byte is [A-Za-z0-9_]. Single-threaded (UEFI).

Values:

enumerator AXL_FIND_DEFAULT
enumerator AXL_FIND_CASE_INSENSITIVE

ASCII case fold.

enumerator AXL_FIND_BACKWARD

search toward offset 0

enumerator AXL_FIND_WHOLE_WORD

match only at word boundaries

Functions

void axl_mem_reader_init(AxlMemReader *mem, const void *data, size_t len)

Initialize a contiguous in-memory reader over data / len.

Parameters:
  • mem – reader to initialize (caller-owned)

  • data – borrowed bytes (must outlive the search)

  • len – number of bytes

bool axl_find_in_source(const AxlByteReader *reader, const char *needle, size_t needle_len, size_t from_offset, uint32_t flags, AxlMatch *out)

Search reader for the needle_len bytes at needle, scanning from from_offset. Forward (default) returns the lowest match with start >= from_offset; AXL_FIND_BACKWARD returns the highest match with start <= from_offset. AXL_FIND_CASE_INSENSITIVE folds ASCII case; AXL_FIND_WHOLE_WORD requires non-word bytes on both sides. Matches spanning the source’s internal boundaries are handled. Wrap-around is the caller’s job.

Parameters:
  • reader – byte source to scan

  • needle – bytes to find

  • needle_len – length of needle

  • from_offset – where to start scanning

  • flags – AxlFindFlags

  • out – [out] match on success

Returns:

true and fills out on a match; false if not found (or needle_len is 0, or any required argument is NULL).

struct AxlMatch
#include <axl-find.h>

A successful match: the bytes [start, start + length). length equals the needle length for a literal find, but is reported explicitly so the result also fits variable-length matchers.

Public Members

size_t start

byte offset of the match

size_t length

match length in bytes

struct AxlByteReader

Random-access byte source the search engine reads through. An implementation fills the function pointers and ctx; the engine never inspects ctx itself.

Public Members

size_t (*length)(const AxlByteReader *r)

Total number of bytes the reader can serve.

size_t (*read)(const AxlByteReader *r, size_t offset, size_t len, void *buf)

Copy up to len bytes starting at logical offset into buf, returning the number actually copied (fewer than len only when offset + len runs past the end). Always present.

const char *(*peek)(const AxlByteReader *r, size_t offset, size_t len)

OPTIONAL zero-copy fast path: if [offset, offset + len) is stored contiguously, return a direct pointer to those bytes; otherwise return NULL. NULL is always a safe answer — the engine falls back to read. May itself be NULL (no fast path).

void *ctx

implementation data (opaque to the engine)

struct AxlMemReader
#include <axl-find.h>

Built-in AxlByteReader over a flat, contiguous memory block. The reader supports the zero-copy peek path, so searching a memory block performs no copying. Initialize with axl_mem_reader_init and pass &mem.reader to axl_find_in_source. The bytes are borrowed — data must outlive the search.

Public Members

AxlByteReader reader

pass &reader to axl_find_in_source

const char *data
size_t len