Quamina: Compilation

This series is an excuse to build a writing habit while exploring a codebase I’ve wanted to understand better.

Pattern matching systems face a fundamental trade-off. For each incoming event, you could iterate through every pattern, checking each one individually. This is simple to implement and reason about, but scales poorly. With thousands of patterns, you’re checking thousands of conditions for every event. The alternative is compilation: transform all patterns into a shared data structure that processes events in a single pass, independent of pattern count. Quamina takes the compilation approach, converting JSON patterns into a two-level finite automaton where multiple patterns share states and transitions.

The challenge lies in messiness of real world data structures. Take JSON as an example. Fields are unordered, values can be strings, numbers, or complex patterns like wildcards, and a single pattern can specify multiple alternative values for a field. A pattern like {"x": [1, 2], "y": ["foo"]} means “match events where x equals 1 or 2, AND y equals foo”. Managing these combinations while sharing structure across patterns requires careful automaton design. Quamina solves this with a field-level automaton that transitions on sorted field names, feeding into value-level automata that match field content byte-by-byte.

Two-Level Architecture Link to heading

The compilation produces two distinct types of automata working in tandem. The field matcher operates at the pattern structure level, tracking which fields appear and in what order. The value matcher operates at the byte level, processing field values as sequences of UTF-8 bytes. This separation allows field-name transitions to share structure across patterns while value transitions handle the complexity of finer details such as wildcards, or case-insensitive matching; and numeric encoding.

Consider how the system compiles {"status": ["active"], "priority": [1, 2]}. The field matcher creates two transitions: one for the path “priority”, another for “status” (lexicographically sorted). Each transition leads to a value matcher that checks the corresponding field’s content. The value matcher for “priority” contains automaton paths for both encoded representations of 1 and 2, merged into a single structure. When an event arrives, the field matcher processes sorted pathname/value pairs from the flattening layer, invoking the appropriate value matcher for each field.

The data structures reflect this division of responsibility. The fmFields struct (field_matcher.go) contains a map from field paths to valueMatcher instances, plus slices tracking which patterns have matched at this point in the traversal:

type fmFields struct {
    transitions map[string]*valueMatcher
    matches     []X
    existsTrue  map[string]*fieldMatcher
    existsFalse map[string]*fieldMatcher
}

The separate existsTrue and existsFalse maps handle patterns that check for field presence or absence without examining values, enabling early-exit optimizations when a field exists or doesn’t exist regardless of its content. It’s always felt like a hack in Ruler, and is done slightly more elegantly here.

The vmFields struct (value_matcher.go) encapsulates the byte-level automaton, along with metadata about whether it contains numeric patterns or requires nondeterministic traversal:

type vmFields struct {
    startTable          *smallTable
    singletonMatch      []byte
    singletonTransition *fieldMatcher
    hasNumbers          bool
    isNondeterministic  bool
}

The singletonMatch and singletonTransition fields implement an optimization for the common case where a field has only one allowed value. Instead of building a full automaton, the system stores the literal bytes to compare and the next state to transition to if they match. Only when a second value is added does the implementation upgrade to a full automaton structure.

Compilation Process Link to heading

Pattern compilation begins with parsing JSON into typed field/value pairs (pattern.go). Each field becomes a patternField containing a path (like “status” or “config\ntimeout” for nested fields) and an array of typed values to match. The parser uses json.UseNumber() to preserve numeric precision, since the Q-number encoding requires exact float64 values.

After parsing, the implementation sorts fields lexically (core_matcher.go). Comments in the codebase explain this explicitly:

Since the order of fields is generally not significant in encoded data objects, the fields are sorted by name before constructing the automaton. This sorting matches what the flattening layer does to events, enabling deterministic state transitions. Without it, the patterns {"a": 1, "b": 2} and {"b": 2, "a": 1} would follow different automaton paths despite being semantically equivalent.

The sorted fields then drive automaton construction through a state-accumulation loop (core_matcher.go). Starting from the root field matcher, each pattern field generates zero or more next states. For simple value matches, addTransition adds the values to the field’s value matcher and returns the resulting next states. For existence checks, addExists follows the appropriate branch without examining values. The loop accumulates states across all fields:

states := []*fieldMatcher{currentFields.state}
for _, field := range patternFields {
    var nextStates []*fieldMatcher
    for _, state := range states {
        switch field.vals[0].vType {
        case existsTrueType:
            ns = state.addExists(true, field)
        case existsFalseType:
            ns = state.addExists(false, field)
        default:
            ns = state.addTransition(field, printer)
        }
        nextStates = append(nextStates, ns...)
    }
    states = nextStates
}

This accumulation naturally handles multi-value fields. When a field specifies [1, 2] as allowed values, addTransition returns two states. One representing “we matched 1 and need to continue”, another for “we matched 2 and need to continue”. The next field in the pattern then processes from both states, creating the Cartesian product of all combinations.

After processing all fields, the implementation marks terminal states with the pattern’s identifier (core_matcher.go). These marks accumulate, so a state representing “matched x=1” might have multiple pattern IDs if several patterns share that prefix. During event matching, when traversal reaches a marked state, all associated patterns are returned as matches.

Value Automaton Construction Link to heading

Value matchers create byte-level automata using the smallTable representation (small_table.go), a space-efficient encoding of nondeterministic finite automata. Each table stores byte-range transitions using parallel arrays of ceilings and next states:

type smallTable struct {
    ceilings []byte
    steps    []*faState
    epsilons []*faState
    spinout  *faState
}

The ceilings array defines byte ranges. To find the transition for byte b, the code scans for the first ceiling greater than b, then indexes into steps at that position. This could be done simply with an array or a map, but both are size-inefficient as the comment in the code says. Instead, the ceiling-based representation uses space proportional to the number of distinct transitions, not the number of possible byte values (256).

For a simple string like “foo”, the system builds a linear chain of states: " → f → o → o → " → 0xF5 → terminal. Each transition has a single ceiling (the matched byte plus one) and a single next state. The final 0xF5 byte is Quamina’s value terminator, marking the boundary between one field value and the next in the flattened representation. Since Q-numbers always use bytes less than 0x80, and string characters in valid UTF-8 are safe, the 0xF5 terminator creates no ambiguity.

Numeric values generate two parallel paths that merge at the terminator. For the number 1, the implementation creates one path for the literal string “1” and another for the Q-encoded representation (value_matcher.go). The Q-encoding enables automaton-based numeric comparison. This dual-path construction allows patterns to match regardless of whether events represent numbers as JSON strings or JSON numbers.

Pattern compilation then merges these individual automata using mergeFAs() (in nfa.go). When adding a pattern with values [1, 2] to a value matcher that already has an automaton for [2, 3], the merge algorithm computes the union. For the shared value 2, the existing automaton path is reused. For the new value 1, a new path is created and spliced into the existing structure. The result is a single automaton that recognizes all values across all patterns targeting that field.

The key performance characteristic of this architecture is structural sharing. When thousands of patterns check the same field for different values, they all share the same field-level transition. The value matcher for that field contains a single unified automaton recognizing all allowed values. Consider what happens when compiling these patterns sequentially:

Pattern 1: {"status": ["active"]}
Pattern 2: {"status": ["inactive"]}
Pattern 3: {"status": ["pending"], "priority": [1]}

After pattern 1, there’s a single value matcher for “status” with an automaton matching “active”. Adding pattern 2 merges in a path for “inactive”, creating a trie-like structure where the two paths share the common prefix " before diverging at a vs i. Adding pattern 3 extends the status value matcher with “pending” and creates a new field-level transition for “priority”, but only from the states representing successful “pending” matches.

The merging algorithm creates product states recursively, handling epsilon transitions and spinout states specially. Epsilon transitions represent zero-width choices, used primarily for wildcard patterns. A pattern like "a*b" compiles to: " → a → spinout → b → " → terminal, where spinout has an epsilon self-loop allowing it to consume any bytes until seeing b. When merging two automata with epsilons, the algorithm computes the epsilon closure (all states reachable via zero or more epsilon transitions) to determine which states must be considered simultaneously during traversal.

Memoization prevents exponential blowup during merging (nfa.go). The algorithm maintains a map from state pairs to merged states, ensuring each unique combination is processed only once. Without this, merging n wildcard patterns could create 2^n states due to the combinatorial explosion of epsilon paths.

Implementation Optimizations Link to heading

Thread safety in the compilation layer relies on atomic pointers rather than locks. Both fieldMatcher and valueMatcher store their data in atomic pointers (fieldMatcher.updateable atomic.Pointer[fmFields]). The compilation code holds a write lock only during the brief window where it’s modifying the automaton. Readers see either the old automaton or the new one atomically, never a partial update. This design reflects the usage pattern where compilation is rare (patterns are added infrequently) but matching is frequent (every event triggers traversal).

The distinction between NFA and DFA traversal happens at runtime, not compilation time. The isNondeterministic flag in vmFields determines which traversal algorithm runs. Simple string and number patterns can use DFA-style traversal where each state has at most one successor per byte, enabling a single-state cursor that steps forward deterministically. Wildcard patterns and anything-but patterns require NFA traversal where multiple states are active simultaneously, tracked in a state set that expands and contracts as epsilon transitions are followed. This hybrid approach balances implementation simplicity (everything is stored as an NFA) with performance (use fast DFA traversal when possible), a consistent trait of Quamina.

I do wish here (and in ruler) that we didn’t do recursive automaton construction because it bends my mind, but switching to iterative process is hard. The recursion enables incremental building where each layer adds one transition and recursively constructs the remainder, simplifying the merge logic. This approach reduces memory churn and speeds up pattern processing rate compared to previous attempts at iteration. The recursion depth equals the length of the longest value being matched, typically under 100 bytes, making stack overflow unlikely in practice but I refuse to say where I’ve seen this assumption blow up.

The segments tree receives updates during compilation (core_matcher.go:80-82), tracking which field paths appear in any pattern. This tree drives the optimization in the flattening layer where irrelevant fields can be skipped entirely. Adding a pattern with path “config\ntimeout” updates the tree to mark both “config” and “config\ntimeout” as used, ensuring the flattener will extract these fields from events. The tree is copied on every pattern addition rather than shared [INFERRED: avoiding subtle aliasing bugs at the cost of some allocation overhead].

Advanced Patterns Link to heading

Shell-style wildcards like "user-*-action" compile to automata with spinout states (shell_style.go). The spinout state has two transitions: an epsilon self-loop allowing it to consume another byte, and a match transition for the character after the wildcard. This creates a nondeterministic choice point where the automaton must speculatively try both “consume more with the wildcard” and “match the next literal character”. Multiple wildcard patterns can share the same spinout state structure when their literal components match, though it wasn’t obvious in code how this works.

Anything-but patterns invert normal matching logic (anything_but.go). Instead of accepting when a specific sequence matches, they accept when the input does NOT match any of the specified alternatives. The implementation creates an automaton where forbidden values lead to dead states with no outgoing field transitions, while all other byte sequences lead to success states. For multi-value anything-but like {"anything-but": ["a", "b"]}, the semantics are intersection: the value must be neither “a” nor “b”. Some folks do interpret the last pattern as not “a” and “b” at the same time in an array, which is fair and shows limitations of the query language.

Case-insensitive matching uses Unicode case folding to duplicate paths (monocase.go). Pattern {"equals-ignore-case": "Cat"} compiles to an automaton with transitions for both uppercase and lowercase variants: " → (C|c) → (A|a) → (T|t) → " → terminal. The implementation applies “simple” case folding, which handles most scripts correctly but may miss edge cases like Turkish İ/i distinctions where case folding depends on locale. There’s a branch of Ruler somewhere in Amazon servers that fixes this.

What This Enables Link to heading

By compiling patterns into shared automaton structures, Quamina achieves matching performance that scales independently of pattern count for the common case where patterns share structure. Adding pattern number 10,000 that checks status = "active" reuses the existing field transition and value automaton for “status”, adding only a terminal state marker. The event processing cost depends on the number of unique fields referenced across all patterns, not the total number of patterns.

The compilation cost is paid upfront when patterns are added. Performance comments suggest patterns add at a rate measured in tens of thousands of fields per second on modern hardware. This batched-compilation model works well for systems where patterns change infrequently relative to event arrival rates. The next component in the architecture handles the matching phase: traversing these compiled automata against flattened events to collect matching pattern IDs in a single linear pass through the field list.