Couple of years back, my optimistic estimate was 4+ years for this but I now have quamina-rs, a Rust implementation of Tim Bray’s stupid-fast JSON pattern matching library quamina. It’s the same automata-based approach that powers Amazon EventBridge’s rule matching (among a staggering amount of other Amazon things) but now with Rust’s performance characteristics.

Why This Exists Link to heading

Mostly because I thought it’d be fun. I used to maintain Event Ruler at AWS, the Java library that was spiritual successor to quamina. With Tim continually working on his Go version, I wanted to see how far Rust could push the same algorithms. The answer ended up being significant.

Benchmark results (same patterns, same data):

Implementation Events/sec Relative
quamina-rs (Rust) ~428,000 1.0x
quamina (Go) ~254,000 0.59x
event-ruler (Java) ~204,000 0.48x

That’s 1.7x faster than the Go version and 2.1x faster than the Java one on roughly identical workloads on my laptop.

Between ruler and quamina-rs, the gap widens for certain pattern types even more:

Pattern Type Rust Java Speedup
Prefix (100 patterns) 219 ns 1,330 ns 6.1x
Anything-but 147 ns 560 ns 3.8x
Numeric range 159 ns 600 ns 3.8x
Wildcard (26 patterns) 496 ns 1,150 ns 2.3x

What It Does Link to heading

The readme does a much better job of this, but in brief; quamina-rs matches JSON events against patterns. You define patterns like:

let mut q = Quamina::new();

// Match events where region is us-east-1 AND status is error
q.add_pattern("alert", r#"{
    "region": ["us-east-1"],
    "status": ["error"]
}"#)?;

// Match events with high CPU
q.add_pattern("cpu-alert", r#"{
    "metric": ["cpu"],
    "value": [{"numeric": [">", 90]}]
}"#)?;

let matches = q.matches_for_event(event_json)?;

There’s a fair amount of pattern-matching support: exact, prefix, suffix, shellstyle wildcards, numeric ranges, anything-but, CIDR, equals-ignore-case, exists/absent, and regex (I-Regexp subset per RFC 9485). See the docs for details.

Compared to the Go version, quamina-rs adds suffix matching, numeric ranges, and CIDR. Compared to Event Ruler, it’s missing the anything-but variants (anything-but-prefix, anything-but-suffix, anything-but-wildcard) and case-insensitive prefix/suffix for now until I can think through the implications.

Where the performance comes from Link to heading

There’s a lot of tricks, but let me mention three for now:

  1. The most basic approach to state machines is a 256-entry (or fixed-sized) transition table per state with one slot for each possible byte value. When input byte comes in, you index directly into the table, and get the next state. This is fast, but memory usage explodes when you have thousands of states and most slots are empty. Similar to the go version, quamina-rs uses sparse tables instead to store only the transitions that exist. Costs a bit more at lookup time but saves ~95% memory.
  2. Suffix matching was interesting. The obvious implementation walks the NFA character by character like ruler. But most suffix patterns are looking for short needles in long strings. I found that using memchr to jump directly to potential match positions gives 40-62x speedups on long inputs with SIMD doing the heavy lifting.
  3. The trickiest part was arena allocation for cyclic NFAs. Rust really doesn’t want you to build graph structures with its ownership and borrowing rules. So an arena owns all states, and states reference each other by index instead of pointer. This reduced unsafe blocks I was originally scattering like nobody’s business.

Is it ready? Link to heading

You be the judge. Quamina-rs has 369 tests covering all edge cases that I could get from original Go version, and then I added some of my own whenever something broke. We have stress testing on rules and inputs along with benchmarks + profiling. There’s formal verification on the automata operations via Kani. Miri checks memory safety. There’s also fuzzing to throw arbitrary JSON and patterns at it for as long as you like.

The regex engine passes 652 of 992 XSD compatibility tests. quamina-rs gets you further on many fronts.

What’s next Link to heading

I’m not sure. I think talking about its internals is quite interesting, but there’s also a meta-story in how I built it. In the background while doing other things, with AI assistance and heavy reliance on the general principles of automation; which I think is just as interesting.

The code is at quamina-rs and you can also explore it in a playground. Feedback and contributions would be awesome.