As I was building the Rust port of Tim Bray’s Quamina, I was very surprised by how much progress I was able to make in a very short time. I had the automaton logic, the small tables, the field matching, and many of the matchers supported within weeks. Probably could have been days if I had let the automation loose by itself instead of letting it wait for me to review what had happened so far and then tell it to keep going. But getting a working NFA construction for wildcards and quantifiers was a completely different story.
In Go and Java you can create cyclic loops and let the garbage collector take care of the resulting memory management mess, since it traces for reachability. So if something is reachable from the root, it stays alive, cycles and all.
// In Go, this just works
state.next = state // cycle back to yourself, GC handles it
But in Rust, you can’t just write state.next = Box::new(state) because the ownership model needs to know who owns what. A cycle means A owns B and B owns A, which Rust can’t represent directly. The standard workaround is Arc (or atomic reference counting), but reference counting only tracks how many pointers an object has; it doesn’t check whether those objects are still reachable. If two objects point to each other, their counts never drop to zero, so they’re never freed and you leak memory.
// This compiles but leaks - Arc can't collect cycles
let a = Arc::new(RefCell::new(State::new()));
let b = Arc::new(RefCell::new(State::new()));
a.borrow_mut().next = Some(Arc::clone(&b));
b.borrow_mut().next = Some(Arc::clone(&a)); // leaked cycle
The workaround I landed on was avoiding cycles entirely by faking them with chains, which is a pattern I’d seen before in contexts where you can’t express recursion directly. Originally, for a pattern like [a-z]*, instead of one state looping back to itself, I naively tried to create a hundred empty states in sequence with each pointing to the next, assuming I wouldn’t need more than 101 ever. This passed the tests, but when I started running benchmarks against the Go version the numbers weren’t great.
It wasn’t dramatically off, but enough to smell fishy given that Rust should have been faster on this kind of work. Profiling pointed at the quantifier handling, which made sense once I thought about it more. I was creating hundred-state chains where Go had two-state cycles. The chain approach allocated more memory, traversed deeper, and fundamentally couldn’t represent what the pattern actually meant because I was encoding “repeat up to 100 times” when the pattern said “repeat any number of times.”
The fix was arena allocation, where states live in a central Vec and reference each other by index instead of pointer. A StateId is just a u32, so the borrow checker doesn’t care where it points since there’s nothing to borrow. Once I had that working, the pattern [a-z]* went from 200+ chained states to 4 states with a true loop, and benchmarks showed a 2.5x speedup for regexp matching compared to the Go version.
#[derive(Clone, Copy)] // can be copied freely, no ownership issues
struct StateId(u32);
struct StateArena {
states: Vec<State>,
}
impl StateArena {
fn alloc(&mut self) -> StateId {
let id = StateId(self.states.len() as u32);
self.states.push(State::new());
id
}
}
// cycles are just indices pointing wherever they want
let start = arena.alloc();
let loopback = arena.alloc();
arena[loopback].transitions.push(start); // points back to start - no problem
arena[start].transitions.push(loopback); // true cycle, 4 states total
I looked at existing arena crates like slotmap, typed-arena, generational-arena; but the performance was hit or miss and the abstractions didn’t fit cleanly into the automata traversal code. For something this central to the matching hot path, I wanted full control over memory layout and traversal patterns rather than working around someone else’s API. If I’m going to verify correctness obsessively anyway, I’d rather verify my own code than write exhaustive tests for a dependency’s invariants. Thinking from a “build vs buy” mentality, this is a core part of the process and critical enough to not offload to someone else until there’s trust (or in my case, until I know enough Rust to trust). For the same reason, I am still unsure of the few dependencies I did take before.
The integration wasn’t smooth though. My first attempt to use arena for all regexp patterns broke a bunch of the tests from Quamina’s samples. Some patterns needed multiple independent NFAs, and my initial Option<StateArena> got overwritten when adding a second pattern. Switching to Vec<(StateArena, StateId)> fixed that. Then I tried using arena for shellstyle wildcards too, which turned out to be architecturally wrong because shellstyle patterns merge into a single automaton for efficiency, but separate arenas would require N traversals instead of one. After that change, shellstyle was 1.8x slower than Go until I tracked it down to Arc allocation during traversal and fixed it with buffer reuse.
What I hadn’t realized was how much unsafe code I’d scattered elsewhere in the codebase while getting things to work. Coming from languages where memory safety isn’t really your problem to think about, I hadn’t internalized that Rust actually makes you prove it rather than just trusting you. I had unsafe blocks for performance shortcuts; from_utf8_unchecked and raw pointer casts to name a few. I was assuming the invariants held without actually having any way to verify them beyond “the unit tests seem to work.”
Then yerke opened the first ever issue pointing out the unsafe code and recommending Miri integration. That reframed how I was thinking about the whole project from a toy to something that could be close to production. Within days I’d added Miri for undefined behavior detection, cargo-fuzz for edge cases, Kani for formal model checking, plus cargo-audit and cargo-deny for dependency hygiene. If I had that scaffolding from day one, I would have found issues way sooner than the weeks of trial and error I went through.
I bet I’d have also found the arena pattern and learned that I’d have to worry about cyclic references much sooner had I started with someone who knew Rust and asked them nicely to look at the code first. You’d be surprised how often you can avoid pain and dead-ends by just talking with people in a very short time. There’s a lot of discussion of tight feedback loops, but not enough on high-signal in those feedback loops. And if you’re not sure, just ask a fellow electrical engineer who can speak much more eloquently about the joys of stronger signals.