Parabix Regular Expression Matching - a Tour
Parabix regular expression matching, like the Glushkov method, uses bitwise operations to accelerate regular expression matching. However, it uses an orthogonal approach in which the bit vectors are used to represent positions in the text being searched.
One can think of a two-dimensional array: each row is for one state in an automaton and each column represents one position in the text being searched. The Glushkov method proceeds column by column, process input characters one at a time, but computing all new state entries at once The Parabix method proceeds row by row, computing state entries one at a time, but using the bitwise parallelism to compute that state entry for many input characters. On current Intel processors, the AVX-512 instructions allow those calculations to be done for 512 character code units at once.
The following prototype implementation in Haskell was used to kick start the Parabix regular expression project: