[antlr-interest] greedy vs nongreedy lexer rules
Ron Burk
ronburk at gmail.com
Sun Apr 18 21:32:22 PDT 2010
> Multi-core systems are the norm now. In my job, we spend a LOT of time
> determining how best to extract maximum work in minimum time, and parallel
> programming is a big part of that.
Splitting the work of lexical analysis across cores without ending up
slower than a simple, single-threaded state machine (due to constant
stalling on shared data)... should be a challenge. It sure is easy to
make multi-threaded solutions that are slower than single-threaded,
especially if the single-threaded solution fits in L1 cache.
IME, the ultimate in lexing speed requires tokenizing to proceed
independently. A function call per token (in C/C++ anyway, using
a deterministic state-driven lexer) tends to swamp performance gains
from having the lexer proceed in batches; the parser needs to be just
advancing a pointer to get to the next token, not calling anybody.
But that's for ultimate speed.
More information about the antlr-interest
mailing list