[antlr-interest] greedy vs nongreedy lexer rules

Mon Apr 19 16:04:06 PDT 2010

> Out of curiosity; may i ask you why "function call per token" is
> something bad for multi-core performance?

That aspect is independent of multiple cores.
If you implement an efficient lexer (e.g., a minimized
DFA), then the overhead of function call tends to be
a significant percentage of the per-token work of the
lexer (based on my experience processing C/C++
source -- YMMV wrt language, though I doubt by
much).

> I am asking because the way
> i see this issue is that pursuing two different paths in NFA has no
> inherent coupling. It is simply matter of state replication and
> merging it back to single outcome state upon entering some barrier
> marking "end of work" state. Am I missing something?

I don't think so. Once you're using an NFA for lexing, you'll
likely be insanely less efficient than a minimized DFA,
so the overhead of a function call per token should be
unnoticed amidst the general slowness :-). If you were
trying to use multiple cores to follow paths in an NFA,
then "merging it back" sure sounds like a place to stall
on attempting to update memory that is shared between
cores. I wouldn't predict multiple cores would be faster for
that without measuring it first on the target CPU. If you
haven't looked at how ugly things have gotten inside
in the last few years, I recommend:

http://www.infoq.com/presentations/click-crash-course-modern-hardware

The root problem for getting anywhere close to the theoretically
possible speeds of putting lexing on a different core from
parsing is the constant stalling over the shared data (the
tokens). Last I looked, Intel did little beneath the covers
to help. Whatever else Google's Go! got wrong, they got
right that communicating sequential processes are the
the only hopeful paradigm for routinely using all that
extra core-ness in software that has any hope of getting
even modestly debugged. Intel must get some kind of
mechanism morally equivalent to a cross-core one-way
pipe working efficiently at some point.  IMHO. :-)