[antlr-interest] Any plans of next ANTLR Release

Fri Apr 30 23:21:24 PDT 2010

> According to Cox, Thompson's original algorithm generated
> machine code in the 1960s.

It was a fair portion of the beauty of his paper that he could
present mostly complete code in such a short space. However,
the machine architecture of the machine he was working
on fit this algorithm very well (and Thompson was clever
in coding it). I would not be as blithe as Cox in assuming that
modern machines are fast enough to forgo that advantage
(YAMV - Your App May Vary). Lexical analysis so easily
can consume a nontrivial portion of parsing CPU cost.
E.g., in  http://tinyurl.com/yebzy5o is claimed:

  "From Intel XML Parsing Accelerator, we found that character
checking loop occupies more than 60% CPU cycles of the whole parsing
process, depending on the property of benchmark."

Of course, Thompson's algorithm (and Cox's analysis)
is oriented towards building lots of NFA's on the fly
for short-term use. Lexical analyzer generators have
the luxury of enormous CPU time to spit out optimized
(if not optimal -- though what one might be optimizing
for can vary) DFAs.