[antlr-interest] Lexer speed comparison

Wed Feb 25 15:13:06 PST 2004

Hi Uli,

Thanks for running some tests.  Unfortunately, it reveals what I've 
suspected.  There are two problems areas with ANTLR's lexers:

(1) generally huge overhead since I'm mimicking LL parsing
(2) my nextToken generates a poor approximation to a DFA for predicting 
which rule will succeed

I have an interesting solution to this problem for ANTLR 3 that works 
in my prototype (that is, the algorithm computes the right stuff).  It 
merges DFAs to get speed and simplicity of specification and then uses 
full parsing when you need it :)

One of the big areas that will get attention in 3.0 is speed. :)

BTW, i'm building a lexer for Python at the moment and boy will it be 
SLOWWW! ;)

Ter

On Feb 25, 2004, at 2:59 PM, Uli Bubenheimer wrote:

> I ran some tests comparing Lexers generated by different tools in
> terms of speed. The sophistication of my Lexer specification that I
> used as input for the Lexer generators is about that of a lexer for C.
> The average numbers I found by repeatedly running the generated lexers
> on a large source file were these:
>
> ANTLR           8300ms
> JLex            6500ms
> JFlex 1.4pre5   2000ms
> Flex            1900ms
>
> I used current versions of all tools. For the ANTLR comparison I had
> to adjust the Lexer specification, but tried to make it as efficient
> as possible. For the Flex comparison, I basically called Flex's
> yylex() function from Java via JNI. Also in the case of Flex I
> generated fewer Java objects as output than for the others, and if I
> take that into account, Flex's time would probably be more similar to
> JFlex's.
>
> I was surprised to find that ANTLR generated the slowest lexer.
> Naturally, when interpreting these numbers keep in mind that the
> outcome may have been different on a different lexer specification or
> lexer input text. On the other hand, it makes sense to me that a DFA
> would be faster on average than a more powerful recursive-descent 
> parser.
>
> Are there other performance numbers available? I have not been able to
> dig up anything substantial. I also wonder how ANTLR-generated LL(k)
> parsers perform compared to LALR(1) parsers.
>
> Uli
>
>
>
>
>
> Yahoo! Groups Links
>
>
>
>
>
>
--
Professor Comp. Sci., University of San Francisco
Creator, ANTLR Parser Generator, http://www.antlr.org
Cofounder, http://www.jguru.com
Cofounder, http://www.knowspam.net enjoy email again!
Cofounder, http://www.peerscope.com pure link sharing

Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/