[antlr-interest] wow! antlr runtime just got faster!

Terence Parr parrt at cs.usfca.edu
Wed Jul 26 17:05:36 PDT 2006


Howdy folks,

So I've been playing with the DFA state tables that come out or ANTLR  
(only for complicated decisions) and I told it to avoid making if- 
then-else chains unless a predicate is present.  Turns out that the  
lexer for Java had a big if-then-else chain as the main entry point.   
ugh.  Turning that into an array lookup has *DRAMATICALLY* improved  
the efficiency.

My informal tests show that we shave almost 40% off the overall exec  
speed parsing JDK 1.4 source.  The new DFAs in a parser/lexer combo  
dropped from 7.3 seconds on a java grammar to 4.3.

Using the if-then-else chain:

java -Xmx100M org.antlr.Tool -Xmaxdfaedges 512 java.g

takes about 7.3s to parse 1.4 source tree:

time java -Xms200M -Xmx400M Main /usr/local/javasrc/java

I get about 4.3s to parse.  WOW!

Running a syntax-converted java grammar from Robert Grimm's cool  
Rats! PEG-based tool into ANTLR yields about 6.4s, but with the if- 
then-else it takes 9.4s.  Note that Robert's tool does amazingly  
well: I originally clocked it at around 7.x seconds on my box but  
somehow today it's taking 13s; weird, well assume it's 7.x seconds  
for now til I can investigate.  ANTLR v2 takes 8.8s but on that  
grammar, antlr 3 does like 3.X seconds (haven't checked that one in a  
while).  Javacc does about 4.3s using it's java grammar.  Note Rats!  
does no static analysis; the speed comes from clever optimizations at  
run-time.

The java grammar has lots of unicode so new DFA thing matters for  
it.  For C grammar it shouldn't matter too much if at all.  Anyway,  
interesting little tidbit.

Ter


More information about the antlr-interest mailing list