[antlr-interest] ANTLR performance

Jan Obdržálek obdrzalek at gmail.com
Thu Jan 15 07:20:27 PST 2009


Hello,

for our student research project we implemented a C grammar in ANTLR
3.1. (The grammar fully implements ANSI C99, and mostly implements the
GNU C extensions - the notable exception being the attributes, which
are a real pain.) However we found the performance of the parser to be
a bit lacking. We therefore ran some tests. In addition to the default
Java target, we modified the grammar for the C target. The results
were then compared to the results obtained by using the GNU C
compiler. This is a fair comparison, since gcc also uses a  top-down
recursive parser (although hand-coded). Actually the test is tougher
on gcc, since gcc does full compilation, compared to just building the
AST in the case of ANTLR.

The results can be summarized as follows (more details about the test
at the end of this post):

 - ANTLR/Java is obviously the slowest one [and there is a serious
start-up/close-down overhead]
 - ANTLR/C is faster, but still miles behind the gcc
 - ANTLR/C uses the most system time (analysis using strace points to
many more memory-related calls to kernel)
 - ANTLR/C allocates the most memory. Several times the memory used by gcc

The comparison isn't very rigorous/scientific (by any means), but the
times are represenative, being rougly the average time for many
different runs. I was quite surprised by how much slower ANTLR is
comparing to the GCC. I also tried to profile the C target, the
results from gprof are attached.

Any comments?

Best regards,
	Jan Obdržálek

--
Dr. Jan Obdržálek <obdrzalek at fi.muni.cz>
Faculty of Informatics, Masaryk University, Brno, Czech Republic


---------------

The grammar:
- backtracking is disabled
- only a few semantic/syntactic predicates
- the most obvious bottlenecks were fixed by left factorization

The input:
- preprocessed file from the Linux kernel, attributes removed
- ~20Kloc, ~700kB:
~/tmp/test$ wc a.c
 20604  80707 697787 a.c

The results
===========

GCC
---
~/tmp/test$ gcc --version
gcc (Ubuntu 4.3.2-1ubuntu11) 4.3.2

~/tmp/test$ time gcc -S a.c
a.c:1026: warning: conflicting types for built-in function 'vsprintf'
a.c:1030: warning: conflicting types for built-in function 'vsnprintf'
a.c:1041: warning: conflicting types for built-in function 'vsscanf'

real	0m0.425s
user	0m0.384s
sys	0m0.036s

ANTLR/C
-------
~/tmp/test$ make
java -cp lib/antlr-3.1.1.jar org.antlr.Tool  GNUCa.g
ANTLR Parser Generator  Version 3.1.1

~/tmp/test$ time ./parser
Reading a.c

real	0m1.958s
user	0m1.284s
sys	0m0.656s

ANTLR/Java
----------
~/xxx/tests$ java -version
java version "1.6.0_10"
Java(TM) SE Runtime Environment (build 1.6.0_10-b33)
Java HotSpot(TM) 64-Bit Server VM (build 11.0-b15, mixed mode)

~/xxx/tests$ time java Test

real	0m3.507s
user	0m5.504s
sys	0m0.332s

NOTE: Java VM uses 2 CPU cores, therefore the difference between real
and user/sys time
-------------- next part --------------
A non-text attachment was scrubbed...
Name: parser.gprof
Type: application/octet-stream
Size: 1576 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20090115/0400c8eb/attachment.obj 


More information about the antlr-interest mailing list