[antlr-interest] Performance Issues

Ric Klaren ric.klaren at gmail.com
Fri Oct 21 21:29:18 PDT 2005


On 10/21/05, Bryan Ewbank <ewbank at gmail.com> wrote:
> I've been having some discussions with the "powers that be" where I
> work regarding whether ANTLR is up to the task we have set before it.
> We have it working (lex scanner; ANTLR parser, multiple tree parsers,
> C++/linux), but the performance is pretty bad (~3 seconds to clone the
> AST for a 10K line input file for a C-like language).  A first quick
> glance at gprof doesn't show any obvious outliers or idiocies;
> however, I know there's more to be done with profiling.
>
> I got the impression, several times, that people were pleased with the
> thruput of ANTLR for parsing and tree transformations.  Yes, there are
> a few "classic" tunings required - I'm working thru the information
> from this list over the past few yearse - but still...
>
> Our productivity is certainly higher with ANTLR, particularly for the
> tree-parsers; however, if customer perceived thruput is "bad"
> (whatever that means, right :-), it's a serious problem.
>
> So, has anyone been holding out on tunings, optimizations, and
> outright tricks that they are using to get ANTLR to eat trees faster?
> Again, I'm in the C++ world, so java suggestions don't help.

If you're not afraid to maintain a own backend library you could dumb
the AST's down. E.g. typedef the RefAST to a real pointer then get rid
of the virtuals (if you don't use polymorphism anywhere) Use some
allocator to allocate AST's between passes. Then throw complete trees
away as you go and don't need them anymore. Cut out all the separate
strings from the AST and hash them once, then only store int's in the
AST (or only pointers to the strings). (Or don't even carry the
strings around if you don't need them). There's so many places were
stuff is copied that it's sick.

There's really a lot of places where you can optimize. The codegen
might need some small massaging, or use a sed/tcl/perl script to fix
the generated code (should be easy to include in a build system)

I guess you could 'prototype' such a hacked library pretty quick (few
days) Then see if the gain is worth it. If you only concentrate on the
tree stuff it should be manageable I think. Maybe you could hack
something together that talks native antlr tree on one side and custom
tree on the other.

Cheers,

Ric


More information about the antlr-interest mailing list