[antlr-interest] ANTLR performance
Jan Obdržálek
obdrzalek at gmail.com
Fri Jan 16 06:08:02 PST 2009
Hello,
thank you very much for your help. Here are the answers.
> 1) Did you you tune the token and tree node allocation pool to reflect
> something like the number of tokens you might expect (this really just
> small, large) the default is usually good for most things. This grammar is
> creating 909856 tree nodes according to your stats and this does not seem
> quite right but possibly is - if it is then you need to change the
> allocation pools to be much bigger, in which case the algorithm will make
> far less. Have you by chance turned on output=AST but not shaped the tree in
> anyway, resulting in a flat tree that must be rewritten all the time? The
> other possibility is the tree building mechanism - currently the way that
> the code is generated is not optimal, even though the runtime mechanisms are
> - we do a lot of stream rewriting which isn;t strictly speaking necessary
> and the gcc compiler won't be doing this.
The grammar generates a pretty standard AST tree (definitely not a
flat one). Turning off the tree-building mechanism made all the
difference - it saved about 80% of the running time. Being on par with
gcc here (although we still need to remember that gcc does a full
compilation) is quite a nice result for a generated parser.
<<<
~/tmp/test$ time ./parser
Reading a.c
real 0m0.481s
user 0m0.360s
sys 0m0.108s
>>>
The problem is I need the AST :(
> 2) You don't say whether you are doing anything in the grammar, such as
> accessing text and so on - you have to take out any action code to see what
> the actual baseline is for the parser. For instance accessing token text is
> a convenience function and you would not use it in a compiler that requires
> speed;
There is no action code except for handling typedefs. This is not a problem.
> 3) As per the examples, when you know that the input stream is 8 bit or 16
> bit, you can set a macro in your grammar such that it traverses the input
> stream directly using pointers and not via function calls. Did you turn this
> on.
I tried
@lexer::header{
#define ANTLR3_INLINE_INPUT_ASCII
}
This actually makes the parser a tiny bit slower than without the #define.
> 4) I would need to see your grammar rather than a gprof (BTW - use
> kcachegrind for better views of this). Realistically, the gcc compiler has
> been worked on for years and is hand crafted, whereas if your grammar has
> predicates or other inefficient methodology, it isn't much of a comparison.
> It may not be so much that ANTLR generates slower code as that the grammar
> could be better, but there is no way to know without your grammar to look at
> (which I would like to see before looking further).
Thanks for suggesting kcachegrind. It is much better than gprof :) I
do not think valgrind existed the last time I did any serious
programming in plain C ... My only complaint is the cycle detection.
Having <cycle1> accounting for most of the time is not what I really
want. However if I skip cycle detection, kcachegring grabs 100% of one
CPU core and keeps crunching ...
The profiling info in kcachegring after turning off the tree-building
seems to be about what I expected. To give you the idea:
20.6% (inclusive time) mTokens[called 255 744 times] - this is the lexer part
12% (inclusive time) antlr3dfapredict - that's not too bad. I can
definitely decrease this significantly.
12% (inclusive time) isTypeName [called 21 800 times] - this function
checks whether the given identifier is a type name or not. It calls ~3
800 000 times strcmp (4.13%) That can also be improved.
> 5) Are you taking out the time spent reading the file? Are you reading the
> file into memory the same way that gcc does it?
No, I'm not. This is what I do:
input = antlr3AsciiFileStreamNew(fName);
lxr = GNUCaLexerNew(input);
tstream = antlr3CommonTokenStreamSourceNew(ANTLR3_SIZE_HINT,
TOKENSOURCE(lxr));
psr = GNUCaParserNew(tstream);
psr->translationUnit(psr);
I can
> 6) I presume that this is the latest 3.1.1 C runtime?
yep
> At the end of the day though it is difficult to compete against hand crafted
> C code that has had a lot of effort put into optimizing it :-).
Sure, I'm not expecting miracles :) That time without the
tree-building seems reasonable to me.
> Publish your grammars so I can take a look at this.
Sure, the grammar will be published soon under GPL. It just needs some clean-up.
To sum up - the main culprit is the tree-building mechanism. Other
than that the performance of the generated parser is quite reasonable.
The problem is that tree-building/rewriting capabilities of ANTLR are
the main reason why we use ANTLR. Second - I need (or would prefer) a
fast Java target. Java does not have to be much slower than C. If Java
is that much slower, we will probably end up using C target for the
parser and then translating the AST into a Java structure ... I will
turn off the tree-building in Java and see what happens.
Once again, thanks to everybody for their suggestions!
Regards,
Jan
--
Dr. Jan Obdržálek <obdrzalek at fi.muni.cz>
Faculty of Informatics, Masaryk University, Brno, Czech Republic
More information about the antlr-interest
mailing list