[antlr-interest] ANTLR v3 lexer performance

Tue Feb 20 11:25:30 PST 2007

Hi.  Egor Ushakov, from the netbeans project, has some very  
interesting lexer performance numbers.

Terence

Begin forwarded message:
> Hi Terence,
>
> new interesting results below!
> I created small and simple lexer grammar to test our grammar  
> optimizations and here are the results:
>
> this simple lexer grammar only recognizes these tokens (ans  
> whitespaces):
> LESS : '<';
> LESSLESS : '<<';
> LESSEQUALS : '<=';
> LESSLESSEQUALS : '<<=';
>
> Actually my intension was to find why v3 is slower, but...
>
> our super tuned :) antlr v2 with no exceptions and so on seems to  
> be 30% slower on this grammar than v3!
>
> (smaller is better)
> v2Sun                   19859
> v2Sun tuned         17328
> v3                          13000
> v3 tuned               12578
>
> tuned here means that we use combined grammar rule:
>
> FIRST_LESS :
>    '<' ({$type=LESS;}|
>           '<'  {$type=LESSLESS;}|
>           '=' {$type=LESSEQUALS;}|
>           '<=' {$type=LESSLESSEQUALS;}
>          )
>    ;
>
> These result have shown several things:
> 1. this kind of tuning can potentially give 10% in v2 (that's why  
> we already use it :)
> 2. even without this tuning v3 is ~20% faster
> 3. for v3 this kind of tuning gives less improvement ~4%
>
> Looking deeper inside the generated code we can find that in v3:
> - text is not saved in tokens (saving text reduces performance of  
> v3 to ~16000) and this can be done in v2
> - better optimized alternative selection code
>
> Although this test case is very simple, it brings me to an idea  
> that v3 was 60% slower on our lexer grammar because of some kind of  
> big imperfection in predicates generated code or something.
> I will continue analysis.
>
> Egor