[antlr-interest] Faster expression parsing

Thu Aug 28 13:26:57 PDT 2008

On Aug 28, 2008, at 1:18 PM, Sam Harwell wrote:

> The *giant* advantage in the code example I showed lies in something
> behind the scenes: no extra syntactic or semantic predicates were  
> added,
> so the prediction runs as fast as it did per-level before. On a  
> grammar

I only need 1 very fast semantic predicate,Which will be hidden from  
the user... should be very fast. The difference is that you get  
recursion in the parser that you defer until tree construction time.  
should balance out because neither of us is descending 20 levels any  
more.

> that doesn't require any predicates at the binary expression level,  
> this
> means a 5:1 performance difference for the function. I just checked,  
> and
> 26.96% of the parse time is in 'binary_expression', which is
> 'prefix_expression' separated by 'binary_operator'. 25.41% of the  
> parse
> time is in 'prefix_expression', so the overhead of parsing all 25
> non-assignment binary operations in 12 precedence groups in my program
> is now 1.5% of the parse time.
>

nice work!

You should make a faq entry :)

> Also, the functions build the tree properly accounting for both  
> operator
> associativity and precedence groups. The profiler showed the tree
> construction overhead (the call to createPrecedenceTree) given a flat
> input was 0.27% of the parse time. Keep in mind that my grammar  
> doesn't
> use backtracking or semantic predicates, and is <=LL(2) for most
> predictions. On a grammar that spends more time in those sections, the
> overhead would show as near 0.

My version would be the exact same thing as yours except that I have a  
predicate to dictate whether to recurse.  LL(1) most likely.

> In the grammar I'm performance testing, I don't have any lines that  
> look
> quite like the one you mentioned. The language I'm parsing allows some
> reserved words to be used as variable names, so I had this:
>
> a	: 	literal
> 		-> ^(literal)
> 	|	varName
> 		-> ^(varName)
> 	|
> 	...
> 	;

I cut-and-paste from your e-mail; must be there somewhere.

Also removes those unnecessary tree construction things:

a -> ^(a)

  is the same as

a -> a

  which is the same as

a

The latter should make you go even faster. try it out :)

> It really shows the power of ANTLR3 to sandwich a operator precedence
> parser in it that still allows a grammar-defined primary non-terminal,
> without any changes to the generator or runtime.

hooray!

Ter