[antlr-interest] faster expression parsing

Edwards, Waverly Waverly.Edwards at genesys.com
Wed Mar 19 06:30:21 PDT 2008


If I understand correctly...
If you don't specify right associativity then its assumed to be left.
Order or precedence indicated by the depth of primary expressions
subrules.


That is an outstanding idea.


W.


-----Original Message-----
From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of Gavin Lambert
Sent: Wednesday, March 19, 2008 6:55 AM
To: Terence Parr
Cc: antlr-interest Interest
Subject: Re: [antlr-interest] faster expression parsing

At 14:57 19/03/2008, Terence Parr wrote:
 >We'd just need to say something like you request.  I also
 >thought that just using an option would be ok...or, use
 >the left-recursive rule.
 >
 >e	:	e '*' e
 >	|	e '-' e
 >	|	e '+' e
 >	|	'-' e
 >	|	e '.' ID
 >	|	e '[' e ']'
 >	|	e '(' e (',' e)* ')'
 >	|	INT
 >	|	ID
 >	;
 >
 >That's nice 'cause it's explicit like a yacc grammar would be.
 >I'd recognize this pattern and build what i sent before.  Only
 >issue is precedence.  Order would work sort of but probably
 >not perfectly...for example a+b.x should not match as (a+b).x.

Well, for that case I think that's just because the alts are out 
of order :)

Though I don't think you can use alt order alone as an indicator 
of precedence, since there has to be some way to signal equal 
precedence (think "a * b / c".  If division is given higher 
precedence than multiplication then you'll get the wrong answer, 
using integer arithmetic).

Extending that a bit (and including subrule invocation), how about 
something like this:

e : ( ID | INT )
   | '(' e ')'
   | callArrayMember
   | unop
   | ( e '*' e | e '/' e )
   | ( e '+' e | e '-' e )
   | comparison
   | assignment
   ;

callArrayMember
   : e '(' e (',' e)* ')'
   | e '[' e ']'
   | e '.' ID
   ;

unop
options { associativity = right; }
   : '+' e
   | '-' e
   ;

comparison
   : e '==' e
   | e '!=' e
   | e '<' e
   | e '>' e
   | e '<=' e
   | e '>=' e
   ;

assignment
options { associativity = right; }
   : e '=' e
   | e '+=' e
   | e '-=' e
   | e '*=' e
   | e '/=' e
   ;

The idea is that in the "root" rule (which might need a special 
option to make it easily recognisable), each top-level alt 
represents an order of precedence, from highest to 
lowest.  Sub-alts (as in ID & INT, * & /) have equal 
precedence.  If a subrule is called, it's treated like a sub-alt; 
all alts within the subrule have equal precedence.  You can also 
use an option to specify the operator associativity.  (I think 
you're allowed to specify options at the alt level as well, but 
just to be safe [and because it seemed tidier that way] I put them 
in subrules.)

Of course, I'm not sure how hard all of this would be to produce; 
it's like a little mini-grammar in its own right.  And something 
like the callArrayMember subrule would probably be hard to 
generate a good AST automatically for.



More information about the antlr-interest mailing list