[antlr-interest] faster expression parsing

Gavin Lambert antlr at mirality.co.nz
Wed Mar 19 03:55:27 PDT 2008


At 14:57 19/03/2008, Terence Parr wrote:
 >We'd just need to say something like you request.  I also
 >thought that just using an option would be ok...or, use
 >the left-recursive rule.
 >
 >e	:	e '*' e
 >	|	e '-' e
 >	|	e '+' e
 >	|	'-' e
 >	|	e '.' ID
 >	|	e '[' e ']'
 >	|	e '(' e (',' e)* ')'
 >	|	INT
 >	|	ID
 >	;
 >
 >That's nice 'cause it's explicit like a yacc grammar would be.
 >I'd recognize this pattern and build what i sent before.  Only
 >issue is precedence.  Order would work sort of but probably
 >not perfectly...for example a+b.x should not match as (a+b).x.

Well, for that case I think that's just because the alts are out 
of order :)

Though I don't think you can use alt order alone as an indicator 
of precedence, since there has to be some way to signal equal 
precedence (think "a * b / c".  If division is given higher 
precedence than multiplication then you'll get the wrong answer, 
using integer arithmetic).

Extending that a bit (and including subrule invocation), how about 
something like this:

e : ( ID | INT )
   | '(' e ')'
   | callArrayMember
   | unop
   | ( e '*' e | e '/' e )
   | ( e '+' e | e '-' e )
   | comparison
   | assignment
   ;

callArrayMember
   : e '(' e (',' e)* ')'
   | e '[' e ']'
   | e '.' ID
   ;

unop
options { associativity = right; }
   : '+' e
   | '-' e
   ;

comparison
   : e '==' e
   | e '!=' e
   | e '<' e
   | e '>' e
   | e '<=' e
   | e '>=' e
   ;

assignment
options { associativity = right; }
   : e '=' e
   | e '+=' e
   | e '-=' e
   | e '*=' e
   | e '/=' e
   ;

The idea is that in the "root" rule (which might need a special 
option to make it easily recognisable), each top-level alt 
represents an order of precedence, from highest to 
lowest.  Sub-alts (as in ID & INT, * & /) have equal 
precedence.  If a subrule is called, it's treated like a sub-alt; 
all alts within the subrule have equal precedence.  You can also 
use an option to specify the operator associativity.  (I think 
you're allowed to specify options at the alt level as well, but 
just to be safe [and because it seemed tidier that way] I put them 
in subrules.)

Of course, I'm not sure how hard all of this would be to produce; 
it's like a little mini-grammar in its own right.  And something 
like the callArrayMember subrule would probably be hard to 
generate a good AST automatically for.



More information about the antlr-interest mailing list