[antlr-interest] MismatchedTokenException due to rule reference in rewriting rule

Stephanie Balzer stephanie.balzer at gmail.com
Tue Sep 8 05:33:00 PDT 2009


Hi all,
In my programming language, I support mathematical set operators such as
union, intersection, cartesian product, etc. To conform with mathematics, I
assign the same precedence and associativity to those operators as defined
in mathematics.

The production dealing with sets therefore looks like:

setExpression
: ( arithmeticExpression -> arithmeticExpression )
 (
(  'cartesianProduct' leftRightChild=arithmeticExpression -> ^(
'cartesianProduct' $setExpression $leftRightChild ) )
 (
'cartesianProduct' rightRightChild=arithmeticExpression
-> ^( 'cartesianProduct' $setExpression $rightRightChild )
 )*
| (  'composition' leftRightChild=arithmeticExpression -> ^( 'composition'
$setExpression $leftRightChild ) )
 (
'composition' rightRightChild=arithmeticExpression
-> ^( 'composition' $setExpression $rightRightChild )
 )*
| (  'union' leftRightChild=arithmeticExpression -> ^( 'union'
$setExpression $leftRightChild ) )
 (
'union' rightRightChild=arithmeticExpression
-> ^( 'union' $setExpression $rightRightChild )
 )*
| (  'intersection' leftRightChild=arithmeticExpression -> ^( 'intersection'
$setExpression $leftRightChild ) )
 (
'intersection' rightRightChild=arithmeticExpression
-> ^( 'intersection' $setExpression $rightRightChild )
 )*
( 'difference' rightestChild=arithmeticExpression )?
-> ^( 'difference' $setExpression $rightestChild )
 )?
 ;

arithmeticExpression
: multiplicativeExpression ( ( '+'^ | '-'^ ) multiplicativeExpression )*
 ;
....

primaryExpression
        :       ....
        |       '(' expression ')'


Above rule setExpression seems probably quite complex, but it enforces the
required associativity (and without generating a non-LL(*) decision error).
It guarantees in particular that none of the operators (except for
intersection and difference) can be mixed with each other without using
parentheses (i.e., "(A union B) intersection C" instead "A union B
intersection C") and that cartesianProduct, composition, union, and
intersection are associative (i.e., A union B union C is fine). Note further
that I make use of rule references $setExpression in the rewriting rules to
build the AST in a left-associative manner.

The grammar and tree construction works as intended for expressions like:

A intersection B intersection C difference D

Unfortunately, I get a MissmatchedToken exception as soon as I parenthesize
subsets. I.e., in the example

(A union B) difference C

the exception occurs on seeing 'difference'.

When I run the expression in the ANTLRWorks debugger, I can see that ANTLR
tries to match a parenthesized expression rather than the whole set
expression.

On the other hand, if I drop associativity and request a programmer to
always parenthesize and thus change the rule setExpression to:

setExpression
: arithmeticExpression ( ('cartesianProduct'^ |  'composition'^ |  'union'^
| 'intersection'^ | 'difference' ^) arithmeticExpression)?
;

I can successfully parse the expression (A union B) difference C.

Is there a bug in my rule for setExpression or the rewriting rule? Any ideas
what could be the problem?

Thanks a lot for your help!

Stephanie
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090908/d6256250/attachment.html 


More information about the antlr-interest mailing list