[antlr-interest] MismatchedTokenException due to rule reference in rewriting rule

Stephanie Balzer stephanie.balzer at gmail.com
Tue Sep 8 10:34:24 PDT 2009


Hi Jim!
Thanks a lot for your help!

On Tue, Sep 8, 2009 at 6:46 PM, Jim Idle <jimi at temporal-wave.com> wrote:

>  Stephanie Balzer wrote:
>
> Hi all,
>  In my programming language, I support mathematical set operators such as
> union, intersection, cartesian product, etc. To conform with mathematics, I
> assign the same precedence and associativity to those operators as defined
> in mathematics.
>
>  The production dealing with sets therefore looks like:
>
>  setExpression
>  : ( arithmeticExpression -> arithmeticExpression )
>  (
>  (  'cartesianProduct' leftRightChild=arithmeticExpression -> ^(
> 'cartesianProduct' $setExpression $leftRightChild ) )
>  (
>  'cartesianProduct' rightRightChild=arithmeticExpression
>  -> ^( 'cartesianProduct' $setExpression $rightRightChild )
>  )*
>  | (  'composition' leftRightChild=arithmeticExpression -> ^(
> 'composition' $setExpression $leftRightChild ) )
>  (
>  'composition' rightRightChild=arithmeticExpression
>  -> ^( 'composition' $setExpression $rightRightChild )
>  )*
>  | (  'union' leftRightChild=arithmeticExpression -> ^( 'union'
> $setExpression $leftRightChild ) )
>  (
>  'union' rightRightChild=arithmeticExpression
>  -> ^( 'union' $setExpression $rightRightChild )
>  )*
>  | (  'intersection' leftRightChild=arithmeticExpression -> ^(
> 'intersection' $setExpression $leftRightChild ) )
>  (
>  'intersection' rightRightChild=arithmeticExpression
>  -> ^( 'intersection' $setExpression $rightRightChild )
>  )*
>  ( 'difference' rightestChild=arithmeticExpression )?
>  -> ^( 'difference' $setExpression $rightestChild )
>  )?
>   ;
>
>  arithmeticExpression
>  : multiplicativeExpression ( ( '+'^ | '-'^ ) multiplicativeExpression )*
>  ;
>  ....
>
>  primaryExpression
>         :       ....
>         |       '(' expression ')'
>
>
>  Above rule setExpression seems probably quite complex, but it enforces
> the required associativity (and without generating a non-LL(*) decision
> error). It guarantees in particular that none of the operators (except for
> intersection and difference) can be mixed with each other without using
> parentheses (i.e., "(A union B) intersection C" instead "A union B
> intersection C") and that cartesianProduct, composition, union, and
> intersection are associative (i.e., A union B union C is fine). Note further
> that I make use of rule references $setExpression in the rewriting rules to
> build the AST in a left-associative manner.
>
>  The grammar and tree construction works as intended for expressions like:
>
>  A intersection B intersection C difference D
>
>  Unfortunately, I get a MissmatchedToken exception as soon as I
> parenthesize subsets. I.e., in the example
>
>  (A union B) difference C
>
>  the exception occurs on seeing 'difference'.
>
>  When I run the expression in the ANTLRWorks debugger, I can see that
> ANTLR tries to match a parenthesized expression rather than the whole set
> expression.
>
>  On the other hand, if I drop associativity and request a programmer to
> always parenthesize and thus change the rule setExpression to:
>
>  setExpression
>  : arithmeticExpression ( ('cartesianProduct'^ |  'composition'^ |
>  'union'^ | 'intersection'^ | 'difference' ^) arithmeticExpression)?
>  ;
>
>  I can successfully parse the expression (A union B) difference C.
>
>  Is there a bug in my rule for setExpression or the rewriting rule? Any
> ideas what could be the problem?
>
>
> The bug is in your setExpression if the syntax you show is meant to be
> valid. Upon seeing the '(', the rule will take arithmeticExpression and
> resolve a parenthesized expression. Now your rule can only take the the
> keywords OTHER than 'difference' here and so throws a syntax error. Perhaps
> you meant to have an | operator before the 'difference' keyword? Check the
> syntax drawing in ANTLR works and you will see what I mean - as you have it,
> 'difference' can only follow one or more 'intersections'.
>

Oops, now I can see the problem. I fixed it (i.e., added another alternative
to allow difference to follow a arithmeticExpression) I can now successfully
parse (A union B) difference C.

>
> Some other tips:
>
>  - Do you need such verbose operators as 'difference' ?
>  - Take out these literals and make them lexer tokens - if you change your
> mind, you will only need to change the lexer token definition, not the
> grammar, when you come to add error reporting, you will find it a lot easier
> to deal with an integer called DIFFERENCE than you will T24.
>

Thanks for the hint, I will do that.

Stephanie
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090908/e7180535/attachment.html 


More information about the antlr-interest mailing list