[antlr-interest] MismatchedTokenException due to rule reference in rewriting rule
Stephanie Balzer
stephanie.balzer at gmail.com
Tue Sep 8 10:34:24 PDT 2009
Hi Jim!
Thanks a lot for your help!
On Tue, Sep 8, 2009 at 6:46 PM, Jim Idle <jimi at temporal-wave.com> wrote:
> Stephanie Balzer wrote:
>
> Hi all,
> In my programming language, I support mathematical set operators such as
> union, intersection, cartesian product, etc. To conform with mathematics, I
> assign the same precedence and associativity to those operators as defined
> in mathematics.
>
> The production dealing with sets therefore looks like:
>
> setExpression
> : ( arithmeticExpression -> arithmeticExpression )
> (
> ( 'cartesianProduct' leftRightChild=arithmeticExpression -> ^(
> 'cartesianProduct' $setExpression $leftRightChild ) )
> (
> 'cartesianProduct' rightRightChild=arithmeticExpression
> -> ^( 'cartesianProduct' $setExpression $rightRightChild )
> )*
> | ( 'composition' leftRightChild=arithmeticExpression -> ^(
> 'composition' $setExpression $leftRightChild ) )
> (
> 'composition' rightRightChild=arithmeticExpression
> -> ^( 'composition' $setExpression $rightRightChild )
> )*
> | ( 'union' leftRightChild=arithmeticExpression -> ^( 'union'
> $setExpression $leftRightChild ) )
> (
> 'union' rightRightChild=arithmeticExpression
> -> ^( 'union' $setExpression $rightRightChild )
> )*
> | ( 'intersection' leftRightChild=arithmeticExpression -> ^(
> 'intersection' $setExpression $leftRightChild ) )
> (
> 'intersection' rightRightChild=arithmeticExpression
> -> ^( 'intersection' $setExpression $rightRightChild )
> )*
> ( 'difference' rightestChild=arithmeticExpression )?
> -> ^( 'difference' $setExpression $rightestChild )
> )?
> ;
>
> arithmeticExpression
> : multiplicativeExpression ( ( '+'^ | '-'^ ) multiplicativeExpression )*
> ;
> ....
>
> primaryExpression
> : ....
> | '(' expression ')'
>
>
> Above rule setExpression seems probably quite complex, but it enforces
> the required associativity (and without generating a non-LL(*) decision
> error). It guarantees in particular that none of the operators (except for
> intersection and difference) can be mixed with each other without using
> parentheses (i.e., "(A union B) intersection C" instead "A union B
> intersection C") and that cartesianProduct, composition, union, and
> intersection are associative (i.e., A union B union C is fine). Note further
> that I make use of rule references $setExpression in the rewriting rules to
> build the AST in a left-associative manner.
>
> The grammar and tree construction works as intended for expressions like:
>
> A intersection B intersection C difference D
>
> Unfortunately, I get a MissmatchedToken exception as soon as I
> parenthesize subsets. I.e., in the example
>
> (A union B) difference C
>
> the exception occurs on seeing 'difference'.
>
> When I run the expression in the ANTLRWorks debugger, I can see that
> ANTLR tries to match a parenthesized expression rather than the whole set
> expression.
>
> On the other hand, if I drop associativity and request a programmer to
> always parenthesize and thus change the rule setExpression to:
>
> setExpression
> : arithmeticExpression ( ('cartesianProduct'^ | 'composition'^ |
> 'union'^ | 'intersection'^ | 'difference' ^) arithmeticExpression)?
> ;
>
> I can successfully parse the expression (A union B) difference C.
>
> Is there a bug in my rule for setExpression or the rewriting rule? Any
> ideas what could be the problem?
>
>
> The bug is in your setExpression if the syntax you show is meant to be
> valid. Upon seeing the '(', the rule will take arithmeticExpression and
> resolve a parenthesized expression. Now your rule can only take the the
> keywords OTHER than 'difference' here and so throws a syntax error. Perhaps
> you meant to have an | operator before the 'difference' keyword? Check the
> syntax drawing in ANTLR works and you will see what I mean - as you have it,
> 'difference' can only follow one or more 'intersections'.
>
Oops, now I can see the problem. I fixed it (i.e., added another alternative
to allow difference to follow a arithmeticExpression) I can now successfully
parse (A union B) difference C.
>
> Some other tips:
>
> - Do you need such verbose operators as 'difference' ?
> - Take out these literals and make them lexer tokens - if you change your
> mind, you will only need to change the lexer token definition, not the
> grammar, when you come to add error reporting, you will find it a lot easier
> to deal with an integer called DIFFERENCE than you will T24.
>
Thanks for the hint, I will do that.
Stephanie
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090908/e7180535/attachment.html
More information about the antlr-interest
mailing list