[antlr-interest] Simple grammar with error
Gavin Lambert
antlr at mirality.co.nz
Sun Sep 16 04:07:08 PDT 2007
At 22:09 16/09/2007, Johannes Luber wrote:
>Additionally, in many languages a particular
>operator has overloaded meanings. An example:
>
>addressof_expression
> : BITWISE_AND unary_expression
> ;
>
>where
>
>BITWISE_AND : '&';
>
>Whenever I read BITWISE_AND I have to replace it with '&' and
>reparse it as OP_ADDRESS. I can't use it in the grammar itself
>because there won't be OP_ADDRESS tokens. Of course, I could
>do a rewrite, but it may not be worth the effort. In such cases,
>I'd wish ANTLR would allow to map BITWISE_AND and OP_ADDRESS to
>the same token (although the debugger may be confused).
This is just an example of choosing the wrong name for the
token. Given that lexically an '&' might be a "bitwise and" or it
might be an "address-of operator" (and the lexer will have no idea
which one), the best name for the token would just be something
like AMP. Leave it to parser rules to assign more semantic
meaning to it. Like so:
tokens {
AMP = '&';
BITWISE_AND;
OP_ADDRESS;
}
...
addressof_expression
: AMP unary_expression -> ^(OP_ADDRESS unary_expression)
;
And it's still cleaner (especially when looking at the generated
code or at error message outputs) to see the token being referred
to as AMP instead of as T31.
(And also, if you're modelling a C++-like language that supports
operator overloading, even BITWISE_AND isn't necessarily a good
name, since that's an overridable operator and so might end up
doing something completely different.)
More information about the antlr-interest
mailing list