[antlr-interest] Simple grammar with error

Johannes Luber jaluber at gmx.de
Sun Sep 16 05:30:10 PDT 2007


Gavin Lambert wrote:
> This is just an example of choosing the wrong name for the token.  Given
> that lexically an '&' might be a "bitwise and" or it might be an
> "address-of operator" (and the lexer will have no idea which one), the
> best name for the token would just be something like AMP.

That I get from stealing the token names from the mono C# compiler. :(
Does this advice go for only ambiguous token names? Or should
overloadable operators implicitly be named after the symbol, even if
they are used in the grammar only in one place?

> Leave it to
> parser rules to assign more semantic meaning to it.  Like so:
> 
> tokens {
>   AMP = '&';
>   BITWISE_AND;
>   OP_ADDRESS;
> }
> ...
> addressof_expression
>   : AMP unary_expression -> ^(OP_ADDRESS unary_expression)
>   ;

Isn't OP_ADDRESS[AMP] allowed? Nonetheless this tipp it now on my todo list.
> 
> And it's still cleaner (especially when looking at the generated code or
> at error message outputs) to see the token being referred to as AMP
> instead of as T31.
> 
> (And also, if you're modelling a C++-like language that supports
> operator overloading, even BITWISE_AND isn't necessarily a good name,
> since that's an overridable operator and so might end up doing something
> completely different.)
> 
It is C#.

Best regards,
Johannes Luber


More information about the antlr-interest mailing list