[antlr-interest] identifiers that are not allowed to be like keywords

Fri Jun 9 08:35:17 PDT 2006

On Jun 9, 2006, at 7:19 AM, Martin Probst wrote:
> Don't worry, ANTLR does this for you automatically (in fact, it's  
> really difficult to get a different behaviour). Sequences of  
> characters will get the ID token type (or something similar). Then  
> they will be tested against the so called Literals table, which  
> contains stuff like 'true'. If it matches, the token type is  
> changed from ID to something different, e.g. LITERAL_true. A rule  
> like:
>
> identifier: ID;
>
> will not match that token then, and everything is fine for you. You  
> can read about that in the manual by looking for the option  
> "testLiterals".

For ANTLR v3, it's even simpler than v2.  There is no literals  
table.  All literals are rules in the lexer and when rules overlap  
like ID and keyword, the ambiguity is resolved by choosing the rule  
mentioned first.  So

B : "begin" ;

ID : 'a'..'z'+ ;

will do what you want.  Just mention keywords as 'if' in the grammar  
and v3 will take care of the rest.

BTW, we will need this java grammar built from the spec so I'm  
willing to help you out here. ;)

Ter