[antlr-interest] Languages where keywords can be used as identifiers

Tue Feb 7 15:13:14 PST 2006

I went through the same thing a long time ago. To do it similar to what I did:

The lexer would always recognize "loop" as a keyword token LOOP.

The grammar would have a rule like:
  unreservedkeyword: loop | etc | etc ;

The grammar would use a rule named "id":
  id: ID | unreservedkeyword ;

But enhance that last rule a bit, so that when you add it to the tree, you change the type from LOOP (or whatever keyword) to ID:
  id: ID | urk:unreservedkeyword { #urk.setType(ID); }
I probably have the syntax wrong for setType, sorry, this is off the top of my head.

Now your grammar can use:
  "goto" id
and
  datatype id

HTH,
John
john at joanju dot com

Adam Bishop (DSLWN) wrote:
> I am parsing a language where “loop” is a keyword, however a label can 
> be named loop.  The rule for label expects an identifier token, but the 
> lexer will return a loop token.  Is there any way to switch testLiterals 
> for a particular rule?
> 
>  
> 
> Ideally the Lexer shouldn’t be doing testLiterals for any usage of the 
> token ID in the parser.
> 
>  
> 
> NOTE:  To make things worse, I am having this problem wherever I have a 
> rule in the parser that expects an identifier
> 
> e.g.
> 
>  
> 
> “goto” ID
> 
>  
> 
> Will fail for input “goto loop”
> 
>  
> 
> And
> 
>  
> 
> datatype ID
> 
>  
> 
> will fail for “Number length” (since length is a keyword in another rule)
> 
>  
>