[antlr-interest] Keywords not context-free

Tue Oct 16 04:56:33 PDT 2007

Gavin Lambert wrote:
> The simplest way to do this is to make a catch-all identifier rule, 
> similar to this:
> identifier
>   : IDENTIFIER | FOO | BAR | BAZ
>   ;
> Where FOO, BAR, and BAZ are tokens representing those specific "words", 
> and IDENTIFIER accepts any other sequence of letters strung together.  
> Consequently the identifier rule will accept any of these in an 
> identifier context, and you can also refer to the FOO, BAR, and BAZ 
> tokens as keywords in some other context.

Yes, that will probably work. It's a pain though, because in this context,
all but three of the language's keywords need to be accepted as normal
identifiers, as part of an arbitrary length string of such identifiers. 
Ugly, because I need to list all the other keywords of the language, so
I wondered whether there was a better way :-).

I once saw an SQL grammar (which used a modified YACC engine), that
detected an error state, saw that the last token was a keyword, and
backed up one token to retry the parse with that token as an identifier.
That way you could have a table called "select", and other nasty things
like that. Given that ANTLR can do backtracking, I thought that approach
might be viable.

Clifford Heath.