[antlr-interest] Why won't this match...

Sun Feb 24 23:51:51 PST 2008

At 12:27 25/02/2008, Mark Volkmann wrote:
 >I didn't see your post until after I sent my last post.
 >Does that mean it isn't possible to do what Alan wants with the
 >current lexer implementation or could the grammar be modified to 

 >do what he wants?

One of the standard workarounds to the "overly optimistic lexer 
problem" (as I call it) is to merge all lexer rules with common 
prefixes:

fragment BIG_TOKEN 'wibbled';   // just a placeholder
LITTLE_TOKEN
   :  'wi'
      ( ('bbled') => 'bbled' { $type = BIG_TOKEN; } )?
   ;
SEMI_TOKEN: 'bble';

This works because the syntactic predicate forces a full lookahead 
before the branch is taken.  BIG_TOKEN is declared as a fragment 
since we need to define a token type id for it, and declaring it 
in the tokens section causes a warning message for some weird 
reason.  Creating a throwaway fragment rule handles both of these.

(The other standard workaround is to have a keyword matching table 
as Ter already suggested.)