[antlr-interest] Semantic Predicates in a Lexer

Fri Mar 20 13:01:03 PDT 2009

At 07:29 21/03/2009, Sam Barnett-Cormack wrote:
 >as I work on the additional bits of ASN.1 I'm finding reasons
 >to really want to know what the last 2 tokens generated were,
 >and use those for gating in the lexer. If I could do that
 >easily (i.e. without adding an action to every single lexer
 >rule), it'd make my life easier. I suppose a lexer subclass
 >could do it - override the emit stuff and add functions to
 >access already-generated tokens.

Well, if you were in the parser you could access 
previously-consumed tokens with LA(-1), LA(-2), etc.  You can do 
the same thing in the lexer but it'll give you the characters 
instead of the tokens.

It'd be hard to write that sort of code anyway -- you can't just 
look at "the single previously generated token" because there's a 
very high chance that it's a comment or whitespace or something, 
meaning you'd have to look further back, and it just quickly gets 
unwieldy.

Probably a better idea is to follow the example in the wiki about 
generating multiple tokens from a single lexer rule.  That way you 
can keep everything under your own control and only emit the set 
once you're sure that's what you really want :)