[antlr-interest] Dumb newbie question dept: Anyway to simulate lexical states?

Wed Jun 23 08:34:09 PDT 2010

I need to at least simulate the lexical states I had in JavaCC.  I've been 
banging my head against ANTLR trying to see if the semantic predicates can 
be used to simulate lexer states.  After several days monkeying with 
things, I think my conclusion is, no... semantic predicates in lexer rules 
only throw FailedPredicateException when false, and do not help the lexer 
decide if a token will be recognized.  But, I do remain a bit confused 
about the whole matter... the documentation is really hard to follow. I've 
read and re-read the Terence Parr chapters on semantic and syntactic 
predicates.  But what happens in a the lexer vs. parser seemed relatively 
confused... as far as I can tell, the same syntax for semantic predicates 
do totally different things in the parser rules vs. the lexer rules.

I'm exploring creating a separate lexer for the XML part of the syntax... 
what Martin Probst's XQPretty does.  But there's not declarative syntax 
for doing multiple lexers... we'll have to create classes that do the 
pushing and popping at the java and C++ level.  Also, multiple token 
streams doesn't seem to work with ANTLRWORKS.  Also, looking at Martin's 
code, multiple scanner handling does seem to have a certain complexity... 
making sure one scanner isn't behind another, and the like.

Generate flex when using C++, which is not an unattractive possibility 
(and, hmm... there is a JFlex out there, and JLex for that matter).  It's 
possible this is my best option?

So, the concrete question is, is it possible in ANTLR3 to filter out sets 
of tokens based on a predicates, or do I need ANTLR4 or a flex variant or 
multiple lexers?

-scott