[antlr-interest] Understanding priorities in lexing (newbie)

Thu Jul 12 06:12:17 PDT 2007

El 12/7/2007, a las 13:31, mail.acc at freenet.de escribió:

>> hi. See filter=true mode for lexers.  see fuzzy example in examples-
>> v3 :)
>> Ter
>
>> El 12/7/2007, a las 7:59, mail.acc at freenet.de escribió:
>> As Ter has already stated, you need a filtering lexer for this.
>
> I didn't want to use filter=true because I need the arbitary input  
> in-between
> the matching token to process. And as far as I understand, in  
> filter=true mode
> unmatched tokens are going to be discarded (and I need them in the
> TokenStream).

Normally unmatched input characters are discarded in filtering mode,  
but not so in your case because your ELSE rule is guaranteed to match  
all otherwise-unmatched chars.

>> prediction. This may seem counter-intuitive at first but you just
>> have to accept that as a basic premise ANTLR is all about speed and
>> that means no backtracking in the event of an error (unless you
>> explicitly turn it on);
>
> Does backtracking works in the lexer also?

I believe so seeing as all ANTLR recognizers (lexers, parsers and  
tree parsers) use similar or same underlying mechanisms. Lexing is  
probably the most computationally expensive phase and so you should  
try to keep backtracking to a minimum. Filtering lexers are a special  
case which wouldn't work at all if it weren't for backtracking.

But note that turning on filtering mode is not exactly the same as  
just turning on backtracking. Try generating a two lexers, one with  
and one without filtering turned on. If you look at the generated  
code for the filtering lexer you will see some differences, the most  
important of which is that the standard "next token" method/function  
is overridden with one which explicitly handles the backtracking  
behaviour (all rules are tried with backtracking turned on and if  
none succeeds then the current char is discarded and the input stream  
moves on). As always you can learn a lot about ANTLR from studying  
the generated code (and reading the book, and experimenting...)

Cheers,
Wincent