[antlr-interest] Lexer matching non-matching rule

Micha micha-1 at fantasymail.de
Sat May 16 11:29:31 PDT 2009


On Saturday 16 May 2009 12:52:13 Jesper Larsson wrote:
> On Sat, 2009-05-16 at 08:27 +0530, Indhu Bharathi wrote:
> > This is because on seeing 'f' of foo lexer has two options - 1. IDENT
> > 2. URL. And it takes the second options since that seems to be longer
> > that the first alternative. Note that the lexer always tries to match
> > the longest token possible.

but it fails to do so here.

>
> I can understand the motivation of this restriction in the interest of
> keeping the lexer target code at a certain complexity level, but I have
> not seen it stated in the documentation. It would have been nice if the
> lexer generator had issued a warning.

right. I don't like that behavior too, maybe it's because the antlr lexer is 
similar to a recursive descent parser. I think it should just return the 
longest match.

 Michael




More information about the antlr-interest mailing list