[antlr-interest] Re: dfa-based lexers versus top-down antlr lexers

Tue Apr 29 09:44:07 PDT 2003

I'm thinking that a future version of ANTLR would pretty much support 
whatever "module" you wanted.  Sometimes a DFA-based lexer is easiest 
and sometimes a top-down lexer is better (such as parsing nested syntax 
within HTML tags or whatever).  Naturally you can always hook in your 
own hand-built lexer.

Thanks for the feedback.

On Monday, April 28, 2003, at 03:39  PM, Oliver Zeigermann wrote:

> I see what you mean. Lately a friend of mine asked me how ANTLR
> decides which token rule to use for a production. This made me talk
> for half an hour but I could not really get it explained. He was
> thinking of something like "the most specific rule will be the one
> to match" and I was trying to say "no, it is all very
> deterministic". Anyway, "most specific" is similar to what you made
> up using left factoring or how you called it "combine left
> edges". "Most specific" seems to be the most natural way to
> interpret token definitions. One thing to notice there is
>
> DIGIT1 : '1';
>
> is also more specific than
>
> INT : ('0'..'9')+ ;
>
> which can not be handeled with left factoring as it seems to me. For
> certain rule based systems it is instead very possible to make this
> distinction. How do they do this?!

I believe they use the first found rule.  Most specific has to be first 
if you want things to work right.

Ter
--
Co-founder, http://www.jguru.com
Creator, ANTLR Parser Generator: http://www.antlr.org
Co-founder, http://www.peerscope.com link sharing, pure-n-simple
Lecturer in Comp. Sci., University of San Francisco

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/