[antlr-interest] Re: dfa-based lexers versus top-down antlr lexers

Wed Apr 30 02:00:26 PDT 2003

--- In antlr-interest at yahoogroups.com, Terence Parr <parrt at j...> > 
On Monday, April 28, 2003, at 03:39  PM, Oliver Zeigermann wrote:
> 
> > I see what you mean. Lately a friend of mine asked me how ANTLR
> > decides which token rule to use for a production. This made me 
talk
> > for half an hour but I could not really get it explained. He was
> > thinking of something like "the most specific rule will be the 
one
> > to match" and I was trying to say "no, it is all very
> > deterministic". Anyway, "most specific" is similar to what you 
made
> > up using left factoring or how you called it "combine left
> > edges". "Most specific" seems to be the most natural way to
> > interpret token definitions. One thing to notice there is
> >
> > DIGIT1 : '1';
> >
> > is also more specific than
> >
> > INT : ('0'..'9')+ ;
> >
> > which can not be handeled with left factoring as it seems to me. 
For
> > certain rule based systems it is instead very possible to make 
this
> > distinction. How do they do this?!
> 
> I believe they use the first found rule.  Most specific has to be 
first 
> if you want things to work right.
> 

This holds true for flex and the like, yes. I was thinking of 
something like XSLT. You have a bunch of matching rules there having 
an XPATH expression that specifies the nodes they match. If you have 
two or more rules that match the same node the "most specific" is 
chosen. The ploblem with this approach is XPATH is a bit complicated 
and thus it is not always clear which expression is more specific...

Oliver

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/