[antlr-interest] Re: dfa-based lexers versus top-down antlr lexers
Oliver Zeigermann
oliver at zeigermann.de
Wed Apr 30 02:00:26 PDT 2003
--- In antlr-interest at yahoogroups.com, Terence Parr <parrt at j...> >
On Monday, April 28, 2003, at 03:39 PM, Oliver Zeigermann wrote:
>
> > I see what you mean. Lately a friend of mine asked me how ANTLR
> > decides which token rule to use for a production. This made me
talk
> > for half an hour but I could not really get it explained. He was
> > thinking of something like "the most specific rule will be the
one
> > to match" and I was trying to say "no, it is all very
> > deterministic". Anyway, "most specific" is similar to what you
made
> > up using left factoring or how you called it "combine left
> > edges". "Most specific" seems to be the most natural way to
> > interpret token definitions. One thing to notice there is
> >
> > DIGIT1 : '1';
> >
> > is also more specific than
> >
> > INT : ('0'..'9')+ ;
> >
> > which can not be handeled with left factoring as it seems to me.
For
> > certain rule based systems it is instead very possible to make
this
> > distinction. How do they do this?!
>
> I believe they use the first found rule. Most specific has to be
first
> if you want things to work right.
>
This holds true for flex and the like, yes. I was thinking of
something like XSLT. You have a bunch of matching rules there having
an XPATH expression that specifies the nodes they match. If you have
two or more rules that match the same node the "most specific" is
chosen. The ploblem with this approach is XPATH is a bit complicated
and thus it is not always clear which expression is more specific...
Oliver
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list