[antlr-interest] dfa-based lexers versus top-down antlr lexers
Greg Lindholm
glindholm at yahoo.com
Thu May 1 08:06:25 PDT 2003
Sorry, these lexers are commercial (or plan to be) so I
can't distribute them. I took a look but couldn't
find a simple example I could send you.
But they are easy to create, all you have to do is implement
the TokenStream interface which is one method nextToken().
For details of how you write a lexer you should read
http://antlr.org/book/byhand.pdf
Then I would suggest take a look at the java code that gets
generated by some of the lexers in the antlr examples files.
Greg
--- Andrew Deren <andrew at adersoftware.com> wrote:
> I was going to build a hand coded lexer for my project, but couldn't
> fine
> much info on it. Can you please send me one of your lexers to look at
> so
> that I can figure how to do. If they are available of course and not
> commercial.
> Thanks a lot.
> Andrew
>
> -----Original Message-----
> From: Greg Lindholm [mailto:glindholm at yahoo.com]
> Sent: Wednesday, April 30, 2003 8:05 AM
> To: antlr-interest at yahoogroups.com
>
> I've done a couple projects using ANTLR and have not yet
> been able to use the lexer. I've had to hand-craft my own
> lexer's because of column positional token's issues.
>
> I think it's a strength of ANTLR that you have multiple
> choices on how to build a lexer.
>
> Adding a dfa-based lexer option is a great idea.
> (I wouldn't change the current lexer, I would add second
> style.)
>
>
> --- Terence Parr <parrt at jguru.com> wrote:
> > Does anyone have an opinion concerning ANTLR's construction of
> > top-down
> > lexers versus more traditional dfa-based state machine solutions?
> >
> > I just got really irritated this week building the PostScript
> > interpreter due to lexer lookahead issues.
> >
> > PROS
> >
> > Very readable lexers. ('0'..'9')+ turns into a while look you can
> > debug/read.
> >
> > Very powerful...can literally parse (because you can call other
> > rules)
> > in the lexer.
> >
> > Consistent with all other antlr-generated recognizers. Lexers,
> > parsers, tree parsers all use the same analyzer and code generator.
> >
> > Can execute an action anywhere during the recognition of a
> > token...with
> > DFA based systems you have to wait until you know the complete
> token
> > has been match.
> >
> > CONS
> >
> > Some lexer rules are a huge pain to specify in ANTLR because of the
>
> > limitations of lookahead.
> >
> > Lexer rules really confuse people since there are lots of special
> > cases
> > in the lookahead and stuff.
> >
> > I wonder if a compromise is possible. Use a DFA-based decision to
> > determine which rule will match; can do arbitrary lookahead and all
> > of
> > that automatically to make a correct decision. The problem is
> speed:
> >
> > you'd parse twice each token. Perhaps I can simply combine all
> rules
> >
> > with common left edges automatically to avoid weirdness. For
> > example:
> >
> > INT : ('0'..'9')+ ;
> >
> > FLOAT : ('0'..'9')+ ('.' ('0'..'9')*)? | '.' ('0'..'9')+ ;
> >
> > could be turned into a combine rule by the system:
> >
> > INT_FLOAT
> > : ('0'..'9')+ ('.' ('0'..'9')*)?
> > | '.' ('0'..'9')+
> > ;
> >
> > or whatever.
> >
> > Still, my lookahead computations are really twisted for lexical
> rules
> >
> > when you can see past end of a rule--literally any character can
> > follow
> > a token (if you consider erroneous input).
> >
> > ANYWAY, who has thoughts on this? I'd like thoughts also from
> people
> >
> > with *no* experience using DFA-based tools like lex/flex. Do ANTLR
>
> > lexers seem "natural"?
> >
> > Thanks!
> > Ter
> > --
> > Co-founder, http://www.jguru.com
> > Creator, ANTLR Parser Generator: http://www.antlr.org
> > Co-founder, http://www.peerscope.com link sharing, pure-n-simple
> > Lecturer in Comp. Sci., University of San Francisco
> >
> >
> >
> >
> > Your use of Yahoo! Groups is subject to
> > http://docs.yahoo.com/info/terms/
> >
> >
>
>
> __________________________________
> Do you Yahoo!?
> The New Yahoo! Search - Faster. Easier. Bingo.
> http://search.yahoo.com
>
>
>
> Your use of Yahoo! Groups is subject to
> http://docs.yahoo.com/info/terms/
>
>
>
>
>
>
> Your use of Yahoo! Groups is subject to
> http://docs.yahoo.com/info/terms/
>
>
__________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.
http://search.yahoo.com
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list