[antlr-interest] dfa-based lexers versus top-down antlr lexers

Greg Lindholm glindholm at yahoo.com
Thu May 1 08:06:25 PDT 2003


Sorry, these lexers are commercial (or plan to be) so I 
can't distribute them.  I took a look but couldn't 
find a simple example I could send you.  

But they are easy to create, all you have to do is implement 
the TokenStream interface which is one method nextToken().

For details of how you write a lexer you should read 
http://antlr.org/book/byhand.pdf

Then I would suggest take a look at the java code that gets
generated by some of the lexers in the antlr examples files.

Greg

--- Andrew Deren <andrew at adersoftware.com> wrote:
> I was going to build a hand coded lexer for my project, but couldn't
> fine
> much info on it. Can you please send me one of your lexers to look at
> so
> that I can figure how to do. If they are available of course and not
> commercial.
> Thanks a lot.
> Andrew
> 
> -----Original Message-----
> From: Greg Lindholm [mailto:glindholm at yahoo.com] 
> Sent: Wednesday, April 30, 2003 8:05 AM
> To: antlr-interest at yahoogroups.com
> 
> I've done a couple projects using ANTLR and have not yet 
> been able to use the lexer.  I've had to hand-craft my own 
> lexer's because of column positional token's issues.
> 
> I think it's a strength of ANTLR that you have multiple 
> choices on how to build a lexer. 
> 
> Adding a dfa-based lexer option is a great idea.
> (I wouldn't change the current lexer, I would add second 
> style.)
> 
> 
> --- Terence Parr <parrt at jguru.com> wrote:
> > Does anyone have an opinion concerning ANTLR's construction of
> > top-down 
> > lexers versus more traditional dfa-based state machine solutions?
> > 
> > I just got really irritated this week building the PostScript 
> > interpreter due to lexer lookahead issues.
> > 
> > PROS
> > 
> > Very readable lexers.  ('0'..'9')+ turns into a while look you can 
> > debug/read.
> > 
> > Very powerful...can literally parse (because you can call other
> > rules) 
> > in the lexer.
> > 
> > Consistent with all other antlr-generated recognizers.  Lexers, 
> > parsers, tree parsers all use the same analyzer and code generator.
> > 
> > Can execute an action anywhere during the recognition of a
> > token...with 
> > DFA based systems you have to wait until you know the complete
> token 
> > has been match.
> > 
> > CONS
> > 
> > Some lexer rules are a huge pain to specify in ANTLR because of the
> 
> > limitations of lookahead.
> > 
> > Lexer rules really confuse people since there are lots of special
> > cases 
> > in the lookahead and stuff.
> > 
> > I wonder if a compromise is possible.  Use a DFA-based decision to 
> > determine which rule will match; can do arbitrary lookahead and all
> > of 
> > that automatically to make a correct decision.  The problem is
> speed:
> > 
> > you'd parse twice each token.  Perhaps I can simply combine all
> rules
> > 
> > with common left edges automatically to avoid weirdness.  For
> > example:
> > 
> > INT : ('0'..'9')+ ;
> > 
> > FLOAT : ('0'..'9')+ ('.' ('0'..'9')*)? | '.' ('0'..'9')+ ;
> > 
> > could be turned into a combine rule by the system:
> > 
> > INT_FLOAT
> > 	:	('0'..'9')+ ('.' ('0'..'9')*)?
> > 	|	'.' ('0'..'9')+
> > 	;
> > 
> > or whatever.
> > 
> > Still, my lookahead computations are really twisted for lexical
> rules
> > 
> > when you can see past end of a rule--literally any character can
> > follow 
> > a token (if you consider erroneous input).
> > 
> > ANYWAY, who has thoughts on this?  I'd like thoughts also from
> people
> > 
> > with *no* experience using DFA-based tools like lex/flex.  Do ANTLR
> 
> > lexers seem "natural"?
> > 
> > Thanks!
> > Ter
> > --
> > Co-founder, http://www.jguru.com
> > Creator, ANTLR Parser Generator: http://www.antlr.org
> > Co-founder, http://www.peerscope.com link sharing, pure-n-simple
> > Lecturer in Comp. Sci., University of San Francisco
> > 
> > 
> >  
> > 
> > Your use of Yahoo! Groups is subject to
> > http://docs.yahoo.com/info/terms/ 
> > 
> > 
> 
> 
> __________________________________
> Do you Yahoo!?
> The New Yahoo! Search - Faster. Easier. Bingo.
> http://search.yahoo.com
> 
>  
> 
> Your use of Yahoo! Groups is subject to
> http://docs.yahoo.com/info/terms/ 
> 
> 
> 
> 
>  
> 
> Your use of Yahoo! Groups is subject to
> http://docs.yahoo.com/info/terms/ 
> 
> 


__________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.
http://search.yahoo.com

 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 




More information about the antlr-interest mailing list