[antlr-interest] dfa-based lexers versus top-down antlr lexers

Terence Parr parrt at jguru.com
Fri Apr 25 13:20:47 PDT 2003


Does anyone have an opinion concerning ANTLR's construction of top-down 
lexers versus more traditional dfa-based state machine solutions?

I just got really irritated this week building the PostScript 
interpreter due to lexer lookahead issues.

PROS

Very readable lexers.  ('0'..'9')+ turns into a while look you can 
debug/read.

Very powerful...can literally parse (because you can call other rules) 
in the lexer.

Consistent with all other antlr-generated recognizers.  Lexers, 
parsers, tree parsers all use the same analyzer and code generator.

Can execute an action anywhere during the recognition of a token...with 
DFA based systems you have to wait until you know the complete token 
has been match.

CONS

Some lexer rules are a huge pain to specify in ANTLR because of the 
limitations of lookahead.

Lexer rules really confuse people since there are lots of special cases 
in the lookahead and stuff.

I wonder if a compromise is possible.  Use a DFA-based decision to 
determine which rule will match; can do arbitrary lookahead and all of 
that automatically to make a correct decision.  The problem is speed: 
you'd parse twice each token.  Perhaps I can simply combine all rules 
with common left edges automatically to avoid weirdness.  For example:

INT : ('0'..'9')+ ;

FLOAT : ('0'..'9')+ ('.' ('0'..'9')*)? | '.' ('0'..'9')+ ;

could be turned into a combine rule by the system:

INT_FLOAT
	:	('0'..'9')+ ('.' ('0'..'9')*)?
	|	'.' ('0'..'9')+
	;

or whatever.

Still, my lookahead computations are really twisted for lexical rules 
when you can see past end of a rule--literally any character can follow 
a token (if you consider erroneous input).

ANYWAY, who has thoughts on this?  I'd like thoughts also from people 
with *no* experience using DFA-based tools like lex/flex.  Do ANTLR 
lexers seem "natural"?

Thanks!
Ter
--
Co-founder, http://www.jguru.com
Creator, ANTLR Parser Generator: http://www.antlr.org
Co-founder, http://www.peerscope.com link sharing, pure-n-simple
Lecturer in Comp. Sci., University of San Francisco


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 




More information about the antlr-interest mailing list