[antlr-interest] Bug in DFA matching?

Gavin Lambert antlr at mirality.co.nz
Mon Feb 9 12:12:23 PST 2009


At 08:55 10/02/2009, C. Scott Ananian wrote:
 >// whitespace at start of line used for INDENT processing
 >INITIAL_WS
 >	: {getCharPositionInLine()==1 && !afterIndent}? // at start of
 >line.
 >	( ' ' | TAB )*
 >    { this.afterIndent=true; }
 >    ;
 >
 >Note the star in the INITIAL_WS rule, which means that *every*
 >line should emit an INITIAL_WS token, possibly matching nothing, 

 >before matching anything else.

You must never do that.  If a lexer rule can ever match nothing, 
then it can always match nothing, and will therefore produce an 
infinite number of matching-nothing tokens, causing an infinite 
loop (until you run out of memory).  Top-level lexer rules must 
always match at least one character.



More information about the antlr-interest mailing list