[antlr-interest] Bug in DFA matching?
C. Scott Ananian
cscott at cscott.net
Mon Feb 9 15:14:21 PST 2009
On Mon, Feb 9, 2009 at 3:12 PM, Gavin Lambert <antlr at mirality.co.nz> wrote:
> At 08:55 10/02/2009, C. Scott Ananian wrote:
>>// whitespace at start of line used for INDENT processing
>>INITIAL_WS
>> : {getCharPositionInLine()==1 && !afterIndent}? // at start of
>>line.
>> ( ' ' | TAB )*
>> { this.afterIndent=true; }
>> ;
>>
>>Note the star in the INITIAL_WS rule, which means that *every*
>>line should emit an INITIAL_WS token, possibly matching nothing,
>>before matching anything else.
>
> You must never do that. If a lexer rule can ever match nothing, then it can
> always match nothing, and will therefore produce an infinite number of
> matching-nothing tokens, causing an infinite loop (until you run out of
> memory). Top-level lexer rules must always match at least one character.
I think you misunderstood me, or misread the grammer. It matches
nothing *at the beginning of the line* and then afterIndent is set to
false, and it doesn't match any more.
That's the intended behavior. It worked in ANTLRv2; it seems to be a
regression in ANTLRv3.
--scott
--
( http://cscott.net/ )
More information about the antlr-interest
mailing list