[antlr-interest] Antlr Bug: Failed semantic predicate in lexer triggers endless loop

Ron Hunter-Duvar ron.hunter-duvar at oracle.com
Wed Feb 10 11:52:58 PST 2010


Yes, you're right that I didn't have the predicate coded properly in the 
rule. I haven't seen this limitation mentioned, but it seems that gated 
semantic predicates within subrules don't work in lexers, only in 
parsers. That is, they don't just turn off that subrule and make 
everything else still match. When I moved the predicate to the start of 
the lexer rule and made it a gated semantic predicate, it did what I wanted.

But that's not an excuse for Antlr going into an endless loop (it's not 
my code that's looping, it's the Antlr run-time code itself that does 
the loop). Surely you're not going to tell me that this is correct 
run-time behaviour, that Antlr is supposed to go into an endless loop if 
I code a semantic predicate wrong?

Ron


Jim Idle wrote:
> This just means that you haven't covered the predicated cases correctly. In general it means that you needed a gated predicate , not a simple ? predicate. Post the code for further help.
>
> Basically your lexer should not throw any exceptions or reach unrecognizable input.
>
> Jim
>
>
>   
>> -----Original Message-----
>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>> bounces at antlr.org] On Behalf Of Ron Hunter-Duvar
>> Sent: Wednesday, February 10, 2010 11:25 AM
>> To: antlr-interest at antlr.org
>> Subject: [antlr-interest] Antlr Bug: Failed semantic predicate in lexer
>> triggers endless loop
>>
>> Hi,
>>
>> I've run into something that is definitely a bug in Antlr's lexer code:
>> if a semantic predicate fails within a lexer rule, it triggers an
>> endless loop. The problem is in the Lexer.nextToken() method:
>>
>>     public Token nextToken() {
>>         while (true) {
>>             state.token = null;
>>             state.channel = Token.DEFAULT_CHANNEL;
>>             state.tokenStartCharIndex = input.index();
>>             state.tokenStartCharPositionInLine =
>> input.getCharPositionInLine();
>>             state.tokenStartLine = input.getLine();
>>             state.text = null;
>>             if ( input.LA(1)==CharStream.EOF ) {
>>                 return Token.EOF_TOKEN;
>>             }
>>             try {
>>                 mTokens();
>>                 if ( state.token==null ) {
>>                     emit();
>>                 }
>>                 else if ( state.token==Token.SKIP_TOKEN ) {
>>                     continue;
>>                 }
>>                 return state.token;
>>             }
>>             catch (NoViableAltException nva) {
>>                 reportError(nva);
>>                 recover(nva); // throw out current char and try again
>>             }
>>             catch (RecognitionException re) {
>>                 reportError(re);
>>                 // match() routine has already called recover()
>>             }
>>         }
>>     }
>>
>> If a NoViableAltException is thrown, the recover method is called,
>> which
>> consumes one character and continues. But when a semantic predicate
>> fails, it throws a FailedPredicateException, which is a subclass  of
>> RecognitionException. As you can see in the code above, the exception
>> is
>> caught and reported, but it then loops around and tries matching again
>> at the same point in the input, resulting in the same exception. Here's
>> a sample of Antlr's output messages:
>>
>> line 1:21 rule FLOAT failed predicate: { notIntFollowedByRangeOp() }?
>> line 1:21 rule FLOAT failed predicate: { notIntFollowedByRangeOp() }?
>> line 1:21 rule FLOAT failed predicate: { notIntFollowedByRangeOp() }?
>> line 1:21 rule FLOAT failed predicate: { notIntFollowedByRangeOp() }?
>> line 1:21 rule FLOAT failed predicate: { notIntFollowedByRangeOp() }?
>> ...
>>
>> I was able to work around this easily because I already had a custom
>> lexer superclass, so I just pasted in that nextToken() code and added a
>> "recover(re);" call to the second catch.
>>
>> Ron
>>
>> --
>> Ron Hunter-Duvar | Software Developer V | 403-272-6580
>> Oracle Service Engineering
>> Gulf Canada Square 401 - 9th Avenue S.W., Calgary, AB, Canada T2P 3C5
>>
>> All opinions expressed here are mine, and do not necessarily represent
>> those of my employer.
>>
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
>> email-address
>>     
>
>
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>   

-- 
Ron Hunter-Duvar | Software Developer V | 403-272-6580
Oracle Service Engineering
Gulf Canada Square 401 - 9th Avenue S.W., Calgary, AB, Canada T2P 3C5

All opinions expressed here are mine, and do not necessarily represent
those of my employer.



More information about the antlr-interest mailing list