[antlr-interest] Antlr Bug: Failed semantic predicate in lexer triggers endless loop

Jim Idle jimi at temporal-wave.com
Wed Feb 10 12:26:10 PST 2010


No it should not generate code like that if can avoid it, but it is difficult to detect that that is what will happen. How would you go about proving that the code will be an endless loop? http://en.wikipedia.org/wiki/Halting_problem 
 
Instead, the failed predicate exception at runtime shows that it must be your predicate that is going awry.
 
Jim
 
From: Ron Hunter-Duvar [mailto:ron.hunter-duvar at oracle.com] 
Sent: Wednesday, February 10, 2010 11:53 AM
To: Jim Idle
Cc: antlr-interest at antlr.org
Subject: Re: [antlr-interest] Antlr Bug: Failed semantic predicate in lexer triggers endless loop
 
Yes, you're right that I didn't have the predicate coded properly in the rule. I haven't seen this limitation mentioned, but it seems that gated semantic predicates within subrules don't work in lexers, only in parsers. That is, they don't just turn off that subrule and make everything else still match. When I moved the predicate to the start of the lexer rule and made it a gated semantic predicate, it did what I wanted.

But that's not an excuse for Antlr going into an endless loop (it's not my code that's looping, it's the Antlr run-time code itself that does the loop). Surely you're not going to tell me that this is correct run-time behaviour, that Antlr is supposed to go into an endless loop if I code a semantic predicate wrong?

Ron


Jim Idle wrote: 
This just means that you haven't covered the predicated cases correctly. In general it means that you needed a gated predicate , not a simple ? predicate. Post the code for further help.
 
Basically your lexer should not throw any exceptions or reach unrecognizable input.
 
Jim
 
 
  
-----Original Message-----
From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
bounces at antlr.org] On Behalf Of Ron Hunter-Duvar
Sent: Wednesday, February 10, 2010 11:25 AM
To: antlr-interest at antlr.org
Subject: [antlr-interest] Antlr Bug: Failed semantic predicate in lexer
triggers endless loop
 
Hi,
 
I've run into something that is definitely a bug in Antlr's lexer code:
if a semantic predicate fails within a lexer rule, it triggers an
endless loop. The problem is in the Lexer.nextToken() method:
 
    public Token nextToken() {
        while (true) {
            state.token = null;
            state.channel = Token.DEFAULT_CHANNEL;
            state.tokenStartCharIndex = input.index();
            state.tokenStartCharPositionInLine =
input.getCharPositionInLine();
            state.tokenStartLine = input.getLine();
            state.text = null;
            if ( input.LA(1)==CharStream.EOF ) {
                return Token.EOF_TOKEN;
            }
            try {
                mTokens();
                if ( state.token==null ) {
                    emit();
                }
                else if ( state.token==Token.SKIP_TOKEN ) {
                    continue;
                }
                return state.token;
            }
            catch (NoViableAltException nva) {
                reportError(nva);
                recover(nva); // throw out current char and try again
            }
            catch (RecognitionException re) {
                reportError(re);
                // match() routine has already called recover()
            }
        }
    }
 
If a NoViableAltException is thrown, the recover method is called,
which
consumes one character and continues. But when a semantic predicate
fails, it throws a FailedPredicateException, which is a subclass  of
RecognitionException. As you can see in the code above, the exception
is
caught and reported, but it then loops around and tries matching again
at the same point in the input, resulting in the same exception. Here's
a sample of Antlr's output messages:
 
line 1:21 rule FLOAT failed predicate: { notIntFollowedByRangeOp() }?
line 1:21 rule FLOAT failed predicate: { notIntFollowedByRangeOp() }?
line 1:21 rule FLOAT failed predicate: { notIntFollowedByRangeOp() }?
line 1:21 rule FLOAT failed predicate: { notIntFollowedByRangeOp() }?
line 1:21 rule FLOAT failed predicate: { notIntFollowedByRangeOp() }?
...
 
I was able to work around this easily because I already had a custom
lexer superclass, so I just pasted in that nextToken() code and added a
"recover(re);" call to the second catch.
 
Ron
 
--
Ron Hunter-Duvar | Software Developer V | 403-272-6580
Oracle Service Engineering
Gulf Canada Square 401 - 9th Avenue S.W., Calgary, AB, Canada T2P 3C5
 
All opinions expressed here are mine, and do not necessarily represent
those of my employer.
 
 
List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
email-address
    
 
 
 
 
List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
 
  



-- 
Ron Hunter-Duvar | Software Developer V | 403-272-6580
Oracle Service Engineering
Gulf Canada Square 401 - 9th Avenue S.W., Calgary, AB, Canada T2P 3C5
 
All opinions expressed here are mine, and do not necessarily represent
those of my employer.




More information about the antlr-interest mailing list