[antlr-interest] Antlr Bug: Failed semantic predicate in lexer triggers endless loop

Jim Idle jimi at temporal-wave.com
Wed Feb 10 11:35:01 PST 2010


By the way - there is an explanation of how to lex that situation correctly here:

http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+point%2C+dot%2C+range%2C+time+specs


Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Ron Hunter-Duvar
> Sent: Wednesday, February 10, 2010 11:25 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Antlr Bug: Failed semantic predicate in lexer
> triggers endless loop
> 
> Hi,
> 
> I've run into something that is definitely a bug in Antlr's lexer code:
> if a semantic predicate fails within a lexer rule, it triggers an
> endless loop. The problem is in the Lexer.nextToken() method:
> 
>     public Token nextToken() {
>         while (true) {
>             state.token = null;
>             state.channel = Token.DEFAULT_CHANNEL;
>             state.tokenStartCharIndex = input.index();
>             state.tokenStartCharPositionInLine =
> input.getCharPositionInLine();
>             state.tokenStartLine = input.getLine();
>             state.text = null;
>             if ( input.LA(1)==CharStream.EOF ) {
>                 return Token.EOF_TOKEN;
>             }
>             try {
>                 mTokens();
>                 if ( state.token==null ) {
>                     emit();
>                 }
>                 else if ( state.token==Token.SKIP_TOKEN ) {
>                     continue;
>                 }
>                 return state.token;
>             }
>             catch (NoViableAltException nva) {
>                 reportError(nva);
>                 recover(nva); // throw out current char and try again
>             }
>             catch (RecognitionException re) {
>                 reportError(re);
>                 // match() routine has already called recover()
>             }
>         }
>     }
> 
> If a NoViableAltException is thrown, the recover method is called,
> which
> consumes one character and continues. But when a semantic predicate
> fails, it throws a FailedPredicateException, which is a subclass  of
> RecognitionException. As you can see in the code above, the exception
> is
> caught and reported, but it then loops around and tries matching again
> at the same point in the input, resulting in the same exception. Here's
> a sample of Antlr's output messages:
> 
> line 1:21 rule FLOAT failed predicate: { notIntFollowedByRangeOp() }?
> line 1:21 rule FLOAT failed predicate: { notIntFollowedByRangeOp() }?
> line 1:21 rule FLOAT failed predicate: { notIntFollowedByRangeOp() }?
> line 1:21 rule FLOAT failed predicate: { notIntFollowedByRangeOp() }?
> line 1:21 rule FLOAT failed predicate: { notIntFollowedByRangeOp() }?
> ...
> 
> I was able to work around this easily because I already had a custom
> lexer superclass, so I just pasted in that nextToken() code and added a
> "recover(re);" call to the second catch.
> 
> Ron
> 
> --
> Ron Hunter-Duvar | Software Developer V | 403-272-6580
> Oracle Service Engineering
> Gulf Canada Square 401 - 9th Avenue S.W., Calgary, AB, Canada T2P 3C5
> 
> All opinions expressed here are mine, and do not necessarily represent
> those of my employer.
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address





More information about the antlr-interest mailing list