[antlr-interest] Bug in ANTLR with +=, the second

Johannes Luber jaluber at gmx.de
Fri Apr 20 14:01:17 PDT 2007


Hello,

I think, I've discovered another bug. The rule
HEXADECIMAL_ESCAPE_SEQUENCE in the following grammar:

grammar Test2;

start
   :	HEXADECIMAL_ESCAPE_SEQUENCE
   ;

HEXADECIMAL_ESCAPE_SEQUENCE
	:	'\\x' (h+=HEX_DIGIT)+ {$h.size()<=4}? // Restrict the number of
HEX_DIGITs to a maximum of 4
	;
fragment HEX_DIGIT
	:	'0'..'9'
	|	'A'..'F'
	|	'a'..'f'
	;

generates the following code:

   public final void mHEXADECIMAL_ESCAPE_SEQUENCE() throws
RecognitionException {
        try {
            int _type = HEXADECIMAL_ESCAPE_SEQUENCE;
            // D:\\tmp\\antlrworks\\Test2.g:12:4: ( '\\\\x' (h+=
HEX_DIGIT )+ {...}?)
            // D:\\tmp\\antlrworks\\Test2.g:12:4: '\\\\x' (h+= HEX_DIGIT
)+ {...}?
            {
            match("\\x");

            // D:\\tmp\\antlrworks\\Test2.g:12:10: (h+= HEX_DIGIT )+
            int cnt1=0;
            loop1:
            do {
                int alt1=2;
                int LA1_0 = input.LA(1);

                if ( ((LA1_0>='0' && LA1_0<='9')||(LA1_0>='A' &&
LA1_0<='F')||(LA1_0>='a' && LA1_0<='f')) ) {
                    alt1=1;
                }


                switch (alt1) {
            	case 1 :
            	    // D:\\tmp\\antlrworks\\Test2.g:12:11: h+= HEX_DIGIT
            	    {
            	    int hStart = getCharIndex();
            	    mHEX_DIGIT();
            	    Token h = new CommonToken(input,
Token.INVALID_TOKEN_TYPE, Token.DEFAULT_CHANNEL, hStart, getCharIndex()-1);

            	    }
            	    break;

            	default :
            	    if ( cnt1 >= 1 ) break loop1;
                        EarlyExitException eee =
                            new EarlyExitException(1, input);
                        throw eee;
                }
                cnt1++;
            } while (true);

            if ( !(list_h.size()<=4) ) {
                throw new FailedPredicateException(input,
"HEXADECIMAL_ESCAPE_SEQUENCE", "$h.size()<=4");
            }

            }

            this.type = _type;
        }
        finally {
        }
    }
    // $ANTLR end HEXADECIMAL_ESCAPE_SEQUENCE

As you may not see, the validating predicate references the variable
list_h, which is neither defined nor receives the h tokens.

Best regards,
Johannes Luber


More information about the antlr-interest mailing list