[antlr-interest] How to use LT and LA in predicates?

Johannes Luber jaluber at gmx.de
Sun Apr 22 14:25:53 PDT 2007


David Holroyd wrote:
> On Sat, Apr 21, 2007 at 12:53:43AM +0200, Johannes Luber wrote:
>> in the beta book Terence wrote, that one has to define the isTypeName()
>> method in this rule somewhere else:
>>
>> type_id
>>    :  {isTypeName(input.LT(1).getText())}? ID
>>    ;
>>
>> Problem is, that LT seems to return only an integer after Eclipse syntax
>> analyzer. So how do I get the text?
> 
> In the parser, 'input' is a TokenStream instance, which defines,
> 
>   LT(k) => Token
> 
> and (via IntStream, its superclass),
> 
>   LA(k) => int
> 
> 
> Something must be mixed up?

I've been trying to use in my grammar a few predicates to determine the
correctness of some input. The first problem was that I defined the
functions all in the parser, but for the following rule I got the error,
that the function isn't defined in the lexer class:

fragment UNICODE_ESCAPE_SEQUENCE[String unicodeClass]
	:	'\\u' {isInCharacterClass($unicodeClass, input.LT(1).getText() +
input.LT(2).getText()
+ input.LT(3).getText() + input.LT(4).getText())}?
HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT
	;

isInCharacterClass() takes to parameters: The first one is the character
class to which the second one may belong. Both are strings and for the
second one the format is simply XXXX (or XXXXXXXX for the second
alternative which I deleted for space reasons).

Moving this function and all accompanying functionality to the lexer
solved the first problem, but then LT() seems only to return an integer
alone.

Here is the generated code for the above rule:

/* D:\\Studium\\Diplomarbeit\\CSharpML\\CSharp3.g:539:4: '\\\\u' {...}?
HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT */
                    {
                    match("\\u"); if (failed) return ;

                    if ( !(isInCharacterClass(unicodeClass,
input.LT(1).getText() + input.LT(2).getText() + input.LT(3).getText() +
input.LT(4).getText())) ) {
                        if (backtracking>0) {failed=true; return ;}
                        throw new FailedPredicateException(input,
"UNICODE_ESCAPE_SEQUENCE", "isInCharacterClass($unicodeClass,
input.LT(1).getText() + input.LT(2).getText()\r\n\t\t\t   +
input.LT(3).getText() + input.LT(4).getText())");
                    }
                    mHEX_DIGIT(); if (failed) return ;
                    mHEX_DIGIT(); if (failed) return ;
                    mHEX_DIGIT(); if (failed) return ;
                    mHEX_DIGIT(); if (failed) return ;

                    }

The third problem I have is with these rules:

identifier
	:	{!isKeyword(input.LT(1).getText())}?=> AVAILABLE_IDENTIFIER
	|	'@' IDENTIFIER_OR_KEYWORD
	;
	
fragment AVAILABLE_IDENTIFIER
	:	IDENTIFIER_OR_KEYWORD /* An identifier_or_keyword that is not a
keyword */
	;

I have used {}?=> to enforce the use of the predicate, as otherwise the
presence or absence of an @ distinguishes between the cases. Now I
receive the error, that the class IntStream doesn't have the method
getText(). The following rule snippet is generated:

else if ( (LA58_0==AVAILABLE_IDENTIFIER) &&
(!isKeyword(input.LT(1).getText()))) {s = 3;}

So what I am doing wrong?

Best regards,
Johannes Luber


More information about the antlr-interest mailing list