[antlr-interest] How to use LT and LA in predicates?

Sun Apr 22 15:07:17 PDT 2007

In the parser, input is a TokenStream.  In a Lexer, it
is an IntStream and returns ints (characters).

--Loring

--- Johannes Luber <jaluber at gmx.de> wrote:

> David Holroyd wrote:
> > On Sat, Apr 21, 2007 at 12:53:43AM +0200, Johannes
> Luber wrote:
> >> in the beta book Terence wrote, that one has to
> define the isTypeName()
> >> method in this rule somewhere else:
> >>
> >> type_id
> >>    :  {isTypeName(input.LT(1).getText())}? ID
> >>    ;
> >>
> >> Problem is, that LT seems to return only an
> integer after Eclipse syntax
> >> analyzer. So how do I get the text?
> > 
> > In the parser, 'input' is a TokenStream instance,
> which defines,
> > 
> >   LT(k) => Token
> > 
> > and (via IntStream, its superclass),
> > 
> >   LA(k) => int
> > 
> > 
> > Something must be mixed up?
> 
> I've been trying to use in my grammar a few
> predicates to determine the
> correctness of some input. The first problem was
> that I defined the
> functions all in the parser, but for the following
> rule I got the error,
> that the function isn't defined in the lexer class:
> 
> fragment UNICODE_ESCAPE_SEQUENCE[String
> unicodeClass]
> 	:	'\\u' {isInCharacterClass($unicodeClass,
> input.LT(1).getText() +
> input.LT(2).getText()
> + input.LT(3).getText() + input.LT(4).getText())}?
> HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT
> 	;
> 
> isInCharacterClass() takes to parameters: The first
> one is the character
> class to which the second one may belong. Both are
> strings and for the
> second one the format is simply XXXX (or XXXXXXXX
> for the second
> alternative which I deleted for space reasons).
> 
> Moving this function and all accompanying
> functionality to the lexer
> solved the first problem, but then LT() seems only
> to return an integer
> alone.
> 
> Here is the generated code for the above rule:
> 
> /*
>
D:\\Studium\\Diplomarbeit\\CSharpML\\CSharp3.g:539:4:
> '\\\\u' {...}?
> HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT */
>                     {
>                     match("\\u"); if (failed) return
> ;
> 
>                     if (
> !(isInCharacterClass(unicodeClass,
> input.LT(1).getText() + input.LT(2).getText() +
> input.LT(3).getText() +
> input.LT(4).getText())) ) {
>                         if (backtracking>0)
> {failed=true; return ;}
>                         throw new
> FailedPredicateException(input,
> "UNICODE_ESCAPE_SEQUENCE",
> "isInCharacterClass($unicodeClass,
> input.LT(1).getText() +
> input.LT(2).getText()\r\n\t\t\t   +
> input.LT(3).getText() + input.LT(4).getText())");
>                     }
>                     mHEX_DIGIT(); if (failed) return
> ;
>                     mHEX_DIGIT(); if (failed) return
> ;
>                     mHEX_DIGIT(); if (failed) return
> ;
>                     mHEX_DIGIT(); if (failed) return
> ;
> 
>                     }
> 
> The third problem I have is with these rules:
> 
> identifier
> 	:	{!isKeyword(input.LT(1).getText())}?=>
> AVAILABLE_IDENTIFIER
> 	|	'@' IDENTIFIER_OR_KEYWORD
> 	;
> 	
> fragment AVAILABLE_IDENTIFIER
> 	:	IDENTIFIER_OR_KEYWORD /* An identifier_or_keyword
> that is not a
> keyword */
> 	;
> 
> I have used {}?=> to enforce the use of the
> predicate, as otherwise the
> presence or absence of an @ distinguishes between
> the cases. Now I
> receive the error, that the class IntStream doesn't
> have the method
> getText(). The following rule snippet is generated:
> 
> else if ( (LA58_0==AVAILABLE_IDENTIFIER) &&
> (!isKeyword(input.LT(1).getText()))) {s = 3;}
> 
> So what I am doing wrong?
> 
> Best regards,
> Johannes Luber
> 

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com