[antlr-interest] Problem using predicates in V3

Dr. Hartmut Kocher hwk.cortex-brainware at t-online.de
Sun Feb 11 13:38:39 PST 2007


No it doesn't work. I appended the C# code (which ist he same for Java).

Here's the rule:

ID 	:	('abc' DIGIT) => 'abc' | ('a'..'z'|'A'..'Z')
('a'..'z'|'A'..'Z'|'_'|'0'..'9')+ ;

If you look at the code, you see that it correctly checks for the predicate.
It then matches the string, but this does not emit a token!
Instead it emits a token at the end of the ID rule, with type ID.

This is the part I don't understand, or where a bug might hide...

Hartmut


Here comes the code:

if ( (LA3_0 == 'a') )
            {
                int LA3_1 = input.LA(2);
                if ( (LA3_1 == 'b') )
                {
                    int LA3_3 = input.LA(3);
                    if ( (LA3_3 == 'c') )
                    {
                        if ( (synpred1()) )
                        {
                            alt3 = 1;
                        }

This matches the predicate. => alt3 = 1.

Now the rule is processed:

switch (alt3) 
            {
                case 1 :
                    // Test.g:30:7: ( 'abc' DIGIT )=> 'abc'
                    {
                    	Match("abc"); if (failed) return ;

                    
                    }
                    break;

This matches the string, but does not set the token type.

Then the token is created:

if ( backtracking == 0 ) 
            {
              
                      if ( (token == null) && (ruleNestingLevel == 1) )
                      {
                          Emit(_type, _line, _charPosition, _channel,
_start, CharIndex-1);
                      }
              
                      
            }    }

Which is "abc" but of type ID!!!

Hope this helps.


-----Ursprüngliche Nachricht-----
Von: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] Im Auftrag von Terence Parr
Gesendet: Sonntag, 11. Februar 2007 22:24
An: ANTLR Interest
Betreff: Re: [antlr-interest] Problem using predicates in V3


On Feb 11, 2007, at 1:15 PM, Dr. Hartmut Kocher wrote:

> The language is fixed :-( And no, I didn't invent it.

;)

> Your second solution is also not possible because then "t123a"  
> would parse
> OK, but "t 123 a" too, which is not allowed. (Of course there's a  
> whitespace
> rule)...

actually, it would only match "t 123" as part of tha rule, but no  
biggie.

> In ANTLR2 I did the following:
>
> tokens {
> "abc";
> }
>
> IDENT
>    options {
>      testLiterals=true;
>    }
>   :
>     ("abc" DIGIT) => "abc"
>   | ('a'..'z') (LD | '_')*;  // LD is letter or digit
>
> This worked quite well. Now I'm trying to accomplish the same with  
> ANTLR 3.
> No such luck so far.

OHhhhhh....ok, abc is a keyword and for some reason it's not taken as  
a special case; oh, because antlr matches longest it can.

IDENT : ("abc" DIGIT)=> "abc" | ('a'..'z') (LD | '_')* ;

should work.  It doesn't?

Ter




More information about the antlr-interest mailing list