[antlr-interest] [antlr-dev] Why doesn't this work?

Indhu Bharathi indhu.b at s7software.com
Wed Apr 8 08:12:10 PDT 2009


I guess you didn't read my mail completely :-) Let me repeat:

The following doesn't recognize "1." and give INT_LIT, DOT as output 
(and I perfectly understand why)

    INT_OR_FLOAT
        :    NUMBER DOT NUMBER  {$type=FLOAT_LIT;}
        |    NUMBER {$type=INT_LIT;}
        ;


But this works fine:

    INT_OR_FLOAT
        :    (NUMBER DOT NUMBER) => NUMBER DOT NUMBER  {$type=FLOAT_LIT;}
        |    (NUMBER) => NUMBER {$type=INT_LIT;}
        ;


The reason is we have added syntactic predicate (which in turn is gated 
symantic predicate) to make it work.

Given that the second example shown above (with syntactic predicate) 
works fine, why doesn't the following work?

    INT_FLOAT_PATTERN
        :    (NUMBER DOT NUMBER LETTER ) => NUMBER DOT NUMBER LETTER
            { $type=PATTERN; }
           
        |    ( NUMBER DOT NUMBER ) =>  NUMBER DOT NUMBER
            { $type=FLOAT_LIT; }

        |    (NUMBER) => NUMBER
            { $type=INT_LIT; }

        ;


What is the difference between case2 and case3 shown above. For the sake 
of clarity, I've shown only the relevant rules above. Following is the 
complete grammar:

grammar Test;

r    :    INT_LIT DOT+
    ;

INT_FLOAT_PATTERN
    :    (NUMBER DOT NUMBER LETTER ) => NUMBER DOT NUMBER LETTER
        { $type=PATTERN; }
       
    |    ( NUMBER DOT NUMBER ) =>  NUMBER DOT NUMBER
        { $type=FLOAT_LIT; }

    |    (NUMBER) => NUMBER
        { $type=INT_LIT; }

    ;

DOT    :    '.'
    ;

fragment PATTERN
    :    ;
   
fragment FLOAT_LIT
    :    ;
   
fragment INT_LIT
    :    ;   

   
fragment
NUMBER    :    ('0'..'9')+
    ;

fragment
LETTER    :    'a'..'z'
    ;


Thanks, Indhu


Jim Idle wrote:
> Indhu Bharathi wrote:
>> Yes, I've read that page earlier and I understand it (and that is how 
>> I've solved the problem for now). Without syntactic predicates I 
>> understand ANTLR Lexer will try matching the longer string and might 
>> fail in the middle. But when a syntactic predicate (which is a gated 
>> semantic predicate) is placed before the production, shouldn't ANTLR 
>> first try the predicate and go on and match the production only if 
>> the syntactic predicate passes like what Terence says here: 
>> http://www.antlr.org/pipermail/antlr-interest/2009-March/033526.html
>>
>> For example, the following grammar wont work for the input "1.". It 
>> wont give me INT_LIT, DOT. Instead it will try matching for FLOAT_LIT 
>> and fail. :
>>
>
> This needs to move antlr-interest, it isn't a bug.
>> grammar Test;
>>
>> r    :    INT_LIT DOT
>>     ;
>>
>> INT_OR_FLOAT
>>     :    NUMBER DOT NUMBER  {$type=FLOAT_LIT;}
>>     |    NUMBER {$type=INT_LIT;}
>>     ;
>>
> 1. does not work because ANTLR only looks ahead enough to predict the 
> alt, not to match the alt. so it sees NUMBER, then it only needs to 
> know if there is a DOT or not to select alt 1 or 2. SO it sees a '.' 
> then tells you your floating point number is in error. Hence your 
> grammar is incorrect. You need to left factor for a start, then use a 
> semantic gated predicate to predict the float part (You need to look 
> past the '.' and make sure there is a digit.
>
> Jim
>
>
> ------------------------------------------------------------------------
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>   
Indhu Bharathi wrote:
> Yes, I've read that page earlier and I understand it (and that is how 
> I've solved the problem for now). Without syntactic predicates I 
> understand ANTLR Lexer will try matching the longer string and might 
> fail in the middle. But when a syntactic predicate (which is a gated 
> semantic predicate) is placed before the production, shouldn't ANTLR 
> first try the predicate and go on and match the production only if the 
> syntactic predicate passes like what Terence says here: 
> http://www.antlr.org/pipermail/antlr-interest/2009-March/033526.html
>
> For example, the following grammar wont work for the input "1.". It 
> wont give me INT_LIT, DOT. Instead it will try matching for FLOAT_LIT 
> and fail. :
>
> grammar Test;
>
> r    :    INT_LIT DOT
>     ;
>
> INT_OR_FLOAT
>     :    NUMBER DOT NUMBER  {$type=FLOAT_LIT;}
>     |    NUMBER {$type=INT_LIT;}
>     ;
>
> fragment INT_LIT
>     :    ;
>    
> fragment FLOAT_LIT
>     :    ;
>
> DOT    :    '.'
>     ;
>
> fragment NUMBER
>     :    '0'..'9'+
>     ;
>    
>
> But if you add syntactic predicate to INT_OR_FLOAT as shown below, it 
> will work:
>
> INT_OR_FLOAT
>     :    (NUMBER DOT NUMBER) => NUMBER DOT NUMBER  {$type=FLOAT_LIT;}
>     |    (NUMBER) => NUMBER {$type=INT_LIT;}
>     ;
>
>
> I was expecting the same thing on my example. But it for some reason 
> doesn't work for my example. What is the difference between the above 
> example and my example? Shouldn't both work fine since syntactic 
> predicate is present?
>
> Thanks, Indhu
>
> Johannes Luber wrote:
>> Indhu Bharathi schrieb:
>>   
>>> Moving this to antlr-dev as I'm starting to feel maybe this is a bug...
>>> No reply in antlr-interest for long time kindof confirms that feeling.
>>>
>>> I can certainly do some work around for the work I'm doing now. But this
>>> is something I've tried a lot of times and always failed. Would like to
>>> know if I'm doing some mistake or is this a bug in ANTLR?
>>>
>>> Thanks, Indhu
>>>     
>>
>> I think that your problem is described here:
>> <http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+point,+dot,+range,+time+specs>
>>
>> Johannes
>>   
>>> Indhu Bharathi wrote:
>>>     
>>>> Hi,
>>>>
>>>> Any clue why this doesn't work? I'm still clueless.
>>>>
>>>> - Indhu
>>>>
>>>> Indhu Bharathi wrote:
>>>>   
>>>>       
>>>>> I was working in a big grammar and stumbled on a problem with 
>>>>> predicates. I've simplified the problem as much as possible and here it is:
>>>>>
>>>>> When I give the input "1.", I expect the tokens <INT_LIT, DOT>. But what 
>>>>> I get is "No viable alternative at character 'EOF'. I'm not able to 
>>>>> understand why this happens. Any pointers?
>>>>>
>>>>> grammar Test;
>>>>>
>>>>> r    :    INT_LIT DOT+
>>>>>     ;
>>>>>
>>>>> INT_FLOAT_PATTERN
>>>>>     :    (NUMBER DOT NUMBER LETTER ) => NUMBER DOT NUMBER LETTER
>>>>>         { $type=PATTERN; }
>>>>>        
>>>>>     |    ( NUMBER DOT NUMBER ) =>  NUMBER DOT NUMBER
>>>>>         { $type=FLOAT_LIT; }
>>>>>
>>>>>     |    (NUMBER) => NUMBER
>>>>>         { $type=INT_LIT; }
>>>>>
>>>>>     ;
>>>>>
>>>>> DOT    :    '.'
>>>>>     ;
>>>>>
>>>>> fragment PATTERN
>>>>>     :    ;
>>>>>    
>>>>> fragment FLOAT_LIT
>>>>>     :    ;
>>>>>    
>>>>> fragment INT_LIT
>>>>>     :    ;   
>>>>>
>>>>>    
>>>>> fragment
>>>>> NUMBER    :    ('0'..'9')+
>>>>>     ;
>>>>>
>>>>> fragment
>>>>> LETTER    :    'a'..'z'
>>>>>     ;
>>>>>    
>>>>>
>>>>> Thanks, Indhu
>>>>>
>>>>>
>>>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>>>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>>>>
>>>>>   
>>>>>     
>>>>>         
>>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>>>
>>>>   
>>>>       
>>> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> antlr-dev mailing list
>>> antlr-dev at antlr.org
>>> http://www.antlr.org/mailman/listinfo/antlr-dev
>>>     
>>
>>
>>   
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> antlr-dev mailing list
> antlr-dev at antlr.org
> http://www.antlr.org/mailman/listinfo/antlr-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090408/d42e4c88/attachment.html 


More information about the antlr-interest mailing list