[antlr-interest] Why and how exactly does ANTLR manage to fail on non recursive grammar for finite language?
Nikolay Ognyanov
nikolay.ognyanov at travelstoremaker.com
Wed Aug 12 16:08:48 PDT 2009
Kevin J. Cummings wrote:
> On 08/12/2009 06:28 PM, Nikolay Ognyanov wrote:
>
>> Jim Idle wrote:
>>
>>> Nikolay Ognyanov wrote:
>>>
>>>> Hi everybody,
>>>>
>>>> Please have a look at a toy grammar for language consisting of 2
>>>> statements :
>>>>
>>>> grammar Ambiguous;
>>>>
>>>> expr
>>>> : expr1
>>>> | expr2
>>>> ;
>>>> expr1
>>>> : PREFIX_1 expr2 SUFFIX
>>>> ;
>>>> expr2
>>>> : PREFIX_2
>>>> | PREFIX_2 SUFFIX
>>>> ;
>>>>
>>>> PREFIX_1 : 'prefix_1';
>>>> PREFIX_2 : 'prefix_2';
>>>> SUFFIX : 'suffix';
>>>> WS : (' ' | '\r' | '\n' | '\t')+ {$channel=HIDDEN;};
>>>>
>>>>
>>>> Please do not advise how to fix it :) I know that but the question is why
>>>> ANTLR considers rule for expr2 ambiguous? Here is a tool run:
>>>>
>>>> java org.antlr.Tool Ambiguous.g
>>>> warning(200): Ambiguous.g:11:5: Decision can match input such as
>>>> "PREFIX_2 {EOF, SUFFIX}" using multiple alternatives: 1, 2
>>>> As a result, alternative(s) 2 were disabled for that input
>>>> error(201): Ambiguous.g:11:5: The following alternatives can never be
>>>> matched: 2
>>>>
>>>>
>>> Clearly, this is ambiguous because you follow expr2 with SUFFIX in the
>>> expr1 rule. When parsing expr2 as part of expr1, the paths of taking
>>> /PREFIX_2 the using SUFFIX in expr1/ and taking PREFIX_2 then SUFFIX
>>> in expr2 followed by SUFFIX in expr1 are ambiguous as written.
>>>
>>> You say you know how to fix it, but perhaps you do not if you cannot
>>> see the ambiguity here. Because expr2 is a stand alone path from expr,
>>> there is no way to know how to disambiguate when called from expr1.
>>> Basically, you need to left factor these rules properly:
>>>
>>> expr
>>> : PREFIX_1 PREFIX_2 SUFFIX SUFFIX?
>>> | PREFIX_2 SUFFIX?
>>> ;
>>>
>>>
>>>
>>> Jim
>>>
>
>
>> By definition a grammar is ambiguous if there is more than 1 way to
>> derive at least one statement.
>> This is not the case here. There is only 1 way to derive the offending
>> statement
>>
>> PREFIX_1 PREFIX_2 SUFFIX SUFFIX
>>
>> And it would take the procedure for expr2 just about 3 tokens lookahead
>> to figure out what is the
>> right thing to do. The question is why ANTLR does not do this?
>>
>
> Because you did not set "k=3" in your parser options?
> (At least that's what I would do in ANTLR 2.7.7....)
>
>
By default ANTLR 3 looks roughly speaking as far ahead as it takes.
Setting explicitly
high k does not help either.
Regards
Nikolay
>> Regards
>> Nikolay
>>
>
>
--
*Nikolay Ognyanov, PhD*
Chief Technology Officer
*TravelStoreMaker.com Inc.* <http://www.travelstoremaker.com/>
Phone: +359 2 933 3832
Fax: +359 2 983 6475
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090813/0c5e5724/attachment.html
More information about the antlr-interest
mailing list