[antlr-interest] Why and how exactly does ANTLR manage to fail on non recursive grammar for finite language?
Kevin J. Cummings
cummings at kjchome.homeip.net
Wed Aug 12 15:57:20 PDT 2009
On 08/12/2009 06:28 PM, Nikolay Ognyanov wrote:
>
>
> Jim Idle wrote:
>> Nikolay Ognyanov wrote:
>>> Hi everybody,
>>>
>>> Please have a look at a toy grammar for language consisting of 2
>>> statements :
>>>
>>> grammar Ambiguous;
>>>
>>> expr
>>> : expr1
>>> | expr2
>>> ;
>>> expr1
>>> : PREFIX_1 expr2 SUFFIX
>>> ;
>>> expr2
>>> : PREFIX_2
>>> | PREFIX_2 SUFFIX
>>> ;
>>>
>>> PREFIX_1 : 'prefix_1';
>>> PREFIX_2 : 'prefix_2';
>>> SUFFIX : 'suffix';
>>> WS : (' ' | '\r' | '\n' | '\t')+ {$channel=HIDDEN;};
>>>
>>>
>>> Please do not advise how to fix it :) I know that but the question is why
>>> ANTLR considers rule for expr2 ambiguous? Here is a tool run:
>>>
>>> java org.antlr.Tool Ambiguous.g
>>> warning(200): Ambiguous.g:11:5: Decision can match input such as
>>> "PREFIX_2 {EOF, SUFFIX}" using multiple alternatives: 1, 2
>>> As a result, alternative(s) 2 were disabled for that input
>>> error(201): Ambiguous.g:11:5: The following alternatives can never be
>>> matched: 2
>>>
>> Clearly, this is ambiguous because you follow expr2 with SUFFIX in the
>> expr1 rule. When parsing expr2 as part of expr1, the paths of taking
>> /PREFIX_2 the using SUFFIX in expr1/ and taking PREFIX_2 then SUFFIX
>> in expr2 followed by SUFFIX in expr1 are ambiguous as written.
>>
>> You say you know how to fix it, but perhaps you do not if you cannot
>> see the ambiguity here. Because expr2 is a stand alone path from expr,
>> there is no way to know how to disambiguate when called from expr1.
>> Basically, you need to left factor these rules properly:
>>
>> expr
>> : PREFIX_1 PREFIX_2 SUFFIX SUFFIX?
>> | PREFIX_2 SUFFIX?
>> ;
>>
>>
>>
>> Jim
> By definition a grammar is ambiguous if there is more than 1 way to
> derive at least one statement.
> This is not the case here. There is only 1 way to derive the offending
> statement
>
> PREFIX_1 PREFIX_2 SUFFIX SUFFIX
>
> And it would take the procedure for expr2 just about 3 tokens lookahead
> to figure out what is the
> right thing to do. The question is why ANTLR does not do this?
Because you did not set "k=3" in your parser options?
(At least that's what I would do in ANTLR 2.7.7....)
> Regards
> Nikolay
--
Kevin J. Cummings
kjchome at rcn.com
cummings at kjchome.homeip.net
cummings at kjc386.framingham.ma.us
Registered Linux User #1232 (http://counter.li.org)
More information about the antlr-interest
mailing list