[antlr-interest] Why and how exactly does ANTLR manage to fail on non recursive grammar for finite language?
Nikolay Ognyanov
nikolay.ognyanov at travelstoremaker.com
Wed Aug 12 14:45:02 PDT 2009
Hi everybody,
Please have a look at a toy grammar for language consisting of 2
statements :
grammar Ambiguous;
expr
: expr1
| expr2
;
expr1
: PREFIX_1 expr2 SUFFIX
;
expr2
: PREFIX_2
| PREFIX_2 SUFFIX
;
PREFIX_1 : 'prefix_1';
PREFIX_2 : 'prefix_2';
SUFFIX : 'suffix';
WS : (' ' | '\r' | '\n' | '\t')+ {$channel=HIDDEN;};
Please do not advise how to fix it :) I know that but the question is why
ANTLR considers rule for expr2 ambiguous? Here is a tool run:
java org.antlr.Tool Ambiguous.g
warning(200): Ambiguous.g:11:5: Decision can match input such as
"PREFIX_2 {EOF, SUFFIX}" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
error(201): Ambiguous.g:11:5: The following alternatives can never be
matched: 2
Or with reversed order of alternatives;
java org.antlr.Tool Ambiguous.g
warning(200): Ambiguous.g:11:5: Decision can match input such as
"PREFIX_2 SUFFIX" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
The language is :
PREFIX_1 PREFIX_2 SUFFIX
PREFIX_1 PREFIX_2 SUFFIX SUFFIX
and none of the statements is really ambiguous per the grammar above.
The only kind
of "ambiguity" I see is that second statement can be parsed as the first
on plus some
extra input after end of program. And the only answer to the question I
know is that
this is just how it goes with recursive descent parsers - they are OK
with leaving extra
input after end of valid program in the input channel. I would not be
surprised if this
is the real answer but it is still interesting to know when and how
exactly ANTLR gets
to this. There are real world problems of the kind which are much more
difficult to
identify and fix than in the toy grammar above, so if anybody can advise
of the more
detailed mechanics of how this happens with ANTLR, I would be grateful.
And also -
of a way (other than the hard way:) to verify that this is or is the
problem in real world
grammars when suspicious ambiguity is reported.
Regards
Nikolay
More information about the antlr-interest
mailing list