[antlr-interest] Listing order of NT alternatives on rhs of production appears to affect "accept/reject" of parser for fixed input.
Dejas Ninethousand
dejas9000 at gmail.com
Tue Nov 11 12:48:11 PST 2008
Hello,
I believe I have found a peculiar issue in ANTLR. If memory serves, the
order of alternatives in a grammar should have no effect on the set of
inputs it accepts. For example I believe:
program : statement_list | expression
is equivalent to:
program : expression | statement_list
I'm having an issue where for the fixed program:
"DOG CAT x ZEBRA x > x" (quotes not actually a part of the input)
defining one of my NT's as:
a : b | c ;
causes the following error when parsing the program:
"line 1:18 mismatched input '>' expecting EOF"
while redefining a to be:
a : c | b ;
accepts the program after regeneration, recompilation, re-execution.
This completely blows my mind. I believe these grammars (see full grammar
below) to be equivalent. What am I missing?
notes:
I claim the input is valid by production:
(program
(a
(b
(d
'DOG'
'CAT'
(a
(b
(f
'x')))
ZEBRA
(a
(b
(f
'x')))
'>'
(a
(b
(f
'x')))))))
------------------------------------
grammar bug;
options {
language='CSharp';
output=AST;
backtrack = true;
}
program : a EOF ;
a : b | c ;
b : f | d ;
c : e ;
d : D_NAME idl=f? D_TARGET a (resOp=RES res=a)? -> D_NAME $idl? D_TARGET a
$resOp? $res?;
e : b RCHEVRON b;
f : VERBATUM_IDENTIFIER;
D_NAME
: 'DOG';
D_TARGET
: 'CAT';
RES
: 'ZEBRA';
RCHEVRON
: '>';
VERBATUM_IDENTIFIER
: ('a'..'z'|'A'..'Z')(('a'..'z'|'A'..'Z')|('_')|('0'..'9'))*
;
WHITESPACE
: ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+ { $channel = HIDDEN; } ;
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20081111/88fe9503/attachment.html
More information about the antlr-interest
mailing list