[antlr-interest] Listing order of NT alternatives on rhs of production appears to affect "accept/reject" of parser for fixed input.

Dejas Ninethousand dejas9000 at gmail.com
Tue Nov 11 12:48:11 PST 2008


Hello,

I believe I have found a peculiar issue in ANTLR.  If memory serves, the
order of alternatives in a grammar should have no effect on the set of
inputs it accepts.  For example I believe:

program : statement_list | expression

is equivalent to:

program : expression | statement_list

I'm having an issue where for the fixed program:

"DOG CAT x ZEBRA x > x"  (quotes not actually a part of the input)

defining one of my NT's as:

a : b | c ;

causes the following error when parsing the program:

"line 1:18 mismatched input '>' expecting EOF"

while redefining a to be:

a : c | b ;

accepts the program after regeneration, recompilation, re-execution.

This completely blows my mind.  I believe these grammars (see full grammar
below) to be equivalent.  What am I missing?

notes:

I claim the input is valid by production:

(program
(a
 (b
(d
'DOG'
'CAT'
(a
(b
(f
'x')))
ZEBRA
(a
(b
(f
'x')))
 '>'
(a
(b
(f
'x')))))))

------------------------------------
grammar bug;

options {
    language='CSharp';
    output=AST;
    backtrack = true;

}


program    : a EOF ;

a : b | c ;

b : f |    d ;

c : e ;

d : D_NAME idl=f? D_TARGET a (resOp=RES res=a)? -> D_NAME $idl? D_TARGET a
$resOp? $res?;

e : b RCHEVRON b;

f : VERBATUM_IDENTIFIER;

D_NAME
    :    'DOG';

D_TARGET
    :    'CAT';

RES
    :    'ZEBRA';

RCHEVRON
    :    '>';

VERBATUM_IDENTIFIER
    : ('a'..'z'|'A'..'Z')(('a'..'z'|'A'..'Z')|('_')|('0'..'9'))*
    ;

WHITESPACE
    : ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+     { $channel = HIDDEN; } ;
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20081111/88fe9503/attachment.html 


More information about the antlr-interest mailing list