[antlr-interest] Bug report: Unexplainable "no viable alternative"

Alex Marin alex.marin at amiq.ro
Wed Nov 4 05:06:42 PST 2009


Hello Jim, Gerald

@ Jim: I suppose you are referring to the much larger grammar I've
mentioned in the initial post by "myriad syntactic predicates". However,
this is why I extracted this (quite simple) example, which has a single
syntactic predicate and yet fails to perform the recognition. On the other
hand, the  rule associative_dimension_1 *is* left-factored, and re-writing
it using a syntactic predicate (as associative_dimension_2) made the
grammar work in the first place.
Indeed, choosing k=1 fixes the problem, but I still don't understand if
this change is reliable as a global change in the "much larger grammar".

@ Gerald: At the moment I'm trying to install AntlrDT in order to reproduce
the diagrams and better understand what exactly is the difference between
rules associative_dimension_1 and _2. It seems that eclipse has loaded the
plugin, but I don't find any new perspective/view/editor. 
However, I don't see why the recognition would treat differently the two
subrules which I find equivalent (i.e. why doesn't the parser follow
sized_or_unsized_dimension when using _1).

A note for clarification: I reported this issue as a "bug" mainly because
of the very strange behavior obtained by commenting/uncommenting an
alternative in the data_type rule. There is no warning issued by the
generator, and the behavior is unpredictable. This makes me feel quite
insecure, since such a "harmless" change can cause a recognition failure. I
understand that there might be better practices for writing a grammar (and
I acknowledge the "lack of experience on my part"), but I would still
classify such behavior as a bug.
In a bigger picture, I want to port a v2 *working* grammar to v3, and I
would like all the code relying on the v2 parser to be affected as little
as possible. In other words, I'd like to modify the rules as little as
possible in order to avoid introducing bugs in a working and verified
grammar.

Thank you,
Alex Marin


On Tue, 03 Nov 2009 12:26:12 -0800, "Jim Idle" <jimi at temporal-wave.com>
wrote:
> It isn’t a bug; this happens because your grammar is ambiguous without
all
> those myriad syntactic predicates. Take those all out and left factor
your
> rules so that there are no ambiguities. Also, remove that k=1 option and
> let ANTLR work it out, then use ANTLRWorks syntax diagrams to work out
why
> your grammar is ambiguous.
> 
> Left factor the rules so that everything that starts with a '[' starts in
> the same rule, then branches at places (such as '*') where you want to
> distinguish. Then build the appropriate AST.
> 
> I think perhaps this is lack of experience on your part so until you have
a
> better idea of what is going on I also recommend that you do the
following:
> 
> 1) Take all the 'xxxxx' out of your parser rules and create clear LEXER
> rules for them;
> 2) Then check your lexer rules for overlaps in their specification;
> 
> As well as taking out those predicates and fixing the rules properly of
> course.
> 
> Finally, eliminate just generation ordering errors by cleaning out the
> generated .java and .tokens classes and completely regenerating
everything
> to make sure that the parser's idea of what tokens numbers go with what
is
> the same as the lexer's thoughts on the matter.
> 
> Jim
> 
>> -----Original Message-----
>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>> bounces at antlr.org] On Behalf Of alex.marin at amiq.ro
>> Sent: Tuesday, November 03, 2009 10:22 AM
>> To: Antlr interest
>> Cc: Etools; Adrian Simionescu; parrt at cs.usfca.edu
>> Subject: [antlr-interest] Bug report: Unexplainable "no viable
>> alternative"
>> 
>> Hello,
>> 
>> We have trouble understanding why we get a "no viable alternative" when
>> running the attached parser grammar on the following input:
>> 
>> bit bitstream [];
>> 
>> The output is:
>> 
>> line 1:15 no viable alternative at input ']'
>> 
>> However, we have found two (very strange) workarounds for the issue:
>> 1. Commenting out the 'real' option in the data_type rule 2. Using
>> associative_dimension_2 rule instead of associative_dimension_1
>> (although the two are equivalent)
>> 
>> What is the explanation for this behavior?
>> Is there a rigurous solution to avoid such behavior?
>> 
>> Thanks,
>> Alex Marin
>> 
>> Notes:
>> - the example is not intended to be useful by itself (it is an excerpt
>> from a much larger grammar)
>> - the latest antlr version has been used for code generation
>> (antlr-3.2.jar)
>> - you can find the referred grammar inline at the end of this e-mail
>> and also in the attached file
>> - by comparing the generated parsers, we noticed that the workarounds
>> cause the prediction to be done by some complicated if-conditions
>> rather than the dfa which throws the NoViableAlt
>> 
>> ////////////////// Example.g ////////////////////////
>> 
>> grammar Example;
>> 
>> options {
>> 	k=1;
>> 	output=AST;
>> 	}
>> 
>> entry
>> :
>> (my_rule)+
>> ;
>> 
>> my_rule
>> 	:
>> 	  tf_port_item SEMI
>> 	;
>> 
>> tf_port_item
>> :
>> data_type ID variable_dimension
>> ;
>> 
>> 
>> data_type
>> :
>> 'bit'
>> | 'byte'
>> | 'real' // Comment this to suppress NoViableAlt 'struct'
>> | 'union' ( 'tagged' )?
>> | 'enum'
>> | 'virtual'
>> | ps_identifier
>> ;
>> 
>> ps_identifier
>> :
>> ( ID COLON_COLON ) => ID COLON_COLON ID
>> | ID
>> ;
>> 
>> variable_dimension
>> :
>> ( associative_dimension_1 ) => associative_dimension_1
>> variable_dimension // comment this line
>> //        ( associative_dimension_2 ) => associative_dimension_2
>> variable_dimension  // and uncomment this one to suppress NoViableAlt (
>> with 'real' alt in data_type)
>> | ( sized_or_unsized_dimension )*
>> ;
>> 
>> associative_dimension_1
>> :
>> LBRACK ( STAR | data_type ) RBRACK
>> ;
>> 
>> associative_dimension_2
>> 	    :
>> ( LBRACK STAR ) => LBRACK STAR RBRACK
>> | LBRACK data_type RBRACK
>> 	    ;
>> 
>> 
>> sized_or_unsized_dimension
>> :
>> LBRACK ( NUMBER )? RBRACK
>> ;
>> 
>> /********** Lexer *************/
>> 
>> SEMI: ';';
>> STAR: '*';
>> LBRACK: '[';
>> RBRACK: ']';
>> COLON_COLON: '::';
>> 
>> WS
>> :
>> (' '|'\r'|'\t'|'\u000C'|'\n') {$channel=HIDDEN;}
>> ;
>> 
>> ID
>> :
>> ('a'..'z'|'A'..'Z'|'_') ('0'..'9'|'a'..'z'|'A'..'Z'|'_')*
>> ;
>> 
>> NUMBER
>> :
>> ('0'..'9')+
>> ;


More information about the antlr-interest mailing list