[antlr-interest] help me to understand nondeterminism warning s please

mzukowski at yci.com mzukowski at yci.com
Wed Dec 18 08:53:38 PST 2002


Start reading your generated code.  Take a look at some rules and you will
see tests like:

		if ((((LA(1) >= '0' && LA(1) <= '9')) &&
(_tokenSet_2.member(LA(2))) && (true))) {
			int _m340 = mark();
			synPredMatched340 = true;
			inputState.guessing++;
			try {
				....

The if statement is the lookahead test.  This rule is looking for a number
followed by a digit or '.' or 'e' or 'E'.  Those are in the _tokenSet_2.
You can prohibit creating the token set by using the option
codeGenBitsetTestThreshold = 999999; which says to make token sets only if
they have more than 999999 members.  However, you can't avoid bitsets for
negation rules like ~('1'..'9').  The "&& (true)" is an artifact of the code
generation and will be optimized away....

Also read the first chapter of Ter's draft book, look for "building parsers
by hand" on the www.antlr.org site.

Start asking questions like "why does this rule create this lookahead test?"

Monty


-----Original Message-----
From: davidjpenton2002 <nwestreb at arrowsash.com>
[mailto:nwestreb at arrowsash.com]
Sent: Tuesday, December 17, 2002 12:23 PM
To: antlr-interest at yahoogroups.com
Subject: [antlr-interest] help me to understand nondeterminism warnings
please


I would very much appreciate tips on understanding nondeterminism 
warnings from antlr.  I suppose it is probably not appropriate to 
just dump a grammar into a post and ask y'all to debug it, so, I'll 
only include the relevant snippets (which may be insufficient to 
identify the problem, I guess).

As I would like to gain a fairly complete grasp of antlr, I expect 
that replies to this post be to point me in the right direction in 
the FAQ, the reference manual, or other sources. Such advice would be 
much appreciated.

Anyway, here is what I get:

*** antlr output: ***

ANTLR Parser Generator   Version 2.7.1   1989-2000 jGuru.com
grammar.g:137: warning: nondeterminism upon
grammar.g:137: 	k==1:S
grammar.g:137: 	k==2:S
grammar.g:137: 	between alt 1 and exit branch of block
grammar.g:92: warning: nondeterminism upon
grammar.g:92: 	k==1:S
grammar.g:92: 	k==2:S
grammar.g:92: 	between alt 1 and exit branch of block

  ***

Here is the rule at line 137:

notationType
    :  "NOTATION" S LEFTPAREN (S)? name 
       ((S)? VERTICALBAR (S)? name)*
       (S)? RIGHTPAREN 
    ;

And here is the production at line 92:

enumeration
    : LEFTPAREN (S)? nmtoken
      ((S)? VERTICALBAR (S)? nmtoken)* (S)? RIGHTPAREN
    ;

and, if it helps, the lexer rule for S is:

S : (' ' | '\t' | '\n' | '\r')+;

I don't really understand what the ambiguity is, which is probably 
just another way of saying I do not yet understand antlr and LL(k) 
parsing yet. More specifically, I don't know how to read the warning 
message.  What are 'alt 1' and the 'exit branch'?

The nondeterminism seems to exist regardless of k.  I guess I don't 
really know if I should expect to spot the problem by looking only at 
the rules at the line numbers reported by antlr, or if I must think 
more globally, i.e. look at the rules that include or are included by 
the offending rules as reported by antlr?

My attempt to sort this out entailed stripping my grammar down to the 
offending rules (the ones above), the rules that refer to them, and 
the related lexer rules.  This produced the odd (to me) effect of 
causing the problem to go away, i.e. no more warnings.

How should I approach getting an understanding of this?



 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



More information about the antlr-interest mailing list