[antlr-interest] help me to understand nondeterminism warning
s please
mzukowski at yci.com
mzukowski at yci.com
Wed Dec 18 08:53:38 PST 2002
Start reading your generated code. Take a look at some rules and you will
see tests like:
if ((((LA(1) >= '0' && LA(1) <= '9')) &&
(_tokenSet_2.member(LA(2))) && (true))) {
int _m340 = mark();
synPredMatched340 = true;
inputState.guessing++;
try {
....
The if statement is the lookahead test. This rule is looking for a number
followed by a digit or '.' or 'e' or 'E'. Those are in the _tokenSet_2.
You can prohibit creating the token set by using the option
codeGenBitsetTestThreshold = 999999; which says to make token sets only if
they have more than 999999 members. However, you can't avoid bitsets for
negation rules like ~('1'..'9'). The "&& (true)" is an artifact of the code
generation and will be optimized away....
Also read the first chapter of Ter's draft book, look for "building parsers
by hand" on the www.antlr.org site.
Start asking questions like "why does this rule create this lookahead test?"
Monty
-----Original Message-----
From: davidjpenton2002 <nwestreb at arrowsash.com>
[mailto:nwestreb at arrowsash.com]
Sent: Tuesday, December 17, 2002 12:23 PM
To: antlr-interest at yahoogroups.com
Subject: [antlr-interest] help me to understand nondeterminism warnings
please
I would very much appreciate tips on understanding nondeterminism
warnings from antlr. I suppose it is probably not appropriate to
just dump a grammar into a post and ask y'all to debug it, so, I'll
only include the relevant snippets (which may be insufficient to
identify the problem, I guess).
As I would like to gain a fairly complete grasp of antlr, I expect
that replies to this post be to point me in the right direction in
the FAQ, the reference manual, or other sources. Such advice would be
much appreciated.
Anyway, here is what I get:
*** antlr output: ***
ANTLR Parser Generator Version 2.7.1 1989-2000 jGuru.com
grammar.g:137: warning: nondeterminism upon
grammar.g:137: k==1:S
grammar.g:137: k==2:S
grammar.g:137: between alt 1 and exit branch of block
grammar.g:92: warning: nondeterminism upon
grammar.g:92: k==1:S
grammar.g:92: k==2:S
grammar.g:92: between alt 1 and exit branch of block
***
Here is the rule at line 137:
notationType
: "NOTATION" S LEFTPAREN (S)? name
((S)? VERTICALBAR (S)? name)*
(S)? RIGHTPAREN
;
And here is the production at line 92:
enumeration
: LEFTPAREN (S)? nmtoken
((S)? VERTICALBAR (S)? nmtoken)* (S)? RIGHTPAREN
;
and, if it helps, the lexer rule for S is:
S : (' ' | '\t' | '\n' | '\r')+;
I don't really understand what the ambiguity is, which is probably
just another way of saying I do not yet understand antlr and LL(k)
parsing yet. More specifically, I don't know how to read the warning
message. What are 'alt 1' and the 'exit branch'?
The nondeterminism seems to exist regardless of k. I guess I don't
really know if I should expect to spot the problem by looking only at
the rules at the line numbers reported by antlr, or if I must think
more globally, i.e. look at the rules that include or are included by
the offending rules as reported by antlr?
My attempt to sort this out entailed stripping my grammar down to the
offending rules (the ones above), the rules that refer to them, and
the related lexer rules. This produced the odd (to me) effect of
causing the problem to go away, i.e. no more warnings.
How should I approach getting an understanding of this?
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list