[antlr-interest] Trouble with nondeterminism
John Gruenenfelder
johng at as.arizona.edu
Mon Aug 28 16:37:38 PDT 2006
I keeping with my "starting simple" plan, I've been adding bits to my simple
parser as I get the hang of ANTLR. The many examples on the Net are also
helping a great deal.
But... I'm having trouble trying to have my lexer recognize both int and
double number types. I've tried a few different examples all ending in
similar ANTLR messages. It looks like this:
Constant
: INT
| DOUBLE
;
INT
: ('0' | '1'..'9' ('0'..'9')*) ;
DOUBLE
: ('0'..'9')+ '.' ('0'..'9')* (Exponent)?
| '.' ('0'..'9')+ (Exponent)?
| ('0'..'9')+ Exponent
| ('0'..'9')+ (Exponent)?
;
protected
Exponent
: ('e'|'E') ('+'|'-')? ('0'..'9')+ ;
Now, I understand that very clearly and it *seems* to make sense. But when I
run it through ANTLR, I get:
opc.g: warning:lexical nondeterminism between rules Constant and INT upon
opc.g: k==1:'0'..'9'
opc.g: k==2:<end-of-token>,'0'..'9'
opc.g: k==3:<end-of-token>,'0'..'9'
opc.g: warning:lexical nondeterminism between rules Constant and DOUBLE upon
opc.g: k==1:'.','0'..'9'
opc.g: k==2:<end-of-token>,'.','0'..'9','E','e'
opc.g: k==3:<end-of-token>,'+','-','.','0'..'9','E','e'
opc.g: warning:lexical nondeterminism between rules INT and DOUBLE upon
opc.g: k==1:'0'..'9'
opc.g: k==2:<end-of-token>,'0'..'9'
opc.g: k==3:<end-of-token>,'0'..'9'
opc.g:181: warning:lexical nondeterminism between alts 1 and 2 of block upon
opc.g:181: k==1:'0'..'9'
opc.g:181: k==2:<end-of-token>,'0'..'9'
opc.g:181: k==3:<end-of-token>,'0'..'9'
opc.g:214: warning:lexical nondeterminism between alts 1 and 3 of block upon
opc.g:214: k==1:'0'..'9'
opc.g:214: k==2:'0'..'9'
opc.g:214: k==3:'0'..'9','E','e'
opc.g:214: warning:lexical nondeterminism between alts 1 and 4 of block upon
opc.g:214: k==1:'0'..'9'
opc.g:214: k==2:'0'..'9'
opc.g:214: k==3:<end-of-token>,'0'..'9','E','e'
opc.g:214: warning:lexical nondeterminism between alts 3 and 4 of block upon
opc.g:214: k==1:'0'..'9'
opc.g:214: k==2:'0'..'9','E','e'
opc.g:214: k==3:'+','-','0'..'9','E','e'
I see that and it also makes sense, since ANTLR is kind enough to tell me just
where the lexer will get confused.
I think my troubles stem from prior experience with tools like bison/flex and
now working with a recursive decent parser like ANTLR. There's something I'm
just not "getting"... For example, maybe these are warnings I don't need to
worry about, not unlike the occasional shift/reduce conflict one might
encounter with bison? Or maybe I *do* need to worry about them...
I would greatly appreceiate any help in clearing up my confusion. I'm sure
it's not that difficult a task. Thanks again.
--
--John Gruenenfelder Research Assistant, UMass Amherst student
Systems Manager, MKS Imaging Technology, LLC.
Try Weasel Reader for PalmOS -- http://gutenpalm.sf.net
"This is the most fun I've had without being drenched in the blood
of my enemies!"
--Sam of Sam & Max
More information about the antlr-interest
mailing list