[antlr-interest] Trouble with nondeterminism
John Gruenenfelder
johng at as.arizona.edu
Mon Aug 28 18:46:28 PDT 2006
On Tue, Aug 29, 2006 at 03:17:46AM +0200, Spálený Ivo wrote:
>Hi,
>
>Sub-rule
>
> ('0'..'9')+ (Exponent)?
>
>can be splitted into:
>
> ('0' | '1'..'9' ('0'..'9')*) // duplicity of INT token
>| ('0'..'9')+ Exponent // duplicity of an alternative DOUBLE branch
>| '0' ('0'..'9')+ // the unique piece of information in this sub-rule
>
>In ANTLR point of view, nondeterministic input probably results in deterministic output. ANTLR disables alternatives. But isn't it better to be sure; if "1" is DOUBLE or INT token finally?
>
>Best regards,
>
>Ivo Spaleny
Hi,
Okay, that certainly makes sense. I guess my question is what is the best way
to resolve these problems with common left prefixes, as one gets with
different numerical types.
After re-reading the section in the reference manual about predicates, I have
to following which generates no warnings from ANTLR:
Constant
: ( ('0'..'9')+ '.') => ('0'..'9')+ '.' ('0'..'9')* (Exponent)?
{ $setType(DOUBLE); }
| '.' ('0'..'9')+ (Exponent)?
{ $setType(DOUBLE); }
| ( ('0'..'9')+ ('e' | 'E')) => ('0'..'9')+ Exponent
{ $setType(DOUBLE); }
| ('0' | ( '1'..'9' ('0'..'9')* ))
{ $setType(INT); }
;
Is that a "good" method for dealing with this problem? I must also say that
even after reading that section and hacking together the above rules, I still
don't really understand how these predicate rules help ANTLR do its job. It
still must grab the characters one at a time. How do the predicates make the
task easier or, at the very least, unambiguous?
--
--John Gruenenfelder Research Assistant, UMass Amherst student
Systems Manager, MKS Imaging Technology, LLC.
Try Weasel Reader for PalmOS -- http://gutenpalm.sf.net
"This is the most fun I've had without being drenched in the blood
of my enemies!"
--Sam of Sam & Max
More information about the antlr-interest
mailing list