[antlr-interest] understanding lexical nondeterminism warnings

Alan <alan at oursland.net> alan at oursland.net
Sun Feb 2 20:59:29 PST 2003


I am having some trouble understanding how the antlr lexical analysis 
works. I hope someone can set me straight.

At the bottom of this message is a simplified grammar I am writing.
I am getting the following "lexical nondeterminism" warnings:
	ANTLR Parser Generator   Version 2.7.2   1989-2003 jGuru.com
	simplescheme.g: warning:lexical nondeterminism between rules 
IDENTIFIER and NUMBER upon
	simplescheme.g:     k==1:'+','-','.'
	simplescheme.g:     k==2:'.','0'..'9'
	simplescheme.g:     k==3:<end-of-
token>,'.','0'..'9','d'..'f','l','s'
	simplescheme.g:     k==4:<end-of-token>,'+','-
','.','0'..'9','d'..'f','l','s'
	simplescheme.g:     k==5:<end-of-token>,'+','-
','.','0'..'9','d'..'f','l','s'

Consider the string "+.9".
I can see how this could be interpreted two ways:
	IDENTIFIER NUMBER => (PECULIAR_IDENTIFIER) ('.' DIGIT)
or
	NUMBER => (SIGN '.' DIGIT)
I understand that the lexer looks for the longest matching token and 
I would expect it to select the second option (which it in fact does).
If this is the case, I don't understand why the warning is displaying.

The SILLY tokens (adapted from Ashley Mills's tutorial) should have 
the same problem.
"AB" could be tokenized as:
	SILLY3 SILLY4
or
	SILLY1
Again, the second option is returned. However, no warning is 
displayed for these tokens.

What is going on here? Is there any way I can clear the warnings 
(without just hiding them)?

Alan

=====================================================================
class SimpleLexer extends Lexer;
options { k=5; }

SILLY1:             "AB";
SILLY2:             "AC";
SILLY3:             "A";
SILLY4:             "B";

IDENTIFIER:         LETTER (LETTER | DIGIT | SPECIAL)*
    |               PECULIAR
    ;
NUMBER:             SIGN UREAL10;

protected UREAL10:  (DIGIT)+ '.' (DIGIT)* (SUFFIX)?
    |               '.' (DIGIT)+ (SUFFIX)?
    ;
protected SUFFIX:   EXPONENT SIGN (DIGIT)+;
protected EXPONENT: 'e' | 'f' | 's' | 'd' | 'l';
protected LETTER:   'a'..'z';
protected DIGIT:    '0'..'9';
protected SIGN:     ('+' | '-')?;
protected SPECIAL:  '+' | '-' | '.' | '@';
protected PECULIAR: '+' | '-' | "...";



 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



More information about the antlr-interest mailing list