[antlr-interest] Lexical nondeterminism

Gabriel Radu gabriel.adrian.radu at googlemail.com
Wed Jan 11 06:17:26 PST 2006


Hi all,


I am trying to write a antler grammar and I am getting a following result:

ANTLR Parser Generator   Version 2.7.5 (20050128)   1989-2005 jGuru.com
ServiceCompiler.g: warning:lexical nondeterminism between rules
INT_or_FLOAT_or_MACADR_or_VERSIONSTRING and DEFAULT upon
AuvitranServiceCompiler.g:     k==1:'D','d'
AuvitranServiceCompiler.g:     k==2:'E','e'
AuvitranServiceCompiler.g:     k==3:'F','f'
AuvitranServiceCompiler.g:     k==4:'A','a'
AuvitranServiceCompiler.g:     k==5:'U','u'
AuvitranServiceCompiler.g:     k==6:'L','l'
AuvitranServiceCompiler.g:     k==7:'T','t'
AuvitranServiceCompiler.g:     k==8:<end-of-token>
AuvitranServiceCompiler.g:     k==9:<end-of-token>
AuvitranServiceCompiler.g:     k==10:<end-of-token>


The interesting parts of the lexer are:

//----------------------------------------------------------------------
// Lexer
//----------------------------------------------------------------------

class ServiceLexer extends Lexer;
options {
  k = 10;
}


//----------------------------------------------------------------------
// White speace:

WS_
  : (' ' | '\t')
  { $setType(ANTLR_USE_NAMESPACE(antlr)Token::SKIP); }
;

NEWLINE
    : '\n'
    |	'\r'
    | "\r\n"
    | "\n\r"
;


//----------------------------------------------------------------------
// Chars:

NONTOCLIT
    :   'g'..'u' | 'x'..'z'
    |   'G'..'U' | 'X'..'Z'
;



//----------------------------------------------------------------------
// Numbers:

protected DIGIT
	:	'0'..'9'
;

protected HEXLIT
  : 'a'..'f' | 'A'..'F'
;

protected HEXDIG
  : (DIGIT | HEXLIT)
;

protected INT
  :	(HEXDIG)+
;

protected FLOAT
  : (DIGIT)+ DOT (DIGIT)+
;

protected MACADRSEPARATOR
  : DOT
;

protected MACADR
  :
    HEXDIG HEXDIG MACADRSEPARATOR
    HEXDIG HEXDIG MACADRSEPARATOR
    HEXDIG HEXDIG MACADRSEPARATOR
    HEXDIG HEXDIG MACADRSEPARATOR
    HEXDIG HEXDIG MACADRSEPARATOR
    HEXDIG HEXDIG
;



protected VERSIONSTRING_L
  : ( DIGIT )+ DOT ( DIGIT )+ DOT ( DIGIT )+ ('A'..'Z'|'a'..'z')?
;

protected VERSIONSTRING_S
  : ( DIGIT )+ DOT ( DIGIT )+ ('A'..'Z'|'a'..'z')
;

protected VERSIONSTRING : ;

INT_or_FLOAT_or_MACADR_or_VERSIONSTRING

   : ( DIGIT (DIGIT)? DOT DIGIT ( DIGIT (DIGIT)? )? DOT )
          => VERSIONSTRING_L { $setType( VERSIONSTRING ); }

   | ( DIGIT (DIGIT)? DOT DIGIT ( DIGIT (DIGIT)? )? ('A'..'Z'|'a'..'z') )
          => VERSIONSTRING_S { $setType( VERSIONSTRING ); }

   | ( ( DIGIT )+ DOT ) => FLOAT { $setType( FLOAT ); }

   | ( HEXDIG HEXDIG MACADRSEPARATOR ) => MACADR { $setType( MACADR ); }

   | ( ( DIGIT )+ ) => INT { $setType( INT ); }

;



//----------------------------------------------------------------------
// Punctuation:

DOT:    '.' ;

COMMA:	',' ;

COLON:	':' ;

SCOLON:	';' ;



//[ some more text]



//----------------------------------------------------------------------
DEFAULT:
    ('D' | 'd')
    ('E' | 'e')
    ('F' | 'f')
    ('A' | 'a')
    ('U' | 'u')
    ('L' | 'l')
    ('T' | 't')
;


The gramar compiles fine if I take out the

DEFAULT:
    ('D' | 'd')
    ('E' | 'e')
    ('F' | 'f')
    ('A' | 'a')
    ('U' | 'u')
    ('L' | 'l')
    ('T' | 't')
;

rule, or the

   | ( DIGIT (DIGIT)? DOT DIGIT ( DIGIT (DIGIT)? )? ('A'..'Z'|'a'..'z') )
          => VERSIONSTRING_S { $setType( VERSIONSTRING ); }

production from the INT_or_FLOAT_or_MACADR_or_VERSIONSTRING rule.


Does anyone know why I am getting the nondeterminism warning, and how
to solve the problem?


Kind regards,
Gabriel


More information about the antlr-interest mailing list