[antlr-interest] lexer nondeterministic warning

Ruben Trancoso rubentrancoso at gmail.com
Thu Mar 8 00:31:37 PST 2007


Hello,
I´m quite new to ANTLR and to deep parser theory. I just had made some
tries with flex and bison years ago but it was hard at the time than I
passed.
Now I´m finnaly trying to learn with ANTLR help that made much easy to
start playing with this stuff and with dragon book. So plese fell free
to correct my missconceptions.

This simple lexer got me stuck. I´m playing with recognizement of
written text - In general in a given language theres some uniformity
on the way a writter tells a story, mainly if you got classics to use
as reference (another discusson). Well, but the case here is that I
got a 'lexical nondeterminism warnig' on the WORD rule when dealing
with  SINGLE and COMPOUND word. Like in 'context' and
'context-sensitivity'.

As its impossible to know wich word ill be the first, theres no reason
to use lookahed, but to the lexer, when it reach and HYPHEN the
ambiguity is solved than I dont understand why ANTLR gave me that
warnning once the rule directs for two different paths.

test.g
header {
}

class Scanner extends Lexer;

WS
	: (	' '
	|	'\t'
	|	'\n'	{ newline(); }
	|	'\r' ('\n')?	{ newline(); }
	)
	{$setType(Token.SKIP);}
	;

CAPWORD
	: UPPERCHAR WORD
	;

WORD
	: SINGLEWORD
	| COMPOUNDWORD
	;

COMPOUNDWORD
	: SINGLEWORD (WORDPLUS)+
	;

SINGLEWORD
	: (LOWERCHAR)+
	;

WORDPLUS
	: HYPHEN SINGLEWORD
	;

UPPERCHAR
	: ( 'A'..'Z' | ('Á'|'À'|'Â'|'Ã'|'É'|'Ê'|'Í'|'Ó'|'Ô'|'Õ'|'Ú'|'Ü') )
	;

LOWERCHAR
	: ( 'a'..'z' | 'ç' | ('á'|'à'|'â'|'ã'|'é'|'ê'|'í'|'ó'|'ô'|'õ'|'ú'|'ü') )
	;

ENDSENTENCE	: DOT|DOTCOMMA|DOUBLEDOT|EXCLAMATION|INTERROGATION;

DOT   : '.';
DOTCOMMA : ';'	;
DOUBLEDOT : ':' ;
EXCLAMATION : '!' ;
INTERROGATION : '?' ;

COMMA	: ',';
QUOTATION : '"' ;
DIALOGMARK  : '\u0097' ;
HYPHEN	: '-';

-- 
Ruben

Quando um homem não acredita em Deus,
não é que não acredite mais em nada -
é que ele acredita em qualquer coisa.
(G. K. Chesterton)


More information about the antlr-interest mailing list