[antlr-interest] nondeterminism warning

thereisnofreeid <chantal.ackermann at web.de> chantal.ackermann at web.de
Thu Jan 16 02:21:50 PST 2003


hello all,

I am very new to parser generators and ANTLR. I am trying to get my
first Lexer compiled.

The Lexer should recognize expressions for a search query:

- Phrases: in double quotes (like "one phrase"), any white space shall
be escaped (like "one\\ phrase").
- boolean operators: "AND", "OR", "NOT"
- single words (not "AND", "OR", "NOT" *sigh*)

outside a phrase white space shall be ignored.

this is my current Lexer which I am not able to improve further to get
rid of the warnings:

/******************** LEXER **************************/

class QueryLexer extends Lexer;

options
{
	charVocabulary = '\3'..'\377';
	k=3;
}

{
	private boolean isPhrase = false;
}

TERM
	:	PHRASE
	|	( AND ) => { $setType(Token.AND); }
	|	WORD
	|	WS { $setType(Token.SKIP); }
	|	{ System.out.println("error: " + $getText()); }
	;

AND
	:	{ this.isPhrase == false }? "AND"
	;

OR
	:	{ this.isPhrase == false }? "OR"
	;

NOT
	:	{ this.isPhrase == false }? "NOT"
	;

protected PHRASE
	:	'"'! { this.isPhrase = true; } WORD ( WS! { $append("\\ "); } WORD
)* (WS!)?
		'"'! { this.isPhrase = false; }
	;

protected WORD
	:	(LETTER)+
	;

protected WS
	:	(' ' | '\t')+
	;

protected LETTER
    :   '\u0024' |
        '\u0041'..'\u005a' |
        '\u005f' |
        '\u0061'..'\u007a' |
        '\u00c0'..'\u00d6' |
        '\u00d8'..'\u00f6' |
        '\u00f8'..'\u00ff' |
        '\u0100'..'\u1fff' |
        '\u3040'..'\u318f' |
        '\u3300'..'\u337f' |
        '\u3400'..'\u3d2d' |
        '\u4e00'..'\u9fff' |
        '\uf900'..'\ufaff'
    ;

/***************** LEXER END **********************/

I get these warnings:

antlr:
    [antlr] ANTLR Parser Generator   Version 2.7.2rc2 (20030105)  
1989-2003 jGuru.com
    [antlr] QueryParser.g: warning:lexical nondeterminism between
rules TERM and AND upon
    [antlr] QueryParser.g:     k==1:'A'
    [antlr] QueryParser.g:     k==2:'N'
    [antlr] QueryParser.g:     k==3:'D'
    [antlr] QueryParser.g: warning:lexical nondeterminism between
rules TERM and OR upon
    [antlr] QueryParser.g:     k==1:'O'
    [antlr] QueryParser.g:     k==2:'R'
    [antlr] QueryParser.g:     k==3:<end-of-token>
    [antlr] QueryParser.g: warning:lexical nondeterminism between
rules TERM and NOT upon
    [antlr] QueryParser.g:     k==1:'N'
    [antlr] QueryParser.g:     k==2:'O'
    [antlr] QueryParser.g:     k==3:'T'
    [antlr] warning: public lexical rule TERM is optional (can match
"nothing")
    [antlr] QueryParser.g:75: warning:lexical nondeterminism upon
    [antlr] QueryParser.g:75:     k==1:'\t',' '
    [antlr] QueryParser.g:75:     k==2:'\t',' '
    [antlr] QueryParser.g:75:     k==3:'\t',' ','"'
    [antlr] QueryParser.g:75:     between alt 1 and exit branch of block

+++++++++++++++++++++++

I changed k to 3 in hope it would solve the nondeterminism but that
changes basically nothing. It adds only the lines with k==2 and k==3
to the warning output.

I do understand that "AND", "OR", "NOT" can match as WORD but I'am not
able to tell antlr to first try to match AND, OR, NOT and then WORD. I
tried with syntactic predicates in different places, but that didn't
change anything.

The last warning is annoying: where is the error in the WS rule? I
can't find anything wrong in it?

I would greatly appreciate any hint, tip, suggestion, solution...!

regards,
Chantal


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



More information about the antlr-interest mailing list