[antlr-interest] lexical nondeterminism
John B. Brodie
jbb at acm.org
Tue Aug 22 13:02:18 PDT 2006
>follow up the previous email, I changed the rules abit as shown:
>=========================================================================
>protected ANYSTRING : (~('\n'|'\r'))* ('\n'|'\r');
>protected WS : ( ' ' | '\t' );
>
>PROPERTYNAME : '%' ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'0'..'9'|
>SPECIALCHAR)* ;
>COMMENT : "//" ANYSTRING;
>ABSTRACT : ("ABSTRACT" (WS)+) => ("ABSTRACT" (WS)+) ANYSTRING
> | ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'0'..'9'|SPECIALCHAR)*
>{ $setType(VARIABLE_NAME); } ;
>=========================================================================
>
>then I got the following warning message:
>
>1 lexical nondeterminism upon k==1:'\t',' ' k==2:'\u0003'..'\u00ff'
>k==3:<end-of-token>,'\u0003'..'\u00ff' between alt 1 and exit branch of
>block
>
>
>anyone can help?
The problem is with the (WS)+ phrase before the ANYSTRING.
Consider the input "ABSTRACT ", the second blank could either be part
of the (WS)+ or be the first character of the ANYSTRING, thus the
non-determinism.
I assume you want ANYSTRING to start with the first non-blank
character, so just add ~(' '|'\t) to the from of the ANYSTRING rule.
(and of course, also adjusting any other rule that uses ANYSTRING).
Also you do not really need the predicate since you have a fixed size
lookahead. e.g. k=9 will distinguish "ABSTRACT " from "ABSTRACTION". I
always work really hard to avoid predicates because they involve
backtracking with the possibility of scanning the input text multiple
times.
Anyway, here is a lexer that gets no complaints from the antlr.Tool
(did not actually try to test it any further):
//=========================================================================
class L extends Lexer;
options {
k = 9;
charVocabulary = '\3'..'\377';
}
protected SPECIALCHAR : '_';
protected ANYSTRING : ~(' '|'\t') (~('\n'|'\r'))* ('\n'|'\r');
protected WS : ( ' ' | '\t' );
protected NAME
: ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'0'..'9'|SPECIALCHAR)* ;
PROPERTYNAME : '%' NAME ;
COMMENT : "//" (WS)+ ANYSTRING;
ABSTRACT : "ABSTRACT" (WS)+ ANYSTRING ;
VARIABLE_NAME : NAME;
//=========================================================================
Hope this helps...
-jbb
More information about the antlr-interest
mailing list