[antlr-interest] newbie question, nondeterminism, syntactic predicate

Robin Davies rerdavies at rogers.com
Wed Jun 13 06:14:59 PDT 2007


I believe that Lexers rules don't use LL(*) rules: they use matching rules 
similar to regular expressions. And I believe that the k=2 option affects 
parser rules, but not lexer rules.

The following should do what you want:

IDENT: MINUS? ('a'..'z')+;

ANY_NUMBER: MINUS? ('0'..'9')+;

of (if you have ulterior motives):
----------
// near the top...
token
{  ANY_NUMBER };

...
IDENT:
   (MINUS ('0'..'9')* ( {setType(ANY_NUMBER);})
   |(MINUS)? ('a'..'z')+
  ;

-----
or
----

IDENT
   :    ANY_NUMBER ( {setType(ANY_NUMBER);})
   |   (MINUS)? ('a'..'z')+
  ;

fragment   // a match of this rule does NOT generate a token.
ANY_NUMBER:    // not sure if you can re-use this token name. if not, use 
token { } with a different name.
    (MINUS)? ('0'..'9')+
;


> This is my first post to this mailing list. So please do not be too harsh 
> if
> my question might seem real dumb for somebody. ;)
>
> I have a simplified grammar (see below). Antlr (v 2.7) generates:
> ***********************************************************
> D:\antlr\277rc1\bin\learn.g: warning:lexical nondeterminism between rules
> IDENT
> and ANY_NUMBER upon
> D:\antlr\277rc1\bin\learn.g:     k==1:'-','0'..'9'
> D:\antlr\277rc1\bin\learn.g:     k==2:<end-of-token>,'0'..'9'
> Press any key to continue . . .
> ***********************************************************
>
> Questions:
> 1) Why is that...? Should not the lookahead of k=2 solve this issue?
>
> 2) If not the lookahead, then at least the syntactic predicate ( (MINUS
> ('0'..'9')) => ANY_NUMBER ( {setType(ANY_NUMBER);}) ) ? Right now it seems
> there is no difference if I use this syntactic predicate or not.
>
> 3) How can I fix this so that lexer returns token of type IDENT (starting
> optionally with '-') or token ANY_NUMBER (starting optionally with '-')
>
> Grammar:
> ***********************************************************
> class MyParser extends Parser;
> page: ANY_NUMBER | IDENT;
>
> class MyLexer extends Lexer;
> options {k=2;}
>
> IDENT:
>   (MINUS ('0'..'9')) => ANY_NUMBER ( {setType(ANY_NUMBER);})
>   |(MINUS)? ('a'..'z')+
>  ;
>
> ANY_NUMBER:
>    (MINUS)? ('0'..'9')+
> ;
>
> protected
> MINUS       :  '-';
> ***********************************************************
>
> Any help will be much appreciated, thank you!
>
> Kind regards,
> Gatis
> 



More information about the antlr-interest mailing list