[antlr-interest] newbie question, nondeterminism, syntactic predicate

Thomas Brandon tbrandonau at gmail.com
Wed Jun 13 06:27:08 PDT 2007


On 6/13/07, Gatis Avots <gatis.avots at inbox.lv> wrote:
>
> Hello!
>
>
> This is my first post to this mailing list. So please do not be too harsh
> if
> my question might seem real dumb for somebody. ;)
>
> I have a simplified grammar (see below). Antlr (v 2.7) generates:
> ***********************************************************
> D:\antlr\277rc1\bin\learn.g: warning:lexical nondeterminism between rules
> IDENT
> and ANY_NUMBER upon
> D:\antlr\277rc1\bin\learn.g:     k==1:'-','0'..'9'
> D:\antlr\277rc1\bin\learn.g:     k==2:<end-of-token>,'0'..'9'
> Press any key to continue . . .
> ***********************************************************
>
> Questions:
> 1) Why is that...? Should not the lookahead of k=2 solve this issue?


I think this is due to the linear approximate lookahead used in ANTLR 2. See
http://www.antlr.org/doc/glossary.html#Linear_approximate_lookahead for an
explanation.

2) If not the lookahead, then at least the syntactic predicate ( (MINUS
> ('0'..'9')) => ANY_NUMBER ( {setType(ANY_NUMBER);}) ) ? Right now it seems
> there is no difference if I use this syntactic predicate or not.


 I think this is because ANTLR 2 doesn't do predicate hoisting. ANTLR
generates a nextToken method that combines all (non-protected) lexer rules
as alternates, so here you get (IDENT|ANY_NUMBER), but as it hasn't hoisted
the predicate this is ambiguous.

3) How can I fix this so that lexer returns token of type IDENT (starting
> optionally with '-') or token ANY_NUMBER (starting optionally with '-')


You need to make the ANY_NUMBER rule protected so it is not added directly
to the nextToken method, then it should work.
Or, ANTLR 3 does not use linear approximate lookahead so this should work
fine there (you don't even need the stuff dealing with ANY_NUMBER in IDENT).
So unless you have a reason to use 2.7 instead of 3 you may be better
upgrading.

Grammar:
> ***********************************************************
> class MyParser extends Parser;
> page: ANY_NUMBER | IDENT;
>
> class MyLexer extends Lexer;
> options {k=2;}
>
> IDENT:
>    (MINUS ('0'..'9')) => ANY_NUMBER ( {setType(ANY_NUMBER);})
>    |(MINUS)? ('a'..'z')+
>   ;
>
> ANY_NUMBER:
>     (MINUS)? ('0'..'9')+
> ;
>
> protected
> MINUS       :  '-';
> ***********************************************************
>
> Any help will be much appreciated, thank you!



Kind regards,
> Gatis
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20070613/bdd02d94/attachment.html 


More information about the antlr-interest mailing list