[antlr-interest] Re: Antlr noobie, nondeterminism abounds

lgcraymer lgc at mail1.jpl.nasa.gov
Sun May 9 12:06:53 PDT 2004


Ok, you've got the factoring--now synthesize some of the common
things-like NEWLINE instead of crlf.

The nondeterminism is on the (char8)*.  If, say you have a rule

literals
    :
    ( literal )+
    ;

then the (char8)*, from the first match of literal, also matches the
LBRACE that starts the next literal.

Work your way through some of the examples in the ANTLR distribution.
 That should help.

--Loring


--- In antlr-interest at yahoogroups.com, "WesSantee" <jws01 at t...> wrote:
> --- In antlr-interest at yahoogroups.com, "lgcraymer" <lgc at m...> wrote:
> > You are trying to do too much in the lexer.  Consider factoring
> things
> > a bit:
> 
> Ah, OK.  That helped quite a bit.  I'm no longer getting the lexer
> nondeterminisms.  Thanks!
> 
> However (and I think this is because I just don't know enough about
> how antlr is operating), I'm now getting parser nondeterminisms, but
> this time I don't even understand what's wrong.  
> 
> It may have something to do with this rule I mentioned before:
> 
> > > 3) Create a lexer rule represeting everything in CHAR *except*
> > > '\r' and '\n'.
> 
> That example was, unfortunately, just an example.  In the case of the
> grammar I'm trying to create there are a lot of these <any SET except
> some SUBSET> rules.  By the time I factored everything out, to
> represent an 8-bit char from \u0001..\u00FF, I have the following:
> 
> char8
> : ASCII_x01_TO_x09 | CR | ASCII_x0B_TO_x0C | LF 
> | ASCII_x0E_TO_x1F | SP | ASCII_x21 | DQUOTE 
> | ASCII_x23_TO_x24 | PERCENT | ASCII_x26_TO_x27 | LPAREN 
> | RPAREN | STAR | ASCII_x2B_TO_x2F | DIGIT | ASCII_x3A_TO_x5B 
> | BACKSLASH | RBRACKET | ASCII_x5E_TO_x7A | LBRACE | ASCII_x7C 
> | RBRACE | ASCII_x7E_TO_x7F | ASCII_x80_TO_xFF
> ;
> 
> *Whew*. Then to implement the subsets, I set the charVocabulary to
> \1..\377 and use the ~ like you suggested:
> 
> // Any 7-bit char except CR and LF
> text_char: ~(CR | LF | ASCII_x80_TO_xFF);
> 
> So far so good, but here's the problem.  I've got a parser rule that
> looks like this:
> 
> crlf:  CR LF;
> number:  (DIGIT)+;
> literal:  LBRACE number RBRACE crlf (char8)*;
> 
> When I run cantlr on it, I get the following:
> 
> test.g:77: warning:nondeterminism upon
> test.g:77:     k==1:SP,RPAREN
> test.g:77:     between alt 1 and exit branch of block
> 
> Increasing k just gives more warnings for each level of k. 
> Unfortunately I have no idea what that means, or how to go about
> getting rid of it.  Obviously it can't determine something at k==1,
> but like I said, increasing k doesn't solve the problem, so I'm
> stumped.  Any ideas?
> 
> Cheers,
> -Wes



 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list