[antlr-interest] Re: Antlr noobie, nondeterminism abounds
lgcraymer
lgc at mail1.jpl.nasa.gov
Sun May 9 12:06:53 PDT 2004
Ok, you've got the factoring--now synthesize some of the common
things-like NEWLINE instead of crlf.
The nondeterminism is on the (char8)*. If, say you have a rule
literals
:
( literal )+
;
then the (char8)*, from the first match of literal, also matches the
LBRACE that starts the next literal.
Work your way through some of the examples in the ANTLR distribution.
That should help.
--Loring
--- In antlr-interest at yahoogroups.com, "WesSantee" <jws01 at t...> wrote:
> --- In antlr-interest at yahoogroups.com, "lgcraymer" <lgc at m...> wrote:
> > You are trying to do too much in the lexer. Consider factoring
> things
> > a bit:
>
> Ah, OK. That helped quite a bit. I'm no longer getting the lexer
> nondeterminisms. Thanks!
>
> However (and I think this is because I just don't know enough about
> how antlr is operating), I'm now getting parser nondeterminisms, but
> this time I don't even understand what's wrong.
>
> It may have something to do with this rule I mentioned before:
>
> > > 3) Create a lexer rule represeting everything in CHAR *except*
> > > '\r' and '\n'.
>
> That example was, unfortunately, just an example. In the case of the
> grammar I'm trying to create there are a lot of these <any SET except
> some SUBSET> rules. By the time I factored everything out, to
> represent an 8-bit char from \u0001..\u00FF, I have the following:
>
> char8
> : ASCII_x01_TO_x09 | CR | ASCII_x0B_TO_x0C | LF
> | ASCII_x0E_TO_x1F | SP | ASCII_x21 | DQUOTE
> | ASCII_x23_TO_x24 | PERCENT | ASCII_x26_TO_x27 | LPAREN
> | RPAREN | STAR | ASCII_x2B_TO_x2F | DIGIT | ASCII_x3A_TO_x5B
> | BACKSLASH | RBRACKET | ASCII_x5E_TO_x7A | LBRACE | ASCII_x7C
> | RBRACE | ASCII_x7E_TO_x7F | ASCII_x80_TO_xFF
> ;
>
> *Whew*. Then to implement the subsets, I set the charVocabulary to
> \1..\377 and use the ~ like you suggested:
>
> // Any 7-bit char except CR and LF
> text_char: ~(CR | LF | ASCII_x80_TO_xFF);
>
> So far so good, but here's the problem. I've got a parser rule that
> looks like this:
>
> crlf: CR LF;
> number: (DIGIT)+;
> literal: LBRACE number RBRACE crlf (char8)*;
>
> When I run cantlr on it, I get the following:
>
> test.g:77: warning:nondeterminism upon
> test.g:77: k==1:SP,RPAREN
> test.g:77: between alt 1 and exit branch of block
>
> Increasing k just gives more warnings for each level of k.
> Unfortunately I have no idea what that means, or how to go about
> getting rid of it. Obviously it can't determine something at k==1,
> but like I said, increasing k doesn't solve the problem, so I'm
> stumped. Any ideas?
>
> Cheers,
> -Wes
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/antlr-interest/
<*> To unsubscribe from this group, send an email to:
antlr-interest-unsubscribe at yahoogroups.com
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list