[antlr-interest] Re: The eternal question: Why is this ANTLR grammar ambiguous??
lgcraymer
lgc at mail1.jpl.nasa.gov
Fri Dec 3 12:54:27 PST 2004
--- In antlr-interest at yahoogroups.com, Harald M. Müller
<harald.m.mueller at b...> wrote:
>
>
> [I have posted this to the pccts newsgroup; and on the antlr forum
> on jguru - I'm sorry if this is "against the rules", but thos two
> forums seem to have quite low answer rates, and I have found this
> mailing list only now]
>
> I have already written a few ANTLR lexers and parsers (for
> productive systems, with sometimes weird grammars) - but the
> following puzzles me:
> Why does the following simple grammar create an ambiguous warning?:
>
>
> class MyLexer extends Lexer;
>
> options {
> charVocabulary = '\3'..'\377';
> k=3;
> caseSensitive=false;
> }
>
> COMMENT_COMMAND
> : '-' '-' '$'
> ;
>
> SQL_STATEMENT
> : ( SYMBOL )+ ';'
> ;
>
> protected SYMBOL
> : ('-' (~ '-')) => '-' //1
> | 'a'
> | '"' (~ '"')* '"' //2
> ;
>
> When I remove either of //1 or //2, the ambiguity goes away.
> When I replace //2 with
>
> | '"' (~ ('"'|'$'))* '"'
>
> - i.e., I also exclude $ inside the "string" -, the ambiguity also
> goes away.
> But why would a $ after(!!) a " lead to an ambiguity?????
This is an approximate LLk issue--there isn't really an ambiguity, but
ANTLR sees --$ as matchable by SYMBOL; SYMBOL is matched as ( 'a' | (
('-' | '"') (~'-' | ~'"') ...
If you take a look at the generated code, it should do the right
thing. Fortunately, Ter is doing full LLk for ANTLR 3 so we don't
have to be confused by this sort of thing.
--Loring
>
> Thanks for any help!
>
> Harald M.
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/antlr-interest/
<*> To unsubscribe from this group, send an email to:
antlr-interest-unsubscribe at yahoogroups.com
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list