[antlr-interest] Ambiguity error in lexer generation

Wed Sep 19 08:57:02 PDT 2007

I'm hoping somebody can offer some insight as to why antlr would *nondeterministically* report lexer ambiguity warnings. That is to say, when I run the following commands:

rm TestLang__.g
rm *.class;
java org.antlr.Tool TestLang.g

It sometimes, but not always, generates warnings of this sort:

warning(205): TestLang.g:1:8: ANTLR could not analyze this decision in rule Tokens; often this is because of recursive rule references visible from the left edge of alternatives. ANTLR will re-analyze the decision with a fixed lookahead of k=1. Consider using "options {k=1;}" for that decision and possibly adding a syntactic predicate.
warning(209): TestLang.g:20:1: Multiple token rules can match input such as "'v'": T22, T24, T25, UNQUOTED_STRING, JAVA_ID
As a result, tokens(s) JAVA_ID,UNQUOTED_STRING,T24,T25 were disabled for that input
warning(209): TestLang.g:13:1: Multiple token rules can match input such as "'g'": T16, T18, UNQUOTED_STRING, JAVA_ID
...

It is primarily the inconsistent reporting of these warnings that is very perplexing to me. There's also nothing obvious to me in the grammar that should be causing the warnings. There is a generic Java identifier lexer rule that could cause an ambiguity with a set of keywords, but it is my understanding that antlr should be able to resolve this ambiguity (by giving preference to the keywords). The unquoted string rule has a semantic predicate that should prevent ambiguity.

Thanks,
Alex

_________________________________________________________________
Gear up for Halo® 3 with free downloads and an exclusive offer. It’s our way of saying thanks for using Windows Live™.
http://gethalo3gear.com?ocid=SeptemberWLHalo3_WLHMTxt_2