[antlr-interest] Please make ANTLR check for missing/redefined tokens

Brian Smith brian-l-smith at uiowa.edu
Fri Jun 21 23:09:59 PDT 2002


I have a parser and a lexer that are in the same file. As long as I 
don't have any errors, everything works great.

But, lately I've been changing my grammar around a lot of and had the 
following problems:

(1) Sometimes my parser will depend on a lexer rule that doesn't exist, 
but ANTLR won't complain. For example, type.g:

     class TypoParser extends Parser;
         classRule: "class" IDENTIFEIR ;

     class TypoLexer extends Lexer;
         IDENTIFIER : ('a'..'z') ;

Notice, in "classRule" I am trying to use token IDENTIFIER but I 
mispelled it. But, when processing the file, ANTLR doesn't catch the 
mistake. I think this is caused by the fact that there seems to be no 
way to tell ANTLR that TypoParser's lexer is TypoLexer. I would welcome 
some mechanism to statically associate a grammar with a lexer, 
especially if it would help static checking and/or optimization.

(2) inconsistent.g:

     class InconsistentParser extends Parser;
         extendsClause: "extends" IDENTIFIER (COMMA IDENTIFIER)* ;
         parameters: "(" IDENTIFIER ("," IDENTIFIER)* ")" ;

     class InconsistentLexer extends Lexer;
         COMMA : "," ;
         IDENTIFIER : ('a'..'z') ;

The problem is that the use of "," in the parameters rule conflicts with 
the COMMA lexer rule. This error took me a long time to find because 
ANTLR doesn't warn about the conflict when processing the grammar file. 
I got this error when attempting to combine two grammars together (one 
was a grammar for a textual notation for UML models, the other was an 
OCL grammar)

Oddly,  if I change the COMMA rule to:

         tokens { COMMA="," }

ANTLR _does_ complain:

      Inconsistent.g:10:25: warning:
          Redefinition of token in tokens {...}: COMMA

It would be nice if ANTLR could extends this analysis to detect rule 
conflicts outside of the "tokens" section in such a case. In fact, I'm 
wondering why there is even a seperate "tokens" in the first place; 
couldn't ANTLR just automatically infer which lexer rules (quoted 
strings with no actions) belong in "tokens" ?

What do you think?

Thanks,
Brian


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



More information about the antlr-interest mailing list