[antlr-interest] Literal testing in ANTLR

Jamieson M. Cobleigh jcobleig at cs.umass.edu
Wed Oct 2 08:53:07 PDT 2002


I was working on a ANTLR Lexer with the following rules:

options { 
  testLiterals = false;
  charVocabulary = '\3'..'\377';
}

tokens {
  DIGRAPH = "digraph";
}

STRING 
  : QUOTED_STRING | UNQUOTED_STRING;

protected
QUOTED_STRING 
  : '"'! (ESC | ~('"' | '\\'))* '"'!;

protected
UNQUOTED_STRING options { testLiterals=true; }
  : LETTER ( NUMBER | LETTER | '_')*;
   
ESC goes to the usual collection of \n, \r, etc.


When 'digraph' was encountered during lexing, it was getting lexed by the 
UNQUOTED_STRING rule and then tested against the literal table.  However, 
the STRING rule was setting the token type to be STRING, overwriting the 
result of the literal test.

I moved the testLiterals=true option to the STRING rule, but then both
'digraph' and '"digraph"' were getting matched as the literal DIGRAPH
because the QUOTED_STRING rule removed the double quotes.

My solution was to remove the STRING rule and make a rule in my parser:
  aString : QUOTED_STRING | UNQUOTED_STRING;

Is this the best way to do this or is there a better solution that I'm not 
seeing?

Jamie


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



More information about the antlr-interest mailing list