[antlr-interest] Legibility Bug in C++ Lexer generation

Mark Lentczner markl at glyphic.com
Sun Mar 28 08:22:47 PST 2004


When a Lexer generates a bit-set test for a set of characters, the 
generated code works, but is misnamed and the comment is incorrect.  
For example:

NUMBER: ( '-' )? ( '0'..'9' )+ ;

Results in this code:
--------------------
if ((_tokenSet_0.member(LA(1))) && (true)) {
     mNUMBER(true);
     theRetToken=_returnToken;
}
...
const unsigned long ScriptLexer::_tokenSet_0_data_[] = { 0UL, 
67051520UL, 0UL, 0UL, 0UL, 0UL };
// "use" "public" "protected" "private" "instance" "const" ID "include"
// "if" "else" "loop"
const antlr::BitSet ScriptLexer::_tokenSet_0(_tokenSet_0_data_,6);
--------------------

Of course, these are really character sets, not token sets, and the 
generated comment is completely wrong..  The generated code would be 
much more readable if it were:
--------------------
if ((_characterSet_0.member(LA(1))) && (true)) {
     mNUMBER(true);
     theRetToken=_returnToken;
}
...
const unsigned long ScriptLexer:: _characterSet_0_data_[] = { 0UL, 
67051520UL, 0UL, 0UL, 0UL, 0UL };
// '-' '0' '1' '2' '3' '4' '5' '6'
// '7' '8' '9'
const antlr::BitSet ScriptLexer:: 
_characterSet_0(_characterSet_0_data_,6);
--------------------

- Mark

Mark Lentczner
markl at wheatfarm.org
http://www.wheatfarm.org/



 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list