[antlr-interest] Inefficiency in lexer
ttest
ttest at gmx.de
Fri May 27 10:40:46 PDT 2005
Hi,
while looking thru my generated lexer code I came across the following
switch statement which is unnecessarily inefficient.
switch ( LA(1)) {
case '\n': case '\r': case ' ': case '0':
case '1': case '2': case '3': case '4':
case '5': case '6': case '7': case '8':
case '9': case 'A': case 'B': case 'C':
case 'D': case 'E': case 'F': case 'G':
case 'H': case 'I': case 'J': case 'K':
case 'L': case 'M': case 'N': case 'O':
case 'P': case 'Q': case 'R': case 'S':
case 'T': case 'U': case 'V': case 'W':
case 'X': case 'Y': case 'Z': case 'a':
case 'b': case 'c': case 'd': case 'e':
case 'f': case 'g': case 'h': case 'i':
case 'j': case 'k': case 'l': case 'm':
case 'n': case 'o': case 'p': case 'q':
case 'r': case 's': case 't': case 'u':
case 'v': case 'w': case 'x': case 'y':
case 'z':
{
mText(true);
theRetToken=_returnToken;
break;
}
A better alternative which could also be easily generated from character
classes using .. i. e. 'a'..'z' would be the following.
char c = LA(1);
if( c=='\n' || c=='\r' || c==' '
|| (c>='0' && c<='9')
|| (c>='A' && c<='Z')
|| (c>='a' && c<='z')
)
{
mText(true);
theRetToken=_returnToken;
break;
}
Greets,
Christian
More information about the antlr-interest
mailing list