[antlr-interest] Inefficiency in lexer
Bryan Ewbank
ewbank at gmail.com
Fri May 27 10:56:02 PDT 2005
Why's this inefficient? LA is called once in both cases, and the
switch can be converted to a lookup table that is faster than the
multiple comparisons of the alternate code.
- Bryan
On 5/27/05, ttest <ttest at gmx.de> wrote:
> Hi,
>
> while looking thru my generated lexer code I came across the following
> switch statement which is unnecessarily inefficient.
>
> switch ( LA(1)) {
> case '\n': case '\r': case ' ': case '0':
> case '1': case '2': case '3': case '4':
> case '5': case '6': case '7': case '8':
> case '9': case 'A': case 'B': case 'C':
> case 'D': case 'E': case 'F': case 'G':
> case 'H': case 'I': case 'J': case 'K':
> case 'L': case 'M': case 'N': case 'O':
> case 'P': case 'Q': case 'R': case 'S':
> case 'T': case 'U': case 'V': case 'W':
> case 'X': case 'Y': case 'Z': case 'a':
> case 'b': case 'c': case 'd': case 'e':
> case 'f': case 'g': case 'h': case 'i':
> case 'j': case 'k': case 'l': case 'm':
> case 'n': case 'o': case 'p': case 'q':
> case 'r': case 's': case 't': case 'u':
> case 'v': case 'w': case 'x': case 'y':
> case 'z':
> {
> mText(true);
> theRetToken=_returnToken;
> break;
> }
>
> A better alternative which could also be easily generated from character
> classes using .. i. e. 'a'..'z' would be the following.
>
> char c = LA(1);
> if( c=='\n' || c=='\r' || c==' '
> || (c>='0' && c<='9')
> || (c>='A' && c<='Z')
> || (c>='a' && c<='z')
> )
> {
> mText(true);
> theRetToken=_returnToken;
> break;
> }
>
> Greets,
>
> Christian
>
>
More information about the antlr-interest
mailing list