[antlr-interest] lexer suggestions

Andy Tripp info at jazillian.com
Sun Mar 6 14:55:58 PST 2005


Hi,

Forgive me if this has been discussed before or if it's not even
an issue with more recent versions of ANTLR. I'm using ANTLR 2.7.2.

I have a couple of minor suggestions for ANTLR-generated lexers:

1) Rather than (or in addition to):

lexer.setObjectClass("some.package.MyToken");

I'd like to be able to specify my own Token factory:

lexer.setTokenFactory(TokenFactory myfactory);

...and then in the antlr package, we'd need:

package antlr;
public interface TokenFactory {
        Token makeToken(int type);
}

One advantage of specifying a factory rather than a class name
is that any errors are caught at compile time, rather
than having a RuntimeException if I have a typo in my class name.
Another advantage is speed. The reflection newInstance() method
is a little slower than just a call to a constructor (though not
much - I measured about 1.5 seconds for 10 million calls for a simple
constructor, and twice that using newInstance()).
A third (potential) advantage is that my Factory could then
be made more efficient, perhaps recycling old Tokens with an
object pool.

Of course, I can add a setTokenFactory() method
myself by adding just a few lines of
code (and I do), but it might be nice default
behaviour.

2) It would also be nice to have Token be an interface, and maybe
rename the current Token class to "BaseToken" or something.
It's not very likely that anyone would really need their
Token class to inherit from something else, but it's possible.
It just seems ugly to have those few methods in the Token class
that don't do anything. We have set/get methods for column, line,
filename, and none of them do anything in the Token class.

Andy



More information about the antlr-interest mailing list