[antlr-interest] C++ code target

Don Caton dcaton at shorelinesoftware.com
Sat Mar 31 15:37:59 PDT 2007

> fashion. Currently I intend to drop unicode support for now and first
> get a 8 bit version out.


Please don't do that.  One of the biggest limitations in Antlr 2 is the lack
of proper Unicode support.  

Why should the code have any dependence on the size of a character?  Please
don't make the same mistake in 3.0.  The lexer class should be a template
class that takes the size of a character as a template parameter.  Then
there will be no need to go back and make another version for Unicode.  It
should not make any difference whether you are parsing 8 bit characters or
16 bit characters or characters of any arbitrary length.

The member function that compares characters should just compare the two
characters for equality, don't worry about Unicode code points or any of
that stuff.  Just make the function virtual so it can be overridden in a

Of course, this is easy for me to say since I'm not doing the work, but I
really don't see why anything in the code should be dependent upon the size
of a character.  Seems a perfect place to use a template class.

Wish I had time to contribute some code but I won't for the next 4-5 months.
I could probably help with code design and review though, as time permits.


More information about the antlr-interest mailing list