[stringtemplate-interest] [ST4] Using a platform independent default charset/encoding (e.g. UTF-8)

Sat Jan 29 11:27:14 PST 2011

Currently STGroup, STGroupFile and and STGroupDir allow defining an encoding/charset to be used for reading the files. This encoding is propagated to an ANTLRInputStream.

When no encoding is defined the ANTLRInputStream used the "default charset". For Java the default charset is platform dependent. Especially the default charset differs for Windows and Unix / Mac OSX systems.

Because of this you should explicitly define an encoding in your code when working with template files across different platforms (assuming the files contain non-ASCII characters).

To simplify working across platforms I suggest we define a fixed, platform independent default charset, to be used when no encoding is defined explicitly. My favorite encoding is UTF-8.

What do others think? 

Udo

P.S.: This decision is independent of the discussion to define an encoding inside individual template (group) files, currently going on in a different thread.

P.P.S.: I think we can change the "default encoding behavior" (from the current platform dependent "Java default charset" to a platform independent charset, e.g. UTF-8) as long as StringTemplate 4.0 is still in beta. Once it is released changing this may not be nice to people assuming the old behavior, e.g. those only working on one platform and writing stuff in "their" charset.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/stringtemplate-interest/attachments/20110129/c708718f/attachment.html