[stringtemplate-interest] UTF-8 not displaying correctly
Leo R. Lundgren
leo at finalresort.org
Mon Mar 15 15:58:47 PDT 2010
To clarify the combination of
StringTemplateGroup.setFileCharEncoding() and HTML meta charset I have
tried, and their results:
setFileCharEncoding("UTF-8")
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
Result: Characters NOT displayed correctly in browser ("<?>"-sign/
character displayed where there should be one "ä").
setFileCharEncoding("ISO-8859-1")
<meta http-equiv="Content-Type" content="text/html;
charset=iso-8859-1">
Result: Characters NOT displayed correctly in browser (dual junk
chars where there should be one "ä").
setFileCharEncoding("UTF-8")
<meta http-equiv="Content-Type" content="text/html;
charset=iso-8859-1">
Result: Characters displayed correctly in browser.
setFileCharEncoding("ISO-8859-1")
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
Result: Characters displayed correctly in browser.
Pretty confusing.
// Leo
15 mar 2010 kl. 23.48 skrev Leo R. Lundgren:
> Hi,
>
> I found StringTemplateGroup. setFileCharEncoding() which takes a
> parameter that seems to be the same as
> java.io.InputStreamReader(InputStream in, String charsetName) accepts.
> I added it to my ViewHandler:
>
> public class ViewHandler {
> private StringTemplateGroup templateGroup;
> private Map<String, String> attributes = new HashMap<String,
> String>();
>
> public ViewHandler(String viewBasePath) {
> templateGroup = new StringTemplateGroup("default", viewBasePath);
> System.out.println(templateGroup.getFileCharEncoding());
> templateGroup.setFileCharEncoding("UTF-8");
> System.out.println(templateGroup.getFileCharEncoding());
> }
>
> public void setAttribute(String name, String value) {
> attributes.put(name, value);
> }
>
> public String getOutput(String viewName) {
> StringTemplate view = templateGroup.getInstanceOf(viewName,
> attributes);
> return view.toString();
> }
>
> public void render(Writer out, String viewName) throws IOException {
> out.write(getOutput(viewName));
> }
> }
>
> Watching the console at the time of a request, it seems that UTF-8 is
> already the default in the system. In any case, that is what the
> option is set to. Still no go in the output however, the encoding
> issue remains.
>
> I have checked all encoding settings for the files properties and they
> all say UTF-8 (inherited from container).
> I also tried templateGroup.setFileCharEncoding("ISO-8859-1") instead,
> and it did change the <?> to a couple of junk characters instead, so
> it's not right.
> I'd also like to clarify that my previous information regarding the
> HTTP response headers carrying a charset in them was wrong; there is
> no such header sent. However, the browser adheres to the HTML meta tag
> defining a charset, that I am sure of.
>
> After some testing, I've found that there is /one/ thing that makes
> the page display correctly; If in the HTML of the template I set the
> charset to iso-8859-1 instead of utf-8, so that the browser parses the
> contents as latin1, it displays correctly. I can't really draw any
> other conclusion from this than that what the browser is sent is coded
> as latin1?
>
> At http://www.stringtemplate.org/api/org/antlr/stringtemplate/PathGroupLoader.html
> I found the description "A brain dead loader that looks only in the
> directory(ies) you specify in the ctor. You may specify the char
> encoding. NOTE: this does not work when you jar things up! Use
> CommonGroupLoader instead in that case".
>
> Reading the note in the description, and also reading http://www.stringtemplate.org/api/org/antlr/stringtemplate/CommonGroupLoader.html
> , I get the feeling that it's not the actual char encoding that
> doesn't work when "jar'ed up", but rather the loader class itself. But
> is this something I should try anyway? If so, how do I use the group
> loader?
>
> I did check with some Eclipse guys and they didn't feel that it was
> Eclipse not saving files correctly. Personally, I don't know since I
> havent used Eclipse long enough to form an opinion based on experience
> in it.
>
> Silly question maybe, but could it be that ST just *reads* the
> template files using UTF-8 (or the set encoding), but then outputs it
> using Latin1?
>
> For reference, here's the beginning of the index HTML template:
>
> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd
> ">
> <html lang="sv-SE">
> <head>
> <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
> <title>MyApp</title>
> <link rel="stylesheet" type="text/css" href="css/common.css">
> </head>
> <body>
> å ä ö <!-- test characters -->
> $(contentTemplate)()$
> </body>
> </html>
>
> Many thanks,
>
> Regards, Leo
>
>
> 15 mar 2010 kl. 19.50 skrev Terence Parr:
>
>> Hi. You have to tell ST to use a UTF-8 encoding. should be option to
>> StringTemplateGroup or something.
>> Ter
>> On Mar 15, 2010, at 10:11 AM, Leo R. Lundgren wrote:
>>
>>> Hi,
>>>
>>> I am building a small servlet application using Eclipse, Tomcat 6,
>>> JRE
>>> 1.6, ST 3.2. Here is a ViewHandler I'm using to wrap ST
>>> functionality:
>>>
>>> public class ViewHandler {
>>> private StringTemplateGroup templateGroup;
>>> private Map<String, String> attributes = new HashMap<String,
>>> String>();
>>>
>>> public ViewHandler(String viewBasePath) {
>>> templateGroup = new StringTemplateGroup("default", viewBasePath);
>>> }
>>>
>>> public void setAttribute(String name, String value) {
>>> attributes.put(name, value);
>>> }
>>>
>>> public String getOutput(String viewName) {
>>> StringTemplate view = templateGroup.getInstanceOf(viewName);
>>> view.setAttributes(attributes);
>>> return view.toString();
>>> }
>>>
>>> public void render(Writer out, String viewName) throws
>>> IOException {
>>> out.write(getOutput(viewName));
>>> }
>>> }
>>>
>>> The handler is used like this in a servlet:
>>>
>>> protected void doGet(HttpServletRequest request,
>>> HttpServletResponse
>>> response) throws ServletException, IOException {
>>> super.doGet(request, response);
>>>
>>> String viewBasePath = getServletContext().getRealPath("/WEB-INF/
>>> view");
>>> ViewHandler viewHandler = new ViewHandler(viewBasePath);
>>> viewHandler.setAttribute("fileName", "test.png");
>>> viewHandler.setAttribute("contentTemplate", "uploadFile");
>>>
>>> viewHandler.render(response.getWriter(), "index");
>>> }
>>>
>>> It does what it is supposed to; The output I get is the contents of
>>> the index.st template, with attributes replaced like they should be,
>>> and the content template included as expected.
>>>
>>> However, swedish characters such as åäö that are part of static
>>> strings in the template files are shown in the browser(s) as
>>> question
>>> marks. I know this indicates coding/charset problems. An example
>>> string (from the template files) that is not displayed correctly is:
>>>
>>> <input type="button" class="cancelUploadButton" value="Avbryt
>>> insättning">
>>>
>>> The 'ä' in the last word becomes a question mark in the browser.
>>>
>>>
>>> So, I have:
>>> - Checked the encoding settings in Eclipse, in all places I can find
>>> that seem to relate to the source files and/or template files.
>>> - Checked the encoding of the related template files (both in their
>>> properties and using an external editor that loads them fine as
>>> UTF-8).
>>> - Verified that the HTTP response headers say UTF-8 as the charset.
>>> The same goes for the HTML code itself, it's UTF-8 all the way.
>>>
>>> The only thing I haven't found to be apparently fine is when I open
>>> the .java files from my project using another editor (TextMate,
>>> which
>>> has always handled encodings fine for me); Normally TextMate
>>> displays
>>> the encoding used/discovered from loading the file (for the template
>>> files it says UTF-8), but for the Java source files it doesn't
>>> display
>>> anything.
>>> However there are no static strings in the source files other than
>>> template names and attributes, so I'm not sure that would matter.
>>> But
>>> maybe it does, assuming there's something wrong with how the source
>>> files are saved by eclipse.
>>>
>>> Can someone shed some light on this issue? As I see it I've got
>>> UTF-8
>>> everywhere (apart from possibly the Java source files, which I guess
>>> could be the issue), and it should work. But maybe I need to change
>>> something with regards to ST to have it work with UTF-8? If not, any
>>> other ideas?
>>>
>>> Thank you,
>>>
>>> // Leo
>>>
>>> _______________________________________________
>>> stringtemplate-interest mailing list
>>> stringtemplate-interest at antlr.org
>>> http://www.antlr.org/mailman/listinfo/stringtemplate-interest
>>
>
>
>
> -|
>
> _______________________________________________
> stringtemplate-interest mailing list
> stringtemplate-interest at antlr.org
> http://www.antlr.org/mailman/listinfo/stringtemplate-interest
-|
More information about the stringtemplate-interest
mailing list