[stringtemplate-interest] UTF-8 not displaying correctly [SOLVED]
Leo R. Lundgren
leo at finalresort.org
Mon Mar 15 17:11:03 PDT 2010
Hey,
I found the cause of the problem. It's quite simple once found.
I needed to tell the PrintWriter of the servlet response to use UTF-8,
as I take it this is because the PrintWriter doesn't just forward/
output a stream of bytes, but actually handle the characters in a way
it's designed to.
I added response.setCharacterEncoding("UTF-8"); to my doGet() method
in the servlet, and everything is fine now.
Thank you for your time!
// Leo
15 mar 2010 kl. 23.58 skrev Leo R. Lundgren:
> To clarify the combination of
> StringTemplateGroup.setFileCharEncoding() and HTML meta charset I have
> tried, and their results:
>
> setFileCharEncoding("UTF-8")
> <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
> Result: Characters NOT displayed correctly in browser ("<?>"-sign/
> character displayed where there should be one "ä").
>
> setFileCharEncoding("ISO-8859-1")
> <meta http-equiv="Content-Type" content="text/html;
> charset=iso-8859-1">
> Result: Characters NOT displayed correctly in browser (dual junk
> chars where there should be one "ä").
>
> setFileCharEncoding("UTF-8")
> <meta http-equiv="Content-Type" content="text/html;
> charset=iso-8859-1">
> Result: Characters displayed correctly in browser.
>
> setFileCharEncoding("ISO-8859-1")
> <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
> Result: Characters displayed correctly in browser.
>
> Pretty confusing.
>
> // Leo
>
>
> 15 mar 2010 kl. 23.48 skrev Leo R. Lundgren:
>
>> Hi,
>>
>> I found StringTemplateGroup. setFileCharEncoding() which takes a
>> parameter that seems to be the same as
>> java.io.InputStreamReader(InputStream in, String charsetName)
>> accepts.
>> I added it to my ViewHandler:
>>
>> public class ViewHandler {
>> private StringTemplateGroup templateGroup;
>> private Map<String, String> attributes = new HashMap<String,
>> String>();
>>
>> public ViewHandler(String viewBasePath) {
>> templateGroup = new StringTemplateGroup("default", viewBasePath);
>> System.out.println(templateGroup.getFileCharEncoding());
>> templateGroup.setFileCharEncoding("UTF-8");
>> System.out.println(templateGroup.getFileCharEncoding());
>> }
>>
>> public void setAttribute(String name, String value) {
>> attributes.put(name, value);
>> }
>>
>> public String getOutput(String viewName) {
>> StringTemplate view = templateGroup.getInstanceOf(viewName,
>> attributes);
>> return view.toString();
>> }
>>
>> public void render(Writer out, String viewName) throws
>> IOException {
>> out.write(getOutput(viewName));
>> }
>> }
>>
>> Watching the console at the time of a request, it seems that UTF-8 is
>> already the default in the system. In any case, that is what the
>> option is set to. Still no go in the output however, the encoding
>> issue remains.
>>
>> I have checked all encoding settings for the files properties and
>> they
>> all say UTF-8 (inherited from container).
>> I also tried templateGroup.setFileCharEncoding("ISO-8859-1") instead,
>> and it did change the <?> to a couple of junk characters instead, so
>> it's not right.
>> I'd also like to clarify that my previous information regarding the
>> HTTP response headers carrying a charset in them was wrong; there is
>> no such header sent. However, the browser adheres to the HTML meta
>> tag
>> defining a charset, that I am sure of.
>>
>> After some testing, I've found that there is /one/ thing that makes
>> the page display correctly; If in the HTML of the template I set the
>> charset to iso-8859-1 instead of utf-8, so that the browser parses
>> the
>> contents as latin1, it displays correctly. I can't really draw any
>> other conclusion from this than that what the browser is sent is
>> coded
>> as latin1?
>>
>> At http://www.stringtemplate.org/api/org/antlr/stringtemplate/PathGroupLoader.html
>> I found the description "A brain dead loader that looks only in the
>> directory(ies) you specify in the ctor. You may specify the char
>> encoding. NOTE: this does not work when you jar things up! Use
>> CommonGroupLoader instead in that case".
>>
>> Reading the note in the description, and also reading http://www.stringtemplate.org/api/org/antlr/stringtemplate/CommonGroupLoader.html
>> , I get the feeling that it's not the actual char encoding that
>> doesn't work when "jar'ed up", but rather the loader class itself.
>> But
>> is this something I should try anyway? If so, how do I use the group
>> loader?
>>
>> I did check with some Eclipse guys and they didn't feel that it was
>> Eclipse not saving files correctly. Personally, I don't know since I
>> havent used Eclipse long enough to form an opinion based on
>> experience
>> in it.
>>
>> Silly question maybe, but could it be that ST just *reads* the
>> template files using UTF-8 (or the set encoding), but then outputs it
>> using Latin1?
>>
>> For reference, here's the beginning of the index HTML template:
>>
>> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd
>> ">
>> <html lang="sv-SE">
>> <head>
>> <meta http-equiv="Content-Type" content="text/html;
>> charset=utf-8">
>> <title>MyApp</title>
>> <link rel="stylesheet" type="text/css" href="css/common.css">
>> </head>
>> <body>
>> å ä ö <!-- test characters -->
>> $(contentTemplate)()$
>> </body>
>> </html>
>>
>> Many thanks,
>>
>> Regards, Leo
>>
>>
>> 15 mar 2010 kl. 19.50 skrev Terence Parr:
>>
>>> Hi. You have to tell ST to use a UTF-8 encoding. should be option to
>>> StringTemplateGroup or something.
>>> Ter
>>> On Mar 15, 2010, at 10:11 AM, Leo R. Lundgren wrote:
>>>
>>>> Hi,
>>>>
>>>> I am building a small servlet application using Eclipse, Tomcat 6,
>>>> JRE
>>>> 1.6, ST 3.2. Here is a ViewHandler I'm using to wrap ST
>>>> functionality:
>>>>
>>>> public class ViewHandler {
>>>> private StringTemplateGroup templateGroup;
>>>> private Map<String, String> attributes = new HashMap<String,
>>>> String>();
>>>>
>>>> public ViewHandler(String viewBasePath) {
>>>> templateGroup = new StringTemplateGroup("default",
>>>> viewBasePath);
>>>> }
>>>>
>>>> public void setAttribute(String name, String value) {
>>>> attributes.put(name, value);
>>>> }
>>>>
>>>> public String getOutput(String viewName) {
>>>> StringTemplate view = templateGroup.getInstanceOf(viewName);
>>>> view.setAttributes(attributes);
>>>> return view.toString();
>>>> }
>>>>
>>>> public void render(Writer out, String viewName) throws
>>>> IOException {
>>>> out.write(getOutput(viewName));
>>>> }
>>>> }
>>>>
>>>> The handler is used like this in a servlet:
>>>>
>>>> protected void doGet(HttpServletRequest request,
>>>> HttpServletResponse
>>>> response) throws ServletException, IOException {
>>>> super.doGet(request, response);
>>>>
>>>> String viewBasePath = getServletContext().getRealPath("/WEB-INF/
>>>> view");
>>>> ViewHandler viewHandler = new ViewHandler(viewBasePath);
>>>> viewHandler.setAttribute("fileName", "test.png");
>>>> viewHandler.setAttribute("contentTemplate", "uploadFile");
>>>>
>>>> viewHandler.render(response.getWriter(), "index");
>>>> }
>>>>
>>>> It does what it is supposed to; The output I get is the contents of
>>>> the index.st template, with attributes replaced like they should
>>>> be,
>>>> and the content template included as expected.
>>>>
>>>> However, swedish characters such as åäö that are part of static
>>>> strings in the template files are shown in the browser(s) as
>>>> question
>>>> marks. I know this indicates coding/charset problems. An example
>>>> string (from the template files) that is not displayed correctly
>>>> is:
>>>>
>>>> <input type="button" class="cancelUploadButton" value="Avbryt
>>>> insättning">
>>>>
>>>> The 'ä' in the last word becomes a question mark in the browser.
>>>>
>>>>
>>>> So, I have:
>>>> - Checked the encoding settings in Eclipse, in all places I can
>>>> find
>>>> that seem to relate to the source files and/or template files.
>>>> - Checked the encoding of the related template files (both in their
>>>> properties and using an external editor that loads them fine as
>>>> UTF-8).
>>>> - Verified that the HTTP response headers say UTF-8 as the charset.
>>>> The same goes for the HTML code itself, it's UTF-8 all the way.
>>>>
>>>> The only thing I haven't found to be apparently fine is when I open
>>>> the .java files from my project using another editor (TextMate,
>>>> which
>>>> has always handled encodings fine for me); Normally TextMate
>>>> displays
>>>> the encoding used/discovered from loading the file (for the
>>>> template
>>>> files it says UTF-8), but for the Java source files it doesn't
>>>> display
>>>> anything.
>>>> However there are no static strings in the source files other than
>>>> template names and attributes, so I'm not sure that would matter.
>>>> But
>>>> maybe it does, assuming there's something wrong with how the source
>>>> files are saved by eclipse.
>>>>
>>>> Can someone shed some light on this issue? As I see it I've got
>>>> UTF-8
>>>> everywhere (apart from possibly the Java source files, which I
>>>> guess
>>>> could be the issue), and it should work. But maybe I need to change
>>>> something with regards to ST to have it work with UTF-8? If not,
>>>> any
>>>> other ideas?
>>>>
>>>> Thank you,
>>>>
>>>> // Leo
>>>>
>>>> _______________________________________________
>>>> stringtemplate-interest mailing list
>>>> stringtemplate-interest at antlr.org
>>>> http://www.antlr.org/mailman/listinfo/stringtemplate-interest
>>>
>>
>>
>>
>> -|
>>
>> _______________________________________________
>> stringtemplate-interest mailing list
>> stringtemplate-interest at antlr.org
>> http://www.antlr.org/mailman/listinfo/stringtemplate-interest
>
>
>
> -|
>
> _______________________________________________
> stringtemplate-interest mailing list
> stringtemplate-interest at antlr.org
> http://www.antlr.org/mailman/listinfo/stringtemplate-interest
-|
More information about the stringtemplate-interest
mailing list