[antlr-interest] What's the best way to manage token i18n?

ugol ugo.landini at gmail.com
Wed Mar 4 00:59:50 PST 2009


Hi all,
suppose I need to parse a language that must be easily i18ned: a
simple interpreter in which "print" can be translated in your local
language, f.e. "repeat" is "ripeti" in italian.
So I'll have one localization file (the one with the association
repeat=ripeti) and the real localized source file (the script with the
translated keywords)
Is there a known strategy to manage this?
The simplest idea is to have the main grammar in a canonical form
(let's say in english) and another stringtemplate grammar to translate
from local language in the canonical form. This seems to work, but
there are some problems to solve, because you lose the source file -
tokens mapping if the localized tokens aren't of the same length of
the original ones (which is usually the case), and that is important
for errors reporting and syntax highlighting.
Ah, and to make it worse unfortunately I can't use stringtemplate
because the target language is javascript and it isn't been ported yet
:(

tia for any hint

-- 
uL

Pragmatist
http://blog.ugolandini.com
http://www.flickr.com/photos/ugol/


More information about the antlr-interest mailing list