[antlr-interest] What's the best way to manage token i18n?

Jim Idle jimi at temporal-wave.com
Wed Mar 4 07:29:21 PST 2009


ugol wrote:
> Hi all,
> suppose I need to parse a language that must be easily i18ned: a
> simple interpreter in which "print" can be translated in your local
> language, f.e. "repeat" is "ripeti" in italian.
> So I'll have one localization file (the one with the association
> repeat=ripeti) and the real localized source file (the script with the
> translated keywords)
> Is there a known strategy to manage this?
> The simplest idea is to have the main grammar in a canonical form
> (let's say in english) and another stringtemplate grammar to translate
> from local language in the canonical form. This seems to work, but
> there are some problems to solve, because you lose the source file -
> tokens mapping if the localized tokens aren't of the same length of
> the original ones (which is usually the case), and that is important
> for errors reporting and syntax highlighting.
> Ah, and to make it worse unfortunately I can't use stringtemplate
> because the target language is javascript and it isn't been ported yet
> :(
>
> tia for any hint
>
>   
If it is that simple and needs to be in Javascript on a web page then I 
don't think that you want to use ANTLR for this, but the best way is to 
write your own hand coded lexer that always returns the token type the 
parser expects.

Jim


More information about the antlr-interest mailing list