[antlr-interest] Help - ANTLR and localized keywords

Edson Tirelli tirelli at post.com
Mon Sep 6 10:06:21 PDT 2010


    Milos,

    We had to do something similar (although with different goals) for
Drools. That is how I would do it (assuming your parser is generating
an AST):

    First, do not define keywords in the lexer. Just define an ID
lexer rule that will serve as your lexer token for both keywords and
regular IDs:

ID = ...;

    Second, define a virtual token for each of your keywords:

tokens {
    VT_STEP;
}

    Third, define a "keyword rule" in your parser for each of your keywords:

step_key	:	{validateIdentifierKey(SoftKeywords.STEP)}?=>  id=ID
		->	VT_STEP[$id]
	;

    Where "SoftKeywords.STEP" returns the actual keyword you are
trying to recognize for whatever locale you are working with at the
moment.

    Fourth, define the "validateIdentifierKey" function either as some
method on a helper class or even as a parser method directly.
Basically the function would be something like:

// this is pseudo-code... some null checks are advisable on real code
private boolean validateIdentifierKey(String text) {
    return text.equalsIgnoreCase( input.LT(1).getText() );
}

    Fifth, write your parser rules using your keyword rules in place
of your keywords:

step_statement: step_key ... ;

    That works for us. Because our grammar is quite complex, there are
a few keywords we had to keep as "hard keywords" defined in the lexer,
but for the most part it works pretty well.

    Our grammar can be found here if you want to take a look:

http://anonsvn.jboss.org/repos/labs/labs/jbossrules/branches/etirelli/drools-compiler/src/main/resources/org/drools/lang/DRL.g

    Hope it helps,
        Edson


2010/9/5 Milos Silhanek <silhanek.m at gmail.com>:
> Hi,
>
> I have implemented Karel language support in NetBeans Platform and I will
> use ANTLR to generate lexer, parser and AST.
> Karel language is described on
> http://karelnb.sourceforge.net/introduction.html
>
> But Karel's keywords are nationale specific (and case insensitive) - the
> STEP command in English is KROK in Czech. I tried several ways. Using
> predicates generates code into parser but not in lexer. Advices in mailing
> list didn't help me.
>
> How can I test keywords in different languages in lexer?
>
> Can I translate localized keywords into basic (English) in AntlrCharStream ?
>
>
> I suppose to load localized keywords in each lexer/parser. Program will be
> able to translate Karel sources from one language to others.
>
> Thanks for help.
>
> Milos Silhanek
> silhanek.m at seznam.cz
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>



-- 
  Edson Tirelli
  JBoss Drools Core Development
  JBoss by Red Hat @ www.jboss.com


More information about the antlr-interest mailing list