[antlr-interest] About literal supports unicode

Ha Luong haluongvn at gmail.com
Sat Jun 13 10:17:20 PDT 2009


Hi Jim and Sam,

I copy the character 'à' from Accessories->System Tools -> Character Map to
Notepad++/Format: Encode UTF8 without BOM
I attached the input file. Could you please help me?

Thank you very much,
Ha

On Sat, Jun 13, 2009 at 9:45 AM, Jim Idle <jimi at temporal-wave.com> wrote:

> Ha Luong wrote:
> > Dear all,
> >
> > I tried to use the grammar for accepting the unicode string as follow:
> > //modify T.g in the example source of ANTLR book
> > grammar T;
> > options {
> >     language=Java;
> > }
> > @members {
> > String s;
> > }
> > r : ID '#' {s = $ID.text; System.out.println("found "+s);} ;
> > ID: ('a'..'z'|'\u00e0')+ ; //\u00e0
> > WS: (' '|'\n'|'\r')+ {skip();} ; // ignore whitespace
> >
> > and do these commands in cygwin:
> > java org.antlr.Tool T.g
> > javac *.java
> >
> > If I test the literal 'a', it is ok
> > java Test
> > a #
> > ^Z
> > found a
> >
> > but the literal 'à', it has error:
> How are you opening your input stream? Have you opened it using the
> encoding that the input file has been saved in, such as UTF-32, UTF-8,
> something else?
>
>
> Jim
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090613/2f46a72c/attachment.html 
-------------- next part --------------
à #


More information about the antlr-interest mailing list