[antlr-interest] c++ Unicode

pepone pepone pepone.onrez at gmail.com
Mon Apr 24 11:42:38 PDT 2006


Hi all

I have same problems to get UNICODE working with c++

I change my lexer rules as Peggy say and now my lexer and parser
compile OK, but it don't work as expected.

in cpp/examples/unicode test.in is encoding as ISO-8859-1 and not as
utf-8, is not posible to have input encoding as UTF-8 or UTF-16?

if i encoding my input as ISO-8859-1 i don't see the same characters
in input that in ouput

for example if i input 'ó'  this give me 'ã '  this happen with my
lexer as well with unicode example.

what is the way to get this working, i want to input 'á' a view 'á' in
the ouput.

Any ideas to solve this.

On 4/23/06, Peggy Fieland <madcapmaggie at yahoo.com> wrote:
> Yes, if you have something like:
>
> GE: ">="
>
> you'll have to change it to:
>
> GE:  'G''E'
>
> There may be another way, but that one worked for me.
>
>
>
> --- pepone pepone <pepone.onrez at gmail.com> wrote:
>
> >  I trying to add Unicode suport to my lexer based on
> > example/cpp/unicode,
> >
> > when add the unicode char vocabulary
> > charVocabulary='\u0000'..'\uFFFE';
> >
> >
> > When try to compile the Lexer i get the next error:
> >
> >
> > WikiLexer.cpp: In member function `void
> > WikiLexer::mDOCUMENT(bool)':
> > WikiLexer.cpp:192: error: invalid conversion from
> > `const wchar_t*' to `unsigned
> >    int'
> > WikiLexer.cpp:192: error:   initializing argument 1
> > of `
> >    antlr::BitSet::BitSet(unsigned int)'
> > WikiLexer.cpp: In member function `void
> > WikiLexer::mSECTION_1_TAG(bool)':
> > WikiLexer.cpp:206: error: invalid conversion from
> > `const wchar_t*' to `unsigned
> >
> >
> > Any ideas
> > Thanks
> > --
> > play tetris http://pepone.on-rez.com/tetris
> > run gentoo http://gentoo-notes.blogspot.com/
> >
>
>


--
play tetris http://pepone.on-rez.com/tetris
run gentoo http://gentoo-notes.blogspot.com/


More information about the antlr-interest mailing list