[antlr-interest] c++ Unicode

pepone pepone pepone.onrez at gmail.com
Mon Apr 24 14:44:07 PDT 2006


Thanks for your help Peggy, UTF-8 it's working now, i forget to create
UnicodeCharBuffer input(*in); in the main program

Thanks again

On 4/24/06, Peggy Fieland <madcapmaggie at yahoo.com> wrote:
> I get my input in UTF-8. I am using ANTLR 2.7.5 and
> have modified it to handle utf-8 using the example in
> the examples directory as a model.
>
> I don't know what the state of C++ unicode suport is
> in antlr-2.7.6
>
> Peggy
>
> --- pepone pepone <pepone.onrez at gmail.com> wrote:
>
> > Hi all
> >
> > I have same problems to get UNICODE working with c++
> >
> > I change my lexer rules as Peggy say and now my
> > lexer and parser
> > compile OK, but it don't work as expected.
> >
> > in cpp/examples/unicode test.in is encoding as
> > ISO-8859-1 and not as
> > utf-8, is not posible to have input encoding as
> > UTF-8 or UTF-16?
> >
> > if i encoding my input as ISO-8859-1 i don't see the
> > same characters
> > in input that in ouput
> >
> > for example if i input '�'  this give me '� '
> this
> > happen with my
> > lexer as well with unicode example.
> >
> > what is the way to get this working, i want to input
> > '�' a view '�' in
> > the ouput.
> >
> > Any ideas to solve this.
> >
> > On 4/23/06, Peggy Fieland <madcapmaggie at yahoo.com>
> > wrote:
> > > Yes, if you have something like:
> > >
> > > GE: ">="
> > >
> > > you'll have to change it to:
> > >
> > > GE:  'G''E'
> > >
> > > There may be another way, but that one worked for
> > me.
> > >
> > >
> > >
> > > --- pepone pepone <pepone.onrez at gmail.com> wrote:
> > >
> > > >  I trying to add Unicode suport to my lexer
> > based on
> > > > example/cpp/unicode,
> > > >
> > > > when add the unicode char vocabulary
> > > > charVocabulary='\u0000'..'\uFFFE';
> > > >
> > > >
> > > > When try to compile the Lexer i get the next
> > error:
> > > >
> > > >
> > > > WikiLexer.cpp: In member function `void
> > > > WikiLexer::mDOCUMENT(bool)':
> > > > WikiLexer.cpp:192: error: invalid conversion
> > from
> > > > `const wchar_t*' to `unsigned
> > > >    int'
> > > > WikiLexer.cpp:192: error:   initializing
> > argument 1
> > > > of `
> > > >    antlr::BitSet::BitSet(unsigned int)'
> > > > WikiLexer.cpp: In member function `void
> > > > WikiLexer::mSECTION_1_TAG(bool)':
> > > > WikiLexer.cpp:206: error: invalid conversion
> > from
> > > > `const wchar_t*' to `unsigned
> > > >
> > > >
> > > > Any ideas
> > > > Thanks
> > > > --
> > > > play tetris http://pepone.on-rez.com/tetris
> > > > run gentoo http://gentoo-notes.blogspot.com/
> > > >
> > >
> > >
> >
> >
> > --
> > play tetris http://pepone.on-rez.com/tetris
> > run gentoo http://gentoo-notes.blogspot.com/
> >
>
>


--
play tetris http://pepone.on-rez.com/tetris
run gentoo http://gentoo-notes.blogspot.com/


More information about the antlr-interest mailing list