[antlr-interest] Re: C++ Parsers - charVocabulary option

Ric Klaren klaren at cs.utwente.nl
Wed Jan 9 01:29:30 PST 2002


Hi,

On Tue, Jan 08, 2002 at 06:38:40PM -0000, therealtootalltimmy wrote:
>    Thanks a lot for replying to my question.  I failed to mention 
> that I 1) am parsing ASCII input only and

Hate to disagree there.. the copyright character is not ASCII. ASCII is
byterange 0 - 127.

My guess is that it's some kind of encoding (which one I can't tell at the
moment since the mailers probably messed with the copyright character
transmitted in the mail it's AFAIK at least a multi byte thing right now
(in the mail) ) If it's a multibyte thing ANTLR can't handle it 
automatically (in C++).

Try running your input file through a hexdump utility and see what exactly
is at the place of the copyright char.

> class MyLexer extends Lexer;
> options {
>    charVocabulary='\003'..'\377';

This should be ok.

> unexpected char: <a character that looks like an upper left corner of 
> an ASCII box>

Is this with 2.7.2a1 or my development version? These do a hexdump of the
character if it does not pass isprint, that might tell a bit more.

Ric
-- 
-----+++++*****************************************************+++++++++-------
    ---- Ric Klaren ----- klaren at cs.utwente.nl ----- +31 53 4893722  ----
-----+++++*****************************************************+++++++++-------
Wit is cultured insolence. --- Aristotle

 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



More information about the antlr-interest mailing list