[antlr-interest] MSVC 7.0

Arnar Birgisson arnarb at oddi.is
Thu Oct 2 10:18:04 PDT 2003


Allright, I think I've found the source of that assert-failure.

In my lexer, I have a rule for a keyword:

STADVAER : "staðvær";

The problem here seems to be the 'ð' character (eth). What you need to
know is that it is 0xf0 in ISO-8859-1. Now, my lexer.g file is saved in
iso8859-1 so that this character is correct in that file.

I recall that the ANTLR documentation states that the inputCharset for
it's metalanguage is 7-bit ascii, so according to that, ANTLR should
have yielded an error for this rule. However, this was translated
directly to a string constant in the C++ file. (Note: this works fine in
Java)

Then, somewhere along the way, the expected character becomes an int,
and should be 0x000000f0, but is generated as 0xfffffff0. When 0xf0 is
seen on the input, this causes a MismatchedCharException and it tries to
generate it's message, it calls charName for 0xfffffff0 (the expected
char), which in turn calls isprint and since 0xfffffff0 is negative, it
blows up.

I guess this is partly my fault since I didn't follow ANTLR's
documentation carefully enough. Changing the rule to

STADVAER	:	"sta\360v\346r";

seems to fix this (it's butt-ugly though :o). However, I would like to
point out that this worked in Java, with it's multibyte string
constants, and along they way, antlr.Tool never complained about it's
input.

I'm sending this in here mostly to be of reference to others lexing
non-7bit ascii data with a c++ lexer, in case the hit the same walls I
did.

cheers!

Arnar

> -----Original Message-----
> From: Arnar Birgisson [mailto:arnarb at oddi.is] 
> Sent: 2. október 2003 16:01
> To: antlr-interest at yahoogroups.com
> Subject: RE: [antlr-interest] MSVC 7.0
> 
> 
> I'm sorry, this was resolved as soon as I changed my projects
> compilation to use a Multithreaded DLL runtime.
> 
> That left me with other problems, now, the "if (isprint(ch))" statment
> in charName(int) in string.cpp fails an assertion in the runtime's
> isctype.c, line 68:
> 
> Expression: (unsigned)(c+1) <= 256.
> 
> If this doesn't ring any bells, please ignore this post, I need to
> investigate this further. I just wanted to let  you know that 
> the other
> issue is resolved.
> 
> Arnar
> 
> > -----Original Message-----
> > From: Arnar Birgisson [mailto:arnarb at oddi.is] 
> > Sent: 2. október 2003 15:50
> > To: antlr-interest at yahoogroups.com
> > Subject: [antlr-interest] MSVC 7.0
> > 
> > 
> > Hello there.
> > 
> > I downloaded Ric's modified ANTLR 2.7.2, dated 2003-09-11, 
> > and am trying
> > to compile the cpp runtime to a dll with MSVC 7.0. I followed the
> > instructions in lib/cpp/README on how to create the project, 
> > except that
> > I skipped using a precompiled header (I checked "Empty 
> project" in the
> > project wizard).
> > 
> > Everything compiled without a problem (leave warnings). However when
> > running a lexer, very similar to the MultiLexer example, I 
> get access
> > violations right at the beginning, related to IO.
> > 
> > More specifically, I get an access violation error at the 
> > "input.get()"
> > statement in antlr::CharBuffer::getChar(), trying to read memory
> > 0x00000014.
> > 
> > Any ideas?
> > 
> > Arnar
> > 
> > 
> >  
> > 
> > Your use of Yahoo! Groups is subject to 
> > http://docs.yahoo.com/info/terms/ 
> > 
> > 
> 
> 
>  
> 
> Your use of Yahoo! Groups is subject to 
> http://docs.yahoo.com/info/terms/ 
> 
> 


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 




More information about the antlr-interest mailing list