[antlr-interest] Infinite lexer exception allocation loop in Ctarget C parser, antlr3.0

Tue Jul 3 09:29:04 PDT 2007

Jim,

thanks for that.  Yeah, I already added the \x support, but it sowed the
seeds of doubt that some other user syntax or lexical error may cause
the same problem.  I use Fedore core 3, and it seemed to recover by
killing the task once it asked for all 2Gbytes, but I don't know if XP
is equally robust.

BTW, really nice job on the C target.  Much better documentation than
most open source projects.  I'm really looking forward to the C++
target, if only to eliminate the explicit passing of the 'this' pointer.

Just a few points: 

1) I basically had little trouble getting it to work.  The only real
problem was that the sources don't come with a configure script already
built, so one has to use autoconf.  To cut a long story short, I
eventually gave up on the GNU tools because makefile.in wasn't available
so I couldn't successfully run automake.  It turned out to be easy to
create a config.h file from scratch, so that's what I ended up doing.
Nevertheless, you might want to add a working configure script and
makefile.am's as is done with most source distros, to avoid
inconvenience to those with back-level autoconf tools.  (Admittedly I'm
all at sea with autoconf tools, so it's probably my stupidity).

2) Before I realized you had already ported Terrence's C grammar/Java
target to a C target, I was attempting to do so myself.  While doing
this, I found out that the @header stuff is put into both the .c and the
.h files.  I really wanted to put the @header stuff in just the .h file
(is that unreasonable?) so I decided to tweak the string template file.
Lo and behold, the changes to the string templates were not noticed by
the antlr tool.  What am I missing?  I thought it was just a matter of
changing 

$(ANTLR_HOME)/src/org/antlr/codegen/templates/C/C.stg

but it obviously isn't that simple...  Surely I don't have to rebuild
antlr - that would mean I'd have to get a java compiler - argh!

Regards,
SJH

> -----Original Message-----
> From: Jim Idle [mailto:jimi at temporal-wave.com] 
> Sent: Monday, July 02, 2007 7:25 PM
> To: Hardy, Stephen; antlr-interest at antlr.org
> Subject: RE: [antlr-interest] Infinite lexer exception 
> allocation loop in Ctarget C parser, antlr3.0
> 
> I t sounds like an issue with the grammar that exposes an issue in the
> runtime, though something tells me that this may be related to an
> extremely recent change to deal with something related. 
> 
> Thanks for pointing this out - I will endeavor to fix this tomorrow or
> at least provide an answer/work around (which might be to allow \x
> sequences ;-).
> 
> Jim
> 
> > -----Original Message-----
> > From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> > bounces at antlr.org] On Behalf Of Hardy, Stephen
> > Sent: Monday, July 02, 2007 1:31 PM
> > To: antlr-interest at antlr.org
> > Subject: [antlr-interest] Infinite lexer exception 
> allocation loop in
> > Ctarget C parser, antlr3.0
> > 
> > Hi all,
> > I'm tasked with some C-to-C translation, and have been 
> using the ANSI
> C
> > grammar with the C target as a starting point, with antlr3.0.
> > 
> > The grammar inadvertently omits the possibility of using \x (hex)
> > escapes in a literal string, and this causes an infinite memory
> > allocation loop when the parser is run against tokens such as
> > "\x00\x00".  The offending sequence of code is in mSTRING_LITERAL(),
> > which calls mEscapeSequence(), which in turn allocates an exception
> > struct (CONSTRUCTEX()) when it fails to understand the \x.
> > Unfortunately, the calling code is a for(;;) loop which does not
> > advance
> > the token stream, hence the lexer will allocate forever.
> > 
> > Sorry I'm really new to this, so it may be my fault, but it 
> looks like
> > it may be a C target problem.  Didn't see any similar problem
> mentioned
> > in the most recent archive.
> > 
> > Regards,
> > SJH
>