[antlr-interest] [C target] @after and/or finally do not cover all cleanup cases when backtracking is on

Wed Dec 29 11:21:30 PST 2010

I cannot guarantee that init and after work perfectly with backtracking
mode as it gets too complicated to deal with with just the template
functionality. They work correctly without backtracking.

I have covered this in prior posts if you look at antlr.markmail.org. But,
the other thing you should consider is that you should not allocate memory
until the parse is found to be good. The best way to do that is to not do
anything until the tree walking phase. If you are trying to do things in
the parsing phase then you have to be very careful - probably avoid @init
and @after etc and perhaps better yet, use factories to create your
allocations that can then track and release the memory without you dealing
with it explicitly.

Basically Justin you are going about constructing this in a way that is
going to get you into trouble with the C target I think. I strongly
recommend left factoring your grammar as the time saved later because of
all the hassle you will avoid will outweigh the time spent doing it now.
You main issue is that if you are using backtracking AND you are
allocating memory then you don't always know if you allocated anything
(without using factories or auto allocation etc) and error messages will
turn up in very unlikely places. If you don't wish to get out of
backtracking, then you should definitely move to parse->tree verify->tree
walk and do stuff.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Justin Murray
> Sent: Wednesday, December 29, 2010 7:10 AM
> To: antlr-interest at antlr.org
> Cc: Robert Jacobs
> Subject: [antlr-interest] [C target] @after and/or finally do not cover
> all cleanup cases when backtracking is on
>
> Hello,
>
>
>
> I have some concerns with the generated parser C code when using @after
> and/or finally blocks in conjunction with backtracking. In my parser,
> for every line matched in a program, I have to create a new instance of
> a C++ class. This instance is in the scope of the "line" rule, and is
> used by the actions of this rule and all of the nested rules that it
> calls. To accomplish this, I added an @init section to create the new
> instance. I was not sure if I should use @after or finally to do the
> cleanup, so I initially tried using both. See the attached Test.g for a
> simplified example. What I found was that with backtracking turned off,
> what I wanted was the finally block, and it would always free the
> allocated memory. Unfortunately, I am stuck with backtrack=true (at
> least for now), and it appears to me that the generated code will not
> clean up memory properly in all cases. As you can see in the attached
> TestParser.c, there are two cases in the line() function where there is
> a "return;" line that does not execute the cleanup code from the @after
> or finally blocks. I am not sure how likely it is for these cases to
> occur, but it is clear to me that there is a potential for memory leaks
> in the generated code.
>
>
>
> The next thing I tried was to move my allocation/deallocation code up
> one level, into the actions of the "prog" rule. This is probably a
> better choice because the memory will not be allocated when
> backtracking. See the attached Test2.g for my simplified
> implementation.
> This generated code (see attached Test2Parser.c) looks better, but
> there is still the potential for a memory leak. Looking at the prog()
> function, the memory is only allocated if BACKTRACKING == 0, and then
> the line() rule function is called. After it returns from line(), if
> HASEXCEPTION() is true, the cleanup is handled by the finally block
> case. However, if HASFAILED() returns true (and HASEXCEPTION() returns
> false), then there is again a direct "return;" that does not do memory
> cleanup. I am beginning to think that maybe this is an impossible
> condition (does HASFAILED() only get set when backtracking?), but it is
> unclear to me.
>
>
>
> I am hoping that you will tell me that the memory leak in the second
> case is an impossible condition, and then I can move on with that
> implementation. If that is not the case, is there some sort or
> workaround that I can do? Regardless of that, it seems like the first
> implementation should also be valid, and maybe the generated code could
> handle that better.
>
>
>
> I know that the real solution is to left-factor the grammar and not use
> backtracking, but the syntax is inherently flawed and ambiguous, and we
> are not allowed to break back compatibility. While it may be possible,
> I am probably not skilled enough to figure it out in the allotted
> timeframe. The good news is that even with backtracking and memoizing,
> the ANTLR parser performs 25% faster than what it replaced (Visual
> Parse++, an ancient and horrible parser generator).
>
>
>
> Thanks again,
>
> Justin Murray
> Software Engineer
> jmurray at aerotech.com
>
> Aerotech, Inc.
> 101 Zeta Drive
> Pittsburgh, PA 15238
> 412-963-7470
>
>