[antlr-interest] [C target] [3.1.1] Memory consumption goes throughthe roof with rewrite rules

Sam Harwell sharwell at pixelminegames.com
Tue Jan 20 08:03:18 PST 2009


I've seen some cases where the lexer will emit a 0-length token and not
proceed. This creates empty tokens until the system runs out of memory.
I'm not familiar with the C target's internals, so maybe someone else
can direct you to the correct function to watch for this behavior?

Now I wish I saved a copy of a grammar that showed this behavior. :\ I
remember its cause and solution were relatively easy to identify at the
time, so I fixed my grammar and moved on.

Sam

-----Original Message-----
From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of Sven Van
Echelpoel
Sent: Tuesday, January 20, 2009 7:34 AM
To: antlr-interest Interest
Subject: [antlr-interest] [C target] [3.1.1] Memory consumption goes
throughthe roof with rewrite rules

Hi,

after passing a piece of non-trivial source code through the parser of
the language we created, memory consumption went through the roof (it
immediately consumed the full 3GB of memory on my 64 bit Ubuntu box) and
this for a source file of a measly 14K characters. Using a debugger and
a profiler I immediately came to the same conclusion as this poster:
http://preview.tinyurl.com/a94qgn. Recursive invocation of rewrite rules
results in addChild() and dupTree() being invoked millions of times.

While his example was slightly contrived, because of the problem domain
our language is highly recursive in nature. And although the source file
was slightly bigger than a toy sample, it is by no means what we intend
to use it for.

So, desperately wanting to get to the bottom of this I used Jim Idle's
response to the aforementioned post (emphasis mine):

<quote>
That looks like a bug - in fact, I think I remember saying to myself 
"now I have put in factories for everything, I should *not call dup on 
the trees*! However, I also remember doing that, so perhaps it is 
something to do with your grammar) Sorry, but I still won't be able to 
look in to this until the weekend :-( 
</quote>

to determine that I could maybe short-circuit the code in dupTree() in
'antlr3basetree.c'.

Here's what I did:

static void    *
dupTree		(pANTLR3_BASE_TREE tree)
{
  /* Fixes memory explosion when using rewrite rules */
  return tree; 
  /* Rest of code removed */
}

in other words, instead of duplicating the tree, I just return it. This
seems to solve the problem. Memory consumption is back to normal and the
code seems to run OK.

The questions that I have are these:

1) Is this workaround valid or are there some unforeseen bodies still
left in the closet?
2) If this is a bug, as was indicated in the response to the OP, will it
get fixed in one of the following released of ANTLR?

Thanks for the input,

Sven


List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address


More information about the antlr-interest mailing list