[antlr-interest] ANTLR in C++

Eugen Cazacu the_e57 at yahoo.com
Thu Apr 5 05:24:33 PDT 2007


Hello,
I hope you don't mind cc the group, because I am also
giving this answer to the community, and for anyone
who needs explanation about how antlr C++ memory
management works.

Thank you for the appreciation. I haven't done any
antlr working recently so I am a bit rusty, so my
answers might not be very exact. Here are some point
you need to understand in order to understand my
problem and its solution, if you don't know any of
them, you should look it up:
1) Any tree can be represented as a binary tree with
first child and next sibling children, and this is how
antlr implements the AST.
2) C++ antlr has a reference counting mechanism which
is managing the memory. A newly created pointer in the
create (not sure about the function name) is wrapped
in the RefCount template class and it will
automatically delete the pointer when the counter
comes to 0. The RefCount<AST> can be used just like
any pointer, but because there are no actual pointers
in the code, there are also no deletes. The delete
operation is executed by the RefCount. You can look
through it if you are good with templates in C++,
otherwise it will be quite difficult to understand.
Basically it is a template class that takes the type
of the pointer it is managing as its parameter, just
like auto_ptr in C++, only smarter. Each time the
refcount is moved around, replicated, or deleted an
operator/constructor/function in refcount is invoked
and modifies the counter of the pointer. This is done
behind the scene and you just use it just as a regular
reference (just like in Java. Note: RefCount is not a
garbage collector, the gc in Java is much more complex
and handles some cases where the RefCount fails. the
occurrence of circular references).
3) The problem: because of the two above the following
happens: When a tree root exits a scope an has to be
destroyed, the destructor of the refcount is invoked,
this causes the destructor of the node to be invoked,
which in order to exit succesfully invokes all of its
children destructors, one of which is the nextSibling
member. This is also a refcount and does the same for
all of its members, and also has a next sibling, and
so on. When the number of siblings is close to 30.000
or more this generates a stack overflow because of the
function recursions (destructor of refcount and
destructor of the AST node). The same thing happens
also with a big depth, but usually this is not a
problem because the first child is much less used than
next sibling. A good example would be for a C++
parser:
{ int a1;
  {int a2;
  .....
    { int a30000;
    }
  }
}
The above is has a little probability to happen.
However the following structure is very much real:
{ int a1; int a2; .... int a30000; }
Some languages might have a very big linear trees like
this with a lot of siblings. An as you can see the
space complexity of the destruction of such a
structure is O(n).

4) The solution: use my own destruct function for the
tree node which will hold the temporary data in the
heap memory, not the stack memory. In my function I do
this by holding the temporary data in a queue
structure, but it might as well be a stack or
something else. By holding the nextsibling and the
firstchild of each node in such a structure, I force
the refcount not to invoke the destructor for them
because I am creating a +1 for each of them in the
counter, and when the root node is destroyed, it
doesn't recurse in destroying its children, the next
step is to destroy each child by hand using the same
method.

PS: There are no destructors because there are no
actual pointers in the code. RefCount does that job

I hope this answers your question, or at least help
you understand were to find the answer.

--- me 262 <me262c at gmail.com> wrote:

> Hi Eugen,
> 
>   I saw your posting last year on ANTLR's memory
> management. Did the
> problem get resolved in 2.7.7?
> 
>   We develop in MSVC7. I am new to parsing. In the
> past week I looked
> at ANTLR, Spirit, Lex/Yacc and others in the zoo.
> ANTLR is nice for
> Java but I can't understand how memories are
> managed. For example,
> none of the generated classes have destructors.
> 
> A appreciate your comments on the suitability of
> ANTLR for C++ projects.
> 
> Thanks in advance,
> 
> M. E.
> 
> Santa Clara, CA
> 



 
____________________________________________________________________________________
It's here! Your new message!  
Get new email alerts with the free Yahoo! Toolbar.
http://tools.search.yahoo.com/toolbar/features/mail/


More information about the antlr-interest mailing list