[antlr-interest] Memory usage of nilNodes in the C target

Wed Apr 1 03:27:25 PDT 2009

Preamble/warning; I'm, yet again, doing disturbing and possibly 
undefined things with the C target ;-)

The addition of the 'reuse' method on trees and the nilStack in the 
arboretum helped memory usage quite a bit in my parser, but we were 
still finding that an awful lot of memory was being taken up by nilNodes 
after the parser had run. In fact, the nilNodes were enormously 
dominating 'real' nodes in memory usage.

We eventually tracked this down to backtracking, and observed that while 
nilNodes were being reused in the event of a rule correctly matching via 
becomeRoot and rulePostProcessing, they were being otherwise orphaned. 
The reason we were observing problems of such magnitude was that some of 
our rules will be called up to ten deep to match a single token, with a 
nilNode being orphaned in each one.

The eventual solution was to have the templates change the output of 
code such as;

     if ( BACKTRACKING==0 ) 

     { 

         retval.stop = LT(-1); 

         retval.tree = 
(pANTLR3_BASE_TREE)(ADAPTOR->rulePostProcessing(ADAPTOR, root_0)); 

         ADAPTOR->setTokenBoundaries(ADAPTOR, retval.tree, retval.start, 
retval.stop); 

     }

by adding

     else { if(root_0) { root_0->reuse(root_0); root_0 = NULL; } }

This helped our memory usage _enormously_. While this worked in our 
parser (we've since passed a very thorough barrage of tests without a 
crash or memory leak or invalid result), is this a good idea in general?

Richard