[antlr-interest] Memory usage of nilNodes in the C target

Jim Idle jimi at temporal-wave.com
Wed Apr 1 08:59:18 PDT 2009


Richard Thrippleton wrote:
> Preamble/warning; I'm, yet again, doing disturbing and possibly 
> undefined things with the C target ;-)
>
> The addition of the 'reuse' method on trees and the nilStack in the 
> arboretum helped memory usage quite a bit in my parser, but we were 
> still finding that an awful lot of memory was being taken up by nilNodes 
> after the parser had run. In fact, the nilNodes were enormously 
> dominating 'real' nodes in memory usage.
>
> We eventually tracked this down to backtracking, and observed that while 
> nilNodes were being reused in the event of a rule correctly matching via 
> becomeRoot and rulePostProcessing, they were being otherwise orphaned. 
> The reason we were observing problems of such magnitude was that some of 
> our rules will be called up to ten deep to match a single token, with a 
> nilNode being orphaned in each one.
>
> The eventual solution was to have the templates change the output of 
> code such as;
>
>      if ( BACKTRACKING==0 ) 
>  
>
>      { 
>  
>
>          retval.stop = LT(-1); 
>  
>
>          retval.tree = 
> (pANTLR3_BASE_TREE)(ADAPTOR->rulePostProcessing(ADAPTOR, root_0)); 
>  
>
>          ADAPTOR->setTokenBoundaries(ADAPTOR, retval.tree, retval.start, 
> retval.stop); 
>
>      }
>
> by adding
>
>      else { if(root_0) { root_0->reuse(root_0); root_0 = NULL; } }
>
> This helped our memory usage _enormously_. While this worked in our 
> parser (we've since passed a very thorough barrage of tests without a 
> crash or memory leak or invalid result), is this a good idea in general?
>
> Ric
I have more to do on this front, but it is tricky to get right in a 
generic way. So the template change may work well for your grammar but 
may not in the generic case. At some point we will get ANTLR itself to 
track usage and reuse nodes, but for the latest release I have 
implemented this in a ll the safe points. This may well be a safe point 
that I missed and I will look at it, but I suspect it only works with 
your rule formulation.

It is not a good idea to use backtracking though unless your input is 
always guaranteed to be correct, as otherwise the ability to give good 
error messages is pretty much destroyed.

Jim


More information about the antlr-interest mailing list