[antlr-interest] comments on suggestion to reduce tree construction code size

Terence Parr parrt at jguru.com
Mon Dec 24 16:54:45 PST 2001


Folks,

Somebody suggested removing the syn pred gate around tree construction 
actions.  The rationale is that an invoked rule can build whatever tree 
it wants during guess mode, because the invoker will ignore the result 
in the try block.  Later, if pred was successful, it will be re-executed 
with "actions on" and the return ast will be used.

For example, the call to explicitConstructorInvocation can build 
whatever it wants as the returnAST is only used in the "if 
( synPredMatched55 ) {...}" block.

if ( ...) {
         int _m55 = mark();
         synPredMatched55 = true;
         inputState.guessing++;
         try {
                 explicitConstructorInvocation();
         }
         catch (RecognitionException pe) {
                 synPredMatched55 = false;
         }
         rewind(_m55);
         inputState.guessing--;
}
if ( synPredMatched55 ) {
         explicitConstructorInvocation();
         if (inputState.guessing==0) {
                 astFactory.addASTChild(currentAST, returnAST);
         }
}

So, the code in explicitConstructorInvocation() such as:

if (inputState.guessing==0) {
    astFactory.addASTChild(currentAST, returnAST);
}

could become the unguarded

astFactory.addASTChild(currentAST, returnAST);

This would result in a big code size savings, but at the cost of 
run-time size and speed.  Clearly building trees is a waste and 
memory-allocation in java is a pig.  This waste would actually only 
occur when you were "guessing".  All other times might see a small speed 
increase since you don't have to test the "syn pred gates" everywhere.

I just removed the gates on a test version of ANTLR and ran it on the 
java grammar.  Parsing all of ANTLR code itself without gates is 17sec 
on my G4 450Mhz OS X box versus the regular 19sec.  Roughly 10% one 
could say.  Code size is reduced from 5551 lines to 4709 lines of java 
code for JavaRecognizer.  134388k vs 115163k characters.  In lines of 
code it's about 17% smaller.  The JavaRecognizer.class file falls from 
54092 to 47296 bytes (14% reduction).  I couldn't learn much from the 
various profiling options on the interpreter to see what the cost in 
memory allocation was.  Sorry.

So, how do you vote?  Can anybody think of a reason we cannot remove 
gates around my tree construction routines?  No doubt they are still 
needed on random user actions.

Ter
--
Chief Scientist & Co-founder, http://www.jguru.com
Creator, ANTLR Parser Generator: http://www.antlr.org


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



More information about the antlr-interest mailing list