[antlr-interest] Stack Overflow for Tree grammar CSharp target

Greg Smolyn greg at smolyn.org
Tue Sep 2 21:38:29 PDT 2008


Hrm, so I'm not sure if the precedence parser will help me that much  
since, indeed, the problem is actually in a tree grammar.

I managed to reduce the recursive load a bit by factoring it out into  
more simple productions, so I know have:

expr
   :  ^( singleArgTypes expr)
   |  ^( doubleArgTypes expr expr)
   | ^( tripeArgTypes expr expr expr)
  ;

singleArgTypes
   : TYPEOF | NOT | ... etc etc ...
   ;

doubleArgTypes
   : ASSIGN | MULT | AND | ... etc etc ...
   ;

this reduced the number of local pointers quite significantly (3 per  
case in expr, plus some extras, so like 12 or 15).  So now instead of  
blowing up at about 190 nestings, it blows up at about 700.  This is  
enough to get me past my current blockage, but I might plug away at it  
a bit more and see if I can improve that.

-greg


On 2-Sep-08, at 3:59 PM, Sam Harwell wrote:

> Yikes, you just found a way to crash our product. I must say though,  
> the
> input required to do so was absurd! It took some 20 full lines of
> 3*3*3*3*3*...! LOL
>
> My operator precedence parser is also vulnerable to this, but the tree
> grammar will fail much earlier. By replacing the tail recursion in the
> precedence tree builder with a hybrid stack/recurse routine, it could
> handle very large input expressions. The main problem remains in the
> tree grammar. I'll be working on this, but no real answer yet. For  
> now,
> I'm modified my expression rule to throw an OperationCanceledException
> if the expression stack gets too deep. I catch that exception at a top
> level, issue an error/warning, and toss the parse results. The
> definition of _expressionDepth is in my separate partial class file.
>
> expression
> @init
> {
> 	_expressionDepth++;
> 	if ( _expressionDepth > 50 )
> 		throw new OperationCanceledException( "Expression nested
> too deeply." );
> }
> 	:	assignExpr // <-- this is actually a giant switch now,
> like yours is
> 	;
> finally
> {
> 	_expressionDepth--;
> }
>
> Sam
>
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org
> [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Johannes Luber
> Sent: Tuesday, September 02, 2008 4:58 PM
> To: Greg Smolyn
> Cc: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Stack Overflow for Tree grammar CSharp
> target
>
> Greg Smolyn schrieb:
>> Hi,
>>
>> I've got a rather large rule in a tree grammar (using the CSharp2
>> target, ANTLR 3.1), that when faced with a large input tree tends to
>> blow up with a StackOverflowException.
>> This happens only with some pretty crazy (but unfortunately required)
>> input that forces a very deep recursion.
>>
>> The rule looks like
>>
>> expr
>>  : ^( ASSIGN expr expr )
>>  | ^(MULT expr expr)
>>  | ^(AND expr expr)
>>  | ^(TYPEOF expr)
>> ... <snipped about 50 others>...
>>  ;
>>
>>
>> This generates an expr() method with quite a few local variables, and
>> which can only handle about ~200 recursions  (ie. given a tree like
> "(*
>> (* (* (* (* (* .... (* 2 3) ... 4) 5) 6) 7) 8) 9)").
>> There seem to be about 3 local variable references for every case,
> which
>> means the thing is putting at least 150 pointers onto the stack at
> every
>> recursive step.
>> For example:
>>
> <http://www.julielist.com/x.xxx?p=54.000.I32p2.1018147.107-102&url=http 
> :
> //www.4realcash.com/affiliates/gallhit/1130677/163/7/2/0>CommonTree
>> ASSIGN76 = null;
>> CommonTree MULT79 = null;
>> ...
>> expr_return expr77 = default(expr_return);
>> expr_return expr78 = default(expr_return);
>> ...
>> CommonTree ASSIGN76_tree=null;
>> CommonTree MULT79_tree=null;
>>
>> Would this be a bug or future optimization?  Am I going to have to go
>> down the path of changing my parser to emit some imaginary AST nodes
>> that can group these into smaller categories (ie. additiveExprs,  
>> etc).
>> I'd rather not do the latter as it would mean changing a whole pile
> of
>> other tree grammars using the generated ASTs.
>>
>> Interestingly I see that really none of those variables are actually
>> used outside of their respective case statements.  Could not these
>> variables be moved there, reducing the load for recursive calls?
>>
>> Would mucking around in the StringTemplates for the csharp2 target be
>> something worth trying?  Or is this something even higher up in the
>> antlr AST generation?
>>
>> Thanks for any insights!
>>
>> -greg
>
> Maybe <http://antlr.org/wiki/display/ANTLR3/Operator+precedence 
> +parser>
> works here?
>
> Johannes
>>
>>
> ------------------------------------------------------------------------
>>
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-addr
> ess
>>
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-addr
> ess
>



More information about the antlr-interest mailing list