[antlr-interest] Problems with AST construction

Andreas Bartho andreas.bartho at inf.tu-dresden.de
Wed Aug 22 08:47:36 PDT 2007


Hello,

I'm currently trying to remove left recursion from a large Grammar (C#). 
I've run into some problems, however, and would be grateful for any help.

This is an edited excerpt of the grammar.

primary : 	Identifier | invocation | increment;	
	
invocation :	primary invocation_rest;
	
increment  :	primary increment_rest;	

invocation_rest : '(' '.'? ')';
	
increment_rest : '++';	

Identifier :   'a'..'z'*;


An AST for i()() should look like (Invocation (Invocation Identifier '(' 
')' ) '(' ')' ).
There are rules not displayed here that can refer to either primary or 
invocation/increment.


To remove left recursion I have experimented with repetition operators, 
but to no avail.

A grammar like the following works, but only for this toy example.

primary : 	primary_rec | Identifier;	
	
primary_rec :	Identifier (invocationrest | incrementrest)*;	

invocation :	primary invocationrest;
	
increment  : 	primary	'++';	
	
invocationrest : '(' '.'? ')' ;

incrementrest :	'++';	

Identifier: 'a'..'z'*;

Once the grammar gets bigger, ANTLR runs into problems, because it can't 
decide, for example, whether it should match an invocationrest in rule 
primary_rec or in rule invocation.

Using right-recursion seems to work:

tokens {
	Invocation;
	Increment;
}

primary : primary_start primary_rec*;	
	
primary_start :	Identifier;
	
primary_rec :	invocation_rest	| increment_rest;	


invocation
	:	primary_start invocation_rec
	->	^(Invocation primary_start invocation_rec)
	;
	
invocation_rec
	:	primary_rec invocation_rec
	//makes AST right associative
	//->	^(Invocation primary_rec invocation_rec)
	|	invocation_rest
	;	
			
invocation_rest : '(' '.'? ')';

increment
	: 	primary_start increment_rec
	->	^(Increment primary_start increment_rec)	
	;	

increment_rec
	:	primary_rec increment_rec
	//makes AST right associative	
	//->  ^(Increment primary_rec increment_rec)
	|	increment_rest
	;	
		
increment_rest : '++';	

Identifier :   'a'..'z'*;

The problem is that due to the right-recursion the resulting trees are 
right-associative, yielding a tree like (Invocation Identifier 
(Invocation '(' ')' '(' ')' ) )
Furthermore, the trees must also be created when rule 'primary' is 
called, leaving only 'primary_rec' for tree construction. But I just 
don't get iterative AST construction to work here. Does anyone have an idea?

Andreas


More information about the antlr-interest mailing list