[antlr-interest] Grammar nondeterminism on recursion beginner's question

Stritzel.Nils at infineon.com Stritzel.Nils at infineon.com
Fri Apr 28 00:37:42 PDT 2006


Hi all,

I am still working my fomrula parser and after solving an earlier issue
(thanks to Martin Probst).
But now I have got a new problem.

My grammar includes a list that contains to different types of elements
runExpr and batchExpr. This is supposed to contains and list one runExpr
and one batchExpr.

runBatchExprList
	:
	(runExpr SEP!) =>
			runExpr SEP! runBatchExprList 
			| 
			runExpr SEP! batchExpr
	|
	(batchExpr SEP!) =>
		batchExpr SEP! runBatchExprList
		|
		batchExpr SEP! runExpr
	;

But the matter is that runExpr and batchExpr can start with the same
Tokens so this seemingly causes some problem.
Is there a way to make the parser know which is the right alternative?
Can I get this to work without rewriting everything?
Below is my (for this posting somewhat simplified) grammar.
Maybe the question boils down to the question, how to choose correctly
from a a rule like this:
Rule1 : batchExpr | runExpr ;


Thanks,

Nils




class ExpressionParser extends Parser;
options {
	buildAST = true;	// uses CommonAST by default
	k = 1;
}

imaginaryTokenDefinitions
	:
	SIGN_MINUS
	SIGN_PLUS
	;

expr	:
	(formula)* EOF!
	;
	
formula
	:
	BATCHCSID ASSIGN^ batchExpr
	| 
	RUNCSID ASSIGN^ runExpr
	;

batchExpr 
	:
	sumExpr 
	;

sumExpr : 
	baseExpr ((PLUS^|MINUS^) baseExpr)* 
	; 

baseExpr 
	: 
	primaryExpr 
	| 
	signedExpr
	;
	
signedExpr
	:
	(m: MINUS^ {#m.setType(SIGN_MINUS);} | p: PLUS^
{#p.setType(SIGN_PLUS);})         
	baseExpr
	;
	
primaryExpr
  	: 
  	DOUBLE
  	| 
  	BATCHVARIABLE
  	| 
  	(LPAREN^ batchExpr RPAREN! )
  	| 
  	functionCall
  	;
  
  
functionCall 
	: 
	(CABS^ | CSIGN^) LPAREN! batchExpr RPAREN!      
  	| 
  	((CAVG LPAREN^) => CAVG LPAREN! batchExprList RPAREN!
  			| CAVG LPAREN! batchExpr RPAREN!) 
	| 
	(CMAX^ | CMIN^) LPAREN! batchExprList RPAREN!
  	;
  

batchExprList 
  	:
	batchExpr (SEP! batchExpr)+ {## = #(#[SEP], ##); }
	;

runExpr : 
	runSumExpr
	;

runSumExpr 
	: 
	runBaseExpr ((PLUS^|MINUS^) runBaseExpr)*
	;
	
	
runBaseExpr 
	: 
	runPrimaryExpr 
	| 
	runSignedExpr
	;


runSignedExpr
	: 
        (m: MINUS^ {#m.setType(SIGN_MINUS);} | p: PLUS^
{#p.setType(SIGN_PLUS);})         
	runBaseExpr
	;
	
runPrimaryExpr	
	: 
  	RUNVARIABLE
  	| 
  	(LPAREN^ runExpr RPAREN! )
  	| 
  	runFunctionCall
  	;
  

runFunctionCall 
	:
       	(CABS^ | CSIGN^) LPAREN! runExprList RPAREN! 
       	|
       	(CMAX^) => (CMAX^  LPAREN! runBatchExprList RPAREN!)
       		| 
       		(CMAX^  LPAREN! runExprList RPAREN!) 	
       		|
       	(CMIN^) => (CMIN^  LPAREN! runBatchExprList RPAREN!)
       		| (CMIN^  LPAREN! runExprList RPAREN!) 
       	|
       	(CAVG^) => (CAVG^  LPAREN! runExprList RPAREN!)
       		| 
       		(CAVG^  LPAREN! runExpr RPAREN!) 
	;
	
runExprList 
  	:
	runExpr (SEP! runExpr)+ {## = #(#[SEP], ##); }
	;
	


runBatchExprList
	:
	(runExpr SEP!) =>
			runExpr SEP! runBatchExprList 
			| 
			runExpr SEP! batchExpr
	|
	(batchExpr SEP!) =>
		batchExpr SEP! runBatchExprList
		|
		batchExpr SEP! runExpr
	;




class ExpressionLexer extends Lexer;

options {
	caseSensitive = false;
	k = 8;
}

PLUS	: 
	'+' 
	;

MINUS	: 
	'-' 
	;

MULT	: 
	'*' 
	;

DIV	: 
	'/'  
	; 

LPAREN	: 
	'(' 
	;

RPAREN	: 
	')' 
	;

protected DIGIT 
	: 
	'0'..'9' 
	;

WS 	:
	(' '
	| 
	'\t'
	|
	'\r' '\n' {newline(); }
	| 
	'\n'	{newline(); }
	)
	{ $setType(Token.SKIP); }
	;	
	
DOUBLE 	:	
	(DIGIT)+ ('.' (DIGIT)+)? ('e' (MINUS|PLUS)? (DIGIT)+ )? 
	;	

ASSIGN	:       
	'=' 
	;

SEP	:
	'\\' 
	;


RUNCSID		
	:	
	"rsc."
	; 

RUNVARIABLE 	
	:
	"r."
	;
	
BATCHCSID	
	:	
	"bsc."
	;
	
BATCHVARIABLE      
	:	
	"b."
	;
	
CMAX	: 
	"c_max"
	;
	
CMIN	:
	"c_min"
	;
	
CABS	:
	"c_abs"
	;
	
CAVG	:			
	"c_avg"
	;

CSIGN	: 
	"c_sign" 
	;


More information about the antlr-interest mailing list