[antlr-interest] Creating a simple expression language

James Abley james.abley at gmail.com
Mon Nov 10 03:48:51 PST 2008


Hi,

I'm an ANTLR newbie. A code base that I work on has various expression
evaluation aspects. I have to add to this by defining various
functions that can be evaluated. ANTLR seemed like a good way of
separating out the parsing aspects and should let my colleagues
concentrate on just defining and plugging in new functions without
having to know much about parsing, etc. I've skimmed the ANTLR
Reference book, but don't quite have the time to go in depth at this
point.

I've written a grammar, which seems to do what I need. Doubtless it
could be trimmed a bit as I learn more. Where I'm stuck is the
connection between having a grammar which can parse the input and how
it gets evaluated.

The baggage that I'm struggling with is how to define my environment,
bind variables, create stack frames, etc.

I think this would be as part of a tree grammar the re-uses the tokens
from the AST grammar, but would like to confirm.

Cheers,

James



grammar Eval;

options {
	output = AST;
//	tokenVocab=Expr; // Read token types from Expr.tokens resource
//	ASTLabelType=CommonTree;    // The Java type of the nodes.
}

tokens {
	FUNC;	// function call
	STR;
}

@parser::header {
package com.example.expression;
}

@lexer::header {
package com.example.expression;
}

stat	:	expr+;

/*
For now, we define expr very basically. We don't need to support
addition, multiplication or other operators. But if we
do, the grammar is easy to alter.
*/
expr	:	atom
	;
//multExpr ( ( '+' | '-') multExpr)*;

//multExpr
//	:	unaryExpr (( '*' | '/') unaryExpr)*;

//unaryExpr
//	:	('+' | '-')?  atom
//	;

/* Basic constituent of an expression.*/
atom	:	var
	|	LPAREN expr RPAREN	// Rule to allow nested expressions.
	|	functionCall
	|	stringLiteral
	| 	number
	;

functionCall
	:	functionName LPAREN ( expr (COMMA expr)* )? RPAREN	-> ^(FUNC
functionName expr*)
	;

functionName
	:	ALPHA (ALPHA | '-' | '_' | DIGIT )* ;
/*
Added to indicate how we currently reference bound variables in
expressions.. This lets us parse them easily enough.
with a view to consolidating our expression evaluation code into this
ANTLR-based version.
*/
var	:	'$' ALPHA (ALPHA | '-' | '_' | DIGIT)*
	;

stringLiteral	:	'"'  ~'"'* '"'
	|	'\'' ~'\''* '\''
	;

number	:	DIGIT+ ('.' DIGIT+)?
	;
	
DIGIT
	:	'0' .. '9';
	
ALPHA
	:	 'a' .. 'z'
	|	 'A' .. 'Z';
	
COMMA
	:	(WS* ',' WS*);

LPAREN	
	: 	(WS*  '(' WS*);
RPAREN	
	:	(WS* ')' WS*);

WS
	:  	' '
	| 	'\t';


More information about the antlr-interest mailing list