[antlr-interest] Creating a simple expression language

James Abley james.abley at gmail.com
Tue Nov 18 04:13:49 PST 2008


Hi all,

Sorry, I'm still stuck with this. I have taken the sample grammar from
the book and reduced it as small as I know how currently.


grammar Expr;

options {
	output=AST;
	ASTLabelType=CommonTree;
	k = 1;
}

prog	:	(stat {System.out.println($stat.tree.toStringTree());} )+ ;

stat	:	expr	-> expr
	;
	
expr	:	multExpr (('+' ^  | '-'^) multExpr)*
	;
	
multExpr	:	atom ('*'^ atom)*
	;
	
atom	:	INT
	|	function
	/* We don't want nodes for the parentheses */
	|	'('! expr ')'!
	;

/* We don't want nodes for the paretheses or commas. */	
function	:	FUNCNAME^ (WS)* '('! (WS)* (expr ( (WS)* ','! (WS)* expr )* )? ')'! ;
	
FUNCNAME	:	( 'a' .. 'z' | 'A' .. 'Z')+  ;
INT	:	'0' .. '9'+ ;
WS	:	( ' ' | '\t' | '\n' | '\r' ) + {skip(); };


I then have a tree grammar to go with this AST grammar.



tree grammar Eval;

options {
	tokenVocab=Expr;	// read the token types from Expr.tokens
	ASTLabelType=CommonTree;
}

prog	:	stat+ ;

stat	:	expr
		{ System.out.println($expr.value); }
	;
	
expr returns [int value]
	:	^('+' a=expr b=expr) { $value = a+b; }
	|	^('-' a=expr b=expr) { $value = a-b; }
	|	^('*' a=expr b=expr) { $value = a*b; }
	|	^( FUNCNAME expr*) { /* What do I do here? */ }
	|	INT	{$value = Integer.parseInt($INT.text); }
	;


I'm not sure how to plugin in the function evaluation using just the
tree grammar.

    @Test
    public void simpleAdditionDefinedInGrammar() throws Exception {
        ExprLexer lexer = new ExprLexer(new ANTLRStringStream("1+15+23-3"));
        CommonTokenStream tokens = new CommonTokenStream(lexer);

        ExprParser parser = new ExprParser(tokens);
        ExprParser.expr_return r = parser.expr();

        CommonTree tree = (CommonTree) r.getTree();

        CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree);

        Eval eval = new Eval(nodes);

        assertEquals(36, eval.expr());
    }

Evaluation using the rules defined in the grammar works fine, as per
my test above. I need to have an environment to allow function binding
and dispatch so that I can have a simple Java API for people to
implement functions against, without having to know ANTLR and alter
the grammar.

Am I approaching this problem in the right way?

I have been able to walk the CommonTree that I get back from the
parser, but I'm not sure I'm doing that in the right way either. Is
there an existing ANTLR API to walk / visit trees?

[1] JUnit TestEval - http://pastie.org/317730 - the second test fails
since my tree grammar currently has no way to do any sort of lookup to
a function.
[2] Function - http://pastie.org/317734
[3] AddFunction - http://pastie.org/317733
[4] Environment - http://pastie.org/317736
[5] TreeEvaluator - http://pastie.org/317737

Cheers,

James

2008/11/11 James Abley <james.abley at gmail.com>:
> Hi Michael,
>
> That helps, thanks. I have something like that, but I'm a little stuck
> with getting the Eval.g grammar going. I think I need to start with a
> simpler grammar and get a better handle on how that will hang
> together, and then see if I still don't understand how to apply that
> to my problem, and then take the CMinus example from the book and play
> around with that a little.
>
> Cheers,
>
> James
>
>
> 2008/11/10 Michael Lee <antlr at quantdev.com>:
>> Hi, James
>>
>> I am a newbie as well. Past 4 weeks, I worked on creating an expression
>> engine that will evaluate an FIX message during runtime. What I wanted to is
>> 'compile' the expression into an expression during the program start time
>> and evaluate them runtime. It also require to have binding to 'Msg' at
>> evaluation and thread-safe.
>>
>> For example,
>>
>> Msg["PRICE"] < 5.00
>>
>> Msg["PRICE"] will be evaluated during runtime - value is determined by
>> FixMsg passed in for evaluation.
>>
>> For this, I created two separate files (Eval.g and Expr.g). One for parsing
>> expression(Expr.g) and one for assembling parsed expression into an
>> organized expression object(Eval.g).
>>
>> Eval.g is a tree grammar. Here is a simplified snippet...
>>
>> expression returns [ Expression exp ]
>> : ^(op='+' a=expression b=expression ) { $exp =
>> NumericOperationExpression.createOperation( "+" , a , b); }
>> | ^(op='-' a=expression b=expression ) { $exp =
>> NumericOperationExpression.createOperation( "-" , a , b); }
>> | ^(op='*' a=expression b=expression ) { $exp =
>> NumericOperationExpression.createOperation( "*" , a , b); }
>> | ^(op='/' a=expression b=expression ) { $exp =
>> NumericOperationExpression.createOperation( "/" , a , b); }
>> ;
>>
>> I create an expression object by calling...
>>
>> InputStream is = new ByteArrayInputStream( exprString.getBytes());
>>
>> // Create an input character stream from standard in
>> ANTLRInputStream input = new ANTLRInputStream(is);
>>
>> // Create an ExprLexer that feeds from that stream
>> ExprLexer lexer = new ExprLexer(input);
>>
>> // Create a stream of tokens fed by the lexer
>> CommonTokenStream tokens = new CommonTokenStream(lexer);
>>
>> // Create a parser that feeds off the token stream
>> ExprParser parser = new ExprParser(tokens);
>>
>> // Begin parsing at rule prog, get return value structure
>> ExprParser.expression_return r = parser.expression();
>>
>> // WALK RESULTING TREE
>> CommonTree t = (CommonTree)r.getTree(); // get tree from parser
>>
>> // Create a tree node stream from resulting tree
>> CommonTreeNodeStream nodes = new CommonTreeNodeStream(t);
>>
>> Eval walker = new Eval(nodes); // create a tree parser
>> Expression expression = walker.expression();
>>
>>
>> Expression.evaluate has one argument - Msg. You can expand this to include
>> context-binding instead of Msg.
>>
>> Basically, an expression is compiled during the program start time and
>> evaluate them during the runtime with some context.
>>
>> I hope this helps.
>>
>> Michael J. Lee
>>
>>
>>
>> James Abley wrote:
>>>
>>> Hi,
>>>
>>> I'm an ANTLR newbie. A code base that I work on has various expression
>>> evaluation aspects. I have to add to this by defining various
>>> functions that can be evaluated. ANTLR seemed like a good way of
>>> separating out the parsing aspects and should let my colleagues
>>> concentrate on just defining and plugging in new functions without
>>> having to know much about parsing, etc. I've skimmed the ANTLR
>>> Reference book, but don't quite have the time to go in depth at this
>>> point.
>>>
>>> I've written a grammar, which seems to do what I need. Doubtless it
>>> could be trimmed a bit as I learn more. Where I'm stuck is the
>>> connection between having a grammar which can parse the input and how
>>> it gets evaluated.
>>>
>>> The baggage that I'm struggling with is how to define my environment,
>>> bind variables, create stack frames, etc.
>>>
>>> I think this would be as part of a tree grammar the re-uses the tokens
>>> from the AST grammar, but would like to confirm.
>>>
>>> Cheers,
>>>
>>> James
>>>
>>>
>>>
>>> grammar Eval;
>>>
>>> options {
>>>        output = AST;
>>> //      tokenVocab=Expr; // Read token types from Expr.tokens resource
>>> //      ASTLabelType=CommonTree;    // The Java type of the nodes.
>>> }
>>>
>>> tokens {
>>>        FUNC;   // function call
>>>        STR;
>>> }
>>>
>>> @parser::header {
>>> package com.example.expression;
>>> }
>>>
>>> @lexer::header {
>>> package com.example.expression;
>>> }
>>>
>>> stat    :       expr+;
>>>
>>> /*
>>> For now, we define expr very basically. We don't need to support
>>> addition, multiplication or other operators. But if we
>>> do, the grammar is easy to alter.
>>> */
>>> expr    :       atom
>>>        ;
>>> //multExpr ( ( '+' | '-') multExpr)*;
>>>
>>> //multExpr
>>> //      :       unaryExpr (( '*' | '/') unaryExpr)*;
>>>
>>> //unaryExpr
>>> //      :       ('+' | '-')?  atom
>>> //      ;
>>>
>>> /* Basic constituent of an expression.*/
>>> atom    :       var
>>>        |       LPAREN expr RPAREN      // Rule to allow nested
>>> expressions.
>>>        |       functionCall
>>>        |       stringLiteral
>>>        |       number
>>>        ;
>>>
>>> functionCall
>>>        :       functionName LPAREN ( expr (COMMA expr)* )? RPAREN      ->
>>> ^(FUNC
>>> functionName expr*)
>>>        ;
>>>
>>> functionName
>>>        :       ALPHA (ALPHA | '-' | '_' | DIGIT )* ;
>>> /*
>>> Added to indicate how we currently reference bound variables in
>>> expressions.. This lets us parse them easily enough.
>>> with a view to consolidating our expression evaluation code into this
>>> ANTLR-based version.
>>> */
>>> var     :       '$' ALPHA (ALPHA | '-' | '_' | DIGIT)*
>>>        ;
>>>
>>> stringLiteral   :       '"'  ~'"'* '"'
>>>        |       '\'' ~'\''* '\''
>>>        ;
>>>
>>> number  :       DIGIT+ ('.' DIGIT+)?
>>>        ;
>>>
>>> DIGIT
>>>        :       '0' .. '9';
>>>
>>> ALPHA
>>>        :        'a' .. 'z'
>>>        |        'A' .. 'Z';
>>>
>>> COMMA
>>>        :       (WS* ',' WS*);
>>>
>>> LPAREN
>>>        :       (WS*  '(' WS*);
>>> RPAREN
>>>        :       (WS* ')' WS*);
>>>
>>> WS
>>>        :       ' '
>>>        |       '\t';
>>>
>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>> Unsubscribe:
>>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>>
>>>
>>
>>
>


More information about the antlr-interest mailing list