[antlr-interest] Creating a simple expression language
James Abley
james.abley at gmail.com
Tue Nov 18 04:13:49 PST 2008
Hi all,
Sorry, I'm still stuck with this. I have taken the sample grammar from
the book and reduced it as small as I know how currently.
grammar Expr;
options {
output=AST;
ASTLabelType=CommonTree;
k = 1;
}
prog : (stat {System.out.println($stat.tree.toStringTree());} )+ ;
stat : expr -> expr
;
expr : multExpr (('+' ^ | '-'^) multExpr)*
;
multExpr : atom ('*'^ atom)*
;
atom : INT
| function
/* We don't want nodes for the parentheses */
| '('! expr ')'!
;
/* We don't want nodes for the paretheses or commas. */
function : FUNCNAME^ (WS)* '('! (WS)* (expr ( (WS)* ','! (WS)* expr )* )? ')'! ;
FUNCNAME : ( 'a' .. 'z' | 'A' .. 'Z')+ ;
INT : '0' .. '9'+ ;
WS : ( ' ' | '\t' | '\n' | '\r' ) + {skip(); };
I then have a tree grammar to go with this AST grammar.
tree grammar Eval;
options {
tokenVocab=Expr; // read the token types from Expr.tokens
ASTLabelType=CommonTree;
}
prog : stat+ ;
stat : expr
{ System.out.println($expr.value); }
;
expr returns [int value]
: ^('+' a=expr b=expr) { $value = a+b; }
| ^('-' a=expr b=expr) { $value = a-b; }
| ^('*' a=expr b=expr) { $value = a*b; }
| ^( FUNCNAME expr*) { /* What do I do here? */ }
| INT {$value = Integer.parseInt($INT.text); }
;
I'm not sure how to plugin in the function evaluation using just the
tree grammar.
@Test
public void simpleAdditionDefinedInGrammar() throws Exception {
ExprLexer lexer = new ExprLexer(new ANTLRStringStream("1+15+23-3"));
CommonTokenStream tokens = new CommonTokenStream(lexer);
ExprParser parser = new ExprParser(tokens);
ExprParser.expr_return r = parser.expr();
CommonTree tree = (CommonTree) r.getTree();
CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree);
Eval eval = new Eval(nodes);
assertEquals(36, eval.expr());
}
Evaluation using the rules defined in the grammar works fine, as per
my test above. I need to have an environment to allow function binding
and dispatch so that I can have a simple Java API for people to
implement functions against, without having to know ANTLR and alter
the grammar.
Am I approaching this problem in the right way?
I have been able to walk the CommonTree that I get back from the
parser, but I'm not sure I'm doing that in the right way either. Is
there an existing ANTLR API to walk / visit trees?
[1] JUnit TestEval - http://pastie.org/317730 - the second test fails
since my tree grammar currently has no way to do any sort of lookup to
a function.
[2] Function - http://pastie.org/317734
[3] AddFunction - http://pastie.org/317733
[4] Environment - http://pastie.org/317736
[5] TreeEvaluator - http://pastie.org/317737
Cheers,
James
2008/11/11 James Abley <james.abley at gmail.com>:
> Hi Michael,
>
> That helps, thanks. I have something like that, but I'm a little stuck
> with getting the Eval.g grammar going. I think I need to start with a
> simpler grammar and get a better handle on how that will hang
> together, and then see if I still don't understand how to apply that
> to my problem, and then take the CMinus example from the book and play
> around with that a little.
>
> Cheers,
>
> James
>
>
> 2008/11/10 Michael Lee <antlr at quantdev.com>:
>> Hi, James
>>
>> I am a newbie as well. Past 4 weeks, I worked on creating an expression
>> engine that will evaluate an FIX message during runtime. What I wanted to is
>> 'compile' the expression into an expression during the program start time
>> and evaluate them runtime. It also require to have binding to 'Msg' at
>> evaluation and thread-safe.
>>
>> For example,
>>
>> Msg["PRICE"] < 5.00
>>
>> Msg["PRICE"] will be evaluated during runtime - value is determined by
>> FixMsg passed in for evaluation.
>>
>> For this, I created two separate files (Eval.g and Expr.g). One for parsing
>> expression(Expr.g) and one for assembling parsed expression into an
>> organized expression object(Eval.g).
>>
>> Eval.g is a tree grammar. Here is a simplified snippet...
>>
>> expression returns [ Expression exp ]
>> : ^(op='+' a=expression b=expression ) { $exp =
>> NumericOperationExpression.createOperation( "+" , a , b); }
>> | ^(op='-' a=expression b=expression ) { $exp =
>> NumericOperationExpression.createOperation( "-" , a , b); }
>> | ^(op='*' a=expression b=expression ) { $exp =
>> NumericOperationExpression.createOperation( "*" , a , b); }
>> | ^(op='/' a=expression b=expression ) { $exp =
>> NumericOperationExpression.createOperation( "/" , a , b); }
>> ;
>>
>> I create an expression object by calling...
>>
>> InputStream is = new ByteArrayInputStream( exprString.getBytes());
>>
>> // Create an input character stream from standard in
>> ANTLRInputStream input = new ANTLRInputStream(is);
>>
>> // Create an ExprLexer that feeds from that stream
>> ExprLexer lexer = new ExprLexer(input);
>>
>> // Create a stream of tokens fed by the lexer
>> CommonTokenStream tokens = new CommonTokenStream(lexer);
>>
>> // Create a parser that feeds off the token stream
>> ExprParser parser = new ExprParser(tokens);
>>
>> // Begin parsing at rule prog, get return value structure
>> ExprParser.expression_return r = parser.expression();
>>
>> // WALK RESULTING TREE
>> CommonTree t = (CommonTree)r.getTree(); // get tree from parser
>>
>> // Create a tree node stream from resulting tree
>> CommonTreeNodeStream nodes = new CommonTreeNodeStream(t);
>>
>> Eval walker = new Eval(nodes); // create a tree parser
>> Expression expression = walker.expression();
>>
>>
>> Expression.evaluate has one argument - Msg. You can expand this to include
>> context-binding instead of Msg.
>>
>> Basically, an expression is compiled during the program start time and
>> evaluate them during the runtime with some context.
>>
>> I hope this helps.
>>
>> Michael J. Lee
>>
>>
>>
>> James Abley wrote:
>>>
>>> Hi,
>>>
>>> I'm an ANTLR newbie. A code base that I work on has various expression
>>> evaluation aspects. I have to add to this by defining various
>>> functions that can be evaluated. ANTLR seemed like a good way of
>>> separating out the parsing aspects and should let my colleagues
>>> concentrate on just defining and plugging in new functions without
>>> having to know much about parsing, etc. I've skimmed the ANTLR
>>> Reference book, but don't quite have the time to go in depth at this
>>> point.
>>>
>>> I've written a grammar, which seems to do what I need. Doubtless it
>>> could be trimmed a bit as I learn more. Where I'm stuck is the
>>> connection between having a grammar which can parse the input and how
>>> it gets evaluated.
>>>
>>> The baggage that I'm struggling with is how to define my environment,
>>> bind variables, create stack frames, etc.
>>>
>>> I think this would be as part of a tree grammar the re-uses the tokens
>>> from the AST grammar, but would like to confirm.
>>>
>>> Cheers,
>>>
>>> James
>>>
>>>
>>>
>>> grammar Eval;
>>>
>>> options {
>>> output = AST;
>>> // tokenVocab=Expr; // Read token types from Expr.tokens resource
>>> // ASTLabelType=CommonTree; // The Java type of the nodes.
>>> }
>>>
>>> tokens {
>>> FUNC; // function call
>>> STR;
>>> }
>>>
>>> @parser::header {
>>> package com.example.expression;
>>> }
>>>
>>> @lexer::header {
>>> package com.example.expression;
>>> }
>>>
>>> stat : expr+;
>>>
>>> /*
>>> For now, we define expr very basically. We don't need to support
>>> addition, multiplication or other operators. But if we
>>> do, the grammar is easy to alter.
>>> */
>>> expr : atom
>>> ;
>>> //multExpr ( ( '+' | '-') multExpr)*;
>>>
>>> //multExpr
>>> // : unaryExpr (( '*' | '/') unaryExpr)*;
>>>
>>> //unaryExpr
>>> // : ('+' | '-')? atom
>>> // ;
>>>
>>> /* Basic constituent of an expression.*/
>>> atom : var
>>> | LPAREN expr RPAREN // Rule to allow nested
>>> expressions.
>>> | functionCall
>>> | stringLiteral
>>> | number
>>> ;
>>>
>>> functionCall
>>> : functionName LPAREN ( expr (COMMA expr)* )? RPAREN ->
>>> ^(FUNC
>>> functionName expr*)
>>> ;
>>>
>>> functionName
>>> : ALPHA (ALPHA | '-' | '_' | DIGIT )* ;
>>> /*
>>> Added to indicate how we currently reference bound variables in
>>> expressions.. This lets us parse them easily enough.
>>> with a view to consolidating our expression evaluation code into this
>>> ANTLR-based version.
>>> */
>>> var : '$' ALPHA (ALPHA | '-' | '_' | DIGIT)*
>>> ;
>>>
>>> stringLiteral : '"' ~'"'* '"'
>>> | '\'' ~'\''* '\''
>>> ;
>>>
>>> number : DIGIT+ ('.' DIGIT+)?
>>> ;
>>>
>>> DIGIT
>>> : '0' .. '9';
>>>
>>> ALPHA
>>> : 'a' .. 'z'
>>> | 'A' .. 'Z';
>>>
>>> COMMA
>>> : (WS* ',' WS*);
>>>
>>> LPAREN
>>> : (WS* '(' WS*);
>>> RPAREN
>>> : (WS* ')' WS*);
>>>
>>> WS
>>> : ' '
>>> | '\t';
>>>
>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>> Unsubscribe:
>>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>>
>>>
>>
>>
>
More information about the antlr-interest
mailing list