[antlr-interest] Newbie question reg. expression grammar

andreaszielke21 andreas.zielke at gmx.net
Wed Feb 25 08:32:38 PST 2004


Hi,
I'm trying to get the hang of grammars, parsers etc.

I started by reading the excellent introduction to Antlr by Ashley 
Mills. 
(http://supportweb.cs.bham.ac.uk/documentation/tutorials/docsystem/bui
ld/tutorials/antlr/antlrhome.html)
Starting with the simple expression evaluation example, I found 
the "implementation" of nested expressions (chap. 9.1) a bit awkward, 
as it requires brackets around all expressions.

So I tried to change the grammar to this: 

class ExpressionParser extends Parser;

options { buildAST=true; }

expr      : sumExpr SEMI;
sumExpr   : prodExpr ((PLUS^|MINUS^) prodExpr)*; 
prodExpr  : powExpr ((MUL^|DIV^|MOD^) powExpr)* ;
powExpr   : atom (POW^ atom)? ;
atom      : INT 
          | LPAREN^ expr RPAREN! ;

class ExpressionLexer extends Lexer;

PLUS  : '+' ;
MINUS : '-' ;
MUL   : '*' ;
DIV   : '/' ;
MOD   : '%' ;
POW   : '^' ;
SEMI  : ';' ;
LPAREN: '(' ;
RPAREN: ')' ;
protected DIGIT : '0'..'9' ;
INT   : (DIGIT)+ ;

{import java.lang.Math;}
class ExpressionTreeWalker extends TreeParser;

expr returns [double r]
  { double a,b; r=0; }

  : #(LPAREN a=expr)       { r=a; }
  | #(PLUS a=expr b=expr)  { r=a+b; }
  | #(MINUS a=expr b=expr) { r=a-b; }
  | #(MUL  a=expr b=expr)  { r=a*b; }
  | #(DIV  a=expr b=expr)  { r=a/b; }
  | #(MOD  a=expr b=expr)  { r=a%b; }
  | #(POW  a=expr b=expr)  { r=Math.pow(a,b); }
  | i:INT { r=(double)Integer.parseInt(i.getText()); }
  ;

Now simple expression without brackets are parsed correctly, but even 
(3+5);
yields a unexpected token error, which I don't understand.
I tried to understand the problem by utilizing the article "Debugging 
and Testing Grammars [...]" 
(http://www.antlr.org/article/parse.trees/index.tml) by Mr. Parr and 
got the following list of 
derivations: 

derivation:
    <expr>
 => <sumExpr> EOF
 => <prodExpr> EOF
 => <powExpr> EOF
 => <atom> EOF
 => ( <expr> EOF EOF
 => ( <sumExpr> ; EOF EOF
 => ( <prodExpr> + <prodExpr> ; EOF EOF
 => ( <powExpr> + <prodExpr> ; EOF EOF
 => ( <atom> + <prodExpr> ; EOF EOF
 => ( 3 + <prodExpr> ; EOF EOF
 => ( 3 + <powExpr> ; EOF EOF
 => ( 3 + <atom> ; EOF EOF
 => ( 3 + 5 ; EOF EOF

I don't understand why there's a second(!) EOF and why the derivation 
from atom to (expr) doesn't work... :(
Could somebody give me a hint, please?

Thanks in advance,
Andreas






 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list