[antlr-interest] AST trees

Herumor herumor at fastwebnet.it
Wed Jul 12 01:43:13 PDT 2006


Hello everyone,

I attached to this e-mail the AST that I found in the Java grammar on
ANTLR'S website to define "expression".
Since I'm a newbie in this kind of things I googled around and I found a
cool and easy example of an AST tree:

orexpression : andexpression ("or"^ andexpression)*
                 ;
andexpression
                 : notexpression ("and"^ notexpression)*
                 ;
notexpression
                : ("not"^)? atom
                ;
atom
         : condition
         | LEFT_PAREN! orexpression RIGHT_PAREN!

condition
        : "true"
        | "false" ... ... etc. etc.

What I understood as a newbie is that this technique is used to define
the precedence between the operators, which is for sure a useful thing.
Another thing I understood is that AST works like a binary tree:
                      A - > B | C

or expression  = andexpression;
andexpression = notexpression;
notexpression = ["not"]  atom
atom = condition LEFT_PAREN! orexpression RIGHT_PAREN!
condition  = "true" | "false" ...

So, in the end     notexpression = ["not"]  ("true" | "false")
LEFT_PAREN! orexpression RIGHT_PAREN!
Now, this was a simple AST tree, but how does the attached AST work? Is
there anyway to make the attached AST tree more "scholastic"?
After all I need just a parser with the rules which produce a simple
Java sublanguage with the main things: binary selection, while loop,
sequence, class...
so operator precedence is a plus in my case. But doing this project I
saw that defining "expression" is quite complex and everything depends
on how well you define it.
Expression is needed in these cases in my specific case: Var
declaration, assignment (that in java can be done on casting time),
binary selection, while loop.
I came to AST tree because I saw that it was impossible to define a
single keyword called expression and create rules straight with
terminals because that produced errors like:
there are 1,2, 3 choices which might produce this and that, so I
disabled choices 1 and 2...
This said is an AST tree needed in my case or are there some
"scholastic" and easy ways to do the job?

Thanks for help,
Deviad

-------------- next part --------------
parExpression
        :       '(' expression ')'
        ;


expression
        :       conditionalExpression (assignmentOperator expression)?
        ;
        
assignmentOperator
        :       '='
    |   '+='
    |   '-='
    |   '*='
    |   '/='
    |   '&='
    |   '|='
    |   '^='
    |   '%='
    |   '<' '<' '='
    |   '>' '>' '='
    |   '>' '>' '>' '='
        ;

conditionalExpression
    :   conditionalOrExpression ( '?' expression ':' expression )?
        ;

conditionalOrExpression
    :   conditionalAndExpression ( '||' conditionalAndExpression )*
        ;

conditionalAndExpression
    :   inclusiveOrExpression ( '&&' inclusiveOrExpression )*
        ;

inclusiveOrExpression
    :   exclusiveOrExpression ( '|' exclusiveOrExpression )*
        ;

exclusiveOrExpression
    :   andExpression ( '^' andExpression )*
        ;

andExpression
    :   equalityExpression ( '&' equalityExpression )*
        ;

equalityExpression
    :   instanceOfExpression ( ('==' | '!=') instanceOfExpression )*
        ;

instanceOfExpression
    :   relationalExpression ('instanceof' type)?
        ;

relationalExpression
    :   shiftExpression ( relationalOp shiftExpression )*
        ;
        
relationalOp
        :       ('<' '=' | '>' '=' | '<' | '>')
        ;

shiftExpression
    :   additiveExpression ( shiftOp additiveExpression )*
        ;

        // TODO: need a sem pred to check column on these >>>
shiftOp
        :       ('<' '<' | '>' '>' '>' | '>' '>')
        ;


additiveExpression
    :   multiplicativeExpression ( ('+' | '-') multiplicativeExpression )*
        ;

multiplicativeExpression
    :   unaryExpression ( ( '*' | '/' | '%' ) unaryExpression )*
        ;
        
unaryExpression
    :   '+' unaryExpression
    |   '-' unaryExpression
    |   '++' primary
    |   '--' primary
    |   unaryExpressionNotPlusMinus
    ;

unaryExpressionNotPlusMinus
    :   '~' unaryExpression
    |   '!' unaryExpression
    |   castExpression
    |   primary selector* ('++'|'--')?
    ;

castExpression
    :  '(' primitiveType ')' unaryExpression
    |  '(' (expression | type) ')' unaryExpressionNotPlusMinus
    ;

primary
    :   parExpression
    |   nonWildcardTypeArguments
        (explicitGenericInvocationSuffix | 'this' arguments)
    |   'this' (arguments)?
    |   'super' superSuffix
    |   literal
    |   'new' creator
    |   Identifier ('.' Identifier)* (identifierSuffix)?
    |   primitiveType ('[' ']')* '.' 'class'
    |   'void' '.' 'class'
        ;

identifierSuffix
        :       ('[' ']')+ '.' 'class'
        |       ('[' expression ']')+ // can also be matched by selector, but do here
    |   arguments
    |   '.' 'class'
    |   '.' explicitGenericInvocation
    |   '.' 'this'
    |   '.' 'super' arguments
    |   '.' 'new' (nonWildcardTypeArguments)? innerCreator
        ;
        
creator
        :       nonWildcardTypeArguments? createdName
        (arrayCreatorRest | classCreatorRest)
        ;

createdName
        :       Identifier nonWildcardTypeArguments?
        ('.' Identifier nonWildcardTypeArguments?)*
    |   primitiveType
        ;
        
innerCreator
        :       Identifier classCreatorRest
        ;

arrayCreatorRest
        :       '['
        (   ']' ('[' ']')* arrayInitializer
        |   expression ']' ('[' expression ']')* ('[' ']')*
        )
        ;

classCreatorRest
        :       arguments classBody?
        ;
        
explicitGenericInvocation
        :       nonWildcardTypeArguments explicitGenericInvocationSuffix
        ;
        
nonWildcardTypeArguments
        :       '<' typeList '>'
        ;
        
explicitGenericInvocationSuffix
        :       'super' superSuffix
        |   Identifier arguments
        ;
        
selector
        :       '.' Identifier (arguments)?
        |   '.' 'this'
        |   '.' 'super' superSuffix
        |   '.' 'new' (nonWildcardTypeArguments)? innerCreator
        |   '[' expression ']'
        ;
        
superSuffix
        :       arguments
        |   '.' Identifier (arguments)?
    ;

arguments
        :       '(' expressionList? ')'
        ;


More information about the antlr-interest mailing list