[antlr-interest] Modifying tree based on semantic content

David Jameson dhjdhj at gmail.com
Sun May 17 08:56:03 PDT 2009


Thank you ---- I'll have to study your response carefully to make sure  
I understand it but I REALLY appreciate the time you took to write it,  
that was very kind.

With respect to my original question about the +, I just did that to  
simplify the question, which was really about how to control the tree  
output as a function of the semantic behavior of the language, in this  
particular case, the types. I fully expected to be able to generalize  
the original answer ( semantic predicates, as you helpfully told me)  
for my real problem.

Apart from simpleExpression, there are also rules for expression,  
term, factor, thereby allowing the appropriate control of precedance  
rules, for example you can write.

if  a + b = c + d then  ....

without any parentheses and get the right answer.

Other than separation for precedance purposes, a binary comparison  
operator is no different than a binary arithmetic operator and in fact  
the expression and term rules are written much the same way as the  
"simpleExpression" rule, replacing comparisonOperator with addOperator  
and mulOperator and so forth.


The grammar does also produce the correct tree for a sequence of more  
than three items, so yes, I would in fact get

     ^( PLUS 'lhs' ^( PLUS 'rhs1' 'rhs2 ) )

The optionalSign supports the notion of unary minus so you can write  
things like
      -1 + 2




The grammar I've developed is actually intended to translate a special  
scripting language into Java source code. The scripting language  
allows integers and strings to be compared using the same operator,  
i.e, you can write
     if   a = "xyz" then ....
     if   i = 3 then ....

but while the latter can be trivially rendered as
     if (i == 3) { .... }
the former requires
     if (a.equals("xyz") .....


and so I need to modify the tree output so that the tree grammar can  
distinguish arithmetic operations from string operations, regardless  
of what they are.


I suspect I am just missing something really trival with respect to  
ANTLR syntax that is preventing me from expressing the rewrite rules  
properly.




On May 17, 2009, at 11:07 AM, Steve Ebersole wrote:

> I am not an Antlr3 expert, I really just started myself about a month
> ago (moving from v2).  It really is probably best to get an answer  
> from
> the list.  You are correct though, you cannot mix the two approaches.
>
> That being said...
>
> The most straight-forward way from your existing grammar rule AFAIK is
> to rewrite the initial simpleExpression result (removing optionalSign
> since I did not understand its purpose here):
>
> expression returns [TAttributeType type]
> @init { boolean isStringComparison=false; }
>    :   ( simpleExpression -> simpleExpression ) {
>            $type = $expression.type;
>            isStringComparison = isString( $expression.type );
>        } (
>            comparisonOperator rhs = simpleExpression {
>                $type=...;
>            }
>            -> { isStringComparison }?
>                    ^( STRINGOP $expression $rhs )
>            ->
>                    ^(comparisonOperator $lhs $rhs)
>        )*
>    ;
>
> Note too that because of the '*' surrounding your "(comparisonOperator
> rhs = simpleExpression)" recognition and the fact that you ref "lhs"  
> in
> the rewrite, I think you will actually end up with trees like:
> ^( PLUS 'lhs' ^( PLUS 'lhs' 'rhs2 ) )
> instead of what I think you probably wanted:
> ^( PLUS 'lhs' ^( PLUS 'rhs1' 'rhs2 ) )
>
> Also, if you wanted to keep this resulting tree structure, a better
> option is probably to recurse the rule:
>
> expression
>    : ( lhs=simpleExpression -> simpleExpression )
>      (
>          ( comparisonOperator rhs=expression )
>              -> { isString( $lhs.type ) }?
>                  ^( STRINGOP $lhs $rhs )
>              ->
>                  ^( comparisonOperator $lhs $rhs )
>      )?
>    ;
>
> See that 'rhs' recurses back into this rule..
>
> (I *think* that the '$lhs' references will work here but am not 100%
> sure.  try it out.  we use $expression rather than a label)
>
>
> Personally I'd be very uncomfortable with your notion of "comparison
> operator".  I believe in your original email you were asking about '+'
> here.  Neither addition nor concatenation is a "comparison".  I think
> you may be trying to handle multiple, unequal concepts here.  In my
> experience that causes problems.  I would look to split the notions of
> "comparison" and PLUS/MINUS/MULTIPLY/DIVIDE/CONCATENATE.  Why?  Well  
> as
> you see they are very different both structurally and semantically.   
> For
> example, most comparison operators (aside from something like a SQL's
> BETWEEN operator e.g.) are "binary operand" (two sides) operators;
> PLUS/MINUS/MULTIPLY/DIVIDE/CONCATENATE are all "chained" meaning that
> ^( PLUS op1 ^( PLUS op2 op3 ) ) is semantically the same as ^( PLUS  
> op1
> op2 op3).  Same for concatenation: ^( CONCAT op1 ^( CONCAT op2  
> op3) ) is
> the same as ^( CONCAT op1 op2 op ).  Antlr rewrite rules have a nice  
> way
> to treat this "rolling up":
>
> addition
>    : additionOperand ( PLUS additionOperand )+
>          -> ^( PLUS additionOperand+ )
>    ;
>
> Or in your case, something like:
> addition
> @init { boolean isString=false; }
>    : lhs=additionOperand {isString=...;} ( PLUS additionOperand )+
>        -> {isString}? ^( CONCAT additionOperand )+
>        -> ^( PLUS additionOperand+ )
>    ;
>
> Anyway, hope that helps...
>
> On Sun, 2009-05-17 at 09:36 -0400, David Jameson wrote:
>> Steve, I hate to bother you personally but you were the only one who
>> responded and pointed me in the right direction. Nobody but you had  
>> an
>> answer to my initial question and in the context of "no good deed  
>> goes
>> unpunished", I'm really hoping you can spot what I am doing wrong as
>> I've been pulling my hair out most of the weekend trying to get ANTLR
>> to accept the "fixes"
>>
>> Many thanks,
>> D
>>
>> ---------------------------
>>
>>
>>
>>
>>
>>
>> I have been trying all day to get the rule below accepted.
>>
>>
>> expression returns [TAttributeType type]
>>               :
>>               optionalSign
>>               lhs = simpleExpression^
>>                  {
>>                     $type = $lhs.type;
>>                  }
>>                  (
>>                   (comparisonOperator rhs = simpleExpression)	
>>                   {
>>                      $type =
>> TErrorHandling.Compatible($comparisonOperator.tree.token,
>> $comparisonOperator.token, $lhs.type, $rhs.type);
>>                   }
>>
>>                     -> { isString($lhs.type) }? ^(comparisonOperator
>> STRINGOP $lhs $rhs)
>>                     ->  ^(comparisonOperator $lhs $rhs)
>>                  )* 	
>> 	        ;
>>
>>
>>
>>
>>
>> I get a "cannot generate the grammar because" error  which is
>>
>>    rule expression alt 1 uses rewrite syntax and also an AST operator
>>
>> Now, I'm pretty certain this is happening because of the "^" that
>> follows lhs = simpleExpression^ in the first section of the rule.
>>
>> However, removing that operator causes no tree node to be generated  
>> in
>> the case where there is just a simple expression and that breaks the
>> results.  However, I tried removing that operator and adding a  
>> rewrite
>> rule after the FIRST closing right brace, e.g.
>>
>>               lhs = simpleExpression
>>                  {
>>                     $type = $lhs.type;
>>                  } -> ^($lhs)
>>
>> and many variants but this just caused ANTLR to complain that
>> comparisonOperator was an unexpected token.  I also tried inserting a
>> third predicate in the bottom group that would test whether $rhs was
>> null and just put out the $lhs in that case but that didn't work  
>> either.
>>
>> Can somebody please put me out of my misery (in a kind manner (grin))
>> and show me what I'm doing wrong? I'd love to have a quiet weekend
>> with no problems to worry about!!!
>>
>> Thanks,
>> D
>>
>> ------------------------------------------------------------------
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On May 15, 2009, at 7:26 AM, Steve Ebersole wrote:
>>
>>> someRule
>>>   : a PLUS b
>>>       -> { areStrings($a.tree,$b.tree) } ^(CONCATENATE a b)
>>>       -> ^(PLUS a b)
>>>
>>> On Thu, 2009-05-14 at 23:38 -0400, David Jameson wrote:
>>>> Is there any way to control the built-in tree generation (from an
>>>> initial parse phase) based on semantics of what is being parsed?
>>>>
>>>> As a simple example,    if   I see the expression
>>>>       a + b
>>>>
>>>> then I want to produce
>>>>    (PLUS a b)
>>>> or
>>>>   (CONCATENATE a b)
>>>>
>>>> depending on whether a and b are numeric or string.
>>>>
>>>>
>>>> How can I do this with rewrite rules (for example)?   Or do I  
>>>> have to
>>>> construct my own trees?
>>>>
>>>>
>>>> Thanks,
>>>> D
>>>>
>>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>> -- 
>>> Steve Ebersole <steve at hibernate.org>
>>> Hibernate.org
>>>
>>
> -- 
> Steve Ebersole <steve at hibernate.org>
> Hibernate.org
>



More information about the antlr-interest mailing list