[antlr-interest] brief analysis of java.g's tree building in 2.x vsproposed 3.0

Loring Craymer Loring.G.Craymer at jpl.nasa.gov
Mon Jan 31 19:46:03 PST 2005


I'm going to try to stop after this response.  There are a couple of inline
2.8 syntax variants below, but Ter has almost converged on 2.8 functionality
with his introduction of subrule rewrites.  Consider interchanging nodes:

foo :
    A B { A } C D ;

or
foo_a :
    A B { B A } C D ;

would be handled in Ter's syntax via

foo_ter :
    ( A B -> B A ) C D ;

and maybe even, using an empty subrule,
foo_t2 :
    A B ( -> A ) C D ;

All that is possibly missing is the capability of longer distance moves and
interoperability with the existing annotation mechanism.  However, I don't
see much difference here and I have to ask

Ter--

Unless I am mistaken, you have introduced the "complexity" that you
originally decried.  Am I correct in this?

--Loring

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Terence Parr
> Sent: Monday, January 31, 2005 6:24 PM
> To: ANTLR Interest
> Subject: [antlr-interest] brief analysis of java.g's tree building in 2.x
> vsproposed 3.0
> 
> Howdy,
> 
> The real test of any proposal is to see what it looks like in practice.
>   I have looked again at the java grammar.  Here is some useful info.
> 
> There are about 75 parser grammar rules.
> 
> There are 27 #(...) tree construction actions.  BUT, 21/27 are purely
> to add an imaginary node as the root of the rule's subtree.  I'm
> guessing the rewrite rules will work well for this.  For example, in
> 2.x:
> 
> modifiers
>      :   ( modifier )*
>          {#modifiers = #([MODIFIERS, "MODIFIERS"], #modifiers);}
>      ;
>
> it becomes the following in 3.0:
> 
> modifiers
>      :   ( modifier )* -> ^(MODIFIERS (modifier)*)
>      ;
> 
> or more precisely
> 
> modifiers
>      :   ( modifier )* -> ^(MODIFIERS["MODIFIERS"] (modifier)*)
>      ;
> 
> though I hope the factor.create(int tokenType) method could ask for the
> token name and figure out "MODIFIERS" automatically; i'll assume for
> now it can.
> 
> Here's another 2.x java.g example:
> 
> implementsClause
>      :   (  i:"implements"! identifier ( COMMA! identifier )* )?
>          {#implementsClause = #(#[IMPLEMENTS_CLAUSE,"IMPLEMENTS_CLAUSE"],
>                                   #implementsClause);}
>      ;
> 
> In 3.0 syntax it would be perhaps:
> 
> implementsClause
>      :   ( "implements" identifier ( COMMA identifier )* )? ->
> ^(IMPLEMENTS_CLAUSE (identifier)+)
>      ;

In 2.8, it's
implementsClause
     :   ( "implements" identifier ( COMMA identifier )* )?
         ^[ IMPLEMENTS_CLAUSE,"IMPLEMENTS_CLAUSE"]^
     ;

> 
> Oh, I've updated the proposal page to use -> instead of => and to
> address some of the concerns mentioned on the list.
> 
> Ter
> PS	here's a nasty rule, which shows a weakness in my current scheme
> dealing with alternatives like Loring predicted I believe:
> 
> field!
>      :   mods:modifiers
>          (   h:ctorHead s:constructorBody // constructor
>              {#field = #(#[CTOR_DEF,"CTOR_DEF"], mods, h, s);}
> 
>          |   cd:classDefinition[#mods]       // inner class
>              {#field = #cd;}
> 
>          |   id:interfaceDefinition[#mods]   // inner interface
>              {#field = #id;}
> 
>          |   t:typeSpec[false]  // method or variable declaration(s)
>              (   IDENT  // the name of the method
> 
>                  LPAREN! param:parameterDeclarationList RPAREN!
> 
>                  rt:declaratorBrackets[#t]
> 
>                  (tc:throwsClause)?
> 
>                  ( s2:compoundStatement | SEMI )
>                  {#field = #(#[METHOD_DEF,"METHOD_DEF"],
>                               mods,
>                               #(#[TYPE,"TYPE"],rt),
>                               IDENT,
>                               param,
>                               tc,
>                               s2);}
>              |   v:variableDefinitions[#mods,#t] SEMI
>                  {#field = #v;}
>              )
>          )
> 
>      |   "static" s3:compoundStatement
>          {#field = #(#[STATIC_INIT,"STATIC_INIT"], s3);}
> 
>      |   s4:compoundStatement
>          {#field = #(#[INSTANCE_INIT,"INSTANCE_INIT"], s4);}
>      ;
> 
> Let me see what I'd like to do.  Ok, with the modifiers left-factored
> in front of that subrule, we need -> in subrules (which I have in
> proposal but said we might not need...seems we do).  Let's see:
> 
> field
>      :   mods=modifiers
>          (   ctorHead constructorBody // constructor
>              -> ^(CTOR_DEF modifiers ctorHead constructorBody)
> 
>          |   classDefinition[@mods.ast]       // inner class
> 
>          |   interfaceDefinition[@mods.ast]   // inner interface
> 
>          |   t:typeSpec[false]  // method or variable declaration(s)
>              (   IDENT  // the name of the method
>                  LPAREN param:parameterDeclarationList RPAREN!
>                  declaratorBrackets[@t.ast]
>                  (throwsClause)?
>                  ( compoundStatement | SEMI )
>                  -> ^(METHOD_DEF
>                             modifiers
>                             ^(TYPE declaratorBrackets)
>                             IDENT parameterDeclarationList throwsClause
> compoundStatement
>                         )
> 
>              |   variableDefinitions[@mods.ast, at t.ast] SEMI
>              )
>          )
> 
>      |   "static" compoundStatement -> ^(STATIC_INIT compoundStatement)
> 
>      |   compoundStatement -> ^(INSTANCE_INIT compoundStatement)
>      ;

The 2.8 field def (assuming 3.0 payloads) is
field!
     :   mods:modifiers
         (   h:ctorHead s:constructorBody // constructor
             ^[CTOR_DEF,"CTOR_DEF"]

         |   cd:classDefinition[#mods]       // inner class
             ^{ => cd }

         |   id:interfaceDefinition[#mods]   // inner interface
             ^{ => id }

         |   t:typeSpec[false] // method or variable declaration(s)
             ^{ ^( ^[TYPE,"TYPE"] t ) }

             (   IDENT  // the name of the method

                 LPAREN! param:parameterDeclarationList RPAREN!

                 rt:declaratorBrackets[#t]
                 (tc:throwsClause)?

                 ( s2:compoundStatement | SEMI )
                 ^[METHOD_DEF,"METHOD_DEF"]^
             |   v:variableDefinitions[#mods,#t] SEMI
                 ^{ => v }
             )
         )





More information about the antlr-interest mailing list