[antlr-interest] Comparing ASTs of the two Java1.5 grammars

Michael Studman mstudman at gmx.net
Mon Nov 1 16:41:07 PST 2004


Hi Andy.

Thanks for giving my grammar a test drive!

It seems strange that annotations aren't stored in the AST - that was
not my intention at all (and I think it's very important that they are
there). I will check out why that is happening and get back to the
group.

2a) also seems to be a bug. Again, I'll investigate and get back to you.

Regards,
Michael

> -----Original Message-----
> From: atripp54321 [mailto:atripp at comcast.net]
> Sent: 25 October 2004 20:26
> To: antlr-interest at yahoogroups.com
> Subject: [antlr-interest] Comparing ASTs of the two Java1.5 grammars
> 
> 
> 
> I went to update my JavaEmitter code for the new JDK1.5 grammar,
> and I see we actually have two JDK1.5 grammars listed at antlr.org:
> one by Michael Studman and another by Michael Stahl.
> My code depends on the "shape" of the Java AST produced
> by the grammar, and I'm sure eventually one of these two will
> need to be chosen to be included with ANTLR as the "official" java.g.
> 
> So I tried out these two grammars on the
> various new 1.5 features, and here are my notes on
> the ASTs that each of these grammars produce.
> For reference, here's the Sun proposed Java 1.5 grammar:
> http://java.sun.com/docs/books/jls/jls-proposed-changes.html
> 
> 1) Annotations
> Neither grammar stores annotations in the AST.
> This seems right to me, as we don't store comments in the AST either.
> Anyone who's annoyed that comments are not stored in the AST
> will now be even more annoyed :)
> 
> 2) Generics:
> Given this code:
>     public Vector(Collection<? extends E> c) {
> 
> Studman's produces this:
>             TYPE
>               IDENT Collection
>                 TYPE_ARGUMENTS
>                   TYPE_ARGUMENT
>                     WILDCARD_TYPE
>                       TYPE_UPPER_BOUNDS
>                         IDENT E
> 
> And Stahl's produces this:
>            TYPE
>              IDENT Collection
>              TYPE_ARGS
>                WILDCARD
>                  LITERAL_extends
>                  TYPE
>                    IDENT E
>                    TYPE_ARGS
> 
> a) One places the TYPE subtree as a child IDENT, the other as a
sibling.
> I prefer Stahl's...seems strange for IDENT to have a child.
> b) Studman's has the extra TYPE_ARGUMENT node, which I prefer.
> c) The two trees are different under WILDCARD_TYPE. I prefer Studman's
> but I'd rename "TYPE_UPPER_BOUNDS" to "TYPE_EXTENDS" (and
> "TYPE_LOWER_BOUNDS"
> to "TYPE_SUPER").
> d) That extra TYPE_ARGS at the end of Stahl's shouldn't be there (I
think)
> 
> 2) For-each loop:
> Given this code:
>                 for (Integer i : integers) {
>                 }
> 
> Studman's produces this:
>          LITERAL_for
>            FOR_EACH_CLAUSE
>              PARAMETER_DEF
>                MODIFIERS
>                TYPE
>                  IDENT Integer
>                IDENT i
>              EXPR
>                IDENT integers
>            SLIST
> 
> And Stahl's produces this:
>           LITERAL_for
>             PARAMETER_DEF
>               MODIFIERS
>               TYPE
>                 IDENT Integer
>                 TYPE_ARGS
>               IDENT i
>             EXPR
>               IDENT integers
>             SLIST
> 
> I prefer Studman's with the "FOR_EACH_CLAUSE" node which parallels the
> "FOR_INIT",
> "FOR_CONDITION", and "FOR_ITERATOR" nodes in the old "for" syntax.
> 
> 3) Enums:
> Given this code:
>    enum Rank2 implements whatever {ONE, TWO, THREE}
> Studman's produces this:
>       ENUM_DEF
>         MODIFIERS
>         IDENT Rank2
>         IMPLEMENTS_CLAUSE
>           IDENT whatever
>         OBJBLOCK
>           ENUM_CONSTANT_DEF
>             ANNOTATIONS
>             IDENT ONE
>           ENUM_CONSTANT_DEF
>             ANNOTATIONS
>             IDENT TWO
>           ENUM_CONSTANT_DEF
>             ANNOTATIONS
>             IDENT THREE
> 
> Stahl's failed with "unexpected token" exception.
> 
> Given a full enum definitions, Studman's produced an AST that's
identical
> to a class definition, but with ENUM_DEF in place of CLASS_DEF.
> Stahl's failed on this one too.
> 
> 4) Varargs:
> Given this code:
> 	void test(int i, String... strings)
> 
> Studman's produces this:
>         PARAMETERS
>           PARAMETER_DEF
>             MODIFIERS
>             TYPE
>               LITERAL_int
>             IDENT i
>           VARIABLE_PARAMETER_DEF
>             MODIFIERS
>             TYPE
>               IDENT String
>             IDENT strings
> 
> And Stahl's produces this:
>         PARAMETERS
>           PARAMETER_DEF
>             MODIFIERS
>             TYPE
>               LITERAL_int
>             IDENT i
>           PARAMETER_DEF
>             MODIFIERS
>             TYPE
>               IDENT String
>               TYPE_ARGS
>               ELLIPSIS
>             IDENT strings
> 
> I prefer Studman's AST with the explicit VARIABLE_PARAMETER_DEF node.
> 
> 5) Static imports:
> Given this code:
> import static java.lang.Math.PI;
> 
> Studman's produces this:
>   STATIC_IMPORT
>     DOT
>       DOT
>         DOT
>           IDENT java
>           IDENT lang
>         IDENT Math
>       IDENT PI
> 
> And Stahl's produces this:
>   IMPORT
>     LITERAL_static
>     DOT
>       DOT
>         DOT
>           IDENT java
>           IDENT lang
>         IDENT Math
>       IDENT PI
> 
> I prefer Studman's STATIC_IMPORT. The issue here is whether a "static
> import"
> is just an "import" that happens to have a "static" modifier
> (as when a variable is static),
> or whether it's a new type of thing (in the way that a "static block"
> differs
> from a regular block).
> 
> Summary:
> Given that these two both correctly parse Java 1.5 code (which they
seem
> to except for the enum problem noted above), choosing one of these to
> be the "official" java.g comes down to which produces a "better" AST.
> I've listed the differences and it looks to me like Studman's AST's
> look like they're more consistent with the ASTs we get today.
> 
> And of course, some guru should look closely at the grammar to make
> sure that it matches the "official" grammar in the JLS, and comments
as
> needed, make sure token names are consistent, etc.
> 
> Andy
> 
> 
> 
> 
> 
> 
> Yahoo! Groups Links
> 
> 
> 
> 
> 
> 
> 
> ---
> 
> Checked by AVG anti-virus system (http://www.grisoft.com).
> Version: 6.0.778 / Virus Database: 525 - Release Date: 15/10/2004
> 

---

Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.786 / Virus Database: 532 - Release Date: 29/10/2004
 



 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
    antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/
 





More information about the antlr-interest mailing list