[antlr-interest] Comparing ASTs of the two Java1.5 grammars
Michael Studman
mstudman at gmx.net
Mon Nov 1 16:41:07 PST 2004
Hi Andy.
Thanks for giving my grammar a test drive!
It seems strange that annotations aren't stored in the AST - that was
not my intention at all (and I think it's very important that they are
there). I will check out why that is happening and get back to the
group.
2a) also seems to be a bug. Again, I'll investigate and get back to you.
Regards,
Michael
> -----Original Message-----
> From: atripp54321 [mailto:atripp at comcast.net]
> Sent: 25 October 2004 20:26
> To: antlr-interest at yahoogroups.com
> Subject: [antlr-interest] Comparing ASTs of the two Java1.5 grammars
>
>
>
> I went to update my JavaEmitter code for the new JDK1.5 grammar,
> and I see we actually have two JDK1.5 grammars listed at antlr.org:
> one by Michael Studman and another by Michael Stahl.
> My code depends on the "shape" of the Java AST produced
> by the grammar, and I'm sure eventually one of these two will
> need to be chosen to be included with ANTLR as the "official" java.g.
>
> So I tried out these two grammars on the
> various new 1.5 features, and here are my notes on
> the ASTs that each of these grammars produce.
> For reference, here's the Sun proposed Java 1.5 grammar:
> http://java.sun.com/docs/books/jls/jls-proposed-changes.html
>
> 1) Annotations
> Neither grammar stores annotations in the AST.
> This seems right to me, as we don't store comments in the AST either.
> Anyone who's annoyed that comments are not stored in the AST
> will now be even more annoyed :)
>
> 2) Generics:
> Given this code:
> public Vector(Collection<? extends E> c) {
>
> Studman's produces this:
> TYPE
> IDENT Collection
> TYPE_ARGUMENTS
> TYPE_ARGUMENT
> WILDCARD_TYPE
> TYPE_UPPER_BOUNDS
> IDENT E
>
> And Stahl's produces this:
> TYPE
> IDENT Collection
> TYPE_ARGS
> WILDCARD
> LITERAL_extends
> TYPE
> IDENT E
> TYPE_ARGS
>
> a) One places the TYPE subtree as a child IDENT, the other as a
sibling.
> I prefer Stahl's...seems strange for IDENT to have a child.
> b) Studman's has the extra TYPE_ARGUMENT node, which I prefer.
> c) The two trees are different under WILDCARD_TYPE. I prefer Studman's
> but I'd rename "TYPE_UPPER_BOUNDS" to "TYPE_EXTENDS" (and
> "TYPE_LOWER_BOUNDS"
> to "TYPE_SUPER").
> d) That extra TYPE_ARGS at the end of Stahl's shouldn't be there (I
think)
>
> 2) For-each loop:
> Given this code:
> for (Integer i : integers) {
> }
>
> Studman's produces this:
> LITERAL_for
> FOR_EACH_CLAUSE
> PARAMETER_DEF
> MODIFIERS
> TYPE
> IDENT Integer
> IDENT i
> EXPR
> IDENT integers
> SLIST
>
> And Stahl's produces this:
> LITERAL_for
> PARAMETER_DEF
> MODIFIERS
> TYPE
> IDENT Integer
> TYPE_ARGS
> IDENT i
> EXPR
> IDENT integers
> SLIST
>
> I prefer Studman's with the "FOR_EACH_CLAUSE" node which parallels the
> "FOR_INIT",
> "FOR_CONDITION", and "FOR_ITERATOR" nodes in the old "for" syntax.
>
> 3) Enums:
> Given this code:
> enum Rank2 implements whatever {ONE, TWO, THREE}
> Studman's produces this:
> ENUM_DEF
> MODIFIERS
> IDENT Rank2
> IMPLEMENTS_CLAUSE
> IDENT whatever
> OBJBLOCK
> ENUM_CONSTANT_DEF
> ANNOTATIONS
> IDENT ONE
> ENUM_CONSTANT_DEF
> ANNOTATIONS
> IDENT TWO
> ENUM_CONSTANT_DEF
> ANNOTATIONS
> IDENT THREE
>
> Stahl's failed with "unexpected token" exception.
>
> Given a full enum definitions, Studman's produced an AST that's
identical
> to a class definition, but with ENUM_DEF in place of CLASS_DEF.
> Stahl's failed on this one too.
>
> 4) Varargs:
> Given this code:
> void test(int i, String... strings)
>
> Studman's produces this:
> PARAMETERS
> PARAMETER_DEF
> MODIFIERS
> TYPE
> LITERAL_int
> IDENT i
> VARIABLE_PARAMETER_DEF
> MODIFIERS
> TYPE
> IDENT String
> IDENT strings
>
> And Stahl's produces this:
> PARAMETERS
> PARAMETER_DEF
> MODIFIERS
> TYPE
> LITERAL_int
> IDENT i
> PARAMETER_DEF
> MODIFIERS
> TYPE
> IDENT String
> TYPE_ARGS
> ELLIPSIS
> IDENT strings
>
> I prefer Studman's AST with the explicit VARIABLE_PARAMETER_DEF node.
>
> 5) Static imports:
> Given this code:
> import static java.lang.Math.PI;
>
> Studman's produces this:
> STATIC_IMPORT
> DOT
> DOT
> DOT
> IDENT java
> IDENT lang
> IDENT Math
> IDENT PI
>
> And Stahl's produces this:
> IMPORT
> LITERAL_static
> DOT
> DOT
> DOT
> IDENT java
> IDENT lang
> IDENT Math
> IDENT PI
>
> I prefer Studman's STATIC_IMPORT. The issue here is whether a "static
> import"
> is just an "import" that happens to have a "static" modifier
> (as when a variable is static),
> or whether it's a new type of thing (in the way that a "static block"
> differs
> from a regular block).
>
> Summary:
> Given that these two both correctly parse Java 1.5 code (which they
seem
> to except for the enum problem noted above), choosing one of these to
> be the "official" java.g comes down to which produces a "better" AST.
> I've listed the differences and it looks to me like Studman's AST's
> look like they're more consistent with the ASTs we get today.
>
> And of course, some guru should look closely at the grammar to make
> sure that it matches the "official" grammar in the JLS, and comments
as
> needed, make sure token names are consistent, etc.
>
> Andy
>
>
>
>
>
>
> Yahoo! Groups Links
>
>
>
>
>
>
>
> ---
>
> Checked by AVG anti-virus system (http://www.grisoft.com).
> Version: 6.0.778 / Virus Database: 525 - Release Date: 15/10/2004
>
---
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.786 / Virus Database: 532 - Release Date: 29/10/2004
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/antlr-interest/
<*> To unsubscribe from this group, send an email to:
antlr-interest-unsubscribe at yahoogroups.com
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list