[antlr-interest] On trees and JavaBeans, part 2: tree creation

Mon Apr 18 05:21:55 PDT 2005

(Please read the "part 1, interfaces, first)

[I forgot in part 1 to mention how we get "token" values: the ast model
would define
    int getTokenIdFor(Object node);
    String getTextFor(Object node);
 that would return the parts of a token for any given node. It could just
return a Token, but that would require the model to create Token objects for
each real object, which may not be necessary.]

Once we can use any data structure for tree parsing, for tree creation, it
would be useful to be able to create *any* data structure. Factories are the
right approach here, and are pretty close to being exactly what we need.

First, ASTFactory must be an interface. Then, abstract away "AST" from the
methods. For example:

  void addChild(Object parent, Object child)
  Object create(int type)
  Object create(int type, String text)
  Object create(int type, String text, String className)
  ...

Poof.

Using the existing ANTLR tree support and only a slight bit of abstraction,
we can now:
* create any data structure
* walk any data structure

ANTLR becomes the ultimate tool for everything ;)  [Not really of course,
but tree parsers become much easier to use and therefore more useful, and
parsing to build data structures becomes easy as well]

Next to JavaBeans, making it all even more powerful...

With a little syntactic sugar, we can have the grammar explicitly specify
which properties of a bean to set or read.

The trouble with what I've said so far is that things are positional. The
only way to determine what to parse or build is by the order of adds or the
index when asking for a child.

What if we used labels that were javabean property names and had a "bean
mode"? (Note: this would work for any language that we can create "get" and
"set" methods for, not just java!)

  options {
    beanMode = true;
  }

  person creates Person
    : name:IDENT phone:phoneNumber address:address
    ;

  address creates Address
    : street:IDENT city:IDENT COMMA! state:IDENT zip:INTEGER
    ;

While this needs a little syntactic help, the idea is that the "creates"
clause states what kind of object to create, and the labels specify which
properties to assign in the bean being created.
If the property is a token, we just set the token. If it's a string, we just
set the token text. We'd probably want some special cases for a few other
data types as well. Primitive conversions are simple string conversions.

Poof - no action code required to create a data structure based on set
methods and no-arg constructors!
(We don't really need the "!" in the above example; we could ignore any term
without a label)

We'd also need "add" support as well as get/set. (Note here we *need* the
parens for the * closure...)

  person creates Person
    : name:IDENT ... 
      ( child:child )*
    ;

Assuming there's an addChild method, it would be called each time through
the loop. If there were only a setChild, it would be overwritten each time
through the loop. We could add some sugar here to ensure an add:

  person creates Person
    : name:IDENT ... 
      ( add:child:child )*
    ;

(ewww)

But I don't think that's necessary.

This of course assumes that subrules all contribute to the rule's object
being built. To build other objects, a separate top-level rule must be
defined. (This is the way my XML parsing support works, and it seems peachy
so far).

[For "bean walking", this is an entirely new ballgame, and I haven't
finished thinking through it. Current tree parsers are positional. Instead,
the new type of tree parser could check properties instead of "next child".
I need to chew on this a bit more to get a good example before I propose
it...]

Thoughts?
-- Scott