[antlr-interest] On trees and JavaBeans, part 2: tree creation

Mon Apr 18 14:16:09 PDT 2005

Scott--

I believe that Ter has adopted the Payload/Carrier model for ANTLR 
3--Carriers should implement an interface that contains navigation support 
(including token ids makes sense, but text belongs in the Payload), but all 
attribute data is in the Payloads.  If you look in the archives, you can 
find my posts and those of others involved in the discussion.  The 
separation of navigation from data goes a long way towards removing the 
awkwardnesses that you are trying to solve.

The Carrier interface should be simple enough to wrap most compatible 
navigation classes, and the base Payload interface would just define 
getText() and maybe getLine() and getColumn().  Carriers get created with a 
Payload and a type and are homogeneous (but a tree parser pass can be used 
to convert from one Carrier class to another); Payloads are carried along 
through transformations and are rarely replaced in tree walks.

Tree grammars cannot walk "any" data structure--the data structure needs to 
be either a digraph with no loops, or a digraph in which the navigation 
methods prevent looping.  Modal navigation (methods determine which field 
to dereference from a mode variable) is possible, but the core restriction 
is "no loops".

--Loring

At 05:21 AM 4/18/2005, Scott Stanchfield wrote:
>(Please read the "part 1, interfaces, first)
>
>[I forgot in part 1 to mention how we get "token" values: the ast model
>would define
>     int getTokenIdFor(Object node);
>     String getTextFor(Object node);
>  that would return the parts of a token for any given node. It could just
>return a Token, but that would require the model to create Token objects for
>each real object, which may not be necessary.]
>
>
>Once we can use any data structure for tree parsing, for tree creation, it
>would be useful to be able to create *any* data structure. Factories are the
>right approach here, and are pretty close to being exactly what we need.
>
>First, ASTFactory must be an interface. Then, abstract away "AST" from the
>methods. For example:
>
>   void addChild(Object parent, Object child)
>   Object create(int type)
>   Object create(int type, String text)
>   Object create(int type, String text, String className)
>   ...
>
>Poof.
>
>Using the existing ANTLR tree support and only a slight bit of abstraction,
>we can now:
>* create any data structure
>* walk any data structure
>
>ANTLR becomes the ultimate tool for everything ;)  [Not really of course,
>but tree parsers become much easier to use and therefore more useful, and
>parsing to build data structures becomes easy as well]
>
>
>Next to JavaBeans, making it all even more powerful...
>
>
>With a little syntactic sugar, we can have the grammar explicitly specify
>which properties of a bean to set or read.
>
>
>The trouble with what I've said so far is that things are positional. The
>only way to determine what to parse or build is by the order of adds or the
>index when asking for a child.
>
>
>What if we used labels that were javabean property names and had a "bean
>mode"? (Note: this would work for any language that we can create "get" and
>"set" methods for, not just java!)
>
>
>   options {
>     beanMode = true;
>   }
>
>   person creates Person
>     : name:IDENT phone:phoneNumber address:address
>     ;
>
>   address creates Address
>     : street:IDENT city:IDENT COMMA! state:IDENT zip:INTEGER
>     ;
>
>
>While this needs a little syntactic help, the idea is that the "creates"
>clause states what kind of object to create, and the labels specify which
>properties to assign in the bean being created.
>If the property is a token, we just set the token. If it's a string, we just
>set the token text. We'd probably want some special cases for a few other
>data types as well. Primitive conversions are simple string conversions.
>
>Poof - no action code required to create a data structure based on set
>methods and no-arg constructors!
>(We don't really need the "!" in the above example; we could ignore any term
>without a label)
>
>We'd also need "add" support as well as get/set. (Note here we *need* the
>parens for the * closure...)
>
>
>   person creates Person
>     : name:IDENT ...
>       ( child:child )*
>     ;
>
>Assuming there's an addChild method, it would be called each time through
>the loop. If there were only a setChild, it would be overwritten each time
>through the loop. We could add some sugar here to ensure an add:
>
>   person creates Person
>     : name:IDENT ...
>       ( add:child:child )*
>     ;
>
>(ewww)
>
>But I don't think that's necessary.
>
>
>This of course assumes that subrules all contribute to the rule's object
>being built. To build other objects, a separate top-level rule must be
>defined. (This is the way my XML parsing support works, and it seems peachy
>so far).
>
>
>[For "bean walking", this is an entirely new ballgame, and I haven't
>finished thinking through it. Current tree parsers are positional. Instead,
>the new type of tree parser could check properties instead of "next child".
>I need to chew on this a bit more to get a good example before I propose
>it...]
>
>
>
>Thoughts?
>-- Scott