[antlr-interest] AST rewrite "requirements"
Andy Tripp
antlr at jazillian.com
Thu Jul 24 13:18:51 PDT 2008
AST-rewrite fans,
As Terence and Leon Su are working on a new term-rewriting system, I'd
like to approach the same problem from the other end. I want to define
what features I think are needed for a term-rewrite system to be
useful for certain translation projects. By "certain", I mean
translators that do things like C-to-Java, C++-to-Java, and COBOL-to-Java.
So I've started by taking one of my current "translation rules", and
writing down pseudocode (actually, fairly reasonable Java code) that
does what needs to be done. If people find this useful, then I'll repeat this
for the many other "rules".
Each "rule" modifies an AST in some way.
The "rule" that I chose modifies a C++/Java AST to change the
access modifiers ("public", "private", "protected") from being "C++ - like"
to "Java like". For example, this C++ code...
class C {
int a;
public:
int b;
int c;
private:
int d;
}
...might produce a "C++-ish" AST that looks like this...
CLASS_SPECIFIER "C"
DECL
TYPE
LITERAL_int
NAME
ID "a"
DECL
MODIFIERS
LITERAL_public
TYPE
LITERAL_int
NAME
ID "b"
DECL
TYPE
LITERAL_int
NAME
ID "c"
DECL
MODIFIERS
LITERAL_private
TYPE
LITERAL_int
NAME
ID "d"
...and we want to transform that AST to this "Java-ish" AST...
CLASS_SPECIFIER "C"
DECL
MODIFIERS
LITERAL_private
TYPE
LITERAL_int
NAME
ID "a"
DECL
MODIFIERS
LITERAL_public
TYPE
LITERAL_int
NAME
ID "b"
DECL
MODIFIERS
LITERAL_public
TYPE
LITERAL_int
NAME
ID "c"
DECL
MODIFIERS
LITERAL_private
TYPE
LITERAL_int
NAME
ID "d"
...which will produce this Java code...
class C {
private int a;
public int b;
public int c;
private int d;
}
I won't cover the "big picture" here, all I want to do is to
outline what logic is needed to make the AST transformation that
we see above.
So here is some rough Java code that would do what needs to be done:
private final static List<Integer> ALL_C_MODIFIERS = ASTFactory.createList(
LITERAL_private, LITERAL_public, LITERAL_protected);
AST currentModifier = ASTFactory.create(LITERAL_private);
// Loop through each of the CLASS_SPECIFIER ASTs, checking
// each of its children to see if it has any access modifier
// ("public", "private", or "protected").
for (AST classSpecifier: root.getASTs(CLASS_SPECIFIER)) {
// If there is any "friend" keyword in this class, we'll need
// to just make everything "public".
boolean hasFriend = classSpecifier.contains(LITERAL_friend);
if (hasFriend) {
currentModifier = ASTFactory.create(LITERAL_public);
}
for (AST child: classSpecifier.getChildren()) {
AST foundModifier = child.findAny(ALL_C_MODIFIERS);
if (foundModifier != AST.NOT_FOUND) {
// Once we've found an access modifier, that's the one
// we'll use until we find another (unless we have a friend,
// in which case leave it "public".
if (!hasFriend) {
currentModifier = foundModifier;
}
} else {
// This child does not yet have an access modifier,
// as required by Java. Insert a COPY of the last modifier
// that we encountered, as the first child
child.insert(0, currentModifier);
}
}
}
>From that code, we see the following features being used:
ASTFactory class:
create() static method that creates a single AST
AST class:
getASTs() method which gets a List of ancestor ASTs of the given type
contains() method which returns true iff there is any ancestor AST of the given type
getChildren() method returns the direct children of the AST
findAny() method returns any one ancestor AST that has one of the given types
NOT_FOUND final static AST flag that indicates no AST was found
insert() method that inserts a copy of the given AST as a child.
Obviously, I haven't shown a "real term-rewrite system", I've just shown some
Java code with an AST class that has some reasonable tree-data-structure functionality.
For any term-rewrite-system to be useful to me, it's got to be easier to
develop and maintain that what I show here. I'm doubtful that that can be done,
but I'd love to be proven wrong.
My point here is not to start a long "Which is better for AST transformation:
a DSL or plain Java code?" debate (although I'd be happy to have that
debate, too). Instead, what I want to know is:
Is it worth my time to write up perhaps a couple hundred more examples of
this sort of Java code that does tree transformations? Would you guys
who are buildinga currently-unnamed-rewrite-system actually be willing
to go through such "requirements" and make sure your system provides
that functionality?
The reason I ask is that a long time ago when I did look into the various tools
(SF+SDF, Stratego, TXL, and later Tom), it seemed to me that they got to
be a real mess to do anything non-trivial. Maybe I'm wrong, but I think
if you implemented the same logic shown above in any of these systems,
it would be harder to write and maintain than the "vanilla Java" shown above.
Andy
More information about the antlr-interest
mailing list