[antlr-interest] request for comments on language implementation
Andy Tripp
antlr at jazillian.com
Thu Feb 26 07:13:18 PST 2009
Andreas Meyer wrote:
(snip)
> Well, written out as a "concrete syntax pattern", this looks somewhat
> like "pattern(x) = x + 0 ==> x". There exist lots of tools that do this,
> MatchO maybe can be extended to do it with similar syntax, although some
> feature may be missing from a fully fledged term rewriting engine (order
> of rule application, traversal etc). (I must admit I have not used any
> of these as of now, so I'm only repeating what I saw elsewhere. I still
> have to figure out what is feasible to implement).
I had tried out several term-rewrite engines tools (stratego, txl, etc) a few years ago,
and I didn't find any of them to be better than just writing stuff like
this "by hand". IIRC, they all seemed fine at simple stuff like "x+0 -> x", but
once you start putting in real-world requirements ("for all values that might
evaluate to zero", "as long as '+' hasn't been operator-overloaded") they got
pretty messy.
> Once you have such a generic function, indeed it looks interesting. If
> your application is built on top of a few of those, maybe this would
> also appeal to me. For our purposes, we need to recognize lots of
> patterns, that's why I prefer pattern matching :-)
Yes, of course you make the code modular and build up a library of useful
methods. I also have lots of patterns - hundreds - in my C-to-Java translator.
Some are simple enough that a simple pattern-match thing works:
printf(X) -> sprintf(stdout, X)
...but others require a lot of code.
>> I don't see how an ANTLR treewalker lets you make things modular (and
>> thus maintainable)
>> like this.
>
> Me too. As said above, for transformations I would prefer patterns and a
> pattern-matching-engine.
As Terrence mentioned, he's working on something new. I'm also trying to
put together something new.
>>
>>> Also, you make yourself highly dependent on the underlying
>>> tree-model, which changed a lot from antlr2 to antlr3.
>> Assuming by "tree-model", you mean the shape of the AST you're
>> creating, no,
>> that wouldn't (or shouldn't) change with the version of ANTLR that you're
>> using. You build the AST to be the shape that you want, regardless of
>> ANTLR version.
>
> Have you used antlr3? In our application, the migration to antlr3 is a
> major effort. For example, an antlr2 AST node can be a singular node,or
> a node with a sibling, which has another sibling. So, it can be either a
> node or a list of nodes. If your code depends on this, you end up
> replacing many type definitions with List<AST>. Or you add the nodes you
> actually want to return to a dummy parent, return the first child of
> that parent, and let the caller find out if the returned node has a
> parent and if so, regard all its childs as the rest of the list. If you
> did not write your program in such a fashion, maybe the translation to
> antlr3 gets easier then.
> But still, there is a function called
> "setNextSibling" in antlr2. How do you emulate that in antlr3? You have
> to distinguish lots of special cases to find out if it means: addChild,
> deleteChild, deleteRangeOfChildren .. whatever.
Yes, I've used antlr3, but I haven't actually converted an existing
ANTLR2-based app to ANTLR3. I tended to subclass most of the AST functionality
anyway (for example, hiding the nary-tree implemented as a binary tree
getFirstChild/getNextSibling thing). I see now how upgrading to
antlr3 could be a pain. I guess that's why I'm just sticking with antlr2
for now.
>> I think "hand-written" code to walk an AST is less
>> dependent on AST structure than an ANTLR treewalker is.
>> For example, suppose I want to find all the ancestor nodes of type
>> "INT". The "hand-written" code is something like:
>>
>> List<AST> nodes = someNode.findAncestorsOfType(MyLanguageTokenTypes.INT);
>>
>> ...while the ANTLR treewalker that does the same thing consists of the
>> *entire* tree shape, with extra code added. The ANTLR treewalker forces
>> you to specify much more info than you actually care about.
>
> You could have only a basic tree walker and inherit from that,
> overriding only the places you want. So, if you have a lot more easy
> tree walkers, the single tree-walker copy of the antlr grammar is still
> a burden, but it only exists once.
But each time you change the structure of the AST, whatever treewalker grammar
you have is no longer valid. Michael has a handful (5-10) of AST-changing
rules, so ends up with a handful of grammars. I have many hundreds of
AST-changing rules. It's completely impractical to keep track of what a
valid AST looks like at each step.
>> If by "tree-model" you mean the ANTLR syntax and semantics, then
>> obviously
>> the "hand-written" treewalker is not dependent on it at all, while the
>> ANLTR treewalker is completely dependent on it.
>
> Well, there might be lots of syntactic changes, but I consider these a
> lot easier to deal with than a structural change of the internal tree
> representation, as was the case between antlr2->antlr3. And if you dont
> like treewalker's at all, then maybe generic visitors/combinators could
> be of interest to you (see JJtraveller).
Do you have a link for JJtraveller? Google can't find it.
> Maybe it's possible to also
> write such a tool for antlr, so that, in effect, you get a
> base-tree-walker (without any actions) for free.
>
> So, although my mail contains lots of speculation, I hope it provides
> some ideas that seem useful to you :-)
Yes, thanks for the discussion!
Andy
> Andreas
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
More information about the antlr-interest
mailing list