[antlr-interest] Thoughts on tree construction

Oliver Zeigermann oliver at zeigermann.de
Fri May 7 14:15:15 PDT 2004


Tiller, Michael (M.M.) wrote:

> With these discussions about ANTLR 3.0 popping up occasionally, I wanted 
> to throw out an idea that has been bouncing around in my head.  The 
> thing is, I’m not really a compiler developer so I don’t know whether 
> people would generally consider this a good thing to do, but it seems 
> like it would work out well for me.
> 
> To accommodate my mental model of how data should be handled during the 
> compilation process (and that is admittedly probably pretty warped), it 
> seems like there would be some benefit to using the DOM API from XML as 
> the basis for tree construction.  First off, you have a standard API 
> that people are familiar with and is, after all, used to build and 
> traverse trees already.  Another more subtle advantage of the DOM 

A lot of people think DOM really is a bad interface, impracticle and 
much too verbose. There is JDOM for the Java side and then the 
interfaces really look the way one would expect them - very practical.

Footnote: I think Terence has the same reservations to XML, but I really 
do not see XML as a human interface for *writing*. Just an exchange 
format for computers that is also *readable*. SGML, the grandfather of 
XML, was also *writeable* for humans.

> approach would be to take advantage of some of the “richness” in the 
> trees.  The fact that the DOM spec. includes the concepts of attributes 
> and elements (not to mention text and comments) seems like it could 
> enhance the way we thing about tree construction.  For example, I find 
> the idea of distinguishing attributes and elements quite appealing 
> because in most cases it seems like the information in the AST really 
> falls into two categories: information about a node (attributes) and 
> information about the structure contained by the node (elements).  It 
> has always seemed confusing to me that the ANTLR tree construction 
> doesn’t account for these separately.

And then again a lot of people think whether something should be 
expressed in an attribute or a child element is merely a matter of 
taste. RelaxNG treats both as almost equivalent.

> All that being said, there is nothing that prevents users of ANTLR from 
> instantiating a DOM and operating on it during tree construction.  I was 
> just thinking it might be nice if the tree construction rules could 
> automatically do this.  I can imagine a situation where the parser could 
> be written in C++ and instantiate a DOM structure that could then be 
> accessed after the parsing from Python (assuming there were Python 
> bindings to the DOM C++ API).  Perhaps some degree of target language 
> neutrality could be achieved through this approach as well?
> 
> Anyway, this was just a thought.  I’m probably going to try and play 
> around with it because I’m working on a project where the “compiler” 
> process is going to be distributed across several tools and we’re going 
> to try and use XML as an intermediate representation between passes.  

I have done this and I really recommend it - provided run-time 
performance is not an issue. Writing and parsing XML - naturally - takes 
qutie some time. Other than that it really is a very convenient way of 
testing each stage of your transformation. You can even change the 
output of one step or completely write it manually for testing.

> For such an approach, using a DOM API to build the tree is ideal because 
> when we are all done we can dump XML to a file for the next tool to pick 
> it up and run with it.

Have you tried XPA that reads in XML and either produces ANTLR tokens or 
ASTs from it for ANTLR transformations and can write back ASTs into XML? 
This way you could go

XML (->AST->ANTLR->AST->XML )*

Oliver


 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list