[antlr-interest] Retaining comments

Thomas Brandon tbrandonau at gmail.com
Wed Mar 12 11:02:03 PDT 2008


On Thu, Mar 13, 2008 at 3:53 AM, Stuart Watt <SWatt at infobal.com> wrote:
>
> The scenario I flagged is illustrative only of my particular task, where I
> want the best of an AST and of the text. This is not quite associating
> comments and structure, but of generating annotated/formatted text. I was
> just commenting that XML technologies can be very helpful for certain tasks
> (like these) and that combining ANTLR and XML for tasks like these ought to
> be easier than having to muck around at the text layer manually. I still
> hope this is possible, but if not, I'll maybe have to think how to manage
> the architecture better.

Why do you think managing annotated text in an AST is difficult? I
don't know I disagree but I don't know that I agree either.
Without any AST rewriting it's obviously a simple 1:1 mapping.
Ignoring nodes could be handled with channels that are ignored during
processing and used in outputting. Or you could remove the nodes from
the tree and output text between the stop index of one node and the
start index of the next as you go through the tree. That saves the
memory of the unneeded nodes assuming the input is still around.
More complex restructuring of nodes would seem to be the main issue.
Something like:
somerule: attributes SOMERULE^ contents;
is going to cause problems as the root is now out of order.
But you could not touch the concrete nodes and just add extra
imaginary nodes to add your structure. Setting them to a different
channel would allow them to be easily skipped when outputting.

This seems like a pretty direct mapping of your proposed XML to an
AST. The concrete nodes are the text content of the XML, the imaginary
nodes are the XML tags. Thus it seems to me that the problem of
mapping your AST to XML is basically the same as that of creating an
AST structure.

Depending on your XPath processor you might even be able to easily
write a DOM\SAX wrapper around your AST and use that to run XPath
against your AST without having to do any XML<->AST. Then you could
mix XPath and tree parser based passes.
Again depending on the processor you could also use that to run XSLT
against your AST. Though I see less use for this unless your already
tied to XSLT. A tree parser with string template output does the job
of a text\html outputting XSLT pretty well and rewriting tree parsers
handle XML outputting XSLT situations. And AST->AST (i.e. XML->XML
XSLT) transformations could be tricky to implement without rewriting
large parts of the XSLT processor. XSLT is probably better for complex
restructuring but it doesn't sound like you need that.

Just thinking out loud here but seems like it might work.

Tom.
>
> All the best
> Stuart
>


More information about the antlr-interest mailing list