[antlr-interest] Preserving Whitespace

Mon Jun 17 06:50:21 PDT 2002

>- imaginary tokens added during parse (as you mentioned)

What I tend to do is create my fragments of code out of text that I pass to
the parser to get a little AST to insert.  That way I can set the source
file to be a mythical filename that I can track back to that pass.  This
also gives me commas and semicolons in the original source that can be saved
and known when printing out.  Should be easy to make it inherit the current
indentation as well.

> It would be nice if the root of the return tree would carry 
> information
> about the range of the input covered by itself *and* its 
> children. This way
> it will be easy to make the source reflect the deletion or 
> movement of a
> subtree in AST.

This would be easily accomplished by traversing the tree after it is all
parsed.  That's a great idea.

> When source text is not available, I think the solution is to keep the
> tokens processed by the parser in a linked list (what is now doing
> TokenStreamHiddenTokenFilter, but keeping the tokens ignored 
> in parser as
> well) and construct the AST over this list. Hidden tokens 
> between two rules
> should be appended in some way to the root of the return tree of the
> previous rule, so that comments and spaces following a subtree can be
> recovered, even if the subtree is completely transformed.

I think it'd be easier to save the source text yourself, but this should
work too.

> This way no information is lost, and one can use the AST as a 
> convenient
> structure for (possibly complex) transformations, and the 
> token list for
> printing back the result.

Let us know how this works for you.

Monty

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/