[antlr-interest] Preserving Whitespace

Bogdan Mitu bogdan_mt at yahoo.com
Mon Jun 17 04:34:25 PDT 2002

--- mzukowski at yci.com wrote:
> I've detailed another approach before on this list which I think is
> very general and I'd love to get somebody to implement it ;) Basically
> you keep the original file around and every Token you create
> represents a region in that file (start and extent).  When it comes
> time to print out your Tokens you keep track of the previously printed
> token and if some whitespace existed between those two tokens
> previously then copy it to the output. There are some boundary cases
> to handle as well as what to do with synthesized tokens that weren't
> present in the original code.  

That's exactly what I do. I work on Documents (files read in Swing
TextComponents), so I have all the source available. I want to provide
services like highlighting the declaration of a variable when you select it
in a tree outline and different source transformations based on the AST. The
problems are:
- tokens ignored in the parser
- imaginary tokens added during parse (as you mentioned)
- tokens moved from their place when the return tree is constructed by hand
(i.e. ## = #(#b, #a))

It would be nice if the root of the return tree would carry information
about the range of the input covered by itself *and* its children. This way
it will be easy to make the source reflect the deletion or movement of a
subtree in AST.

When source text is not available, I think the solution is to keep the
tokens processed by the parser in a linked list (what is now doing
TokenStreamHiddenTokenFilter, but keeping the tokens ignored in parser as
well) and construct the AST over this list. Hidden tokens between two rules
should be appended in some way to the root of the return tree of the
previous rule, so that comments and spaces following a subtree can be
recovered, even if the subtree is completely transformed.

This way no information is lost, and one can use the AST as a convenient
structure for (possibly complex) transformations, and the token list for
printing back the result.


Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 

More information about the antlr-interest mailing list