[antlr-interest] Preserving ALL comments!
Andy Tripp
antlr at jazillian.com
Wed Feb 22 06:26:34 PST 2006
Damir Kirasić wrote:
>
> We agree that it is easy to remove all Newline, Whitespace and Comment
> from the token stream.
> Our problem is that we don't know is how to "programmatically"
> determine which comment goes with which code.
> So far, our main objective was to have comments attached as hidden
> tokens to the corresponding nodes in the AST. And at the same time we
> would NOT like to change the grammar file.
> For example if we have:
> …
> main() /* comment2 */
> …
> comment2 has to be "reassigned" not to BLANK, not to RPAREN but to ID
> because, according to AST construction from grammar, neither BLANK nor
> RPAREN will be present in the AST. So, it seems that we have to know
> (from inspecting grammar and AST construction) that RPAREN will not be
> in the AST and skip it as we already skipped the BLANK token.
> As far as we can see it, if a comment goes with a token that will not
> be present in the AST,
> we have to go back and reassign given comment to next token (which
> will be present in the AST). And yet, we don't know if that new
> candidate token will be present in the AST.
>
> Is it possible? Are we asking too much?
> Should we reformulate our objective? (To preserve comments as HIDDEN
> tokens attached to "normal" AST nodes).
>
> Thank you for your answer(s).
>
> Damir
>
Yes, it is possible. This is exactly the problem that I had to solve.
See /"Preserving the Documentary Structure of Source Code in
Language-based transformation Tools"/ by Michael L. Van De Vanter at Sun
Laboratories, which talks about the same issue.
What I do is just before stripping out the comment/newline/whitespace
tokens, I give each physical line of input a
"loose description" (e.g. "declaration of variable i", "a for
statement", "a comment", etc). Then, later, after translation is
done, I attempt to put each comment back with the line that it seemed to
"got with" at the start. I can send you more
details from my top-secret-highly-classified design document in email if
you'd like :)
Andy
More information about the antlr-interest
mailing list