[antlr-interest] Preserving ALL comments!

Wed Feb 22 00:50:16 PST 2006

 >To just remove comments/whitespace shouldn't be more than a few lines >of
 >code. Just loop through the List
 >of tokens, removing all the ones of type Newline, Whitespace, Comment,
 >and CPPComment. I wouldn't call
 >that a whole "layer", just a few lines of code that you'd quickly write
 >in Java (or whatever) that will accomplish
 >what you want.

 >And the hidden channels seem like the wrong solution...they split the
 >comments and whitespace into
 >a separate stream of tokens to be independently processed. But if you
 >really want to know which
 >comments go with which code, you can't do your processing >independently.
 >You need to treat
 >the comments and whitespace in the context of the stream of "real" 
 >tokens.

We agree that it is easy to remove all Newline, Whitespace and Comment 
from the token stream.
Our problem is that we don't know is how to "programmatically" determine 
which comment goes with which code.
So far, our main objective was to have comments attached as hidden 
tokens to the corresponding nodes in the AST. And at the same time we 
would NOT like to change the grammar file.
For example if we have:
	…
	main()   /* comment2 */
	…
comment2 has to be "reassigned" not to BLANK, not to RPAREN but to ID 
because, according to AST construction from grammar, neither BLANK nor 
RPAREN will be present in the AST.  So, it seems that we have to know 
(from inspecting grammar and AST construction) that RPAREN will not be 
in the AST and skip it as we already skipped the BLANK token.
As far as we can see it, if a comment goes with a token that will not be 
present in the AST,
we have to go back and reassign given comment to next token (which will 
be present in the AST). And yet, we don't know if that new candidate 
token will be present in the AST.

Is it possible? Are we asking too much?
Should we reformulate our objective? (To preserve comments as HIDDEN 
tokens attached to "normal" AST nodes).

Thank you for your answer(s).

Damir