[antlr-interest] Preserving ALL comments!

Andy Tripp antlr at jazillian.com
Mon Feb 20 07:05:47 PST 2006


Damir Kirasić wrote:

> Hello,
>
> We want to constuct AST for the Standard C programs
> and to have ALL comments preserverd.
>
> We started with StdCParser.g from examples
> (we use CommonHiddenStreamToken, CommonASTWithHiddenTokens
> and all the stuff)
> and it works fine but not ALL the comments
> seem to be present in the AST.
>
> For example, for the following snippet:
>
> /* comment1 */
> main()  /* comment2  */
> {
>     printf("Hello");
> }
> /* comment3 */
>
>
> only comment1 can be found in the AST.
>
> It seems that comment2 is tied to RPAREN
> and comment3 to RCURLY.
> RPAREN and RCURLY are not included in the AST
> and comment2 and comment3 are lost.
>
> Am I right?
> How to get ALL the comments in AST?
>
> Thank you for your time.
>
>
> Damir
>
What I did was change the grammar to have it not discard
Whitespace, Newline, Comment, and CPPComment tokens.
Then, after lexing, I make a pass through the tokens, trying to
use the whitespace to figure out which piece of code each comment
really seems to go with. For example, your comment1 seems to
go with the "main" token, but if there had been a blank line
after it, it would "go with" the top of the file.

Also during this pass through the token list, I also remove all these
Whitespace, Newline, Comment, and CPPComment tokens.

Andy


More information about the antlr-interest mailing list