[antlr-interest] "Comments" token from source to the target language

Terence Parr parrt at cs.usfca.edu
Tue Nov 13 11:04:41 PST 2007


On Nov 12, 2007, at 10:59 PM, Austin Hastings wrote:

> Ter,
>
> I suppose it depends on how he plans to deal with comments. But if  
> the objective is to link the comments to the nearest-entity  
> (statement, subexpression, docblock, etc.) then he's going to have  
> special case handling for comments on essentially every node of  
> *some* tree.

Hi Austin.  Turns out my mechanism is very simple.  Keep a list of  
all tokens in order.  AST nodes point into that list for their  
payload.  No node in tree for AST.  I just look into token buffer to  
get tokens to left or right of token associated with AST node. :)   
Simple.

> That is, consider:
>
> (* maybe do something *)
> if (a > 1 (* if a is set *)
>   or b > 1 (* if b is set *)
>  or c > 1) then (* or c *)
> begin
>  doSomething; (* defined in other file *)
> end
>
> The corresponding Java is, in this case, fairly straightforward.  
> But in order to map the comments correctly, he'll have to preserve  
> "shape" as well as sequence.

List<Token> works great :)  Not sure what you mean by shape.  Just as  
comments mess up input structure, they mess up tree structure.   
Remember they are part of lexical language but are NOT part of  
grammatical structure.  A crucial distinction.  It implies they  
should not alter parse tree nor AST.

Can you explain how you avoid having COMMENT? everywhere?  I'm  
actually curious to know if I've missed a cool idea.

Ter


More information about the antlr-interest mailing list