[antlr-interest] "Comments" token from source to the target language

Austin Hastings Austin_Hastings at Yahoo.com
Mon Nov 12 22:59:05 PST 2007


Ter,

I suppose it depends on how he plans to deal with comments. But if the 
objective is to link the comments to the nearest-entity (statement, 
subexpression, docblock, etc.) then he's going to have special case 
handling for comments on essentially every node of *some* tree.

That is, consider:

(* maybe do something *)
if (a > 1 (* if a is set *)
   or b > 1 (* if b is set *)
  or c > 1) then (* or c *)
begin
  doSomething; (* defined in other file *)
end

The corresponding Java is, in this case, fairly straightforward. But in 
order to map the comments correctly, he'll have to preserve "shape" as 
well as sequence.

(IF
    (COND
        (OR
            (a > 1 (COMMENT "if a is set"))
            (b > 1 (COMMENT "if b is set"))
            (c > 1 (COMMENT "or c"))))
    (BLOCK
        (STMT_PROC doSomething (COMMENT "defined in other file")))
    (COMMENT "maybe do something"))

Note that in the AST above, "or c" is likely in the wrong position.

=Austin


Terence Parr wrote:
> Hi gang, doesn't this make it hard on the parser grammar? You have 
> must have a COMMENT? subrule after every single token in case there is 
> a comment on the input stream.  I specifically designed hidden 
> channels to avoid this. It's literally a single array of tokens, but 
> parser ignores tokens on "off channels".
>
> I'm pretty sure recognizing comments is not the right approach, but 
> I'm very open to other options if the "optional COMMENT rule" issue 
> can be addressed.
>
> Ter
>
> On Nov 12, 2007, at 4:38 PM, Austin Hastings wrote:
>
>> Mateus,
>>
>> I'd recommend that you not hide the comment tokens. Instead, 
>> recognize them, translate them, and then include them in your AST. 
>> You should do an early-stage AST rewrite to get the comments out of 
>> the way - attach them to the nodes instead. Then you can go back to 
>> AST rewriting for the language itself.
>>
>> =Austin
>>
>>
>> Mateus Baur da Silva wrote:
>>> Hi Ter,
>>>
>>> I understand that parser will ignore the tokens if I set the token 
>>> to be sent to the parser thru the hidden channel ($channel=HIDDEN;).
>>>
>>> By reading your message (and your book), I know I can check the 
>>> hidden channel for the comments token inside my actions. However, I 
>>> don't know how to do that. Is there some sample implementing this 
>>> behavior?
>>>
>>> If not, could you (or someone else) let me know how I should 
>>> implement that inside my actions?
>>>
>>> Thanks and Regards,
>>> Mateus
>>>
>>>
>>> On Nov 12, 2007 8:03 PM, Terence Parr < parrt at cs.usfca.edu 
>>> <mailto:parrt at cs.usfca.edu>> wrote:
>>>
>>>
>>>     On Nov 12, 2007, at 11:38 AM, Mateus Baur da Silva wrote:
>>>
>>>     > Hi Guys,
>>>     >
>>>     > As I mentioned in some my other email, I doing a translator 
>>> from a
>>>     > Pascal subset to java. Currently, I'm ignoring the "comments" by
>>>     > using skip() on the lexer rule that defines the "comments".
>>>     >
>>>     > However, I would like to translate the comments from Pascal to 
>>> Java
>>>     > code as well. I was wondering if I could do that by using the
>>>     > HIDDEN_CHANNEL or some other feature to properly translate the
>>>     > comments. Does someone have any clue on how to do that?
>>>     >
>>>
>>>     Yep, use the hidden token thing.  Your actions then ask for the
>>>     hidden tokens between real tokens.  Parser ignores them.
>>>
>>>     Ter
>>>
>>>
>>> ------------------------------------------------------------------------ 
>>>
>>>
>>> No virus found in this incoming message.
>>> Checked by AVG Free Edition. Version: 7.5.503 / Virus Database: 
>>> 269.15.30/1125 - Release Date: 11/11/2007 9:50 PM
>>>
>>
>
>
>



More information about the antlr-interest mailing list