[antlr-interest] "Comments" token from source to the target language

Tue Nov 13 21:24:05 PST 2007

Mateus Baur da Silva wrote:
> Hi Ter,
> 
> I apologize if the questions are stupid and if I'm taking to long to 
> figure out this stuff. However, I'm a beginner on ANTLR and 
> languages/parsers stuff.
> 
> The problem you mentioned in the other message ("...You have must have a 
> COMMENT? subrule after every single token in case there is a comment on 
> the input stream...") is exactly what I'm trying to avoid.
> 
> I still don't understand how I can do that. What else I need to do 
> besides including the $channel=HIDDEN in the COMMENTS token? What do I 
> need to to on the parser side? Could you show me a real sample, a rule 
> having this implemented?

On the parser side you just have to put comments into another channel 
than the default (channel #0). You can have an arbitrary number of 
channels, so you can either use the HIDDEN channel and throw whitespace 
away, because keeping WS and COMMENTS in the same channel will probably 
mess things up. Or use a designated channel for comments. Channels are 
just integers with HIDDEN=100 (or was it 99?).
In your AST every node has getTokenStartIndex() and getTokenStopIndex() 
which point to the first and last token of its subtree in the token 
stream. The tree for "1 + 2" will always have 1 as start and 2 as stop 
token, no matter how the tree actually looks like. You can then get a 
list of all comment tokens anywhere within source fragment of this tree 
using CommonTokenStream.getTokens(startIndex, stopIndex, COMMENT).
So if you generate the output you can get the list of comments in all 
'meaningful' locations and output them in the 'appropriate' way - this 
heavily depends on the syntax of the output language. One thing you 
probably want is when you have a list of statements, look at the 
comments between statements, i.e. currentStmt.getTokenStopIndex()+1 ... 
nextStmt.getTokenStartIndex()-1 and emit them between the statements of 
the output language.

At least that's how I imagine it should work. A real example would be 
nice though ;)

> On Nov 12, 2007 11:19 PM, Terence Parr 
> <parrt at cs.usfca.edu 
> <mailto:parrt at cs.usfca.edu>> wrote:
> 
>     Hi. Sure.  Each token as an index into token buffer via
>     t.getTokenIndex().  Then ask for the token at index -1, -2, etc...
>     You can ask for it's channel number too.  Just scan :)
> 
>     Ter
>     On Nov 12, 2007, at 4:13 PM, Mateus Baur da Silva wrote:
> 
>      > Hi Ter,
>      >
>      > I understand that parser will ignore the tokens if I set the token
>      > to be sent to the parser thru the hidden channel ($channel=HIDDEN;).
>      >
>      > By reading your message (and your book), I know I can check the
>      > hidden channel for the comments token inside my actions. However, I
>      > don't know how to do that. Is there some sample implementing this
>      > behavior?
>      >
>      > If not, could you (or someone else) let me know how I should
>      > implement that inside my actions?
>      >
>      > Thanks and Regards,
>      > Mateus
>      >
>      >
>      > On Nov 12, 2007 8:03 PM, Terence Parr <
>     parrt at cs.usfca.edu
>     <mailto:parrt at cs.usfca.edu>> wrote:
>      >
>      > On Nov 12, 2007, at 11:38 AM, Mateus Baur da Silva wrote:
>      >
>      > > Hi Guys,
>      > >
>      > > As I mentioned in some my other email, I doing a translator from a
>      > > Pascal subset to java. Currently, I'm ignoring the "comments" by
>      > > using skip() on the lexer rule that defines the "comments".
>      > >
>      > > However, I would like to translate the comments from Pascal to
>     Java
>      > > code as well. I was wondering if I could do that by using the
>      > > HIDDEN_CHANNEL or some other feature to properly translate the
>      > > comments. Does someone have any clue on how to do that?
>      > >
>      >
>      > Yep, use the hidden token thing.  Your actions then ask for the
>      > hidden tokens between real tokens.  Parser ignores them.
>      >
>      > Ter
>      >
>      >
> 
>