[antlr-interest] "Comments" token from source to the target language
Curtis Clauson
NOSPAM at TheSnakePitDev.com
Mon Nov 12 19:56:05 PST 2007
Mateus Baur da Silva wrote:
> As I mentioned in some my other email, I doing a translator from a
> Pascal subset to java. Currently, I'm ignoring the "comments" by using
> skip() on the lexer rule that defines the "comments".
>
> However, I would like to translate the comments from Pascal to Java code
> as well. I was wondering if I could do that by using the HIDDEN_CHANNEL
> or some other feature to properly translate the comments. Does someone
> have any clue on how to do that?
Another way to look at this is to consider input vs. output. In a
program language parser, you parse the input source text into
implementable units. In this context, comments have no meaning and are
skipped or shuttled to the HIDDEN token stream channel.
However, in your situation, you are translating one source language into
another. In this context, comments not only have meaning, they are part
of the output. As such, they should be handled by the parser as part of
the source language and not punted by the lexer.
The components of Pascal comments become valid tokens, the different
Pascal comment syntaxes are parsed matching those tokens, and you use
the tokens for the comment text to emit the Java style comments. This
allows you to distinguish between single-line and multi-line comments,
and even to prepend " * " to the interior lines of multi-line comments.
In this manner, the comment tokens are a valid part of the main token
stream and there is no need to use any special code to read alternative
token stream channels.
I hope that helps.
-- Curtis
More information about the antlr-interest
mailing list