[antlr-interest] Parsing whole-line comments?
Christian Convey
christian.convey at gmail.com
Sun Jun 6 04:14:14 PDT 2010
>> That is, <beginning of line> <the letter C> <zero or more
>> non-end-of-line characters> <end-of-line>
>>
>> My problem is, to my knowledge ANTLR won't let me define tokens that
>> match on the beginning of a line ('^').
>>
>> Any suggestions?
>
>
> There is no need to match such positions: when you match a certain line (a
> token that ends with a line break), the next character will be the first in
> a (new) line.
> Something like this should do the trick:
>
> grammar Test;
> parse
> : (Comment | Line)+ EOF
> ;
> Comment
> : 'C' ~('\r' | '\n')* (NewLine | EOF)
> ;
> Line
> : ~'C' ~('\r' | '\n')* (NewLine | EOF)
> ;
> fragment
> NewLine
> : '\r'? '\n'
> | '\r'
> ;
Thanks, that may work for my particular language, because I may have
no other tokens that begin with a capital letter 'C'.
But let me wax hypothetical for a minute. Suppose that in other,
non-comment lines, I have need to support another token that begins
with a capital C. For example, 'CALL'. So my DSL might have a
program like this:
C My test
E CALL FOO
CALL This is a comment because 'C' is in first column.
Any suggestions for how to an ANTLR lexeme/grammar should handle this?
My impression is that something like Flex, whose token regex's can
match the beginning-of-line imaginary character, would just let me do
this:
CommentToken ::= ^C.*$
CallToken ::= ~(^)CALL
More information about the antlr-interest
mailing list