[antlr-interest] Newbie trying to tame lexer
Bart Kiers
bkiers at gmail.com
Sat Aug 4 11:20:13 PDT 2012
Hi forumer,
You'd normally create a single rule for a block comment, like this:
BLOCK_COMMENT
: '/*' .* '*/'
;
(note that `.*` and `.+` are ungreedy by default!)
Be careful, however, you can't have a trailing `.*` or `.+` though: it will
consume the entire input.
Also, you can't negate two characters: `~('*/')` is wrong. And you should
never have a lexer rule that matches an empty string (your
`CONTINUE_COMMENT` does that): your lexer might go in an infinite loop:
there are an infinite number of empty string in any input, after all.
Regards,
Bart.
On Sat, Aug 4, 2012 at 12:58 AM, <forumer at smartmobili.com> wrote:
> Hi,
>
> I would like to use antlr to generate a lexer to highlight some keyword
> and comments and so far
> everything is fine as long as I don't try to handle multiline comments.
> To solve my problem I wrote the following lines :
>
>
> BLOCK_COMMENT
> : '/*'
> ;
>
> CONTINUE_COMMENT
> : ~('*/')* // DOESN'T WORK
> ;
>
> END_BLOCK_COMMENT
> : '*/'
> ;
>
> LINE_COMMENT
> : '//' ~('\n'|'\r')* ('\r\n' | '\r' | '\n')
> {
> $channel = Hidden;
> }
> | '//' ~('\n'|'\r')* // a line comment could appear at the
> end of the file without CR/LF
> {
> $channel = Hidden;
> }
> ;
>
> The problem is with BLOCK_COMMENT, CONTINUE_COMMENT and
> END_BLOCK_COMMENT rules so my question is:
>
> Once the lexer is inside BLOCK_COMMENT how do I tell him to pass to
> CONTINUE_COMMENT rule
> and then how do I tell CONTINUE_COMMENT to eat everything except '*/' ?
>
>
>
> Thanks
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
More information about the antlr-interest
mailing list