[antlr-interest] Newbie trying to tame lexer

Bart Kiers bkiers at gmail.com
Sat Aug 4 11:20:13 PDT 2012


Hi forumer,

You'd normally create a single rule for a block comment, like this:

BLOCK_COMMENT
 : '/*' .* '*/'
 ;

(note that `.*` and `.+` are ungreedy by default!)

Be careful, however, you can't have a trailing `.*` or `.+` though: it will
consume the entire input.

Also, you can't negate two characters: `~('*/')` is wrong. And you should
never have a lexer rule that matches an empty string (your
`CONTINUE_COMMENT` does that): your lexer might go in an infinite loop:
there are an infinite number of empty string in any input, after all.

Regards,

Bart.


On Sat, Aug 4, 2012 at 12:58 AM, <forumer at smartmobili.com> wrote:

> Hi,
>
> I would like to use antlr to generate a lexer to highlight some keyword
> and comments and so far
> everything is fine as long as I don't try to handle multiline comments.
> To solve my problem I wrote the following lines :
>
>
> BLOCK_COMMENT
>      : '/*'
>         ;
>
> CONTINUE_COMMENT
>      : ~('*/')*            // DOESN'T WORK
>      ;
>
> END_BLOCK_COMMENT
>      :   '*/'
>         ;
>
> LINE_COMMENT
>      :   '//' ~('\n'|'\r')*  ('\r\n' | '\r' | '\n')
>              {
>                   $channel = Hidden;
>              }
>      |   '//' ~('\n'|'\r')*     // a line comment could appear at the
> end of the file without CR/LF
>              {
>                   $channel = Hidden;
>              }
>      ;
>
> The problem is with BLOCK_COMMENT, CONTINUE_COMMENT and
> END_BLOCK_COMMENT rules so my question is:
>
> Once the lexer is inside BLOCK_COMMENT how do I tell him to pass to
> CONTINUE_COMMENT rule
> and then how do I tell CONTINUE_COMMENT to eat everything except '*/' ?
>
>
>
> Thanks
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>


More information about the antlr-interest mailing list