[antlr-interest] Lexing nested comments

Gavin Lambert antlr at mirality.co.nz
Wed Feb 10 12:20:14 PST 2010


At 08:35 11/02/2010, Michael Siff wrote:
 > NESTED : '/*' (NESTED | .)* '*/' { $channel = HIDDEN } ;
 >
 >However, the language in question has the need to consider 
tokens
 >like:
 >
 > /*:bool:*/
 >
 >as a way of specifying explicit type information. Currently, 
what I
 >have gets the nested comments correctly, but then ignores the
 >/*:bool:*/ as if it is a comment even though I have a separate
 >rule like:
 >
 >  BOOL : '/*:bool:*/' ;
 >
 >Is there an easy way around this problem?

First, ensure your BOOL rule is listed before your NESTED 
rule.  In case of doubt, ANTLR will give preference to the first 
listed rule, so this may be enough by itself to get the behaviour 
you want.

Failing that, usually the solution to this sort of thing is to be 
a bit more explicit about what you're expecting a comment to look 
like; for example, if you want to treat anything of the form 
/*:xxxx:*/ as a processing instruction rather than a comment, you 
can do this:

fragment PROC_INSTR: '/*:...:*/';
fragment NESTED: '/*' (NESTED | .)* '*/';
COMMENT: '/*' ( (':') => ':' .* ':*/' { $type = PROC_INSTR; }
               | (NESTED | .)* '*/'
               ;
(Some refinement may be needed to handle error cases such as 
/*:foo*/.)



More information about the antlr-interest mailing list