[antlr-interest] File comments aka. unclosed comments
Jim Idle
jimi at temporal-wave.com
Tue Oct 30 10:03:04 PDT 2007
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Ramon Verbruggen
> Sent: Tuesday, October 30, 2007 7:46 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] File comments aka. unclosed comments
>
> I have spent quite some time trying to figure this out, and searched
> the
> internet extensively (including the antlr mailing list archives) but
> could not find anything related, so as a last resort I am posting this
> here.
I think I have posted this before, but I couldn't find I in serach either
;-).
Here is a lexer construct that deals with embedded /* comments (in this case
for T-SQL but the principle is exactly the same of course). Note that this
rule does not check for missing trailing '*/' explicitly, though it could be
made to do that. I will leave that as an exercise though :-)
// A multiline comment is akin to a C style comment and is bounded
// by /* and */. However the T-SQL lexer allows for, and checks
// embedded comments. See how here we use a fragment rule to define
// the lexical construct, as this does not try to create tokens and
// hence can be called recursively by itself. The actual token making
// rule here then, just calls that fragment rule.
//
ML_COMMENT
: ML_COMFRAG
{
$channel = HIDDEN;
}
;
///////////////////////////////////////////////////////////////////////
// This rule is a fragment so that it can call itself recursively
// and deal with multiple embedded comments.
//
fragment ML_COMFRAG
:
'/*' ( options { greedy=false;}
// The predicate looks for the start
of an embedded comment
// and this triggers a recursive
call of this rule
// and therefore automatically
matches /* and */ pairs.
//
: {(input.LA(1)== '/' && input.LA(2)
== '*')}? ML_COMFRAG
| .
)*
'*/'
;
Jim
More information about the antlr-interest
mailing list