[antlr-interest] parsing only inside of C Comments?

Jim Idle jimi at temporal-wave.com
Tue Jul 31 11:26:05 PDT 2007



> -----Original Message-----
> From: Martin Kortmann [mailto:email at kortmann.de]
> Sent: Tuesday, July 31, 2007 11:10 AM
> To: Jim Idle
> Cc: antlr-interest Interest
> Subject: Re: [antlr-interest] parsing only inside of C Comments?
> 
> Hello Jim,
> 
> Jim Idle schrieb:
> > Martin,
> >
> > I should think that the easiest way to do this is to write the
> grammar
> > that parses those patterns, then write a separate filtering lexer
> that
> > invokes the parser on the discovered text in the way that island
> > grammars do when you can invoke them from the lexing phase. There is
> an
> > example of using islan grammars with the Java and C targets
(others?)
> in
> > the dowloadable examples jar at the download page:
> 
> I have already a handwritten lexer that skips everything
> outside the c-comments and tokenized everything inside the
> comments to feed the (also handwritten) parser. Now i would
> like to replace the handwritten parser with some other code.
> I wonder about the fact that it is simple to ignore everything
> inside an comment but it seems not so easy to ignore
> everything outside.

That's where the filtering lexer comes in. Just define the C comment
rule, then when it matches, invoke your new ANTLR parser, with its own
lexer, string stream and so on:

options 
{
    filter	= true;
}

COMMENT
    :   '/' '*' ( options {greedy=false;} : . )*

		{	use existing input stream here and invoke your
lexer->parser sequence
			This lexer stops when it sees '*/', which it
hides from the new parser.	
		 }
    ;

LINE_COMMENT
    : '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}

		{ same deal }
    ;

Everything that isn't a comment will just be skipped over.

Jim


More information about the antlr-interest mailing list