[antlr-interest] Creating a token that spans multiple lines . . .

Jeff Vincent JVincent at Novell.Com
Wed May 28 15:58:34 PDT 2003


I dredged through the FAQ and didn't notice anything specific (I could have missed it) to this problem.  Hopefully this isn't obvious, I am just returning to ANTLR after a long absence * ack!

The basic question I am trying to answer is:  "How do I create a rule to grab everything between two delimiting tokens?"   I am trying to create a construct in my grammar that parses the following :

group (
   id : x.y.z;
   description : This is a free-form blob of text
     that ends with a semicolon
     ;
   enabled : true;
   )
{
   //block
}

The values associated with the "description" and "id"  literals I want to be free form.  In other words, everything between the ":" and the ";" should be returned as they are just strings.  I created the following literals/rules in my lexer (I've simplified most of it):

tokens {
	GROUP="group";
	ID="id";
	DESC="description";
	ENABLED="enabled";
}
LPAREN : '(';
RPAREN : ')';
COLON : ':';
SEMICOLON : ';';

protected ESCAPE_CHAR :
	'\\' ';'  { $setText(";");  }
	;

TEXT_BLOB :
	':'! ( ESCAPE_CHAR | ~( ';' | '\\') )* ';'!
	{	String temp = $getText;
		temp = temp.toString().trim();
		$setText(temp);
	}
	;

in my parser then has (simplified to help reduce clutter):

	statement :
		(	. . .  //Some other statement rules
		|	GROUP^ LPAREN! groupInfo RPAREN! statement
 		)
		;
	
	groupInfo :
		(	ID^ id:TEXT_BLOB
		|	ENABLED^ COLON! bool SEMICOLON!
		|	DESC^ desc:TEXT_BLOB
		)+ 
		;

This works fine for the "description" and the "id", but when I hit the ENABLED case, the leading COLON is matched as part of the TEXT_BLOB token (as expected).  What I really want (in pseudo grammar code) is :

	groupInfo :
		(	ID^ COLON! <allTheStuffInTheMiddle>  SEMICOLON!
		|	ENABLED^ COLON! bool SEMICOLON!
		|	DESC^  COLON! <allTheStuffInTheMiddle>  SEMICOLON!
		)+ 
		;

Can anyone point me to a FAQ entry (that I may have missed) or offer any hints as to how to grab all the stuff between COLON and SEMICOLON, but only if preceeded by ID or DESC ?

Much thanks,

Jeff



 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 




More information about the antlr-interest mailing list