[antlr-interest] Embedding ANTLR in my grammar

Mark Bednarczyk voytechs at yahoo.com
Sat Feb 3 12:51:31 PST 2007


Hope you guys don't mind I share some excittement.

I have an interesting idea. I'm embedding ANTLR into my own grammar (NPL)
and adding few extra's to make it work nice as a NPL file.
 
Here is a non working sample as of yet, of the resulting syntax mix. I'm
still working on the grammar, but I'm getting very excitted over here.

Quick NPL intro: NPL is used to convert binary buffers full of network
packet data (byte streams), into object oriented packet/header/field tree
structured result. User's can access any of the declared members, fields and
headers that were produced from a resulting Packet object.
 
But what to do with textual protocols? This has been a major problem in
previous version of jNetStream. My idea is to embed a full parser and let it
handle all the complexity of parsing text input and then generate the proper
output expected by the runtime:

// HTTP NPL definition with ANTLR grammar definition embeded
// First creates a parser with HTTP EBNF grammar
// At end of NPL definition, we activate the parser  

public header HTTP {
	
	/*
	 * RFC 2616
	 * Define a grammar for HTTP messages using ANTLR v3 syntax.
	 * This a type definition for the parser, the parser is not
instantiated
	 * here.
	 */
	public typedef parser HTTPParser {
		generic					// RFC 822 - generic
message format
			:	Request | Response	// HTTP/1.1 messages
			;
			
		message
			:	
			
			start 
				(header CRLF)* 
				CRLF
				body?
			;
			
		start
			:	request | status
			;
			
		public final field string 
		request
			:	method SP requestURI SP Version CRLF
				(
					(	generalHeader
					|	requestHeader
					|	entityHeader
					) CRLF
				)*
				CRLF
				body?
			;
				
		public final field uri 
		requestURI
			:	'*'
			|	absoluteURI
			|	absPath
			|	authority
			;
				
		method
			:	'OPTIONS"
			|	'GET'
			|	'HEAD'
			|	'POST'
			|	'PUT'
			|	'DELETE'
			|	'TRACE'
			|	'CONNECT'
			|	extensionMethod
			;
			
		extensionMethod
			:	token
			;
	}; 
	
	/*
	 * Allocate our parser statically, we don't want to allocate one per
header
	 * but 1 for all HTTP headers, and tell the runtime to use the
parser as input
	 * The parser will export all rules that are marked with "field"
modifier as
	 * fields to the HTTP header. Any of its subrules will become
subfields.
	 * So be careful not to go overboard or we could potentially create
fields
	 * down to individual tokens or worse, down to each character.
	 */
	public static final HTTPParser parser = new HTTPParser();
	
	/*
	 * We only need to invoke the parser at runtime to kick start the
parsing.
	 * We pass in a reference to this header so that the parser can
export the
	 * fields its generates. The input is aready aligned at the start of
HTTP 
	 * data and ready for parsing.
	 */
	parser.generic(this);
}

Notice that certain rules are prefixed with "public final field" which would
be filtered out and replaced by appropriate action to produce a "field" and
export it into the parent header which is passed as an argument as the last
line of code. After the filtering what would remain is pure ANTLR grammar
that can be fed into ANTLR compiler. NPL definitions are cross compiled to
normal java, so the NPL compilation step already involves invoking the javac
as a second stage.

If the above were to compile and to be used in a java program you'd use it
like this. The NPL compiler generates java classes for headers and fields
are accessible as class members. 

	// Capture 10 HTTP packets
	LiveCapture capture = Captures.openLive(10, new PcapFilter("http"));

	for (Packet packet: capture) {

	  // A java class is generated for every NPL header statement - same
name
	  HTTP http = packet.getHeader(HTTP.class);

	  // ANTLR rule "request: ..." turned into HTTP.request() and
HTTP.hasRequest()
	  if (http.hasRequest()) {  
	    System.out.printf("HTTP packet was a request [%s] to URI=%s\n",
	           http.request().toString(), http.requestURI().toString());
	  } else {
	    System.out.printf("HTTP packet was something else\n");
	  }
	}
	capture.close();

I think this is a pretty neat way to leverage the power of ANTLR.

Cheers,
Mark...

http://jnetstream.sf.net




More information about the antlr-interest mailing list