[antlr-interest] Building an HTTP parser

Mon Dec 29 07:02:11 PST 2008

> How embedded?  Are we talking "HC11 with 4K of RAM and 16K of OTP ROM"
> embedded or "Coldfire with all the RAM and FLASH I could ever possibly need"
> embedded?

The device is a PCI card running a PowerPC based CPU with approx 140MB
usable RAM, it runs a stripped BSD based kernel cross-compiled with a
custom GCC compiler. I doubt that we will have a problem running the
code generated by ANTLR/C, and brief eye-balling on the genned code
doesn't raise any red flags at this stage.

> ANTLR (and for that matter the LEXes and YACCs of the world) are eminently
> suitable for doing exactly what you propose.  The "unstructured vs.
> structured" concern you stated is largely beside the point.  Parsing is, by
> definition, the taking of a stream of text with some sort of defined
> structure and interpreting it.  The only variable is how complex and/or
> sloppy the specification of the language in question is.  Parsing HTML with
> tools such as ANTLR should be a relatively straightforward endeavor.

I don't care about the HTML contained in the body, and I might only
ever be interested in breaking up POST body data in an HTTP request. I
am however interested in the HTTP status (first line), headers
(KEY=VALUE) and obtaining the body in it's entirety as a blob. The
HTTP spec is pretty daunting, but as far as I can see should be simply
breaking up tokens based on CRLF's in the input.

> Hope this helps Tom.

Thanks Greg, this does indeed. My next steps will be to try and get my
hands on the antlr book, and see if I can hunt down my 10 year old
varsity course notes on BNF and compiler-compilers.

All the best for 2009!

-- 
http://www.tomwells.org