[antlr-interest] Building an HTTP parser

Mon Dec 29 13:06:37 PST 2008

At 04:02 30/12/2008, Tom Wells wrote:
 >I don't care about the HTML contained in the body, and I might
 >only ever be interested in breaking up POST body data in an
 >HTTP request. I am however interested in the HTTP status (first
 >line), headers (KEY=VALUE) and obtaining the body in it's
 >entirety as a blob. The HTTP spec is pretty daunting, but as
 >far as I can see should be simply breaking up tokens based on
 >CRLF's in the input.

Hand rolling a parser for an HTTP/1.0 envelope (which is usually 
sufficient) is fairly trivial, although the spec can be a bit hard 
to read in places.  (I've done it a few times before.)

Supporting HTTP/1.1 (with chunked encoding and persistent 
connections) is a bit trickier, but probably still doable.  (I 
usually haven't bothered though, but it depends on what your 
specific requirements are.)

But as Jim said, there are quite a few HTTP-capable libraries 
already out there, unsurprisingly enough.  He already mentioned 
libcurl; their own site mentions a few more:
   <http://curl.haxx.se/libcurl/competitors.html>

Regardless, while it's certainly *possible* to parse HTTP in 
ANTLR, it's probably not something I would use it for; it seems 
like overkill.