[antlr-interest] big XML file support
Sam Barnett-Cormack
sdb at geekworld.co.uk
Mon May 15 16:01:22 PDT 2006
Brannon King wrote:
> Suppose I have a file that looks like this:
>
> <a>
> <b>
> <c>
> <d /> <d /> <d /> ... For a few GB worth
> </c>
> <c binary="true">
> <CDATA[[ about 10GB of binary data ]]>
> </c>
> </b>
> </a>
>
> I need a parser to go through and build up a structure with the tree but
> without any <d> or binary data. Instead, I just want to record the file
> locations for those and I'll go pull them from the file when I need them. Is
> ANTLR a good tool to do that or am I better off parsing by hand? Or should I
> use Xerces? Or, the real question, does ANTLR have some ability to do
> XML-type structures easily? What are the largest files you've parsed using
> ANTLR? I'm using C++. Thanks for your time.
I know I'm sidestepping the issue, but it might be worth using a
dedicated XML parses. XERCES-C++ from Apache is probably a good bet for
you, and it supports SAX, DOM, and DOMasSAX, IIRC. Otherwise it's easy
to write a DOMasSAX layer yourself...
Sam
More information about the antlr-interest
mailing list