XML parsing (was RE: [antlr-interest] Places where Antlr can be used ....)

Oliver Zeigermann oliver.zeigermann at gmail.com
Fri Jun 24 07:14:41 PDT 2005


On 6/24/05, Scott Stanchfield <scott at javadude.com> wrote:
> > I like this and think it is superior to the parsing part of
> > XPA. But where is the tree transformation part?
> >
> > Oliver
> 
> I'm not doing any tree transformation. I'm planning on keeping this separate
> from XPA, though if you're interested we could work on putting tree
> transformation in ANTXR, or set up XPA as a separate tree transformer that
> could be integrated with ANTXR output. I must admit I haven't look at the
> XPA tree transformation support very closely, so I don't know how easy this
> would be. If we found the right place for hooks, XPA and ANTXR could work
> together quite nicely.
> 
> The ANTLR AST support still exists in ANTXR, and ANTXR should be able to
> create trees. I haven't tried it yet, though, so I may have inadvertently
> broken something. I don't think that was hit though. Standard ANTLR tree
> parsers should work on ANTXR output, though again, I haven't tried it.

XPA tree transformation works by simply bringing XML into an AST and
do the processing using ordinary tree transformation rules. Mapping of
XML element to AST type is done using the TokenTypes.txt file which is
needed to parse the XML into an AST.
 
> About 6 weeks ago I realized things would be a lot more user friendly if I
> split from ANTLR. Ter wasn't enthusiastic about it, but understood that it
> could be better served that way.

Have you evaluated bringing this to 3.0? XPA simply works with ANTLR
without modifications, but the ANTLR grammar syntax is less than
optimal for XML parsing. As I said your solution seems superion in
that respect.

> I now have it working with XMLPULL parsers as well (sooooo much easier than
> the wonderful thread synch we did for SAX, but the validation support is
> weak in most of the XMLPULL parsers I looked at). I've got simple token
> stream versions for Xerces, Crimson and kxml so users don't need to
> configure them for basic XML parsing. They can still use the more complex
> sax and xmlpull streams if full control is desired.

Sounds cool.

> (I pulled out the non-Java code generators. Eventually I may add them back
> in if I get the chance to learn XML parsing on those languages.)
> 
> What I *really* need to do is an XML Schema->ANTXR grammar generator...

That would be cool and very useful to consume XML into Java, yes. Cool
idea. Might be fun as well. Even though I think DTD still is much more
common than Schema. What about an ANTLR DTD parser? (Maybe even antlr
3.0?)


Oliver


More information about the antlr-interest mailing list