[antlr-interest] ANTLR3 tutorial

Dominik Holenstein dholenstein at gmail.com
Thu Aug 3 02:04:51 PDT 2006


Martin,

I appreciate your knowledge and expertise on XML.
But I think there is one reason for developing my own XML parser with
ANTLR: Learning!

As a newbie to Java, ANTLR and XML I can learn a lot by developing my
own XML parser (or my own specific xml parser to extract just parts of
an XML file).

This is just my opinion and I am very happy and thankful that Oliver
has added this tutorial.

Regards,
Dominik





On 8/3/06, Martin Probst <mail at martin-probst.com> wrote:
> Hi,
>
> I'm kind of repeating myself in this, but whatever. I consider myself
> quite an expert on XML (heck, I'm developing an XML database!) so I
> got my 0.02 € on this.
>
> First, you should at all times avoid tricking people into writing a
> parser for XML. There is absolutely no reason for this. None
> whatsoever. There are XML parsers out there for all languages and a
> variety of different profiles, SAX, DOM, XML Pull, whatever. They are
> highly optimized and it's extremely unlikely you get something faster
> using ANTLR. Plus, you totally spoil the whole XML thing (may not
> have processing instructions, comments, CDATA, entitites, ... except
> from that you don't even support Unicode and have quite some errors
> in that lexer). XML was invented (among other things) to save people
> from having to write their own parser!
>
> Second, there are appropriate techniques to create bindings from XML
> to the language of choice for custom vocabularies. E.g. XML beans and
> friends who do all the parsing plus validation plus create the domain
> specific objects. Again, faster than everything you can write (in a
> reasonable amount of time), plus less errors, plus validation, plus
> some even language independent (YMMV).
>
> Third, and if you really know what you are doing (ie. have spent
> years on XML and have a very specific case in which everything is
> different. This is almost certainly not you, whoever might read
> this :-) ), then don't start by writing an XML lexer, but rather use
> an existing one, e.g. any SAX parser you like. Then use that one as
> the lexer, convert the events to tokens, and implement the
> appropriate ANTLR interface. This might be even easier if you use XML
> Pull as the underlying technology. This way you might get one of the
> most important things (encoding & Unicode) right, which will save
> your users a lot of pain, and you also solved the escaping/entity/
> etc. thing. And again, it's going to be a lot faster than anything
> ANTLR can generate.
>
> So please don't tell people how to generate their own XML parser.
> Tell them that they are at the wrong address and should rather use a
> pre-built XML parser. It's always the same - the old lex/yacc people
> who think "I need to parse something" and start off with a compiler
> toolkit, trying to solve the same problems over and over again. And
> then they fail and complain about XML being complex because they
> didn't get Unicode or Entities right ...
>
> Martin
>
> Am 02.08.2006 um 19:33 schrieb Oliver Zeigermann:
>
> > Hi folks!
> >
> > I finished the first part of my Parsing XML using ANTLR3 tutorial:
> >
> > http://www.antlr.org/wiki/display/ANTLR3/Parsing+XML
> >
> > And the first part:
> >
> > http://www.antlr.org/wiki/display/ANTLR3/Lexer
> >
> > However, and most frustrating the Wiki made a mess of that page that I
> > could not even fix after an hour of work :( :( :( I keep trying to
> > find a solution, any hints highly appreciated.
> >
> > Anyway, because of this I have the intro and the first part on lexing
> > here as well:
> >
> > http://zeigermann.de/antlr/Intro.html
> > http://zeigermann.de/antlr/Lexing.html
> >
> > Comments/Improvements on the tutorial itself are highly welcome. Are
> > the most important questions answered? Can you even follow? Is this
> > complete crap? Let me know :)
> >
> > Also, stay tuned for the next part parsing which I am already
> > working on.
> >
> > Cheers
> >
> > Oliver
> >
>
>


More information about the antlr-interest mailing list