[antlr-interest] antlr suitable for xml languages? (likebpelandare there existing grammaires)

Scott Stanchfield scott at javadude.com
Wed Dec 14 06:55:21 PST 2005


Context, readability, maintainability and "non-missability".

First, I'll discount DOM right away for memory concerns, readability, and
"missability". Quite simply, you can forget to traverse a child tag...

Now a few details RE SAX:
Context: Instead of having monster if/else in a SAX parser and having to
keep track of "in which context is that <name> tag?", the rules do it for
you.

Readability: Nice grammar vs single large method with long if/else and
accompanying structure to keep track of location.

Maintainability: If you need to add handing for a new subtag or attribute,
is it easier to find the right spot in the grammar or in a hand-written
if/else-based method.

Non-missability: With SAX you can easily have a trailing else that ignores
some tags you didn't mean to ignore. With ANTXR, you must specify each tag.
(You can specify an explicit ANY_TAG if you really need to, but it's more
explicit and can have nested subtags)


Quick ANTXR example:

XML:
<?xml version="1.0"?>
<people>
  <person ssn="111-11-1111">
    <firstName>Terence</firstName>
    <lastName>Parr</lastName>
  </person>
  <person ssn="222-22-2222">
    <firstName>Scott</firstName>
    <lastName>Stanchfield</lastName>
  </person>
  <person ssn="333-33-3333">
    <firstName>James</firstName>
    <lastName>Stewart</lastName>
  </person>
</people>

ANTXR grammar:
header {
package com.javadude.antlr.sample.xml;
}

class PeopleParser extends Parser;

document
  : <people> EOF;

<people>
  : (<person>)*
  ;

<person> 
  { System.out.println("ssn=" + @ssn); }
  : ( <firstName>
    | <lastName>
    )*
  ;

<firstName>
  : PCDATA
  ;

<lastName>
  : PCDATA
  ;


You can add actions, semantic and syntactic preds, etc, just like in normal
ANTLR, but with some extra syntax help. For a more detailed example, look
near the bottom of the ANTXR page for the GUIParser example.

Of course YMMV, but everyone I've shown this to (& who's actually used it)
thinks it's significantly faster to write and easier to maintain an XML
parser this way.

The best way to find out is to try a few simple parsers and see how it
feels...

Later,
-- Scott



> [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Micheal J
> 
> Scott, what would one gain from using ANTXR rather than a 
> standard XML parser validating with an XSD/DTD.
> 
> Micheal
> 




More information about the antlr-interest mailing list