[antlr-interest] Re: XML/XSD parser generators and processing

iank at bearcave.com iank at bearcave.com
Wed Mar 5 09:36:44 PST 2003


>>   As someone pointed out, similar tools seem to have been built for
>>   XML Document Type Descriptions (DTDs).  So far, such a tool does not
>>   seem to exist for XML schemas, which represent a more powerful
>>   grammar that DTDs.
>
> I just stumbled over this. I haven't downloaded an eval,
> but it sounds very much like what you're suggesting.
>
> http://www.roguewave.com/developer/evaluations/xol/

  Thanks for posting this.  RogueWave's XML Object Link looks
  interesting.  "Object Link" is sort of an XML ANTLR, where it reads
  an XML Schema (rather than the ANTLR grammar) and generates a C++
  parser.  I'm thinking of an XML YACC - an XML schema is read and a
  state table is created which is used by a DFA to parse the XML
  described by the schema.  This allows dynamic loading of schemas
  without a compilation step, but still supports very fast parsing.

  This whole idea is kind of "high concept".  That is, concept without
  details and as we all know, the Devil is in the details.

  I wrote my original note before I had dived deeply into XML
  schemas.  I bought the book "The XML Schema Complete Reference" by
  Binstock et al which is over 900 pages of documentation on schemas.
  Schemas are far more complicated than parser generator grammars.  So
  creating a parser generator for XML schemas is a major task.

  Sometimes I wonder if parser generator issues ever occured to the
  people who designed the schemas.  For example, just as it is
  possible to create an ambiguous grammar with ANTLR, an ambiguous
  schema can be created for an XML document.  I have not seen any
  discussion on this issue (that is not say that it does not exist,
  just that I have not seen it).  Given that creating an error free
  grammar for a language like Java is a lot of work, it seems that
  schemas might be even worse, since the "grammar" is so complicated.
  I have not used validation on Xerces (or any other XML processor),
  so I don't know how they handle errors for ambiguous "grammars".

  Ian


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



More information about the antlr-interest mailing list