[antlr-interest] Re: XPA and ANTLR

ooobles oobles at hotmail.com
Thu Aug 12 19:05:01 PDT 2004


Hi Oliver,

Nice work on XPA!  I looked at the token parser, but it looks like I'd
need to do more work to find start and end tags, etc.  I really like
SAXDrivenASTParser giving me an AST ready to use.

After I turned off some of my debug, I noticed that the parser does
print out error statements saying that additional nodes are found in
the tree.  I was able to hunt down a few extra tokens I had skipped in
the grammar I posted.  eg.  MaxLength and MinLength in restrictions. 
But its difficult to locate the exact location this way.


On processing the actual Schema, I've made some progress.  To get the
Schema into a more generic form I've started writing multiple tree
walkers which rearrange the tree.  For instance, I've been able to
remove all "groupDef" and "groupRef" tokens from the tree by capturing
groupDef subtrees and reinserting them at groupRef points.

Eventually I will run into some recursion problems with the AST
refering back to itself.  I assume I will need to handle this problem
manually in my grammers?

I'm also having a few problems with cutting out AST elements from the
tree while doing this method.  I wrote another message about that.

Thanks for your help,
David.


> Hi David,
> 
> which tokens do you think are skipped?
> 
> Anyway, you could also try to use the XPA token parser. There you
can be 
> very sure nothing is missed as you will have to consume each and every 
> token.
> 
> Oliver
> 
> ooobles wrote:
> 
> > Hi all,
> > 
> > It's been a while since I've got to play with ANTLR in a new way. 
> > I've been happily making up new grammars for a while now. :)  I now
> > have the new challenge of parsing XML. After writing a 1200line parser
> > that reads over a dom4j tree, I decided there must be a better way. 
> > Thankfully, I found XPA and have started writing a tree walker for
> > reading XML Schemas (XSD).
> > 
> > One thing I find with tree walkers is that I can't be sure if I missed
> > nodes in the tree.  A tree walker can silently skip child nodes
> > because the grammar has already been met.  Is there any way to force
> > the parser to report an error when additional nodes have been found in
> > the tree?
> > 
> > I've been using one of the XPA examples to write the tree parser (see
> > below). It *seems* to read XSD, but I'm quite sure it is skipping
> > some elements.
> > 
> > As an aside, has anyone already written an XSD parser that generates a
> >  nice internal model? :)  I'm guessing I'll need to do a few passes
> > over the XSD files to resolve all the data types, groups,
elements, etc.
> > 
> > Thanks,
> > David.
> > 
> > PS I read over a few other messages mentioning that there arn't many
> > fans of XML here.  I'm definately not a fan either, but when you
> > work in a group that only does XML, you don't get much choice. :) 
> > 
> > -------------- XSD Tree Parser ----------------
> > 
> > class ComponentTreeParser extends TreeParser;
> > options 
> > {
> >    buildAST = true;
> >    ASTLabelType = "XMLAST";
> > }
> > 
> > // enable wildcard processing for xtal and 
> > // set wildcard element type to "<wildcard>"
> > tokens 
> > {
> >     "<wildcard>";
> > }
> > 
> > schema : #(c:"<xsd:schema>" ( schemaImport | schemaInclude	| element |
> > complexType | complexContent | groupDef )* )
> >     ;
> > 
> > schemaImport : imp:"<xsd:import>" 
> > 	;
> > 
> > schemaInclude : inc:"<xsd:include>"
> > 	;	
> > 
> > groupDef: #("<xsd:group>" sequence )
> > 	;
> > 
> > annotation: #("<xsd:annotation>" documentation )
> > 	;	
> > 
> > documentation: "<xsd:documentation>"
> > 	;
> > 
> > complexType: #( "<xsd:complexType>"  ( annotation | sequence |
> > attribute | simpleContent | complexContent )* )
> > 	;
> > 
> > complexContent: #( "<xsd:complxContent>" extension )
> > 	;
> > 
> > simpleContent: #( "<xsd:simpleContent>" extension )
> > 	;
> > 	
> > extension: #( "<xsd:extension>" ( attribute | sequence ) )
> > 	;
> > 	
> > sequence: #( "<xsd:sequence>" ( sequence | element | choice | groupRef
> > )* )
> > 	;
> > 	
> > choice: #( "<xsd:choice>" ( element | sequence )* )
> > 	;	
> > 
> > groupRef: "<xsd:group>" 	
> > 	;
> > 	
> > attribute: #( "<xsd:attribute>" (simpleType)? )
> > 	;	
> > 	
> > simpleType: #( "<xsd:simpleType>" restriction )	
> > 	;
> > 	
> > restriction: #( "<xsd:restriction>" enumeration )
> > 	;
> > 	
> > enumeration: "<xsd:enumeration>"	
> > 	;
> > 
> > 
> > 
> > 
> > 
> > 
> >  
> > Yahoo! Groups Links
> > 
> > 
> > 
> >  
> > 
> >



 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
    antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list