XML parsing (was RE: [antlr-interest] Places where Antlr can be
used ....)
Oliver Zeigermann
oliver.zeigermann at gmail.com
Fri Jun 24 04:12:16 PDT 2005
I like this and think it is superior to the parsing part of XPA. But
where is the tree transformation part?
Oliver
On 6/24/05, Scott Stanchfield <scott at javadude.com> wrote:
> > For example - I already have some completly mind boggling
> > feature planned for support xml parsing through antlr !!
> >
> > PRASHANT
>
> FYI - I'll be releasing a beta of my XML parsing this weekend (if all goes
> as planned). It's an offshoot of ANTLR called ANTXR (ANother Tool for Xml
> Recognition), pronounced "Ant-zer". (I've copied & modified the antlreclipse
> plugin to support this as well.)
>
> (Perhaps we should chat about what you plan and see if it makes more sense
> to integrate with ANTXR or pursue what you're planning)
>
>
> Basically I've modified the ANTLR syntax slightly so you can parse
>
> <?xml version="1.0"?>
> <people>
> <person ssn="111-11-1111">
> <first-name>Terence</first-name>
> <last-name>Parr</last-name>
> </person>
> <person ssn="222-22-2222">
> <first-name>Scott</first-name>
> <last-name>Stanchfield</last-name>
> </person>
> <person ssn="333-33-3333">
> <first-name>James</first-name>
> <last-name>Stewart</last-name>
> <sponge>Haha</sponge>
> <p>This is a <i>nested</i> other tag data</p>
> </person>
> </people>
>
> using the following grammar. (Note: I'm still working on the "any" tag --
> I'm trying to come up with a nice shortcut syntax, but the listed syntax is
> the verbose way of doing it.
>
> The rules with <ruleName> automatically match the begin and end tag with
> their name. I'm still working on getting tags with dots in their names to
> work this way.
>
> Attributes are referenced using "@attributeName" in an action.
>
> ----------
> header {
> package com.javadude.antlr.xml.sample;
>
> import java.util.List;
> import java.util.ArrayList;
> }
>
> class PeopleParser extends Parser;
>
> document returns [List results = null]
> : results=people EOF
> ;
>
> <people> returns [List results = new ArrayList() ]
> { Person p; }
> : (p=<person> {results.add(p);} )*
> ;
>
> <person> returns [Person p = new Person() ]
> {
> String first, last;
> p.setSsn(@ssn);
> }
> : (
> first=<first-name>
> { p.setFirstName(first); }
> |
> last=<last-name>
> { p.setLastName(last); }
> |
> otherTag
> )*
> ;
>
> <first-name> returns [String value=null]
> : pcdata:PCDATA { value = pcdata.getText(); }
> ;
>
> <last-name> returns [String value=null]
> : pcdata:PCDATA { value = pcdata.getText(); }
> ;
>
> otherTag
> : other:OTHER_TAG
> ( otherTag
> | pcData:PCDATA
> )*
> XML_END_TAG
> ;
> ----------
>
> This example didn't use namespaces, but you can add something like
>
> options {
> xmlns="http://www.somedomain.com";
> xmlns:stuff="http://www.crunchyfrog.com/plah/foo";
> }
>
> and then use
>
> <someTag> ("somedomain" namespace)
> <stuff:someTag> ("crunchyfrog" namespace)
>
> in the grammar rules.
>
> I've been using an earlier version of this for several months with huge
> success. I plan to convert my work code to use this new grammar syntax soon
> (it uses the same constructs under the covers).
>
> I used to have the rules look like
>
> person options {xmlTag="person";}
> : ...
> ;
>
> but I thought that was redundant.
>
> Anyway, more when I release it.
>
> Later,
> -- Scott
>
>
>
More information about the antlr-interest
mailing list