[antlr-interest] Generating XML from a antlr grammer!

Oliver Zeigermann oliver at zeigermann.de
Sun Jun 17 07:15:44 PDT 2007


Hi Johannes!

To me your solution looks nice. Maybe the error reporting could be
more beautiful ;)

Anyway, the only thing I could find was where the scope begins (should
be element in my opinion) and the check for equality in endTag.

This grammar modified in the way described works for me:

parser grammar xmlParser;
options {
    tokenVocab=xmlLexer;
    output=AST;
}

tokens {
    ELEMENT;
    ATTRIBUTE;
}

scope ElementScope {
String currentElementName;
}

document : element ;

element
scope ElementScope;
    : ( startTag^
            (element
            | PCDATA
            )*
            endTag!
        | emptyElement
        )
    ;

startTag
    : TAG_START_OPEN GENERIC_ID attribute* TAG_CLOSE
            {$ElementScope::currentElementName = $GENERIC_ID.text; }
        -> ^(ELEMENT GENERIC_ID attribute*)
    ;

attribute : GENERIC_ID ATTR_EQ ATTR_VALUE -> ^(ATTRIBUTE GENERIC_ID
ATTR_VALUE) ;

endTag!
    : { $ElementScope::currentElementName.equals(input.LT(2).getText()) }?
 TAG_END_OPEN GENERIC_ID TAG_CLOSE
    ;

emptyElement : TAG_START_OPEN GENERIC_ID attribute* TAG_EMPTY_CLOSE
        -> ^(ELEMENT GENERIC_ID attribute*)
    ;



2007/6/11, Johannes Luber <jaluber at gmx.de>:
> Oliver Zeigermann wrote:
> > 2007/6/11, Johannes Luber <jaluber at gmx.de>:
> >
> > He, Johannes!
> >
> > Good observation.
> >
> > Maybe we can have a version that checks the order using validating
> > semantic predicates. That really would be a good example for their
> > use.
> >
> > What do you think?
>
> I've created an implementation (see grammar below), but the error
> reporting AND recovering is weak. For the XML file
>
> <doc>
>     <assembly>
>         <name>Util</Name>
>     </assembly>
> </doc>
>
> the following is being output:
>
> "line 3:18 rule endTag failed predicate: {
> $ElementScope::currentElementName == input.LA(2).text }?
> line 4:4 rule endTag failed predicate: {
> $ElementScope::currentElementName == input.LA(2).text }?"
>
> The second message is a conundrum: If the value of currentElementName is
> still "name", why does it accept the closing </doc>? Maybe you have
> better idea regarding this problem.
>
> Best regards,
> Johannes Luber
>
>
> parser  grammar XMLParser;
>
> options {      tokenVocab=XMLLexer; }
>
> scope ElementScope {
> String currentElementName;
> }
>
> document  : element ;
>
> element
>     : startTag
>         (element
>         | PCDATA
>         )*
>         endTag
>     | emptyElement
>     ;
>
> startTag
> scope ElementScope;
>         :       TAG_START_OPEN GENERIC_ID (attribute)* TAG_CLOSE {
> $ElementScope::currentElementName = $GENERIC_ID.text; }
>         ;
>
> attribute  : GENERIC_ID ATTR_EQ ATTR_VALUE ;
>
> endTag
> scope ElementScope;
>         :       { $ElementScope::currentElementName == input.LT(2).getText() }?
> TAG_END_OPEN GENERIC_ID TAG_CLOSE ;
>
> emptyElement : TAG_START_OPEN GENERIC_ID  (attribute)* TAG_EMPTY_CLOSE ;
>


More information about the antlr-interest mailing list