[antlr-interest] newbie greedy option question

Jim Idle jimi at temporal-wave.com
Fri Aug 28 05:18:41 PDT 2009


I suggest that you do not pick XHTML as your first grammar as it is  
too convoluted and you will get frustrated. Read the getting started  
guides on the wiki. I see othrs have answered these specific things.

Jim

On Aug 28, 2009, at 3:42 AM, stephane richard <kabnot at gmail.com> wrote:

> Hi all.
>
> I'm trying to build a simple xhtml recognizer (for whitespace
> compression) in the purpose of learning antlr. Here's a sample of what
> I like to match :
>
> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
>    <head>
>        <title>
>            XHTML
>            Example
>        </title>
>    </head>
>    <body>
>        <p>
>            Please Choose a Day:
>            <br /><br />
>            <select name="day">
>                <option selected="selected">Monday</option>
>                <option>Tuesday</option>
>                <option>Wednesday</option>
>            </select>
>        </p>
>    </body>
> </html>
>
>
> This is the grammar :
>
> grammar Html;
>
> options {
>    output=AST;
>    ASTLabelType=CommonTree;
> }
>
> prog    :    tag_element *
>    ;
>
> element    :    tag_element
>    |    text_element
>    ;
>
> tag_element
>    :    open_tag element* close_tag
>    |    empty_tag
>    ;
>
> open_tag:    OPEN_TAG name attribute* CLOSE_TAG
>    ;
>
> close_tag
>    :    OPEN_TAG '/' name CLOSE_TAG
>    ;
>
> empty_tag
>    :    OPEN_TAG name '/' CLOSE_TAG
>    ;
>
> attribute
>    :    namespace? ID '=' '"' (options{greedy=false;}: .)* '"'
>    ;
>
> namespace
>    :    ID ':'
>    ;
>
>
> name    :    ID
>    ;
> text_element
>    :    (~(OPEN_TAG) | WS)+
>    ;
>
>
> ID          : ('a'..'z'|'A'..'Z')+ ;
> INT        : '0'..'9'+ ;
> NEWLINE        : '\r'? '\n' ;
> WS        : (' '|'\t'|'\n'|'\r')+ {skip();} ;
> OPEN_TAG    : '<';
> CLOSE_TAG    : '>';
>
>
> My problem is with the text_element rule. I'd like to match everything
> until the recognizer find a OPEN_TAG, including whitespaces. While the
> actuale rule work, this give me this error :
>
> [10:32:12] warning(200): Html.g:43:21: Decision can match input such
> as "WS" using multiple alternatives: 1, 2, 3
> As a result, alternative(s) 3,2 were disabled for that input
> [10:32:12] warning(200): Html.g:43:21: Decision can match input such
> as "{CLOSE_TAG..ID, INT..':'}" using multiple alternatives: 1, 3
> As a result, alternative(s) 3 were disabled for that input
> [10:32:12] error(201): Html.g:43:21: The following alternatives can
> never be matched: 2
>
> How could I handle this case properly ?
>
> Regard,
> Kabnot
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address


More information about the antlr-interest mailing list