[antlr-interest] newbie greedy option question
stephane richard
kabnot at gmail.com
Fri Aug 28 01:42:44 PDT 2009
Hi all.
I'm trying to build a simple xhtml recognizer (for whitespace
compression) in the purpose of learning antlr. Here's a sample of what
I like to match :
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>
XHTML
Example
</title>
</head>
<body>
<p>
Please Choose a Day:
<br /><br />
<select name="day">
<option selected="selected">Monday</option>
<option>Tuesday</option>
<option>Wednesday</option>
</select>
</p>
</body>
</html>
This is the grammar :
grammar Html;
options {
output=AST;
ASTLabelType=CommonTree;
}
prog : tag_element *
;
element : tag_element
| text_element
;
tag_element
: open_tag element* close_tag
| empty_tag
;
open_tag: OPEN_TAG name attribute* CLOSE_TAG
;
close_tag
: OPEN_TAG '/' name CLOSE_TAG
;
empty_tag
: OPEN_TAG name '/' CLOSE_TAG
;
attribute
: namespace? ID '=' '"' (options{greedy=false;}: .)* '"'
;
namespace
: ID ':'
;
name : ID
;
text_element
: (~(OPEN_TAG) | WS)+
;
ID : ('a'..'z'|'A'..'Z')+ ;
INT : '0'..'9'+ ;
NEWLINE : '\r'? '\n' ;
WS : (' '|'\t'|'\n'|'\r')+ {skip();} ;
OPEN_TAG : '<';
CLOSE_TAG : '>';
My problem is with the text_element rule. I'd like to match everything
until the recognizer find a OPEN_TAG, including whitespaces. While the
actuale rule work, this give me this error :
[10:32:12] warning(200): Html.g:43:21: Decision can match input such
as "WS" using multiple alternatives: 1, 2, 3
As a result, alternative(s) 3,2 were disabled for that input
[10:32:12] warning(200): Html.g:43:21: Decision can match input such
as "{CLOSE_TAG..ID, INT..':'}" using multiple alternatives: 1, 3
As a result, alternative(s) 3 were disabled for that input
[10:32:12] error(201): Html.g:43:21: The following alternatives can
never be matched: 2
How could I handle this case properly ?
Regard,
Kabnot
More information about the antlr-interest
mailing list