[antlr-interest] Why do html comments ruin my grammar?
Gavin Lambert
antlr at mirality.co.nz
Sat Jun 30 03:16:41 PDT 2007
At 19:03 30/06/2007, Ruth Karl wrote:
>Hi, I wonder if this message has ever been read of if I shall
>send it again? Does anyone have an idea about this problem? I
>really need some help there....
[...]
>> But when I add the lexer rule
>>
>> HTMLCOMMENT : '<!--' ( options {greedy=false;} : . )*
>> '-->' {$channel=HIDDEN;} ;
>>
>> to my grammar (see attachment), the interpreter in ANTLRworks
>> will start to see '<!' (like in '<!DOCTYPE html ...') as part
>> of a TEXT item, even though TEXT is defined as
>>
>> TEXT options {greedy=false;}
>> :
>>(~('<'|'>'|'%'|'/'|'"'|'\''|'('|')'|'['|']'|'{'|'}'|'\n'|'\t'|'\r')
>>)+
>> ;
>>
>> which is confusing not only me but the parser as well... ;-)
Try removing the greedy option from the TEXT rule. I don't think
it will actually work there, since that's a top-level lexer rule
and you don't have any following characters within the rule
itself. (Though I could be wrong.)
But anyway, with those two rules you've posted, the ! will match
TEXT, assuming the < has already matched some other token.
More information about the antlr-interest
mailing list