[antlr-interest] Why do html comments ruin my grammar?
Ruth Karl
ruth.karl at gmx.de
Sat Jun 30 04:23:52 PDT 2007
Gavin Lambert schrieb:
> At 19:03 30/06/2007, Ruth Karl wrote:
> >Hi, I wonder if this message has ever been read of if I shall
> >send it again? Does anyone have an idea about this problem? I
> >really need some help there....
> [...]
> >> But when I add the lexer rule
> >>
> >> HTMLCOMMENT : '<!--' ( options {greedy=false;} : . )*
> >> '-->' {$channel=HIDDEN;} ;
> >>
> >> to my grammar (see attachment), the interpreter in ANTLRworks
> >> will start to see '<!' (like in '<!DOCTYPE html ...') as part
> >> of a TEXT item, even though TEXT is defined as
> >>
> >> TEXT options {greedy=false;}
> >> :
> >>(~('<'|'>'|'%'|'/'|'"'|'\''|'('|')'|'['|']'|'{'|'}'|'\n'|'\t'|'\r')
> >>)+
> >> ;
> >>
> >> which is confusing not only me but the parser as well... ;-)
>
> Try removing the greedy option from the TEXT rule. I don't think it
> will actually work there, since that's a top-level lexer rule and you
> don't have any following characters within the rule itself. (Though I
> could be wrong.)
>
> But anyway, with those two rules you've posted, the ! will match TEXT,
> assuming the < has already matched some other token.
>
Hi Gavin,
thanks a lot for your help. Leaving the greedy option out did not help -
but I found a solution by myself now: (and it is so simple!): I just
added another lexer rule:
DOCTYPE : '<!DOCTYPE' ( options {greedy=false;} : . )* '>' ;
Thanks anyway, and have a nice day,
Ruth
More information about the antlr-interest
mailing list