[antlr-interest] Why do html comments ruin my grammar?
Ruth Karl
ruth.karl at gmx.de
Sat Jun 30 00:03:25 PDT 2007
Hi, I wonder if this message has ever been read of if I shall send it
again? Does anyone have an idea about this problem? I really need some
help there....
Thanks.
Ruth
Ruth Karl schrieb:
> Hello out there, I need some help....
>
> i have been spending hours to find a way to exclude html comments from
> further analysis with my jsp parser.
> But when I add the lexer rule
>
> HTMLCOMMENT : '<!--' ( options {greedy=false;} : . )* '-->'
> {$channel=HIDDEN;} ;
>
> to my grammar (see attachment), the interpreter in ANTLRworks will
> start to see '<!' (like in '<!DOCTYPE html ...') as part of a TEXT
> item, even though TEXT is defined as
>
> TEXT options {greedy=false;}
> :
> (~('<'|'>'|'%'|'/'|'"'|'\''|'('|')'|'['|']'|'{'|'}'|'\n'|'\t'|'\r'))+
> ;
>
> which is confusing not only me but the parser as well... ;-)
>
>
> For the same reason, adding the HTMLCOMMENT lexer rule also causes
> problems with the generated C# (!) code:
>
> a MismatchedTokenException will be thrown at mHTMLCOMMENT() method in
> the lexer class when it comes to the line
> Match("<!--");
>
> I thought I should somehow add a backtracking option and an exception
> handling there, but I could not find out how... (backtracking option
> does not seem to be allowed...???)
>
>
>
> I would really appreciate any kind of help, thanks a lot in advance!
> Ruth
>
>
More information about the antlr-interest
mailing list