[antlr-interest] Why do html comments ruin my grammar? ;-)
Ruth Karl
ruth.karl at gmx.de
Thu Jun 28 07:55:43 PDT 2007
Hello out there, I need some help....
i have been spending hours to find a way to exclude html comments from
further analysis with my jsp parser.
But when I add the lexer rule
HTMLCOMMENT : '<!--' ( options {greedy=false;} : . )* '-->'
{$channel=HIDDEN;} ;
to my grammar (see attachment), the interpreter in ANTLRworks will start
to see '<!' (like in '<!DOCTYPE html ...') as part of a TEXT item, even
though TEXT is defined as
TEXT options {greedy=false;}
:
(~('<'|'>'|'%'|'/'|'"'|'\''|'('|')'|'['|']'|'{'|'}'|'\n'|'\t'|'\r'))+
;
which is confusing not only me but the parser as well... ;-)
For the same reason, adding the HTMLCOMMENT lexer rule also causes
problems with the generated C# (!) code:
a MismatchedTokenException will be thrown at mHTMLCOMMENT() method in
the lexer class when it comes to the line
Match("<!--");
I thought I should somehow add a backtracking option and an exception
handling there, but I could not find out how... (backtracking option
does not seem to be allowed...???)
I would really appreciate any kind of help, thanks a lot in advance!
Ruth
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: JSP.g
Url: http://www.antlr.org/pipermail/antlr-interest/attachments/20070628/c29eb673/attachment-0001.pl
More information about the antlr-interest
mailing list