[antlr-interest] Languages within HTML

Darien Hager darien.hager at etelos-inc.com
Thu Jan 31 10:15:23 PST 2008


I'm experimenting with ANTLR to try to solve a particular problem, and I'd
like to check some assumptions and ask for any suggestions.

Situation: I have an HTML file with boundaries defining blocks of embedded
code, such as PHP and JSP. More than one language can be embedded.

Suppose PHP blocks are encapsulated with <? ?> markers, and JSP blocks in <%
%> markers.

What I want to do is analyze the file and create a AST tree that begins with
line of siblings for each segment. (e.g. HTML, PHP, HTML, JSP, PHP, HTML,
PHP)

However, don't want it to be so naive that a properly-quoted end-marker will
be wrongly hit e.g. : <? echo("?>"); ?>

Question: Is the only robust way to do this to create (or re-use) grammars
for PHP and JSP?
I'm assuming the answer is yes, in which case
it's no longer a small experiment anymore.

-- 
Darien Hager
Developer
Etelos, Inc.
darien at etelos.com

http://www.etelos.com
"Revolutionizing the way applications are developed, distributed and
consumed."

This e-mail message, including attachments, may contain confidential
information for the sole use of the intended recipient(s). If you are not
the intended recipient, then this is notice that any use, disclosure,
dissemination, distribution or copying is strictly prohibited. If you have
received this message in error please contact the sender by reply mail and
destroy all copies of the original message.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080131/ee0ca457/attachment.html 


More information about the antlr-interest mailing list