[antlr-interest] Languages within HTML

Monty Zukowski monty at codetransform.com
Thu Jan 31 12:24:02 PST 2008


You could probably get pretty far just by handling strings, comments &
escape sequences for each embedded language.

Monty

On Jan 31, 2008 10:15 AM, Darien Hager <darien.hager at etelos-inc.com> wrote:
> I'm experimenting with ANTLR to try to solve a particular problem, and I'd
> like to check some assumptions and ask for any suggestions.
>
> Situation: I have an HTML file with boundaries defining blocks of embedded
> code, such as PHP and JSP. More than one language can be embedded.
>
> Suppose PHP blocks are encapsulated with <? ?> markers, and JSP blocks in <%
> %> markers.
>
> What I want to do is analyze the file and create a AST tree that begins with
> line of siblings for each segment. (e.g. HTML, PHP, HTML, JSP, PHP, HTML,
> PHP)
>
> However, don't want it to be so naive that a properly-quoted end-marker will
> be wrongly hit e.g. : <? echo("?>"); ?>
>
> Question: Is the only robust way to do this to create (or re-use) grammars
> for PHP and JSP?
>  I'm assuming the answer is yes, in which case it's no longer a small
> experiment anymore.
>
> --
> Darien Hager
> Developer
> Etelos, Inc.
> darien at etelos.com
>
> http://www.etelos.com
> "Revolutionizing the way applications are developed, distributed and
> consumed."
>
> This e-mail message, including attachments, may contain confidential
> information for the sole use of the intended recipient(s). If you are not
> the intended recipient, then this is notice that any use, disclosure,
> dissemination, distribution or copying is strictly prohibited. If you have
> received this message in error please contact the sender by reply mail and
> destroy all copies of the original message.


More information about the antlr-interest mailing list