[antlr-interest] Parsing Nested Multi-line text

Kurt Rayner Kurt at AlphaSoftware.com
Tue Feb 28 09:36:55 PST 2006


While re-implementing a parser for a variant of BASIC, I got stuck on this
one.

 

The existing hand-coded parser supports nested multi-line string literals
(mainly for dynamic code generation) using the following form:

 

<<%identifier%

... 

%identifier%

 

A lame example:

 

MyCodeSegment = <<%code%

if a < 12 then

            evaluate_template(<<%code%

a = 12

%code%

else

            evaluate_template(<<%code%

a = 14

%code%

end if

%code%

 

One would expect the names in the nested blocks would be different, but the
existing parser doesn't seem to care, and I have found exceptions.

Also note that the embedded text does NOT have to be parseable.

 

Here's what I've tried most recently to make the lexer handle the syntax.
ANTLR obviously doesn't like that it doesn't have sufficient look-ahead.

 

protected

MultiLineLiteralIdentifier

            :  '%' Identifier '%'

            ;

 

protected

MultiLineLiteralInitiator

            : "<<" MultiLineLiteralIdentifier

            ;

 

protected

EmbeddedMultiLineLiteral           

            :           MultiLineLiteralInitiator 

                        (options { greedy=false; } :
(EmbeddedMultiLineLiteral | .) )*

                        MultiLineLiteralIdentifier

            ;

 

ASCIIStringLiteral 

            :           ('"'!        ( ('\\'! '\\') |  ('\\'! '"') | ('"'!
'"') | (~'"'))*              '"'!)

            |           MultiLineLiteralInitiator! 

                        (options { greedy=false; } :
(EmbeddedMultiLineLiteral | .)  )*

                        MultiLineLiteralIdentifier!

            ;

 

If I were using Flex, I would just take control of the input stream, but I
would prefer to use something a little more elegant.

 

Thanks in advance for any ideas.

 

 

Kurt Rayner

Development

Alpha Software, Inc.

83 Cambridge Street, Suite 3B

Burlington, MA 01803-4483

kurt at AlphaSoftware.com

(781) 229-4500 X 27

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20060228/08b89b39/attachment-0001.html


More information about the antlr-interest mailing list