[antlr-interest] v3: How could I construct a parser for an "active" language such as ASP.Net, PHP or (in my case) Active RTF?

Peter Crowther peter.crowther at melandra.com
Thu Apr 5 07:58:05 PDT 2012

Aha!  Island grammars!  Thank you - that's the pointer I needed.

- Peter

On 5 April 2012 15:09, Eric <researcher0x00 at gmail.com> wrote:

> I do not really understand your question but I will give you some ideas
> that come to mind. As I am busy, I won't be able to elaborate past giving
> these ideas.
> 1. Check out combined grammars. i.e. A grammar with an embedded grammar,
> e.g. JavaScript inside of HTML. I believe the examples from the downloads
> page has one.
> 2. Token stream rewriting between the lexer and the parser. ANTLR includes
> methods to help do this; they are noted in the API.
> On Thu, Apr 5, 2012 at 7:40 AM, Peter Crowther <
> peter.crowther at melandra.com> wrote:
>> Thanks for the input!  I read the RTF spec some time ago, believe I know
>> how RTF works, and (luckily) I'm doing a different job.
>> This application doesn't need to understand RTF. Other than the few
>> commands I outline (that aren't RTF tags - note the forward slash not
>> backslash), the RTF is presently opaque to the application.  It's just a
>> bunch of bytes that need to be output in that order until a processing
>> instruction is encountered.  I believe some parser other than the RTF
>> parser is appropriate for the job I wish to do with these processing
>> instructions :-).
>> Cheers,
>> - Peter
>> On 5 April 2012 11:09, Eric <researcher0x00 at gmail.com> wrote:
>>> Just my thoughts or thinking out loud on this.
>>> I played around with RTF a few years ago because I wanted to make an
>>> editor/diff tool that could add tags to change the properties of the text,
>>> i.e. color, underlining, etc., something like HTML. If I remember
>>> correctly, one of the interesting aspects of RTF was that it was designed
>>> to be read and processed sequentially in one pass. If the tags were done
>>> correctly, you could start at a marked boundary and continue processing,
>>> even in the middle of the stream. This was highly advantageous when working
>>> with long documents of hundreds of pages or more.
>>> From the Rich Text Format (RTF) Specification Version 1.9.1
>>> "A sample RTF parsing reader program is given in Appendix A: Sample RTF
>>> Reader Application. This sample RTF reader is designed for use in
>>> conjunction with this document to assist those interested in developing
>>> their own RTF readers. The sample RTF reader is not a for-sale product, and
>>> Microsoft does not provide technical support or any other kind of support
>>> for the sample RTF parsing reader code or this document. "
>>> I would seriously read the RTF specs first and truly understand how RTF
>>> works before using ANTLR with it. You may be walking through a door you
>>> wish you never opened.
>>> Do not take this as implying ANTLR is terrible. It is merely that you
>>> should use the right tool for the right job and here ANTLR is not the right
>>> tool, the tool you need is outlined in the RTF spec.

More information about the antlr-interest mailing list