[antlr-interest] Skipping sections of the input stream

Thu Jun 28 14:11:07 PDT 2012

I figured out a way to do what I was thinking using a chunk of code like so:

sectiona: SECTIONA name {
       if ($name.text->chars != expectedName) {
                      // consumeUntilSet for SECTIONA || SECTIONB
                      // throw SKIP_SECTION_EXCEPTION
      }
    }  LCURLY subsectiona* RCURLY -> ...
     ;
    [ SKIP_SECTION_EXCEPTION] {
        PSRSTATE->error = ANTLR3_FALSE;
        PSRSTATE->failed = ANTLR3_FALSE;
    }

Sent to the list  for future reference.

--
Burton Samograd

-----Original Message-----
From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Burton Samograd
Sent: Thursday, June 28, 2012 10:34 AM
To: Mike Lischke
Cc: antlr-interest at antlr.org
Subject: Re: [antlr-interest] Skipping sections of the input stream

I simplified my example and we actually have two top level section types which I think would greatly Increase the complexity of a splitter and require making a baby parser to parse quite a bit of the input structure before passing it to the actual parser.

Not an impossible task, but I think if it could be done completely in the grammar using the standard input stream it wouldn't create such a dependency between the baby parser and the actual parser unless you can suggest a better way to do the splitting.

--
Burton Samograd

-----Original Message-----
From: Mike Lischke [mailto:mike at lischke-online.de]
Sent: Thursday, June 28, 2012 10:27 AM
To: Burton Samograd
Subject: Re: [antlr-interest] Skipping sections of the input stream

Burton,

> It has been requested that we only parse the section that has been
> specified by the user.  I am thinking that a strategy where we have a rule like:
>
> section: SECTION quoted_string {
>    // if quoted_string != requested_section
>    // skip entire section by matching { and } parens
>    }
>    | SECTION quoted_string LCURLY subsection* RCURLY -> ...
>    ;
>
> First off, is what I would like to do possible and is my approach
> reasonable using anltr?  If it is, can I skip over and discard large sections of the input stream like I have outlined above?

It might simplify things a lot (and make it much faster) if you would run your input through a splitter, which takes out anything unwanted and only feed the relevant text to the parser. Skipping over some text considering quotes etc. is simple stuff and can be extremely fast in a (relatively) simple loop.

Mike
--
www.soft-gems.net

This e-mail, including accompanying communications and attachments, is strictly confidential and only for the intended recipient. Any retention, use or disclosure not expressly authorised by Markit is prohibited. This email is subject to all waivers and other terms at the following link: http://www.markit.com/en/about/legal/email-disclaimer.page

Please visit http://www.markit.com/en/about/contact/contact-us.page? for contact information on our offices worldwide.

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address

This e-mail, including accompanying communications and attachments, is strictly confidential and only for the intended recipient. Any retention, use or disclosure not expressly authorised by Markit is prohibited. This email is subject to all waivers and other terms at the following link: http://www.markit.com/en/about/legal/email-disclaimer.page

Please visit http://www.markit.com/en/about/contact/contact-us.page? for contact information on our offices worldwide.