[antlr-interest] Can an Antlr Parser return a TreeNodeStream so as to not have to parse the whole file at once?

Wed Apr 18 11:20:55 PDT 2012

If you post some sample input and what you are trying to do, then perhaps
I can help. However, without knowing the input, what you are trying to
achieve and how you are currently doing it, it is a little difficult to
advise. I think that your question is not the one you are asking, but a
more fundamental question.

Jim

> -----Original Message-----
> From: Burton Samograd [mailto:burton.samograd at markit.com]
> Sent: Wednesday, April 18, 2012 10:20 AM
> To: Jim Idle
> Cc: antlr-interest at antlr.org
> Subject: RE: [antlr-interest] Can an Antlr Parser return a
> TreeNodeStream so as to not have to parse the whole file at once?
>
> Jim,
>
> Personally, I think we should be using flex/bison because I don't think
> that Antlr is the right tool for the job.  Our initial requirements
> have been exceeded by over 2 orders of magnitude and I just don't think
> that the Antlr architecture is up for the task, and I feel that we
> might need to be able to handle another order of magnitude on top of
> that in the not so near future.  For what we are creating I don't think
> we need all the features that Antlr provides, but it was chosen before
> I got to my new job and before these new requirements came up.
> --
> Burton Samograd
>
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Jim Idle
> Sent: Monday, April 16, 2012 2:52 PM
> Cc: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Can an Antlr Parser return a
> TreeNodeStream so as to not have to parse the whole file at once?
>
> 3.1? You should try 3.3 or maybe 3.4
>
> However, your best bet is to partition your input. I have suggested
> some ways to do that in past posts but if your input is not a
> calculator, then I can't be any more specific. However, the fact that
> you can process the TreeNodeStream (which is what is produced anyway)
> suggests that you can partition this up before the lexer.
>
> Jim
>
>
>
> > -----Original Message-----
> > From: Burton Samograd [mailto:burton.samograd at markit.com]
> > Sent: Monday, April 16, 2012 1:48 PM
> > To: Eric; Jim Idle
> > Cc: antlr-interest at antlr.org
> > Subject: RE: [antlr-interest] Can an Antlr Parser return a
> > TreeNodeStream so as to not have to parse the whole file at once?
> >
> > Antlr 3.1.3, C target wrapped for a C++ program.
> >
> > --
> > Burton Samograd
> >
> > -----Original Message-----
> > From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> > bounces at antlr.org] On Behalf Of Eric
> > Sent: Monday, April 16, 2012 2:46 PM
> > To: Jim Idle
> > Cc: antlr-interest at antlr.org
> > Subject: Re: [antlr-interest] Can an Antlr Parser return a
> > TreeNodeStream so as to not have to parse the whole file at once?
> >
> > Nice catch Jim.
> >
> > Burton, which ANTLR target are you using? e.g. ANTLR 3.4 C# 3, ANTLR
> > 3.4 Java, ...
> >
> > On Mon, Apr 16, 2012 at 4:41 PM, Jim Idle <jimi at temporal-wave.com>
> > wrote:
> >
> > > This isn't the C target unless someone added 'new' to the ANSI C
> > > standard when I was not looking.
> > >
> > > Jim
> > >
> > > > -----Original Message-----
> > > > From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> > > > bounces at antlr.org] On Behalf Of Eric
> > > > Sent: Monday, April 16, 2012 1:12 PM
> > > > To: Burton Samograd
> > > > Cc: antlr-interest at antlr.org
> > > > Subject: Re: [antlr-interest] Can an Antlr Parser return a
> > > > TreeNodeStream so as to not have to parse the whole file at once?
> > > >
> > > > I just notice that you are using an earlier version of the C
> > target.
> > > > There has been lots of messages here about running out of memory
> > for
> > > > that version. Check the mailing list for old post. Since I don't
> > use
> > > > the C target and Jim Idle created it, is the expert on it, and is
> > > > here regularly, he might jump in on this. Anything he suggests is
> > > > worth the trouble of looking into, even if it means a few days of
> > work.
> > > >
> > > > Eric
> > > >
> > > > On Mon, Apr 16, 2012 at 3:47 PM, Eric <researcher0x00 at gmail.com>
> > wrote:
> > > >
> > > > >
> > > > >
> > > > > On Mon, Apr 16, 2012 at 3:03 PM, Burton Samograd <
> > > > > burton.samograd at markit.com> wrote:
> > > > >
> > > > >> Hello,
> > > > >> In the following Antlr example, the parser is used to generate
> > an
> > > > AST
> > > > >> which is then converted into a CommonTreeNodeStream, which is
> > > > >> then passed to the checker.
> > > > >> public static void main(String[] args) {
> > > > >>
> > > > >> CalcLexer  lex  = new CalcLexer(
> > > > >>                        new ANTLRInputStream(System.in));
> > > > >> CommonTokenStream tokens = new CommonTokenStream(lex);
> > CalcParser
> > > > >> parser = new CalcParser(tokens);
> > > > >>
> > > > >> CalcParser.program_result result = parser.program();
> CommonTree
> > > > >> tree = (CommonTree) result.getTree();
> > > > >>
> > > > >> CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree);
> > > > >> CalcChecker checker = new CalcChecker(nodes);
> > > > >> checker.program();
> > > > >>
> > > > >> CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree);
> > > > >> CalcInterpreter interpreter = new CalcInterpreter(nodes);
> > > > >> interpreter.program(); } Is it possible to get the parser to
> > > > >> return
> > > > a
> > > > >> CommonTreeNodeStream that can be then passed to the Checker so
> > > > >> that the whole file does not have to be lexed and parsed at
> > > > >> once and rather as a stream of tokens and then tree nodes?
> > > > >>
> > > > > If I am understanding this correctly, you want to do partial
> > > > > parsing, and then generating a partial AST because the file is
> > > > > to large. Since the lexer has to lex/scan the entire text file
> > > > > to create the tokens for the parser, you cannot do a partial
> > > > > lexing
> > of the input.
> > > > >
> > > > > Ter did something with scannerless parsing several months ago,
> > but
> > > > > since I never worked with it I cannot say it will help, but is
> > > > > something I personally would look into for your problem, but
> not
> > > > > expect it to work. I have had stranger suggestions that worked.
> > > > >
> > > > > I would also profile the running of the grammar to see which
> > > > > part of the grammar is using too much memory and try altering
> > > > > the grammar and/or adding actions to correct the problem.
> > > > >
> > > > > Usually one wants the entire AST before doing analysis, so I am
> > > > > curious as to what you would do with an the AST tokens being
> > > > processed
> > > > > as a stream instead of a DOM.
> > > > >
> > > > > As a worse case, you could switch to overriding parts of the
> > ANTLR
> > > > > parser with hand written code, or even worse, switch to a
> > > > > different type of parser, i.e. LR, parser combinator, fully
> hand
> > > > > written
> > > > recursive descent.
> > > > >
> > > > > You can also contract for support from Ter.
> > > > >
> > > > >  Eric.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >> I ask because we are running into a problem with an extremely
> > > > >> large file being passed into our Antlr parser and it is
> causing
> > > > >> memory exhaustion in the parsing phase. I am thinking that
> > > > >> using a TreeNodeStream would solve this problem if it is even
> > possible.
> > > > >> --
> > > > >> Burton Samograd
> > > > >>
> > > > >>
> > > > >> ________________________________ This e-mail, including
> > > > >> accompanying communications and attachments, is strictly
> > > > >> confidential and only for the intended recipient. Any
> > > > >> retention, use or disclosure not expressly authorised by
> Markit
> > > > >> is prohibited. This email is subject to all waivers and other
> > > > >> terms at
> > > > the following link:
> > > > >> http://www.markit.com/en/about/legal/email-disclaimer.page
> > > > >>
> > > > >> Please visit http://www.markit.com/en/about/contact/contact-
> > us.page?
> > > > >> for contact information on our offices worldwide.
> > > > >>
> > > > >> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > > > >> Unsubscribe:
> > > > >> http://www.antlr.org/mailman/options/antlr-interest/your-
> email-
> > > > addres
> > > > >> s
> > > > >>
> > > > >
> > > > >
> > > >
> > > > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > > > Unsubscribe:
> > > > http://www.antlr.org/mailman/options/antlr-interest/your-
> > > > email-address
> > >
> > > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > > Unsubscribe:
> > > http://www.antlr.org/mailman/options/antlr-interest/your-email-
> > address
> > >
> >
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe: http://www.antlr.org/mailman/options/antlr-
> interest/your-
> > email-address
> >
> > This e-mail, including accompanying communications and attachments,
> is
> > strictly confidential and only for the intended recipient. Any
> > retention, use or disclosure not expressly authorised by Markit is
> > prohibited. This email is subject to all waivers and other terms at
> > the following link: http://www.markit.com/en/about/legal/email-
> > disclaimer.page
> >
> > Please visit http://www.markit.com/en/about/contact/contact-us.page?
> > for contact information on our offices worldwide.
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address
>
> This e-mail, including accompanying communications and attachments, is
> strictly confidential and only for the intended recipient. Any
> retention, use or disclosure not expressly authorised by Markit is
> prohibited. This email is subject to all waivers and other terms at the
> following link: http://www.markit.com/en/about/legal/email-
> disclaimer.page
>
> Please visit http://www.markit.com/en/about/contact/contact-us.page?
> for contact information on our offices worldwide.