[antlr-interest] Can an Antlr Parser return a TreeNodeStream so as to not have to parse the whole file at once?

Mon Apr 16 13:51:34 PDT 2012

3.1? You should try 3.3 or maybe 3.4

However, your best bet is to partition your input. I have suggested some
ways to do that in past posts but if your input is not a calculator, then
I can't be any more specific. However, the fact that you can process the
TreeNodeStream (which is what is produced anyway) suggests that you can
partition this up before the lexer.

Jim

> -----Original Message-----
> From: Burton Samograd [mailto:burton.samograd at markit.com]
> Sent: Monday, April 16, 2012 1:48 PM
> To: Eric; Jim Idle
> Cc: antlr-interest at antlr.org
> Subject: RE: [antlr-interest] Can an Antlr Parser return a
> TreeNodeStream so as to not have to parse the whole file at once?
>
> Antlr 3.1.3, C target wrapped for a C++ program.
>
> --
> Burton Samograd
>
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Eric
> Sent: Monday, April 16, 2012 2:46 PM
> To: Jim Idle
> Cc: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Can an Antlr Parser return a
> TreeNodeStream so as to not have to parse the whole file at once?
>
> Nice catch Jim.
>
> Burton, which ANTLR target are you using? e.g. ANTLR 3.4 C# 3, ANTLR
> 3.4 Java, ...
>
> On Mon, Apr 16, 2012 at 4:41 PM, Jim Idle <jimi at temporal-wave.com>
> wrote:
>
> > This isn't the C target unless someone added 'new' to the ANSI C
> > standard when I was not looking.
> >
> > Jim
> >
> > > -----Original Message-----
> > > From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> > > bounces at antlr.org] On Behalf Of Eric
> > > Sent: Monday, April 16, 2012 1:12 PM
> > > To: Burton Samograd
> > > Cc: antlr-interest at antlr.org
> > > Subject: Re: [antlr-interest] Can an Antlr Parser return a
> > > TreeNodeStream so as to not have to parse the whole file at once?
> > >
> > > I just notice that you are using an earlier version of the C
> target.
> > > There has been lots of messages here about running out of memory
> for
> > > that version. Check the mailing list for old post. Since I don't
> use
> > > the C target and Jim Idle created it, is the expert on it, and is
> > > here regularly, he might jump in on this. Anything he suggests is
> > > worth the trouble of looking into, even if it means a few days of
> work.
> > >
> > > Eric
> > >
> > > On Mon, Apr 16, 2012 at 3:47 PM, Eric <researcher0x00 at gmail.com>
> wrote:
> > >
> > > >
> > > >
> > > > On Mon, Apr 16, 2012 at 3:03 PM, Burton Samograd <
> > > > burton.samograd at markit.com> wrote:
> > > >
> > > >> Hello,
> > > >> In the following Antlr example, the parser is used to generate
> an
> > > AST
> > > >> which is then converted into a CommonTreeNodeStream, which is
> > > >> then passed to the checker.
> > > >> public static void main(String[] args) {
> > > >>
> > > >> CalcLexer  lex  = new CalcLexer(
> > > >>                        new ANTLRInputStream(System.in));
> > > >> CommonTokenStream tokens = new CommonTokenStream(lex);
> CalcParser
> > > >> parser = new CalcParser(tokens);
> > > >>
> > > >> CalcParser.program_result result = parser.program(); CommonTree
> > > >> tree = (CommonTree) result.getTree();
> > > >>
> > > >> CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree);
> > > >> CalcChecker checker = new CalcChecker(nodes); checker.program();
> > > >>
> > > >> CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree);
> > > >> CalcInterpreter interpreter = new CalcInterpreter(nodes);
> > > >> interpreter.program(); } Is it possible to get the parser to
> > > >> return
> > > a
> > > >> CommonTreeNodeStream that can be then passed to the Checker so
> > > >> that the whole file does not have to be lexed and parsed at once
> > > >> and rather as a stream of tokens and then tree nodes?
> > > >>
> > > > If I am understanding this correctly, you want to do partial
> > > > parsing, and then generating a partial AST because the file is to
> > > > large. Since the lexer has to lex/scan the entire text file to
> > > > create the tokens for the parser, you cannot do a partial lexing
> of the input.
> > > >
> > > > Ter did something with scannerless parsing several months ago,
> but
> > > > since I never worked with it I cannot say it will help, but is
> > > > something I personally would look into for your problem, but not
> > > > expect it to work. I have had stranger suggestions that worked.
> > > >
> > > > I would also profile the running of the grammar to see which part
> > > > of the grammar is using too much memory and try altering the
> > > > grammar and/or adding actions to correct the problem.
> > > >
> > > > Usually one wants the entire AST before doing analysis, so I am
> > > > curious as to what you would do with an the AST tokens being
> > > processed
> > > > as a stream instead of a DOM.
> > > >
> > > > As a worse case, you could switch to overriding parts of the
> ANTLR
> > > > parser with hand written code, or even worse, switch to a
> > > > different type of parser, i.e. LR, parser combinator, fully hand
> > > > written
> > > recursive descent.
> > > >
> > > > You can also contract for support from Ter.
> > > >
> > > >  Eric.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >> I ask because we are running into a problem with an extremely
> > > >> large file being passed into our Antlr parser and it is causing
> > > >> memory exhaustion in the parsing phase. I am thinking that using
> > > >> a TreeNodeStream would solve this problem if it is even
> possible.
> > > >> --
> > > >> Burton Samograd
> > > >>
> > > >>
> > > >> ________________________________
> > > >> This e-mail, including accompanying communications and
> > > >> attachments, is strictly confidential and only for the intended
> > > >> recipient. Any retention, use or disclosure not expressly
> > > >> authorised by Markit is prohibited. This email is subject to all
> > > >> waivers and other terms at
> > > the following link:
> > > >> http://www.markit.com/en/about/legal/email-disclaimer.page
> > > >>
> > > >> Please visit http://www.markit.com/en/about/contact/contact-
> us.page?
> > > >> for contact information on our offices worldwide.
> > > >>
> > > >> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > > >> Unsubscribe:
> > > >> http://www.antlr.org/mailman/options/antlr-interest/your-email-
> > > addres
> > > >> s
> > > >>
> > > >
> > > >
> > >
> > > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > > Unsubscribe:
> > > http://www.antlr.org/mailman/options/antlr-interest/your-
> > > email-address
> >
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe:
> > http://www.antlr.org/mailman/options/antlr-interest/your-email-
> address
> >
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address
>
> This e-mail, including accompanying communications and attachments, is
> strictly confidential and only for the intended recipient. Any
> retention, use or disclosure not expressly authorised by Markit is
> prohibited. This email is subject to all waivers and other terms at the
> following link: http://www.markit.com/en/about/legal/email-
> disclaimer.page
>
> Please visit http://www.markit.com/en/about/contact/contact-us.page?
> for contact information on our offices worldwide.