[antlr-interest] Can an Antlr Parser return a TreeNodeStream so as to not have to parse the whole file at once?

Mon Apr 16 13:41:45 PDT 2012

This isn't the C target unless someone added 'new' to the ANSI C standard
when I was not looking.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Eric
> Sent: Monday, April 16, 2012 1:12 PM
> To: Burton Samograd
> Cc: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Can an Antlr Parser return a
> TreeNodeStream so as to not have to parse the whole file at once?
>
> I just notice that you are using an earlier version of the C target.
> There has been lots of messages here about running out of memory for
> that version. Check the mailing list for old post. Since I don't use
> the C target and Jim Idle created it, is the expert on it, and is here
> regularly, he might jump in on this. Anything he suggests is worth the
> trouble of looking into, even if it means a few days of work.
>
> Eric
>
> On Mon, Apr 16, 2012 at 3:47 PM, Eric <researcher0x00 at gmail.com> wrote:
>
> >
> >
> > On Mon, Apr 16, 2012 at 3:03 PM, Burton Samograd <
> > burton.samograd at markit.com> wrote:
> >
> >> Hello,
> >> In the following Antlr example, the parser is used to generate an
> AST
> >> which is then converted into a CommonTreeNodeStream, which is then
> >> passed to the checker.
> >> public static void main(String[] args) {
> >>
> >> CalcLexer  lex  = new CalcLexer(
> >>                        new ANTLRInputStream(System.in));
> >> CommonTokenStream tokens = new CommonTokenStream(lex); CalcParser
> >> parser = new CalcParser(tokens);
> >>
> >> CalcParser.program_result result = parser.program(); CommonTree tree
> >> = (CommonTree) result.getTree();
> >>
> >> CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree);
> >> CalcChecker checker = new CalcChecker(nodes); checker.program();
> >>
> >> CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree);
> >> CalcInterpreter interpreter = new CalcInterpreter(nodes);
> >> interpreter.program(); } Is it possible to get the parser to return
> a
> >> CommonTreeNodeStream that can be then passed to the Checker so that
> >> the whole file does not have to be lexed and parsed at once and
> >> rather as a stream of tokens and then tree nodes?
> >>
> > If I am understanding this correctly, you want to do partial parsing,
> > and then generating a partial AST because the file is to large. Since
> > the lexer has to lex/scan the entire text file to create the tokens
> > for the parser, you cannot do a partial lexing of the input.
> >
> > Ter did something with scannerless parsing several months ago, but
> > since I never worked with it I cannot say it will help, but is
> > something I personally would look into for your problem, but not
> > expect it to work. I have had stranger suggestions that worked.
> >
> > I would also profile the running of the grammar to see which part of
> > the grammar is using too much memory and try altering the grammar
> > and/or adding actions to correct the problem.
> >
> > Usually one wants the entire AST before doing analysis, so I am
> > curious as to what you would do with an the AST tokens being
> processed
> > as a stream instead of a DOM.
> >
> > As a worse case, you could switch to overriding parts of the ANTLR
> > parser with hand written code, or even worse, switch to a different
> > type of parser, i.e. LR, parser combinator, fully hand written
> recursive descent.
> >
> > You can also contract for support from Ter.
> >
> >  Eric.
> >
> >
> >
> >
> >
> >> I ask because we are running into a problem with an extremely large
> >> file being passed into our Antlr parser and it is causing memory
> >> exhaustion in the parsing phase. I am thinking that using a
> >> TreeNodeStream would solve this problem if it is even possible.
> >> --
> >> Burton Samograd
> >>
> >>
> >> ________________________________
> >> This e-mail, including accompanying communications and attachments,
> >> is strictly confidential and only for the intended recipient. Any
> >> retention, use or disclosure not expressly authorised by Markit is
> >> prohibited. This email is subject to all waivers and other terms at
> the following link:
> >> http://www.markit.com/en/about/legal/email-disclaimer.page
> >>
> >> Please visit http://www.markit.com/en/about/contact/contact-us.page?
> >> for contact information on our offices worldwide.
> >>
> >> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> >> Unsubscribe:
> >> http://www.antlr.org/mailman/options/antlr-interest/your-email-
> addres
> >> s
> >>
> >
> >
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address