[antlr-interest] Can an Antlr Parser return a TreeNodeStream so as to not have to parse the whole file at once?

Burton Samograd burton.samograd at markit.com
Mon Apr 16 13:45:02 PDT 2012


Jim,

I just used that as an example.  It is similar to the process that we are using in our lexer/parser.

--
Burton

-----Original Message-----
From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Jim Idle
Sent: Monday, April 16, 2012 2:42 PM
Cc: antlr-interest at antlr.org
Subject: Re: [antlr-interest] Can an Antlr Parser return a TreeNodeStream so as to not have to parse the whole file at once?

This isn't the C target unless someone added 'new' to the ANSI C standard when I was not looking.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Eric
> Sent: Monday, April 16, 2012 1:12 PM
> To: Burton Samograd
> Cc: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Can an Antlr Parser return a
> TreeNodeStream so as to not have to parse the whole file at once?
>
> I just notice that you are using an earlier version of the C target.
> There has been lots of messages here about running out of memory for
> that version. Check the mailing list for old post. Since I don't use
> the C target and Jim Idle created it, is the expert on it, and is here
> regularly, he might jump in on this. Anything he suggests is worth the
> trouble of looking into, even if it means a few days of work.
>
> Eric
>
> On Mon, Apr 16, 2012 at 3:47 PM, Eric <researcher0x00 at gmail.com> wrote:
>
> >
> >
> > On Mon, Apr 16, 2012 at 3:03 PM, Burton Samograd <
> > burton.samograd at markit.com> wrote:
> >
> >> Hello,
> >> In the following Antlr example, the parser is used to generate an
> AST
> >> which is then converted into a CommonTreeNodeStream, which is then
> >> passed to the checker.
> >> public static void main(String[] args) {
> >>
> >> CalcLexer  lex  = new CalcLexer(
> >>                        new ANTLRInputStream(System.in));
> >> CommonTokenStream tokens = new CommonTokenStream(lex); CalcParser
> >> parser = new CalcParser(tokens);
> >>
> >> CalcParser.program_result result = parser.program(); CommonTree
> >> tree = (CommonTree) result.getTree();
> >>
> >> CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree);
> >> CalcChecker checker = new CalcChecker(nodes); checker.program();
> >>
> >> CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree);
> >> CalcInterpreter interpreter = new CalcInterpreter(nodes);
> >> interpreter.program(); } Is it possible to get the parser to return
> a
> >> CommonTreeNodeStream that can be then passed to the Checker so that
> >> the whole file does not have to be lexed and parsed at once and
> >> rather as a stream of tokens and then tree nodes?
> >>
> > If I am understanding this correctly, you want to do partial
> > parsing, and then generating a partial AST because the file is to
> > large. Since the lexer has to lex/scan the entire text file to
> > create the tokens for the parser, you cannot do a partial lexing of the input.
> >
> > Ter did something with scannerless parsing several months ago, but
> > since I never worked with it I cannot say it will help, but is
> > something I personally would look into for your problem, but not
> > expect it to work. I have had stranger suggestions that worked.
> >
> > I would also profile the running of the grammar to see which part of
> > the grammar is using too much memory and try altering the grammar
> > and/or adding actions to correct the problem.
> >
> > Usually one wants the entire AST before doing analysis, so I am
> > curious as to what you would do with an the AST tokens being
> processed
> > as a stream instead of a DOM.
> >
> > As a worse case, you could switch to overriding parts of the ANTLR
> > parser with hand written code, or even worse, switch to a different
> > type of parser, i.e. LR, parser combinator, fully hand written
> recursive descent.
> >
> > You can also contract for support from Ter.
> >
> >  Eric.
> >
> >
> >
> >
> >
> >> I ask because we are running into a problem with an extremely large
> >> file being passed into our Antlr parser and it is causing memory
> >> exhaustion in the parsing phase. I am thinking that using a
> >> TreeNodeStream would solve this problem if it is even possible.
> >> --
> >> Burton Samograd
> >>
> >>
> >> ________________________________
> >> This e-mail, including accompanying communications and attachments,
> >> is strictly confidential and only for the intended recipient. Any
> >> retention, use or disclosure not expressly authorised by Markit is
> >> prohibited. This email is subject to all waivers and other terms at
> the following link:
> >> http://www.markit.com/en/about/legal/email-disclaimer.page
> >>
> >> Please visit http://www.markit.com/en/about/contact/contact-us.page?
> >> for contact information on our offices worldwide.
> >>
> >> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> >> Unsubscribe:
> >> http://www.antlr.org/mailman/options/antlr-interest/your-email-
> addres
> >> s
> >>
> >
> >
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address

This e-mail, including accompanying communications and attachments, is strictly confidential and only for the intended recipient. Any retention, use or disclosure not expressly authorised by Markit is prohibited. This email is subject to all waivers and other terms at the following link: http://www.markit.com/en/about/legal/email-disclaimer.page

Please visit http://www.markit.com/en/about/contact/contact-us.page? for contact information on our offices worldwide.


More information about the antlr-interest mailing list