[antlr-interest] Can an Antlr Parser return a TreeNodeStream so as to not have to parse the whole file at once?

Mon Apr 16 13:45:35 PDT 2012

Nice catch Jim.

Burton, which ANTLR target are you using? e.g. ANTLR 3.4 C# 3, ANTLR 3.4
Java, ...

On Mon, Apr 16, 2012 at 4:41 PM, Jim Idle <jimi at temporal-wave.com> wrote:

> This isn't the C target unless someone added 'new' to the ANSI C standard
> when I was not looking.
>
> Jim
>
> > -----Original Message-----
> > From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> > bounces at antlr.org] On Behalf Of Eric
> > Sent: Monday, April 16, 2012 1:12 PM
> > To: Burton Samograd
> > Cc: antlr-interest at antlr.org
> > Subject: Re: [antlr-interest] Can an Antlr Parser return a
> > TreeNodeStream so as to not have to parse the whole file at once?
> >
> > I just notice that you are using an earlier version of the C target.
> > There has been lots of messages here about running out of memory for
> > that version. Check the mailing list for old post. Since I don't use
> > the C target and Jim Idle created it, is the expert on it, and is here
> > regularly, he might jump in on this. Anything he suggests is worth the
> > trouble of looking into, even if it means a few days of work.
> >
> > Eric
> >
> > On Mon, Apr 16, 2012 at 3:47 PM, Eric <researcher0x00 at gmail.com> wrote:
> >
> > >
> > >
> > > On Mon, Apr 16, 2012 at 3:03 PM, Burton Samograd <
> > > burton.samograd at markit.com> wrote:
> > >
> > >> Hello,
> > >> In the following Antlr example, the parser is used to generate an
> > AST
> > >> which is then converted into a CommonTreeNodeStream, which is then
> > >> passed to the checker.
> > >> public static void main(String[] args) {
> > >>
> > >> CalcLexer  lex  = new CalcLexer(
> > >>                        new ANTLRInputStream(System.in));
> > >> CommonTokenStream tokens = new CommonTokenStream(lex); CalcParser
> > >> parser = new CalcParser(tokens);
> > >>
> > >> CalcParser.program_result result = parser.program(); CommonTree tree
> > >> = (CommonTree) result.getTree();
> > >>
> > >> CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree);
> > >> CalcChecker checker = new CalcChecker(nodes); checker.program();
> > >>
> > >> CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree);
> > >> CalcInterpreter interpreter = new CalcInterpreter(nodes);
> > >> interpreter.program(); } Is it possible to get the parser to return
> > a
> > >> CommonTreeNodeStream that can be then passed to the Checker so that
> > >> the whole file does not have to be lexed and parsed at once and
> > >> rather as a stream of tokens and then tree nodes?
> > >>
> > > If I am understanding this correctly, you want to do partial parsing,
> > > and then generating a partial AST because the file is to large. Since
> > > the lexer has to lex/scan the entire text file to create the tokens
> > > for the parser, you cannot do a partial lexing of the input.
> > >
> > > Ter did something with scannerless parsing several months ago, but
> > > since I never worked with it I cannot say it will help, but is
> > > something I personally would look into for your problem, but not
> > > expect it to work. I have had stranger suggestions that worked.
> > >
> > > I would also profile the running of the grammar to see which part of
> > > the grammar is using too much memory and try altering the grammar
> > > and/or adding actions to correct the problem.
> > >
> > > Usually one wants the entire AST before doing analysis, so I am
> > > curious as to what you would do with an the AST tokens being
> > processed
> > > as a stream instead of a DOM.
> > >
> > > As a worse case, you could switch to overriding parts of the ANTLR
> > > parser with hand written code, or even worse, switch to a different
> > > type of parser, i.e. LR, parser combinator, fully hand written
> > recursive descent.
> > >
> > > You can also contract for support from Ter.
> > >
> > >  Eric.
> > >
> > >
> > >
> > >
> > >
> > >> I ask because we are running into a problem with an extremely large
> > >> file being passed into our Antlr parser and it is causing memory
> > >> exhaustion in the parsing phase. I am thinking that using a
> > >> TreeNodeStream would solve this problem if it is even possible.
> > >> --
> > >> Burton Samograd
> > >>
> > >>
> > >> ________________________________
> > >> This e-mail, including accompanying communications and attachments,
> > >> is strictly confidential and only for the intended recipient. Any
> > >> retention, use or disclosure not expressly authorised by Markit is
> > >> prohibited. This email is subject to all waivers and other terms at
> > the following link:
> > >> http://www.markit.com/en/about/legal/email-disclaimer.page
> > >>
> > >> Please visit http://www.markit.com/en/about/contact/contact-us.page?
> > >> for contact information on our offices worldwide.
> > >>
> > >> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > >> Unsubscribe:
> > >> http://www.antlr.org/mailman/options/antlr-interest/your-email-
> > addres
> > >> s
> > >>
> > >
> > >
> >
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> > email-address
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>