[antlr-interest] Can an Antlr Parser return a TreeNodeStream so as to not have to parse the whole file at once?

Burton Samograd burton.samograd at markit.com
Mon Apr 16 13:47:59 PDT 2012


Antlr 3.1.3, C target wrapped for a C++ program.

--
Burton Samograd

-----Original Message-----
From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Eric
Sent: Monday, April 16, 2012 2:46 PM
To: Jim Idle
Cc: antlr-interest at antlr.org
Subject: Re: [antlr-interest] Can an Antlr Parser return a TreeNodeStream so as to not have to parse the whole file at once?

Nice catch Jim.

Burton, which ANTLR target are you using? e.g. ANTLR 3.4 C# 3, ANTLR 3.4 Java, ...

On Mon, Apr 16, 2012 at 4:41 PM, Jim Idle <jimi at temporal-wave.com> wrote:

> This isn't the C target unless someone added 'new' to the ANSI C
> standard when I was not looking.
>
> Jim
>
> > -----Original Message-----
> > From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> > bounces at antlr.org] On Behalf Of Eric
> > Sent: Monday, April 16, 2012 1:12 PM
> > To: Burton Samograd
> > Cc: antlr-interest at antlr.org
> > Subject: Re: [antlr-interest] Can an Antlr Parser return a
> > TreeNodeStream so as to not have to parse the whole file at once?
> >
> > I just notice that you are using an earlier version of the C target.
> > There has been lots of messages here about running out of memory for
> > that version. Check the mailing list for old post. Since I don't use
> > the C target and Jim Idle created it, is the expert on it, and is
> > here regularly, he might jump in on this. Anything he suggests is
> > worth the trouble of looking into, even if it means a few days of work.
> >
> > Eric
> >
> > On Mon, Apr 16, 2012 at 3:47 PM, Eric <researcher0x00 at gmail.com> wrote:
> >
> > >
> > >
> > > On Mon, Apr 16, 2012 at 3:03 PM, Burton Samograd <
> > > burton.samograd at markit.com> wrote:
> > >
> > >> Hello,
> > >> In the following Antlr example, the parser is used to generate an
> > AST
> > >> which is then converted into a CommonTreeNodeStream, which is
> > >> then passed to the checker.
> > >> public static void main(String[] args) {
> > >>
> > >> CalcLexer  lex  = new CalcLexer(
> > >>                        new ANTLRInputStream(System.in));
> > >> CommonTokenStream tokens = new CommonTokenStream(lex); CalcParser
> > >> parser = new CalcParser(tokens);
> > >>
> > >> CalcParser.program_result result = parser.program(); CommonTree
> > >> tree = (CommonTree) result.getTree();
> > >>
> > >> CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree);
> > >> CalcChecker checker = new CalcChecker(nodes); checker.program();
> > >>
> > >> CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree);
> > >> CalcInterpreter interpreter = new CalcInterpreter(nodes);
> > >> interpreter.program(); } Is it possible to get the parser to
> > >> return
> > a
> > >> CommonTreeNodeStream that can be then passed to the Checker so
> > >> that the whole file does not have to be lexed and parsed at once
> > >> and rather as a stream of tokens and then tree nodes?
> > >>
> > > If I am understanding this correctly, you want to do partial
> > > parsing, and then generating a partial AST because the file is to
> > > large. Since the lexer has to lex/scan the entire text file to
> > > create the tokens for the parser, you cannot do a partial lexing of the input.
> > >
> > > Ter did something with scannerless parsing several months ago, but
> > > since I never worked with it I cannot say it will help, but is
> > > something I personally would look into for your problem, but not
> > > expect it to work. I have had stranger suggestions that worked.
> > >
> > > I would also profile the running of the grammar to see which part
> > > of the grammar is using too much memory and try altering the
> > > grammar and/or adding actions to correct the problem.
> > >
> > > Usually one wants the entire AST before doing analysis, so I am
> > > curious as to what you would do with an the AST tokens being
> > processed
> > > as a stream instead of a DOM.
> > >
> > > As a worse case, you could switch to overriding parts of the ANTLR
> > > parser with hand written code, or even worse, switch to a
> > > different type of parser, i.e. LR, parser combinator, fully hand
> > > written
> > recursive descent.
> > >
> > > You can also contract for support from Ter.
> > >
> > >  Eric.
> > >
> > >
> > >
> > >
> > >
> > >> I ask because we are running into a problem with an extremely
> > >> large file being passed into our Antlr parser and it is causing
> > >> memory exhaustion in the parsing phase. I am thinking that using
> > >> a TreeNodeStream would solve this problem if it is even possible.
> > >> --
> > >> Burton Samograd
> > >>
> > >>
> > >> ________________________________
> > >> This e-mail, including accompanying communications and
> > >> attachments, is strictly confidential and only for the intended
> > >> recipient. Any retention, use or disclosure not expressly
> > >> authorised by Markit is prohibited. This email is subject to all
> > >> waivers and other terms at
> > the following link:
> > >> http://www.markit.com/en/about/legal/email-disclaimer.page
> > >>
> > >> Please visit http://www.markit.com/en/about/contact/contact-us.page?
> > >> for contact information on our offices worldwide.
> > >>
> > >> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > >> Unsubscribe:
> > >> http://www.antlr.org/mailman/options/antlr-interest/your-email-
> > addres
> > >> s
> > >>
> > >
> > >
> >
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe:
> > http://www.antlr.org/mailman/options/antlr-interest/your-
> > email-address
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address

This e-mail, including accompanying communications and attachments, is strictly confidential and only for the intended recipient. Any retention, use or disclosure not expressly authorised by Markit is prohibited. This email is subject to all waivers and other terms at the following link: http://www.markit.com/en/about/legal/email-disclaimer.page

Please visit http://www.markit.com/en/about/contact/contact-us.page? for contact information on our offices worldwide.


More information about the antlr-interest mailing list