[antlr-interest] Can an Antlr Parser return a TreeNodeStream so as to not have to parse the whole file at once?

Burton Samograd burton.samograd at markit.com
Wed Apr 18 10:19:50 PDT 2012


Jim,

Personally, I think we should be using flex/bison because I don't think that Antlr is the right tool for the job.  Our initial requirements have been exceeded by over 2 orders of magnitude and I just don't think that the Antlr architecture is up for the task, and I feel that we might need to be able to handle another order of magnitude on top of that in the not so near future.  For what we are creating I don't think we need all the features that Antlr provides, but it was chosen before I got to my new job and before these new requirements came up.
--
Burton Samograd

-----Original Message-----
From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Jim Idle
Sent: Monday, April 16, 2012 2:52 PM
Cc: antlr-interest at antlr.org
Subject: Re: [antlr-interest] Can an Antlr Parser return a TreeNodeStream so as to not have to parse the whole file at once?

3.1? You should try 3.3 or maybe 3.4

However, your best bet is to partition your input. I have suggested some ways to do that in past posts but if your input is not a calculator, then I can't be any more specific. However, the fact that you can process the TreeNodeStream (which is what is produced anyway) suggests that you can partition this up before the lexer.

Jim



> -----Original Message-----
> From: Burton Samograd [mailto:burton.samograd at markit.com]
> Sent: Monday, April 16, 2012 1:48 PM
> To: Eric; Jim Idle
> Cc: antlr-interest at antlr.org
> Subject: RE: [antlr-interest] Can an Antlr Parser return a
> TreeNodeStream so as to not have to parse the whole file at once?
>
> Antlr 3.1.3, C target wrapped for a C++ program.
>
> --
> Burton Samograd
>
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Eric
> Sent: Monday, April 16, 2012 2:46 PM
> To: Jim Idle
> Cc: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Can an Antlr Parser return a
> TreeNodeStream so as to not have to parse the whole file at once?
>
> Nice catch Jim.
>
> Burton, which ANTLR target are you using? e.g. ANTLR 3.4 C# 3, ANTLR
> 3.4 Java, ...
>
> On Mon, Apr 16, 2012 at 4:41 PM, Jim Idle <jimi at temporal-wave.com>
> wrote:
>
> > This isn't the C target unless someone added 'new' to the ANSI C
> > standard when I was not looking.
> >
> > Jim
> >
> > > -----Original Message-----
> > > From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> > > bounces at antlr.org] On Behalf Of Eric
> > > Sent: Monday, April 16, 2012 1:12 PM
> > > To: Burton Samograd
> > > Cc: antlr-interest at antlr.org
> > > Subject: Re: [antlr-interest] Can an Antlr Parser return a
> > > TreeNodeStream so as to not have to parse the whole file at once?
> > >
> > > I just notice that you are using an earlier version of the C
> target.
> > > There has been lots of messages here about running out of memory
> for
> > > that version. Check the mailing list for old post. Since I don't
> use
> > > the C target and Jim Idle created it, is the expert on it, and is
> > > here regularly, he might jump in on this. Anything he suggests is
> > > worth the trouble of looking into, even if it means a few days of
> work.
> > >
> > > Eric
> > >
> > > On Mon, Apr 16, 2012 at 3:47 PM, Eric <researcher0x00 at gmail.com>
> wrote:
> > >
> > > >
> > > >
> > > > On Mon, Apr 16, 2012 at 3:03 PM, Burton Samograd <
> > > > burton.samograd at markit.com> wrote:
> > > >
> > > >> Hello,
> > > >> In the following Antlr example, the parser is used to generate
> an
> > > AST
> > > >> which is then converted into a CommonTreeNodeStream, which is
> > > >> then passed to the checker.
> > > >> public static void main(String[] args) {
> > > >>
> > > >> CalcLexer  lex  = new CalcLexer(
> > > >>                        new ANTLRInputStream(System.in));
> > > >> CommonTokenStream tokens = new CommonTokenStream(lex);
> CalcParser
> > > >> parser = new CalcParser(tokens);
> > > >>
> > > >> CalcParser.program_result result = parser.program(); CommonTree
> > > >> tree = (CommonTree) result.getTree();
> > > >>
> > > >> CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree);
> > > >> CalcChecker checker = new CalcChecker(nodes);
> > > >> checker.program();
> > > >>
> > > >> CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree);
> > > >> CalcInterpreter interpreter = new CalcInterpreter(nodes);
> > > >> interpreter.program(); } Is it possible to get the parser to
> > > >> return
> > > a
> > > >> CommonTreeNodeStream that can be then passed to the Checker so
> > > >> that the whole file does not have to be lexed and parsed at
> > > >> once and rather as a stream of tokens and then tree nodes?
> > > >>
> > > > If I am understanding this correctly, you want to do partial
> > > > parsing, and then generating a partial AST because the file is
> > > > to large. Since the lexer has to lex/scan the entire text file
> > > > to create the tokens for the parser, you cannot do a partial
> > > > lexing
> of the input.
> > > >
> > > > Ter did something with scannerless parsing several months ago,
> but
> > > > since I never worked with it I cannot say it will help, but is
> > > > something I personally would look into for your problem, but not
> > > > expect it to work. I have had stranger suggestions that worked.
> > > >
> > > > I would also profile the running of the grammar to see which
> > > > part of the grammar is using too much memory and try altering
> > > > the grammar and/or adding actions to correct the problem.
> > > >
> > > > Usually one wants the entire AST before doing analysis, so I am
> > > > curious as to what you would do with an the AST tokens being
> > > processed
> > > > as a stream instead of a DOM.
> > > >
> > > > As a worse case, you could switch to overriding parts of the
> ANTLR
> > > > parser with hand written code, or even worse, switch to a
> > > > different type of parser, i.e. LR, parser combinator, fully hand
> > > > written
> > > recursive descent.
> > > >
> > > > You can also contract for support from Ter.
> > > >
> > > >  Eric.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >> I ask because we are running into a problem with an extremely
> > > >> large file being passed into our Antlr parser and it is causing
> > > >> memory exhaustion in the parsing phase. I am thinking that
> > > >> using a TreeNodeStream would solve this problem if it is even
> possible.
> > > >> --
> > > >> Burton Samograd
> > > >>
> > > >>
> > > >> ________________________________ This e-mail, including
> > > >> accompanying communications and attachments, is strictly
> > > >> confidential and only for the intended recipient. Any
> > > >> retention, use or disclosure not expressly authorised by Markit
> > > >> is prohibited. This email is subject to all waivers and other
> > > >> terms at
> > > the following link:
> > > >> http://www.markit.com/en/about/legal/email-disclaimer.page
> > > >>
> > > >> Please visit http://www.markit.com/en/about/contact/contact-
> us.page?
> > > >> for contact information on our offices worldwide.
> > > >>
> > > >> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > > >> Unsubscribe:
> > > >> http://www.antlr.org/mailman/options/antlr-interest/your-email-
> > > addres
> > > >> s
> > > >>
> > > >
> > > >
> > >
> > > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > > Unsubscribe:
> > > http://www.antlr.org/mailman/options/antlr-interest/your-
> > > email-address
> >
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe:
> > http://www.antlr.org/mailman/options/antlr-interest/your-email-
> address
> >
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address
>
> This e-mail, including accompanying communications and attachments, is
> strictly confidential and only for the intended recipient. Any
> retention, use or disclosure not expressly authorised by Markit is
> prohibited. This email is subject to all waivers and other terms at
> the following link: http://www.markit.com/en/about/legal/email-
> disclaimer.page
>
> Please visit http://www.markit.com/en/about/contact/contact-us.page?
> for contact information on our offices worldwide.

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address

This e-mail, including accompanying communications and attachments, is strictly confidential and only for the intended recipient. Any retention, use or disclosure not expressly authorised by Markit is prohibited. This email is subject to all waivers and other terms at the following link: http://www.markit.com/en/about/legal/email-disclaimer.page

Please visit http://www.markit.com/en/about/contact/contact-us.page? for contact information on our offices worldwide.


More information about the antlr-interest mailing list