[antlr-interest] Can an Antlr Parser return a TreeNodeStream so as to not have to parse the whole file at once?

Eric researcher0x00 at gmail.com
Mon Apr 16 13:12:00 PDT 2012


I just notice that you are using an earlier version of the C target. There
has been lots of messages here about running out of memory for that
version. Check the mailing list for old post. Since I don't use the C
target and Jim Idle created it, is the expert on it, and is here regularly,
he might jump in on this. Anything he suggests is worth the trouble of
looking into, even if it means a few days of work.

Eric

On Mon, Apr 16, 2012 at 3:47 PM, Eric <researcher0x00 at gmail.com> wrote:

>
>
> On Mon, Apr 16, 2012 at 3:03 PM, Burton Samograd <
> burton.samograd at markit.com> wrote:
>
>> Hello,
>> In the following Antlr example, the parser is used to generate an AST
>> which is then converted into a CommonTreeNodeStream, which is then passed
>> to the checker.
>> public static void main(String[] args) {
>>
>> CalcLexer  lex  = new CalcLexer(
>>                        new ANTLRInputStream(System.in));
>> CommonTokenStream tokens = new CommonTokenStream(lex);
>> CalcParser parser = new CalcParser(tokens);
>>
>> CalcParser.program_result result = parser.program();
>> CommonTree tree = (CommonTree) result.getTree();
>>
>> CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree);
>> CalcChecker checker = new CalcChecker(nodes);
>> checker.program();
>>
>> CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree);
>> CalcInterpreter interpreter = new CalcInterpreter(nodes);
>> interpreter.program();
>> }
>> Is it possible to get the parser to return a CommonTreeNodeStream that
>> can be then passed to the Checker so that the whole file does not have to
>> be lexed and parsed at once and rather as a stream of tokens and then tree
>> nodes?
>>
> If I am understanding this correctly, you want to do partial parsing, and
> then generating a partial AST because the file is to large. Since the lexer
> has to lex/scan the entire text file to create the tokens for the parser,
> you cannot do a partial lexing of the input.
>
> Ter did something with scannerless parsing several months ago, but since I
> never worked with it I cannot say it will help, but is something I
> personally would look into for your problem, but not expect it to work. I
> have had stranger suggestions that worked.
>
> I would also profile the running of the grammar to see which part of the
> grammar is using too much memory and try altering the grammar and/or adding
> actions to correct the problem.
>
> Usually one wants the entire AST before doing analysis, so I am curious as
> to what you would do with an the AST tokens being processed as a stream
> instead of a DOM.
>
> As a worse case, you could switch to overriding parts of the ANTLR parser
> with hand written code, or even worse, switch to a different type of
> parser, i.e. LR, parser combinator, fully hand written recursive descent.
>
> You can also contract for support from Ter.
>
>  Eric.
>
>
>
>
>
>> I ask because we are running into a problem with an extremely large file
>> being passed into our Antlr parser and it is causing memory exhaustion in
>> the parsing phase. I am thinking that using a TreeNodeStream would solve
>> this problem if it is even possible.
>> --
>> Burton Samograd
>>
>>
>> ________________________________
>> This e-mail, including accompanying communications and attachments, is
>> strictly confidential and only for the intended recipient. Any retention,
>> use or disclosure not expressly authorised by Markit is prohibited. This
>> email is subject to all waivers and other terms at the following link:
>> http://www.markit.com/en/about/legal/email-disclaimer.page
>>
>> Please visit http://www.markit.com/en/about/contact/contact-us.page? for
>> contact information on our offices worldwide.
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe:
>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>
>
>


More information about the antlr-interest mailing list