[antlr-interest] ANTLR running out of memory while parsing huge files

Andreas Meyer andreas.meyer at smartshift.de
Tue Apr 21 07:53:55 PDT 2009


No, in any case, you should avoid parsing the whole file! That would 
basically mean loading the whole file into main memory. Either use a 
lexer, a custom tokenizer or whatever for seperating your entries in the 
body section. Then, for each line, you invoke the parser, possibly by 
reusing the existing instance.

Best,
Andreas

Nick Vlassopoulos schrieb:
> Hi Jim!
>
> Thanks for your replies!!
>
> The input lines are of the form
> "var = data"
> so they are pretty simple!
> If I got this right, you suggest using something like a
> body_set :
>    body_start (probably a "greedy" option here?) body_end
> rule and then just add code to parse the intermediate lines (which are 
> pretty simple) manually??
>
> Thanks!
>
> Nikos
>
> On Tue, Apr 21, 2009 at 3:31 PM, Jim Idle <jimi at temporal-wave.com 
> <mailto:jimi at temporal-wave.com>> wrote:
>
>     Nick Vlassopoulos wrote:
>     > Hi Andreas,
>     >
>     > Thanks for your fast reply!
>     > So it should be something like a "line parser" that's
>     instatiated for
>     > each line of the BODY section!
>     >
>     No - you don't want to do this really, you will create millions of
>     malloc/free calls - go with the custom input stream I mentioned
>     and you
>     will be fine. It sounds like you can easily pick out the faked EOF
>     points without parsing them.
>
>     What is the input? If it is just millions of data elements, then you
>     could parse the headers, then have the input stream traverse the data
>     points with a little custom code, until the next header is seen.
>
>     Jim
>
>     List: http://www.antlr.org/mailman/listinfo/antlr-interest
>     Unsubscribe:
>     http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>
> ------------------------------------------------------------------------
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>   



More information about the antlr-interest mailing list