[antlr-interest] ANTLR running out of memory while parsing huge files

Nick Vlassopoulos nvlassopoulos at gmail.com
Tue Apr 21 08:52:08 PDT 2009


Andreas, Jim,

Yes, this seems the right way to do it, since the actual "body data" are
pretty trivial!
I'll try working this the way you suggested!

Again, thanks for your replies!

Nikos

On Tue, Apr 21, 2009 at 4:11 PM, Jim Idle <jimi at temporal-wave.com> wrote:

> Nick Vlassopoulos wrote:
> > Hi Jim!
> >
> > Thanks for your replies!!
> >
> > The input lines are of the form
> > "var = data"
> > so they are pretty simple!
> > If I got this right, you suggest using something like a
> > body_set :
> >    body_start (probably a "greedy" option here?) body_end
> > rule and then just add code to parse the intermediate lines (which are
> > pretty simple) manually??
> Actually, do you need a parser? Perhaps you can do this all in the lexer
> and not create tokens for the data but just use the input stream in your
> own lexer action code.
>
> But I was thinking this:
>
> 1) Copy my input stream code and name it for yourself;
> 2) Have it respond to LA() using buffered reads until it finds the token
> that starts the body, say it is 'BODY', then it returns EOF;
> 3) Invoke the parser/lexer/inputstream stack and it will set up the
> information you need for the incoming data and stop, the input stream
> remembers where it was;
> 4) Process the data using a little custom C code that works with the
> input stream until you see the data has ended, tell the input stream
> where to restart;
> 5) Tell the input stream to set up for the next header starting at the
> data end location. If it wasn't at real EOF, then go to 3)
> 6) End
>
> It sounds more complicated written in an email than it will be in the C
> code ;-) You can also do the same thing without a custom input stream,
> but then you would be reading the entire file and pre-scanning and so on.
>
> If your headers are pretty simple, you might also find that an awk
> script  or just plain C code is a better method ;-)
>
> Jim
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090421/d96e823e/attachment.html 


More information about the antlr-interest mailing list