[antlr-interest] C target memory usage

Thu Dec 22 20:00:02 PST 2011

Hi,

We have been successfully using antlr in the form of the C target for some
time, however we have recently noticed that the memory consumption can be
quite large - up to 150 times the size of the input file. Is this factor of
~150 to be expected, or does it indicate that we may be doing something
wrong? For the vast majority of possible inputs this does not cause a
problem, however some input files can be as large as 0.5 Gb, giving a peak
memory usage of 75 Gb - not exactly feasible on most machines!

Does anyone have any examples of using a custom lexer that provides a token
buffer rather than storing all tokens in memory?

Cheers,

Richard