[antlr-interest] C target memory usage

Tue Jan 10 08:07:46 PST 2012

Here are all the changes I made. IIRC, the setText/getText functions have
many dependencies so it wasn't as easy to do a search and replace to change
those. The startIndex/stopIndex functions are used by the generated code so
I left those alone.

On Tue, Jan 10, 2012 at 2:01 PM, Richard Gildea <rgildea at gmail.com> wrote:

> Hi,
>
> Could you possibly give more details about the modifications you made? I
> found it was possible to remove the user1, user2, user3 fields and the
> custom function pointer with only minimal changes in other source files.
> This gave approximately a 10 percent reduction in memory usage. Removing
> function pointers looks to be a lot more involved to me.
>
> Cheers,
>
> Richard
>
>
> On 23 December 2011 19:09, A Z <asicaddress at gmail.com> wrote:
>
>> Hi Richard,
>>
>>   I see about 140:1 for the ratio of memory use to input size on a 64-bit
>> system. This is after I hacked commontoken to remove most of the function
>> pointers, which halved the size of the tokens. I didn't investigate any
>> further as I recall an email thread about ANTLR 4 indicating it would use
>> 8-byte tokens instead of the 200+ byte tokens 3.4 uses.
>>
>>
>> Looks like it may only be for C#.
>> http://markmail.org/message/eggfcjt3a6qdzkvc
>>
>> Ad
>>
>>
>> On Fri, Dec 23, 2011 at 10:00 AM, Richard Gildea <rgildea at gmail.com>wrote:
>>
>>> Hi,
>>>
>>> We have been successfully using antlr in the form of the C target for
>>> some
>>> time, however we have recently noticed that the memory consumption can be
>>> quite large - up to 150 times the size of the input file. Is this factor
>>> of
>>> ~150 to be expected, or does it indicate that we may be doing something
>>> wrong? For the vast majority of possible inputs this does not cause a
>>> problem, however some input files can be as large as 0.5 Gb, giving a
>>> peak
>>> memory usage of 75 Gb - not exactly feasible on most machines!
>>>
>>> Does anyone have any examples of using a custom lexer that provides a
>>> token
>>> buffer rather than storing all tokens in memory?
>>>
>>> Cheers,
>>>
>>> Richard
>>>
>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>> Unsubscribe:
>>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>>
>>
>>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smallToken.tar.gz
Type: application/x-gzip
Size: 53641 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20120110/4bdfccb5/attachment.gz