[antlr-interest] C target memory usage

Richard Gildea rgildea at gmail.com
Tue Jan 10 15:05:56 PST 2012


Hi,

Thanks, your changes were useful. I have managed to get another ~30%
reduction in the memory size, although it is still quite a bit larger than
we would like - approximately 50-60 times the input file size.

Cheers,

Richard

On 10 January 2012 08:07, A Z <asicaddress at gmail.com> wrote:

> Here are all the changes I made. IIRC, the setText/getText functions have
> many dependencies so it wasn't as easy to do a search and replace to change
> those. The startIndex/stopIndex functions are used by the generated code so
> I left those alone.
>
>
>
> On Tue, Jan 10, 2012 at 2:01 PM, Richard Gildea <rgildea at gmail.com> wrote:
>
>> Hi,
>>
>> Could you possibly give more details about the modifications you made? I
>> found it was possible to remove the user1, user2, user3 fields and the
>> custom function pointer with only minimal changes in other source files.
>> This gave approximately a 10 percent reduction in memory usage. Removing
>> function pointers looks to be a lot more involved to me.
>>
>> Cheers,
>>
>> Richard
>>
>>
>> On 23 December 2011 19:09, A Z <asicaddress at gmail.com> wrote:
>>
>>> Hi Richard,
>>>
>>>   I see about 140:1 for the ratio of memory use to input size on a
>>> 64-bit system. This is after I hacked commontoken to remove most of the
>>> function pointers, which halved the size of the tokens. I didn't
>>> investigate any further as I recall an email thread about ANTLR 4
>>> indicating it would use 8-byte tokens instead of the 200+ byte tokens 3.4
>>> uses.
>>>
>>>
>>> Looks like it may only be for C#.
>>> http://markmail.org/message/eggfcjt3a6qdzkvc
>>>
>>> Ad
>>>
>>>
>>> On Fri, Dec 23, 2011 at 10:00 AM, Richard Gildea <rgildea at gmail.com>wrote:
>>>
>>>> Hi,
>>>>
>>>> We have been successfully using antlr in the form of the C target for
>>>> some
>>>> time, however we have recently noticed that the memory consumption can
>>>> be
>>>> quite large - up to 150 times the size of the input file. Is this
>>>> factor of
>>>> ~150 to be expected, or does it indicate that we may be doing something
>>>> wrong? For the vast majority of possible inputs this does not cause a
>>>> problem, however some input files can be as large as 0.5 Gb, giving a
>>>> peak
>>>> memory usage of 75 Gb - not exactly feasible on most machines!
>>>>
>>>> Does anyone have any examples of using a custom lexer that provides a
>>>> token
>>>> buffer rather than storing all tokens in memory?
>>>>
>>>> Cheers,
>>>>
>>>> Richard
>>>>
>>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>>> Unsubscribe:
>>>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>>>
>>>
>>>
>>
>


More information about the antlr-interest mailing list