[antlr-interest] Why does Lexer of C++ run time target eat so much memory

Jim Idle jimi at temporal-wave.com
Tue Dec 16 16:09:41 PST 2008


On Tue, 16 Dec 2008 15:54:58 -0800, chain one <chainone at gmail.com> wrote:

> Still waiting for help
> I just wanna know, if c runtime target is suitable for large input?

Yes.


>
> On 12/16/08, chain one <chainone at gmail.com> wrote:
>> Hi,
>> These days I am writing a parser for a kind of data file using C++. The
>> format of the data file is simple, so the rules are simple.
>> But when I feed a about 20M-size data file to the parser, the parser 
>> eats
>> almost 600M+ memory.
>> I am surprised by this result and I found most memory and time were 
>> consumed
>> by the Lexer.

There is possibly something not quite right with this then. 

However, a 20M input file is going to generate a lot of tokens and you need all of tokens in order to 
parse the input, hence you are using a lot of memory - especially if a lot of your tokens are just a few characters. If all your tokens were one character then you would need 20M tokens - that would be the worst case and your case will be something less than this. 

One way to reduce the number of tokens is to use the SKIP(); macro on tokens that you don't need the parser to see, such as ',' or ' ' and so on. Otherwise they are sitting in your token stream for no reason. Only mark them as hidden and keep them in the token stream if you will need to examine them later. Otherwise SKIP them.

>>
>> Is there anything wrong with my grammar or it is the performance issue 
>> of
>> ANTLR3 C++ runtime?
>> I hope there is some way to get my parser more lightweight.
>>
>> I attached the .g file to this mail and the data file(.txt format) 
>> could be

I must have missed the original emails, sorry about that. Can you resend me your .g file?

Jim
>> got from this link:https://download.yousendit.com/Q01FSU5ONEhZY1IzZUE9PQ
>>
>> The test main function is:
>> ========================================
>> int main(int argc, char * argv[])
>> {



-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20081216/d95158d6/attachment.html 


More information about the antlr-interest mailing list