[antlr-interest] Why does Lexer of C++ run time target eat so much memory

chain one chainone at gmail.com
Mon Dec 15 23:32:25 PST 2008


Hi,
These days I am writing a parser for a kind of data file using C++. The
format of the data file is simple, so the rules are simple.
But when I feed a about 20M-size data file to the parser, the parser eats
almost  600M+ memory.
I am surprised by this result and I found most memory and time were consumed
by the Lexer.

Is there anything wrong with my grammar or it is the performance issue of
ANTLR3 C++ runtime?
I hope there is some way to get my parser  more lightweight.

I attached the .g file to this mail and the data file(.txt format) could be
got from this link:https://download.yousendit.com/Q01FSU5ONEhZY1IzZUE9PQ

The test main function is:
========================================
     int main(int argc, char * argv[])
     {
        pANTLR3_INPUT_STREAM           input;
        pExpressDataLexer               lex;
        pANTLR3_COMMON_TOKEN_STREAM    tokens;
        pExpressDataParser              parser;

        input  = antlr3AsciiFileStreamNew          ((pANTLR3_UINT8)argv[1]);
        lex    = ExpressDataLexerNew                (input);
        tokens = antlr3CommonTokenStreamSourceNew  (ANTLR3_SIZE_HINT,
TOKENSOURCE(lex));
        parser = ExpressDataParserNew               (tokens);

        parser  ->syntax(parser);

        parser ->free(parser);
        tokens ->free(tokens);
        lex    ->free(lex);
        input  ->close(input);

        return 0;
     }
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20081216/8c63e969/attachment.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ExpressData.g
Type: application/octet-stream
Size: 3384 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20081216/8c63e969/attachment.obj 


More information about the antlr-interest mailing list