[antlr-interest] More, Status of C++ backend?

Ruslan Zasukhin sunshine at public.kherson.ua
Wed Jan 2 03:09:33 PST 2008


On 2/1/08 4:49 AM, "Mark Wright" <markwright at internode.on.net> wrote:

>> 2) also I have come to all these issues because again did profile of
>> 100K-1M INSERTS, and I see big troubles in ANTLR 2.7.2 C++ runtime.
>> As far as I could see problem come from LA() which calls a lots
>> NextToken() which creates std::string() what cause call of new() (and
>> later free.
>> 
>> ANTLR parser looks to work 8-10 times slower of YACC or Lemon. And I
>> think because of this reason.
>> 
>> I have check -- and it seems to be deal of few hard days work to try
>> remove std::string from C++ runtime of ANTLR 2.7.2 and using instead
>> just pair  { char*, length }
>>     where char* points right into string we do parse.
>>     I do not see any need today do COPY of each token string.
>>     btw, how you have implement this in C runtime for v3 ?
> 
> Hello Ruslan and Jim,
> 
> Another idea is:  maybe it might be easier to find a way to optionally
> plug in Andrei Alexandrescu's flex_string instead of std::string.
> flex_string is used in the Boost Wave project, presumably for the
> same reason.

But does flex_string require allocation by new()?
I assume that YES.

And again, profiler show main problems from this new/delete =>
malloc()/free() allocations.

Stack of profiler show deep for our SQL parser e.g. On 25-30 methods.
Each method itself eat e..g. 0.9% of time.
    0.8% is in fact LA() calls.

So there is no obvious bottleneck place. Its spread over all calls.

In SqlLite's Lemon parser deep of stack is only about 7-10 methods. And I am
sure they do not do this overhead allocations.


--------
Aha, I see what you mean, Mark.

Probably flex_string uses POINTERs and do not copy inside?
Well, this may speed up 50% of potential ..

I will try to check this way. Should be relatively easy.


---------
And one more thing.

    MEMORY POOL

I believe that C/C++ runtimes of ANTLR should be armed by this things
always. And give to develop this way:

    pool -> all AST nodes and may be even our SQL nodes we allocate from it.
        then we trash the whole tree by single call of pool.free_all()

I know that few developers have use this way, and say this improve a lots
speed. So why we all need invent a bike?


-- 
Best regards,

Ruslan Zasukhin
VP Engineering and New Technology
Paradigma Software, Inc

Valentina - Joining Worlds of Information
http://www.paradigmasoft.com

[I feel the need: the need for speed]




More information about the antlr-interest mailing list