[antlr-interest] [C] my v3 Parser no reuse() slower 20% than v2. With reuse() 2GB leaks, oops.

Jim Idle jimi at temporal-wave.com
Wed Nov 16 08:50:20 PST 2011


All your assumptions below are correct - the methods you are calling there
are public to grammar programmers for this reason. Just lose the $text and
have your own helper methods - for instance you only want the text when it
is time to actually do something with it, and not just to create a new
token that is the same text and position and so on. Your helper methods
can take a token, a start and stop token, a tree node with a payload, and
a tree node with a start and stop span. Even in Java you find that you
need these for good error reporting.

Sorry that the C runtime takes a lot more groking, but there isn't all
that object infrastructure to help you. I am still inclined to make a very
streamlined C runtime, that does not allow overrides of much at all, but
is very fast.

Jim


> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Ruslan Zasukhin
> Sent: Wednesday, November 16, 2011 8:36 AM
> To: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] [C] my v3 Parser no reuse() slower 20%
> than v2. With reuse() 2GB leaks, oops.
>
> On 11/16/11 6:00 PM, "Jim Idle" <jimi at temporal-wave.com> wrote:
>
> > [C] my v3 Parser no reuse() slower 20% than v2. With reuse() 2GB
> > leaks, oops.
> >
> > Do not use the $text annotations if you want performance, they are
> > purely for convenience ­ I must have said this 5000 times and I wish
> I
> > had never added that bit ;) I also told you 3 or 4 times in various
> > emails not to use it. I think that that is in the API docs somewhere,
> > but I should make sure that it is, if it is not.
>
> Right you told ...
>
> But in docs, ANTLR books, examples, everywhere present this
>
>     hex_string_literal
>
>     :    s = HEX_NUMBER  -> CONST_STR_HEX[$s.text->chars]
>
> Yes, I have checked C API docs even today, but have found any special
> page, which says
>
>     Java guys do this
>     C guys do this.
>
>
> > There is no memory leak, but the auto string stuff does not release
> > until you free the string factory, which only happens when you free
> > the parser, not when you reuse it. Because it allocates small strings
> > all the time, it kills performance, and then you will page.
>
> Clear.
>
> So when I "fix" all places with .text usage problem with memory should
> disappear self.
>
>
> > xxx: s=HEX_NUMBER { $s.type = CONST_STR_HEX; } ;
>
> > I think that the field name is type but you get the idea.
>
> Yes, I will try this asap and give feedback.
> I have 40 such places in parser. And some number in the tree parser.
>
>
> >  Donąt use the
> > fake object oriented stuff when you want performance, use the structs
> > directly ­ you will find that it is many times faster than the v2
> C++,
> > not slower ­ this is C and you should get as close to the metal as
> you can.
>
> I very hope :-)
>
> If with PARSER I think I see how I can use this $s.type I will check
> right now other 39 places in parser :)
>
> =====================================
> It is not clear to me what we can do with Tree Parser ??
>
> So I have some token, e.g. Date or time or other literal.
> I make label, now I need get TEXT.
>
> general_literal returns [ENode_Const_Ptr res]
>
>     : cd=CONST_DATE
>             { res=make_enode_date ( GET_FBL_STRING($cd.text) );  }
>
>
>
> So far I have found, that I can do something as
>
> general_literal returns [ENode_Const_Ptr res]
>
>     : cd=CONST_DATE
>       {
>               pANTLR3_COMMON_TOKEN pToken = $cd->getToken( $cd );
>               ANTLR3_MARKER pStart = pToken ->getStartIndex( pToken );
>               ANTLR3_MARKER pEnd  = pToken->getStopIndex( pToken );
>              .... Do some job ...
>       }
>
>
> Does such code in TreeParser looks correct for you?
>
> Is it really safe and  getStartIndex / getStopIndex always return us
> correct pointers?
>
> Of course this can be extracted into special func to be used in many
> places in one line of code ...
>
> Just I believe there is no any example in C and any docs pages which
> discuss this for TreeParser and C. If exists please point me by finger
> :-)
>
>
> --
> Best regards,
>
> Ruslan Zasukhin
> VP Engineering and New Technology
> Paradigma Software, Inc
>
> Valentina - Joining Worlds of Information http://www.paradigmasoft.com
>
> [I feel the need: the need for speed]
>
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


More information about the antlr-interest mailing list