[antlr-interest] Antlr v4 - C++ target

Jim Idle jimi at temporal-wave.com
Thu Jan 12 10:07:26 PST 2012


I do plan on doing that in fact. However I would like to respond to the
criticisms here as follows:

1) I wrote the C runtime in under two weeks because I needed it for a
project and at that time ANTLR v3 was not released (beta). Hence by
waiting until v4 runtime is stable then we should get some cleaner
runtimes.
2) So, I did not really know how anyone else would want to use it and so I
made absolutely everything dynamic. Since that time there have been lots
of memory and performance tweaks, but I am sure there are more I can do.
3) I basically copied the Java model as is with the idea being that it
would be easier to follow changes that were made to the Java runtime in
the C runtime.
4) There are performance enhancements you can turn on such as adding
defines for ANTLR3_INLINE_INPUT_8BIT or ANTLR3_INLINE_INPUT_16BIT and
defining SKIP_FOLLOW_SETS to avoid stacking rule descriptors only used by
error reporting.
5) All my tests and most everyone else finds the C v3 runtime to be faster
than the C++ runtime, so I can only conclude that there is something
different about one or two grammar files.
6) I did implement reuse other than for trees and that helps most of the
use cases where the initial memory allocation takes time and so you don't
want to tear it down and re-allocate it.
7) It is a lot easier to start with someone else's code than it is to
start with vi and a blank screen. Where's the love?
8) ANTLR is naturally more heavyweight than some other tools, but it is
usually easier to use it.
9) Why not wait for v4 where some of these things are addressed as a
natural consequence of the design.


A minimum token needs the type and a pointer to the text, plus either a
pointer to the end of the text or the length. If you use a length then
with encodings like UTF8, you will start to need to traverse the text to
extract nnn characters. There are always tradeoffs. Pointers are 64 bits
not 32 bits on a 64 bit compiler. You can compile in 32 bit mode if you
don't need 64 bit stuff.

Jim


> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of A Z
> Sent: Wednesday, January 11, 2012 5:38 PM
> To: Ruslan Zasukhin
> Cc: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Antlr v4 - C++ target
>
> The realistic minimum I see for commontoken in the existing 3.4 code is
> 32 bytes on a 64-bit architecture. This would involve modifications to
> the code generator to no longer use the function pointers(for
> setStart/setStopIndex/setType) and using a smaller data type for the
> channel, factory and type members. There is still an additional
> 16B/token used by the vector data structure holding the tokens.
>
>
>
> On Wed, Jan 11, 2012 at 5:09 PM, Ruslan Zasukhin <
> ruslan_zasukhin at valentina-db.com> wrote:
>
> > On 1/11/12 11:12 AM, "Loring Craymer" <lgcraymer at yahoo.com> wrote:
> >
> > > If Jim did not implement the vtable indirection (that could be
> > > easily
> > changed,
> > > if so), then there is a little more opportunity for optimization,
> > > but
> > still
> > > the problem is that state information takes up much more memory
> than
> > does the
> > > text in tokens.
> >
> > Right,
> >
> > Well, lets look on antrl3commontoken.h
> >
> > API:
> >        19   pointers to func
> >                        32 bit os    19 * 4  = 76 bytes
> >
> > And about
> >        11 * 4 bytes  of useful info
> >
> >
> > So there is chance that in c++ style
> > OR with single pointer on ala-VT  token will become
> >
> >    from 118 bytes to 48 bytes
> >
> >
> >
> > --
> > Best regards,
> >
> > Ruslan Zasukhin
> > VP Engineering and New Technology
> > Paradigma Software, Inc
> >
> > Valentina - Joining Worlds of Information
> http://www.paradigmasoft.com
> >
> > [I feel the need: the need for speed]
> >
> >
> >
> >
> >
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe:
> > http://www.antlr.org/mailman/options/antlr-interest/your-email-
> address
> >
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


More information about the antlr-interest mailing list