[antlr-interest] Suggestion for List size in BaseTree

Loring Craymer lgcraymer at yahoo.com
Tue Dec 2 07:31:22 PST 2008


Memory allocators do not tend to allocate exact sizes; they allocate in blocks and pick the nearest block size to what was requested.  Java is a memory hog, and I expect that blocks are allocated in 128 or 256-byte increments--if not larger--which would mean that your suggested change would have no impact on actual allocation sizes.

--Loring



----- Original Message ----
> From: Martin Probst <mail at martin-probst.com>
> To: antlr-interest at antlr.org
> Sent: Tuesday, December 2, 2008 1:05:39 AM
> Subject: [antlr-interest] Suggestion for List size in BaseTree
> 
> Hi all,
> 
> I have a suggestion for the java.util.List children in
> org.antlr.runtime.tree.BaseTree.
> 
> It appears to be constructed with the default ArrayList constructor,
> which will give you an empty list with space for 10 elements.
> 
> In my personal experience, for 99% of my language's statements this is
> complete overkill: most have either two or three children, very few
> ones have more. Constructing the tree with smaller array sizes might
> have a noticeable impact on memory usage (and thus performance) of the
> system.
> 
> A CommonTree currently has 3 int fields, plus a reference to parent, a
> reference to token, and the reference to the children array, which in
> turn has a reference to the array, a size and a modcount field. So for
> each AST node we have 4x reference size plus 5x int size, e.g., on a
> 64 bit system about 52 bytes, plus the object and array overhead of
> the JVM (something like 32 bytes?). So that's 84 bytes not counting
> the contents of the array. This calculation is entirely done on a
> rumours and phantasies basis, but I think it should be roughly giving
> the right direction.
> 
> Now if the array is size == 3 by default, we have 24 bytes addition,
> if it's size 10 by default, we have 80 bytes, giving 108 bytes against
> 168 bytes. If you're AST is entirely 3 children or less, you'll save
> ~60% of memory (not counting token objects).
> 
> Of course this is a difficult tradeoff, and it totally depends on the
> application in case. Functional languages will probably get much
> larger savings than procedural, where you commonly have many
> statements as children of a function body. Still, all your typical
> expressions (arithmetic, boolean, ...) will usually have two children,
> most function calls have less than ten parameters, and so forth. Also,
> the "global" performance characteristics depend on how expensive it is
> to increase the array size (no clue about that), relative to the cost
> of the memory.
> 
> So maybe there should be some way to change BaseTree's behaviour here?
> Or a reasonable default that's significantly lower?
> 
> Martin
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: 
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address



      


More information about the antlr-interest mailing list