[antlr-interest] Java code generator memory optimization

Akhilesh Mritunjai virtualaspirin at yahoo.com
Sun Sep 25 08:25:10 PDT 2005


Hi Terence

My comments inline:

--- Terence Parr <parrt at cs.usfca.edu> wrote:
> In the ANTLR v3 version, I have tokens point at the
> start/stop index  
> into a single char buffer that has the entire input
> text (well, that  
> is the default anyway).  So, you have a duplicates
> still in the sense  
> that all references to identifier "salary" are not
> shared, but at  
> least there are not multiple copies as there are now
> by default. :)   

afaik, thats how the current one works too. Lexer
makes strings from chars it gets from input stream. So
for every identifier in stream you get entirely
different string objects with separate char arrays. Of
course, they won't be duplicated more than they occur
in input stream... and there is no sharing at all and
won't be with that approach in v3.0

> If your file is 1M, it's probably pretty big and
> that's just not  
> enough memory to worry about this days.  Wow, I

Um... The certification for mine will happen on an
input file set around 37 MB in size, and then some
people out there must be doing continuous stream
parsing.

The current suggestion comes from my observation of
processing an 8MB automatically generated sadist
pathological example made by me for which the parse
tree contains total of 5.7M nodes... 40% are
identifier subtree nodes and every one has a string
object. I intern'ed the node texts and, bam!!, it
saved me 150MB of memory  :)

Uh, I dunno how to put it, but somehow Terence, you
seem to underestimate the reach, potential and
influence of all the kickass tools you've made. I did
a lot of research and will have a solid testimony once
I complete this thing... one being making difference
between product ending in success or a sad failure.

> remember when my 16k  
> machine was great! ;)  Anybody remember which
> processor was 1.077 mhz  

God I'm young... my first was a 640k, 16 something MHz
on which I learnt BASIC and MSDOS 3.3 & 5.0 more than
a decade back :)

- Akhilesh


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


More information about the antlr-interest mailing list