[antlr-interest] Re: ANTLR Java Code Generation

Wed Nov 14 03:00:00 PST 2001

Hi,

I really appreciate the proposal hanging the Java Code Generation such
that the generated byte code becomes much smaller! This would allow my
compiler to be build in an jdk1.3 environment which is currently not
possible. So I'm less worried about a possible performance drawback. 

I use a large charVocabulary, because I need some UNICODE support in my
compiler. The bitsets in the generated lexer become really big. Using
Sun's jdk1.3, the generated byte code is too big to be loaded into the
VM: 
  Exception in thread "main" java.lang.ClassFormatError:
com/sun/jdori/common/query/jdoqlc/JDOQLLexer (Code of a method longer
than 65535 bytes)
The only workaround I know so far is using the jdk1.2 javac. So the
build environment has to ensure that the lexer is never compiled using
the jdk1.3 javac (even not indirectly).

Regards Michael 

> 
> Will there be a performance gain? There will be a bytecode size
> decrease, which will decrease memory usage on platforms like Unix
> which memory-map classes. But won't there be a performance decrease
> because instead of a nice super clean:
> 1. create Array long _tokenSet_0_data_[] with size n
> 2. put -549755813896L at 0
> 3. put -268435457L at 1
> 4. put -1L at 2
> .....
> n. put -1L at n
> with O[n] time you'll have:
> 1. create Array long _tokenSet_0_data_[] with size n
> 2. put -549755813896L at 0
> .
> .
> .
> 6. put -268435457L at 1
> .
> .
> .
> 10. put -1L at 2
> .....
> n*4. put -1L at n
> With all the extra loop overhead. Thus your time is now O[x*n] (where
> x is number of instructions to do for each entry). Is this actually
> going to be a gain? You still have to add the same number of entries.
> Just trades off class size vs. speed doesn't it. And I would have
> thought for many cases speed was more important than class size. You
> only store\load the class once, you have to do the BitSet creation
> every parse (if you did 2 passes with one init, then you'd double the
> amount of computation (and thus overhead) but same memory overhead).
> 
> And surely the JDK's gonna be better at optimizing the initializer
> cause it knows everything that's going on, where as the second one
> must deal with any side effects.
> 
> Still might be worth offering it as an option, but I would have
> thought it would be best to check performance before using only this
> method.
> 
> It doesn't cut down the intialization does it? Still got to
> initialize every value, in fact makes it worse because it's x times
> more ops. Just cuts down class size.
> 
> Tom.
> --- In antlr-interest at y..., Christian Ernst <christian.ernst at p...>
> wrote:
> > Hy !
> >
> > Stdiobe wrote:
> >
> > > Seems to me like a good suggestion. Any drawbacks performance-
> wise?
> >
> > We didn't do any performance measures.
> > But there should be a performance gain.
> >
> > For example for the Java Lexer 1.3.
> > It cuts down the needed initializing of 6 Bitsets
> > with about 11.000 Long Values
> > to only about 4.000 Long Values
> >
> > mfg
> > Christian
> 
> 
> 
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/

-- 
Michael Bouschen		Tech at Spree Software Technology GmbH
mailto:mbo.tech at spree.de	http://www.tech.spree.de/
Tel.:++49/30/235 520-33		Buelowstr. 66			
Fax.:++49/30/2175 2012		D-10783 Berlin

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/