[antlr-interest] 1st look at Java classfile parser

Andreas Rueckert a_rueckert at gmx.net
Tue Jan 22 06:26:21 PST 2002


Hi!

Sorry, but it seems the following text was missing in my last mail...?

Terence liked the idea of posting an early version of the classfile parser here,
because some of you might be interested in taking a look and commenting.

There are a couple of issues with this version:

- I started with an AST, that was very close to the Classfile specs. Was very
fast, but not very easy to understand for those, who are only familiar to the
Java (tree) parser. I'm trying to generate an AST, that is closer to a tree,
representing a Java source file. This transition is not yet completed, that's
why there's nothing like a classblock in the type definition.

- The AST generation in general is not very elegant at some points (i.e.
compared to the Java grammar). I want to work on this as my skills are getting
better.

- There are still lots of comments missing, that explain what's going on inside
the Classfile parser. But even with the comments, you'll need some knowledge of
the classfile specs to understand most of the actions. The idea of them is, to
leave the constant pool completely out of the AST, and to transform the info in
it to a structure, that is close(r) to a Java sourcefile.

- At this point, I complety ignore bytecode and just parse it as a byte array.
AFAIK, that's why I cannot parse some features, like var initializers. That's
not a big problem to _me_, since I'm mainly interested in generating a nice
classdiagram for some libs. I'm not sure, if there's a good workaround, since
adding a recompiler would bloat the grammar to a degree, that would make it
hard to maintain and maybe too slow to actually use it. Performance was an
issue to me, since some Argo users are spoiled and want to import megabyte
sized jars into Argo. That's the reason, why I couldn't use a Antlr-generated
lexer. It simply added too much overhead and slowed down the RE process.

- I cannot really tell you, if the attached sourcecode will work for you. That
is, because the version I use here, is a Argo module. It means, that it's
cluttered with code to generate all kinds of Metamodel objects and add them to
the current Argo model. I tried to remove all this code (and package info) from
the Antlr files, before I zipped them. If you think, that some essential part
is missing, it's very likely that I was too ambitious with with deleting
process :-). Just ask then.

Feel free to flame,

Andreas





 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



More information about the antlr-interest mailing list