[antlr-interest] Re: Antlr grammar to parse Java classfile?

Terence Parr parrt at jguru.com
Wed Dec 5 11:54:10 PST 2001

Hi Gang,

Nice discussion of what's best to use and thanks for supporting ANTLR!  
One thing that you might also consider is simply using Java reflection 
to pull apart the class files...if I'm not mistaken we built (at jGuru) 
a search system (no longer with us) for classes/methods and so on.  
There was some really easy built in Java thing that let it happen.  
Janne Leppanen is probably still wandering Europe as a gypsy so I can't 
ask him; oh, maybe I'll dig up that branch of the repository. :)

Anyway, concerning the "match n times" thing.  You're right...it would 
be pretty useful.  What syntax is appropriate and how to do you say 
0...n vs 1..n?  Perhaps, for uniformity, we use my "element modifier 
syntax (e.g., "INT<AST=INTNode>"):

ids4 : ( ID )+<n=4> ;  // weird looking thing

ids4 : ( ID )+<4> ; // a little better ("n" would be the default 

ids2opt : ( ID )*<2> ; // 0..2 not 1..2

The implementation would be pretty simple I guess.  Just defines a 
counter like the (...)+ does and generates an error if you don't get n; 
counter<=n would need to go in the while loop, but should be easy.


On Wednesday, December 5, 2001, at 12:11  AM, Andreas Rueckert wrote:

> Hi!
> On Mit, 05 Dez 2001 J. Stephen Riley Silber wrote:
> --<snip>--
>>> I know that predicates are one of the features ANTLR has. I think
>> this could
>>> be the only salvation...
>> Symantic predicates would indeed handle the "match n-times" problem.
> Antlr turns every rule into a method, right? And you could call such a 
> method
> in a production? Why not do something like
> =======================================
> // Parse the method block in the classfile
> methods
> { int mCount = 0; }
> 	:  mCount = methodCount
> 	     { for( int i=0; i < mCount; i++) { method(); } }
>             ;
> // Parse one method in the classfile
> method
>             : ....
> =======================================
> , where methodCount returns the number of methods in the classfile and
> the method rule parses a method in the classfile. Then construct a nice 
> AST, so
> the tree grammar could be really clean.
>> The thing is, and sorry Terry--nothing personal, I just don't think
>> ANTLR is really the right tool for this kind of thing.
> I agree, that the 'call the method rule n times' option is missing 
> somehow. It
> would make the creation of binary parsers somewhat easier.
>> Building the .class file parser in any language with decent byte and
>> bit analysis constructs is gonna be easy--tedious, definitely, but
>> pretty easy.
>> The real advantage of building a parser in ANTLR is when you have to
>> *change* the parser.  And in this case, for something so binary, I
>> think changing the ANTLR parser vs. changing a C parser (for example)
>> would end up being a wash.
> Well, data abstraction also has the advantage of being more readable, 
> than a
> coded parser. And if you have all basic types, like u2, u4 etc. 
> defined, you
> could basically retype the classfile definition from the JVM specs to 
> write the
> parser.
> classfile
> 	: magic_number
> 	  version_number
> 	  constant_pool
> 	  access_flags
> 	  this_class
> 	  super_class
> 	  ...
>> Of course, I might be way off base here.  :-)
>> In any event, I still think ANTLR is the best thing going in the
>> parser-generator world, by far!
> I agree. And I've used lex,yacc,JavaCC etc.
> Ciao,
> Andreas
> Your use of Yahoo! Groups is subject to 
> http://docs.yahoo.com/info/terms/
Chief Scientist & Co-founder, http://www.jguru.com
Creator, ANTLR Parser Generator: http://www.antlr.org


Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 

More information about the antlr-interest mailing list