[antlr-interest] DFA's encoded directly in java bytecodes

matthew ford Matthew.Ford at forward.com.au
Tue Oct 19 13:14:31 PDT 2004


What about debugging.
You always claimed Antlr's code was readable.  
This is worse the XML :-(
matthew

----- Original Message ----- 
From: "Terence Parr" <parrt at cs.usfca.edu>
To: <antlr-interest at yahoogroups.com>
Sent: Wednesday, October 20, 2004 5:29 AM
Subject: [antlr-interest] DFA's encoded directly in java bytecodes


> 
> Howdy,
> 
> For the Java target of ANTLR 3.0, I've been concerned about the hideous 
> nature of the code generated for DFA machines.  Without a goto 
> instruction in Java, you have to use a massive switch (slow as death) 
> or use method calls (slow); plus you have to create like an object for 
> each of thousands of states when the Lexer or Parser is loaded.  Ick.
> 
> Sriram Srinivasan at the ANTLR2004 workshop suggested generating byte 
> code directly.  There is a goto bytecode of course. :)  The beauty of 
> these suckers is that the CPU program counter will be used (once 
> compiled to native code) to traverse the states of the DFA.  That will 
> be insanely fast; about the same speed as the C DFAs Ric generates for 
> 3.0.
> 
> So, I will generate byte code assembly for the DFAs from the ANTLR code 
> generator using StringTemplate so people can tweak the byte codes 
> easily.  For example, you might want to inline some of the char fetch 
> stuff.  That will haul ass!
> 
> The only problem was that I didn't want to use BCEL or Jasmin or any 
> other package since it means ANTLR depends on those packages (i hate 
> that).  So, I spent the weekend and yesterday building a .class file 
> writer and then a byte code assembler.  Both are tiny as they are very 
> specific to my needs (only handles like 50 bytecodes).  Nonetheless, 
> 900 total lines of code is nice. :)  An excellent price to pay to avoid 
> other people's libraries.
> 
> Here is a sample (partial) state in my assembly code
> 
>      aload 0
>      iconst 1
>      invokeinterface IntegerStream.LA 2
>      istore 1
>      iload 1
>      iconst 98
>      if_icmplt x
>      iload 1
>      ireturn
> x:
>      iconst -1
>      ireturn
> 
> I'm guessing that a minimal state will be about 20-ish bytes, allowing 
> for a max of about 3000 states in any specific DFA.  Not a serious 
> limitation probably.
> 
> I also hope to use this code in my CS652 grad prog lang class whenever 
> I teach that again.  A nice real world problem.
> 
> Ter
> PS yes, the .class file format is pretty complicated, but not 
> unreasonable given the problem
> --
> CS Professor & Grad Director, University of San Francisco
> Creator, ANTLR Parser Generator, http://www.antlr.org
> Cofounder, http://www.jguru.com
> Cofounder, http://www.knowspam.net enjoy email again!
> 
> 
> 
> 
> 
>  
> Yahoo! Groups Links
> 
> 
> 
>  
> 
> 
> 


 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
    antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/
 





More information about the antlr-interest mailing list