[antlr-interest] generating cyclic state machines in Java

Terence Parr parrt at cs.usfca.edu
Wed Feb 8 14:56:23 PST 2006


Hi Jeff,

Well, I'm not sure I fully understand what you have, but while  
sitting in the beautiful California sun today I thought about how I  
would redo them.  Perhaps this is similar to what you would suggest.

First, my goals:

1. don't create a lot of classes
2. don't create a lot of instances
3. avoid method calls per character just to transition
4. be as small as possible

I thought about just making a huge switch on state number that jumped  
to a CASE:

switch ( s ) {
   case 0 : ...
   case 1 : ...
   ...
}

But, with say 500 states I'm not sure java would do this efficiently  
and it would limit the size of our DFAs due to java method size  
limits.  For states that have semantic predicates are truly  
complicated transitions, we could do this.

As an optimization, I would like something tight like this:

while ( s != accept state ) {
   s = states[s][input.LA(1)];
   if ( s==-1 ) error;
}
return predictedAlt[s];

which works well enough for 8 bits. Actually when the range from low  
to high of the edge labels is > threshold then you'd decide the array  
would be too big and you resort to another case in your switch.  So  
for single char labels very high into the unicode range, you could  
still use this approach.  You need min/max arrays too:

while ( s != accept state ) {
   c = input.LA(1);
   if ( c>max[s] ) error;
   s = states[s][c-min[s]];
   if ( s==-1 ) error;
}
return predictedAlt[s];

For most states then, you could do this fast scheme and then resort  
to a big switch for the others or even use methods for s0(), s32(),  
etc.. and jump to them to reduce the size of the method containing  
the switch.

Do you see any problems with this?

One could be defining the stuff; java does what...ok, web says I can  
do this:

int dfa1_min[] = {...};
int dfa1_max[] = {...};
...
int dfa1_states[][] = {
	{ 0,0,0,3,0,2,0,... }, // state 1 transitions where on char x?
	{ ... },			// state 2
	...
};

Java creates an array of arrays, which is a lot of objects and  
hopefully it doesn't wipe to 0 first and then add these default  
values. :(

Anybody have experience with this in Java?

Ter


More information about the antlr-interest mailing list