[antlr-interest] generating cyclic state machines in Java
Terence Parr
parrt at cs.usfca.edu
Wed Feb 8 14:56:23 PST 2006
Hi Jeff,
Well, I'm not sure I fully understand what you have, but while
sitting in the beautiful California sun today I thought about how I
would redo them. Perhaps this is similar to what you would suggest.
First, my goals:
1. don't create a lot of classes
2. don't create a lot of instances
3. avoid method calls per character just to transition
4. be as small as possible
I thought about just making a huge switch on state number that jumped
to a CASE:
switch ( s ) {
case 0 : ...
case 1 : ...
...
}
But, with say 500 states I'm not sure java would do this efficiently
and it would limit the size of our DFAs due to java method size
limits. For states that have semantic predicates are truly
complicated transitions, we could do this.
As an optimization, I would like something tight like this:
while ( s != accept state ) {
s = states[s][input.LA(1)];
if ( s==-1 ) error;
}
return predictedAlt[s];
which works well enough for 8 bits. Actually when the range from low
to high of the edge labels is > threshold then you'd decide the array
would be too big and you resort to another case in your switch. So
for single char labels very high into the unicode range, you could
still use this approach. You need min/max arrays too:
while ( s != accept state ) {
c = input.LA(1);
if ( c>max[s] ) error;
s = states[s][c-min[s]];
if ( s==-1 ) error;
}
return predictedAlt[s];
For most states then, you could do this fast scheme and then resort
to a big switch for the others or even use methods for s0(), s32(),
etc.. and jump to them to reduce the size of the method containing
the switch.
Do you see any problems with this?
One could be defining the stuff; java does what...ok, web says I can
do this:
int dfa1_min[] = {...};
int dfa1_max[] = {...};
...
int dfa1_states[][] = {
{ 0,0,0,3,0,2,0,... }, // state 1 transitions where on char x?
{ ... }, // state 2
...
};
Java creates an array of arrays, which is a lot of objects and
hopefully it doesn't wipe to 0 first and then add these default
values. :(
Anybody have experience with this in Java?
Ter
More information about the antlr-interest
mailing list