[antlr-interest] Re: updated "antlr 2 bashing list"

Wed Mar 10 03:49:06 PST 2004

On Tue, Mar 09, 2004 at 05:22:25PM -0800, Terence Parr wrote:
> > I hate to say it, but I think that that's a cop out.  Building
> > exception objects and stack frames costs, and there is no point when
> > in guessing mode.

Have to agree with Loring here.

> This never happens unless you fail and is a minor cost compared to the
> cost of nested predicates, right?

Well in C++ you can compile without exception support which gives you a
faster and smaller binary, people in the embedded business might like that.
I'm not sure though wether I would support two versions of the runtime...

> Doesn't bother me in Java, but other languages might be a drag.

The argument for retargeting to a exceptionless generated code would also
make a C codegenerator an (easy) option. (Hmm wonder how far you'd get with
longjmp and co. talk about ugly...)

> I can investigate not using exceptions to see what the code looks like.
>   Actually somebody who has used JavaCC could tell us what it looks like
> ;)  They do not use exceptions.

What I recall of the code generated by ell from cocktail it's not that bad
although it seems to have an advantage due to its error recovery code. It
doesn't even return an error code for a rule. Basically it looks like
(simplified of course and a bit translated to antlr-ish data structures the
original uses C with a distinct modula2-to-C convertor taste)

void rule( Token incomingtoken, BitSet recoverySet )
{
   Token someTokenInRule;
   Token someOtherTokenInRule;

	for(;;) {
		if( curToken.getType() == .. ) { }
		if( curToken.getType() == .. ) { }
		unexpectedTokenHandler( recoverySet );
	}
	unexpectedTokenHandler( anotherRecoverySet );
}

It would use labels and goto's to jump over unexpectedHandlers but that's
something that could be done differently. Similar to our guessing mode
counter it has a RepairMode flag which is set when the parser is
inserting/skipping tokens to get to a restart point. At certain points
repairmode is reset (probably for the rules that are at restart points).
Sometimes inside rules some extra checks for repairmode and recovery are
inserted. In between rule invocations the recovery sets are 'tweaked' (all
simple index updating). There are a few flavours of unexpectedTokenHandlers
used (think mostly to handle negated bitsets).

The unexpectedTokenHandler calls would correspond to antlr's catch blocks
and to some extend match calls. I guess the guessing could also be wrapped
into them. Most (all more likely) guessing handling could be done in antlr
with shuffling around a few if/else parts. Maybe add a stack of active
predicates (to replace the guessing counter) then maintain the stack during
guessing and you can even keep track of what predicate failed where. Or
make the predicate a parameter to the rule.

> > and hamper code generation for languages which do not have
> > exceptions.  It would be better to have a "synpredFailed flag as part
> > of the generated classes-the flag will always be in cache, so is not a
> > performance issue--or to have a return flag from rules and wrap rule
> > invocations (foo) in
> > if (foo() == false) return;
> >
> > and add
> > if (guessing)
> >     return false;
> > else {
> >     <throw exception>
> > }
> >
> > around the <throw exception>s in ANTLR.  Come to think of it,
> > implementing this with the current ANTLR runtime would be only about
> > 10 lines of code or so (maybe a little more--I think that each
> > exception would have to be fixed separately).
>
> Uh...how you gonna redo return values in 10 lines ;)

I don't think you'd even need return values (other than how we use them
already for rule return values). An error attribute on the parser class
would probably already do the trick or a queue of them. Or suppose you'd
use something like:

rule { int val; } :
	( NUM NUM NUM (ID|COMMA) ) => val=someRule
	| NUM NUM NUM
;

-- start of predicate test --
Predicate p = new Predicate("NUM NUM NUM (ID|COMMA)");
val = someRule( p )
if( p.failed() )
 ...
-- end --

In the rule itself

----snip----
int someRule( Predicate p ) {
 ...
 if( p != null )  // equivalent to our inputState.guessing != 0
----snip----

When nested predicates occur you could create a linked list of predicates
giving you a fail trace. Or just pass the failed status. If you tinker with
the predicate class you can probably do some nice stuff too (debugging vs.
production code version).

The average syntactic predicate check now looks like:

--------snip--------
         boolean synPredMatched6 = false;
         if (((LA(1)==TK_int||LA(1)==TK_char||LA(1)==ID))) {
            int _m6 = mark();
            synPredMatched6 = true;
            inputState.guessing++;
            ///// doesn't this bit look a bit excessive ? ///////////
				try {
               {
                  variable();
                  otherrule();
                  otherrule();
                  match(SOME_TOKEN);
               }
            }
            catch (RecognitionException pe) {
               synPredMatched6 = false;
            }
            /////////////////////////////////////////////////////////
            rewind(_m6);
            inputState.guessing--;
         }
         if ( synPredMatched6 ) {
            variable();
         }
--------snip--------

The exception is always caught in the level directly above the call. So a
bunch of if-else-if's would be just as practical.

In C++ mode the approach with the predicate class could be quite nice. I
could probably do the mark/rewind stuff in the constructor/destructor and
get automatic rewinding for free (and exception safe to boot). Java is out
of luck there though ;)

Another option is to generate rules like:

RULE_RETURN_VALUE_TYPE
   someRule( InternalReturnValue ret, Predicate p, <other rule params>  )
{
 ...
}

or

InternalReturnValue
   someRule( RULE_RETURN_VALUE_TYPE ret, Predicate p, <other rule params>  )
{
 ...
}

In java you'd have to use an Integer wrapper or something for the internal
return value.

Cheers,

Ric
--
-----+++++*****************************************************+++++++++-------
    ---- Ric Klaren ----- j.klaren at utwente.nl ----- +31 53 4893722  ----
-----+++++*****************************************************+++++++++-------
  Wo das Chaos auf die Ordnung trifft, gewinnt meist das Chaos, weil es
  besser organisiert ist. --- Friedrich Nietzsche

Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/