[antlr-interest] Implementing "break" statement in antlr-based interpreter

Tue Oct 23 13:43:47 PDT 2012

A much better/simpler approach when writing any interpreter with control
flow is to do it in two phases:

1. parse the file and construct an interpreter
2. run the interpreter

It looks like you're intermingling these; I think you're building a
statement list and passing it to the loop -- what if one of the statements
in that list is a nested loop?

You can implement the control-flow very easily using something like the GoF
Interpreter pattern. A break or continue can easily be handled by throwing
an exception that is caught by containing loops, or by having your
evaluate() methods return an object that has a flag marking break/continue.

Think about something like the following (not tested, but should
demonstrate the concept)

  // Context is some object that tracks current state, possibly scoped
variables,
  //   runtime stack, etc
  public interface Evaluator {
    Status evaluate(Context context);
  }
  public class Status {
    private Object value;
    private boolean isBreak;
    public Object getValue() {
      return value;
    }
    public boolean isBreak() {
      return isBreak;
    }
    public Status(Object value, boolean isBreak) {
      this.value = value;
      this.isBreak = isBreak;
    }
  }
  public class PrintStatement implements Evaluator {
    private Evaluator value;
    public PrintStatement(Evaluator value) {
      this.value = value;
    }
    @Override public Status evaluate(Context context) {
      Object v = value.evaluate(context).getValue();
      System.out.println(v);
      return new Status(v, false);
    }
  }
  public class BreakStatement implements Evaluator {
    @Override public Status evaluate(Context context) {
      return new Status(null, true);
    }
  }
  public class StatementList implements Evaluator {
    private List<Evaluator> evaluators = new ArrayList<Evaluator>();
    public void add(Evaluator statement) {
      evaluators.add(statement);
    }
    @Override public Status evaluate(Context context) {
      Status status = null;
      for (Evaluator evaluator : evaluators) {
        status = evaluator.evaluate(context);
        if (status.isBreak())
          return status;
      }
      if (status == null)
        status = new Status(null, false);
      else
        status = new Status(status.getValue(), false);
      return status;
    }
  }
  public class WhileStatement implements Evaluator {
    private Evaluator condition;
    private Evaluator body;
    public WhileStatement(Evaluator condition, Evaluator body) {
      this.condition = condition;
      this.body = body;
    }
    @Override public Status evaluate(Context context) {
      Status result = null;
      while(Boolean.TRUE.equals(condition.evaluate(context).getValue())) {
        result = body.evaluate(context);
        if (result.isBreak())
          break;
      }
      if (result == null)
        result = new Status(null, false);
      else
        result = new Status(result.getValue(), false);
      return result;
    }
  }

Using similar objects, all you need to do is have your parser create
instances of these evaluators, then call evaluate(context) on the top level.

This also makes it *very* easy to debug, as you can separate the "did I
parse it and create the structure correctly" and separately the "am I
running it correctly". It also lends itself much more cleanly to setting up
a debugger for your target language.

Hope this helps,
-- Scott

----------------------------------------
Scott Stanchfield
http://javadude.com

On Tue, Oct 23, 2012 at 3:51 PM, Juancarlo Añez <apalala at gmail.com> wrote:

> Michael,
>
> Simplest, is best.
>
> One option is to split the rules for statement sequences into two: those
> allowed within loops, and everything else.
>
> The other is to allow a "break" in any statement sequence, and deal with
> it's validity later.
>
> There's a lot of semantic nuances to a programming language that are very
> difficult to solve at the syntactic level.
>
> -- Juanca
>
> On Tue, Oct 23, 2012 at 3:16 PM, Michael Cooper <tillerman35 at yahoo.com
> >wrote:
>
> > Say you have your basic "for" or "while" loop, e.g.
> >
> > for(i=0; i<10; i++) {
> >   print i
> > }
> >
> >
> > In the "pie" example, the author has a while loop that uses a "defer"
> > parameter to indicate that the interpreter will do the job of evaluating
> > the expr that determines if the loop proceeds.
> >
> >
> >    |   'while' expr[true] slist[true]
> >         {if (!defer) interp.whileloop($expr.start, $slist.start);}
> >
> > I would like to be able to break out of a "for" or "while" loop, as is
> > done in many programming languages, e.g.
> >
> > //Read rows from a cursor and print out the contents of field 1 until it
> > says "break"
> > while cursor.hasrows() {
> >   cursor.getrow()
> >   if cursor.getfield[1] = "stop" then break  //<-- When field one says
> > "break" we should exit the "while" loop.
> >   print cursor.getfield[1]
> > }
> >
> > What I think I need to do is add a boolean to my interpreter that
> > indicates that a break statement has been encountered, and then test for
> > that condition = true in each rule action.
> >
> > so in my "statement" rule, I would add an alternative:
> >
> > | 'break' { if(!defer && !interp.breakfound) interp.break = true; }
> >
> > and then add the "&& !interp.breakfound)" into every rule.  That way, the
> > parser would not execute any interpreter functions until the breakfound
> > condition was re-set.
> >
> > I would also need to save the break condition prior to entering
> break-able
> > constructs (loops and functions are the only ones I can think of) so
> that I
> > could restore it after the end of the construct.  That way the break
> > statement only exits the loop it executes in.
> >
> > e.g. in class Interpreter:
> >
> > public whileloop(Token expr_start, Token slist_start) {
> >   boolean saved_breakfound = this.breakfound;
> >     ...Handle the loop stuff here...
> >   this.breakfound = saved_breakfound;
> > }
> >
> > Does this make sense?  Is there a better way that I'm missing?  I imagine
> > the same idea could be used to implement a "return(value)" statement in a
> > function as well.  The only difference would be that a return statement
> > would exit the function no matter how deep it was into loops.  For that,
> > I'd need some kind of tri-state indicator, with values like 0 =>
> continue,
> > 1=> break out of loop, 2 => return from function.
> >
> > Any thoughts?
> >
> > Thanks!
> >
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe:
> > http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> >
>
>
>
> --
> Juancarlo *Añez*
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>