[antlr-interest] writing an interpreter

Sat Mar 25 19:06:19 PST 2006

Compiling to threaded code has many advantages over interpreting the tree
directly and costs no more to implement.  Basically, threaded code is a
sequence of instructions and arguments:

instr1 instr2 (instr3 arg1 arg2) instr4 ...

The instr items might be bytes containing an instruction code, addresses of
code to be executed, or some more indirect approach.  The arguments
(parentheses in the example just indicate that arg1 and arg2 are associated
with instr3; the above is a linear list) might be word-sized integers,
pointers, indices into tables (strings in Java, for example, where there are
no pointers to take advantage of), or whatever.

The approach is to do a tree walk to compile to threaded code; there is a
separate interpreter (a big case statement is the norm when the interpreter
is written in a high level language) in which each case represents an instr
value.

if..else statements compile to
<if> <offset to else code> <sequence of statements to execute if true>
<branch> <offset to skip else code> <statements to execute if false>

A one pass (treewalk) compiler lays down <if> <hole> ... code and backfills
the hole with an offset when the else or end of the if statement is
encountered.

Most modern interpreted languages (python, perl, ghc, etc.) use threaded
code; if you look at the source code, you can see the compiler.  Timothy
Budd's Little Smalltalk (he wrote a short book which may still be findable)
is a good example to learn from, but there are probably others.

--Loring 

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Maciej Zawadzinski
> Sent: Saturday, March 25, 2006 12:45 PM
> To: ANTLR Interest
> Subject: Re: [antlr-interest] writing an interpreter
> 
> On 3/25/06, Bryan Ewbank <ewbank at gmail.com> wrote:
> > On 3/25/06, Maciej Zawadziński <mzawadzinski at gmail.com> wrote:
> > > I am gonna write an interpreter of simple language. I am thinking of
> > > implementing it as a treewalker, but I encountered some problems with
> loops.
> > >
> > > How can I force ANTLR treewalker not to parse some nodes of tree?
> (e.g.
> > > "body" of the if/while loop if the condition is not satisfied?)
> >
> > Assuming your if node looks like this:
> >     #( IF e1 s1 s2 )
> >         -- e1 is the expression to evaluate
> >         -- s1 is the statement to evaluate if e1 is true
> >         -- s2 is the statement to evaluate if e2 is false
> >
> > The ANTLR for driving evaluation is:
> >
> >     if_statement
> >     { bool res; }
> >     :
> >         #( IF res=exprbool
> >             ( {res == true}?  statement .   // i.e., skip "s2"
> >             |                 . statement   // i.e., skip "s1"
> >             )
> >         )
> >     ;
> >
> 
> Thanks a lot! I am close to understanding antlr now ;)
> 
> But nothing comes so quickly. I had some problems with optional "else"
> phrase, but I finally wrote something like this that works:
> 
> ifStatement { int e = 0; }:
>   ( #("if" expr stmt "else") ) => #("if" e=expr ( {e == true}? stmt |
> . "else" stmt ) )
>  | ( #("if" expr stmt) ) => #("if" e=expr ( {e == true}? stmt | . ) )
> 
> 
> It looks a bit ugly, is there a way to make it without syntactic
> predicates?
> 
> greets,
> 
> --
> Maciej Zawadziński