[antlr-interest] Re: Semantic predicates that aren't & hoisting

David Jung jungdl at ornl.gov
Fri Mar 11 12:29:59 PST 2005


John D. Mitchell said:
>>>>>> "David" == David Jung <jungdl at ornl.gov> writes:
>> John D. Mitchell said:
>>>>>>>> "David" == David Jung <jungdl at ornl.gov> writes:
> [...]
>
>> yes. That is the intention.  A program in an expr, so are namespaces,
>> classes etc. etc.
>
> Interesting.
>
> Well, if you're going that far, why not go whole hog to e.g., Lisp or even
> a pure functional language?  You'll get a much cleaner, more consistent,
> and (mostly likely) a more powerful solution.

The language has some functional features already - functions can be
marked 'pure' which disallows side-effects.  That also means that
pure functions with const args can be computed at compile time
and result in further const values.
Types, being first-class values, are just literals assigned to
variables.  If the variables are const then static type checking
can be performed on the type, otherwise the types are dynamic
(but everything is still strongly typed - with the exception
of the 'any' type).
Hence, pure functions that operate on types and are called with
const types in initializers allow something like Java/C++ generics/templates
(strong, static, parameterized types).  If such a function isn't pure
then the type is dynamically generated instead (using java.lang.reflect).

The language is intended to be 'familiar' to Java/C++/Fortran(!) programmers
as well as having a shallow learning, but still full-featured.
(static & dynamic typing, strong & weak typing, OO, procedural,
Design by Contract support, operator overloading, rich builtin
types (matrix, vector, complex, list, map, etc.), good string
processing (regexs) etc.)
But the current interpreter is hidiously slow (and obviously removes
the distinction between static(compile-time)/dynamic(run-time)).
Hence the need to write a compiler.

I've put our little test grammar into ANTLR, see below..

>
>
> [...]
>>>> Secondly, and importantly, expr won't match:
>
>>>> "{if {a>0} then {f();g();} x();}", will it? (or am I going blind?)
>
>>> Of course it will:
>
>>> expr will match exprList.
>
>>> exprList will match expr will match if followed by an expr matching an
>>> exprList followed by then followed by an expr matching an exprList
>>> containing two exprs matching a list of function call expressions;
>>> followed by a function call expression.
>
>> I still don't get it.  It looks like your exprListOrStmt *requires* a
>> ';'
>> to follow an expr.  So, if the above entire expression is an exprList
>> containing two exprs (first is 'if'... and second is x();), why isn't a
>> ';' required to follow the 'if' expr?
>
> You're (mis-)reading the wrong part.  The *FIRST* alternative in
> exprListOrStmt matches the *nested* exprList's.  That's the whole point
> (and why that's the first alternative and the statement is the second
> alternative -- remember that ANTLR does top-down (aka predictive)
> parsers).
>
> I suggest that you write a small, simple grammar that has this and turn on
> the tracing and run it over some input to see it in action.

This is what I give ANTLR:

program : expr EOF ;
expr : ifExpr | addExpr ;
ifExpr : "if" expr "then" expr ;
addExpr : condExpr ( (PLUS | MINUS) expr )? ;
condExpr: callExpr ( GTHAN expr )? ;
callExpr : primaryExpr ( LPAREN RPAREN )? ;
primaryExpr : exprList | constant | IDENT ;
exprList : LCURLY ( exprListOrStmt )* RCURLY ;
exprListOrStmt : (RCURLY)=>exprList | expr SEMI ;

which is a small variation on my previous post, mainly to
make it readable and to use my already defined lexer tokens.

Now, if I try to match "{if {a>0;} then {f();g();} x();}",
like I want (save the extra ';' I has to put after the 0), I get
this:

> program; LA(1)=={, LA(2)==if
  > expr; LA(1)=={, LA(2)==if
   > addExpr; LA(1)=={, LA(2)==if
    > condExpr; LA(1)=={, LA(2)==if
     > callExpr; LA(1)=={, LA(2)==if
      > primaryExpr; LA(1)=={, LA(2)==if
       > exprList; LA(1)=={, LA(2)==if
        > exprListOrStmt; LA(1)==if, LA(2)=={
         > expr; LA(1)==if, LA(2)=={
          > ifExpr; LA(1)==if, LA(2)=={
           > expr; LA(1)=={, LA(2)==a
            > addExpr; LA(1)=={, LA(2)==a
             > condExpr; LA(1)=={, LA(2)==a
              > callExpr; LA(1)=={, LA(2)==a
               > primaryExpr; LA(1)=={, LA(2)==a
                > exprList; LA(1)=={, LA(2)==a
                 > exprListOrStmt; LA(1)==a, LA(2)==>
                  > expr; LA(1)==a, LA(2)==>
                   > addExpr; LA(1)==a, LA(2)==>
                    > condExpr; LA(1)==a, LA(2)==>
                     > callExpr; LA(1)==a, LA(2)==>
                      > primaryExpr; LA(1)==a, LA(2)==>
                      < primaryExpr; LA(1)==>, LA(2)==0
                     < callExpr; LA(1)==>, LA(2)==0
                     > expr; LA(1)==0, LA(2)==;
                      > addExpr; LA(1)==0, LA(2)==;
                       > condExpr; LA(1)==0, LA(2)==;
                        > callExpr; LA(1)==0, LA(2)==;
                         > primaryExpr; LA(1)==0, LA(2)==;
                          > constant; LA(1)==0, LA(2)==;
                          < constant; LA(1)==;, LA(2)==}
                         < primaryExpr; LA(1)==;, LA(2)==}
                        < callExpr; LA(1)==;, LA(2)==}
                       < condExpr; LA(1)==;, LA(2)==}
                      < addExpr; LA(1)==;, LA(2)==}
                     < expr; LA(1)==;, LA(2)==}
                    < condExpr; LA(1)==;, LA(2)==}
                   < addExpr; LA(1)==;, LA(2)==}
                  < expr; LA(1)==;, LA(2)==}
                 < exprListOrStmt; LA(1)==}, LA(2)==then
                < exprList; LA(1)==then, LA(2)=={
               < primaryExpr; LA(1)==then, LA(2)=={
              < callExpr; LA(1)==then, LA(2)=={
             < condExpr; LA(1)==then, LA(2)=={
            < addExpr; LA(1)==then, LA(2)=={
           < expr; LA(1)==then, LA(2)=={
           > expr; LA(1)=={, LA(2)==f
            > addExpr; LA(1)=={, LA(2)==f
             > condExpr; LA(1)=={, LA(2)==f
              > callExpr; LA(1)=={, LA(2)==f
               > primaryExpr; LA(1)=={, LA(2)==f
                > exprList; LA(1)=={, LA(2)==f
                 > exprListOrStmt; LA(1)==f, LA(2)==(
                  > expr; LA(1)==f, LA(2)==(
                   > addExpr; LA(1)==f, LA(2)==(
                    > condExpr; LA(1)==f, LA(2)==(
                     > callExpr; LA(1)==f, LA(2)==(
                      > primaryExpr; LA(1)==f, LA(2)==(
                      < primaryExpr; LA(1)==(, LA(2)==)
                     < callExpr; LA(1)==;, LA(2)==g
                    < condExpr; LA(1)==;, LA(2)==g
                   < addExpr; LA(1)==;, LA(2)==g
                  < expr; LA(1)==;, LA(2)==g
                 < exprListOrStmt; LA(1)==g, LA(2)==(
                 > exprListOrStmt; LA(1)==g, LA(2)==(
                  > expr; LA(1)==g, LA(2)==(
                   > addExpr; LA(1)==g, LA(2)==(
                    > condExpr; LA(1)==g, LA(2)==(
                     > callExpr; LA(1)==g, LA(2)==(
                      > primaryExpr; LA(1)==g, LA(2)==(
                      < primaryExpr; LA(1)==(, LA(2)==)
                     < callExpr; LA(1)==;, LA(2)==}
                    < condExpr; LA(1)==;, LA(2)==}
                   < addExpr; LA(1)==;, LA(2)==}
                  < expr; LA(1)==;, LA(2)==}
                 < exprListOrStmt; LA(1)==}, LA(2)==x
                < exprList; LA(1)==x, LA(2)==(
               < primaryExpr; LA(1)==x, LA(2)==(
              < callExpr; LA(1)==x, LA(2)==(
             < condExpr; LA(1)==x, LA(2)==(
            < addExpr; LA(1)==x, LA(2)==(
           < expr; LA(1)==x, LA(2)==(
          < ifExpr; LA(1)==x, LA(2)==(
         < expr; LA(1)==x, LA(2)==(
        < exprListOrStmt; LA(1)==x, LA(2)==(
       < exprList; LA(1)==x, LA(2)==(
      < primaryExpr; LA(1)==x, LA(2)==(
     < callExpr; LA(1)==x, LA(2)==(
    < condExpr; LA(1)==x, LA(2)==(
   < addExpr; LA(1)==x, LA(2)==(
  < expr; LA(1)==x, LA(2)==(
 < program; LA(1)==x, LA(2)==(
exception during compilation:
tests/test.sg:0:0: unexpected token: x

As I expected.  However, if I add the extra ';' after the exprList
"{if {a>0;} then {f();g();}; x();}"
(which I don't want in the syntax), I get a sucessful parse:

 > program; LA(1)=={, LA(2)==if
  > expr; LA(1)=={, LA(2)==if
   > addExpr; LA(1)=={, LA(2)==if
    > condExpr; LA(1)=={, LA(2)==if
     > callExpr; LA(1)=={, LA(2)==if
      > primaryExpr; LA(1)=={, LA(2)==if
       > exprList; LA(1)=={, LA(2)==if
        > exprListOrStmt; LA(1)==if, LA(2)=={
         > expr; LA(1)==if, LA(2)=={
          > ifExpr; LA(1)==if, LA(2)=={
           > expr; LA(1)=={, LA(2)==a
            > addExpr; LA(1)=={, LA(2)==a
             > condExpr; LA(1)=={, LA(2)==a
              > callExpr; LA(1)=={, LA(2)==a
               > primaryExpr; LA(1)=={, LA(2)==a
                > exprList; LA(1)=={, LA(2)==a
                 > exprListOrStmt; LA(1)==a, LA(2)==>
                  > expr; LA(1)==a, LA(2)==>
                   > addExpr; LA(1)==a, LA(2)==>
                    > condExpr; LA(1)==a, LA(2)==>
                     > callExpr; LA(1)==a, LA(2)==>
                      > primaryExpr; LA(1)==a, LA(2)==>
                      < primaryExpr; LA(1)==>, LA(2)==0
                     < callExpr; LA(1)==>, LA(2)==0
                     > expr; LA(1)==0, LA(2)==;
                      > addExpr; LA(1)==0, LA(2)==;
                       > condExpr; LA(1)==0, LA(2)==;
                        > callExpr; LA(1)==0, LA(2)==;
                         > primaryExpr; LA(1)==0, LA(2)==;
                          > constant; LA(1)==0, LA(2)==;
                          < constant; LA(1)==;, LA(2)==}
                         < primaryExpr; LA(1)==;, LA(2)==}
                        < callExpr; LA(1)==;, LA(2)==}
                       < condExpr; LA(1)==;, LA(2)==}
                      < addExpr; LA(1)==;, LA(2)==}
                     < expr; LA(1)==;, LA(2)==}
                    < condExpr; LA(1)==;, LA(2)==}
                   < addExpr; LA(1)==;, LA(2)==}
                  < expr; LA(1)==;, LA(2)==}
                 < exprListOrStmt; LA(1)==}, LA(2)==then
                < exprList; LA(1)==then, LA(2)=={
               < primaryExpr; LA(1)==then, LA(2)=={
              < callExpr; LA(1)==then, LA(2)=={
             < condExpr; LA(1)==then, LA(2)=={
            < addExpr; LA(1)==then, LA(2)=={
           < expr; LA(1)==then, LA(2)=={
           > expr; LA(1)=={, LA(2)==f
            > addExpr; LA(1)=={, LA(2)==f
             > condExpr; LA(1)=={, LA(2)==f
              > callExpr; LA(1)=={, LA(2)==f
               > primaryExpr; LA(1)=={, LA(2)==f
                > exprList; LA(1)=={, LA(2)==f
                 > exprListOrStmt; LA(1)==f, LA(2)==(
                  > expr; LA(1)==f, LA(2)==(
                   > addExpr; LA(1)==f, LA(2)==(
                    > condExpr; LA(1)==f, LA(2)==(
                     > callExpr; LA(1)==f, LA(2)==(
                      > primaryExpr; LA(1)==f, LA(2)==(
                      < primaryExpr; LA(1)==(, LA(2)==)
                     < callExpr; LA(1)==;, LA(2)==g
                    < condExpr; LA(1)==;, LA(2)==g
                   < addExpr; LA(1)==;, LA(2)==g
                  < expr; LA(1)==;, LA(2)==g
                 < exprListOrStmt; LA(1)==g, LA(2)==(
                 > exprListOrStmt; LA(1)==g, LA(2)==(
                  > expr; LA(1)==g, LA(2)==(
                   > addExpr; LA(1)==g, LA(2)==(
                    > condExpr; LA(1)==g, LA(2)==(
                     > callExpr; LA(1)==g, LA(2)==(
                      > primaryExpr; LA(1)==g, LA(2)==(
                      < primaryExpr; LA(1)==(, LA(2)==)
                     < callExpr; LA(1)==;, LA(2)==}
                    < condExpr; LA(1)==;, LA(2)==}
                   < addExpr; LA(1)==;, LA(2)==}
                  < expr; LA(1)==;, LA(2)==}
                 < exprListOrStmt; LA(1)==}, LA(2)==;
                < exprList; LA(1)==;, LA(2)==x
               < primaryExpr; LA(1)==;, LA(2)==x
              < callExpr; LA(1)==;, LA(2)==x
             < condExpr; LA(1)==;, LA(2)==x
            < addExpr; LA(1)==;, LA(2)==x
           < expr; LA(1)==;, LA(2)==x
          < ifExpr; LA(1)==;, LA(2)==x
         < expr; LA(1)==;, LA(2)==x
        < exprListOrStmt; LA(1)==x, LA(2)==(
        > exprListOrStmt; LA(1)==x, LA(2)==(
         > expr; LA(1)==x, LA(2)==(
          > addExpr; LA(1)==x, LA(2)==(
           > condExpr; LA(1)==x, LA(2)==(
            > callExpr; LA(1)==x, LA(2)==(
             > primaryExpr; LA(1)==x, LA(2)==(
             < primaryExpr; LA(1)==(, LA(2)==)
            < callExpr; LA(1)==;, LA(2)==}
           < condExpr; LA(1)==;, LA(2)==}
          < addExpr; LA(1)==;, LA(2)==}
         < expr; LA(1)==;, LA(2)==}
        < exprListOrStmt; LA(1)==}, LA(2)==null
       < exprList; LA(1)==null, LA(2)==null
      < primaryExpr; LA(1)==null, LA(2)==null
     < callExpr; LA(1)==null, LA(2)==null
    < condExpr; LA(1)==null, LA(2)==null
   < addExpr; LA(1)==null, LA(2)==null
  < expr; LA(1)==null, LA(2)==null
 < program; LA(1)==null, LA(2)==null

Can you see my problem?

-David.




More information about the antlr-interest mailing list