[antlr-interest] parsing dynamic infix operators

Eric Northup digitale at digitaleric.net
Mon Mar 13 11:54:52 PST 2006


Hi all,

I need to parse a language where the list of operators can change
dynamically (as in ML or Prolog).  For example, after a statement

:- operator(400, yfx, ~).

// 400 is the precedence
// yfx means an infix, left-associative operator
// ~ is the newly bound operator name

the fragment

foo(5~6, X~Y~Z)

would be parsed as:
foo(~(5,6), ~(~(X,Y),Z))

or, after the statements

:- operator(500, yfx, plus).
:- operator(400, yfx, times).
:- operator(300, fy, cos).
:- operator(200, yf, squared).

the fragment

X plus cos Y times cos (cos Z) squared

would parse as:

plus(X, times(cos(Y), cos(squared(cos(Z)))))



As in Prolog, the operator types (xfx,yfx,xfy,fx,fy,yf,xf) specify
the operators arity and associativity.

f: the operator (required)
x: this sub-expression must be of lower precedence than f's
y: this sub-expression must be of less than or equal precedence than
   f's

For example : fy is a recursive prefix operator, but fx is a
non-recurisve prefix operator.

Any expression enclosed in parenthesis has a precedence of zero.



An operator-precedence parser which pushes tokens on a stack while
precedence is decreasing seems like the obvious way to go.  It seems
like it might be possible to do this in ANTLR with syntactic and/or
semantic predicates, but it's not immediately clear to me how this
would work.

The best I've come up with would use semantic predicates to check
whether the precedence matches as expected, but would continue
checking other alternatives if the check failed (see ***).
Something along the lines of the grammar sketched below:

pattern returns [int precedence]
        {int arg1, arg2}
        : ATOM { precedence=0; }
        | '(' pattern ')' { precedence=0; }
        | fx:ATOM arg1=pattern { is_fxoperator(fx.getText())
                                 && opr_prec(fx.getText()) <= arg1 }?
                                // ***
                                // if the above check fails, I'd like
                                // to continue checking other matching
                               { precedence = opr_prec(fx.getText(); }

        | arg1=pattern xfy:ATOM arg1=pattern
              { is_xfyoperator(xfy.getText())
                && opr_prec(xfy.getText()) < arg1
                && opr_prec(xfy.getText()) <= arg2}?
                // ***
              { precedence = opr_prec(fx.getText(); }
        | ... // other productions


Alternately, I could use syntactic predicates, but there doesn't
seem to be a way to add native code checks to them (checking that
the precedence is decreasing).

Does anyone have suggestions for persuading ANTLR to parse such a
language?

-Eric



More information about the antlr-interest mailing list