[antlr-interest] parsing operators with priorities via attribute grammar??

W.Pasman@tudelft.nl w.pasman at tudelft.nl
Fri Dec 14 02:05:46 PST 2007


Hi,
I am trying to write a parser for prolog.
Now in prolog there are operators with assigned priorities. There are 
dozens of operators, and writing a grammar rule for each operator would 
greately clutter the (already large) prolog grammar.
The prolog reference implementation ("Prolog: The Standard" by Deransart 
et al) uses an attribute grammar and we want to copy this as close as 
possible into the antlr grammar describing prolog.

Is there a way to do this operators-with-priority parsing using attributes?





Below a description of the tries I already did. I ran into lots of 
problems using semantic predicates, attributes, etc. I'm not sure which 
issues are the same underlying problem, and whether I run do maybe 
misunderstand parts or whether there are antlr bugs involved. So let me 
just discuss what I tried.
A very basic example illustrating the problems is this grammar

term::=NUMBER extraterms
extraterms::=(OP NUMBER)* (leaving out the priorities for a moment)
NUMBER::=1|2|3
OP::=a|b|c

To incorporate the priority handling in ANTLR you would want to write 
something like

extraterms[P]:
        (nextop=OP { prio(nextop)>P }?) =>
        OP NUMBER extraterms[prio(nextop)]
    |
    ;

so basically, we use the attribute P to limit the priority of the term 
that is being parsed
and, what we want is the parser to try read next OP, and if it is right 
priority continue parsing.
    if it is not right priority, then take the epsilon production rule 
instead.

note: I tested the parsers with antlrworks 1.1.5.
In some cases my grammar seemed to work as intended from the 
antlrworks-debugger, but it failed when running in the interpreter.






I noticed a number of problems with this and similar attempts

1. grammar fragments like {Token op=input.LA(1); 
getprio(op.getText())>0}? compiles to java code
            if ( !(Token op=input.LA(1); getprio(op.getText())>0) ) { .... }
which fails to compile (in java you can't declare variables inside a 
boolean test I think)

2. splitting out the initializer and the real test as in
    extraterms[int P]:
    {Token op=input.LT(1);} { getprio(op.getText())>0}? OP
    |
    ;
    does compile but the resulting parser just checks for OP in the 
input stream
    in fact it seems my grammar is now interpreted as

    writing extraterms[int P]:
    OP {Token op=input.LT(1);} { getprio(op.getText())>0}?
    |
    ;

which will parse the terms incorrectly: if an operator comes with the 
wrong priority the parse fails wth a FailedPredicateException, instead 
of succeeding using the epsilon production rule in that case.

After more attempts in this directoin  I got the impression that I might 
need the semantic predicates to get what I want. So next attempt

4. extraterms[...]: .... (p=OP {getprio(p.getText())>P}?)=> OP | ...
the semantic test (the term left of =>) calls a separate function 
testsynpred(). Problems now are:
a. the code does not pass the attribute P to testsynpred(),  so the ">" 
can not be evaluated in testsynpred
    (hence the generated java code does not even compile)
b. similarly the variable p (smaller p) is declared in the extraterms 
code section,
    not in testsynpred where the parser now wants to use it
    (another reason why generated java code again does not compile)

5. as an attempted workaround to get the attribute P to the testsynpred 
we define a global currentP and write
{ currentP=P; }
        ( {input.LA(1)==OP && getprio(input.LT(1).getText())<currentP }?)
But results in a code block
    if (backtracking==0) { currentP =P; }
so currentP is probably not set at all defeating the workaround.





More information about the antlr-interest mailing list