[antlr-interest] Calling a production returns null? And newline as a separator?

bill robertson bill at tekbot.com
Fri Aug 6 22:51:15 PDT 2010


I'm working on a grammar, and I'm trying to create it such that
expressions can either be separated by newlines or by semicolons.

While working on unit tests, I found something that I don't
understand.  I have a "program" that is a sequence of "expressions."
If I ask the parser for the program, I get an AST with all of the
expressions.  However, if I ask for the expressions one by one,
sometimes I get back a null tree.

Here's sample code...

   public void foo() throws Exception {
       String script =
               "1*2\n"              // expr 1
               + "3+4*55\n"         // expr 2
               + "(5-6)*(77+88);\n" // expr 3
               + "7\n"
               + "/\n"
               + "8\n"              // expr 4
               + "9\n"
               + "\n"
               + "+\n"
               + "10;\n"            // expr 5
               + "11-12\n"
               + ";\n"              // expr 6
               + "13%14\n"          // expr 7
               + "15+16; 17+18\n";  // expr 8, expr 9

       FooParser parser = create(script);
       CommonTree all = (CommonTree)parser.prog().getTree();
       System.out.println("Child count = " + all.getChildCount());
                        //  should be 9

       FooParser parser2 = create(script);
       CommonTree expr1 = (CommonTree)parser2.expr().getTree();
       CommonTree expr2 = (CommonTree)parser2.expr().getTree();
       CommonTree expr3 = (CommonTree)parser2.expr().getTree();
       CommonTree expr4 = (CommonTree)parser2.expr().getTree();
       CommonTree expr5 = (CommonTree)parser2.expr().getTree();

       System.out.println("expr1 " + expr1 + " " + expr1.getChild(0));
       System.out.println("expr2 " + expr2 + " " + expr2.getChild(0));
       System.out.println("expr3 " + expr3 + " " + expr3.getChild(0));
       System.out.println("expr4 " + expr4);   //  expr4 is null
       System.out.println("expr5 " + expr5 + " " + expr5.getChild(0));

   }

It prints the following output (hopefully enough to show which expr is
being returned each time...

Child count = 9
expr1 * 1
expr2 + 3
expr3 * -
expr4 null
expr5 / 7

Is this normal?  Or is there something wrong with the grammar?  (will
append to end of this message)

Also, is there a better way to allow expressions to be separated by
newlines and semi-colons, yet still allow newlines in the middle of
expressions?   I tried as many ways as I could think of, and I was
only able to get the extremely verbose method (below) to work.

Thanks!

Grammar follows...

grammar Foo;

options {
output=AST;
}

prog
   :    expr+
   ;

expr
   :   expr_plus_minus ( NL! | NL!* SEM! | EOF!)
   |   NL!
   ;


expr_plus_minus
   :    expr_mult_div_mod
       ( NL!* PLUS^ NL!* expr_mult_div_mod
       | NL!* DASH^ NL!* expr_mult_div_mod
       )*
   ;

expr_mult_div_mod
       :       expr_unary
               ( NL!* STAR^ NL!* expr_unary
               | NL!* SLASH^ NL!* expr_unary
               | NL!* PCT^ NL!* expr_unary
               )*
       ;

expr_unary
       :       expr_atom
       |       ('~' | '`' | BANG) expr_unary // use temp symbols for now
//      |       (DASH | PLUS | BANG) expr_unary
// requires backtracking - don't want to enable yet...
       ;

expr_atom
       :       BASE10
       |   LPAR! NL!* expr_plus_minus NL!* RPAR!
       ;

BASE10  : ('0'..'9'+ | '0'..'9'* DOT '0'..'9'+);

SEM :   ';';
PLUS:   '+';
DASH:   '-';
STAR:   '*';
LPAR:   '(';
RPAR:   ')';
SLASH : '/';
PCT     :       '%';
DOT     :       '.';
BANG:   '!';

NL      :   '\r'? '\n';
WS      :   (' '|'\t'|'\f')+    {$channel=HIDDEN;} ;


More information about the antlr-interest mailing list