[antlr-interest] Parsing Operators as Atoms?

Wed Nov 14 17:51:25 PST 2012

see attached. it is ambiguous, but still seems to work.

hope this helps...
    -jbb

On 11/14/2012 05:22 PM, DJB MASTER wrote:
> Given a list...
>
> +
> +
> 2
> +
> 3
> +
> 4
> +
> +
> 6
> +
>
> ...How can I parse it so that it picks out all the infix trees (eg ^(+ ^(+ 2
> 3) 4)) and keeps the rest as single trees?
>
> I've tried this rule...
>
> expr: (a=atom -> $a)
> (op='+' b=atom-> {$a.text != "+" && $b.text != "+"}? ^($op $expr $b) //
> infix
> -> {$b.text != "+"}? // HAVING TROUBLE COMING UP WITH THIS CORRECT REWRITE!
> -> $expr $op $b)*; // simple list
>
> atom: INT | '+';
> INT : '0'..'9'+;
>
> ...and I think I'm almost there. I've been working on this for a couple of
> days with no luck.
>
>
>
> --
> View this message in context: http://antlr.1301665.n2.nabble.com/Parsing-Operators-as-Atoms-tp7579199.html
> Sent from the ANTLR mailing list archive at Nabble.com.
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address

-------------- next part --------------
grammar Test;

options {
   output = AST;
   ASTLabelType = CommonTree;
}

@members {

   // test data - each string in the following array is parsed separately
   private static final String [] x = new String[] {
      "+\n+\n2\n+\n3\n+\n4\n+\n+\n6\n+\n",
      "++2+3+4++6++7+8+9+",
      "1++2++3+4++5+6+7",
   };

   public static void main(String [] args) {
      for( int i = 0; i < x.length; ++i ) {
         try {
            System.out.println("about to parse:`"+x[i]+"`");

            TestLexer lexer = new TestLexer(new ANTLRStringStream(x[i]));
            CommonTokenStream tokens = new CommonTokenStream(lexer);

            // System.out.format("dump of the token stream:\%n");
            // tokens.fill();
            // int j = 0;
            // for( Object obj : tokens.getTokens() ) {
            //    Token tok = (Token) obj;
            //    int typ = tok.getType();
            //    System.out.format("\%d: type = \%s, text = `\%s`\%s\%n",
            //                      j++,
            //                      typ==EOF?"EOF":tokenNames[typ],
            //                      tok.getText(),
            //                      tok.getChannel()==HIDDEN?" (HIDDEN)":"");
            // }
            // System.out.format("now performing the parse\n");

            TestParser parser = new TestParser(tokens);
            TestParser.test_return p_result = parser.test();

            CommonTree ast = p_result.tree;
            if( ast == null ) {
               System.out.println("resultant tree: is NULL");
            } else {
               System.out.println("resultant tree: " + ast.toStringTree());
            }
            System.out.println();
         } catch(Exception e) {
            e.printStackTrace();
         }
      }
   }
}

test : list EOF! ;

list : (PLUS | expr)+ ;

expr : INT (PLUS^ INT)* ;

PLUS : '+' ;

INT : DIGIT+ ;
fragment DIGIT : '0' .. '9' ;

IDENTIFIER : LETTER ('_'|LETTER|DIGIT)* ;
fragment LETTER : 'A' .. 'Z' | 'a' .. 'z' ;

// Whitespace -- ignored
WS : ( ' ' | '\t' | '\f' | '\r' | '\n' )+ { $channel=HIDDEN; } ;