[antlr-interest] Inserting missing nodes

Thu May 5 01:50:40 PDT 2011

How about something like this:

grammar MyGrammar;

options {
  output=AST;
}

tokens {
  DEFAULT_OP;
}

query
  :  andExpression EOF -> andExpression
  ;

andExpression
  :  (orExpression -> orExpression) ( AND e=orExpression              ->
^(AND $e $andExpression)
                                    | (orExpression)=> e=orExpression ->
^(DEFAULT_OP $e $andExpression)
                                    )*
  ;

orExpression
  :  negation (OR^ negation)*
  ;

negation
  :  NOT operand -> ^(NOT operand)
  |  operand
  ;

operand
  :  WORD
  |  '(' andExpression ')' -> andExpression
  ;

AND   : 'AND';
OR    : 'OR';
NOT   : 'NOT';
WORD  : 'a'..'z'+;
SPACE : (' ' | '\t' | '\r' | '\n') {skip();};

Test class:

import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import org.antlr.stringtemplate.*;

public class Main {
    public static void main(String[] args) throws Exception {
        ANTLRStringStream in = new ANTLRStringStream("software engineer OR
java programmer");
        MyGrammarLexer lexer = new MyGrammarLexer(in);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        MyGrammarParser parser = new MyGrammarParser(tokens);
        MyGrammarParser.query_return returnValue = parser.query();
        CommonTree tree = (CommonTree)returnValue.getTree();
        DOTTreeGenerator gen = new DOTTreeGenerator();
        StringTemplate st = gen.toDOT(tree);
        System.out.println(st);
    }
}

Regards,

Bart.

On Wed, May 4, 2011 at 4:51 PM, Jean-Sebastien Vachon <
jean-sebastien.vachon at wantedtech.com> wrote:

> Thanks for your input. So here is the whole thing with two use cases that
> are not giving me the expected results...
> (Sorry for the long post)
>
> INPUT = abc def zyx toto
> RESULT = (DEFAULT_OP abc def) (DEFAULT_OP zyx toto)
> EXPECTED = (DEFAULT_OP (DEFAULT_OP abc def) (DEFAULT_OP zyx toto))
>
> INPUT = software engineer OR java programmer
> RESULT = (DEFAULT_OP software (OR engineer java)) programmer
> EXPECTED =  (DEFAULT_OP (DEFAULT_OP software (OR engineer java))
> programmer)
>
> I'm also having some trouble using the Interpreter within Eclipse.
> The same expressions are not working in the interpreter. It fails to
> generate the
> tree with a "NoViableAltException at input 'abc' " (for the first case).
> I don't think this is related to my other problem since I can't get it to
> generate any tree.
>
> Thanks again for your time and comments
>
>
> ----------------------------------------------------------------------------------------------------------
> Grammar (validation by building a tree and trying to insert missing
> operators)
>
> ----------------------------------------------------------------------------------------------------------
> grammar MyGrammar;
>
> options {
>  language = Java;
>  output = AST;
>  ASTLabelType = CommonTree;
> }
>
> // Rules to build the tree representation of our expression...
>
> query
>  : and_expr+ EOF!
>  ;
>
> // Each AND expression can contain OR expressions...
> and_expr
>  : (expr expr+) => default_op
>  | (u1=or_expr (AND^ u2=or_expr)*)
>  ;
>
> // A OR expression contains one or more expression
> or_expr
>  : u1=expr (OR^ u2=expr)*
>  ;
>
> default_op
>  : (e1=or_expr e2=or_expr) -> ^(DEFAULT_OP $e1 $e2)
>  ;
>
> expr
>  : (NOT^)? (operand)
>  ;
>
> // The leafs of the tree.. Words, sentence and so on...
> // Note that an expression such as '-word' is rewritten in its 'NOT word'
> form
> operand
>  : (f=FIELD^)(o=operand)
>   | PREFIX
>  | WORD
>  | SENTENCE
>  | WORDLIST
>  | NEGATIVE(w=PREFIX|w=WORD|w=SENTENCE|w=WORDLIST) -> ^(NOT $w)
>  | MUST
>   | LPAREN! and_expr RPAREN!
>  ;
>
> // Lexer ...
> NEGATIVE    : '-';
> LPAREN      : '(' ;
> RPAREN      : ')' ;
> DOUBLEQUOTE : '"';
> STAR          : '*';
> AND         : 'AND' | '+';
> OR          : 'OR';
> NOT         : 'NOT';
> DEFAULT_OP  : 'DEF_OP';
> FIELD       : ('title'|'TITLE'|'Title')(FIELDSEPARATOR);
> WS          : (WSCHAR)+ { $channel=HIDDEN; };
> PREFIX      : WORDCHAR+(STAR);
> WORD        : WORDCHAR+(('-'|'+')WORDCHAR*)*;
> SENTENCE    : ((DOUBLEQUOTE)(~(DOUBLEQUOTE))*(DOUBLEQUOTE));
> WORDLIST    : ((PREFIX | WORD | SENTENCE)(','(WS)* (PREFIX | WORD |
> SENTENCE))+);
> MUST          : '+'(PREFIX|WORD|SENTENCE|WORDLIST);
> fragment WORDCHAR       : (~( WSCHAR | LPAREN | RPAREN | '-' |':' | '+' |
> ',' | STAR | DOUBLEQUOTE) );
> fragment FIELDSEPARATOR : ':';
> fragment WSCHAR         : ( ' ' | '\t' | '\r' | '\n');
>
>
>
> ================================= END OF GRAMMAR ==========================
>
>
>
>
>
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:
> antlr-interest-bounces at antlr.org] On Behalf Of Bart Kiers
> Sent: May-04-11 10:21 AM
> To: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Inserting missing nodes
>
> On Wed, May 4, 2011 at 4:12 PM, Jean-Sebastien Vachon <
> jean-sebastien.vachon at wantedtech.com> wrote:
>
> > No one can help me with this? :S
> > Let me know if something is not clear. I need to fix this issue as
> > soon as I can.
> >
> > Thanks
>
>
> The fact that you didn't provide the lexer rules (although they might be
> straight-forward as you mentioned), and you didn't mention what input you're
> specifically having problems with parsing (the following is a bit
> vague: *"... but I can't get it to parse everything I'm throwing at it
> ..."*), might be some reasons why you haven't been answered.
>
> Regards,
>
> Bart.
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>