[antlr-interest] Inserting missing nodes

Jean-Sebastien Vachon jean-sebastien.vachon at wantedtech.com
Thu May 5 07:08:01 PDT 2011


Hi,

I've integrated your solution to my whole grammar and it works perfectly.

Thanks for your help

From: Bart Kiers [mailto:bkiers at gmail.com]
Sent: May-05-11 4:51 AM
To: Jean-Sebastien Vachon
Cc: antlr-interest at antlr.org
Subject: Re: [antlr-interest] Inserting missing nodes

How about something like this:

grammar MyGrammar;

options {
  output=AST;
}

tokens {
  DEFAULT_OP;
}

query
  :  andExpression EOF -> andExpression
  ;

andExpression
  :  (orExpression -> orExpression) ( AND e=orExpression              -> ^(AND $e $andExpression)
                                    | (orExpression)=> e=orExpression -> ^(DEFAULT_OP $e $andExpression)
                                    )*
  ;

orExpression
  :  negation (OR^ negation)*
  ;

negation
  :  NOT operand -> ^(NOT operand)
  |  operand
  ;

operand
  :  WORD
  |  '(' andExpression ')' -> andExpression
  ;

AND   : 'AND';
OR    : 'OR';
NOT   : 'NOT';
WORD  : 'a'..'z'+;
SPACE : (' ' | '\t' | '\r' | '\n') {skip();};

Test class:

import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import org.antlr.stringtemplate.*;

public class Main {
    public static void main(String[] args) throws Exception {
        ANTLRStringStream in = new ANTLRStringStream("software engineer OR java programmer");
        MyGrammarLexer lexer = new MyGrammarLexer(in);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        MyGrammarParser parser = new MyGrammarParser(tokens);
        MyGrammarParser.query_return returnValue = parser.query();
        CommonTree tree = (CommonTree)returnValue.getTree();
        DOTTreeGenerator gen = new DOTTreeGenerator();
        StringTemplate st = gen.toDOT(tree);
        System.out.println(st);
    }
}

Regards,

Bart.

On Wed, May 4, 2011 at 4:51 PM, Jean-Sebastien Vachon <jean-sebastien.vachon at wantedtech.com<mailto:jean-sebastien.vachon at wantedtech.com>> wrote:
Thanks for your input. So here is the whole thing with two use cases that are not giving me the expected results...
(Sorry for the long post)

INPUT = abc def zyx toto
RESULT = (DEFAULT_OP abc def) (DEFAULT_OP zyx toto)
EXPECTED = (DEFAULT_OP (DEFAULT_OP abc def) (DEFAULT_OP zyx toto))

INPUT = software engineer OR java programmer
RESULT = (DEFAULT_OP software (OR engineer java)) programmer
EXPECTED =  (DEFAULT_OP (DEFAULT_OP software (OR engineer java)) programmer)

I'm also having some trouble using the Interpreter within Eclipse.
The same expressions are not working in the interpreter. It fails to generate the
tree with a "NoViableAltException at input 'abc' " (for the first case).
I don't think this is related to my other problem since I can't get it to generate any tree.

Thanks again for your time and comments

----------------------------------------------------------------------------------------------------------
Grammar (validation by building a tree and trying to insert missing operators)
----------------------------------------------------------------------------------------------------------
grammar MyGrammar;

options {
 language = Java;
 output = AST;
 ASTLabelType = CommonTree;
}
// Rules to build the tree representation of our expression...

query
 : and_expr+ EOF!
 ;

// Each AND expression can contain OR expressions...
and_expr
 : (expr expr+) => default_op
 | (u1=or_expr (AND^ u2=or_expr)*)
 ;
// A OR expression contains one or more expression
or_expr
 : u1=expr (OR^ u2=expr)*
 ;

default_op
 : (e1=or_expr e2=or_expr) -> ^(DEFAULT_OP $e1 $e2)
 ;

expr
 : (NOT^)? (operand)
 ;
// The leafs of the tree.. Words, sentence and so on...
// Note that an expression such as '-word' is rewritten in its 'NOT word' form
operand
 : (f=FIELD^)(o=operand)
 | PREFIX
 | WORD
 | SENTENCE
 | WORDLIST
 | NEGATIVE(w=PREFIX|w=WORD|w=SENTENCE|w=WORDLIST) -> ^(NOT $w)
 | MUST
 | LPAREN! and_expr RPAREN!
 ;

// Lexer ...
NEGATIVE    : '-';
LPAREN      : '(' ;
RPAREN      : ')' ;
DOUBLEQUOTE : '"';
STAR          : '*';
AND         : 'AND' | '+';
OR          : 'OR';
NOT         : 'NOT';
DEFAULT_OP  : 'DEF_OP';
FIELD       : ('title'|'TITLE'|'Title')(FIELDSEPARATOR);
WS          : (WSCHAR)+ { $channel=HIDDEN; };
PREFIX      : WORDCHAR+(STAR);
WORD        : WORDCHAR+(('-'|'+')WORDCHAR*)*;
SENTENCE    : ((DOUBLEQUOTE)(~(DOUBLEQUOTE))*(DOUBLEQUOTE));
WORDLIST    : ((PREFIX | WORD | SENTENCE)(','(WS)* (PREFIX | WORD | SENTENCE))+);
MUST          : '+'(PREFIX|WORD|SENTENCE|WORDLIST);
fragment WORDCHAR       : (~( WSCHAR | LPAREN | RPAREN | '-' |':' | '+' | ',' | STAR | DOUBLEQUOTE) );
fragment FIELDSEPARATOR : ':';
fragment WSCHAR         : ( ' ' | '\t' | '\r' | '\n');



================================= END OF GRAMMAR ==========================





-----Original Message-----
From: antlr-interest-bounces at antlr.org<mailto:antlr-interest-bounces at antlr.org> [mailto:antlr-interest-bounces at antlr.org<mailto:antlr-interest-bounces at antlr.org>] On Behalf Of Bart Kiers
Sent: May-04-11 10:21 AM
To: antlr-interest at antlr.org<mailto:antlr-interest at antlr.org>
Subject: Re: [antlr-interest] Inserting missing nodes

On Wed, May 4, 2011 at 4:12 PM, Jean-Sebastien Vachon < jean-sebastien.vachon at wantedtech.com<mailto:jean-sebastien.vachon at wantedtech.com>> wrote:

> No one can help me with this? :S
> Let me know if something is not clear. I need to fix this issue as
> soon as I can.
>
> Thanks


The fact that you didn't provide the lexer rules (although they might be straight-forward as you mentioned), and you didn't mention what input you're specifically having problems with parsing (the following is a bit
vague: *"... but I can't get it to parse everything I'm throwing at it ..."*), might be some reasons why you haven't been answered.

Regards,

Bart.
List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address



More information about the antlr-interest mailing list