[antlr-interest] Inserting missing nodes
Bart Kiers
bkiers at gmail.com
Thu May 5 01:50:40 PDT 2011
How about something like this:
grammar MyGrammar;
options {
output=AST;
}
tokens {
DEFAULT_OP;
}
query
: andExpression EOF -> andExpression
;
andExpression
: (orExpression -> orExpression) ( AND e=orExpression ->
^(AND $e $andExpression)
| (orExpression)=> e=orExpression ->
^(DEFAULT_OP $e $andExpression)
)*
;
orExpression
: negation (OR^ negation)*
;
negation
: NOT operand -> ^(NOT operand)
| operand
;
operand
: WORD
| '(' andExpression ')' -> andExpression
;
AND : 'AND';
OR : 'OR';
NOT : 'NOT';
WORD : 'a'..'z'+;
SPACE : (' ' | '\t' | '\r' | '\n') {skip();};
Test class:
import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import org.antlr.stringtemplate.*;
public class Main {
public static void main(String[] args) throws Exception {
ANTLRStringStream in = new ANTLRStringStream("software engineer OR
java programmer");
MyGrammarLexer lexer = new MyGrammarLexer(in);
CommonTokenStream tokens = new CommonTokenStream(lexer);
MyGrammarParser parser = new MyGrammarParser(tokens);
MyGrammarParser.query_return returnValue = parser.query();
CommonTree tree = (CommonTree)returnValue.getTree();
DOTTreeGenerator gen = new DOTTreeGenerator();
StringTemplate st = gen.toDOT(tree);
System.out.println(st);
}
}
Regards,
Bart.
On Wed, May 4, 2011 at 4:51 PM, Jean-Sebastien Vachon <
jean-sebastien.vachon at wantedtech.com> wrote:
> Thanks for your input. So here is the whole thing with two use cases that
> are not giving me the expected results...
> (Sorry for the long post)
>
> INPUT = abc def zyx toto
> RESULT = (DEFAULT_OP abc def) (DEFAULT_OP zyx toto)
> EXPECTED = (DEFAULT_OP (DEFAULT_OP abc def) (DEFAULT_OP zyx toto))
>
> INPUT = software engineer OR java programmer
> RESULT = (DEFAULT_OP software (OR engineer java)) programmer
> EXPECTED = (DEFAULT_OP (DEFAULT_OP software (OR engineer java))
> programmer)
>
> I'm also having some trouble using the Interpreter within Eclipse.
> The same expressions are not working in the interpreter. It fails to
> generate the
> tree with a "NoViableAltException at input 'abc' " (for the first case).
> I don't think this is related to my other problem since I can't get it to
> generate any tree.
>
> Thanks again for your time and comments
>
>
> ----------------------------------------------------------------------------------------------------------
> Grammar (validation by building a tree and trying to insert missing
> operators)
>
> ----------------------------------------------------------------------------------------------------------
> grammar MyGrammar;
>
> options {
> language = Java;
> output = AST;
> ASTLabelType = CommonTree;
> }
>
> // Rules to build the tree representation of our expression...
>
> query
> : and_expr+ EOF!
> ;
>
> // Each AND expression can contain OR expressions...
> and_expr
> : (expr expr+) => default_op
> | (u1=or_expr (AND^ u2=or_expr)*)
> ;
>
> // A OR expression contains one or more expression
> or_expr
> : u1=expr (OR^ u2=expr)*
> ;
>
> default_op
> : (e1=or_expr e2=or_expr) -> ^(DEFAULT_OP $e1 $e2)
> ;
>
> expr
> : (NOT^)? (operand)
> ;
>
> // The leafs of the tree.. Words, sentence and so on...
> // Note that an expression such as '-word' is rewritten in its 'NOT word'
> form
> operand
> : (f=FIELD^)(o=operand)
> | PREFIX
> | WORD
> | SENTENCE
> | WORDLIST
> | NEGATIVE(w=PREFIX|w=WORD|w=SENTENCE|w=WORDLIST) -> ^(NOT $w)
> | MUST
> | LPAREN! and_expr RPAREN!
> ;
>
> // Lexer ...
> NEGATIVE : '-';
> LPAREN : '(' ;
> RPAREN : ')' ;
> DOUBLEQUOTE : '"';
> STAR : '*';
> AND : 'AND' | '+';
> OR : 'OR';
> NOT : 'NOT';
> DEFAULT_OP : 'DEF_OP';
> FIELD : ('title'|'TITLE'|'Title')(FIELDSEPARATOR);
> WS : (WSCHAR)+ { $channel=HIDDEN; };
> PREFIX : WORDCHAR+(STAR);
> WORD : WORDCHAR+(('-'|'+')WORDCHAR*)*;
> SENTENCE : ((DOUBLEQUOTE)(~(DOUBLEQUOTE))*(DOUBLEQUOTE));
> WORDLIST : ((PREFIX | WORD | SENTENCE)(','(WS)* (PREFIX | WORD |
> SENTENCE))+);
> MUST : '+'(PREFIX|WORD|SENTENCE|WORDLIST);
> fragment WORDCHAR : (~( WSCHAR | LPAREN | RPAREN | '-' |':' | '+' |
> ',' | STAR | DOUBLEQUOTE) );
> fragment FIELDSEPARATOR : ':';
> fragment WSCHAR : ( ' ' | '\t' | '\r' | '\n');
>
>
>
> ================================= END OF GRAMMAR ==========================
>
>
>
>
>
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:
> antlr-interest-bounces at antlr.org] On Behalf Of Bart Kiers
> Sent: May-04-11 10:21 AM
> To: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Inserting missing nodes
>
> On Wed, May 4, 2011 at 4:12 PM, Jean-Sebastien Vachon <
> jean-sebastien.vachon at wantedtech.com> wrote:
>
> > No one can help me with this? :S
> > Let me know if something is not clear. I need to fix this issue as
> > soon as I can.
> >
> > Thanks
>
>
> The fact that you didn't provide the lexer rules (although they might be
> straight-forward as you mentioned), and you didn't mention what input you're
> specifically having problems with parsing (the following is a bit
> vague: *"... but I can't get it to parse everything I'm throwing at it
> ..."*), might be some reasons why you haven't been answered.
>
> Regards,
>
> Bart.
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
More information about the antlr-interest
mailing list