[antlr-interest] Using antlr to retreive logical expressions

Miguel Almeida migueldealmeida at gmail.com
Fri Mar 2 11:34:56 PST 2012


For those interested, I made some improvements in my grammar file [1].

Current limitations:
- only works with &&/||, not AND/OR (not very problematic)
- you can't have spaces between the parenthesis and the &&/|| (I solve that
by replacing " (" with ")" and ") " with ")" in the source String before
feeding the lexer)



[1] My grammar file
grammar Logic;

options {
  output = AST;
}

tokens {
  AND = '&&';
  OR  = '||';
  NOT = '~';
}

// parser/production rules start with a lower case letter
parse
  :  expression EOF!    // omit the EOF token
  ;

expression
  :  or
  ;

or
  :  and (OR^ and)*    // make `||` the root
  ;

and
  :  not (AND^ not)*      // make `&&` the root
  ;

not
  :  NOT^ atom    // make `~` the root
  |  atom
  ;

atom
  :  ID
  |  '('! expression ')'!    // omit both `(` and `)`
  ;

// lexer/terminal rules start with an upper case letter
ID
  :
    (
    'a'..'z'
    | 'A'..'Z'
    | '0'..'9' | ' '
    | SYMBOL
  )+
  ;

SYMBOL
  :
    ('+'|'-'|'*'|'/'|'_')
 ;

//Space : (' ' | '\t' | '\r' | '\n')+ {$channel=HIDDEN;};


On Fri, Mar 2, 2012 at 11:58 AM, Martijn Reuvers
<martijn.reuvers at gmail.com>wrote:

> Hello Miguel,
>
> Ok, well I am not sure if that is possible at once in one single parse
> as you want to treat the space as either separator for tokens, but for
> one token you don't want that (the complete expression), maybe others
> know this?
>
> The tree visitor could still combine the expression for you, so you
> can evaluate it the way you want (same pattern but instead just
> combining into a single string).
>
> What you could do perhaps too is either rewrite the ast, and combine
> the expression into a single node (this could be done by either tree
> grammars I guess, or just manually when visiting) or you could also do
> it in the parser directly, though you may not want to use the ast
> feature then and build a tree yourself (directly).
>
> E.g. quick sample parser below the mail (this just combines the text,
> you probably want to create a node with the text).
>
> Regards,
> Martijn
>
> grammar test;
>
> @header {
>  package test;
>
>  import java.util.List;
>  import java.util.ArrayList;
> }
>
> @members {
>
>  final List list = new ArrayList();
>
> }
>
> r_start :       x=INT EQ y=INT { list.add($x.text + $EQ.text + $y.text);
> System.out.println(list); };
>
> EQ      :       'eq';
>
> INT     :       ('0'..'9')+;
>
> WS      :       (' ' | '\t' | '\r' | '\n')+ { skip(); };
>
>
> On Fri, Mar 2, 2012 at 12:22 PM, Miguel Almeida
> <migueldealmeida at gmail.com> wrote:
> > Hi Martin,
> >
> > That's the thing - I don't think I want that. I already have the code
> that
> > evaluates my expressions given a String with the clause (eg: given "1 eq
> 1"
> > it returns true, "2000-01-01 gt 2011-12-12" evaluates to false).
> >
> > That is why I'd like the token to be the complete expression and not
> >   eq
> > 1      5
> >
> > I'm comfortable with the tree visitor/walker. What I do need is the tree
> to
> > be built like
> >
> > http://i286.photobucket.com/albums/ll86/wild_oscar/get_preview.png
> >
> >
> >
> >
> > On Fri, Mar 2, 2012 at 11:09 AM, Martijn Reuvers <
> martijn.reuvers at gmail.com>
> > wrote:
> >>
> >> Hello Miguel,
> >>
> >> You probably do want an ast like (the last thing you mentioned):
> >>
> >>   eq
> >> 1      5
> >>
> >> These are relatively easy to evaluate, some pseudo code when using a
> >> treevisitor/walker:
> >>
> >> function walkEQ(..) {
> >>  walk(nodeWith1)
> >>  walk(nodeWith5)
> >>
> >>  // Then compare the the results of these 2 nodes and do with the
> >> result what you want (you can bind the result e.g. inside the node
> >> directly or work with return values, there's multiple ways to Rome
> >> after all).
> >> }
> >>
> >> Martijn
> >>
> >> On Fri, Mar 2, 2012 at 11:32 AM, Miguel Almeida
> >> <migueldealmeida at gmail.com> wrote:
> >> > Thank you for your answers,
> >> >
> >> > Both links touch parts of the problem -
> >> >
> >> >
> http://www.alittlemadness.com/2006/06/05/antlr-by-example-part-1-the-language/
> >> > is pretty similar to what I have. However, it doesn't deal with
> >> > precisely
> >> > what I'm having problems with: whitespaces.
> >> >
> >> > Every possible token in these examples doesn't have whitespace. In my
> >> > case,
> >> > however, I want a token with spaces (eg: "1 eq 1"). Where should I
> >> > include
> >> > this so that the correct expression is in the tree leaf? I take it is
> >> > should be in the ID lexer, but I'm not quite sure how.
> >> > If, on the other hand, I try to go the parser rule route, like:
> >> > clause: ID operator ID;
> >> > operator: 'eq' | 'neq' | 'gt' | 'lt' ;
> >> >
> >> > Then I won't have a leaf like "1 eq q", but a node "clause" composed
> of
> >> > 3
> >> > child leafs: "1", "eq", "1", which I take it is not appropriate. I
> can't
> >> > just test if the tree node is a clause an combine all children if it
> is,
> >> > can I?
> >> >
> >> > I appreciate your feedback,
> >> >
> >> > Miguel
> >> >
> >> > On Thu, Mar 1, 2012 at 7:30 PM, Bart Kiers <bkiers at gmail.com> wrote:
> >> >
> >> >> Only that tutorial is based on ANTLR v2.7, whose syntax differs
> >> >> considerably from the most recent ANTLR version (3.4).
> >> >>
> >> >> Regards,
> >> >>
> >> >> Bart.
> >> >>
> >> >>
> >> >> On Thu, Mar 1, 2012 at 8:12 PM, Kunal Naik <kunal.a.naik at gmail.com>
> >> >> wrote:
> >> >>
> >> >>> Hi Miguel,
> >> >>>
> >> >>> I think this blog post might be beneficial to you:
> >> >>>
> >> >>>
> >> >>>
> http://www.alittlemadness.com/2006/06/05/antlr-by-example-part-1-the-language/
> >> >>> In part 4 he explains the process of writing a tree grammar where
> you
> >> >>> can
> >> >>> inject your code that can evaluate your expressions.
> >> >>>
> >> >>> HTH,
> >> >>> Kunal
> >> >>>
> >> >>> On Thu, Mar 1, 2012 at 1:31 PM, Miguel Almeida
> >> >>> <migueldealmeida at gmail.com
> >> >>> >wrote:
> >> >>>
> >> >>> > Dear all,
> >> >>> >
> >> >>> > I need to parse and evaluate expressions which are in the format:
> >> >>> > - x eq 1 && y eq 10
> >> >>> > - (x lt 10 && x gt 1) || x eq -1
> >> >>> >
> >> >>> > I have the evaluator part working (ie, I have code that evaluates
> >> >>> > all
> >> >>> the
> >> >>> > gt/lt/eq/neq expressions. All I need is the part that breaks the
> >> >>> > clause
> >> >>> > into these expressions and then applies the logical and/or's
> >> >>> >
> >> >>> > I saw a recommendation on ANTLR to do this. My idea is to:
> >> >>> > 1) Build a tree
> >> >>> > 2) Execute the leafs using my already existing code (eg, replace
> "1
> >> >>> > eq
> >> >>> 10"
> >> >>> > with false)
> >> >>> > 3) Execute a method that then applies the logical operation to get
> >> >>> > the
> >> >>> > result of the tree
> >> >>> >
> >> >>> > While I've spend the last couple of days reading things about
> ANTLR,
> >> >>> > I
> >> >>> am
> >> >>> > kind of lost at the moment: I can't seem to be able to get a tree
> >> >>> structure
> >> >>> > whose tokens hold either the && and || or the complete
> expressions.
> >> >>> >
> >> >>> > My current grammar is [1]. An example test case is [2]
> >> >>> > - If I omit the | ' ' from the ID, "x eq 1" will be 3 tokens
> instead
> >> >>> > of
> >> >>> the
> >> >>> > one token I need
> >> >>> > - If I leave it there, then for example this "1 eq 1 && (bb eq 1)"
> >> >>> > expression won't work (No viable alternative at input '(' )
> >> >>> >
> >> >>> >
> >> >>> > Could you shed some light on what could be wrong?
> >> >>> >
> >> >>> > [1]
> >> >>> > grammar Logic;
> >> >>> >
> >> >>> > options {
> >> >>> >  output = AST;
> >> >>> > }
> >> >>> >
> >> >>> > tokens {
> >> >>> >  AND = '&&';
> >> >>> >  OR  = '||';
> >> >>> > }
> >> >>> >
> >> >>> > // parser/production rules start with a lower case letter
> >> >>> > parse
> >> >>> >  :  expression EOF!    // omit the EOF token
> >> >>> >  ;
> >> >>> >
> >> >>> > expression
> >> >>> >  :  implication
> >> >>> >  ;
> >> >>> >
> >> >>> > implication
> >> >>> >  :  or ('->'^ or)*    // make `->` the root
> >> >>> >  ;
> >> >>> >
> >> >>> > or
> >> >>> >  :  and ('||'^ and)*    // make `||` the root
> >> >>> >  ;
> >> >>> >
> >> >>> > and
> >> >>> >  :  not ('&&'^ not)*      // make `&&` the root
> >> >>> >  ;
> >> >>> >
> >> >>> > not
> >> >>> >  :  '~'^ atom    // make `~` the root
> >> >>> >  |  atom
> >> >>> >  ;
> >> >>> >
> >> >>> > atom
> >> >>> >  :  ID+
> >> >>> >  |  '('! expression ')'!    // omit both `(` and `)`
> >> >>> >  ;
> >> >>> >
> >> >>> >
> >> >>> > // lexer/terminal rules start with an upper case letter
> >> >>> > ID
> >> >>> >  :
> >> >>> >    (
> >> >>> >    'a'..'z'
> >> >>> >    | 'A'..'Z'
> >> >>> >    | '0'..'9'
> >> >>> >    | ' '
> >> >>> >  )+
> >> >>> >  ;
> >> >>> >
> >> >>> > Space : (' ' | '\t' | '\r' | '\n')+ {$channel=HIDDEN;};
> >> >>> >
> >> >>> >
> >> >>> >
> >> >>> > [2] Example test case
> >> >>> >    @Test
> >> >>> >    public void complexAndOr() throws RecognitionException{
> >> >>> >        // the expression
> >> >>> >        String src = "(1 eq 1 && 2 eq 2) || 3 eq 3";
> >> >>> >
> >> >>> >        // create a lexer & parser
> >> >>> >        LogicLexer lexer = new LogicLexer(new
> >> >>> > ANTLRStringStream(src));
> >> >>> >        LogicParser parser = new LogicParser(new
> >> >>> CommonTokenStream(lexer));
> >> >>> >
> >> >>> >        // invoke the entry point of the parser (the parse()
> method)
> >> >>> > and
> >> >>> > get the AST
> >> >>> >        CommonTree tree = (CommonTree)parser.parse().getTree();
> >> >>> >
> >> >>> >        assertEquals("||",tree.getText());
> >> >>> >        Tree child1 = tree.getChild(0);
> >> >>> >        assertEquals("&&",or.getText());
> >> >>> >        assertEquals("1 eq 1",child1.getChild(0));
> >> >>> >        assertEquals("2 eq 2",child1.getChild(1));
> >> >>> >        assertEquals("3 eq 3",tree.getChild(1).getText());
> >> >>> >    }
> >> >>> >
> >> >>> > Thank you,
> >> >>> >
> >> >>> > Miguel
> >> >>> >
> >> >>> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> >> >>> > Unsubscribe:
> >> >>> >
> >> >>> >
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> >> >>> >
> >> >>>
> >> >>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> >> >>> Unsubscribe:
> >> >>>
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> >> >>>
> >> >>
> >> >>
> >> >
> >> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> >> > Unsubscribe:
> >> >
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> >
> >
>


More information about the antlr-interest mailing list