[antlr-interest] Help controlling parser decisions

Wed Jul 25 09:52:49 PDT 2007

Thank you Tom, for having devoted time to this already.

The major difference between our grammars is yours does not have any lexer
rules for the operator NEAR, so there is no conflict. Adding the BOOL_OP
lexer rule back in breaks that example.
Here's a stripped down grammar with your semantic predicate for 'near'.If I
comment the BOOL_OP rule, it works fine by treating everything as WCHAR.
If I uncomment that BOOL_OP rule, it illustrates how the semantic predicate
fails to resolve the NEAR operator to a wchar when the conflicting lexer
rule  exists.

grammar WQL;

options{
   output=AST;
   ASTLabelType=CommonTree;
}

query :  tag '=' keyBOOL terms+
      ;

terms  : WCHAR+
       ;

tag    : WCHAR
       ;

keyBOOL: near
       ;

near:   {input.LT(1).getText().toLowerCase().equals("near")}? WCHAR
       ;

BOOL_OP :  'NEAR'; //comment this out to get working
WS      : (' '|'\t'|'\r'|'\n')+ {skip();};
WCHAR   : ~('='|'('| ')'|'"'|' '|'\t'|'\n'|'\r'|'#')+;

Thank you again. The help is very much appreciated.

Ted

On 7/25/07, Thomas Brandon <tbrandonau at gmail.com> wrote:
>
> On 7/25/07, Ted Villalba <ted.villalba at gmail.com> wrote:
> > Thanks for the responses.
> > Seems straight forward enough to create the disambiguating semantic
> > predicate, but perhaps Im not starting out with the right assumptions.
> >
> > If I want to accept near as a term if it begins( or ends) a sentence,
> then I
> > thought I could do something like this:
> >
> > value   :  value_ -> ^(VALUE value_) ;
> >
> > value_  :  keyBOOL terms* (operator^ value)*
> >             | LPAREN! value RPAREN! ( operator^ value)*
> >             ;
> >
> > keyBOOL : {input.LT(1).getText().equals("NEAR")}? terms;
> >
> > terms   : WCHAR+  -> ^(TERMS WCHAR+ )
> >            ;
> >
> > But when I try to enter SO=(NEAR apples oranges), the parser no likey.
> > Still getting:
> >      line 1:5 no viable alternative at input 'NEAR'.
> >
> > Am I missing an obvious puzzle piece ?
> > I tried instead to assume all booleans were terms and then tested each
> of
> > the terms in a similar approach, but wasn't successful yet at
> > differentiating, on demand, the operators from the terms.
> The problem in the above rules isn't obvious to me. Looks like it
> should work, though you seem to have some uneeded + and *'s given that
> term is already WCHAR+, but that shouldn't break it as far as I can
> see.
> Were you running it under the ANTLRWorks interpreter? That won't do
> actions so it won't work there. You need to use the debugger in this
> case.
> I got the following grammar to parse your example fine as well as to
> parse "near=(near apples oranges)", correctly handling the first
> "near" as a tag and the second as a keyBool:
> grammar WQL;
>
> options{
>     output=AST;
>     ASTLabelType=CommonTree;
> }
>
> tokens{ TAG; VALUE; TERMS;} //imaginary token types
>
> @header{
> import java.util.HashMap ;
> }
>
> @members {
> HashMap fieldMap = new HashMap();
> }
>
> start
>    :(   query
>                 {System.out.println("AST:\n"+$query.tree.toStringTree());}
>         )+
>         ;
>
> query
>         :       field
>     ;
>
> field
>         :       tag '=' LPAREN value RPAREN -> ^('=' tag value)
>     |   tag '=' terms -> ^('=' tag terms)
>     |   qid
>         ;
>
> value
>         :       keyBOOL terms?
>     ;
>
> keyBOOL
>         :       near
>         |       far
>         ;
>
> terms
>         :       WCHAR+  -> ^(TERMS WCHAR+ )
>     ;
>
> tag     :       WCHAR
>     ;
>
> qid     : '#'! DIGIT
>     ;
>
> near:   {input.LT(1).getText().toLowerCase().equals("near")}? WCHAR
>         ;
>
> far     :       {input.LT(1).getText().toLowerCase().equals("near")}?
> WCHAR
>         ;
>
> DIGIT   : ('0'..'9');
> WS      : (' '|'\t'|'\r'|'\n')+ {skip();};
> LPAREN    : '(' ;
> RPAREN    : ')' ;
> QUOTE    : '"';
> WCHAR   : ~('='|'('| ')'|'"'|' '|'\t'|'\n'|'\r'|'#')+;
>
> AFAICT there are no changes to the basic method you gave. I removed
> some stuff to simplify putting together the grammar and I cleaned up
> some of the unneccessary +s and *s. But no major changes.
>
> Tom.
> >
> > Thank you for the help,
> > Ted
> >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20070725/0d409d28/attachment.html