[antlr-interest] Help controlling parser decisions
Ted Villalba
ted.villalba at gmail.com
Tue Jul 24 16:19:06 PDT 2007
Thanks for the responses.
Seems straight forward enough to create the disambiguating semantic
predicate, but perhaps Im not starting out with the right assumptions.
If I want to accept near as a term if it begins( or ends) a sentence, then I
thought I could do something like this:
value : value_ -> ^(VALUE value_) ;
value_ : keyBOOL terms* (operator^ value)*
| LPAREN! value RPAREN! ( operator^ value)*
;
keyBOOL : {input.LT(1).getText().equals("NEAR")}? terms;
terms : WCHAR+ -> ^(TERMS WCHAR+ )
;
But when I try to enter SO=(NEAR apples oranges), the parser no likey.
Still getting:
line 1:5 no viable alternative at input 'NEAR'.
Am I missing an obvious puzzle piece ?
I tried instead to assume all booleans were terms and then tested each of
the terms in a similar approach, but wasn't successful yet at
differentiating, on demand, the operators from the terms.
Thank you for the help,
Ted
On 7/24/07, Thomas Brandon <tbrandonau at gmail.com> wrote:
>
> On 7/25/07, Ted Villalba <ted.villalba at gmail.com> wrote:
> > Hi,
> >
> > I have a grammar that contains tokens that are sometimes operators,
> > sometimes not, depending on the context. The set of operators overlaps
> with
> > the set of all words that can be acceptable tokens. Trouble is,
> depending on
> > the order of my lexer rules, the parser will recognize all such tokens
> (AND
> > , OR ,NEAR) as operators, or will recognize none of them as operators.
> >
> > So if my lexer rules are:
> > BOOL_OP : 'AND'|'and'|'OR'|'or'|'NOT'|'not';
> > WOK_OP :
> > 'SAME'|'same'|'NEAR'('/'DIGIT+)*|'near'('/'DIGIT+)*;
> > ...
> > WCHAR : ~('='|'('| ')'|'"'|' '|'\t'|'\n'|'\r'|'#')+;
> >
> > In this order, if any of the tokens from the first 2 rules are
> encountered,
> > the parser assumes the token to be an operator, even where there is no
> > grammar rule to support the notion( and will follow with aa NoViableAlt
> > exception). If the rules are reversed, it will not recognize any of the
> > wchar+ as operators.
> >
> > So if I try to parse something like:
> > SO=(BY THE AIRPORT) , then it works fine, but if I try SO=(NEAR THE
> AIRPORT)
> > it throws the exception, trying to force "NEAR" into the role of
> operator,
> > even if the grammar does not support the idea of an operator at the
> > beginning of a phrase.
> Lexing occurs independently of parsing so parser context does not
> influence which tokens are matched.
> See http://www.antlr.org/wiki/pages/viewpage.action?pageId=1741 for
> the two possible solutions.
>
> Tom.
> >
> > Here is my complete grammar:
> >
> > grammar WQL;
> >
> > options{
> > output=AST;
> > ASTLabelType=CommonTree;
> > }
> >
> > tokens{ TAG; VALUE; TERMS;} //imaginary token types
> >
> > @header{
> > import java.util.HashMap ;
> > }
> >
> > @members {
> >
> > HashMap fieldMap = new HashMap();
> >
> > }
> >
> >
> >
> >
> >
> > start : ( query
> > {System.out.println("AST:\n"+$query.tree.toStringTree());}
> > )+
> > ;
> >
> >
> > query : field (BOOL_OP^ query)*
> > | LPAREN! query RPAREN! ( BOOL_OP^ query)*
> > ;
> >
> > field : tag '=' LPAREN value RPAREN -> ^('=' tag value)
> > | tag '=' terms+ -> ^('=' tag terms)
> > | qid
> > ;
> >
> > value : value_ -> ^(VALUE value_) ;
> >
> > value_ : terms+ (operator^ value)*
> > | LPAREN! value RPAREN! ( operator^ value)*
> > ;
> >
> > tag : WCHAR
> > ;
> >
> > terms : WCHAR+ -> ^(TERMS WCHAR+ )
> > | QUOTE WCHAR+ QUOTE -> ^(TERMS WCHAR+ ) // strip QUOTEs
> > ;
> >
> >
> > qid : '#'!DIGIT
> > ;
> >
> > operator: BOOL_OP|WOK_OP;
> >
> >
> > BOOL_OP : 'AND'|'and'|'OR'|'or'|'NOT'|'not';
> > WOK_OP :
> > 'SAME'|'same'|'NEAR'('/'DIGIT+)*|'near'('/'DIGIT+)*;
> > DIGIT : ('0'..'9');
> > WS : (' '|'\t'|'\r'|'\n')+ {skip();};
> > LPAREN : '(' ;
> > RPAREN : ')' ;
> > QUOTE : '"';
> > WCHAR : ~('='|'('| ')'|'"'|' '|'\t'|'\n'|'\r'|'#')+;
> >
> >
> > Thanks a million for the help.
> >
> > Ted
> >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20070724/e3538a8f/attachment-0001.html
More information about the antlr-interest
mailing list