[antlr-interest] Starting two parser rules with the same token

Jim Idle jimi at temporal-wave.com
Tue Feb 28 19:21:15 PST 2012


Wrong! ;)

This just means that your grammar rules are not organized in ll precedence  properly and you need to restructure. If you look at the Java grammar, you should see how to construct the expression tree. Your parents should be in the final rule which has highest presence. This will let you have unlimited nesting. the parentheses should only be in your atom rule and it will work out. 

Jim

On Feb 28, 2012, at 19:06, Kunal Naik <kunal.a.naik at gmail.com> wrote:

> Hello,
> 
> So the subject text is probably already getting most of you ready to yell
> "wrong!" but hear me out.  I'm trying to write a grammar that supports
> something like the following:
> (1*2/(3-variableOne) >= variableTwo OR variableThree != 4) AND variableFour
>> 5
> 
> Basically I want to be able to use parentheses to group the mathematical
> operations [(1*2/(3-variableOne) above] as well as use parentheses to group
> the boolean operations [binding the two operations around OR above].  The
> way the grammar is laid out, we can have an infinite amount of opening
> parenthesis so ANTLR can't immediately tell if it's at the start of a
> grouped mathematical statement or boolean statement.  If I could limit the
> number of nested parenthesis, I think I could probably set k in the options
> to that same limit and that might help but I haven't come up with an
> elegant solution of enforcing a limit.
> 
> I feel like this has to be possible because the Java grammar allows me to
> do something like:
> if((1*2/(3-variableOne) >= variableTwo || variableThree != 4) &&
> variableFour > 5) { //do something}
> and there is apparently an example Java.g for ANTLR so perhaps it has been
> implemented?  (although I haven't actually compiled and tested against it,
> just read Java.g and couldn't figure out how they accomplished it)
> 
> ANTLR is throwing the following error: "rule simpleFilterExpression has
> non-LL(*) decision due to recursive rule invocations reachable from alts
> 1,2.  Resolve by left-factoring or using syntactic predicates or using
> backtrack=true option." which makes sense now that I've wrapped my head
> around the problem.  After much Googling, I even tried setting the
> backtrack setting to true but that didn't seem to help.
> I'm pasting the grammar below if anyone would like to take a stab at it.
> 
> Thanks,
> Kunal
> 
> Grammar:
> 
> options
> {
>    output=AST;
>    ASTLabelType=CommonTree;
> }
> 
> tokens {
>    ADD  = '+' ;
>    SUB  = '-' ;
>    MULT = '*' ;
>    DIV  = '/' ;
>    EQ   = '=';
>    DEQ  = '==';
>    NEQ  = '!=';
>    GT   = '>';
>    GTE  = '>=';
>    LT   = '<';
>    LTE  = '<=';
>    LEFT_PARENTHESIS  = '(';
>    RIGHT_PARENTHESIS = ')';
> }
> 
> //////////////
> // Parser rules
> //////////////
> // entry point
> compoundFilterExpression : orFilterExpression EOF;
> 
> // AND takes precedence over OR
> orFilterExpression : andFilterExpression (OR^ andFilterExpression)*;
> 
> andFilterExpression : simpleFilterExpression (AND^ simpleFilterExpression)*;
> 
> simpleFilterExpression
>    : additiveExpression (EQ|DEQ|NEQ|GT|GTE|LT|LTE)^ additiveExpression
>    | LEFT_PARENTHESIS! orFilterExpression RIGHT_PARENTHESIS!
>    ;
> 
> // * and / take precedence over + and -
> additiveExpression : multiplicativeExpression ((ADD|SUB)^
> multiplicativeExpression)*;
> 
> multiplicativeExpression : atom ((MULT|DIV)^ atom)*;
> 
> // There is no way to differentiate between a numeric and string column
> // in the grammar so we have to group them together for now and do an
> // explicit check while walking the AST
> atom
>    : COLUMN_NAME
>    | FLOAT
>    | STRING
>    | LEFT_PARENTHESIS! additiveExpression RIGHT_PARENTHESIS!
>    ;
> 
> //////////////
> // Lexer rules (plus the tokens at the top)
> //////////////
> OR
>    : 'or'
>    | 'OR'
>    | '||'
>    | '|'
>    ;
> 
> AND
>    : 'and'
>    | 'AND'
>    | '&&'
>    | '&'
>    ;
> 
> COLUMN_NAME : ('a'..'z'|'A'..'Z')+ ; // anything from a-z and A-Z
> 
> FLOAT
>    : ('0'..'9')+ '.' ('0'..'9')+    // 123.456
>    | '.' ('0'..'9')+ //.456
>    | ('0'..'9')+  // 123
>    ;
> 
> STRING
>    :  '"' ( ESC_SEQ | ~('\\'|'"') )+ '"'
>    ;
> 
> fragment
> HEX_DIGIT : ('0'..'9'|'a'..'f'|'A'..'F') ;
> 
> fragment
> ESC_SEQ
>    :   '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')
>    |   UNICODE_ESC
>    |   OCTAL_ESC
>    ;
> 
> fragment
> OCTAL_ESC
>    :   '\\' ('0'..'3') ('0'..'7') ('0'..'7')
>    |   '\\' ('0'..'7') ('0'..'7')
>    |   '\\' ('0'..'7')
>    ;
> 
> fragment
> UNICODE_ESC
>    :   '\\' 'u' HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT
>    ;
> 
> WS : ( ' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;} ;
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address


More information about the antlr-interest mailing list