[antlr-interest] Non-determinism (was: Can I force a token to have precendence in the lexer?)
Bart Kiers
bkiers at gmail.com
Tue Apr 20 23:21:43 PDT 2010
On Wed, Apr 21, 2010 at 2:41 AM, Andy Hull <andyh at sunrunhome.com> wrote:
> Wow, thanks for the article. I was able to redefine the language to avoid
> the problem in order to keep the parser as simple as possible (now using
> "to" instead of "..." ).
>
> My parser needs to be able to handle nested array expressions like so
>
> {1,2,{5 to 10}, {3,6,9}, 4}
>
> I have the following grammar:
>
> arrayExpression
> : LEFT_BRACKET! arrayInitializer? RIGHT_BRACKET!;
> arrayInitializer
> : (e+=expression (',' e+=expression)*)+ -> ^(ELEMENTLIST $e*)
> | expression AUTO expression -> ^(AUTO expression expression)
> ;
>
> expression
> : arrayExpression
> /* | other types of expression */
> ;
>
> with the expected non-LL(*) grammar because "arrayInitializer" depends on
> the recursive rule expression. Setting backtrack to true doesn't resolve
> this as I expected.
>
> x={1,2,3,4};
>
> yields the correct tree but...
>
> x={1 to 3};
>
> yields the error:
>
> BR.recoverFromMismatchedToken
> line 1:5 mismatched input 'to' expecting RIGHT_BRACKET
>
> arrayInitializer behaves as expected when it contains only a single subrule
> (either the element list or the range initializer).
>
> Is backtracking the right solution to the non-determinism? I am doing
> something wrong?
>
How about something like this:
grammar Test;
parse
: array ';' EOF
;
array
: '{' (arrayAtom (',' arrayAtom)*)? '}'
;
arrayAtom
: Number
| array
| range
;
range
: Number 'to' Number
;
Number
: '0'..'9'+
;
Space
: (' ' | '\t' | '\r' | '\n') {skip();}
;
Regards,
Bart.
More information about the antlr-interest
mailing list