[antlr-interest] Why does antlr not know alternative?
Gavin Lambert
antlr at mirality.co.nz
Mon Jan 9 16:53:35 PST 2012
At 11:46 10/01/2012, James Ladd wrote:
>^ 12.
>
>I can see that NUMBER has a component of it that can be a '.'
hence
>the grammar issue.
>Is it the start of the decimal part of a number or the end of a
>statement.
Yes, that's probably the problem. While generating a NUMBER token
it's consuming the dot (and then probably suffering a sync error).
>NUMBER: ((NUMBER_LEFT)? ('-')? DIGITS (NUMBER_RIGHT_P1)?
>(NUMBER_RIGHT_P2)?);
[...]
>fragment NUMBER_LEFT: DIGITS 'r';
>fragment NUMBER_RIGHT_P1: '.' DIGITS;
>fragment NUMBER_RIGHT_P2: 'e' ('-')? DIGITS;
>fragment DIGIT: '0'..'9';
>fragment DIGITS: DIGIT+;
Unfortunately v3 lexers are a little too optimistic when faced
with subrules and */+ sequences -- they tend to only use one-char
lookahead when they should be using more. ie. in this case, the
lexer is deciding between taking the NUMBER_RIGHT_P1 branch or not
depending on whether the next character is a dot or a not-dot; it
doesn't look one further ahead to ensure that there's a digit
afterwards as well, it'll just throw an error while trying to
match the DIGITS subrule.
You should be able to force the issue by making it explicit:
NUMBER: ((NUMBER_LEFT)? ('-')? DIGITS (('.' DIGIT) =>
NUMBER_RIGHT_P1)? (NUMBER_RIGHT_P2)?);
More information about the antlr-interest
mailing list