[antlr-interest] Lexer rule for INTEGER and COMMA_INTEGER

Bernard Kaiflin bkaiflin.ruby at gmail.com
Sat Nov 10 06:00:33 PST 2012


Oh, I see what you mean. You may have been fooled because comma_integer is
inside atom.

Yes, a sequence like "1  ,2" INSIDE a function will be recognized as a list
(comma_integer=INT, separator=SPACE_COMMA, comma_integer=INT). This is not
a single comma_integer but three tokens.

No, a sequence like "1  ,2" OUTSIDE a function cannot be recognized as a
comma_integer (INT WS COMMA INT), it will fail. As we are outside a
function, the rule piece will call comma_integer which will consume `1` as
an INT. Then the parser will try the loop  ( COMMA INT )*. It sends a
getToken() request to the lexer. The pointer in the input stream is under
the space after 1. The space is ambiguous because the lexer has the choice
between SPACE_COMMA and WS. So the lexer peeks the next character to see if
it is a comma. No, so there is no more ambiguity, the lexer creates a WS
token on the hidden channel and starts the process again. The pointer in
the input stream is now under the second space after 1, before the comma.
The space is ambiguous because the lexer has the choice between SPACE_COMMA
and WS. So the lexer peeks the next character to see if it is a comma. Yes,
so there is no more ambiguity, the lexer emits a SPACE_COMMA token and
returns. As the parser is waiting for a COMMA, it fails to match and sends
a message like no viable alternative at input ' ,'.

Hope it's now clear.
Bernard


2012/11/10 Zhaohui Yang <yezonghui at gmail.com>

> The main ambiguity here is that a sequence like "1  ,2" can either by
> recognized as a comma_integer (INT WS COMMA INT) or a list
> (comma_integer=INT, seperator=SPACE_COMMA, comma_integer=INT).
>
> I guess the simplicity of the V4 version come from some default priority /
> greedy policy that favous comma_integer (than seperator in list). Or ANTLR
> V4 has unified ambiguity analysis that considers all lexer and parser rules
> together?
>
> --
> Regards,
>
> Yang, Zhaohui
>
>


More information about the antlr-interest mailing list