[antlr-interest] Why don't parsers support character ranges?
Johannes Luber
jaluber at gmx.de
Wed Apr 23 05:21:06 PDT 2008
Hannes Schmidt schrieb:
> Hi all,
>
> I would like to use character ranges in a parser as illustrated in the
> following example (a very reduced version of my real-world grammar):
>
> grammar test1;
> foo : before '@' after;
> before : 'a'..'z';
> after : 'm'..'z';
>
> ANTLR generates a parser that ignores the range as if the grammar were
>
> grammar test2;
> foo : before '@' after;
> before : ;
> after : ;
>
> IOW, the grammar fails to parse the input "a at m". If I break the grammar
> up into a lexer and a parser as in
>
> grammar test3;
> foo : BEFORE '@' AFTER;
> BEFORE : 'a'..'z';
> AFTER : 'm'..'z';
>
> the generated code fails to parse "a at m" with a MismatchedTokeException
> at the 'm'. This is because ANTLR silently prioritizes BEFORE even
> though its set of characters intersects that of AFTER. Swapping BEFORE
> and AFTER would generate a parser that fails to recognize "m at m".
You could alternatively use:
grammar test4;
foo : BEFORE '@' AFTER;
BEFORE : A_TO_L | M_TO_Z;
AFTER : M_TO_Z;
fragment A_TO_L: 'a'..'l';
fragment M_TO_Z: 'm'..'z';
But I suppose it is easier for error messages, if you leave A_TO_L in
for AFTER and check it in a later stage for correctness.
grammar test5;
foo : ALPHA '@' ALPHA;
ALPHA: 'a'..'z';
Johannes
More information about the antlr-interest
mailing list