[antlr-interest] Lexer rule for INTEGER and COMMA_INTEGER

Jim Idle jimi at temporal-wave.com
Sat Nov 3 20:13:22 PDT 2012


You will need to use gated semantic predicates I think. Unless you are in
charge to the language, then you can stop it being so dumb ;)

The predicates require that you cover the positive and negative alts
basically, or you will get the failed predicate message.

Jim

-----Original Message-----
From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of Zhaohui Yang
Sent: Saturday, November 03, 2012 11:27 PM
To: antlr-interest at antlr.org
Subject: [antlr-interest] Lexer rule for INTEGER and COMMA_INTEGER

Hi,

I have a lexer grammar that that has to recognize INTEGER like 1234 and
COMMA_INTEGER like 1,234,567 The later integer token has comma in it, and
of cause the language has other places that use comma, e.g. F(1, x) is
valid, which contains "1,"
that should be recognized as a INTEGER 1 followd by a comma.

This is similar to the "lexer rule for floating point, integer and range
operator" example given in ANTLR wiki. There the conflict is around
period, here it is comma.

However, I tried the ways suggested by the example, but cannot get it
right. The following is one version of my lexer rules, using semantic
predicate:
    COMMA_INTEGER:(('0'..'9')+ {input.LA(1)==',' && input.LA(2)>='0' &&
input.LA(2)<='9'}?=>(',' ('0'..'9')+)+);
    INTEGER:('0'..'9')+;
This version results in error
    "rule COMMA_INTEGER failed predicate: {input.LA(1)==',' &&
input.LA(2)>='0' && input.LA(2)<='9'}? " for input "1, " as in F(1, x)

The following version uses syntatic predicate
    COMMA_INTEGER:(('0'..'9')+ (',' ('0'..'9')+)=>(','
('0'..'9')+)+);//TODO-COMMA_integer different from RES
    INTEGER:('0'..'9')+;
and results in error
    "required (...)+ loop did not match anything at character ' ' "
 (charactor SPACE)

Swapping the order of INTEGER and COMMA_INTEGER does not changed the
errors.

So it looks like the lexer is predicting next token without running the
predicates, i.e. it goes directly to match COMMA_INTEGER upon seeing a
comma after some digits.

Any suggestion? Thanks!

--
Regards,

Yang, Zhaohui

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address


More information about the antlr-interest mailing list