[antlr-interest] behaviour of lexer

Bart Kiers bkiers at gmail.com
Tue Mar 2 10:07:09 PST 2010


On Tue, Mar 2, 2010 at 7:00 PM, Bart Kiers <bkiers at gmail.com> wrote:

>
>
> On Tue, Mar 2, 2010 at 5:25 PM, Philippe Frankson <
> Philippe.Frankson at frsglobal.com> wrote:
>
>> ...
>>
>> @int('444') is a function converting a string into integer.
>> If I don't have parentheses, then it is not a function, it is only a
>> column name. Ex.: @test, @integer, @in, ....
>>
>> Here is a part of lexer rules:
>>
>> fragment DIGIT  : ('0'..'9');
>> fragment ALPHA  : ('a'..'z'|'A'..'Z'|'_');
>>
>> OB              : '(';
>> INTTOKEN        : ('@int' OB)=> '@int'; // so I check if there is an open
>> parenthesis to return INTTOKEN.
>> AT              : '@';
>> NAME            : ALPHA (ALPHA | DIGIT)*;   ...
>>
>
> Why not just include the OB in your lexer rule?
> Something like this:
>
> INT_METHOD      : AT 'int' OB; // or: AT 'int' OB STRING CB;
> COLUMN          : AT NAME;
>
> OB              : '(';
>
> AT              : '@';
> NAME            : ALPHA (ALPHA | DIGIT)*;
>
> fragment DIGIT  : ('0'..'9');
> fragment ALPHA  : ('a'..'z'|'A'..'Z'|'_');
>
> and because lexer rules are matched from top to bottom, '@int' will be
> matched as 'AT NAME'.
>
> Regards,
>
> Bart.
>

But, maybe better, move the "responsibility" to the parser instead of the
lexer:

parse           :  method | column;

method          : AT NAME OB STRING CB;
column          : AT NAME;

STRING          : '"' ~('"')* '"';
OB              : '(';
CB              : ')';
AT              : '@';
NAME            : ALPHA (ALPHA | DIGIT)*;

fragment DIGIT  : ('0'..'9');
fragment ALPHA  : ('a'..'z'|'A'..'Z'|'_');

Regards,

Bart.


More information about the antlr-interest mailing list