[antlr-interest] behaviour of lexer

Philippe Frankson Philippe.Frankson at Frsglobal.com
Tue Mar 2 10:20:18 PST 2010


In fact, I have many functions (@int(), @float(), @left(), @right(), ...)
For each of those, I build a specific object.
If I do what you are suggesting, I will have to code a kind of switch(...) for every different kind of object.
I prefer to have that 'intelligence' in the parsing and not in my code.

I show you a piece of code I have to make it more clear...

floatexpr:		FLOATTOKEN OB arithexpr CB { stack.push(new DTExprFloat(this.colDefinition, (DTExpr)stack.pop())); };

leftexpr:		LEFTTOKEN OB arithexpr COMMA arithexpr CB { stack.push(new DTExprLeft(this.colDefinition, (DTExpr)stack.pop(), (DTExpr)stack.pop())); };

rightexpr: 		RIGHTTOKEN OB arithexpr COMMA arithexpr CB { stack.push(new DTExprRight(this.colDefinition, (DTExpr)stack.pop(), (DTExpr)stack.pop())); };

replaceexpr:	REPLACETOKEN OB arithexpr COMMA arithexpr COMMA arithexpr COMMA arithexpr CB { 
				// var,start,length,replacestring
				stack.push(new DTExprReplace(this.colDefinition, (DTExpr)stack.pop(), (DTExpr)stack.pop(), (DTExpr)stack.pop(), (DTExpr)stack.pop())); }
			;

substringexpr:	SUBSTRING OB arithexpr COMMA arithexpr COMMA arithexpr CB { stack.push(new DTExprSubstring(this.colDefinition, (DTExpr)stack.pop(), (DTExpr)stack.pop(), (DTExpr)stack.pop())); };

....


Regards
Philippe




-----Original Message-----
From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Bart Kiers
Sent: 02 March 2010 19:07
To: antlr-interest at antlr.org
Subject: Re: [antlr-interest] behaviour of lexer

On Tue, Mar 2, 2010 at 7:00 PM, Bart Kiers <bkiers at gmail.com> wrote:

>
>
> On Tue, Mar 2, 2010 at 5:25 PM, Philippe Frankson <
> Philippe.Frankson at frsglobal.com> wrote:
>
>> ...
>>
>> @int('444') is a function converting a string into integer.
>> If I don't have parentheses, then it is not a function, it is only a
>> column name. Ex.: @test, @integer, @in, ....
>>
>> Here is a part of lexer rules:
>>
>> fragment DIGIT  : ('0'..'9');
>> fragment ALPHA  : ('a'..'z'|'A'..'Z'|'_');
>>
>> OB              : '(';
>> INTTOKEN        : ('@int' OB)=> '@int'; // so I check if there is an open
>> parenthesis to return INTTOKEN.
>> AT              : '@';
>> NAME            : ALPHA (ALPHA | DIGIT)*;   ...
>>
>
> Why not just include the OB in your lexer rule?
> Something like this:
>
> INT_METHOD      : AT 'int' OB; // or: AT 'int' OB STRING CB;
> COLUMN          : AT NAME;
>
> OB              : '(';
>
> AT              : '@';
> NAME            : ALPHA (ALPHA | DIGIT)*;
>
> fragment DIGIT  : ('0'..'9');
> fragment ALPHA  : ('a'..'z'|'A'..'Z'|'_');
>
> and because lexer rules are matched from top to bottom, '@int' will be
> matched as 'AT NAME'.
>
> Regards,
>
> Bart.
>

But, maybe better, move the "responsibility" to the parser instead of the
lexer:

parse           :  method | column;

method          : AT NAME OB STRING CB;
column          : AT NAME;

STRING          : '"' ~('"')* '"';
OB              : '(';
CB              : ')';
AT              : '@';
NAME            : ALPHA (ALPHA | DIGIT)*;

fragment DIGIT  : ('0'..'9');
fragment ALPHA  : ('a'..'z'|'A'..'Z'|'_');

Regards,

Bart.

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address


More information about the antlr-interest mailing list