[antlr-interest] Optional spaces question

Thu Jan 19 02:17:26 PST 2012

You are right: I need to make the ID a fragment (I copied the two lines
from the older similar question without checking validity).

Your suggestion is interesting: putting action code inside the lexer (where
the whitespace has not yet been filtered out and therefore is less hidden).
I'll try that, thanks.

Best regards,

-Thomas

2012/1/18 Gokulakannan Somasundaram <gokul007 at gmail.com>

> Hi,
>    First of all, you cannot have a lexer rule inside another lexer rule on
> the left. Either you need to make ID as a fragment lexer rule / make the
> FUNCTION_CALL as a parser rule.
>    Remember, even if there is a skip rule for spaces, it won't skip the
> space that occurs in between fragment lexer rules inside a lexer rule. So
> following the first suggestion, if ID is a fragment lexer rule, you can
> write
>
> FUNCTION_CALL : ID ( SPACE { /* do whatever you want here*/ } )*  '(';
>
> Hope it helps.
>
> Gokul.
>
> On Wed, Jan 18, 2012 at 9:17 PM, Thomas Thomsen <thomas at t-t.dk> wrote:
>
>> I am pretty new to ANTLR, doing a DSL language. I like ANTLR a lot, but I
>> am struggling with a problem regarding optional whitespaces. My problem is
>> that I need to distinguish between "f(x)" and "f  (x)" -- note the space
>> between "f" and "(x)" in the latter (I am putting whitespace on the hidden
>> channel, and I want to continue to do that). The former is a function
>> call,
>> the latter something different.
>>
>> I found a post on this list from 2007 ("Handling optional spaces") which
>> addresses the exact same question. One suggestion was to have the lexer
>> absorb the left parenthesis if there is no space in between:
>>
>> ID : ('a'..'z') + ;
>> FUNCTION_CALL: ID '(' ;
>>
>> Then the lexer would return "f(" as a FUNCTION_CALL-token if there is not
>> space in between. This works, but it is not too pretty and complicates
>> things elsewhere in my code. The other suggestion was to check the hidden
>> channel for whitespace-tokens by means of Java code (actually C# in my
>> case). But since I am not yet too familiar with the inner workings of
>> ANTLR, this scares me a bit.
>>
>> So I was thinking of a third strategy: Have a simple preprocessor look
>> through the input file, and if a letter is directly followed by a left
>> parenthesis, put some special character in between. So the preprocessor
>> transforms "f(x)" into "f&(x)", where "&" is a (glue) character not used
>> elsewhere in the grammar. And afterwards, it would be much easier to
>> distinguish between "f&(x)" and "f  (x)" in ANTLR.
>>
>> Is this question or strategy completely stupid for some reason?
>>
>> Best regards, and thanks for all the good work on ANTLR,
>>
>> -Thomas Thomsen
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe:
>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>
>
>