[antlr-interest] Location dependent token?

Oliver Zeigermann oliver.zeigermann at gmail.com
Wed Dec 31 09:30:52 PST 2008


Hi Mats!

If you really need to distinguish monetary from measurement unit in
the _lexer_ - which I doubt for the same reasons as the others
answering - you could add a semantic predicate.

Modifying Jared's grammar might lead to this:

@lexer::members {
  protected boolean isMonetarySymbol = true;
}


expression : NUMBER MONETARY_SYMBOL (SLASH MEASUREMENT_SYMBOL)? ;

SLASH : '/' { isMonetarySymbol = false; };

MONETARY_SYMBOL
   : {isMonetarySymbol}? SYMBOL
   ;

MEASUREMENT_SYMBOL
   : {!isMonetarySymbol}? SYMBOL { isMonetarySymbol = true; }
   ;

fragment SYMBOL : A-Z A-Z A-Z ;


Be careful to set the predicate in the lexer, though!

Oliver

2008/12/29 Jared Bunting <jared.bunting at peachjean.com>:
> If the three-letter words can be anything, can you just define one token
> that matches 3 uppercase letters?  Your parser should be able to tell
> what's what based on context.
>
> maybe something like this?
>
> expression : NUMBER SYMBOL ('/' SYMBOL)? ;
>
> SYMBOL : A-Z A-Z A-Z ;
>
> -Jared
>
> Mats Ekberg wrote:
>> Ok, maybe I was a bit unsharp.
>> Monetary units are expressed as three-letter words; EUR GBP and so on.
>> Measurement unitts are also expressed with three letters; TNE KGM and
>> so on.
>>
>> The only way to know which is which is where the three letters are
>> located. In one location its a monetary and another its a measurement.
>>
>> ok?
>>
>> regards
>> mats
>>
>> mån 2008-12-29 klockan 08:10 -0600 skrev Gary R. Van Sickle:
>>> > From: Mats Ekberg
>>> >
>>> > Lets say a three letter word in uppercase can mean one of two
>>> > tings like:
>>> >
>>> >   10  EUR
>>> > where EUR means a monetary unit
>>> >
>>> >   10 EUR / TNE
>>> > where EUR still means a monetary unit but the three letters
>>> > TNE now means a measurement uniot.
>>> >
>>> > How can that be expressed in a grammar??
>>> >
>>> > /mats
>>>
>>> Your question must be missing some information, because what you're asking
>>> is the most basic of lexing/parsing issues:
>>>
>>>
>>> Lexer does something like this:
>>>
>>> NUMBER : [0..9]+ ;
>>>
>>> EUR : 'EUR' ;
>>>
>>> TNE : 'TNE' ;
>>>
>>>
>>> Parser does something like this:
>>>
>>> num_with_monetary_unit_and_optional_per_unit
>>>     : NUMBER monetary_unit ('/' measurement_unit)?
>>>     ;
>>>
>>> monetary_unit
>>>     : EUR
>>>     | <<whatever other monies you support>>
>>>     ;
>>>
>>> measurement_unit
>>>     : TNE
>>>     | <<whatever other measurement units you support>>
>>>     ;
>>>
>>>
>>> But was that really your question?
>>>
>>>
>> ------------------------------------------------------------------------
>>
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>


More information about the antlr-interest mailing list