[antlr-interest] Location dependent token?
Oliver Zeigermann
oliver.zeigermann at gmail.com
Wed Dec 31 09:30:52 PST 2008
Hi Mats!
If you really need to distinguish monetary from measurement unit in
the _lexer_ - which I doubt for the same reasons as the others
answering - you could add a semantic predicate.
Modifying Jared's grammar might lead to this:
@lexer::members {
protected boolean isMonetarySymbol = true;
}
expression : NUMBER MONETARY_SYMBOL (SLASH MEASUREMENT_SYMBOL)? ;
SLASH : '/' { isMonetarySymbol = false; };
MONETARY_SYMBOL
: {isMonetarySymbol}? SYMBOL
;
MEASUREMENT_SYMBOL
: {!isMonetarySymbol}? SYMBOL { isMonetarySymbol = true; }
;
fragment SYMBOL : A-Z A-Z A-Z ;
Be careful to set the predicate in the lexer, though!
Oliver
2008/12/29 Jared Bunting <jared.bunting at peachjean.com>:
> If the three-letter words can be anything, can you just define one token
> that matches 3 uppercase letters? Your parser should be able to tell
> what's what based on context.
>
> maybe something like this?
>
> expression : NUMBER SYMBOL ('/' SYMBOL)? ;
>
> SYMBOL : A-Z A-Z A-Z ;
>
> -Jared
>
> Mats Ekberg wrote:
>> Ok, maybe I was a bit unsharp.
>> Monetary units are expressed as three-letter words; EUR GBP and so on.
>> Measurement unitts are also expressed with three letters; TNE KGM and
>> so on.
>>
>> The only way to know which is which is where the three letters are
>> located. In one location its a monetary and another its a measurement.
>>
>> ok?
>>
>> regards
>> mats
>>
>> mån 2008-12-29 klockan 08:10 -0600 skrev Gary R. Van Sickle:
>>> > From: Mats Ekberg
>>> >
>>> > Lets say a three letter word in uppercase can mean one of two
>>> > tings like:
>>> >
>>> > 10 EUR
>>> > where EUR means a monetary unit
>>> >
>>> > 10 EUR / TNE
>>> > where EUR still means a monetary unit but the three letters
>>> > TNE now means a measurement uniot.
>>> >
>>> > How can that be expressed in a grammar??
>>> >
>>> > /mats
>>>
>>> Your question must be missing some information, because what you're asking
>>> is the most basic of lexing/parsing issues:
>>>
>>>
>>> Lexer does something like this:
>>>
>>> NUMBER : [0..9]+ ;
>>>
>>> EUR : 'EUR' ;
>>>
>>> TNE : 'TNE' ;
>>>
>>>
>>> Parser does something like this:
>>>
>>> num_with_monetary_unit_and_optional_per_unit
>>> : NUMBER monetary_unit ('/' measurement_unit)?
>>> ;
>>>
>>> monetary_unit
>>> : EUR
>>> | <<whatever other monies you support>>
>>> ;
>>>
>>> measurement_unit
>>> : TNE
>>> | <<whatever other measurement units you support>>
>>> ;
>>>
>>>
>>> But was that really your question?
>>>
>>>
>> ------------------------------------------------------------------------
>>
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
More information about the antlr-interest
mailing list