[antlr-interest] Keywords Vs Identifiers.
Monty Zukowski
monty at codetransform.com
Wed May 19 15:13:09 PDT 2004
If you want to handle that in the lexer you need to do it by calling
the rule that tests the literals table, here's an example from the C
grammar:
IDMEAT
:
i:ID {
if ( i.getType() ==
LITERAL___extension__ ) {
$setType(Token.SKIP);
}
else {
$setType(i.getType());
}
}
;
protected ID
options
{
testLiterals = true;
}
: ( 'a'..'z' | 'A'..'Z' | '_' | '$')
( 'a'..'z' | 'A'..'Z' | '_' | '$' | '0'..'9' )*
;
It's actually tricky to figure out how to lex the following whitespace
and integer without using a syntactic predicate, but a syn pred here
will be a performance problem. I would actually recommend using a
parser filter see http://www.codetransform.com/filterexample.html
By the way your parser solution works just fine too, is probably the
easiest.
Monty
On May 19, 2004, at 2:55 PM, Bharath wrote:
> Hi Monty,
>
> I did. I figured a way out too but I am not sure if it's an efficient
> solution. I set a rule in the parser which accepts an identifier and I
> extracted the identifier input into a string. If the string is not
> "TIME", I
> throw an exception, otherwise I accept it. (using getText() method).
>
> Please let me know if this is bad practice.
>
> Thanks!
>
> Bharath.
>
> -----Original Message-----
> From: Monty Zukowski [mailto:monty at codetransform.com]
> Sent: Wednesday, May 19, 2004 4:41 PM
> To: antlr-interest at yahoogroups.com
> Cc: Monty Zukowski
> Subject: Re: [antlr-interest] Keywords Vs Identifiers.
>
> See the documentation about "literals"
>
> Monty
>
> On May 19, 2004, at 8:25 AM, Bharath S wrote:
>
>> Hi Antlers,
>>
>> I have some rules in my grammar, for time literals which require that
>> 'TIME'
>> or "time" be appended to the front of the rule. For eg., time can
>> represented as TIME 99secs. The problem is, "TIME" is not a keyword
>> and so I
>> cant have it in the parser. If I throw it in the lexer, it causes a
>> clash
>> with IDENTIFIER rule, because the lexer sees the rule as
>>
>> TIME: 'T' 'I' 'M' 'E' (Integer) ; and
>> IDENTIFIER: ('a'..'z'|'A'..'Z')+;
>>
>> as expected. Is there a common workaround for this?
>>
>> I can solve this problem by moving a whole bunch of rules in the
>> parser back
>> to the lexer, just to make the TIME rule protected. But it doesnt make
>> sense, at all.
>>
>> Any comments are most welcome.
>>
>> Bharath.
> Monty Zukowski
>
> ANTLR & Java Consultant -- http://www.codetransform.com
> ANSI C/GCC transformation toolkit --
> http://www.codetransform.com/gcc.html
> Embrace the Decay -- http://www.codetransform.com/EmbraceDecay.html
>
>
>
>
> Yahoo! Groups Links
>
>
>
>
>
>
>
>
>
>
>
> Yahoo! Groups Links
>
>
>
>
>
>
>
>
Monty Zukowski
ANTLR & Java Consultant -- http://www.codetransform.com
ANSI C/GCC transformation toolkit --
http://www.codetransform.com/gcc.html
Embrace the Decay -- http://www.codetransform.com/EmbraceDecay.html
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/antlr-interest/
<*> To unsubscribe from this group, send an email to:
antlr-interest-unsubscribe at yahoogroups.com
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list