[antlr-interest] Keywords Vs Identifiers.

Monty Zukowski monty at codetransform.com
Thu May 20 10:05:17 PDT 2004


I'm sorry, I was in a hurry.  Inspect the generated code, you will see 
in the ID rule where antlr tests the token text against the literals 
table and assigns the token type.  To use it in a rule you may need a 
semantic predicate, this is a little tricky because you need to use the 
predicate to choose an alternative--hmmm, maybe you could get by with 
calling the lexer rule directly in your action code.  Yes, in your 
action where you see the TIME id, call the WS rule and then the INT 
rule.  If either fail that's ok, it was not the TIME keyword, is was an 
ID, so change the type back.  Then call your s,m,ms rule.  The text 
will still be appended to the token buffer and make it through to the 
parser.  Try it out and ask when you hit a problem.  I wish I had 
another 15 minutes to explain fully...

Monty

On May 20, 2004, at 6:30 AM, Bharath S wrote:

> Hi Monty,
>
> I am unclear about the ID token here. Let's say that lexer sees "abc" 
> which
> is a token of type ID. Please correct me if my understanding is not 
> right.
>
> 1. if (i.getType( )) statement, is used to test against literals. So, 
> if ID
> was "INT" instead of "abc", it would return LITERAL_INT and it would 
> skip
> that token. Otherwise, it sets "abc"'s type as ID. Though ID by itself 
> has
> {testliterals} options set, IDMEAT rule would allow me to have both ID 
> and
> (TIME : "TIME" Integer;) rule to co-exist in the lexer.
>
> 2. This is a better solution because if I had 's', 'm', 'ms' etc to 
> denote
> seconds, minutes and milliseconds, I have to write a separate rule for 
> each
> one of them  in the parser (if i follow my solution) to prevent 
> conflict
> with the ID rule. Doing it via IDMEAT will solve the issue and make 
> life
> easier.
>
> Thanks for your comments and clarifications!
>
> Bharath.
> ----- Original Message -----
> From: "Monty Zukowski" <monty at codetransform.com>
> To: <antlr-interest at yahoogroups.com>
> Cc: "Monty Zukowski" <monty at codetransform.com>
> Sent: Wednesday, May 19, 2004 5:13 PM
> Subject: Re: [antlr-interest] Keywords Vs Identifiers.
>
>
>> If you want to handle that in the lexer you need to do it by calling
>> the rule that tests the literals table, here's an example from the C
>> grammar:
>>
>> IDMEAT
>>          :
>>                  i:ID                {
>>
>>                                          if ( i.getType() ==
>> LITERAL___extension__ ) {
>>                                                  $setType(Token.SKIP);
>>                                          }
>>                                          else {
>>                                                  
>> $setType(i.getType());
>>                                          }
>>
>>                                      }
>>          ;
>>
>> protected ID
>>          options
>>                  {
>>                  testLiterals = true;
>>                  }
>>          :       ( 'a'..'z' | 'A'..'Z' | '_' | '$')
>>                  ( 'a'..'z' | 'A'..'Z' | '_' | '$' | '0'..'9' )*
>>          ;
>>
>> It's actually tricky to figure out how to lex the following whitespace
>> and integer without using a syntactic predicate, but a syn pred here
>> will be a performance problem.  I would actually recommend using a
>> parser filter see http://www.codetransform.com/filterexample.html
>>
>> By the way your parser solution works just fine too, is probably the
>> easiest.
>>
>> Monty
>>
>> On May 19, 2004, at 2:55 PM, Bharath wrote:
>>
>>> Hi Monty,
>>>
>>> I did. I figured a way out too but I am not sure if it's an efficient
>>> solution. I set a rule in the parser which accepts an identifier and 
>>> I
>>> extracted the identifier input into a string. If the string is not
>>> "TIME", I
>>> throw an exception, otherwise I accept it. (using getText() method).
>>>
>>> Please let me know if this is bad practice.
>>>
>>> Thanks!
>>>
>>> Bharath.
>>>
>>> -----Original Message-----
>>> From: Monty Zukowski [mailto:monty at codetransform.com]
>>> Sent: Wednesday, May 19, 2004 4:41 PM
>>> To: antlr-interest at yahoogroups.com
>>> Cc: Monty Zukowski
>>> Subject: Re: [antlr-interest] Keywords Vs Identifiers.
>>>
>>> See the documentation about "literals"
>>>
>>> Monty
>>>
>>> On May 19, 2004, at 8:25 AM, Bharath S wrote:
>>>
>>>> Hi Antlers,
>>>>
>>>> I have some rules in my grammar, for time literals which require 
>>>> that
>>>> 'TIME'
>>>> or "time" be appended to the front of the rule. For eg., time can
>>>> represented as TIME 99secs. The problem is, "TIME" is not a keyword
>>>> and so I
>>>> cant have it in the parser. If I throw it in the lexer, it causes a
>>>> clash
>>>> with IDENTIFIER rule, because the lexer sees the rule as
>>>>
>>>> TIME: 'T' 'I' 'M' 'E' (Integer) ; and
>>>> IDENTIFIER: ('a'..'z'|'A'..'Z')+;
>>>>
>>>> as expected. Is there a common workaround for this?
>>>>
>>>> I can solve this problem by moving a whole bunch of rules in the
>>>> parser back
>>>> to the lexer, just to make the TIME rule protected. But it doesnt 
>>>> make
>>>> sense, at all.
>>>>
>>>> Any comments are most welcome.
>>>>
>>>> Bharath.
>>> Monty Zukowski
>>>
>>> ANTLR & Java Consultant -- http://www.codetransform.com
>>> ANSI C/GCC transformation toolkit --
>>> http://www.codetransform.com/gcc.html
>>> Embrace the Decay -- http://www.codetransform.com/EmbraceDecay.html
>>>
>>>
>>>
>>>
>>> Yahoo! Groups Links
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Yahoo! Groups Links
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>> Monty Zukowski
>>
>> ANTLR & Java Consultant -- http://www.codetransform.com
>> ANSI C/GCC transformation toolkit --
>> http://www.codetransform.com/gcc.html
>> Embrace the Decay -- http://www.codetransform.com/EmbraceDecay.html
>>
>>
>>
>>
>> Yahoo! Groups Links
>>
>>
>>
>>
>>
>>
>
>
>
>
> Yahoo! Groups Links
>
>
>
>
>
>
>
>
Monty Zukowski

ANTLR & Java Consultant -- http://www.codetransform.com
ANSI C/GCC transformation toolkit -- 
http://www.codetransform.com/gcc.html
Embrace the Decay -- http://www.codetransform.com/EmbraceDecay.html



 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list