[antlr-interest] Keywords Vs Identifiers.

Bharath S bharath at starthis.com
Thu May 20 06:30:31 PDT 2004


Hi Monty,

I am unclear about the ID token here. Let's say that lexer sees "abc" which
is a token of type ID. Please correct me if my understanding is not right.

1. if (i.getType( )) statement, is used to test against literals. So, if ID
was "INT" instead of "abc", it would return LITERAL_INT and it would skip
that token. Otherwise, it sets "abc"'s type as ID. Though ID by itself has
{testliterals} options set, IDMEAT rule would allow me to have both ID and
(TIME : "TIME" Integer;) rule to co-exist in the lexer.

2. This is a better solution because if I had 's', 'm', 'ms' etc to denote
seconds, minutes and milliseconds, I have to write a separate rule for each
one of them  in the parser (if i follow my solution) to prevent conflict
with the ID rule. Doing it via IDMEAT will solve the issue and make life
easier.

Thanks for your comments and clarifications!

Bharath.
----- Original Message ----- 
From: "Monty Zukowski" <monty at codetransform.com>
To: <antlr-interest at yahoogroups.com>
Cc: "Monty Zukowski" <monty at codetransform.com>
Sent: Wednesday, May 19, 2004 5:13 PM
Subject: Re: [antlr-interest] Keywords Vs Identifiers.


> If you want to handle that in the lexer you need to do it by calling
> the rule that tests the literals table, here's an example from the C
> grammar:
>
> IDMEAT
>          :
>                  i:ID                {
>
>                                          if ( i.getType() ==
> LITERAL___extension__ ) {
>                                                  $setType(Token.SKIP);
>                                          }
>                                          else {
>                                                  $setType(i.getType());
>                                          }
>
>                                      }
>          ;
>
> protected ID
>          options
>                  {
>                  testLiterals = true;
>                  }
>          :       ( 'a'..'z' | 'A'..'Z' | '_' | '$')
>                  ( 'a'..'z' | 'A'..'Z' | '_' | '$' | '0'..'9' )*
>          ;
>
> It's actually tricky to figure out how to lex the following whitespace
> and integer without using a syntactic predicate, but a syn pred here
> will be a performance problem.  I would actually recommend using a
> parser filter see http://www.codetransform.com/filterexample.html
>
> By the way your parser solution works just fine too, is probably the
> easiest.
>
> Monty
>
> On May 19, 2004, at 2:55 PM, Bharath wrote:
>
> > Hi Monty,
> >
> > I did. I figured a way out too but I am not sure if it's an efficient
> > solution. I set a rule in the parser which accepts an identifier and I
> > extracted the identifier input into a string. If the string is not
> > "TIME", I
> > throw an exception, otherwise I accept it. (using getText() method).
> >
> > Please let me know if this is bad practice.
> >
> > Thanks!
> >
> > Bharath.
> >
> > -----Original Message-----
> > From: Monty Zukowski [mailto:monty at codetransform.com]
> > Sent: Wednesday, May 19, 2004 4:41 PM
> > To: antlr-interest at yahoogroups.com
> > Cc: Monty Zukowski
> > Subject: Re: [antlr-interest] Keywords Vs Identifiers.
> >
> > See the documentation about "literals"
> >
> > Monty
> >
> > On May 19, 2004, at 8:25 AM, Bharath S wrote:
> >
> >> Hi Antlers,
> >>
> >> I have some rules in my grammar, for time literals which require that
> >> 'TIME'
> >> or "time" be appended to the front of the rule. For eg., time can
> >> represented as TIME 99secs. The problem is, "TIME" is not a keyword
> >> and so I
> >> cant have it in the parser. If I throw it in the lexer, it causes a
> >> clash
> >> with IDENTIFIER rule, because the lexer sees the rule as
> >>
> >> TIME: 'T' 'I' 'M' 'E' (Integer) ; and
> >> IDENTIFIER: ('a'..'z'|'A'..'Z')+;
> >>
> >> as expected. Is there a common workaround for this?
> >>
> >> I can solve this problem by moving a whole bunch of rules in the
> >> parser back
> >> to the lexer, just to make the TIME rule protected. But it doesnt make
> >> sense, at all.
> >>
> >> Any comments are most welcome.
> >>
> >> Bharath.
> > Monty Zukowski
> >
> > ANTLR & Java Consultant -- http://www.codetransform.com
> > ANSI C/GCC transformation toolkit --
> > http://www.codetransform.com/gcc.html
> > Embrace the Decay -- http://www.codetransform.com/EmbraceDecay.html
> >
> >
> >
> >
> > Yahoo! Groups Links
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > Yahoo! Groups Links
> >
> >
> >
> >
> >
> >
> >
> >
> Monty Zukowski
>
> ANTLR & Java Consultant -- http://www.codetransform.com
> ANSI C/GCC transformation toolkit -- 
> http://www.codetransform.com/gcc.html
> Embrace the Decay -- http://www.codetransform.com/EmbraceDecay.html
>
>
>
>
> Yahoo! Groups Links
>
>
>
>
>
>



 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list