[antlr-interest] Tokenising for context specific reserved words

Thu Jul 17 16:49:52 PDT 2008

That is one solution; however, semantic predicates-- { input.LT(1).getText().equals("foo") }? ID --are much to be preferred when there are lots of potential keywords and cost less in terms of performance since they avoid the id method call for the general case.  (Or should cost less:  ANTLR 3 currently does not reduce the generated conditionals.)
--Loring

----- Original Message ----
From: Jim Idle <jimi at temporal-wave.com>
To: antlr-interest <antlr-interest at antlr.org>
Sent: Thursday, July 17, 2008 4:32:42 PM
Subject: Re: [antlr-interest] Tokenising for context specific reserved words

On Thu, 2008-07-17 at 16:22 -0700, Roshan James wrote: 
Hello,

I am trying to parse a language where there are words that have keyword status in only some contexts/rules. In any other context those words can be used as identifiers. However the default behavior of the lexer is that it will generate special tokens for these always. Is there some way to work around this? 

As an example consider the rule called options below:
options: 'format' INTEGER

I define identifiers as:
ID : LETTER (LETTER | DIGIT)*;

However when I do this, the lexer generates a special token that has type 'format'. Thus, in any other part of the grammar where I expect to parse the input string 'format' as an identifier the parser complains. 

The solution that comes to mind is to change the above rule to be
options: ID INTEGER
and then inserting an appropriate semantic check. 

Always generate the keywords, then use parser rules to allow them as variables in specific contexts:

For instance, when compiling LINQ in VB.Net, you can't sue the LINQ keywords as variables, but you can everywhere else, so you have;

linq_id 
  : ID
  ;

id: ID
  | linq_keywords
;

linq_keywords
    : SELECT -> ID[SELECT]
    | WHERE -> ID[WHERE]
    | etc
   ;

Then you use the appropriate parser rule as context requires.

Jim 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080717/7b87c55c/attachment.html