[antlr-interest] Tokenising for context specific reserved words

Jim Idle jimi at temporal-wave.com
Thu Jul 17 16:32:42 PDT 2008


On Thu, 2008-07-17 at 16:22 -0700, Roshan James wrote:
> Hello,
> 
> I am trying to parse a language where there are words that have
> keyword status in only some contexts/rules. In any other context those
> words can be used as identifiers. However the default behavior of the
> lexer is that it will generate special tokens for these always. Is
> there some way to work around this? 
> 
> As an example consider the rule called options below:
> options: 'format' INTEGER
> 
> I define identifiers as:
> ID : LETTER (LETTER | DIGIT)*;
> 
> However when I do this, the lexer generates a special token that has
> type 'format'. Thus, in any other part of the grammar where I expect
> to parse the input string 'format' as an identifier the parser
> complains. 
> 
> The solution that comes to mind is to change the above rule to be
> options: ID INTEGER
> and then inserting an appropriate semantic check. 


Always generate the keywords, then use parser rules to allow them as
variables in specific contexts:


For instance, when compiling LINQ in VB.Net, you can't sue the LINQ
keywords as variables, but you can everywhere else, so you have;


linq_id 
  : ID
  ;

id: ID
  | linq_keywords
 ;

linq_keywords
    : SELECT -> ID[SELECT]
    | WHERE -> ID[WHERE]
    | etc
   ;

Then you use the appropriate parser rule as context requires.

Jim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080717/b0645f6a/attachment.html 


More information about the antlr-interest mailing list