[antlr-interest] IDs and keywords

Jim Idle jimi at temporal-wave.com
Thu Oct 20 08:44:10 PDT 2011


This is well covered in this forum, so use the search engine. But:

1) Do not use 'LITERALS', create real tokens;
2) Use a rule id instead of the token ID
3) The id rule has ID and all the keywords as alts, if producing AST, then
change type to ID;
4) Where this introduces ambiguity, use a one token predicate, or explicit
k=1;


Assuming AST...

id:  ID
  |  CPA    -> ID[$CPA]
  |  ASSOC  -> ID[$ASSOC]
... etc


model: MODEL id ASSOC cpa_vars id? id? SEMI ;

CPA  : 'CPA';
PSAT : 'PSAT';

etc...

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Nuno Pedrosa
> Sent: Thursday, October 20, 2011 2:32 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] IDs and keywords
>
> Hello everyone,
>
> I have just started a project to convert our current way of processing
> the file generated by our program into a more elegant way by using a
> parser generator.
>
> ANTLR so far as proven to be quite powerful but I think I have hit a
> bit of a wall.
>
> here is an extract of my grammar:
> >>>>>>>>>>>>>>>>>>>>>>>
> grammar MFL;
>
> options{
>  language = C;
> }
>
> model:
> 'MODEL'  ID  'ASSOC' cpa_vars id1=ID? id2=ID? ';'
> ;
>
> cpa_vars returns [long var]:
>     'CPA'     {$var = JVCPA;}
>     | 'PSAT'  {$var = JVSVP;}
> ;
>
> ID
>     :('a'..'z'|'A'..'Z'|'0'..'9'|'_')
> ('a'..'z'|'A'..'Z'|'0'..'9'|'_'|'-'|','|'+'|'.')*
> ;
> >>>>>>>>>>>>>>>>>>>>>>>>
>
> I think this should be enough to explain my problem.
>
> the rule model should match things like, but there are situation where
> it does not work as I expected:
> example: this matches OK
> MODEL mymodel ASSOC CPA BIP1 ;
>
> but says that <missing ID> where cpa is.
> MODEL cpa ASSOC CPA BIP1 ;
>
> debugging the code I understand the lexer has assigned token type to
> the literal present in the cpa_vars rule instead of the mode generic ID
> token type.
>
> My question is: how do I make sure I match ID instead of  'CPA' of
> another rule for this case?
>
> The configuration file I am trying to parse follows this structure that
> depending in the place the tokens are, they are considered actual
> tokens or else they are just general identifiers.
>
> I sure will appreciate any help on this.
>
> Best regards,
>
> Nuno
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


More information about the antlr-interest mailing list