[antlr-interest] So I wish one token has two types

=?gb2312?q?=CB=EF=BC=CD=B8=D5=20Jigang=20(Robert)=20Sun?= sunjigang1965 at yahoo.com.cn
Tue Jun 6 07:14:58 PDT 2006


I have two kinds of tokens, CHAR and ID, to identify 

CHAR: LowerCaseChar;
ID: (LowerCaseChar)+;

a single char, e.g. 's', could be either type of CHAR of ID, 

the following lexical rule can only assign one type, 

ID: LowerCaseChar {$setType(CHAR);} ( LowerCaseChar)* {$setType(ID);} //get only one type
protected LowerCaseChar: 'a'..'z';

In parser I need 's' be scanned as type of CHAR or ID in different context. 

Micheal J ever told me to see rule INT_LITERAL of csharp_v1 example, I think that method can only
differ situation like

     INTEGER: (DIGIT)+;
from
     REAL: (DIGIT)+ '.' (DIGIT)+;

use a rule like 

VALUE: (DIGIT)+ {$setType(INTEGER);} ('.' (DIGIT)+ {$setType(REAL);})?;

if same text such as "32" is treated as INTEGER in one occasion, while in other case it need to be
recognised as REAL the above method could not work.

Probaly the anwser is in the example, but the csharp_v1 grammar is in a big size. I have not find
the solution.

So I wish one token has two types, thus 's' feeding to parser is of CHAR or ID depending by my
needs.

Or could composite token forming be left to parser,

id:(LowerCaseChar)+;

a string "name" could be converted a single token, not a tree in parser? 

Many thanks.

Robert
     


__________________________________________________
¸Ï¿ì×¢²áÑÅ»¢³¬´óÈÝÁ¿Ãâ·ÑÓÊÏä?
http://cn.mail.yahoo.com


More information about the antlr-interest mailing list