=?gb2312?q?=BB=D8=B8=B4=A3=BA=20Re:=20[antlr-interest]=20So=20I=20wish=20?= =?gb2312?q?one=20token=20has=20two=20types?=

=?gb2312?q?=CB=EF=BC=CD=B8=D5=20Jigang=20(Robert)=20Sun?= sunjigang1965 at yahoo.com.cn
Wed Jun 7 06:19:41 PDT 2006


Thanks, the action works.

While, I just worried this method has a disadvantage. What I know is that, antlr does not check
semantics of action. In this example, the token set of ID is greater than CHAR; So to parser
generator, the token set is ID other than CHAR, nondeterministic warnings would be given for where
original CHAR would not, especially for large scale grammar.

Robert 

   
--- Martin Probst <mail at martin-probst.com>дµÀ:

> You should probably consider CHAR to be a special case of ID and give  
> an error if it's longer than 1, e.g.
> 
> char: id:ID { if (#id.getText().length() > 1) throw new  
> RecognitionException(...)); };
> 
> and replace references to CHAR with references to char and have the  
> CHAR rule in the lexer be protected.
> 
> Am 06.06.2006 um 16:14 schrieb Ëï¼Í¸Õ Jigang (Robert) Sun:
> 
> > I have two kinds of tokens, CHAR and ID, to identify
> >
> > CHAR: LowerCaseChar;
> > ID: (LowerCaseChar)+;
> >
> > a single char, e.g. 's', could be either type of CHAR of ID,
> >
> > the following lexical rule can only assign one type,
> >
> > ID: LowerCaseChar {$setType(CHAR);} ( LowerCaseChar)* {$setType 
> > (ID);} //get only one type
> > protected LowerCaseChar: 'a'..'z';
> >
> > In parser I need 's' be scanned as type of CHAR or ID in different  
> > context.
> >
> > Micheal J ever told me to see rule INT_LITERAL of csharp_v1  
> > example, I think that method can only
> > differ situation like
> >
> >      INTEGER: (DIGIT)+;
> > from
> >      REAL: (DIGIT)+ '.' (DIGIT)+;
> >
> > use a rule like
> >
> > VALUE: (DIGIT)+ {$setType(INTEGER);} ('.' (DIGIT)+ {$setType 
> > (REAL);})?;
> >
> > if same text such as "32" is treated as INTEGER in one occasion,  
> > while in other case it need to be
> > recognised as REAL the above method could not work.
> >
> > Probaly the anwser is in the example, but the csharp_v1 grammar is  
> > in a big size. I have not find
> > the solution.
> >
> > So I wish one token has two types, thus 's' feeding to parser is of  
> > CHAR or ID depending by my
> > needs.
> >
> > Or could composite token forming be left to parser,
> >
> > id:(LowerCaseChar)+;
> >
> > a string "name" could be converted a single token, not a tree in  
> > parser?
> >
> > Many thanks.
> >
> > Robert
> >
> >
> >
> > __________________________________________________
> > ¸Ï¿ì×¢²áÑÅ»¢³¬´óÈÝÁ¿Ãâ·ÑÓÊÏä?
> > http://cn.mail.yahoo.com
> >
> 
> 


__________________________________________________
¸Ï¿ì×¢²áÑÅ»¢³¬´óÈÝÁ¿Ãâ·ÑÓÊÏä?
http://cn.mail.yahoo.com


More information about the antlr-interest mailing list