[antlr-interest] lexer doubts ..

Wed Oct 16 10:29:38 PDT 2002

===== 2 ====== has been solved like this 

 DOT           :  '.'  ;	
 protected
DIGIT :	'0'..'9' ;

 ID : ('a'..'z' | 'A'..'Z'|'_')('a'..'z' | 'A'..'Z'|'_'|DIGIT)*   ;

protected SUFFIX      :   ('a'|'f'|'p'|'n'|'u'|'m'|'k'|'M'|'g'|'t')   ;
protected EXPONENT :   ( ('e')('+'|'-')?(DIGIT)+ )   ;
protected INT            :	(DIGIT) +    ;  /// match the integral part 
123
protected FLOATING  :   ( '.'  (DIGIT)+ ) ;

NUM   : INT ( FLOATING  ) ? ( SUFFIX )? (EXPONENT ) ?     ;      

i still dont understand why this works and the otherone doesnt.. all i
did was re organise it and use some more sub rules to make it more
readable .. 
if any one understands .. please explain
cheers
ram

--- Sriram Durbha <cintyram at yahoo.com> wrote:
> 1. how to give the same character two names .
>  eg: 
>  BOOLEAN_OR : '|' ;
>  PIPE       : '|' ;
> 
> but antlr cribs for this .. 
> ====== 2 ============================
> also i have a grammar which i dont know how to debug.. so please help
> ..
> 
> DOT : '.' ;
> protected
> DIGIT :	'0'..'9' ;
> NUM   :	(DIGIT) + /// match the integral part  123
>        (         /// the floating part is optional 
>         DOT (DIGIT)+  ///  123. atleast one digit after dot
>         (    /// scientific claculators have  k for kilo etc
>          ('a'|'f'|'p'|'n'|'u'|'m'|'k'|'M'|'g'|'t')  
>         |
>          ( ('e'|'d'|)('+'|'-')?(DIGIT)+ ) 
> 
>         )? /// end of scientific
>        )? ///end of  optional floating part 
>       |
>        (   /// use this alt for .123e-10  no integral part
>         DOT (DIGIT)+ 
>         (
>          ('a'|'f'|'p'|'n'|'u'|'m'|'k'|'M'|'g'|'t')  
>         |
>          ( ('e'|'d'|)('+'|'-')?(DIGIT)+ ) 
>         )?
>        ) /// end of alt2 for only floating part 
>  	;
>  
>  ID : ('a'..'z' | 'A'..'Z'|'_')('a'..'z' | 'A'..'Z'|'_'|DIGIT)*
>  	;
>  
> i am getting lexical nondeterminism warning.. but i dont see why i
> should get any except for  DOT and NUM's second alt.. 
> 
> warning: lexical nondeterminism between rules DOT and NUM upon
> scical.g:0:        k==1:'.'
> scical.g:0:        k==2:<end-of-token>
> scical.g:0:        k==3:<end-of-token>
> scical.g:0:        k==4:<end-of-token>
> scical.g:0:        k==5:<end-of-token>
> scical.g:204: warning: lexical nondeterminism upon
> scical.g:204:      k==1:'0'..'9'
> scical.g:204:      k==2:<end-of-token>,'+','-','0'..'9','e','u'
> scical.g:204:      k==3:<end-of-token>,'0'..'9','g'
> scical.g:204:      k==4:<end-of-token>,'0'..'9'
> scical.g:204:      k==5:<end-of-token>,'0'..'9'
> scical.g:204:      between alt 1 and exit branch of block
> scical.g:212: warning: lexical nondeterminism upon
> scical.g:212:      k==1:'0'..'9'
> scical.g:212:      k==2:<end-of-token>,'+','-','0'..'9','e','u'
> scical.g:212:      k==3:<end-of-token>,'0'..'9','g'
> scical.g:212:      k==4:<end-of-token>,'0'..'9'
> scical.g:212:      k==5:<end-of-token>,'0'..'9'
> scical.g:212:      between alt 1 and exit branch of block
> 
> if i have to use syntax prediacte how do i use it? and due to what
> reason? plese explain .. i could not understand clearly about it from
> the example and documentation.
> 
> ====== 3 ============================
> 
> if in future i add structures and complex data types.. is it better
> to
> recognize structure definitions at lexing time or parsing time? same
> for structure references ..
> should it be TYPE_STRUCT 
> or KWD_STRUCT LBRACE TYPE ID and so on... 
> 
> 
> should it be KWD_STRUCT DOT ID
> or STRUCT_REF 
> 
> ==========
> 
> thank you 
> ram
> 
> 
> 
> 
> 
> 
> __________________________________________________
> Do you Yahoo!?
> Faith Hill - Exclusive Performances, Videos & More
> http://faith.yahoo.com
> 
>  
> 
> Your use of Yahoo! Groups is subject to
> http://docs.yahoo.com/info/terms/ 
> 
> 

__________________________________________________
Do you Yahoo!?
Faith Hill - Exclusive Performances, Videos & More
http://faith.yahoo.com

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/