# [antlr-interest] lexer doubts ..

Sriram Durbha cintyram at yahoo.com
Wed Oct 16 10:29:38 PDT 2002

```===== 2 ====== has been solved like this

DOT           :  '.'  ;
protected
DIGIT :	'0'..'9' ;

ID : ('a'..'z' | 'A'..'Z'|'_')('a'..'z' | 'A'..'Z'|'_'|DIGIT)*   ;

protected SUFFIX      :   ('a'|'f'|'p'|'n'|'u'|'m'|'k'|'M'|'g'|'t')   ;
protected EXPONENT :   ( ('e')('+'|'-')?(DIGIT)+ )   ;
protected INT            :	(DIGIT) +    ;  /// match the integral part
123
protected FLOATING  :   ( '.'  (DIGIT)+ ) ;

NUM   : INT ( FLOATING  ) ? ( SUFFIX )? (EXPONENT ) ?     ;

i still dont understand why this works and the otherone doesnt.. all i
did was re organise it and use some more sub rules to make it more
if any one understands .. please explain
cheers
ram

--- Sriram Durbha <cintyram at yahoo.com> wrote:
> 1. how to give the same character two names .
>  eg:
>  BOOLEAN_OR : '|' ;
>  PIPE       : '|' ;
>
> but antlr cribs for this ..
> ====== 2 ============================
> ..
>
> DOT : '.' ;
> protected
> DIGIT :	'0'..'9' ;
> NUM   :	(DIGIT) + /// match the integral part  123
>        (         /// the floating part is optional
>         DOT (DIGIT)+  ///  123. atleast one digit after dot
>         (    /// scientific claculators have  k for kilo etc
>          ('a'|'f'|'p'|'n'|'u'|'m'|'k'|'M'|'g'|'t')
>         |
>          ( ('e'|'d'|)('+'|'-')?(DIGIT)+ )
>
>         )? /// end of scientific
>        )? ///end of  optional floating part
>       |
>        (   /// use this alt for .123e-10  no integral part
>         DOT (DIGIT)+
>         (
>          ('a'|'f'|'p'|'n'|'u'|'m'|'k'|'M'|'g'|'t')
>         |
>          ( ('e'|'d'|)('+'|'-')?(DIGIT)+ )
>         )?
>        ) /// end of alt2 for only floating part
>  	;
>
>  ID : ('a'..'z' | 'A'..'Z'|'_')('a'..'z' | 'A'..'Z'|'_'|DIGIT)*
>  	;
>
> i am getting lexical nondeterminism warning.. but i dont see why i
> should get any except for  DOT and NUM's second alt..
>
> warning: lexical nondeterminism between rules DOT and NUM upon
> scical.g:0:        k==1:'.'
> scical.g:0:        k==2:<end-of-token>
> scical.g:0:        k==3:<end-of-token>
> scical.g:0:        k==4:<end-of-token>
> scical.g:0:        k==5:<end-of-token>
> scical.g:204: warning: lexical nondeterminism upon
> scical.g:204:      k==1:'0'..'9'
> scical.g:204:      k==2:<end-of-token>,'+','-','0'..'9','e','u'
> scical.g:204:      k==3:<end-of-token>,'0'..'9','g'
> scical.g:204:      k==4:<end-of-token>,'0'..'9'
> scical.g:204:      k==5:<end-of-token>,'0'..'9'
> scical.g:204:      between alt 1 and exit branch of block
> scical.g:212: warning: lexical nondeterminism upon
> scical.g:212:      k==1:'0'..'9'
> scical.g:212:      k==2:<end-of-token>,'+','-','0'..'9','e','u'
> scical.g:212:      k==3:<end-of-token>,'0'..'9','g'
> scical.g:212:      k==4:<end-of-token>,'0'..'9'
> scical.g:212:      k==5:<end-of-token>,'0'..'9'
> scical.g:212:      between alt 1 and exit branch of block
>
> if i have to use syntax prediacte how do i use it? and due to what
> reason? plese explain .. i could not understand clearly about it from
> the example and documentation.
>
> ====== 3 ============================
>
> if in future i add structures and complex data types.. is it better
> to
> recognize structure definitions at lexing time or parsing time? same
> for structure references ..
> should it be TYPE_STRUCT
> or KWD_STRUCT LBRACE TYPE ID and so on...
>
>
> should it be KWD_STRUCT DOT ID
> or STRUCT_REF
>
> ==========
>
> thank you
> ram
>
>
>
>
>
>
> __________________________________________________
> Do you Yahoo!?
> Faith Hill - Exclusive Performances, Videos & More
> http://faith.yahoo.com
>
>
>
> Your use of Yahoo! Groups is subject to
> http://docs.yahoo.com/info/terms/
>
>

__________________________________________________
Do you Yahoo!?
Faith Hill - Exclusive Performances, Videos & More
http://faith.yahoo.com

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/

```