# [antlr-interest] Basic Lexical Analysis Problem

Bogdan Mitu bogdan_mt at yahoo.com
Mon Jun 24 01:53:29 PDT 2002

```--- v_vivekg <v_vivekg at yahoo.com> wrote:
> Hello All,
>
> I am new to antlr and having problems with lexical analysis between
> binary,hexdigit and normal digits. The lexer code is
>
> NUMBER	 	:	('0'..'9')+;
> UPPER
> options {testLiterals = true;}
> 			:   ('A'..'Z') ( 'a'..'z' | 'A'..'Z' |'-' |
> '0'..'9' )* 	;
>
> LOWER
> options {testLiterals = true;}
> 		:	('a'..'z') ( 'a'..'z' | 'A'..'Z' |'-' |
> '0'..'9' )*	;
> B_STRING 	: 	SINGLE_QUOTE	('0'..'1')*  SINGLE_QUOTE 'B';
> H_STRING 	: 	SINGLE_QUOTE	('0'..'9' |'A'..'F'|
> 'a'..'f')+  SINGLE_QUOTE 'H' ;
>
> While compiling this grammar errors as below
> I have tried various manipulations in this grammar without success.
> Also this grammar is not able to differntiate between hex and binary
> digits. It gives error for binary digits and works ok for hexdigits.

B_STRING sequences are also H_STRING sequences, except for the last char
('B' or 'H'). So the lexer can not decide what rule to apply until it sees
the end of the sequence. Since the sequence length is arbitrary long, it can
not do it with finite lookahead. Trying to increase the k value, as it seems
you did, won't help. You need to use syntactic predicates (see the ANTLR
documentation for details).

Bogdan

When the input is a B-STRING, the lexer
Rules B_STRING and H_STRING can not be

> Kindly help
>
> Regards
> Vivek
>
>
>
> ===============================================================
>
>
> ANTLR Parser Generator   Version 2.7.2a2 (20020112-1)   1989-2002
> jGuru.com
> warning: lexical nondeterminism between rules B_STRING and H_STRING
> upon
>        k==1:'\''
>        k==2:'0','1'
>        k==3:'\'','0','1','B'
>        k==4:'\'','0','1','B'
>        k==5:<end-of-token>,'\'','0','1','B'
>        k==6:<end-of-token>,'\'','0','1','B'
>        k==7:<end-of-token>,'\'','0','1','B'
>        k==8:<end-of-token>,'\'','0','1','B'
>        k==9:<end-of-token>,'\'','0','1','B'
>        k==10:<end-of-token>,'\'','0','1','B'
>        k==11:<end-of-token>,'\'','0','1','B'
> 145: warning: lexical nondeterminism upon
> 145:   k==1:'-'
> 145:   k==2:'\n','-'
> 145:   k==3:<end-of-token>
> 145:   k==4:<end-of-token>
> 145:   k==5:<end-of-token>
> 145:   k==6:<end-of-token>
> 145:   k==7:<end-of-token>
> 145:   k==8:<end-of-token>
> 145:   k==9:<end-of-token>
> 145:   k==10:<end-of-token>
> 145:   k==11:<end-of-token>
> 145:   between alt 1 and exit branch of block
> 155: warning: lexical nondeterminism upon
> 155:   k==1:'a'..'z'
> 155:   k==2:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 155:   k==3:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 155:   k==4:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 155:   k==5:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 155:   k==6:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 155:   k==7:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 155:   k==8:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 155:   k==9:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 155:   k==10:<end-of-token>,'"','-','0'..'9','A
> '..'Z','a'..'z'
> 155:   k==11:<end-of-token>,'"','-','0'..'9','A
> '..'Z','a'..'z'
> 155:   between alt 1 and exit branch of block
> 155: warning: lexical nondeterminism upon
> 155:   k==1:'A'..'Z'
> 155:   k==2:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 155:   k==3:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 155:   k==4:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 155:   k==5:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 155:   k==6:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 155:   k==7:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 155:   k==8:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 155:   k==9:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 155:   k==10:<end-of-token>,'"','-','0'..'9','A
> '..'Z','a'..'z'
> 155:   k==11:<end-of-token>,'"','-','0'..'9','A
> '..'Z','a'..'z'
> 155:   between alt 2 and exit branch of block
> 159: warning: lexical nondeterminism upon
> 159:   k==1:'a'..'z'
> 159:   k==2:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 159:   k==3:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 159:   k==4:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 159:   k==5:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 159:   k==6:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 159:   k==7:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 159:   k==8:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 159:   k==9:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 159:   k==10:<end-of-token>,'"','-','0'..'9','A
> '..'Z','a'..'z'
> 159:   k==11:<end-of-token>,'"','-','0'..'9','A
> '..'Z','a'..'z'
> 159:   between alt 1 and exit branch of block
> 159: warning: lexical nondeterminism upon
> 159:   k==1:'A'..'Z'
> 159:   k==2:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 159:   k==3:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 159:   k==4:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 159:   k==5:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 159:   k==6:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 159:   k==7:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 159:   k==8:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 159:   k==9:<end-of-token>,'"','-','0'..'9','A'
> ..'Z','a'..'z'
> 159:   k==10:<end-of-token>,'"','-','0'..'9','A
> '..'Z','a'..'z'
> 159:   k==11:<end-of-token>,'"','-','0'..'9','A
> '..'Z','a'..'z'
>
> =================================================================
>
>
>
>
>
>
>
>
>
>
>
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
>
>
>

__________________________________________________
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/

```