[antlr-interest] Token class in lexer - lexical nondeterminism error

Jiho Han jhan at InfinityInfo.com
Tue Jul 25 08:32:59 PDT 2006


Thanks for the tip.  In fact, I have proceeded on that path but then what is all this business with token class?

The reason this Token Class interested me in the first place is that the element label can't be applied to a parser rule like below:

expr: (a:ID op:operator value:LITERAL) ;

To use them in the semantic action.  If, however, I had OPERATOR token class, I could:

expr: (a:ID op:OPERATOR value:LITERAL) ;

I am trying to avoid:

expr: (a:ID (eq:OP_EQ | neq:OP_NEQ | gt:OP_GT | lt:OP_LT | ge:OP_GE | le:OP_LE) value:LITERAL) ;

Something like that.  I want to be able to apply a single label to a group of TOKENs - token class to use in the semantic action that follows.
Thanks
Jiho

-----Original Message-----
From: james_cataldo at agilent.com [mailto:james_cataldo at agilent.com] 
Sent: Tuesday, July 25, 2006 11:22 AM
To: Jiho Han; antlr-interest at antlr.org
Subject: RE: [antlr-interest] Token class in lexer - lexical nondeterminism error

The problem is that if the lexer sees an equals sign, it has two tokens it can create.  One is an OP_EQ token, and the other is an OPERATOR token.  I suggest you make a rule called operator in your parser:

> operator:
>    (OP_EQ | OP_NEQ | OP_GT | OP_LT | OP_GE | OP_LE) ;

Then you shouldn't get any problem about nondeterminism, because tokens will be sent to the parser from the lexer, not characters. Character streams are sent to the lexer, which then creates token streams to send to the parser.  Hope this helps.

Cheers,
Adam

________________________________________
From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Jiho Han
Sent: Tuesday, July 25, 2006 6:16 AM
To: antlr-interest at antlr.org
Subject: [antlr-interest] Token class in lexer - lexical nondeterminism error

I am new to ANTLR. 
I have the following defined in my lexer. 
OP_EQ           : '=' ;
OP_NEQ  : "<>" ;
OP_GT           : '>' ;
OP_LT           : '<' ;
OP_GE           : ">=" ;
OP_LE           : "<=" ;
LPAREN  : '(' ;
RPAREN  : ')' ;
OPERATOR        : (OP_EQ | OP_NEQ | OP_GT | OP_LT | OP_GE | OP_LE) ; When I run it through antlr I get: 
FilterExpression.g: warning:lexical nondeterminism between rules OP_EQ and OPERATOR upon
FilterExpression.g:     k==1:'=' 
FilterExpression.g:     k==2:<end-of-token> And bunch of others like it. 
I tried to create OPERATOR as a token class as mentioned in the antlr documentation in the section titled, Meta Language.
So that I can do this in the parser: 
expr: ID OPERATOR^ VALUELITERAL ;
Instead of,
expr: ID (OP_EQ | OP_NEQ | OP_GT | OP_LT | OP_GE | OP_LE) VALUELITERAL ; What am I missing? 
Thanks
Jiho Han
Senior Software Engineer
Infinity Info Systems
The Sales Technology Experts
Tel: 212.563.4400 x216
Fax: 212.760.0540
jhan at infinityinfo.com
www.infinityinfo.com 



More information about the antlr-interest mailing list