[antlr-interest] ANTLR 3 newbie question: Decision can match using multiple alternatives

Ivo Jimenez ivojimenez at gmail.com
Fri Apr 27 02:51:46 PDT 2007


Hi,

I have a question regarding a non-determinism. I'm new to ANTLR 3 and
I'm trying to write the grammar for SQL. I don't want to put the whole
set of rules since I have implemented only a subset of the select and
create statements and its already 800 lines long but if you consider
that it would be better if I post it please let me know. Well, so I
will try to explain myself using only the non-deterministic rules.

Everything was going fine until I decided to expand the language and
include the recognition of expression lists (that appear in the where
condition), for example I want to recognize something like this:

select partnumber, vendornumber
   from purchdb.supplyprice
   where partnumber = any ('1343-d-01', '1623-td-01', '1723-ad-01',
'1733-ad-01')
      and not vendornumber = 9011;

I need to recognize a expression list (which can or can't be enclosed
by parenthesis), where each expression is separated by a comma.

The rules I have for these are the following:

predicate
   : row_value_constructor ( comparison_predicate | in_predicate |
null_predicate )
   | exists_predicate
   ;

comparison_predicate
   : ( '=' | '<>' | '<' | '<=' | '>' | '>=' ) ( quantifier )?
            ( row_value_constructor | '(' subquery ')' )
   ;

in_predicate
   : ('not')? 'in' '(' subquery ')'
   ;

null_predicate
   : 'is' ('not')? 'null'
   ;

exists_predicate
   : 'exists' '(' subquery ')'
   ;

row_value_constructor
   : expression_list
   ;

quantifier
   : 'any' | 'some' | 'all'
   ;

expression_list
   : expression ( ',' expression )*
   ;

expression
   : character_expression
   | numeric_expression
   ;

character_expression
   : atom ( '|' '|' atom )*
   ;

numeric_expression
   : numeric_term ( ( '+' | '-' ) numeric_term )*
   ;

numeric_term
   : numeric_factor ( ( '*' | '\\' ) numeric_factor )*
   ;

numeric_factor
   : ( '+' | '-' )? atom
   ;

atom
   : NUMBER
   | STRING
   | column_reference
   | aggregate_function
   | other_function
   | '(' expression ')'
   ;

// Lexer
WS    : (' ' |'\t' |'\n' |'\r' )+ { skip(); } ;
NL    : '\r' ? '\n';
NUMBER: ( '0'..'9' )+ ( '.' ( '0'..'9' )+ )?;
STRING: '\'' ( LETTER | '0'..'9' | '_' | '\\' | ' ' | '-' | '>' | '='
| '<' | '+' )+ '\'';
ID    : LETTER ( LETTER | '0'..'9' | '_' )*;
LETTER: ('a'..'z' | 'A'..'Z');

When I call ANTLR I get the following:

ANTLR Parser Generator  Version 3.0b7 (April 12, 2007)  1989-2007
warning(200): SqlV3.g:248:34: Decision can match input such as "','
{ID, 'avg'..'nullifzero'}" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
warning(208): SqlV3.g:842:1: The following token definitions are
unreachable: NL,LETTER

Where line 248 is the expression_list: expression ( ',' expression )*;
line and 842 the LETTER line. I've even bought the ANTLR 3 book in
search of an answer and I know it is there (section 11.5 Ambiguities
and Non-determinisms) but I can't see it.

Thanks a lot for your time.

-Ivo


More information about the antlr-interest mailing list