[antlr-interest] Advice on SQL grammar

Brian Lavender brian at brie.com
Wed Oct 12 12:14:22 PDT 2011


I am analyzing a set of SQL queries to see what tables and columns are
touched.  They are select queries in Oracle using a handful of functions
such as NVL2, DECODE, and other functions. While I have seen grammars
for Oracle, it seems that digging through them has been confusing not
to mention one used all my heap!  So, I decided to construct my own
grammar. The H2 website has a pretty nice layout of a select grammar,
so I am using that as my model. I have a couple questions.

First, I tried making my start rule 'select' which was later recursively
called.  It appears that I can't do that? I corrected my grammar with
a superceding start rule and all appears fine. Can someone explain the
problem with recursively calling the start rule.

How does my grammar look so far? I have just been "stubbing" it out. 

Any suggestions for doing this?

brian

Sample grammar I am referencing.
http://www.h2database.com/html/grammar.html#select

grammar BooSQL;

options {
  language = Java;
}

@header {
  package grammar;
}

@lexer::header {
  package grammar;
}

sqlStatement
	: select
	;

select
	: 
	'SELECT' ('DISTINCT'|'ALL')? selectExpression (',' selectExpression)* 
	'FROM' tableExpression (',' tableExpression)*
	 ('WHERE' expression)?
	 ('GROUP' 'BY' expression (',' expression)*)?
	 ('HAVING' expression)?
// This where I originally had the problem
	 (('UNION' ('ALL')? | 'MINUS' | 'EXCEPT' | 'INTERSECT') select)?
// Fake end to statment for now
	'ENDSQL'
	;
	
selectExpression
	: IDENT+
	;

tableExpression
	: IDENT+
	;

expression
	: IDENT+
	;





fragment LETTER : ('a'..'z' | 'A'..'Z') ;
fragment DIGIT : '0'..'9';
INTEGER : DIGIT+ ;
IDENT : LETTER (LETTER | DIGIT)*;
QUOTE_IDENT : '"' LETTER (LETTER | DIGIT | ' ')* '"';
WS : (' ' | '\t' | '\n' | '\r' | '\f')+ {$channel = HIDDEN;};
COMMENT : '--' .* ('\n'|'\r') {$channel = HIDDEN;};


-- 
Brian Lavender
http://www.brie.com/brian/

"There are two ways of constructing a software design. One way is to
make it so simple that there are obviously no deficiencies. And the other
way is to make it so complicated that there are no obvious deficiencies."

Professor C. A. R. Hoare
The 1980 Turing award lecture


More information about the antlr-interest mailing list