[antlr-interest] Problem generating the Java parser for Oracle PL/SQL grammar

Andrew Haritonkin thikone at gmail.com
Wed Nov 12 03:44:41 PST 2008


Hi, Javier Luis Cánovas!

You are welcome!

Yep, I also use ANTLRNoCaseFileStream for case insensitivity, and all
my keywords therefore are upper case. Beyond that I did small
improvement in select statement regarding join clause - see attached
file.

I would also appreciate any comments/suggestions from your side.
Especially it would be helpful to know if valid pl/sql cannot be
parsed by this grammar, since I'm very interested in this grammar
improvement...

>From what I already know is that XML SQL functions are not supported,
and that the following statement fails (it seems multiple level of '(
)' caused the trouble):

select * from
(((( a inner join b on a.x = b.x )
left outer join c on a.x = c.x )
left outer join d on a.x = d.x )
left outer join e on a.x = e.x );

Also there is a problem with some not reserved keywords. Well, most of
them are ID tokens with the gate predicate to check the text of token,
like this:

keyWHILE : {PLSQL3jParser.this.input.LT(1).getText().toUpperCase().equals("WHILE")}?
ID;

but this particular keyword and some others I had to replace with
literal token instead:

keyWHILE : 'WHILE';

Because otherwise parser is not able to make the right decision in
some cases... Therefore, they cannot be used as identifiers, while in
fact, they can:

CREATE TABLE test (while NUMBER);

BEGIN
   UPDATE test SET while = while + 1;
END;
/

Will be accepted by Oracle.

Recently I saw an article in wiki which might help me to solve this
and also improve the speed (I think I already know what to do):
http://www.antlr.org/wiki/display/ANTLR3/How+to+remove+global+backtracking+from+your+grammar

Andrew

On Wed, Nov 12, 2008 at 11:18 AM, Javier Luis Cánovas Izquierdo
<zirrus at gmail.com> wrote:
> Hi Andrew!
>
> Thanks for the advices. They have been useful for improving my grammar
> definition.
>
> I had to modify the antlr ant task to execute the antlr parser tool
> (memory aspects) and some elements in the grammar definition (as you
> told in your mail): options section, members section, and some grammar
> rules. The only thing I have done different is the definition of rules
> for keyword. I use the ANTLRNoCaseFileStream Java class defined in
> http://www.antlr.org/wiki/pages/viewpage.action?pageId=1782. This way,
> all keywords can be specified in uppercase, lowercase or both, they
> will be recognized in uppercase in the lexer so these rules only work
> with uppercase words.
>
> Regards!
>
> 2008/11/10 Andrew Haritonkin <thikone at gmail.com>:
>> Hi, Javier Luis Cánovas Izquierdo!
>>
>> You don't need so much memory for my grammar, really :) 256Mb is
>> enough for ANTLR v3.1.1. Well I use 512Mb actually...
>>
>> You need to change one rule though, to make it compatible with ANTLR 3.1.x:
>>
>> column_spec
>>   :       sql_identifier ( DOT sql_identifier )*;
>>
>> For some reason, ANTLR 3.1.x cannot compile it, raising a error:
>>
>> error(206): PLSQL3.g:791:4: Alternative 2: after matching input such
>> as ID DOT ID DOT ID DOT ID DOT decision cannot predict what comes next
>> due to recursion overflow to expr_add from sql_expression and to
>> expr_mul from expr_add
>>
>> While with ANTLR 3.0.1 it was compiling just fine... Anyway, replace
>> it with this:
>>
>> column_spec
>>   :       sql_identifier ( DOT sql_identifier ( DOT sql_identifier )? )?;
>>
>> And regarding Java target - there is not much you need to change, only
>> members declaration and some gate predicates:
>>
>> options {
>>       language=Java;
>>       k=*;
>>       backtrack=true;
>>       memoize=true;
>>       output=AST;
>> }
>>
>> @members {
>>   private boolean is_sql = false;
>> }
>>
>> and all parser rules for keywords should like like this:
>>
>> keyA : {PLSQL3Parser.this.input.LT(1).getText().toUpperCase().equals("A")}? ID;
>>
>> Here I have to reference parser class, because this predicate can be
>> also embedded in DFA, but there only token type stream is available,
>> LT(1) returns token ID, integer... not very convenient. Gonna write
>> separate topic for this, eventually.
>>
>> I also use Java target, mainly to debug the grammar in ANTLRWorks -
>> works perfectly.
>>
>> Andrew
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>
>>
>
>
>
> --
> Javier Luis Cánovas Izquierdo
> http://zirrus.es
> zirrus at gmail.com
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PLSQL3j.g
Type: application/octet-stream
Size: 44748 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20081112/fd91859f/attachment.obj 


More information about the antlr-interest mailing list