[antlr-interest] Very general question, might require a book as an answer

Mon Apr 2 09:48:01 PDT 2007

Thanks Miguel,

You can see how much of your ideas I already implemented at 
http://sourceforge.net/projects/quantum
and then check out cvs. Look for 
/quantum-plugin/src/com/quantum/core/sql/grammar

Regards,

Jan

Miguel Ping wrote:
> Hey there,
>
> Sorry for replying just now, I must confess until recently i haven't
> lost much time with the antlr mailing list. I'm trying to develop an
> antlr grammar for sql for some personal stuff, and since I didn't know
> nothing of antlr, I'm learning antlr also. So if it's not too late :p
> I'll try and answer your questions based on my short experience.
>
>> 1) Should I be using antlr, or should I stick with the stuff eclipse 
>> offers?
> Antlr makes it really easy to build your language tools. I dunno what
> eclipse offers except a jdom toolkit for manipulating java asts...
>> 2) Is antlr the right tool of choice for a project in which each
>> database vendor speaks its own dialect? In other words will the
>> inheritance feature deliver the promise of less coding? And of course,
>> what should be the base grammar of this all? And could we have one 
>> lexer?
> My guess is that  inheritance will indeed solve many problems. I think
> the base grammar should be a plain ANSI SQL'92 grammar reduced to the
> minimum common denominator of all dialects. As for the lexer, one
> lexer may not suffice because of special keywords of the several
> dialects. In one dialect you want to recognize a keyword as a
> function, in other it may be an identifer...
>> 3) We would like to support scripts, some dialects have statement
>> separators, others do not. Does this mean I need to write something that
>> separates the statements first, or is there a smarter way?
> Dunno
>> 4) I never used the tree walker classes in antlr. I must admit, I do not
>> understand the value of it yet. However I think they are what I need
>> because after lexing and parsing I want to interpret the results. So far
>> my AST is not a tree but a list. What are the benefits of using tree
>> walkers?
> I haven't written a tree walker, but for some languages I guess it's
> more easy to do stuff with structured trees. for instance, determining
> scopes, operator precedence, etc.
>> 5) Why should I upgrade to v3?
> V3 deals with some problems of V2 both in terms of design and
> internals , and it's faster and the algorithm is alot more powerful.
> in v2 you must specify lookahead, v3 has a LL(*) algorithm that has
> infinite lookahead.
>
> Bear in mind that I am a antlr newbie and I am still learning lots of
> stuff, so the answers I provided may not be the correct answers to
> your questions...
>
> On 1/21/07, Jan van der Ven <jhvdven at xs4all.nl> wrote:
>> Dear list,
>>
>>
>>
>> I am involved in the Quantum DB eclipse plugin  for quite some time now.
>> This is a plugin for uniform access to databases that have a JDBC driver
>> (http://sourceforge.net/projects/quantum). I joined the development team
>> for the frustration I kept getting with queries that contained errors
>> that could have been solved at 'compile' time rather than at 'runtime'.
>> This, imho, means that the syntax of the statement(s) need to be
>> checked, which means a parser.
>>
>> As you may be aware SQL comes in many dialects and our plugin aims to
>> offer support for all of them.
>>
>> So, I looked around and found antlr. It seemed very fit for the job as
>> it allows inheritance. As plugin developers we deliver a base and allow
>> others to enhance/specialize that for their specific database.
>>
>> I got started, looked at some .g files, borrowed some constructs and got
>> stuck on a very basic point: all the tables, columns, aliases and so on,
>> were recognized as identifiers only. Once the statement was parsed
>> 'successfully', it was not possible to reason about it as the parser did
>> not deliver tables involved, columns belonging to those tables and so
>> on. (I think you would call this an interpreter...)
>>
>> I posted to this list about 'promoting' certain tokens, and of course
>> and thank you all, received an answer: promote them by adding an action:
>>
>> <pre>
>> column_alias
>>     :
>>     i:id {#i.setType(COLUMN_IDENTIFIER_ALIAS);}
>>     ;
>> </pre>
>>
>> So that my id gets promoted to a column alias whenever this rule 
>> executes.
>>
>> I got this to work, finding out errors even when the syntax worked out
>> ok (misspelled table and column names, wrong relationships), but it
>> seemed to me that I had not unleashed the complete power of antlr.
>>
>> So, now, finally, I come to the questions that may require a book (or
>> 2)  to answer:
>>
>> 1) Should I be using antlr, or should I stick with the stuff eclipse 
>> offers?
>> 2) Is antlr the right tool of choice for a project in which each
>> database vendor speaks its own dialect? In other words will the
>> inheritance feature deliver the promise of less coding? And of course,
>> what should be the base grammar of this all? And could we have one 
>> lexer?
>> 3) We would like to support scripts, some dialects have statement
>> separators, others do not. Does this mean I need to write something that
>> separates the statements first, or is there a smarter way?
>> 4) I never used the tree walker classes in antlr. I must admit, I do not
>> understand the value of it yet. However I think they are what I need
>> because after lexing and parsing I want to interpret the results. So far
>> my AST is not a tree but a list. What are the benefits of using tree
>> walkers?
>> 5) Why should I upgrade to v3?
>>
>> If anyone could find the time to answer even one of my questions, that
>> would be greatly appreciated.
>>
>> Kind regards,
>>
>>
>>
>>
>> Jan
>>
>>
>>
>>
>
>