[antlr-interest] Very general question, might require a book as an answer
Jan van der Ven
jhvdven at xs4all.nl
Sun Jan 21 14:27:50 PST 2007
Dear list,
I am involved in the Quantum DB eclipse plugin for quite some time now.
This is a plugin for uniform access to databases that have a JDBC driver
(http://sourceforge.net/projects/quantum). I joined the development team
for the frustration I kept getting with queries that contained errors
that could have been solved at 'compile' time rather than at 'runtime'.
This, imho, means that the syntax of the statement(s) need to be
checked, which means a parser.
As you may be aware SQL comes in many dialects and our plugin aims to
offer support for all of them.
So, I looked around and found antlr. It seemed very fit for the job as
it allows inheritance. As plugin developers we deliver a base and allow
others to enhance/specialize that for their specific database.
I got started, looked at some .g files, borrowed some constructs and got
stuck on a very basic point: all the tables, columns, aliases and so on,
were recognized as identifiers only. Once the statement was parsed
'successfully', it was not possible to reason about it as the parser did
not deliver tables involved, columns belonging to those tables and so
on. (I think you would call this an interpreter...)
I posted to this list about 'promoting' certain tokens, and of course
and thank you all, received an answer: promote them by adding an action:
<pre>
column_alias
:
i:id {#i.setType(COLUMN_IDENTIFIER_ALIAS);}
;
</pre>
So that my id gets promoted to a column alias whenever this rule executes.
I got this to work, finding out errors even when the syntax worked out
ok (misspelled table and column names, wrong relationships), but it
seemed to me that I had not unleashed the complete power of antlr.
So, now, finally, I come to the questions that may require a book (or
2) to answer:
1) Should I be using antlr, or should I stick with the stuff eclipse offers?
2) Is antlr the right tool of choice for a project in which each
database vendor speaks its own dialect? In other words will the
inheritance feature deliver the promise of less coding? And of course,
what should be the base grammar of this all? And could we have one lexer?
3) We would like to support scripts, some dialects have statement
separators, others do not. Does this mean I need to write something that
separates the statements first, or is there a smarter way?
4) I never used the tree walker classes in antlr. I must admit, I do not
understand the value of it yet. However I think they are what I need
because after lexing and parsing I want to interpret the results. So far
my AST is not a tree but a list. What are the benefits of using tree
walkers?
5) Why should I upgrade to v3?
If anyone could find the time to answer even one of my questions, that
would be greatly appreciated.
Kind regards,
Jan
More information about the antlr-interest
mailing list