[antlr-interest] OO design question
Terence Parr
parrt at jguru.com
Thu Jun 13 14:09:29 PDT 2002
Folks,
Let's imagine an optimal world where I'm contemplating building a
simpler internal structure for ANTLR. Please suspend disbelief for the
duration of this email. :) ;) :D
Ok, so ANTLR works like this:
1. preprocess the grammar, expanding any inheritance relationships.
2. parse the grammar and build an internal graph/network representation;
the graph looks like a syntax diagram you would build to describe a
grammar. A token manager tracks token type definitions and is
associated with the graph.
3. try to generate code, which drives the grammar analysis engine. In
other words, when I try to generate a decision in the output parser, I
on-demand ask the analyzer to compute the lookahead sets and then use
that to generate decision code. This is the same principle for html
output, text output, java, C++, whatever.
In the current implementation this stuff is so hopelessly interwined
that it's pretty hard to separate what is going on in your head.
Changes in one area may have broad effects on others.
Optimally, ANTLR would be re-engineered to separate the various phases
as much as possible for ease of specification, ease of debugging, and
ease picking and choosing what phases you wanted to make a cool tool.
For example, you might want to generate a pretty printer. You could
take the front-end that generated the graph and hook a tool on the back
that just dumps the grammar, ignoring code gen and analysis etc... You
could even build an interpreted version of antlr from just the front end
and the analysis phase, ignoring the code gen.
I imagine that you'd have
1. front end generates grammar graph and vocabulary manager; might even
have a persistence text format so you don't even need to know antlr's
internal data structure format--you could use your own or even use the
text format to connect it to a tool written in another language.
2. analyzer, computes approximate pred-LL(k) lookahead information at
every decision point in the grammar. Again, you could have a text
output format for debugging purposes or to use as an interface to
completely separate tool. This would no longer be driven by the code
generator, though the code generator could ask for the information
naturally.
3. various back ends such as Java, C++, html, etc... I have lots more
to say about this, but I need to firm up my "output grammar" ideas more
first.
My specific question at the moment relates to what you might call
aspect-oriented programming vs oo programming. Consider the data
structure representing your grammar (i.e., the graph). It's very neatly
organized as a set of GrammarElement subclasses that include subrules
and atoms like token-refs. Now, how do you perform analysis on this
graph? Conceptually you would like to keep the analyzer separate (so
somebody could build a different one and for good programming practice
reasons). I would call this an aspect not an object encapsulation
idea. Anyway, currently I have added a method to each GrammarElement
called lookahead() or some such that is supposed to compute lookahead
for that type of element. Works great except for the fact that now my
analyzer is not a big chunk (an aspect of ANTLR) it is completely
interwined with the graph data structure. The reason of course is that
it is really nice to ask a node for it's lookahead. In order to
separate out the analyzer, you'd have ask the analyzer to look at a
node: Analyzer.lookahead(GrammarElement). But, you'd lose polymorphism
and have to have a switch in lookahead that asked what kind of node it
was. Ick.
So! How do you get a good separable chunk called an Analyzer without
making the implementation fully of crappy switch-statements? I.e., how
do I use aspect programming w/o losing the polymorphism convenience?!
Anybody wanna lend me some smarts?
Thanks,
Ter
--
Co-founder, http://www.jguru.com
Creator, ANTLR Parser Generator: http://www.antlr.org
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list