[antlr-interest] Re: Seperating Grammar and Actions..

Tue Jan 14 13:30:37 PST 2003

>>>>> "cintyram" == cintyram <cintyram at yahoo com> <cintyram at yahoo.com> writes:

[...]

>> What does the implementation choice of direct, n-ary tree nodes versus
>> the current child-sibling node structure have to do with what you can
>> achieve with tree walkers?

> if a different application requires a different processing , it would
> only be a little less difficult starting with the wrongly structured tree
> than starting with teh original token stream ;

(A) The direct n-ary vs. child-sibling implementation detail has nothing to
do with that issue.

(B) I'm not saying that the parser should be elided or otherwise does not
still need to do (potentially) a good bit of work to create good, useful
initial ASTs and what not.  I am saying that that is basically all the
parser should do.

(C) Good ASTs make things explicit and regular.  Doing a good job of that
in creating the initial AST in the parser makes for fewer clean up
treewalkers.  Where to draw the line depends on a lot of factors.

> i wanted to say that even the initial AST and auxilliary structures if
> any are not written in stone for all possible applications that can be
> based on the same parser/lexer

Of course! That's one of the main points that started this thread --
separate the actions from the grammar itself.

I've never said that we should only develop a single, fixed AST from a
parser.  Of course that depends on what your needs are.  I will, however,
say that for a given source language, developing a good initial AST will
provide you with the flexibility to do almost everything.

> etc .. lets look at it like this : Let us assign the rank 0 to Lexer , 1
> to Parser , 2 to any immediately subsequent phase [like tree walker or
> tree Maker] 3 for a first pass over the tree ..

> assume there are many such stages upto rank say 15 or more ..

Been there, don't that. :-)

> if say rank 5 op[operation] is only to act as a bridge between the tree
> output by rank4 op, and a different format expected by rank6 op , and it
> is possible to write one simple processing unit which does this without
> making a tree which is any way not used .. but directly by making a tree
> that is expected by rank6 op from the input to the rank4 op ; we should
> have an easy framework where we can do this ;

You should play around more with the capabilities of treewalkers.  You can
do what you're talking about now.

IME, in building systems with many treewalker phases, most of the phases
end up using the same AST structure.  Many peephole and local optimizations
will simplify the actual tree but the tree grammar hasn't/doesn't, a
priori, need to be changed for every phase.

In fact, having a number of e.g., optimization phases that all use the same
grammar and auxiliary structures is actually a boon since some types of
optimizations interact with others in various ways so being able to
re-order the phases and/or run particular phases more than once can be a
good thing.

That leads back to a basic guideline that each phase should have one
primary job and do that well.  [I think one of the problems that so many
parser generators have with complexity is that they are trying to do (or
allow) too damn many things at the same time.  Remember, pipelines are your
friends. :-)]

> if we "brack project" this idea to the initial stages ie rank 0 and 1, we
> find that generalisation is only possible to some extent , but by being
> able to recognize and express that possibility we can have a toolset
> which is more suitable for a case where the higher ranked ops , can be
> replaced to achieve a different target/application ;

I must be missing something because, as I understand what you're trying to
say, you can already do that.  Perhaps it would help if you had or can
create some additional, simple examples to illustrate what you're talking
about.

> for example if say we start with a java compiler with 5 ranks , now if we
> replace say rank 4 and rank 5 ops with some other operations may be three
> new processing units, and end up with a code analyzer, which just detects
> the pattern of use of white space .. etc .. or looks for some pattern of
> use of comments [ just as an example ] if we had started out with a
> different ir , which makes it easier to access the white space and
> comments rather than the actual code, may be we end up with a faster app
> ;

You can already do that.

> another example is if we want regular expression based syntax highliting
> in the source editor, which can also act as compiler's front end etc ..we
> want two sets of output one which can be used by teh renderer and the
> other for the compiler.  this might be a bad example but now im not able
> to come with anything better ;

No, that's fine.  You can do that now too.  However, hooking it into a IDE
to try to do that as you're typing doesn't fit well into ANTLR's current
way of processing (-- which is pull-based streams at the lexer level).

However, if you want to talk about building better code editors, that's a
different topic.  :-)

Take care,
	John

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/