[antlr-interest] Re: Preprocessors - academic question

Fri Jun 28 11:12:53 PDT 2002

--On 28/06/2002 10:05 AM -0700 mzukowski at yci.com wrote:

> That's a tough problem.  You obviously can't analyze all the possible sets
> of preprocessor conditions.

Well, let's qualify that. I think that analyzing all possible sets of
preprocessor conditions is NP Complete. I think it's possible to graph all
possible sets, but like the Traveling Salesman Problem, it doesn't take
much before it becomes impractical to analyze all sets. All sets would have
to be analyzed to prove that a particular transformation doesn't break
anything.

>  I don't think there is a general solution,
> but try to think about what you would do if you were doing it by hand, but
> analyzing the fully processed code for one specific set of preprocessor
> variables.  As you change nodes, you should be able to trace it back to
> the unpreprocessed code.  Things could get ugly when you have to split a
> preprocessor directive in two so that the change only affects the code you
> want it to, and not all code everywhere.  Managing this for multiple
> configurations of preprocessor variables would be tough too.

Exactly - and it *is* as ugly as it sounds. And Progress's preprocessor
lets you get away with murder.

> Most people hopefully don't use the preprocessor in a brain dead way.

Nice try! The Progress 4GL has had a *lot* of shortcomings in the past, and
still has quite a few. It's not even properly object oriented yet. So, of
course, the preprocessor is the hack which allows programmers to sort-of
work with the language the way they want to work with it. For example,
include files and PATH settings were hacked for making up for the lack of
any language support for inheritance and polymorphism. To make matters
worse, the 4GL programmers are typically business analysts who have no idea
what a mess they are making when they use preprocessing willy-nilly.

> What kind of transformations are you doing?  Maybe you could pick a tough
> example and we could work it through.

Ideally, we wanted to be able to do radical transformations on the code.
For example, we wanted to be able to take old code which mixed UI and
database access, and automatically split that code so that it could more
easily be split up and put into an n-tier architecture (application
servers, etc).

We still think that's possible, but with this big limitation: the
preprocessing junk is lost from the source. That will be unacceptable for
most potential projects, but it might be acceptable for the odd thing.

On the other hand, there's a completely different class of transformation
that people are looking for: token twiddling. That will be comparatively
easy. We'll relate nodes in the parse tree back to an unprocessed token
list (whitespace and all). The parse tree can be used for the analysis, the
token list is the place where the token twiddling is done. Between the
parse tree and the token list, it will be obvious if any desired twiddling
crosses preprocessing boundaries, and in that case the twiddling cannot be
done automatically - it will be reported and done by hand. Once the
twiddling is done, the token list is written back out to source.

John
www.joanju.com
john @ joanju dot com

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/