[antlr-interest] Tweaking a source file.

Tue Oct 17 20:32:02 PDT 2006

I'm new to ANTLR and although I'm generally a quick study when it comes to
languages, I could use a little help because I'm on a hard deadline.

I want to write a preprocessor for a simplified boo language (boo is a
language that resembles Python). The preprocessor will process "unit"
declarations for a unit checker. For example, you could write code like
this:

def Energy(mass as double `kg`, speed as double `m/s`):
    return 0.5*mass*Square(speed)

def Square(x as double):
    return x*x

Then, the unit checker could automatically deduce that the return value of
Energy() is measured in kg*m^2/s^2. It also has to spit out a new source
file without the unit information:

def Energy(mass as double, speed as double):
    return 0.5*mass*Square(speed)
....

So, the unit checker needs to:

1. Strip out unit information such as `m/s`, to produce a new source file.
2. Create an AST, in order to do semantic analysis (i.e. unit checking.)
3. Process newlines and indentation as part of the syntax.

The first decision I need to make is, should I use ANTLR 2 or ANTLR 3? The
readme for ANTLR 3 says "For example, to read in some input, tweak it, and
write it back out preserving whitespace, is easy in v3." That sounds great
because it is exactly what I need to accomplish.  But how can it be done?
Does an example exist? For my application, I think an observer inserted
between the lexer and parser can do the job--I just don't know the details.

The second big question is, what is the easiest way to make an AST, and is
this task also easier in v3?

The third question is, how can one parse a language where indentation is
syntactically significant?

-- 
- David
http://qwertie.net
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20061017/6fa8a29d/attachment-0001.html