[antlr-interest] target language independent action code

Sun Jan 20 18:22:19 PST 2008

On Sun, 20 Jan 2008 21:14:23 +0100
Arnulf Heller <aheller at gmx.at> wrote:

> hi,
> 
> I know this topic was discussed a couple of times here ...
> 
> But as far as I know there is no solution available right now 
> (possibly apart from Loring Cramers yggdrasil).
> 
> I think target language independent action code would be of great
> help because:
> 
> 1. ANTLR provides a steadily growing foundation of grammars for 
> various languages (which is very cool). Unfortunately its almost 
> certain that the grammar targets a different language ...
> 2. Action code clutters the readability of the grammar - especially 
> if its in a target language that you don't know.

Hello Arnulf,

The normal approach to avoid cluttering the grammar is to just
have one line of action code that calls a method in in target
language.

> Because ANTLR changes a lot over time, action code should be embedded 
> into ANTLR directly with "on board" tools.

For a parser for a large language the ANTLR generated parser file
is already too large (2.5MB in my case) for the Netbeans debugger
to open when it only has at most one line in each action.  To
then go and embed the action code would stop me from being able to
debug it at all.  Even if the debugger could handle it, there is
no way I want to go searching through megabytes of generated
parser code looking for the place to set my breakpoint.

> So why not use these wonderful string templates?
> 
> Instead of writing
> 
> { myDict.add($ID.text()); }
> 
> one could write for instance
> 
> [ DictAdd(ID) ]
> 
> which ANTLR could translate on the fly to target language code at 
> that position.

In practice the action code for a compiler for a large
language is thousands of lines of code just for entering the
information into a symbol table, which for an object orientated
language is a DAG (Directed Acyclic Graph), for looking up
information in the symbol table, etc.  It would be inconvenient
to develop and debug this using string templates.

> Then the writer of the grammar needs to provide a string template 
> group (with a template "DictAdd") which performs the translation to 
> "his" target language.
> This way targetting a different language amounts to rewriting the 
> string template group.
> This does not alter the original grammar and will hopefully be
> posted :-)
> 
> The target language folks even could provide a minimal toolset for 
> dictionaries and the like.

A dictionary is insufficient for a symbol table for an object
orientated language.  It would be impractical for the target language
developers to anticipate the symbol table language requirements for
evey conceivable language.

> If there is a good collection of tools, 
> the action code gets structured, documented and well known by the
> time.
> 
> What do you think?

In my dreams I wish there was some magic way to automatically translate
the Java ANTLR runtime and all my Java action code into C++ sometime
in the future when the C++ runtime is available.

Back in real world, what I am doing at the moment is I develop all
of my Java action code modelled in the freeware UML CASE tool called
BOUML:

http://bouml.free.fr/

Then I hope that some time in the future that some other kind hearted
masochists (not me, sorry, I am already one level of indirection away
from real work) will develop a C++ ANTLR runtime including tree wizard,
and C++ string template.

Thanks, Mark

--