[antlr-interest] target language independent action code

Sun Jan 20 19:42:38 PST 2008

Mark--

Oh good--someone with a very large parser.  How much of the generated code is DFA definitions?  That is, if you split the file where the DFA classes start appearing, how big are the two pieces?  From the cases I have seen, DFA classes grow non-linearly with the size of the grammar.  For a generated file to be this large, my guess is that most of the code is DFA definitions that could be generated in separate files in a dfa directory to end up with manageable file sizes.

--Loring

----- Original Message ----
> From: Mark Wright <markwright at internode.on.net>
> To: antlr-interest at antlr.org
> Sent: Sunday, January 20, 2008 6:22:19 PM
> Subject: Re: [antlr-interest] target language independent action code
> 
> On Sun, 20 Jan 2008 21:14:23 +0100
> Arnulf Heller  wrote:
> 
> > hi,
> > 
> > I know this topic was discussed a couple of times here ...
> > 
> > But as far as I know there is no solution available right now 
> > (possibly apart from Loring Cramers yggdrasil).
> > 
> > I think target language independent action code would be of great
> > help because:
> > 
> > 1. ANTLR provides a steadily growing foundation of grammars for 
> > various languages (which is very cool). Unfortunately its almost 
> > certain that the grammar targets a different language ...
> > 2. Action code clutters the readability of the grammar - especially 
> > if its in a target language that you don't know.
> 
> Hello Arnulf,
> 
> The normal approach to avoid cluttering the grammar is to just
> have one line of action code that calls a method in in target
> language.
>  
> > Because ANTLR changes a lot over time, action code should be
> embedded
> 

> > into ANTLR directly with "on board" tools.
> 
> For a parser for a large language the ANTLR generated parser file
> is already too large (2.5MB in my case) for the Netbeans debugger
> to open when it only has at most one line in each action.  To
> then go and embed the action code would stop me from being able to
> debug it at all.  Even if the debugger could handle it, there is
> no way I want to go searching through megabytes of generated
> parser code looking for the place to set my breakpoint.
> 
> > So why not use these wonderful string templates?
> > 
> > Instead of writing
> > 
> > { myDict.add($ID.text()); }
> > 
> > one could write for instance
> > 
> > [ DictAdd(ID) ]
> > 
> > which ANTLR could translate on the fly to target language code at 
> > that position.
> 
> In practice the action code for a compiler for a large
> language is thousands of lines of code just for entering the
> information into a symbol table, which for an object orientated
> language is a DAG (Directed Acyclic Graph), for looking up
> information in the symbol table, etc.  It would be inconvenient
> to develop and debug this using string templates.
>  
> > Then the writer of the grammar needs to provide a string template 
> > group (with a template "DictAdd") which performs the translation to 
> > "his" target language.
> > This way targetting a different language amounts to rewriting the 
> > string template group.
> > This does not alter the original grammar and will hopefully be
> > posted :-)
> > 
> > The target language folks even could provide a minimal toolset for 
> > dictionaries and the like.
> 
> A dictionary is insufficient for a symbol table for an object
> orientated language.  It would be impractical for the target language
> developers to anticipate the symbol table language requirements for
> evey conceivable language.
> 
> > If there is a good collection of tools, 
> > the action code gets structured, documented and well known by the
> > time.
> > 
> > What do you think?
> 
> In my dreams I wish there was some magic way to automatically translate
> the Java ANTLR runtime and all my Java action code into C++ sometime
> in the future when the C++ runtime is available.
> 
> Back in real world, what I am doing at the moment is I develop all
> of my Java action code modelled in the freeware UML CASE tool called
> BOUML:
> 
> http://bouml.free.fr/
> 
> Then I hope that some time in the future that some other kind hearted
> masochists (not me, sorry, I am already one level of indirection away
> from real work) will develop a C++ ANTLR runtime including tree wizard,
> and C++ string template.
> 
> Thanks, Mark
> 
> -- 
> 

      ____________________________________________________________________________________
Never miss a thing.  Make Yahoo your home page. 
http://www.yahoo.com/r/hs