[antlr-interest] target language independent action code

Wed Jan 23 11:22:20 PST 2008

Hi!

On Jan 21, 2008, at 11:20 PM, Johannes Luber wrote:

>> Thats because Terence already did the work in these cases (and I  
>> assume that there are hidden string templates that translate that).  
>> ANTLR translates the $variable tokens appropriately.
>
> Ter created the example and someone else than me translated it to  
> CSharp. So I don't know how much had to be changed there. In any  
> case, the $-variables are parsed by ANTLR itself and don't need  
> StringTemplate per se. But the actual output will use ST again.

Just for clarification, it's best to treat $variables in actions as  
opaque. There's a filtering lexer that goes through every action to  
look for the variables and then uses certain StringTemplates from the  
respective Target.stg to translate into target code.
So, in effect ST is used by those, but just because code generation in  
ANTLR is based on ST, they are not templates themselves, though.

I don't think it is wise to try to abstract code even further than  
that, as it is either a heavy burden on the code generation side, or  
will be painful to write (just because you'd actually invent a new  
programming language for actions, that could be easily translated into  
any other kind of language, at least those ANTLR targets).
One beauty of ANTLR is that it nicely integrates with target specific  
code, so you can call any old API within your actions. Apart from the  
"helper variables" $token et al, ANTLR doesn't care what you put in it.
The variables are there to protect the programmer from changes in the  
ANTLR implementation. Just imagine you'd had to write code like  
_id.text all the time. It would surely break when you rename the label  
in the grammar, and you would not get a warning from ANTLR - just from  
your compiler, saying that there are a gazillion references to an  
undeclared var _id.

That said, we are aware of the pains of cross-language grammars and  
I'm thinking about ways to help with the solution. Sadly I'm far from  
ready to announce anything yet.
One common use-case is that you found a grammar in the antlr.org  
grammar list, use that, some publishes a bug fix for that grammar and  
you are left with your copy of it and the need to merge. In most cases  
you have probably heavily modified the grammar already, and if only  
with custom actions. Painful.

As for code size: I've been long a fan of having the DFA classes  
outside of the generated file, i.e. in either one DFA file or in  
separate ones. I guess providing an option like that (- 
XmultipleDFAfiles or somesuch) could help in certain situations. Once  
upon a time the DFAs were in separate files, IIRC, but that has  
changed. I don't remember the reasons, but the code generation classes  
in ANTLR's core would need changing to support multiple output files.  
So don't expect it to be done quickly (unless of course, you want to  
volunteer and present your solution ;)) We might even convince Ter of  
the value :P

cheers,
-k
-- 
Kay Röpke
http://classdump.org/