[antlr-interest] Can I target C and Java from one grammar file?

Thu Jan 22 12:27:57 PST 2009

Jim Idle schrieb:
> Johannes Luber wrote:
>> Jim Idle schrieb:
>>   
>>> Andy Grove wrote:
>>>     
>>>> Hi,
>>>>
>>>> I need to generate C and Java from an ANTLR grammar containing  
>>>> actions. Is there a preprocessor approach I can use rather than  
>>>> maintaining two versions of the grammar?
>>>>   
>>>>       
>>> I use perforce and maintain a base grammar that has no actions, then 
>>> change only the grammar base. When ready to test I integrate the changes 
>>> via a prestored branch spec. Other SCCS can do the same sort of thing, 
>>> though perforce is streets ahead of anything else at the merge process.
>>>
>>> However, occasionally it is a pain to debug remotely when I want to just 
>>> use the ANTLR works debugger before integrating a change, so I have 
>>> written a pre-processor as an experiment (it is in ANTLR3 of course), 
>>> and am trying to decide between the C# lexer base approach and the VB/C 
>>> approach (albeit not having the stupidity of the VB pre-processor.)
>>>     
>>
>> Having a preprocessor would allow me to have a single grammar for C#,
>> while allowing to serve it my own needs and the general ones at once.
>> I've thought about using C#'s preprocessor, but that would at best allow
>> to circumvent the assignments, but not any initializations.
>>   
> I was referring to the design pattern rather than the implementation. C#
> has a minimalist approach to what you can do with the pre-processor, VB
> has some weird and wonderful stuff, and the C pre-processor lets you
> hang yourself. So, I personally would not want a pre-processor that you
> can program as we are writing grammars, not pre-processor macros ;-0

I think you misunderstood me. Here is one rule in my grammar:

collection_initializer
    :   OPEN_BRACE element_initializer_list COMMA? CLOSE_BRACE
    -> ^(OPEN_BRACE element_initializer_list ^(OPTIONAL COMMA?) CLOSE_BRACE)
    ;

A normal parser would maybe need only:

collection_initializer
    :   OPEN_BRACE element_initializer_list COMMA? CLOSE_BRACE
    -> ^(element_initializer_list)
    ;

With a preprocessor one could combine them:

collection_initializer
    :   OPEN_BRACE element_initializer_list COMMA? CLOSE_BRACE
    -> ^(
	#ifdef ALL_TOKENS
	OPEN_BRACE
	#endif

	element_initializer_list

	#ifdef ALL_TOKENS
	^(OPTIONAL COMMA?) CLOSE_BRACE
	#endif
)
    ;

A bit ugly, but it gets the job done. Maybe you have another idea to
accomplish this goal?

> 
...
>>
>> Instead doing it with macros,how about using:
>>
>> #ifdef ANTLR_3_1_2
>>   
> Same thing. The tool can pre-define anything it likes of course. You
> need the MAJ and MIN etc as well because you sometimes need to say "This
> version or above"
>>> language = template;
>>>
>>> r1 : %r1predicate(x)%?=>   a=INT bINT c=INT 
>>>     
>>
>> Can you explain this in more detail? I'm not sure how you arrive at
>> those and what the purpose actually is.
>>   
> Actually you would need:
> 
> language=C;
> actions=template-name;
> 
> But the idea is just like the runtime use of templates but happens at
> code generation time. The code generation for the target is exactly the
> same, but the actions and code related things such as semantic
> predicates are generated by string template supplied by the grammar
> author. Then you can have one grammar and different action templates
> that just return code that is passed on to the target language template.

Ah.
>>> For now though, m4 is your best bet (it is in the Java compiler even), 
>>> or perhaps something simple with gawk.
>>>     
>>
>> What is m4? I find only weapon references.
>>   
> Type:
> 
> man m4
> 
> ;-) It is the pre-processor that was used by C compilers of long past.
> 
> http://www.gnu.org/software/m4/

Thanks,

Johannes
> 
> Jim
>