[antlr-interest] Can I target C and Java from one grammar file?
Andy Grove
andy.grove at codefutures.com
Fri Jan 23 08:59:15 PST 2009
Thanks for the feedback and different options presented.
I need something very simple and pragmatic right now so I went with a
preprocessor approach where I use ANTLR comments such as:
//ifdef JAVA
... java version of syntax
//elifdef CPP
... cpp version of syntax
//endif
At least I have everything in a single file and I can easy compare
Java and C code side by side even if there is a little duplication.
In case anyone is interested, here is the complete source for the pre-
processor, written in Ruby.
#!/usr/bin/ruby
# Preprocessor for ANTLR grammar files with multiple language targets
# Written by Andy Grove on 23-Jan-2009
def preprocess(filename, userTarget)
f = File.open(filename)
include = true
currentTarget = "*"
f.each_line {|line|
if line[0,7] == '//ifdef'
currentTarget = line[7,line.length].strip
elsif line[0,9] == '//elifdef'
currentTarget = line[9,line.length].strip
elsif line[0,7] == '//endif'
currentTarget = "*"
else
if currentTarget=="*" || currentTarget==userTarget
puts line
end
end
}
f.close
end
begin
if ARGV.length < 2
puts "Usage: preprocess filename target"
else
preprocess(ARGV[0], ARGV[1])
end
end
Thanks,
Andy Grove
Chief Architect
CodeFutures Corporation
On Jan 22, 2009, at 11:57 PM, Johannes Luber wrote:
> Jim Idle schrieb:
>>> Johannes Luber wrote:
>>>
>>> I think you misunderstood me. Here is one rule in my grammar:
>>>
>>> collection_initializer
>>> : OPEN_BRACE element_initializer_list COMMA? CLOSE_BRACE
>>> -> ^(OPEN_BRACE element_initializer_list ^(OPTIONAL COMMA?)
>>> CLOSE_BRACE)
>>> ;
>>>
>>> A normal parser would maybe need only:
>>>
>>> collection_initializer
>>> : OPEN_BRACE element_initializer_list COMMA? CLOSE_BRACE
>>> -> ^(element_initializer_list)
>>> ;
>>>
>>> With a preprocessor one could combine them:
>>>
>>> collection_initializer
>>> : OPEN_BRACE element_initializer_list COMMA? CLOSE_BRACE
>>> -> ^(
>>> #ifdef ALL_TOKENS
>>> OPEN_BRACE
>>> #endif
>>>
>>> element_initializer_list
>>>
>>> #ifdef ALL_TOKENS
>>> ^(OPTIONAL COMMA?) CLOSE_BRACE
>>> #endif
>>> )
>>> ;
>>>
>>> A bit ugly, but it gets the job done. Maybe you have another idea to
>>> accomplish this goal?
>>>
>> Well, you should do this with runtime configuration (I show a
>> parameter
>> here but you should use some grammar global config class set
>> externally):
>>
>> collection_initializer[boolean allTokens]
>> : OPEN_BRACE element_initializer_list COMMA? CLOSE_BRACE
>>
>> -> {allTokens}? ^(OPEN_BRACE element_initializer_list
>> ^(OPTIONAL
>> COMMA?) CLOSE_BRACE)
>> -> element_initializer_list
>> ;
>
> While runtime configuration is interesting, the problem remains that
> tree grammars have to treat both rewrites possible. Effectively you
> are
> duplicating parts of the tree. I've had another idea to make the
> syntax
> more compact:
>
> #define ALL
>
> collection_initializer
> : OPEN_BRACE element_initializer_list COMMA? CLOSE_BRACE
> -> ^(ALL.OPEN_BRACE element_initializer_list ^(ALL.OPTIONAL
> ALL.COMMA?) ALL.CLOSE_BRACE)
> ;
>
> Only if ALL is defined rules and tokens marked with "ALL." end up in
> the
> generated code. The only question is, how one should treat "^()".
> Maybe
> saying that it is enough that only if the root node is included that
> DOWN and UP are included as well.
>>
>> And you probably don't need that COMMA under a root node ;-)
>
> For my special purpose I do need really all tokens - except non-
> newline
> whitespace, I think. And using OPTIONAL fixes the general tree
> structure
> which makes handling the direct sons of the root node easier.
>
> Johannes
>>
>> But the general point is good.
>>
>> Jim
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
More information about the antlr-interest
mailing list