[antlr-interest] Can I target C and Java from one grammar file?

Johannes Luber jaluber at gmx.de
Thu Jan 22 22:57:28 PST 2009


Jim Idle schrieb:
>> Johannes Luber wrote:
>>
>> I think you misunderstood me. Here is one rule in my grammar:
>>
>> collection_initializer
>>     :   OPEN_BRACE element_initializer_list COMMA? CLOSE_BRACE
>>     -> ^(OPEN_BRACE element_initializer_list ^(OPTIONAL COMMA?) CLOSE_BRACE)
>>     ;
>>
>> A normal parser would maybe need only:
>>
>> collection_initializer
>>     :   OPEN_BRACE element_initializer_list COMMA? CLOSE_BRACE
>>     -> ^(element_initializer_list)
>>     ;
>>
>> With a preprocessor one could combine them:
>>
>> collection_initializer
>>     :   OPEN_BRACE element_initializer_list COMMA? CLOSE_BRACE
>>     -> ^(
>> 	#ifdef ALL_TOKENS
>> 	OPEN_BRACE
>> 	#endif
>>
>> 	element_initializer_list
>>
>> 	#ifdef ALL_TOKENS
>> 	^(OPTIONAL COMMA?) CLOSE_BRACE
>> 	#endif
>> )
>>     ;
>>
>> A bit ugly, but it gets the job done. Maybe you have another idea to
>> accomplish this goal?
>>   
> Well, you should do this with runtime configuration (I show a parameter
> here but you should use some grammar global config class set externally):
> 
> collection_initializer[boolean allTokens]
>     :   OPEN_BRACE element_initializer_list COMMA? CLOSE_BRACE
> 
>        -> {allTokens}? ^(OPEN_BRACE element_initializer_list ^(OPTIONAL
> COMMA?) CLOSE_BRACE)
>        -> element_initializer_list
> ;

While runtime configuration is interesting, the problem remains that
tree grammars have to treat both rewrites possible. Effectively you are
duplicating parts of the tree. I've had another idea to make the syntax
more compact:

#define ALL

collection_initializer
    :   OPEN_BRACE element_initializer_list COMMA? CLOSE_BRACE
    -> ^(ALL.OPEN_BRACE element_initializer_list ^(ALL.OPTIONAL
ALL.COMMA?) ALL.CLOSE_BRACE)
    ;

Only if ALL is defined rules and tokens marked with "ALL." end up in the
generated code. The only question is, how one should treat "^()". Maybe
saying that it is enough that only if the root node is included that
DOWN and UP are included as well.
> 
> And you probably don't need that COMMA under a root node ;-)

For my special purpose I do need really all tokens - except non-newline
whitespace, I think. And using OPTIONAL fixes the general tree structure
 which makes handling the direct sons of the root node easier.

Johannes
> 
> But the general point is good.
> 
> Jim



More information about the antlr-interest mailing list