[antlr-interest] C-Target Token definitions

Sat May 30 13:10:18 PDT 2009

Jochen Wilhelmy wrote:
> Hi Jim,
>
> at first thank you very much for doing great work
> on the C-Target. I recently ported a parser from
> bison to ANTLR3 and it was easy and fun.
> One little thing is problematic in the C-Target:
> all tokens are #defined, therefore pollute the
> global namespace.
>
> Is it possible to prefix all tokens with the grammar
> name and put them into an enum?
> For example the tokens for a Lua grammar would
> look like this:
> enum Toneks
> {
>    LUA_FOR = 5,
>    ...
> };
>
> Another possibility would be to put the tokens into
> the context struct, e.g.
> struct LuaParser_Ctx_struct
> {
> enum Toneks
> {
>   FOR = 5,
>    ...
> };
>  
> };
The problem with enum is that it does not really offer much over #define 
in C and it isn't available on old compilers for embedded systems and so 
on and I want people to be able to the code on just about anything.

The #defines are only used within the context of the include file, and 
in practice all you need do is stick a K in front of any TOKEN name that 
clashes with the system, such as FILE etc. So, make that KFILE and all 
is good.

Basically, all the targets do not attempt to protect you from the target 
itself, so for instance you can't use a parser rule called package in 
the Java target and so on. The problem with doing so is that it is never 
100% correct anyway. Also, when I experimented with this, there was one 
part of the code gen that did not ask the target templates for the token 
name and so it all fell over. That could be fixed I am sure, but in the 
end, I decided that it is better to see the token names without 
obfuscation when debugging the generated C code.

So, basically, what I am saying is that it will be staying as is for the 
foreseeable future ;-)

Jim

PS: Please send questions/bugs to the ANTLR group rather than me personally.