[antlr-interest] @init actions in "C Target" have problems with MSVS compilers

Jim Idle jimi at temporal-wave.com
Thu Mar 22 13:19:24 PDT 2007


Alexander,

This is not really how you are supposed to use the @init rule :-) it is
only supposed to contain code statements to initialize variables
declared elsewhere and is generally for parser rules to initialize
variables declared in scope(s). However the fact that it accepts an
@init clause on lexer rules means that perhaps Ter wanted them to be
supported and it does make sense to initialize certain things. However,
you cannot place declarations there as declarations have already been
generated by that point.

So, you can use them to initialize lexer stuff:

@lexer::header
{
   static int	    lastCommandNo;
    static int	    lastMacroLineNo;
    static int	    lastOffset;

}

(but be careful with such things as when you have globals you make your
lexer unlikely to remain thread safe.)

Or, if you want to use a local variable, declare it in the action:

TAG : '#' num=DIGITS
	{
		int	myNum;
		myNum = $num.text->toInt32($num.text);
	}

However, because I felt it was quite often the case that one would wish
to do some small thing with a token and pass the result around I added a
few things to the C implementation that are not in the Java
implementation. In Java you would just inherit a commontoken into a new
class and add what you need. You can do this with the C stuff as well
but it is a heck of a lot more typing ;-). So, generally I felt you
would either want to pass around a couple of numbers, or perhaps some
modified string, or perhaps even some huge structure. Because of this, I
gave the common token 3 integers (ANTLR3_UINT32 user1,user2,user3;) and
one void * (void * custom;) that are specifically for use by the grammar
programmer. If you need to free memory when the token is destroyed you
can also set the address of your cleanup function into a function
pointer (  void (*freeCustom)(void * custom); ).

This saves you having to subclass the lexer token 9 times out of ten and
means I have done the hard work for you ;-)

It is easy to use them:


TAG : '#' num=DIGITS
	{
		struct mystruct * ms;
		ltoken()->user1 = $num.text->toInt32($num.text);
		ms = ANTLR3_MALLOC(sizeof(struct mystruct));
		ltoken()->custom = (void *)ms;
		ltoken()->freeCustom = free;   // Or whatever
		ms->xyz = blah;
	}
 
When the tokenfactory is closed it will ensure that the freeCustom
function is called for any token that has it defined.

In general though I have found it easier to manipulate things in the
parser, where you have rule scopes that make tracking such things
easier.

baddict
    scope   { ANTLR3_BOOLEAN bad;              }
    @init   { $baddict::bad = ANTLR3_FALSE;  }

    :   (
            UQS { $baddict::bad = ANTLR3_TRUE; } 
        )? 
            bs=STRING
    
    {
        if  ($baddict::bad) 
	{
	    if	($query::q->errorList->len > 0)
	    {
		$query::q->errorList->addc
($query::q->errorList, 0XFE);
	    }
	    $query::q->errorList->append8   ($query::q->errorList,
"7011");
	    $query::q->errorList->addc	    ($query::q->errorList,
0XFD);
	    $query::q->errorList->appendS   ($query::q->errorList,
$bs.text);
            
            $query::parseError  = ANTLR3_TRUE;
        }
    }
        ->$bs
    ;

Hope that this helps?

Jim

-----Original Message-----
From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of Alexander
Stasenko
Sent: Thursday, March 22, 2007 6:34 AM
To: antlr-interest at antlr.org
Subject: [antlr-interest] @init actions in "C Target" have problems with
MSVS compilers

Consider the following rule with @init action:

		CharacterConstant @init {
				char c; int o;
		}
...

For C target the following code will be generated:

		void mCharacterConstant(pSisal2Lexer ctx)
		{
	
	
		    ANTLR3_UINT32	    _type;
		    ANTLR3_UINT64	    _start;
		    ANTLR3_UINT64	    _end;
		    ANTLR3_UINT64	    _line;
		    ANTLR3_UINT32	    _charPosition;
		    ANTLR3_UINT32	    _channel;
	
		    /* Initialize rule variables
		     */


		    ctx->pLexer->ruleNestingLevel++;
		    _type	    = CharacterConstant;
		    _start	    = getCharIndex();
		    _end	    = 0;
		    _line	    = getLine();
		    _charPosition   = getCharPositionInLine();
		    _channel	    = ANTLR3_TOKEN_DEFAULT_CHANNEL;

	
		    	char c; int o;
...

MSVS 2005 compiler can not compiler it since "char c; int o;"
declarations are placed after executable statements. This restriction
was relazed in C99 standart which is unfortunately is not supported
even in MSVS 2005 in its C-mode. Trying to compile in C++ mode causes
other problems in other places, so usage of C++ mode is not a solution
here.

Is there any possible workaround? Thanks at advance.

-- 
Best regards, Alexander.


More information about the antlr-interest mailing list