[antlr-interest] [C] code to change Token type, use char* and loose data when buffer destroyed

Ruslan Zasukhin ruslan_zasukhin at valentina-db.com
Tue Sep 27 14:31:04 PDT 2011


On 9/27/11 9:45 PM, "Jim Idle" <jimi at temporal-wave.com> wrote:

Hi Jim,

As always thank you a lots for your time.

> Each token contains the char * pointer that is in to the input stream
> start, which is what I generally use, but if you want to use my build in
> string stuff and have it auto free then it is just:
> 
> csl
> @declarations { pANTLR3_STRING s; }
> : s1=STRING
>      { s= $s1.text; }
>    (
> s2=STRING
> {
> s->append(s, $s2.text);
> }
> 
>    )*
> { $s1->setText(s);  /* Check that, but I think it is this */ }
> 
> ->s1
> ;

Nice to see I am becoming expert in the ANTRL3 :-)

I have try this way above. What I like here is that if there is only ONE
literal, 
what is true for 99%  we are still effective, no need do append() or use
other buffers.

This is how should looks above rule, to really compile ...

character_string_literal
@declarations{
    pANTLR3_STRING s;
}
    :    s1 = STRING_LITERAL        { s = $s1.text; }
        ( s2 = STRING_LITERAL       { s->append( s, (const char*)
$s2.text->chars ); }
        )*
        
        { $s1->setText( $s1, s ); }
        
        -> $s1
    ;    


But (!!)

This rule in the latest ANTLR 3.4.1 generate C code, which not compiles.
Oops.
This is why I have spent yesterday the whole evening loosing hairs :-)

Look on generated code:

// $ANTLR start synpred20_SqlParser_v3
static void synpred20_SqlParser_v3_fragment(pSqlParser_v3Parser ctx )
{
    pANTLR3_COMMON_TOKEN    ;           <<<<<<<   should be s2;

           = NULL;                                       <<<<<<<   s2 =
NULL;


    // 
/PARADIGMA/Developer_2/sources/VKernel/VSQL/Parser/v3/grammars/SqlParser_v3.
g:644:5: (s2= STRING_LITERAL )
    // 
/PARADIGMA/Developer_2/sources/VKernel/VSQL/Parser/v3/grammars/SqlParser_v3.
g:644:5: s2= STRING_LITERAL
    {
        s2 = (pANTLR3_COMMON_TOKEN) MATCHT(STRING_LITERAL,
&FOLLOW_STRING_LITERAL_in_synpred20_SqlParser_v34838);


I don¹t know why this is happens.
It seems happens only for STRING_LITERAL in my grammar.
But I do not see nothing special to this LEXER-generated token.
I can send you my parser.g file so you can test self to see where is
trouble.


Now, I have this rule working using above idea.
GOOD. Thank you, Jim.

And now I am ready to play with second way

csl
: s1+=STRING -> $s1+  /* Or, ->^(SLIT $s1+) */
;


I have also try this way ysterday (again I am glad I have think about it)
But I was not able find solution how to join all that tokens in the array.

You have give nice idea ­ join them in the TreeParser.
Yes, indeed this can work... So I will play now with this way.


-- 
Best regards,

Ruslan Zasukhin
VP Engineering and New Technology
Paradigma Software, Inc

Valentina - Joining Worlds of Information
http://www.paradigmasoft.com

[I feel the need: the need for speed]



More information about the antlr-interest mailing list