[antlr-interest] v2->v3 Skip chars in Lexer? For C-target [SOLVED]

Ruslan Zasukhin ruslan_zasukhin at valentina-db.com
Sat Apr 16 09:36:35 PDT 2011


On 4/16/11 1:18 PM, "Bart Kiers" <bkiers at gmail.com> wrote:

Hi All,

Just for archive  I will show solution I was able built so far for our
Valentina SQL  couple of LEXER rules.

The only not clear yet to me is:
    if I must destroy temporary strings to avoid leaks?

Also I still wonder, if exists more compact and elegant and effective
solution
>From point of view of C ­ developer? :-)


//--------------------------------------------------------------------------
----
// an identifier. Note that testLiterals is set to true!  This means
// that after we match the rule, we look in the literals table to see
// if it's a literal or really an identifier
IDENT
    :    ( LETTER | '_' ) ( LETTER | '_' | DIGIT )*
    ;

DELIMITED       // delimited_identifier
    :
    (    DQUOTE ( ~(DQUOTE) | DQUOTE DQUOTE )+ DQUOTE
    |    BQUOTE ( ~(BQUOTE) | BQUOTE BQUOTE )+ BQUOTE
            
    |    LBRACK ( ~(']') )+ RBRACK     // valentina extension   [asasas '' "
sd "]    
    )    
        {
            // Remove the first and the last chars:
            pANTLR3_STRING pQuotedStr = GETTEXT();
            pANTLR3_STRING pStr = pQuotedStr->subString( pQuotedStr, 1,
pQuotedStr->len - 1 );
            
            SETTEXT( pStr );
        }
        { $type = IDENT; }
    ;


And this is the second rule, more complex, because can be quotes inside:

//--------------------------------------------------------------------------
----
STRING_LITERAL
@init
{
    int dquotes_count = 0;
}
    :    QUOTE 
        (    ESCAPE_SEQUENCE
        |    ~('\'' | '\\')
        |    QUOTE QUOTE            { ++dquotes_count; }
        )* 
        QUOTE 
        
        {
            // Remove the first and the last chars:
            pANTLR3_STRING pQuotedStr = GETTEXT();
            pANTLR3_STRING pStr = pQuotedStr->subString( pQuotedStr, 1,
pQuotedStr->len - 1 );
            
            char* pStart = (char*) pStr->chars;
            
            while( dquotes_count-- )
            {
                char* pFirstQuote = strchr( pStart, '\'' );
                
                if( *(pFirstQuote + 1) != '\'' ) // second quote?
                    continue;
                   
                // Example: 'aabbcc''def'
                int CharsOnLeft = pFirstQuote - pStart + 1;
                int CharsToMove = pStr->len - CharsOnLeft;
                   
                ANTLR3_MEMMOVE( pFirstQuote + 1, pFirstQuote + 2,
CharsToMove );

                // prepare for possible next loop:
                pStart = pFirstQuote + 1;
                pStr->len--;
            }
            
            SETTEXT( pStr );
        }
    ;



-- 
Best regards,

Ruslan Zasukhin
VP Engineering and New Technology
Paradigma Software, Inc

Valentina - Joining Worlds of Information
http://www.paradigmasoft.com

[I feel the need: the need for speed]



More information about the antlr-interest mailing list