[antlr-interest] converting Java code to C for PHP grammar

Dinesha Balasuriya Weragama dinesha74 at yahoo.com
Mon Jul 19 20:50:16 PDT 2010


I am relatively new to ANTLR and am trying to use the PHP grammar that has been used with the ANTLR PHP target.  This grammar handles the first input token separately as 
a first body string using the following Java Code to override the nextToken() function.

@lexer::members{
    // Handle the first token, which will always be a BodyString.
    public Token nextToken(){
        //The following code was pulled out from super.nextToken()
        if (input.index() == 0) {
            try {
                state.token = null;
                state.channel = Token.DEFAULT_CHANNEL;
                state.tokenStartCharIndex = input.index();
                state.tokenStartCharPositionInLine = input.getCharPositionInLine();
                state.tokenStartLine = input.getLine();
                state.text = null;
                mFirstBodyString();
                state.type = BodyString;
                emit();
                return state.token;
            } catch (NoViableAltException nva) {
                reportError(nva);
                recover(nva); // throw out current char and try again
            } catch (RecognitionException re) {
                reportError(re);
                // match() routine has already called recover()
            }    
        }
        return super.nextToken();
    }
}

mFirstBodyString is created to identify the first body string as the following fragment in the grammar.

fragment
FirstBodyString
    : (('<' ~ '?')=> '<' | ~'<' )* '<?' ('php'?)
    ;

I am attempting to convert this to C but end up with a runtime error with a pointer to memory location 0x0.  My attempted code (ignoring the error handling for now) is 
given below.

@lexer::apifuncs{
  TOKENSOURCE(ctx)->nextToken=myNextToken;
}

@lexer::members{

#include "phpTest.h"
     pANTLR3_COMMON_TOKEN myNextToken(pANTLR3_TOKEN_SOURCE toksource){
     
        pANTLR3_LEXER lexer;
        
        lexer=(pANTLR3_LEXER)(toksource->super);
        if(lexer->input->istream->index(lexer->input->istream)==0){
              lexer->rec->state->token=NULL;
              lexer->rec->state->channel=ANTLR3_TOKEN_DEFAULT_CHANNEL;
              lexer->rec->state->tokenStartCharIndex=lexer->input->istream->index(lexer->input->istream);
              lexer->rec->state->tokenStartCharPositionInLine=lexer->input->getCharPositionInLine(lexer->input);
              lexer->rec->state->tokenStartLine=lexer->input->getLine(lexer->input);
              lexer->rec->state->text=NULL;
              mFirstBodyString(lex);
              lexer->rec->state->type=BodyString;
              lexer->emit(lexer);
              return lexer->rec->state->token;
        }
        return lexer->rec->state->token;
     }
}

phpTest.h just contains a global declaration of lex which is of type pphpLexer and is the lexer used in the test rig.  Please let me know what I'm doing wrong.




      


More information about the antlr-interest mailing list