[antlr-interest] [antlr 3] How to use the a C lexer/parser

Jim Idle jimi at temporal-wave.com
Thu May 17 14:50:40 PDT 2007


Check against the examples (which are VS2005 .sln enabled). However, it
is probable that the beta version you are using is out of date anyway. 

 

You can use the release version as of today and build the runtime from
the dist tgz (as per the wiki). Then you will also get the vs2005
integration rule file.

 

Where did it crash? As in after which call below? I t might be that you
should not be passing 0 to antlr3CommonTokenStreamSourceNew

 

But, I see you are freeing the input stream directly when you should be
using the close() method. 

 

Jim

 

main.c example:

 

int ANTLR3_CDECL

main  (int argc, char *argv[])

{

    // Now we declare the ANTLR related local variables we need.

    // Note that unless you are convinced you will never need thread
safe

    // versions for your project, then you should always create such
things

    // as instance variables for each invocation.

    // -------------------

 

    // Name of the input file. Note that we always use the abstract type
pANTLR3_UINT8

    // for ASCII/8 bit strings - the runtime library garauntees that
this will be

    // good on all platforms. This is a general rule - always use the
ANTLR3 supplied

    // typedefs for pointers/types/etc.

    //

    pANTLR3_UINT8     fName;

 

    // The ANTLR3 character input stream, which abstracts the input
source such that

    // it is easy to privide inpput from different sources such as
files, or 

    // memory strings.

    //

    // For an ASCII/latin-1 memory string use:

    //          input = antlr3NewAsciiStringInPlaceStream (stringtouse,
(ANTLR3_UINT64) length, NULL);

    //

    // For a UCS2 (16 bit) memory string use:

    //          input = antlr3NewUCS2StringInPlaceStream (stringtouse,
(ANTLR3_UINT64) length, NULL);

    //

    // For input from a file, see code below

    //

    // Note that this is essentially a pointer to a structure containing
pointers to functions.

    // You can create your own input stream type (copy one of the
existing ones) and override any

    // individual function by installing your own pointer after you have
created the standard 

    // version.

    //

    pANTLR3_INPUT_STREAM    input;

 

    // The lexer is of course generated by ANTLR, and so the lexer type
is not upper case.

    // The lexer is supplied with a pANTLR3_INPUT_STREAM from whence it
consumes its

    // input and generates a token stream as output.

    //

    pCLexer           lxr;

 

    // The token stream is produced by the ANTLR3 generated lexer. Again
it is a structure based

    // API/Object, which you can customise and override methods of as
you wish. a Token stream is

    // supplied to the generated parser, and you can write your own
token stream and pass this in

    // if you wish.

    //

    pANTLR3_COMMON_TOKEN_STREAM         tstream;

 

    // The C parser is also generated by ANTLR and accepts a token
stream as explained

    // above. The token stream can be any source in fact, so long as it
implements the 

    // ANTLR3_TOKEN_SOURCE interface. In this case the parser does not
return anything

    // but it can of cousre speficy any kind of return type from the
rule you invoke

    // when calling it.

    //

    pCParser                        psr;

 

    // Create the input stream based upon the arguement supplied to us
on the command line

    // for this example, the input will always default to ./input if
there is no explicit

    // argument.

    //

    if (argc < 2 || argv[1] == NULL)

    {

      fName =(pANTLR3_UINT8)"./input"; // Note in VS2005 debug, working
directory must be configured

    }

    else

    {

      fName = (pANTLR3_UINT8)argv[1];

    }

 

    // Create the input stream using the supplied file name

    // (Use antlr3AsciiFileStreamNew for UCS2/16bit input).

    //

    input   = antlr3AsciiFileStreamNew(fName);

 

    // The input will be created succesfully, providing that there is
enoguh

    // memory and the file exists etc

    //

    if ( (ANTLR3_UINT64)input < 0 )

    {

      switch((ANTLR3_UINT64)input)

      {

      case    ANTLR3_ERR_NOMEM:

 

          fprintf(stderr, "Unable to open file %s due to malloc()
failure1\n", (char *)fName);

          exit(1);

          break;

 

      default:

 

          fprintf(stderr, "Failed to open file %s - exit with code
%d\n", (char *)fName, (ANTLR3_UINT64)input);

          exit((int)((ANTLR3_UINT64)input));

          break;

      }

    }

 

    // Our input stream is now open and all set to go, so we can create
a new instance of our

    // lexer and set the lexer input to our input stream:

    //  (file | memory | ?) --> inputstream -> lexer --> tokenstream -->
parser ( --> treeparser )?

    //

    lxr         = CLexerNew(input);     // CLexerNew is generated by
ANTLR

 

    // Need to check for errors

    //

    if ( (ANTLR3_UINT64)lxr < 0 )

    {

      switch((ANTLR3_UINT64)lxr)

      {

      case    ANTLR3_ERR_NOMEM:

 

          fprintf(stderr, "Unable to create the lexer due to malloc()
failure1\n");

          exit(1);

          break;

 

      default:

 

          fprintf(stderr, "Failed to create lexer - exit with code
%d\n", (ANTLR3_UINT64)lxr);

          exit((int)((ANTLR3_UINT64)lxr));

          break;

      }

    }

 

    // Our lexer is in place, so we can create the token stream from it

    // NB: Nothing happens yet other than the file has been read. We are
just 

    // connecting all these things together and they will be invoked
when we

    // call the parser rule. ANTLR3_SIZE_HINT can be left at the default
usually

    // unless you have a very large token stream/input. Each generated
lexer

    // provides a token source interface, which is the second argument
to the

    // token stream creator.

    // Note tha even if you implement your own token structure, it will
always

    // contain a standard common token within it and this is the pointer
that

    // you pass around to everything else. A common token as a pointer
within

    // it that should point to your own outer token structure.

    //

    tstream = antlr3CommonTokenStreamSourceNew(ANTLR3_SIZE_HINT,
lxr->pLexer->tokSource);

 

    if ((ANTLR3_UINT64)tstream == ANTLR3_ERR_NOMEM)

    {

      fprintf(stderr, "Out of memory trying to allocate token
stream\n");

      exit(ANTLR3_ERR_NOMEM);

    }

 

    // Finally, now that we have our lexer constructed, we can create
the parser

    //

    psr         = CParserNew(tstream);  // CParserNew is generated by
ANTLR3

 

    if ((ANTLR3_UINT64)tstream == ANTLR3_ERR_NOMEM)

    {

      fprintf(stderr, "Out of memory trying to allocate parser\n");

      exit(ANTLR3_ERR_NOMEM);

    }

 

    // We are all ready to go. Though that looked complicated at first
glance,

    // I am sure, you will see that in fact most of the code above is
dealing

    // with errors and there isn;t really that much to do (isn;t this
always the

    // case in C? ;-).

    //

    // So, we now invoke the parser. All elements of ANTLR3 generated C
components

    // as well as the ANTLR C runtime library itself are pseudo objects.
This means

    // that they are represented as pointers to structures, which
contain any

    // instance data they need, and a set of pointers to other
interfaces or

    // 'methods'. Note that in general, these few pointers we have
created here are

    // the only things you will ever explicitly free() as everythign
else is created

    // via factories, that alloacte memory efficiently and free()
everything they use

    // automatically when you close the parser/lexer/etc.

    //

    // Note that this means only that the methods are always called via
the object

    // pointer and the first argument to any method, is a pointer to the
structure itself.

    // It also has the side advantage, if you are using an IDE such as
VS2005 taht can do it

    // that when you type ->, you will see a list of tall the methods
the object supports.

    //

    psr->translation_unit(psr);

 

    // We did not return anything from this parser rule, so we can
finish. It only remains

    // to close down our open objects, in the reverse order we created
them

    //

    psr         ->free  (psr);          psr = NULL;

    tstream ->free  (tstream);          tstream = NULL;

    lxr         ->free  (lxr);          lxr = NULL;

    input   ->close (input);      input = NULL;

 

    return 0;

}

 

 

From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of Boris Boucher
Sent: Thursday, May 17, 2007 2:28 PM
To: antlr-interest at antlr.org
Subject: [antlr-interest] [antlr 3] How to use the a C lexer/parser

 

Hello all,

I just begun to write a lexer/parse using antlr 3.0b7. 

My problem is what are the step to be done to invoke the parser ?
I have writen the follwing code but the program crash inside the antlr C
runtime in a hash that did not seams to be allocated or initialized. 

Here my code :

void parseInputFile(const std::string &fileName)
{
    // create a file input stream
    pANTLR3_INPUT_STREAM inputStream =
antlr3AsciiFileStreamNew((pANTLR3_UINT8)fileName.c_str()); 
    // create the lexer
    pInputParserLexer inputLexer = InputParserLexerNew(inputStream);
    // create a token stream
    pANTLR3_COMMON_TOKEN_STREAM    tokenStream =
antlr3CommonTokenStreamSourceNew(0, inputLexer->pLexer->tokSource); 
    // create the parser
    pInputParserParser inputParser = InputParserParserNew(tokenStream);

    // call the parser
    inputParser->data(inputParser);

    // freeup resources
    inputParser->free(inputParser); 
    tokenStream->free(tokenStream);
    inputLexer->free(inputLexer);
    // no free method on the input stream ?
    free(inputStream);
}

Did I miss somethink ? 

I am using Visual Studio 2005. 

Boris



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20070517/662d1761/attachment-0001.html 


More information about the antlr-interest mailing list