[antlr-interest] Could the parser and lexer be reused?

chain one chainone at gmail.com
Mon Jan 5 01:10:09 PST 2009


Thanks for your reply.
As what you said, I searched the API doc and found the reset functions in
the code of lexer and input stream(no parser).
And I tried to reuse the input stream and the lexer.
The following is the difficulties I met:

1. input stream is reused by this way
        if(!m_input)        m_input  =
antlr3NewAsciiStringInPlaceStream((pANTLR3_UINT8)entry_start,p_semi-entry_start+1,NULL);
    else
    {
        m_input->reset(m_input);
        m_input->data = (pANTLR3_UINT8)entry_start;
        m_input->sizeBuf = p_semi-entry_start+1;
    }

But, at run time this piece of code will make the programme print a lot of
error messages.


2. Lexer is reused in this way:
  m_input  =
antlr3NewAsciiStringInPlaceStream((pANTLR3_UINT8)entry_start,p_semi-entry_start+1,NULL);
    if(!m_lex)
        m_lex    = StepDataEntryLexerNew(m_input);
    else
        m_lex->pLexer->setCharStream(m_lex->pLexer,m_input);

However, it will make programme crash at line 482 of antlrtokenstream.c

I don't know how to make it work. I would appreciate if you take a look at
this and give some suggestions. : )

Best Regards,
Young







On Mon, Jan 5, 2009 at 1:16 PM, Jim Idle <jimi at temporal-wave.com> wrote:

>  chain one wrote:
>
> There are many pieces of  inputs , all of which should be parsed by one
> parser. Such as :
> Input 1:
>      Jack 100$
> Input 2:
>     Tom   200$
> Input ......
>
>  However, this kind of inputs doesn't come all in one time. They arrive at
> different time. Once one input arrives, it needs to be parsed immediately.
> So the next piece of pseudo code shows how it is processed by my way:
>
>  void ParseOneInput(const char* data)
> {
>    input  =
> antlr3NewAsciiStringInPlaceStream((pANTLR3_UINT8) data,strlen(data),NULL);
>    lex    = StepDataEntryLexerNew(m_input);
>    tokens = antlr3CommonTokenStreamSourceNew(ANTLR3_SIZE_HINT,
> TOKENSOURCE(lex));
>    parser = DataEntryParserNew               (tokens);
>     parser  ->entry(parser,1);
>     parser ->free(parser);
>     tokens ->free(tokens);
>     lex    ->free(lex);
>     input  ->close(input);
> }
>
>  Once one input arrives, Function ParseOneInput is called.
>
>  It works fine.
>
>  The question is , could the parser and lexer in ParseOneInput be reused?
> If they could be reused, then it is unnecessary to create/destroy a lexer
> and a parser every time an input arrives. If not, I believe it
> is inefficient.
>
>  Actually it isn't particularly inefficient, it is justa bit of memory and
> few pointers initialized; though when measured relative to the speed of
> parsing/lexing it may appear to be so :-). However, there is no need to
> recreate the lexer and parser, you can reuse them and reset() them, setting
> their input streams as per the API docs.
>
> Jim
>
> ------------------------------
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090105/c3925f99/attachment.html 


More information about the antlr-interest mailing list