[antlr-interest] Implementing "include" functionality with C runtime

Jim Idle jimi at temporal-wave.com
Tue Jun 19 08:41:20 PDT 2007


If you have not sent all the code here then please send it to me off
line and I will take a look, but at best guess, it is because when you
finally try to turn the tokens into strings it only has the original
input stream and to turn it into the text for the tokens that you
generated from the include file, you would need that too. 

 

So, for this to work, when you create the tokens, you would have to
create the token text at the same time. However, it may be that some
difference has crept in to setCharStream since I wrote the C version - I
have a note to look at this in fact. I suspect that that runtime is not
setting the token's input charstream and hence it is stringifying from
the original stream only.

 

By a strange co-incidence I have a need to do the same thing this week,
so I bet it will be fixed this week J

 

Added:

 

http://www.antlr.org:8888/browse/ANTLR-144

 

Jim

 

From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of Cameron Esfahani
Sent: Monday, June 18, 2007 6:55 PM
To: antlr-interest at antlr.org
Subject: [antlr-interest] Implementing "include" functionality with C
runtime

 

When I was prototyping with the Java runtime, I had implemented the
"include" functionality based on code from the wiki:

 

http://www.antlr.org/wiki/pages/viewpage.action?pageId=557057

 

And this worked great.

 

Now that I've switched over to the C runtime, I'm having some trouble
porting the above solution over.

 

I've hooked into the nextToken() vector of the lexer's token source, and
I set up a simple stack to save and restore the pLexer->input
ANTLR3_INPUT_STREAM.

 

I make sure to call mark() on the current ANTLR3_INPUT_STREAM so when it
gets switched back in by the nextToken() override, I can just call
rewindLast() on it.

 

The problem is at the end of the parsing, when I'm dumping out the AST.
The text from included file isn't there.  In fact, the original
"include" string is there.  It seems to me that the C runtime doesn't
like how I've swapped out one stream for another.

 

Is there some cached state I need to reset?

 

There are two files: test.txt and test2.txt

 

include "test2.txt"

 

best = {

                                  "funky" : [ "list" ]

                      }

 

and here is test2.txt

 

test-stream = <123ABC>

 

test-tree3 = "hello"

 

Here is the output:

 

(T_ASSIGN include "te (T_HEXSTREAM .txt"

 

b)) es (T_ASSIGN t =    {

                        " (T_STR ky" : [))  " 

 

(T_ASSIGN best (T_OBJ (T_DEF "funky" (T_ARRAY (T_STR "list"))))) 

 

<EOF>

 

 

Here is the relevant code from the lexer and nextToken hooks:

 

          : 'include' WS? f = STRING {

                      ANTLR3_INPUT_STREAM*
Input;

                      ANTLR3_UINT8*
FileName;

                      int
Length;

 

                      // Extract out the file name from within the
quotes.

                      Length = strlen( f->getText( f )->chars + 1 );

                      FileName = malloc( Length );

                      strcpy( FileName, f->getText( f )->chars + 1 );

                      FileName[ Length - 1 ] = 0;

 

                      Input = antlr3AsciiFileStreamNew( FileName );

 

                      // Remember where we are in this stream, and save
it.

 

                      gLexer->pLexer->input->istream->mark(
gLexer->pLexer->input->istream );

                      gIncludeStack->push( gIncludeStack,
gLexer->pLexer->input, NULL );

 

                      gLexer->pLexer->setCharStream( gLexer->pLexer,
Input );

 

                      free( FileName );

                      }

 

ANTLR3_COMMON_TOKEN*

NextToken( ANTLR3_TOKEN_SOURCE* TokenSource )

{

          ANTLR3_COMMON_TOKEN*
Token;

          ANTLR3_INPUT_STREAM*
SavedStream;

 

          Token = gOriginalNextToken( TokenSource );

 

          if ( Token == &TokenSource->eofToken )

          {

                      // We've reached the end of this file.  Pop
anything off the include

                      // stack and continue.

 

                      if ( gIncludeStack->size( gIncludeStack ) > 0 )

                      {

                                  SavedStream = gIncludeStack->top;

 

                                  gLexer->pLexer->setCharStream(
gLexer->pLexer, SavedStream );

                                  SavedStream->istream->rewindLast(
SavedStream->istream );

 

                                  Token = gOriginalNextToken(
TokenSource );

 

                                  gIncludeStack->pop( gIncludeStack );

                      }

          }

 

          if ( ( ( ANTLR3_INT64 ) Token->getStartIndex( Token ) ) < 0 )

          {

                      Token = gOriginalNextToken( TokenSource );

          }

 

          return( Token );

}

 

Cameron Esfahani

dirty at apple.com

 

"I cannot for the life of me understand why, while people without
driver's licenses are not allowed on public roads, in bookstores one can
find any number of books by persons without decency - let alone
knowledge."

 

"His Master's Voice", Stanislaw Lem

 





 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20070619/643dae42/attachment-0001.html 


More information about the antlr-interest mailing list