[antlr-interest] Modifying a tree node stream

tcorbat at hsr.ch tcorbat at hsr.ch
Mon May 10 07:46:59 PDT 2010


Hello

I've got a question about reparsing a rewritten rule in tree grammars. I'm currently working on a preprocessor (for C++). While I think I have solved the functional requirement I think there could be improvements regarding performance  - which I did not consider in the first place.

While preprocessing C++ code, there are macros to expand. After expanding a macro the replacement, together with the rest of the code, is rescanned for further macros to expand. Is there a neat way to implement such a behavior with a tree grammar?

In my current approach I have a lexer for tokenizing the files and a parser which creates an AST, introducing some structure. Basically, I perform the expansion of the macros in two tree grammars. One is generally for traversing the AST, created in the parser, determining which groups (lines of code) have to become output. The other is for expanding the macros.  The expanding grammar is invoked by the other, to perform all expansions.

So, after recognizing a group of code lines, which will become output of the preprocessor, an expander, taking these lines as input in tree-form, is created. The expander contains a rule, which consumes the next token (or several, if it is a function-like-macro call, including the arguments) and returns the replacement in tree-form. Since I have to reprocess the replacement and all following tokens, in case of an expanded macro, I take this result, together with the remaining nodes from the tree node stream and build a new stream to create a new expander. If there is no macro-expansion the printer just consumes the token und continues.

Now, I don't like creating a new stream every time a macro-expansion is encountered. Is there an alternative? Can I somehow mark the position in the current stream, perform the replacement in the stream, rewind the stream and start over with the expander?

Probably I got a bit narrow-minded while reading the C++ standard and trying to solve this problem and there could be a much easier approach.


I know my description might be a bit confusing, I'll try to explain it on a small example:

---
#define X Y
#define Y(a) a
X(2)
---
Primarily "X" gets expanded to "Y", resulting in "Y(2)". As "Y" is reprocessed, together with "(2)" it has to be expanded again, becoming "2"

The Lexer creates the Tokens for the Parser. The Parser creates an AST. The Printer traverses that AST. When it comes to the line "X(2)" which has to be expanded an Expander is created, with the subtree representing "X(2)" as input. The Printer invokes the "expand" rule of the Expander and receives the replacement for "X": "Y". Currently, the Printer reads the rest of the stream "(2)", appends it to the replacement, resulting in "Y(2)" and creates a new stream and Expander for this input and reinvokes the "expand" rule.
Is it possible to modify the node stream of the expander, instead of creating a new one?

I hope I did not confuse everybody with my problem.
I appreciate every answer.

Regards,
Thomas


More information about the antlr-interest mailing list