[antlr-interest] Re: C++ grammar
Terence Parr
parrt at jguru.com
Thu Jun 13 11:36:10 PDT 2002
On Thursday, June 13, 2002, at 11:31 AM, cppljevans wrote:
> --- In antlr-interest at y..., Terence Parr <parrt at j...> wrote:
>> Folks,
>>
>> A number of people are playing with a C++ front end for ANTLR
> (either
>> from scratch or by converting old PCCTS grammar forward to ANTLR).
> I
>> might be putting some effort behind making a standard C++ parser for
>> ANTLR and could use any head start people have. So, who's been
> doing
>> what? :)
>>
> I'm trying to convert Lilley's parser to a pretty printer for c++.
> I'm planning on using c++, and might current focus is getting
> the lexer to work. The main problem is passing the "expanded"
> tokens to the parser; yet, just printing the "unexpanded" tokens.
> By "expanded" token, I mean the tokens that are the result of
> either #include <file> or processing a preprocessor macro.
Preprocessor stuff is typically done as a char stream filter so the C++
lexer is not complicated by the preprocessor. Helps to separate these
tasks. It can be done, of course. You might also just say "I'll use
/lib/cpp" ;) Naturally this makes pretty printing harder as you don't
always know what was the original source ;)
> I haven't coded anthing yet (except converting some of Lilley's
> data structures to stl), but I'm thinking of merging some of
> the ideas in http://www.antlr.org/doc/streams.html with
> Lilley's macro expansion methods ( see void
> CPreParserImp::ExpandTokenList in cpre_expand.cpp).
>
> To be more specific, I'm thinking of the lexer as a stack of
> iterators, where each iterator corresponds either to a file or
> a macro invocation. The output tokens would only come from the
> bottom of the stack, whereas the parser would always read from
> the top. Since the bottom corresponds to the original source file,
> only tokens from the original source would be output.
Yeah, a more general queue for TokenStream would be useful that let the
lexer push more than one token on the stream at once would be groovy.
>
> For example, given the following code in test.cpp:
>
> #define DECLB int b
> int a;
> DECLB ;
> int c;
>
> Then the lexer stack, just before the read of b, would contain:
>
> int b
> ^
> int a ; DECLB ; int c ;
> ^
Oh, well you can just push lexer input states for this. There is a
stack mechanism already for nested lexing and parsing.
> I'd appreciate any feedback on this design.
Cool. Let us know how it goes.
Ter
--
Co-founder, http://www.jguru.com
Creator, ANTLR Parser Generator: http://www.antlr.org
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list