[antlr-interest] Updated C++ parser
Monty Zukowski
monty at codetransform.com
Thu Jun 10 20:24:30 PDT 2004
Thanks for releasing it! Will make for good discussion at the ANTLR
Revival.
Monty
On Jun 10, 2004, at 9:26 AM, Ric Klaren wrote:
> Hi,
>
> Once upon a time I was planning to give a talk on Monty's tokenstream
> filtering idea. Things went differently a bit but I ended up with a
> modified version of the C++ parser David Wigg has been working on.
>
> Let me cut and paste from what I added to the MyReadme.txt in the tar
> ball:
>
> I Grabbed the C++ grammar as a vehicle to play with tokenstream
> filtering.
> The plan was to attempt to make a drop in C++ preprocessor together
> with
> #include #ifdef support. This to demo tokenstream filtering for a talk.
> Just before the deadline I came to the point that I had to redesign
> things
> to keep things nice and concise for the talk. Therefore I had to
> scratch
> using this as a talk vehicle. Yet the added features more or less
> worked
> and can be interesting to look at.
>
> Added features:
>
> - Split up the big monolithic .g file from the original into a
> separate lexer
> and parser.
> - Added a Makefile (GNU Make/Gcc)
> - Use of a Custom Token class with line and file information. This
> needs a
> patch on antlr. It is supplied as antlr.patch in this directory. It's
> probably present in the next snapshot after 2.7.4 release. (Once
> Terence
> merges the doc changes from 2.7.4 to mainline ;) but that can wait
> while
> he's engrossed in hacking on antlr 3 :) )
> - I used a custom stream multiplexer class that handles closing of
> streams if
> needed. CPPStreamStack. Antlr's default thing does not handle
> cleanups
> (read direct java port).
> - Added a ugly ancient hash table template to hash the filenames
> encountered.
> The tokens only store a hash value for filename. It's loosely based
> on
> a Modula 2 implementation of hash tables in GMD's cocktail (uses the
> same
> hash function).
> - Handling of #line directives with the above (might be off one line
> here
> and there I did not check that for correctness, proof of concept it
> works)
> - Handling of #include directives with a TokenStream multiplexer.
> this is completely done inside the lexer. To make this nice
> (conceptually)
> the preprocessor should become the TS multiplexer.
> - Handling of simple #define/#ifdef/#ifndef/#else/#endif statements.
> Nested
> but basically only checking defined or not defined. (good enough for
> #include guards) This works actually quite nice. Should be easy
> enough to
> implement more functionality.
>
> As long as the preprocessor stays LL(1) it should be possible to fold
> the
> tokenstream multiplexor and #include handling into the preprocessor,
> giving
> a pretty nice layered design. Although sinning against the no feedback
> from
> parser to lexer mantra.
>
> Releasing this stuff now since I probably won't have much time to make
> it a
> real release. So tinker with it at your own risk. It needs a cleanup.
> It
> contains old code from David's #line handling which I did not prune and
> probably there's some more virtual corpses in various virtual closets..
>
> It could be an idea to pass newlines on to the preprocessor. This would
> require some more tinkering. It might become necessary to let the
> preprocessor patch the line numbers from outgoing tokens.
>
> The result of the few days hacking can be found here:
>
> http://wwwhome.cs.utwente.nl/~klaren/antlr/CPPParser.tar.bz2
>
> This is basically released for the sake of releasing it. It's not a
> finished product, more a proof of concept. I probably don't have time
> to
> wrap it up into something presentable soon, so that's why I release it.
>
> Cheers,
>
> Ric
> --
> -----
> +++++*****************************************************+++++++++----
> ---
> ---- Ric Klaren ----- j.klaren at utwente.nl ----- +31 53 4893755
> ----
> -----
> +++++*****************************************************+++++++++----
> ---
> Wo das Chaos auf die Ordnung trifft, gewinnt meist das Chaos, weil es
> besser organisiert ist. --- Friedrich Nietzsche
>
>
>
>
> Yahoo! Groups Links
>
>
>
>
>
>
>
>
Monty Zukowski
ANTLR & Java Consultant -- http://www.codetransform.com
ANSI C/GCC transformation toolkit --
http://www.codetransform.com/gcc.html
Embrace the Decay -- http://www.codetransform.com/EmbraceDecay.html
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/antlr-interest/
<*> To unsubscribe from this group, send an email to:
antlr-interest-unsubscribe at yahoogroups.com
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list