[antlr-interest] Updated C++ parser

Monty Zukowski monty at codetransform.com
Thu Jun 10 20:24:30 PDT 2004


Thanks for releasing it!  Will make for good discussion at the ANTLR  
Revival.

Monty

On Jun 10, 2004, at 9:26 AM, Ric Klaren wrote:

> Hi,
>
> Once upon a time I was planning to give a talk on Monty's tokenstream
> filtering idea. Things went differently a bit but I ended up with a
> modified version of the C++ parser David Wigg has been working on.
>
> Let me cut and paste from what I added to the MyReadme.txt in the tar  
> ball:
>
> I Grabbed the C++ grammar as a vehicle to play with tokenstream  
> filtering.
> The plan was to attempt to make a drop in C++ preprocessor together  
> with
> #include #ifdef support. This to demo tokenstream filtering for a talk.
> Just before the deadline I came to the point that I had to redesign  
> things
> to keep things nice and concise for the talk. Therefore I had to  
> scratch
> using this as a talk vehicle. Yet the added features more or less  
> worked
> and can be interesting to look at.
>
> Added features:
>
> - Split up the big monolithic .g file from the original into a  
> separate lexer
>   and parser.
> - Added a Makefile (GNU Make/Gcc)
> - Use of a Custom Token class with line and file information. This  
> needs a
>   patch on antlr. It is supplied as antlr.patch in this directory. It's
>   probably present in the next snapshot after 2.7.4 release. (Once  
> Terence
>   merges the doc changes from 2.7.4 to mainline ;) but that can wait  
> while
>   he's engrossed in hacking on antlr 3 :) )
> - I used a custom stream multiplexer class that handles closing of  
> streams if
>   needed. CPPStreamStack. Antlr's default thing does not handle  
> cleanups
>   (read direct java port).
> - Added a ugly ancient hash table template to hash the filenames  
> encountered.
>   The tokens only store a hash value for filename. It's loosely based  
> on
>   a Modula 2 implementation of hash tables in GMD's cocktail (uses the  
> same
>   hash function).
> - Handling of #line directives with the above (might be off one line  
> here
>   and there I did not check that for correctness, proof of concept it  
> works)
> - Handling of #include directives with a TokenStream multiplexer.
>   this is completely done inside the lexer. To make this nice  
> (conceptually)
>   the preprocessor should become the TS multiplexer.
> - Handling of simple #define/#ifdef/#ifndef/#else/#endif statements.  
> Nested
>   but basically only checking defined or not defined. (good enough for
>   #include guards) This works actually quite nice. Should be easy  
> enough to
>   implement more functionality.
>
> As long as the preprocessor stays LL(1) it should be possible to fold  
> the
> tokenstream multiplexor and #include handling into the preprocessor,  
> giving
> a pretty nice layered design. Although sinning against the no feedback  
> from
> parser to lexer mantra.
>
> Releasing this stuff now since I probably won't have much time to make  
> it a
> real release. So tinker with it at your own risk. It needs a cleanup.  
> It
> contains old code from David's #line handling which I did not prune and
> probably there's some more virtual corpses in various virtual closets..
>
> It could be an idea to pass newlines on to the preprocessor. This would
> require some more tinkering. It might become necessary to let the
> preprocessor patch the line numbers from outgoing tokens.
>
> The result of the few days hacking can be found here:
>
> http://wwwhome.cs.utwente.nl/~klaren/antlr/CPPParser.tar.bz2
>
> This is basically released for the sake of releasing it. It's not a
> finished product, more a proof of concept. I probably don't have time  
> to
> wrap it up into something presentable soon, so that's why I release it.
>
> Cheers,
>
> Ric
> --
> ----- 
> +++++*****************************************************+++++++++---- 
> ---
>     ---- Ric Klaren ----- j.klaren at utwente.nl ----- +31 53 4893755   
> ----
> ----- 
> +++++*****************************************************+++++++++---- 
> ---
>   Wo das Chaos auf die Ordnung trifft, gewinnt meist das Chaos, weil es
>   besser organisiert ist. --- Friedrich Nietzsche
>
>
>
>
> Yahoo! Groups Links
>
>
>
>
>
>
>
>
Monty Zukowski

ANTLR & Java Consultant -- http://www.codetransform.com
ANSI C/GCC transformation toolkit --  
http://www.codetransform.com/gcc.html
Embrace the Decay -- http://www.codetransform.com/EmbraceDecay.html



 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list