[antlr-interest] Java Parser to analyze C++ language syntax
David Wigg
wiggjd at bcs.org.uk
Thu Mar 5 03:04:36 PST 2009
I've retired from maintaining this C/C++ parser but I've been following
recent discussions with some puzzlement as at the moment I can't see what
the aim is. I regret I have lost track of who raised this question, perhaps
he/she could let us know what text is to be parsed, the aim and the reason
for it
Before replying I draw your attention to the information below about the
language definition which is in my download. You will see that it makes it
clear that the grammar alone is not sufficient. The words in para 1 under
the two lines are from Bjarne's book.
Thanks,
David.
C++ Language Definition
Notes:
DW 23/04/03
This data was originally obtained from,
http://www.csci.csusb.edu/dick/C++std/cd2/gram.html
I have converted the original hyphens separating words in names to
underscores and the opt(ional) indicator at the end of names to -opt to aid
searching.
As this file is stored in text mode the textual distinction between reserved
keywords
and other words has been lost. The keywords should be listed in any book
about C++.
DW 27/09/04
This data has been modified to match the grammar listed in Appendix A of
"The C++ Programming Language" Third Edition by Bjarne Stroustrup
ISBN 0201889544
Deleted or amended lines of the original (old) text are shown within []
brackets. Lines copied from the book are shown as "Book line".
One line was added from "C++ In a Nutshell" First edition, ISBN 059600298X
I would be grateful if anyone could let me know if it needs updating
at wiggjd at bcs.org.uk
______________________________________________________________________
Annex 0 (informative)
Grammar summary [gram]
______________________________________________________________________
1 This summary of C++ syntax is intended to be an aid to comprehension.
It is not an exact statement of the language. In particular, the
grammar described here accepts a superset of valid C++ constructs.
Disambiguation rules (_stmt.ambig_, _dcl.spec_, _class.member.lookup_)
must be applied to distinguish expressions from declarations. Fur-
ther, access control, ambiguity, and type rules must be used to weed
out syntactically valid but meaningless constructs.
1.1 Keywords [gram.key]
1 New context_dependent keywords are introduced into a program by type_
def (_dcl.typedef_), namespace (_namespace.def_), class (_class_),
enumeration (_dcl.enum_), and template (_temp_) declarations.
typedef_name:
identifier
namespace_name:
original_namespace_name
namespace_alias
original_namespace_name:
identifier
namespace_alias:
identifier
class_name:
identifier
template_id
enum_name:
identifier
template_name:
identifier
Note that a typedef_name naming a class is also a class_name
(_class.name_).
Message: 9
Date: Tue, 03 Mar 2009 11:18:49 -0500
From: Andy Tripp <antlr at jazillian.com>
Subject: Re: [antlr-interest] Java Parser to analyze C++ language
syntax
To: Kaleb Pederson <kaleb.pederson at gmail.com>
Cc: antlr-interest at antlr.org
Message-ID: <49AD5869.9030502 at jazillian.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Kaleb Pederson wrote:
> In looking at LLVM a bit more closely, although both the FAQ and the
> features page mention full C++ support,
"full C++ support" - heh, that would be impossible ;)
Seems like pretty much any random sequence of characters would be
valid C++.
> One note on David Wigg's C++ grammar, it seemed well documented
> although, IIRC, it used a slightly patched version of ANTLR or its
> runtime which makes it a bit more difficult to use than would be
> typical.
IIRC, Wigg's C++ grammar chokes if you ever give it enough ugly C++
to chew on. As Terrence quips in his book "Back when C++ was almost
parsable...".
Andy
>
> --Kaleb
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090305/6b82b93c/attachment.html
More information about the antlr-interest
mailing list