[antlr-interest] advocacy of C++ support in ANTLR 3.x

Jim Idle jimi at temporal-wave.com
Tue Apr 1 09:53:39 PDT 2008


Please read the comments in the source for common tree adaptor and base tree adaptor before attempting this, as well as http://www.antlr.org/api/C/index.html. 

 

In the C version , all adaptors and so on should return a pointer to pANTLR_BASE_TREE, which should be contained within your own tree nodes (which can contain anything so long as they have an ANTLR_BASE_TREE interface. That interface contains a pointer to the higher level structure, such as COMMON_TREE, which in turn can point to an even higher level tree. But, you need to implement an adaptor, which will handle the tree for you and which the generated code will use. The adaptor needs to provide the methods in the BASE_TREE_ADAPTOR. You can probably create a COMMON adaptor, then install pointers to your own methods for those that won't work as is. To be honest though, I don't know of anyone that is doing this, so you may be pioneering here, though the standard implementation uses the same mechanisms, so it must 'work' ;-)

 

It would seem that in your case you will want both an adaptor and a tree implementation. You might find it just as easy to implement the standard tree, then use a tree grammar to construct your own tree, though you shoudl not HAVE to do this.

 

Jim

 

From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Tomas Potrusil
Sent: Tuesday, April 01, 2008 3:58 AM
To: ANTLR
Subject: Re: [antlr-interest] advocacy of C++ support in ANTLR 3.x

 

I was wrong. I do not need to "override" a tree, but a tree adaptor! Investigating the mailing-list and the source code I've found that the generated parser uses just the adapter and not the tree directly. But then there is something strange in the current C runtime:

 

In Java runtime the tree adaptor interface works with "Object" objects only. Of course it must abstract access to real tree nodes - it is an adaptor; not just an object factory.  Terence Parr in a documentation says: "Rather than have a separate factory and adaptor, I've merged them."

 

The C runtime simulates its Java version, but it doesn't work with void* ("Object" in C) but directly with ANTLR3_BASE_TREE. It is not an adaptor anymore, it is just an object factory. Methods like

ANTLR3_TREE_ADAPTOR::addChild(...adaptor, pANTLR3_BASE_TREE t, pANTLR3_BASE_TREE child)

are useless, because everyone can call t->addChild(child) directly.

 

This prevents me to use our existing AST C++ classes within ANTLR without "subclassing" them from ANTLR3_COMMON_TREE, doesn't it...

 

Tom

 

From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Jim Idle
Sent: Monday, March 31, 2008 1:39 AM
To: ANTLR
Subject: Re: [antlr-interest] advocacy of C++ support in ANTLR 3.x

 

You will probably find it best to override pANTLR3_COMMON_TREE by encapsulating this within your own structure, as per the docs. This, as all the structures are, is a set of pointers to functions and you need only override the ones that you have to, just as in Java. Runtime type checking 'can' be an overhead, so i am not sure you would want to do that anyway, but I will contemplate your suggestion of course as it has some merit.

 

Jim

 

From: Tomas Potrusil [mailto:potrto at centrum.cz] 
Sent: Friday, March 28, 2008 5:43 AM
To: Jim Idle
Cc: ANTLR
Subject: RE: [antlr-interest] advocacy of C++ support in ANTLR 3.x

 

Oh yes, I know. I've already made a prototype implementation of a part of the grammar based on the idea I presented bellow (atom returns [OurNode* result] etc.). It is working but it is a little bit clumsy and I cannot use the resulting AST for a tree parsing - of course, I'm creating my own AST.

 

I've been thinking about the new tree adapter (I was talking about bellow) and probably you are true,  few C++ wrappers could do the work. But there is one inconvenience - there is not an "abstract" tree yet. The most abstract tree is ANTLR3_BASE_TREE_struct which contains children vector and other attributes. The ANTLR3_TREE_struct with only pointers to functions (something like a Java interface) would suit my needs better. Our existing AST nodes solve the storage already... Could you do it, please?

 

Another problem is safety. When somebody call ANTLR3_BASE_TREE_struct::addChild(pANTLR3_BASE_TREE tree) for example, I must trust him that the tree argument is really the tree he is calling. I cannot write dynamic_cast<MyTreeWrapper>(tree->super). This cannot be solved in the current C-based system.

 

Tom

 

From: Jim Idle

 

ANTLR 3.1 C target can now incorporate C++ code directly into the grammar and so can easily call your existing C++ code. All you do is compile the C output file as C++ (or rename it to .cpp perhaps). 

Can you try using that and let me know if you think that there is anything that you could do if the runtime was C++ that you can't do right now? I don't really think that there will be.

You need to get the latest 3.1 snapshot from the downloads page and use the ANTLR Tool hjar in there. Then build the ANTLR 3.1 C runtime from the tar.gz in the dist director under the runtime/C directory in the snapshot. 3 or 4 people have successfully integrated their C++ code with the C target now and I think you will have similar success :-)

Jim

 

-----

Hallo,

 

I'm new to the list. I'm trying to use ANTLR for generating a SQL parser because our current parser doesn't support Unicode input - it was generated by Lex/Yacc. We use C++ and we have our own complex AST that is used by a SQL engine already... So my idea is to write a tree adapter that would create our existing AST nodes (they would just inherit ANTLR tree interface).

 

And here comes a problem that ANTLR 3.x doesn't contain support for "pure" C++ implementation. I've just found Jim Idle's "promise":

 

> Later I may well produce a complete C++ implementation from scratch,

> however, at this point I am not sure that it buys you anything. Please

> let me know if there are things you cannot do with the system as it

> stands (other than access the tokens and so on using C++ objects, which

> will be done later). 

 

I know that the problem could be solved with the current system somehow, but it would be probably very ugly. So yes, complete C++ implementation will buy us something! Or we can use ANTLR 2.x...

 

Right now we will probably try to build the AST by hand:

 

atom returns [OurNode* result]

@init { $result = NULL; }

:              NUMBER

                {

                               std::string str((char*)$NUMBER.text->chars, $NUMBER.text->len);

                               $result = new OurNumberNode(str);

                };

 

Or do you have some other ideas?

 

Thanks

 

Tom

 



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080401/52424eef/attachment-0001.html 


More information about the antlr-interest mailing list