[antlr-interest] Manual Tree Walking Vs. Tree Grammars

Patrick Niemeyer pat at pat.net
Tue Nov 2 15:59:31 PDT 2010


I had been building up the experience to post something thoughtful on this topic, but I'll just chime in now :)

First let me say that I'm very impressed with ANTLR and it is making my job a lot easier these days.

I recently had to start a project with a very large grammar and I looked at the tree grammar pattern / capability that is presented in  the ANTLR book.  While I can see that it is a fairly elegant solution from a theoretical perspective, it seems like it would be awkward to maintain in real world code.  The first problem that I see is that I'd have two very large grammar files that I'd have to keep in sync.  The second problem is that even with a helper facade I'd still be creating a third layer that essentially ties together all of the code...  The solution to this seems to me to be to go with heterogeneous node types, which can encapsulate knowledge of the language in a nice object oriented way.

I was happy to see that there is support in ANTLR for specifying heterogeneous node types in the grammar directly, although I almost missed it because it seems to have been added after the ANTLR book was written (it's added as a footnote).  I have found that the support is a little preliminary and using them basically means that you can't use the antlrworks debugger, at all etc.   I'm sure that situation will improve.  (I will volunteer to help if I can).

Beyond that my only other request is that I'd like to see a little more solidification of the API for node types and tokens in general.  If you are walking a node stream and want to navigate things like UP/DOWN and error nodes you have to resort to some shenanigans like comparing strings ("UP"/"DOWN", knowing that node type 0 is reserved as "<invalid>", etc.). 

I'd be open to being proved wrong on the tree grammar front... if anyone has examples of how it simplified their code, etc.


Pat Niemeyer


On Nov 2, 2010, at 2:48 PM, Amr Muhammad wrote:

> Hello,
> 
> In this post : http://www.antlr.org/pipermail/antlr
> -interest/2010-October/039862.html
> The following was mentioned:
> 
>> Also, remember to only call external Helper methods from your parsers/tree
>> walkers. Do not embedded any code other than the calling code and pass the
>> whole tree or token pointer. This means your calls won't care what gets done
>> by the helper API and the helper API will not care how the parsers decided
>> to call it. Anything else is an unmaintainable mess.
>> 
>> 
> So,
> does this imply that it is easier to walk the AST manually rather than
> embedding actions in the tree grammar ?
> 
> Based on what i have tried till now, it seems that getting the embedded
> actions to work, as expected, is not easy. So, I'd like to know if there is
> some benefit that I would get out of writing embedded actions in tree
> grammars?
> 
> Also, there is this post that seems to advocate manual tree walking:
> http://www.antlr.org/article/1170602723163/treewalkers.html
> 
> So, I'm confused as to whether continue trying to make tree grammars do what
> I want, or switch to manual tree walking. Appreciate your guidance...
> 
> Thank you for your time :)
> Best Regards,
> 
> Amr Muhammad
> Cairo Univ. Computer Eng. Grad.
> twitter:@amrmuhammad <http://twitter.com/amrmuhammad>
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> 

Pat





More information about the antlr-interest mailing list