[antlr-interest] Annotation tool best practices

Tue Apr 27 09:52:03 PDT 2004

>>>>> "micheal" == micheal jor <open.zone at virgin.net> writes:
[...]

> I have been struggling a little to understand the extent to which the
> annotation tool is useful and whether a different idea is needed to
> support grammar maintenance.

ObDisclaimer:  I haven't tried using Bogdan's Annotation tool.  I have
played with a number of approaches to grammar management.

[...]

> Here, we've had to move the rule name from the template grammar to the
> insert file. Note that this might be required for many rules and, that a
> single rule with multiple occurrences of "dottedName" might be linked to
> multiple flavours via similar inserts. This seems a little "wrong" to me
> somehow. I'd always imagine that as long as the *.tg file contained the
> grammar and just additional tags for the insert points, it wasn't too bad
> a trade-off. I get to view a cleaner, bare bones grammar and with a good
> naming scheme I can even "read into the insert file" just from the
> template grammar.

Basically, that's a simplified "aspect oriented programming" (AOP)
approach.  AOP is currently a fairly hot subject area and so there's a lot
of hype and a fair bit of information floating around.

> Heck, maybe I'm missing some insight here. How do you solve such issues
> in your projects?. Do you use the annotation tool or another
> tool?. Perhaps you have a very different solution based on entirely
> different concepts from the text replacement preprocessing at work
> here. Enquiring minds would like to know....

For a concrete and available example, check out what Monty did with the
GnuC project with "literate programming" (ala Knuth via noweb).  That
allows for nice, cross-phase locality (i.e., all of those versions of
expression right next to each other).  One of my dislikes for that approach
is that the intermingling makes in more difficult to just look at all of
the rules for a single phase (in the editor).  On my play-with to-do list
is to investigate using a "literate" approach with a good outline editor so
that I could fold out all of the stuff that I didn't want to deal with at
any given point in time (think: different, editable views of a single
master source file).

Another "trick" you can do with the tree phases is to create a single,
super-set tree definition that encompasess all of the needs of all of the
different phases.  That way, each phase can reuse the exact same tree
grammar and all you're changing in each phase is the actions.  IME, I find
that that works well for sets of phases but that there needs to be explicit
breaks to cleanly and explicitly deal with more fundamental shifts.

In the majority of my experience, I keep it simple (if not simplistic :-)
and have everything in their separate files.  When I want to deal with a
single rule across multiple phases, I bring up all of the relevant files
into individual Emacs windows and I draw any needed diagrams on a
whiteboard or pad of paper.  IME, the issue for me is much less dealing
consciously with the isolated, individual (sets of) rules across multiple
files but rather needing a true grammar level "diff" tool to catch those
cases where I wasn't conscious (or smart) enough to realize that a
particular change in one spot needed to be account for in other spots.

One of the keys in my thinking about trees is how they look (i.e.,
structure) before the phase is run and how they look afterwards.  So, the
vast majority of my external documentation of a tree translator are
diagrams to help visualize the nasty corners of the transformations.

Finally, on a related subject, there seems to be two schools of thought
on how to use grammar based approaches.  One is the "pure" school which
says that the grammar drives the system.  The other is that the grammar is
a slave to the rest of the system.  The argument, IMHO, isn't one versus
the other but rather (a) which one to use for which needs and (b) whether
or not it's acceptable to mix the approaches.

In terms of (b), I'm unsure of but am concerned about your dottedNameXYZ
example as it's unclear to me why you (feel that you) need to take that
variation approach.  I.e., why isn't there only a single rule in the
grammar that provides the underlying information such that whatever is
consuming that information can extract it as appropriate?

Hope this helps,
		John

Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/