[antlr-interest] Appropriate use of honey badger listeners

Kyle Ferrio kferrio at gmail.com
Thu Jan 12 17:28:11 PST 2012


Sam, I think I understand that well enough to use safely.  :)
On Jan 12, 2012 5:23 PM, "Sam Harwell" <sam at tunnelvisionlabs.com> wrote:

> You could use a generic ContextAnnotator type, which can be created as
> necessary to associate an arbitrary type with parse tree nodes in a way
> that
> can be easily shared among multiple listeners:
>
> class ContextAnnotator<T> {
>    private final Map<ParserRuleContext<?>, T> data = new
> IdentityHashMap<ParserRuleContext<?>, T>();
>    public T getData(ParserRuleContext<?> context) { return
> data.get(context); }
>    public void setData(ParserRuleContext<?> context, T value) {
> data.put(context, value); }
> }
>
> class SomeListener extends BlankMyParserListener {
>    private final ContextAnnotator<ExtraInfo> extraInfo;
>
>    public SomeListener(ContextAnnotator<ExtraInfo> extraInfo) {
>        this.extraInfoAnnotator = extraInfoAnnotator;
>    }
> }
>
> --
> Sam Harwell
> Owner, Lead Developer
> http://tunnelvisionlabs.com
>
>
> -----Original Message-----
> From: Terence Parr [mailto:parrt at cs.usfca.edu]
> Sent: Thursday, January 12, 2012 5:40 PM
> To: ANTLR Interest Mailing List
> Subject: Re: [antlr-interest] Appropriate use of honey badger listeners
>
>
> On Jan 11, 2012, at 8:39 PM, Kyle Ferrio wrote:
>
> > Hi Ter,
> >
> > I don't have answers, but I have questions which might at least cut off
> some dead ends...
> >
> > I agree that point #2 seems pretty clear.  And I realize that this:
> >
> >     newtree = listener.someEvent(oldtree);
> >     if ( newtree!=oldtree ) replace-oldtree-with-newtree;
> >
> > is pseudo-code.  You probably have this all worked out, but it occurs to
> me that comparing two trees may be more expensive than necessary.
>
> I would be comparing pointers. if the pointers are different, I would slip
> in the new pointer. only the replace operation is pseudocode. In reality it
> would simply be "set the ith child pointer to the new pointer".
>
> >   Despite the fact that walking a tree of n nodes is O(n), and so
> comparing two trees (non-naively) is also O(n), there may be a lot of data
> in those nodes to compare.  If all we care about is detecting changes, a
> significant speedup is possible if the interface for a tree object (or for
> listeners acting on trees?) includes a flag that gets flipped by any
> operation which has the possibility of changing the tree.  (N.B.
> Possibility, not guarantee.  A guarantee would require the eval we're
> trying
> to avoid.)  This tiny bit of added API built into a Listener baseclass for
> processing tree changes (which are all covered by a small, finite set of
> generic operations) would turn the subsequent O(n) comparison into O(1).
> I'm not sure how this fits with your design.  Of course, all bets are off
> if
> someone messes with the tree directly.  (N.B. Alternatively, a hash on
> trees
> would be almost as cheap and require no extra API, but that only works if
> we're willing to treat topologically equivalent trees with different memory
> layouts (e.g. flipping a tree left-to right) as different.)
>
> I don't think we need to do anything fancy. All we are doing is altering
> pointers in the tree. If you want to delete a subtree, return null. If you
> want to leaves the tree as it is, return the original pointer. If you want
> to alter a subtree, return a pointer to the new tree.
>
> > Ok, now point #1.  Not sure how to approach this, but...
> >
> > Thought: If the root of a rule r which returns a value does not declare
> the type of value expected, two questions arise. One, where/when does the
> antlr tool perform type-checking for the rules which consume rule r?
>
> ANTLR would not be involved in this case because all ANTLR does is create
> the parse trees. Our listener would be annotating those trees.
>
> > Two, where/when does the antlr tool perform type-checking of the retvals
> from listeners for different alts of r?
>
>  Again, I don't think ANTLR is involved here. the only way it would get
> involved as if we defined return values or parameters that get shoved into
> the context objects. And the context objects are the parse tree nodes.
>
> > Perhaps an explicit decoration/declaration of a type T at the root of r
> is
> the easiest, clearest and safest way to communicate to all of the listeners
> of all of the alts of r that they must return type T.  My thinking here is
> probably clouded by my preternaturally strong preference for static typing.
>
>
> I too am leaning toward some static typing, but I hesitate to put "returns
> [T v]" into the grammar because T it is application-specific. and, more
> importantly, it is pass specific. each pass might need a different type so
> we would need to do something like "returns [T v, U w]" but then the
> grammar
> is specifically tied to a particular application and literally the parser
> would not compile without having those types available. there would be no
> way to provide such a generic grammar to users on the ANTLR website.
>
> We need a way to decorate or annotate a node in a type safe way.  To do
> that
> in Java, we have to use of classes or interfaces or something. Certainly,
> we
> can provide a factory that creates the various contexts to use our
> subclasses of ParserRuleContext like T and U. The only problem is that the
> tree node types can only be sent once: on construction.
>
> What if we had some kind of adapter that fed the listeners proxies? So the
> original parse tree has generic nodes like ruleContext and we need T and U
> for 2 different phases. As the parse tree walker moved along, it could wrap
> the generic ruleContext objects with a proxy, of type T, that had the extra
> information we need. T with points at ruleContext instead of using
> inheritance.  If we didn't reuse the proxies, however, we would have to
> create a new object for every listener event. That's not the end of the
> world as I do that during parsing. But, we could think about reusing proxy
> objects, though I doubt we would know when the application was going to
> refer to multiple contexts. As long as the proxy object popped into
> existence for the listener event and popped out of existence afterwards,
> should not burden the garbage collector (which does not walk dead objects).
>
> let's see what this would look like. Here is a chunk of the parse tree
> walker:
>
> ParseTree.RuleNode r = (ParseTree.RuleNode)t;
> enterRule(listener, r);
> int n = r.getChildCount();
> for (int i = 0; i<n; i++) {
>    walk(listener, r.getChild(i));
> }
> exitRule(listener, r);
>
> And let's say we have an adapter that knows how to create wrappers for
> parse
> tree nodes:
>
> adaptor.wrap(t); // returns type T or U or whatever you want
>
> may be like this:
>
> class T {
>    ParserRuleContext delegate; // hide in a superclass somewhere
>    int v; // my field
>    public T(ParserRuleContext ctx) { delegate = ctx; }
> }
>
> class myAdaptor<T> implements ParseTreeWrapper {
>    public T wrap(ParserRuleContext t) { return new T(t); }
> }
>
> Then, all we have to do is change the enter event to look like this:
>
> enterRule(listener, adaptor.wrap(r));
>
> The listener event would have to cast the ParserRuleContext to T or U etc.
>
> public void exitRule(AParser.multContext ctx) {
>    T t = (T)ctx;
>    t.v = t.a.v * t.b.v;
> }
>
> MUCH cleaner and type safe.
>
> OTOH, we could reverse it by adding a pointer to every parse tree node that
> points to an object with extra fields. Only those parse trees that need
> extra fields would have a pointer to an auxiliary object or object
> extension
> if you will.
>
> Or, As Sam said, we could provide a factory for creating the various parse
> tree nodes. If our application needed object extensions, we could pass in a
> factory that created special context objects that had our fields. Those
> objects would have to have a union of all the values needed by all phases,
> but that's not necessarily a bad thing. It's often the case that I want my
> type computation information to persist through multiple phases all the way
> through to code generation. On the other hand, if we have lots of temporary
> values we need just for computation, the tree node would have lots of extra
> fields, mucking up our class. I suppose nothing beats the simplicity of a
> factory creating the right objects. Then we can create a class like this
>
> class T extends ParserRuleContext {
>    int v; // temp value
>    TypeInfo type; // persists across tree passes
> }
>
> The problem is that we would need a new class definition for every context,
> meaning every rule, in the grammar. yuck.  actually, maybe it is not too
> bad
> because we would only need to annotate some of the nodes such as eContext.
> We would still need a typecast in the listeners like:  T t = (T)ctx;
>
> It'd be simpler to add a field to ParserRuleContext and had a generic type
> because it means creating only a single object:
>
> class ParserRuleContext<T> {
>    T extension;
>    . existing stuff ...
> }
>
> All of the various context object types could point at the single type of
> extension object.  I guess it would mean adding code in the listeners like
> this:
>
> public void exitRule(AParser.multContext<T> ctx) {
>    if ( ctx.extension==null ) ctx.extension = new T();
>    T t = ctx.extension;
>    T a = ctx.a.extension; // ugh; kind of a mouthful
>    T b = ctx.b.extension;
>    t.v = a.v * b.v;
> }
>
> Hmm.
>
> one thing I should note about the stack solution. I wonder if it might get
> confusing knowing what is on the stack and what is not, particularly if not
> every event has a listener implementation (that pushed a value). If the
> values are stored in the parse tree contexts, you know that that no does
> not
> have a value. With a stack, you would see a value computed way below that
> you don't necessarily want. You would not know that it is not the value you
> want because it is a disembodied value on the stack. That said, it's
> simplicity is appealing. If we need to annotate the tree anyway, however,
> maybe we do need a solution like I describe above.
>
> > Well, I'm not sure if any of this helps.  But I'm reasonably confident
> that it does not hurt.
>
> "no animals were hurt during the production of this e-mail" :)
>
> Ter
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>


More information about the antlr-interest mailing list