[stringtemplate-interest] StringTemplate Compiler for .NET

Sam Harwell sharwell at pixelminegames.com
Sat Mar 21 12:07:37 PDT 2009


This email describes a proposal that I'm absolutely willing to
implement. As I mention at the end, I'm very interested in feedback on
the idea as will.

Before I work to optimize handling of various cases, I'm modifying
StringTemplate to have a Frozen (immutable) state. Before the object is
frozen, a call to ToString() will internally call GetAsFrozen()
(actually not quite; see #4 below) to get a frozen duplicate of the
current template, and evaluate ToString() on it. A warning will
optionally be issued to stderr when this case occurs.

When a template is frozen, all attributes are locked and the expression
tree is compiled. However, a subsequent call to Clone will return an
non-frozen copy of the template with no attributes set. The advantage
comes from several key characteristics of the cloned copies:

1. The clone gets a copy of the compiled expression tree. As long as the
pattern is not changed, the expression tree will not need recompiling
before evaluation. The calls to Freeze or GetAsFrozen will simply lock
the attribute set.

2. When the expression tree is compiled, a List of objects is created
with a bucket for every referenced attribute of every template. When
SetAttribute is called, the value is placed immediately in the proper
bin for direct lookup during evaluation. Otherwise, it goes in the
dictionary and the values are placed in bins after the expression is
compiled.

3. If a never-frozen template is cloned, the non-compiled template is
copied on write. This means if *either* copy is frozen, both copies will
get the expression tree and bin layout. The frozen copy will move
attributes to the bins immediately; the other will copy them when it is
frozen later, but it won't have to recomputed the bins.

4. Calls to GetAsFrozen copy all frozen subtemplates by reference. Two
additional methods, CloneCurrentValue and GetCurrentValueAsFrozen,
return copies including all current attributes. Clearly a call to
GetCurrentValueAsFrozen on a frozen template will simply "return this;",
and a call to Freeze on a frozen template will immediately return. With
that said, a call to ToString on a non-frozen template will actually
call GetCurrentValueAsFrozen. :)

It's my opinion (open to suggestions) that this implementation is
appropriate for the following reasons:

A. It's conceptually straightforward and therefore straightforward to
implement.
B. It addresses all remaining bottlenecks that have been described here
on the list.
C. It offers excellent performance for a variety of uses/order of
operations. It's easy to add runtime warnings for performance-degrading
uses such as calls to ToString() on non-frozen templates while
simultaneously holding on to the compiled expression so the performance
is strong on later calls to ToString() on that same template.
D. Since the concept is not reliant on either the template compilation
method or syntax/built-in functions/features of the StringTemplate
language itself, it will be portable to version 4. We've also already
shown that multiple compilation techniques can be used targeting the
CLI, and I imagine similar could be done in Java (jealous of our lambdas
yet? =) ).

-----Original Message-----
From: Volkan Ceylan [mailto:volkanceylan at gmail.com] 
Sent: Saturday, March 21, 2009 5:39 AM
To: Sam Harwell; stringtemplate-interest at antlr.org
Subject: Re: [stringtemplate-interest] StringTemplate Compiler for .NET

Hi again,

Thanks a lot Prof. Parr, now i can see Sam's code.

It will take some time before i understand your work fully, but seems
like you are following a similar path with me,
emitting dynamic methods in ActionEvaluator.cs but certainly
proceeding a lot faster than me. If you finish your
work, mine will probably become obsolete but anyway i'll keep on
working, its a practice and brainstorming for me
to perform in evenings :)

Yesterday i made a profiling session to find out why it is working so
slow in StringTemplate.ToString() or
StringTemplate.Write methods using my templates. It seemed like most
calls are to new HashTable(),
new StringTemplate(), StringTemplate.BreakTemplateIntoChunks(),
GetConstructorInfo() methods. I can understand
new StringTemplate() calls caused by GetInstanceOf() methods but why
BreakTemplateIntoChunks()??
It shouldn't be parsing templates in evaluation stage.

In ActionEvaluator.attribute(), for ANONYMOUS_TEMPLATE nodes, there is
a code like below:

StringTemplate valueST = new StringTemplate(self.Group, at.GetText());
valueST.EnclosingInstance = self;
valueST.Name = "<anonymous template argument>";
value = valueST;

This is called for all anonymous templates and all separator, format,
null etc. options in each evaluation.
Caching the template by replacing code with:

if (at.StringTemplate == null)
{
   at.StringTemplate = new StringTemplate(at.getText());
   at.StringTemplate.Name = "<anonymous template argument>";
}
StringTemplate valueST = at.StringTemplate.GetInstanceOf();
valueST.EnclosingInstance = self;
valueST.Group = self.Group;
value = valueST;

removed all these parsing calls, and it saved a lot time alone as
parsing is slow.
Your port seems to have the same code block causing problem.

Reflection method GetConstructorInfo() was called in
BreakTemplateIntoChunks()
while creating an instance of lexer class but seems like you spotted
that performance bottleneck.

There is also another potential problem i spotted while working on the
compiler:

AFAIK, StringTemplate.ToString() calls (i mean evaluation) is
thread-safe, as long as you
use a unique instance created with GetInstanceOf() methods. I see no
reason why it should't.
But in just a few places in ActionEvaluator.cs for ANONYMOUS_TEMPLATE
nodes,
and in ApplyTemplateToListOfAttributes, ApplyAlternatingTemplates
methods, an instance of
anonymous template is not used (by calling of GetInstanceOf()), but
directly anonymous
template itself is modified (this is not true for named templates,
always a new instance of
them is used). This will cause same anonymous template instance to be
modified
in different threads. If threads modify at same time, this may cause
unexpected side-effects.
Am i wrong in assuming string template instance evaluation as
thread-safe or this is a bug??

2009/3/20 Sam Harwell <sharwell at pixelminegames.com>:
> I checked in (CL5952) an update that builds a delegate
(System.Func<T>)
> inside ActionEvaluator instead of basic interpretation. I have not
> implemented caching of the delegates or optimized resolution of
> arguments other special name attributes. However, the implementation
is
> a complete working example of execution of a compiled template that
> passes the same 1200 unit tests that the interpreter does.


More information about the stringtemplate-interest mailing list