[antlr-interest] ST 4.0 planning

Mon Sep 8 17:34:44 PDT 2008

A couple of comments, that are not directly tied to each other.

1. Part of putting code into an open source project is letting everybody 
else read it. Another part is letting other people contribute to it.

Before you spend a year writing code, go to some conferences that have 
nothing to do with parsing.

2. There seem to be two models for using ST currently. First is "let the 
templates run free." Second is "call a template, then call another."

2a. The "templates run free" is I guess the "best way" but getting all 
the data structures set up so that the templates can be happy takes a 
hell of a lot of work. If you could do that, then one template could 
blissfully call another and another, referring to the giant global data 
store.

2b. The "call one, call another" makes more sense from an immediate code 
perspective, especially from a parser: here's all the context I have at 
the moment, go generate a little bit of text and come back. I'll pass 
that string up the chain, and somebody with more or different 
information can insert that string into their context.

It makes sense for there to be a sort of happy medium, 2c, which maybe 
should be built on top of some kind of hierarchical base data object 
class. That way callers with fairly limited context could link a 
template, or a location (anchor? namespace? context) inside a template, 
to a concrete data element. But at the same time, the templates could be 
written to interact with each other, instead of just dumbly returning 
text. In that respect, it seems a little bit like the DOM tree model. 
Except that it shouldn't be. It should be more general purpose -- an 
in-memory database, say. (Seriously: as a thought experiment, pretend 
you have a SQL database as the ONLY thing available to the template 
engine. What do templates look like?)

Alternatively, think about what happens if template data access uses 
xpath syntax.

My point is that the stuff *outside* the angle brackets is easy. It's 
the interpolation and invocation and whatnot that costs all the effort. 
At some level, you'll wind up writing a PHP interpreter. What do you 
really want as your requirements?

3. I read a bunch of stuff about how ST was a functional language. And I 
remember thinking, what possible use is this to me? Do I give a rat's 
ass if my text is generated in a functional language? At the time, I was 
trying to implement automatically generated sequence numbers for things 
like test cases and symbols. It was far more difficult than it needed to 
be to inject that single simple feature. I had to support it in my code, 
rather than having it come for free in the templates. And it tended to 
taint my design. There needs to be a way to get stuff done, and if that 
means that a bunch of nerds from the Esoteric Language Institute are 
disappointed, give 'em a Klingon dictionary and send them home.

=Austin

Terence Parr wrote:
> Dear ST-o-philes and related humans,
>
> I am starting the planning stages for ST 4.0. I begin my sabbatical  
> next May, a few months after I finish this current book. I plan on  
> writing software like crazy (for 15 months!). This will include  
> optimizing ANTLR and hopefully converting ANTLR and ST to be ANTLR v3  
> clean; no more ANTLR v2 requirement.
>
> As part of converting ST to v3 grammars, I took a look at updating  
> things.  As I looked through all the complicated code that manages  
> dynamic scopes, parameters, and nested templates I realized that there  
> is a lot of stuff going on in there.  ST groove organically from a  
> simple string with holes in it to a sophisticated tree-based  
> interpreter. Tree-based Interpreters are much more difficult to build  
> then, say, a byte code interpreter. Further, debugging ST stuff is  
> quite difficult because all you have are objects and you have to chase  
> a lot of pointers through hash tables and so on to figure out what is  
> going on.  There is no code to step through related to your templates.
>
> I am contemplating moving to a JSP-like model where I generate Java  
> (or C# or Python, ...) instead of doing an interpreter. There are a  
> number of advantages:
>
> 1. In principle, we could use the rechargeable architecture pattern of  
> ANTLR to generate whatever source code we want; C++ and so on. the  
> only requirement would be some sort of reflection still because I  
> don't want attributes to be typed in ST. That means that you'd need  
> RTTI for C++, which it supposedly has now.
>
> 2.  I would suspect that the templates would go much faster when  
> executing "natively" in Java.
>
> 3. You could debug templates by stepping through them just like you do  
> ANTLR parsers. Templates would translate to Java methods. Groups would  
> translate to objects.  Like JSP, we could automatically compile things  
> in the background. This means that they would go slow the first time  
> you ran the template. Also, I would have to investigate a custom class  
> loader so that I could unload templates.
>
> I'm planning on breaking with absolute backward compatibility to fix a  
> number of design flaws that came about because requirements changed  
> during the last eight years.
>
> So, it is a bit premature, but I like to have things to think about  
> while I'm waiting in line etc...
>
> The idea of generating Java code is growing on me. Note that it would  
> only be generating Java or stay an interpreter. I would not do both.  
> Those are two totally separate products almost in terms of  
> implementation.
>
> Ter
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>
>