[antlr-interest] philosophy about translation

Tue Oct 31 08:38:48 PST 2006

Andy,

You seem to be changing your position on the issue of whether (as Anthony
put it):

"If it really *is* great, then the chances are the majority of programmers
*can't* 'easily "get it" '."

I'm leaning towards agreeing with Anthony as I said.

> >Many would argue that Java is a limited implementation of OO 
> principles 
> >as pioneered in LISP et al and later Smalltalk. Everything inherits 
> >from System.Object is plain ugly and, no multiple inheritance?. AOD 
> >(AspectJ and
> >cousins) can be viewed as a series of hacks to try and 
> simulate some of the
> >important OO bits that were thrown out to make Java.
> >  
> >
> I view Java's decision to do a "limited implementation" by avoiding 
> things like multiple inheritance
> as exactly what made it successful.

Being a great tool for the job doesn't guarantee popularity. Popularity is
ultimately a measure of the tool's accessibility to average programmers
(they are the majority). Great tools are often beyond the ability of the
average programmer. Certainly to build. And often to use too.

C++ is a great tool (Java was written using it). Most Java programmers
wouldn't be able to master it. Or the domain expertise needed to build Java
itself.

> By avoiding being 
> "completely pure", 
> Java is accessible
> to average programmers.

My point exactly (not sure about the "pure OO" label though). 

Incidentally "above average" Java programmers understand the value of the
missing features and are forever trying to add them back. As I suggested,
AspectOrientedDesign in Java can be viewed as attempts to hack some of them
back into Java.

> And that's why Java is popular and 
> Smalltalk and 
> LISP are not.
> It's also why people prefer Java over C++.

It is a easier tool to use. Less powerful. But easier.

> >So, I'm tending to agree with Anthony here. Great tools 
> often require 
> >in-depth domain expertise that the majority simply don't have.
> >  
> >
> Sometimes they do, but sometimes they don't. Compilers never require 
> in-depth
> domain expertise.

Try feeding Java code or an ANTLR grammar to a C++ compiler. ;-)

> I know almost nothing about byte-code 
> generation, yet 
> I use javac
> every minute or two. I think the world would benefit from an 
> ANTLR tool 
> that was like that.

For your examples of [general purpose language] compiler and ANTLR [grammar
language compiler], the domain expertise isn't primarily about the internals
of the tool. It's about the syntax, semantics and idioms of the language
recognized by the tool. Knowledge of the tool's internals can elevate those
who have it above the "average" user who doesn't.

> >>I think a tool can be great while being simple enough for most
> >>programmers (e.g. Java).
> >>    
> >>
> >
> >Not without limiting it's power.
> >  
> >
> Yes, just as Java's power is "limited" by not supporting MI, 
> pointers, etc. I love to have my power "limited" by not 
> giving me lots of rope to hang 
> myself with.
> And so do most people, judging by the popularity of Java over C++ and 
> every high-level
> language over assembly.

Java's swan song is productivity (for those problems to which to it can be
applied). Not power as in flexibility, expressivity or performance.

I use Java/C# for the productivity benefits. If performance, flexibility or
expressivity was *more* important in a particular project, there are better
tools than Java/C# (e.g. C++, Occaml).

> >>I think Terence could make a huge leap forward by not thinking about
> >>ANTLR as "a tool to automate what
> >>a guru would have written by hand", but rather "a tool that 
> hides all 
> >>the details of language manipulation, so that
> >>most any programmer can do it".
> >>    
> >>
> >
> >Don't think so. My point about domain expertise is relevant 
> here. Joe 
> >Average just can't start developing language recognition 
> tools with an 
> >appreciation of the theory that underlies that subject area.
> >  
> >
> Sure he could. Joe average could easily write:
> a + b --> a.add(b)
> and have his tool do the rest (and maybe warn him about cases 
> that might 
> match that he hadn't
> thought about).

Not without understanding the syntax and semantics of the DSL you created.
Not without understanding just what that input string instructs your DSL's
"compiler" to do.

Incidentally, your DSL is just a small part of your particular language
recognition toolkit. My comment refers to someone building the whole
toolkit. Examples abound of tools that offer multiple DSLs to tackle the
various phases/modules of a language processing toolkit (e.g. Cocktail,
Stratego).

> I've had a programmer working with me for a few months now, 
> and he's had 
> no trouble writing
> translation rules without every learning ANTLR grammar or knowing 
> anything about language
> recognition tools.

He isn't using ANTLR directly (i.e. creating/maintaining ANTLR grammars) so,
no surprise if he hasn't had to learn to use ANTLR. 

He isn't developing a language translation toolkit (you've done that
already) so, no surprise if he he has no grounding in formal language
theory.

He _is_ using a DSL you created to encode source-to-source transformations.
You just expressed the opinion that he has aquired the domain expertise
required to use your DSL.

> >>Most programmers use a
> >>compiler without 
> >>ever knowing much more than
> >>"it generates some lower-level code from my code". Similarly, 
> >>it would 
> >>be nice if most programmers working on
> >>language transformation could use ANTLR without knowing much 
> >>more than 
> >>"it generates a lexer/parser from
> >>my grammar".
> >>    
> >>
> >
> >The analogy isn't quite apples-to-apples. Programmers using 
> a compiler 
> >[for a programming language like C/C++] have to understand 
> the syntax 
> >and semantics of the language the compiler recognises. Plus 
> the rules 
> >for using other related tools such as linkers, loaders etc. True, an 
> >IDE and the OS can hide much of that these days but they still exist.
> >
> >Similarly with ANTLR. ANTLR users have to understand the syntax and 
> >semantics of the grammars they develop. ANTLR projects involve *two* 
> >languages - ANTLR's grammar language and a general purpose 
> programming 
> >language such as Java/C/C#/ObjC etc.
> >
> >Beyond that, it's the same user experience: "I feed in some code and 
> >this tool (compiler or ANTLR) generates a whole lotta stuff I don't 
> >need to understand".
> >  
> >
> No, there's a real difference. Yes, you have to know java syntax and 
> semantics to use javac.
> And you have to know ANTLR syntax and symantics to use ANTLR. 
> But with 
> ANTLR, it's
> not enough to know the syntax and semantics of ANTLR. To do anything 
> useful, you almost
> always have to know something about the internals of what ANTLR is 
> doing. I find that I
> often have to look at the generated code to figure out what 
> went wrong 
> or how to do what
> I want to do. I *never* have to look at java byte code - I'm 
> completely 
> hidden from that
> by the design of the compiler.

Not by the design of the compiler. But by how well tested it is. And by how
well documented Java (and javacc) is. Many Javacc users - who can and care
to look under hood - have tripped over bugs-a-plenty in it yet, "average"
Javacc users don't discover those same bugs in the same tool even when they
write code that triggers it.

Knowledge of Java and bytecode and how javacc works means "above average"
users can do more with Java and javacc than the average programmer can. 

Knowing more, lets you do more.

ANTLR is similar and different. Different because it suffers in comparison
by being a less popular tool with less resources behind it. Nevertheless,
for someone with a deep knowledge of ANTLR's grammar language, it's
limitations (e.g. no predicate hoisting and approximate-LLk in 2.x) and the
available documentation, there really is no need to look at the output code.
You develop your grammar, you develop your tests, you build the whole lot
and, the tests will alert you if you need to change anything. Unless you
trip over a bug in ANTLR of course...

Now with ANTLR V3, not only can you look at the output code it if you wish
but, for the price of a little more knowledge (i.e. domain expertise), you
can change it!. 

The "average" ANTLR user has no need to change the code and would never do
so but, others will.

Micheal