[antlr-interest] philosophy about translation

Wed Nov 1 22:53:12 PST 2006

Hi Andy,

> >>I don't think I'm changing my position. I still disagree with that
> >>quote. I think almost all the great tools
> >>are the ones that the majority of programmers to "get". In 
> >>fact, that's 
> >>part of what makes them great.

> >I don't think "popular" and "great" (when applied to tools) are 
> >synonyms.
> >  
> >
> I don't either, that's why I said "in part".

I should have been clearer: Popularity isn't a measure of greatness.

> >>I disagree. I view Java as being "great" and C++ not being
> >>"great".

> >If productivity (not power) is the priority, Java/C# is 
> likely to be a 
> >better tool than C++ for those problems to which Java is applicable.
> >
> >Otherwise C++ (or some other tool) is better.
> >  
> >
> And Java is *always* the priority.

It's pointless to argue over value judgements so, I won't. 

Incidentally, this statement about Java...

> The only way you'd ever 
> fall back to 
> C++ is if there really
> wasn't enough power in Java. And I've only heard of one or 
> two cases of 
> that.

...contradicts this later statement about Java.

> >>and I think the
> >>vast majority 
> >>of programmers who know both
> >>agree with me.

> >I make no such claim (I have no idea what the vast majority of 
> >competent Java/C++ programmers think).

> Then you should get out more. Talk to 10 co-workers about 
> Java vs. C++, 
> or go to a conference.
> I'd say that less than 5% of those who've actually used both Java and 
> C++ prefer C++.
> That's from my experience of talking to perhaps a few hundred 
> developers 
> about it.

A few hundred (even a few thousand) developers doesn't equate to the "vast
majority".

C++ and automatic memory management aren't mutually exclusive. GC systems
for C++ predate Java/C#.
My point?, productivity and C++ aren't mutually exclusive either.

> >>Probably the main benefit is that it's easy to use for
> >>"average programmers". That's also why
> >>ANTLR is better than the competition - because it's easier to use.
> >>    
> >>
> >
> >Not to my mind. Coco/R, JavaCC, SLK are equally easy to use (if one 
> >takes the time to learn them).
> >
> >ANTLR's "popularity" is due to a lot if things including: Ter's 
> >predicated
> >LL(k)/LL(*) technology, 
> >
> LL(*) is brand new to V3, so that has nothing to do with it. 

I disagree. V3 is the reason many ANTLR'ers aren't using some other tool.
V2's performance (and a few missing features) eventually became an issue.
But I digress...

> And some of 
> the others are LL(k), so
> I don't think that's it, either. I'm using Javacc now, and 
> it's driving 
> me nuts, just as lex/yacc
> and similar tools did.

Once I learned the syntax/semanics and prevailing idioms, javacc was easy
enough.

> >The "vast majority" don't understand the value/utility of 
> MI, mixins, 
> >or the "why?" of AOD  etc.
> >  
> >
> That's a circular argument. If someone "understands the value" of MI, 
> etc. then of course they want it.

It isn't a circular argument. It is perfectly possible to "understand the
value" of a feature and yet not want it. I "understand the value" of MI for
instance and I'm not calling for [standard] Java to include it bacause it
make makes dynamic class loading far more difficult to implement. I'd rather
not be introduced to another slew of Java bugs.

Extended variants of Java such as MIJava (a preprocessor) and MultiJava (a
full compiler - uses ANTLR) etc are available for when I need the extra
power in a project.

> It's that "vast majority" who know what they are and don't want them 
> that matters.

I doubt that the vast majority truly understand MI et al.

> >In the context of this thread, "compiling by hand" is not a tool.
> >  
> >
> The point is that just because one approach (whether tool or not) is 
> less powerful doesn't
> mean that it's worse.

It means that it's less powerful. That can mean "useless" if one needs the
missing "power features".

> >How would you change in ANTLR to make it easier?
> >  
> >
> Short answer: hide all the details from me. Make it so that I have no 
> idea that
> there is code being generated to do lexing and parsing. Let 
> me just give 
> it a C grammar
> and a Java grammar, and then dive in and start writing 
> translation logic 
> without any
> generated code or even ASTs in sight. How to do that is left as an 
> exercise for the reader.

Interesting idea. Don't know if it is possible but, interesting nonetheless.
;-)

> >Quite often just getting "something that works" is all that is 
> >required. Getting the best output from a compiler requires 
> knowing more 
> >about what goes on under the hood.
> >  
> >
> Yea, I know. You can do a better job at garbage collection 
> than java's 
> gc. You can write
> better byte code than javac because you've studied javac and bytecode.
> 
> The Java JIT guys say the first rule of performance 
> optimization is to 
> STOP doing whatever
> it is you're doing that you think is producing better 
> bytecode. And what 
> did Terence find
> out about performance when he tried generating his own bytecode?

Regardless of what Ter experienced while generating DFAs as bytecode, what
the Java JIT guys may have said or indeed whether I can beat javac's GC
strategy, what I actually said above remains a fact.

> >>But required knowledge of the tool's internals limits the "average"
> >>user's productivity.
> >
> >A user is already limited if he/she don't understand how a 
> tool works. 
> >Whether or not that matters depends on what they are trying to 
> >accomplish.
> >
> Yes, we lower 99% of the programming community are writing 
> sub-par code 
> because we don't
> understand how our compilers work. ;)

Yes we are. ;-)

> >>>I use Java/C# for the productivity benefits. If performance,
> >>>flexibility or expressivity was *more* important in a particular 
> >>>project, there are better tools than Java/C# (e.g. C++, Occaml).
> >>>
> >>Right, so you're just like the rest of us. You've chosen to
> >>limit your 
> >>own "power" by using Java rather
> >>than, say, assembly. So I'm sticking with my claim that
> >>"I think a tool can be great while being simple enough for most 
> >>programmers (e.g. Java)."
> >>and not buying your "Not without limiting it's power" reply.
> >
> >It isn't "my" reply. The fact is:
> >- Java/C# is less powerful than assembler, C or C++ (you 
> need them to 
> >build java/c# in the first place).
> >  
> >
> If that's your definition of "power", I don't see how it relates to 
> anything.

I defined "power" in terms of performance, flexibility and expressivity
(it's still visible above).

Performance: Given equivalent programs written in Java and asm/C/C++, the
Java version would be slower (or it is always possible to optimize the
asm/C/C++ version so it outperforms the Java version).

Flexibility: Anything program that can be written in Java can be written in
asm/C/C++ (although one might not want to). The reverse is not true.

Expressivity: Java is less expressive than C++ (even without macros). With
[really!] clever use of macros, the same can be said of C and perhaps asm
too.

> >- For some problems, Java/C# is more productive than assembler, C or 
> >C++.
> >  
> I'd say "for almost all problems" but OK.

Depends on what sort of programming problems you have to solve. A
Windows/Linux device driver developer wouldn't use Java for instance.

> >I disagree. He is working with code generated by ANTLR. He 
> isn't using 
> >ANTLR.
> >  
> Ah, come on. When someone is using a lexer built using ANTLR, 
> you won't 
> consider that to be
> "using ANTLR?" As in "He's using ANTLR without ever seeing the input 
> grammar". That's
> like saying I'm not "using javac", I'm just using the 
> bytecode that it 
> generates.

Which is precisely what many ANTLR users do when they download the binary
distribution. They aren't using javac (some probably don't even know what
javac is). They are just "using bytecode generated by javac".

> >If he used ANTLR directly (like you did). He can do more. Your DSL 
> >(like
> >Java/C#) favours productivity over power/flexibility.
> >  
> >
> The whole point of building the DSL is because I think if he 
> (or I) used 
> ANTLR directly, we'd actually
> "do less", not "do more". We'd get less accomplished because we'd be 
> struggling with AST
> shapes in our heads. The DSL lets us tackle the same problem 
> at a higher 
> level of abstraction.

As I said, it is a more productive tool (for what you want to do) than using
ANTLR directly.

On the other hand, using ANTLR directly affords more "power".

> >ANTLR *is* a compiler.
> >  
> >
> Right, and as such, I believe it can do what "traditional" 
> compilers do: 
> hide all the underlying
> stuff from the users.

It does. That's why your guy can use the code it generated without knowing
or caring about ANTLR.

> >>Compiler designers take it as a given that users need only know the
> >>syntax/semantics of the input
> >>language. If Ter took it as a given that ANTLR4 users need 
> >>only know the 
> >>syntax/semantics
> >>of the input language, he'd end up with a very different tool.
> >>    
> >When using ANTLR, that is all one needs to know.
> >
> No. To use ANTLR, you not only need to know the input 
> language (say, C) 
> syntax&symantics, you
> also need to know:
> * The ANTLR syntax&symantics
> * How to hook in actions: where do they make sense? What language are 
> they in?

ANTLR's input language is a customized variant of EBNF that can include
embedded "action" code written in one of a few general programming
languages. It is used to describe the syntactic structure of other languages
e.g. your ANTLR grammar for the C language.

Learning where actions can be "hooked in" is part of learning about the
syntax/semantics of ANTLR's input language.

> * You often need to know details about the code that's generated to 
> resolve ambiguities

A test suite mitigates against this. I agree that approximate lookahead
generates spurious warnings.

> * You need to know how the grammar maps to an AST structure. It's not 
> enough to have a mental
>    picture of the input grammar, you need to be able to form a mental 
> picture of the AST each time
>    you see a chunk of code.

ASTs are optional. You don't use them for instance. In any case, the user
designs an AST not ANTLR. ANTLR simply provides a language for specifying
AST construction.

> >A compiler designer can't determine the best code to 
> generate for every 
> >possible situation in advance.
> >
> He doesn't need to always generate the best code. It's good 
> enough that 
> he just generally do
> better than humans do.

For some users/projects, that is enough. Not for everyone or every project.

> >This feature makes the tool more useful - for
> >those who care to acquire the knowledge required to use it 
> effectively. 
> >It empowers knowledgeable users to tailor the output for any given 
> >situation.
> >
> And yet, there is no equivalent in Java - no bytecode 
> tweaking. And no 
> one seems to mind.

Actually, there is. Not just with javac. Javaassist, BCEL etc do just that.

> And there is an equivalent in C/C++ - embedded asm code. That was 
> popular 20 years ago,
> but today's programmers realize that the assembler is 
> probably better at 
> producing good code,
> and they don't need every last 1% of performance anyway.

Not all the time. When they do, it is reassuring to know that gcc/vc++ still
support it...  ;-)

Micheal

-----------------------
The best way to contact me is via the list/forum. My time is very limited.