[antlr-interest] philosophy about translation

Wed Nov 1 20:10:28 PST 2006

Micheal J wrote:

>Hi Andy,
>
>  
>
>>I don't think I'm changing my position. I still disagree with that 
>>quote. I think almost all the great tools
>>are the ones that the majority of programmers to "get". In 
>>fact, that's 
>>part of what makes them great.
>>    
>>
>
>I don't think "popular" and "great" (when applied to tools) are synonyms.
>  
>
I don't either, that's why I said "in part".

>Some "great" tools are accessible to the majority. Others are not.
>
>  
>
>>>Being a great tool for the job doesn't guarantee popularity. 
>>>      
>>>
>>Popularity 
>>    
>>
>>>is ultimately a measure of the tool's accessibility to average 
>>>programmers (they are the majority). Great tools are often 
>>>      
>>>
>>beyond the 
>>    
>>
>>>ability of the average programmer. Certainly to build. And 
>>>      
>>>
>>often to use 
>>    
>>
>>>too.
>>> 
>>>
>>>      
>>>
>>I disagree. I view Java as being "great" and C++ not being 
>>"great".
>>    
>>
>
>If productivity (not power) is the priority, Java/C# is likely to be a
>better tool than C++ for those problems to which Java is applicable.
>
>Otherwise C++ (or some other tool) is better.
>  
>
And Java is *always* the priority. The only way you'd ever fall back to 
C++ is if there really
wasn't enough power in Java. And I've only heard of one or two cases of 
that.

>>and I think the 
>>vast majority 
>>of programmers who know both
>>agree with me.
>>    
>>
>
>I make no such claim (I have no idea what the vast majority of competent
>Java/C++ programmers think).
>  
>
Then you should get out more. Talk to 10 co-workers about Java vs. C++, 
or go to a conference.
I'd say that less than 5% of those who've actually used both Java and 
C++ prefer C++.
That's from my experience of talking to perhaps a few hundred developers 
about it.

>  
>
>>Probably the main benefit is that it's easy to use for 
>>"average programmers". That's also why
>>ANTLR is better than the competition - because it's easier to use.
>>    
>>
>
>Not to my mind. Coco/R, JavaCC, SLK are equally easy to use (if one takes
>the time to learn them).
>
>ANTLR's "popularity" is due to a lot if things including: Ter's predicated
>LL(k)/LL(*) technology, 
>
LL(*) is brand new to V3, so that has nothing to do with it. And some of 
the others are LL(k), so
I don't think that's it, either. I'm using Javacc now, and it's driving 
me nuts, just as lex/yacc
and similar tools did.

>[somewhat] comprehensible code generation for
>multiple target languages, grammar as documentation for all phases of
>translation, it's PD or BSD license, funky codegen in V3 etc
>
>  
>
>>Yes, a few people want to add stuff back, but most do not. It's just 
>>that the few are very vocal.
>>The vast majority don't want MI, operator overloading, or built-in 
>>AspectOrientedDesign.
>>    
>>
>
>The "vast majority" don't understand the value/utility of MI, mixins, or the
>"why?" of AOD  etc.
>  
>
That's a circular argument. If someone "understands the value" of MI, 
etc. then of course they want it.
It's that "vast majority" who know what they are and don't want them 
that matters.

>  
>
>>>>And that's why Java is popular and
>>>>Smalltalk and 
>>>>LISP are not.
>>>>It's also why people prefer Java over C++.
>>>>   
>>>>        
>>>>
>>>It is a easier tool to use. Less powerful. But easier.
>>> 
>>>      
>>>
>>Right - so I hope there's nothing wrong with me pushing to make ANTLR 
>>(or some successor)
>>easier to use. A compiler is easier to use than compiling by 
>>hand, but 
>>also less powerful.
>>I'm ok with that.
>>    
>>
>
>In the context of this thread, "compiling by hand" is not a tool.
>  
>
The point is that just because one approach (whether tool or not) is 
less powerful doesn't
mean that it's worse.

>How would you change in ANTLR to make it easier?
>  
>
Short answer: hide all the details from me. Make it so that I have no 
idea that
there is code being generated to do lexing and parsing. Let me just give 
it a C grammar
and a Java grammar, and then dive in and start writing translation logic 
without any
generated code or even ASTs in sight. How to do that is left as an 
exercise for the reader.

>
>>That's like having to know the details about the 
>>bytecode that
>>javac creates. I don't have to read the manual for that stuff...I'd 
>>rather have the tool not
>>force me to know those details.
>>    
>>
>
>Quite often just getting "something that works" is all that is required.
>Getting the best output from a compiler requires knowing more about what
>goes on under the hood.
>  
>
Yea, I know. You can do a better job at garbage collection than java's 
gc. You can write
better byte code than javac because you've studied javac and bytecode.

The Java JIT guys say the first rule of performance optimization is to 
STOP doing whatever
it is you're doing that you think is producing better bytecode. And what 
did Terence find
out about performance when he tried generating his own bytecode?

>  
>
>>>For your examples of [general purpose language] compiler and ANTLR 
>>>[grammar language compiler], the domain expertise isn't 
>>>      
>>>
>>primarily about 
>>    
>>
>>>the internals of the tool. It's about the syntax, semantics 
>>>      
>>>
>>and idioms 
>>    
>>
>>>of the language recognized by the tool. Knowledge of the tool's 
>>>internals can elevate those who have it above the "average" user who 
>>>doesn't.
>>> 
>>>
>>>      
>>>
>>But required knowledge of the tool's internals limits the "average" 
>>user's productivity.
>>    
>>
>
>A user is already limited if he/she don't understand how a tool works.
>Whether or not that matters depends on what they are trying to accomplish.
>  
>
Yes, we lower 99% of the programming community are writing sub-par code 
because we don't
understand how our compilers work. ;)

>  
>
>>>Java's swan song is productivity (for those problems to 
>>>      
>>>
>>which to it can 
>>    
>>
>>>be applied). Not power as in flexibility, expressivity or 
>>>      
>>>
>>performance.
>>    
>>
>>>I use Java/C# for the productivity benefits. If performance, 
>>>flexibility or expressivity was *more* important in a particular 
>>>project, there are better tools than Java/C# (e.g. C++, Occaml).
>>> 
>>>
>>>      
>>>
>>Right, so you're just like the rest of us. You've chosen to 
>>limit your 
>>own "power" by using Java rather
>>than, say, assembly. So I'm sticking with my claim that
>>"I think a tool can be great while being simple enough for most 
>>programmers (e.g. Java)."
>>and not buying your "Not without limiting it's power" reply.
>>    
>>
>
>It isn't "my" reply. The fact is:
>- Java/C# is less powerful than assembler, C or C++ (you need them to build
>java/c# in the first place).
>  
>
If that's your definition of "power", I don't see how it relates to 
anything. I need an engine to build
a car. That, to me, doesn't mean the engine is "more powerful" than the 
car. I'd say if anything
the car is "more powerful" (or maybe just "more useful") than the 
engine, as it lets you get the
job done faster.

>- For some problems, Java/C# is more productive than assembler, C or C++.
>  
>
I'd say "for almost all problems" but OK.

>  
>
>>>Incidentally, your DSL is just a small part of your 
>>>      
>>>
>>particular language 
>>    
>>
>>>recognition toolkit.
>>>
>>>      
>>>
>>It is??? How do you know that?
>>    
>>
>
>Relying on what I've learned about similar systems.
>If it isn't I'd like to hear more about it.
>  
>
It depends on how you measure, but I wouldnt' say my DSL is a "small part".

>  
>
>>>He isn't using ANTLR directly (i.e. creating/maintaining ANTLR 
>>>grammars) so, no surprise if he hasn't had to learn to use ANTLR.
>>> 
>>>
>>>      
>>>
>>Ah, but he is using ANTLR directly: he spends all day working 
>>with the 
>>Token streams produced by ANTLR,
>>without having ever seen an ANTLR grammar. That's possible when using 
>>ANTLR as a lexer,
>>but that wouldn't be possible using its parser.
>>    
>>
>
>I disagree. He is working with code generated by ANTLR. He isn't using
>ANTLR.
>  
>
Ah, come on. When someone is using a lexer built using ANTLR, you won't 
consider that to be
"using ANTLR?" As in "He's using ANTLR without ever seeing the input 
grammar". That's
like saying I'm not "using javac", I'm just using the bytecode that it 
generates.

>  
>
>>>He _is_ using a DSL you created to encode source-to-source 
>>>transformations. You just expressed the opinion that he has 
>>>      
>>>
>>aquired the 
>>    
>>
>>>domain expertise required to use your DSL.
>>> 
>>>
>>>      
>>>
>>Yes, so if you believe me when I say that my DSL is orders of 
>>magnitude 
>>easier to use than
>>to use ANTLR to build and walk ASTs, then you must see my point: He's 
>>much more productive.
>>    
>>
>
>But he is also limited to what your DSL allows - cf. "not without limiting
>it's power". 
>
>If he used ANTLR directly (like you did). He can do more. Your DSL (like
>Java/C#) favours productivity over power/flexibility.
>  
>
The whole point of building the DSL is because I think if he (or I) used 
ANTLR directly, we'd actually
"do less", not "do more". We'd get less accomplished because we'd be 
struggling with AST
shapes in our heads. The DSL lets us tackle the same problem at a higher 
level of abstraction.

>  
>
>>So I'm building my DSL (and other code) on top of 
>>ANTLR/lexer. I think 
>>there's an opportunity
>>for Terence to build a better and different tool in place of the 
>>ANTLR/parser - one that doesn't
>>require users to know formal language theory or picture ASTs in their 
>>heads.
>>    
>>
>
>Personally, I can't see how that is possible.
>  
>
I'm a bit fuzzy on the details, too :)

>  
>
>>>Not by the design of the compiler. But by how well tested it 
>>>      
>>>
>>is. And by 
>>    
>>
>>>how well documented Java (and javacc) is.
>>>
>>>      
>>>
>>No, I do think it's by the design of the compiler - by the design of 
>>compilers in general.
>>    
>>
>
>ANTLR *is* a compiler.
>  
>
Right, and as such, I believe it can do what "traditional" compilers do: 
hide all the underlying
stuff from the users.

>  
>
>>Compiler designers take it as a given that users need only know the 
>>syntax/semantics of the input
>>language. If Ter took it as a given that ANTLR4 users need 
>>only know the 
>>syntax/semantics
>>of the input language, he'd end up with a very different tool.
>>    
>>
>
>When using ANTLR, that is all one needs to know. 
>
No. To use ANTLR, you not only need to know the input language (say, C) 
syntax&symantics, you
also need to know:
* The ANTLR syntax&symantics
* How to hook in actions: where do they make sense? What language are 
they in?
* You often need to know details about the code that's generated to 
resolve ambiguities
* You need to know how the grammar maps to an AST structure. It's not 
enough to have a mental
   picture of the input grammar, you need to be able to form a mental 
picture of the AST each time
   you see a chunk of code.

>By knowing even more, it is
>possible to do even more than the "average" ANTLR user.
>
>  
>
>>>Now with ANTLR V3, not only can you look at the output code 
>>>      
>>>
>>it if you 
>>    
>>
>>>wish but, for the price of a little more knowledge (i.e. domain 
>>>expertise), you can change it!.
>>> 
>>>
>>>      
>>>
>>I can change the bytecode generated by javac, too. If javac let me do 
>>that, it would be an indication
>>to the javac designer that his design is less-than-great.
>>    
>>
>
>Or that "the [above average] programmer knows best" to paraphrase the C/C++
>motto.
>
>A compiler designer can't determine the best code to generate for every
>possible situation in advance. 
>
He doesn't need to always generate the best code. It's good enough that 
he just generally do
better than humans do.

>This feature makes the tool more useful - for
>those who care to acquire the knowledge required to use it effectively. It
>empowers knowledgeable users to tailor the output for any given situation.
>  
>
And yet, there is no equivalent in Java - no bytecode tweaking. And no 
one seems to mind.
And there is an equivalent in C/C++ - embedded asm code. That was 
popular 20 years ago,
but today's programmers realize that the assembler is probably better at 
producing good code,
and they don't need every last 1% of performance anyway.

So, I guess we're way off topic :( But thanks for the conversation!
Andy

>
>Micheal
>
>
>  
>