[antlr-interest] language design

Sam Harwell sharwell at pixelminegames.com
Thu May 7 14:05:40 PDT 2009


All of the issues you mentioned below are addressed in other languages.
Do you have a clear view of what feature you need that isn't currently
available? What kinds of problems do you wish to solve with your
programming language? Are you creating a language just so you can learn
about the design/implementation process, or do you plan to actually use
the result?

 

Creating and implementing a simple language as an experiment is a
wonderful learning opportunity. Designing a practical general-purpose
language that successfully addresses a wide range of programming
situations is one of the most difficult tasks in computer science.

 

Here are some things you should definitely do:

 

1.       Complete Project Euler <http://projecteuler.net/>  twice in an
existing language you are evaluating for comparison. Make one of them an
elegant solution in the language and one of them an efficient solution.
This reveals some measure of the "strength" of the language for a very
wide range of tasks. Do note that it does not evaluate other extremely
important items such as asynchronous programming, data binding, user
interfaces, and many more.

2.       Read "Demystifying Magic: High-level Low-level Programming,"
available from IBM Research:
http://domino.research.ibm.com/comm/research_people.nsf/pages/dgrove.vee
09.html

3.       Write a compiler front-end from scratch, at least the lexer and
a complete stage 1 parser for AST creation. Choose a well-defined
research language that does not have an ANTLR grammar currently
available. You may reference other grammars, but obviously the big tasks
are on you. Two possibilities are F#
<http://research.microsoft.com/en-us/um/cambridge/projects/fsharp/>  or
Chapel <http://chapel.cs.washington.edu/> . Make sure your final
solution doesn't use semantic predicates or the "backtracking=true"
flag. Also attempt to implement a front-end for a language that doesn't
have an official spec <http://www.php.net/>  so you understand the
importance of not failing this area.

4.       Make sure you understand what it means to solve problems with
language
<http://java.sun.com/docs/books/tutorial/essential/exceptions/runtime.ht
ml>  compared to solving problems with tools
<http://www.red-gate.com/products/Exception_Hunter/index.htm> . Here is
another example of problems solved with language
<http://research.microsoft.com/en-us/projects/specsharp/>  and with
tools <http://msdn.microsoft.com/en-us/devlabs/dd491992.aspx> .

 

Sam

 

From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of Edwards, Waverly
Sent: Thursday, May 07, 2009 3:22 PM
To: 'antlr-interest at antlr.org'
Subject: Re: [antlr-interest] language design

 

<< 

In Java, there are two severe limitations to the ability of primitives.
First, the language doesn't allow user-defined primitives, so there is
no concept of lightweight enums, aggregate data structures, or the
ability to create special struct based on an int that has special
meaning with no more overhead than the int itself (say for a special
application-specific type of bitfield, or distinguishing a
platform-width integer from a memory address/pointer stored in a
platform-width integer). Second, generics such as ArrayList<T> in Java
are only handled specially by the compiler. At runtime, the storage used
inside the list forces boxing/unboxing of values. Arrays of primitives
(like int[] a) are able to hold unboxed values, so they are lighter.

>> 

 

>> lightweight enums, aggregate data structures

 

This is exactly the things I wish to address in my language.  It will
have features of Java but I found that

 

final int a = 10; is not a sufficient or efficient way to do enums.
Also, when I first started programming in Java

I didn't know there weren't aggregate types.  I had never experience
that, so I went on a mad hunt trying

to figure out how to create a struct. Its funny now, but it wasn't funny
then.  Combining features of one language

with the features of another along with the addition of your own ideas
seems like a good way to go.

 

I'll look into Android.  I'm curious now.  Its challenging for me to
understand how the folks at Google could make

a loser (big mistakes) like the one you refer to after having so many
winners.

 

 

W.

 

 

________________________________

From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of Sam Harwell
Sent: Thursday, May 07, 2009 10:51 AM
To: Edwards, Waverly; antlr-interest at antlr.org
Subject: Re: [antlr-interest] language design

While this is somewhat true of Java, it's really not true of C#. The
built-in primitive types in both languages (int, short, double, char,
bool) are passed by value in function calls and kept on the stack as
local variables. The process of boxing in both languages takes a
primitive type and wraps it as an object on the heap, as required for an
operation like the following. The process of unboxing (also shown)
converts it back to a primitive.

 

object o = 3;

int i = (int)o;

 

In Java, there are two severe limitations to the ability of primitives.
First, the language doesn't allow user-defined primitives, so there is
no concept of lightweight enums, aggregate data structures, or the
ability to create special struct based on an int that has special
meaning with no more overhead than the int itself (say for a special
application-specific type of bitfield, or distinguishing a
platform-width integer from a memory address/pointer stored in a
platform-width integer). Second, generics such as ArrayList<T> in Java
are only handled specially by the compiler. At runtime, the storage used
inside the list forces boxing/unboxing of values. Arrays of primitives
(like int[] a) are able to hold unboxed values, so they are lighter.

 

C# shows heavy advantages in several areas by hitting these issues
square on the head. First, enums are as lightweight as ints, so using
them is an encouraged, zero-overhead practice. Second, careful use of
structs, C#'s way of creating user-defined unboxed values, leads to
immense performance benefits in places like embedded computing, systems
programming, high-performance applications, and even cases like my
high-performance SlimLexer for ANTLR's C# port that operates 5x faster
than Lexer in 1/5 the memory (I sent an email to antlr-dev mailing list
about this). Third, when a generic type is instantiated at runtime with
a value type (like List<int>), the JIT actually produces code that
operates fully on unboxed values, in cases allowing performance rivaling
C++ templates. These advantages are one of the reasons I believe
Google's use of Java for the Android platform was an unacceptable
mistake, a failure to meet the fundamental requirements of their clients
as well as they could. It is the responsibility of development teams to
not make decisions like they did knowing full well it will result in a
crippled end-user experience.

 

Sam

 


______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
______________________________________________________________________

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090507/2c7695b8/attachment.html 


More information about the antlr-interest mailing list