In which I wax philosophical about programming languages

“A rose by any other name would smell as sweet” –(Shakespeare 2.2.6)

Alas, the smell of a rose is not the sole bearer of its beauty.  There’s also the the color, the gentle visual lines, the nesting of the petals upon each other.  Rightly, one might say that the beauty of a rose is a multifaceted, multi-sensory phenomenon.  The same is true, I’d argue, of a language (in this case, a programming language).

What makes a programming language beautiful?  There are many attributes that contribute to the beauty of a language (since the perception of beauty is clearly somewhat subjective, I’m giving my opinion here).  A beautiful language has a clear and uncluttered syntax — neither more verbose than necessary nor more terse than readability allows.  A beautiful language has clear semantics — the function of a piece of code should be relatively clear just by looking at it; in most cases one should not need to consult the language / library reference just to determine what the code is doing.  An addendum to this is that neither the syntax nor semantics should be plagued with special cases, instances in which the form or meaning of a piece of code is different from the general case.  A beautiful language should be expressive — simple things should be easy and complex things should be possible.  Finally, and this one might stir the pot a bit, a beautiful language should be efficient — to truly admire the other features of a beautiful language, a developer shouldn’t feel guilty for using it because the runtime overhead (in terms of speed, memory, etc.) makes it un-scalable or impractical for large data.

Lately, I’ve been coding a lot of C++ . . . a lot.  And it is, most assuredly, not a beautiful language.  The new standard, C++11, has, in my opinion, greatly improved the language. Code has become much less verbose with the auto keyword, the range-based for loops and the initializer lists.  The inclusion of closures in the language make using the standard library algorithms reasonably nice and is generally a huge win.  Some of the other additions, like delegating constructors, variadic templates and constexpr, have less of an effect on how most client code looks, but they change the game substantially for library implementers.  The truly great thing about these new features, however, is that they bring their associated niceties to C++ without an associated runtime cost.  The one aspect of a beautiful language that C++ undoubtedly does have is speed.  When I choose to implement an idea or method in C++, I don’t waste any time worrying if my choice of language will fundamentally limit my ability to process large data in a reasonable amount of time and memory.  The new language and library features leave this beautiful property unblemished.

However, what about the other features; syntax, semantics and expressiveness?  Without belaboring the details too much, I’d argue that expressiveness has gone up moderately, while the syntactic and semantic beauty have, generally stagnated or even slightly decreased.  Take, for example, the closure syntax:

[cc lang=’cpp’ ]

[=, &var1, this] (
type1 arg1,
type2 arg2, … ) -› return_type {  BODY; };

[/cc]

hardly what I would call “beautiful.”  Some of the new additions, since the committee struggled to maintain maximum backwards compatibility, also harm the semantic beauty of the language.  Take, for example, the new strongly-typed enums. With strongly-typed enums, instead of having enums that are implicitly and haphazardly converted to ints all over the place, we have enumerations which represent a unique type, and whose members are properly nested in their enum’s enclosing scope. I’d argue this is the way it should have always been, but for backward compatibility with C. However, take notice of how these new strongly-typed enums work:

[cc lang=’cpp’]

enum class EnumTypesStrong {
StronlgyTyped, WeaklyTyped, HowShouldIKnow
};

enum EnumTypesWeak {
StronlgyTyped, WeaklyTyped, HowShouldIKnow
};

[/cc]

The only thing that designates this (the top) enum is strongly-typed is the inclusion of the ‘class’ keyword between the ‘enum’ keyword and the enum’s name. Moreover, the old type of enums still exist, and with their old semantics. This means that if I want to access a member of the strongly typed enum, I must do something along the lines of:

[cc lang=’cpp’] EnumTypesStrong::StronglyTyped [/cc]

Good; this makes sense. However, given the second enum, I can still just say

[cc lang=’cpp’] StronglyTyped [/cc]

which refers to the declaration in the ‘EnumTypesWeak’, because, to maintain backward compatibility, the values of weakly-typed enums must still be exported to their enclosing scope.

As a result, we’re left with a good (i.e. reasonable, strongly-typed) enum type and a bad (i.e. backwards compatible, weakly typed, implicitly converted, leaky) enum type. They can co-exist in the same language and the same program, and the only thing that delineates them syntactically is the inclusion of the ‘class’ keyword. To me, all of this overloading and clutter, keeping the enum types straight and remembering the semantics of each, decreases the semantic beauty of the language. It’s not really that using the ‘class’ keyword to delineate the two types of enums is a bad decision (given the semantics of ‘class’, it makes sense), it’s that, to maintain backward compatibility, C++ has to keep around the accumulated cruft and sub-optimal decisions from decades in the past.

So what am I really trying to get at here? C++ clearly has one of the aspects of a beautiful language (speed) and while while it might be making slow progress in some aspects (expressiveness), it’s stagnating or regressing in others (syntax & semantics). I’m not bashing C++ here; I use it all the time and believe that for many tasks it’s currently the best available choice. It’s beauty is largely the result of one design / decision-making principle — use zero-cost abstractions. On the other hand, many of its warts are the result of its success (i.e. poor initial decisions that could not later be reversed because of the language’s popularity and wide-spread use). However, I believe that a language can be truly beautiful in all regards, not perfect, but truly and holistically beautiful. I believe that a language can be fast (as fast as C++), while simultaneously being expressive, with a syntax that is direct without being cryptic and semantics that are consistent and predictable. As long as you’re willing to learn from your languages past mistakes (or the mistakes of other languages) and your language is not yet so successful that it precludes substantial evolution, I believe that you can attain the type of multifaceted beauty that makes a language fun to write, easy to read and fast to execute. In the upcoming posts, I’m going to explore some different languages and how they fare under the given (but admittedly, somewhat arbitrary) notions of beauty. We’ll see where these different languages excel and where they fall short. We’ll admire their beauty, and learn from their warts, with the ultimate goal of deciding what properties, when (tenably) combined, will make a language that’s beautiful in all regards.

This entry was posted in Uncategorized. Bookmark the permalink.

2 Responses to In which I wax philosophical about programming languages

  1. David says:

    I like it. From a professional programmer’s point of view, I think you keyed on some important notes — namely, you want a language that’s both syntactically concise and semantically clear, without sacrificing efficiency.

    But what about for someone like me, a non-programmer who needs to decide on a language for himself and others? Should we not judge a languages value by its ability to improve to productivity? Is not optimization of speed premature while productivity languishes?

    What I am hinting at is that I am leaning towards Python. It is a beautiful language that once written, can be understood and enjoyed by others. It is also not highly domain specific, which is important for a non-professional who doesn’t have the luxury of learning perhaps dozens of language throughout his career, and is unable to change languages/frameworks on case-by-case basis.

    Thoughts?

  2. Rob says:

    Hi David,

    Thanks for your comment! I certainly believe that there is a practical aspect to the efficiency question. For example, if I’m going to be implementing a linear (time & space) algorithm and I’m never going to be processing data sets with more than a few thousand elements, than efficiency concerns are, to a large extent, purely academic. Now, I’d still argue that extreme inefficiencies detract from the “beauty” of the language, but in such a case you’d never really be hit by these ugly realities.

    The above is the reason that I still use Python for a number of tasks (e.g. scraping data from the web, analyzing experimental results, plotting, etc.). However, I’ve reached a point where Python is no longer where I turn for even moderate computational challenges. To some extent, this is because I know that most of the time I’ll need to eventually “scale-up” my solution, but even if that’s not the case I tend to shy away from Python if I know I’ll ever be processing more than a relatively small dataset (see my latest blog post for an example where Python memory usage gets out of hand while performing an incredibly simple task).

    All that being said, I do think that there are many tasks where Python is a more than adequate tool. Also, the abundance of high-quality and easy-to-use Python libraries for a host of different tasks makes it a very enticing option if you know the magnitude of the problems you’ll be solving. I guess it comes down to the fact that I do think that optimization is premature if it comes at the cost of a large cut in productivity, but productivity also grinds to a halt if one gets stuck with a tool that can’t scale to the problem sizes one will be facing in practice.

Leave a Reply

Your email address will not be published. Required fields are marked *

Please insert the signs in the image: