Why C wins: the cold realities of abstraction

The sad reality underlying many of the handy abstractions we rely on every day is that they only work most of the time - and what that means is that the abstractions we choose to use define limits to the quality of our work.

Joel Spolsky's "The Law of Leaky Abstractions". is so well written I can't effectively abstract it - so here's the whole of his introduction:

There's a key piece of magic in the engineering of the Internet which you rely on every single day. It happens in the TCP protocol, one of the fundamental building blocks of the Internet.

TCP is a way to transmit data that is reliable. By this I mean: if you send a message over a network using TCP, it will arrive, and it won't be garbled or corrupted.

We use TCP for many things like fetching web pages and sending email. The reliability of TCP is why every exciting email from embezzling East Africans arrives in letter-perfect condition. O joy.

By comparison, there is another method of transmitting data called IP which is unreliable. Nobody promises that your data will arrive, and it might get messed up before it arrives. If you send a bunch of messages with IP, don't be surprised if only half of them arrive, and some of those are in a different order than the order in which they were sent, and some of them have been replaced by alternate messages, perhaps containing pictures of adorable baby orangutans, or more likely just a lot of unreadable garbage that looks like the subject line of Taiwanese spam.

Here's the magic part: TCP is built on top of IP. In other words, TCP is obliged to somehow send data reliably using only an unreliable tool.

To illustrate why this is magic, consider the following morally equivalent, though somewhat ludicrous, scenario from the real world.

Imagine that we had a way of sending actors from Broadway to Hollywood that involved putting them in cars and driving them across the country. Some of these cars crashed, killing the poor actors. Sometimes the actors got drunk on the way and shaved their heads or got nasal tattoos, thus becoming too ugly to work in Hollywood, and frequently the actors arrived in a different order than they had set out, because they all took different routes. Now imagine a new service called Hollywood Express, which delivered actors to Hollywood, guaranteeing that they would (a) arrive (b) in order (c) in perfect condition. The magic part is that Hollywood Express doesn't have any method of delivering the actors, other than the unreliable method of putting them in cars and driving them across the country. Hollywood Express works by checking that each actor arrives in perfect condition, and, if he doesn't, calling up the home office and requesting that the actor's identical twin be sent instead. If the actors arrive in the wrong order Hollywood Express rearranges them. If a large UFO on its way to Area 51 crashes on the highway in Nevada, rendering it impassable, all the actors that went that way are rerouted via Arizona and Hollywood Express doesn't even tell the movie directors in California what happened. To them, it just looks like the actors are arriving a little bit more slowly than usual, and they never even hear about the UFO crash.

That is, approximately, the magic of TCP. It is what computer scientists like to call an abstraction: a simplification of something much more complicated that is going on under the covers. As it turns out, a lot of computer programming consists of building abstractions. What is a string library? It's a way to pretend that computers can manipulate strings just as easily as they can manipulate numbers. What is a file system? It's a way to pretend that a hard drive isn't really a bunch of spinning magnetic platters that can store bits at certain locations, but rather a hierarchical system of folders-within-folders containing individual files that in turn consist of one or more strings of bytes.

Back to TCP. Earlier for the sake of simplicity I told a little fib, and some of you have steam coming out of your ears by now because this fib is driving you crazy. I said that TCP guarantees that your message will arrive. It doesn't, actually. If your pet snake has chewed through the network cable leading to your computer, and no IP packets can get through, then TCP can't do anything about it and your message doesn't arrive. If you were curt with the system administrators in your company and they punished you by plugging you into an overloaded hub, only some of your IP packets will get through, and TCP will work, but everything will be really slow.

This is what I call a leaky abstraction. TCP attempts to provide a complete abstraction of an underlying unreliable network, but sometimes, the network leaks through the abstraction and you feel the things that the abstraction can't quite protect you from. This is but one example of what I've dubbed the Law of Leaky Abstractions:

All non-trivial abstractions, to some degree, are leaky.

Abstractions fail. Sometimes a little, sometimes a lot. There's leakage. Things go wrong. It happens all over the place when you have abstractions.

Back in 1954 a guy named Tom Godwin wrote a story under the title "The cold equations." The basic plot holds that a courier pilot on a one way mission to deliver desperately needed medical supplies has exactly enough fuel to land the vehicle's planned mass, discovers that the eighteen year sister of one of the people he's been sent to save has stowed away on board, and then has to eject her out the airlock to complete his mission and save, among others, her brother.

My copy of the story is in a collection edited by David Hartwell and Kathryn Cramer - here's part of their introduction to it:

[The Cold Equations] is one of the most popular and controversial hard sf stories of the last fifty years, a story that stacks the deck and then plays with the reader's emotions with carefully juxtaposed cliches that imply a deus ex machina - then frustrates that false expectation.

... Godwin's story angered many readers when it appeared in the fifties, nearly all of whom wanted the problem solved by violating some scientific principle or law. ...

The point of the story, of course, is that scientific law cannot be violated under any circumstances, and ignorance of scientific law can kill you, no matter how sincere you are.

Godwin's staging doesn't look reasonable today, but his science does and the basic lesson in the story is as clear and applicable now as it was then: the rules by which the Universe works are real, are ours to discover, and the only moral good lies in aligning our actions with the absolutes governing the universe - not with wishful thinking, social theorizing, or some imagined standard of compassion, but with the reality of what works, and what doesn't.

Magic doesn't - and the lesson in the story is that neither the author nor the audience can change that.

In Spolsky's context Godwin's equations are non leaky, non trivial abstractions - meaning that his law should be amended to read something like this:

The probability that an abstraction leaks varies directly with the number of people involved in creating it, and exponentially with the number of abstractions it subsumes.

In other words, the simpler and closer to reality the constructed abstraction is, the less likely it is to leak - or, more succinctly: genius programs in K&R C.