The Drudge report headlined "Emergency Surgery" this weekend, pointing out the various startup failures experienced by the IT backbone behind the Affordable Care Act, better known as Obamacare.
Coming right on the heals of the government shutdown designed to kill it, the first week of ACA registration and sign-up was an unmitigated disaster. Almost nobody was able to sign in the system. The Web interface failed regularly, and the entire launch of the system was a fiasco.
Let's set aside, for a moment, whether or not Obamacare is good policy. I've written at length about the failings of our health insurance system (see my book for a free read), and while Obamacare does overcome some of the problems, it's really a very, very expensive band-aid pasted over a very, very cancerous organ.
But, like I said, let's not discuss health policy today. Let's just look at what it takes to launch a nationwide IT project with millions of expected users.
This is not breaking new ground. Facebook, Google and even Twitter, not to mention many smaller players, are able to easily support millions of users with impressive responsiveness and reliability.
The science of how to scale the consumer-facing side of a major IT system is not new. There are many examples, not just of the software, but even of how to implement a system to scale in hardware. Facebook has even open-sourced some of their IT infrastructure design, down to the hardware component level.
So, from a technological implementation standpoint, there was no reason that healthcare.gov had to crash and burn so badly.
Ah, but there was a reason it crashed: the nature of politics.
Let's separate out the question of the contractor. There has been some discussion why an American arm of a Canadian firm was contracted to do this project, but for the purpose of this discussion, we'll leave them out of it.
The fact is, there's one factor involved that may well have doomed any vendor to failure with Obamacare: politics.
Think about this for a second. Healthcare.gov rolled out to the entire United States population on one day. All at once. Of course it blew up.
That is not how you're supposed to roll out a system. Even Facebook started small, with limited functions, and supporting only Harvard students. It is a very bad practice to roll out your entire system to the full breadth of a huge user base. It pretty much stands no chance of succeeding.
Unfortunately, the nature of politics pretty much doomed the Obamacare rollout to failure. Imagine if HHS had tried to do a staged rollout. They could have chosen one town (like Google has done with fiber), and tested it in a microcosm. But then, the news channels, the bloggers, and the politicians would have screamed favoritism.
Or HHS could have opened the service up to a limited number of initial ticket holders, say a few hundred or a few thousand. But how those initial tickets would have been distributed would have caused an uproar. If they were originally distributed to government employees, more charges of favoritism would have been wielded. If they were offered to a specific category or group of people, again there would have been complaints.
And yet, that's exactly what should have happened. The healthcare.gov infrastructure should have been rolled out to a very small group, tested, refined, and then rolled out to a slightly larger group, with ever more testing and ever more debugging.
Sure, it might have taken a year or more to do the rollout, but it's how this stuff is done. You can't just hang best practices in the cloak room because the system is fraught with political debate.
Frankly, the biggest mistake made here was with the administration, which didn't firmly set expectations and didn't firmly express that best practices come before politics. On the other hand, you can't entirely blame them, since their entire system has been under extreme hostile fire since the day it was proposed.
That said, that's the purpose of senior management, in an enterprise or in government. Once you decide to deploy a system, it's up to the IT professionals to build it using accepted professional practice. And it's up to senior management to run interference and make it possible for the professionals to do their work.
This didn't happen this time, and it probably won't the next time.
You want to know the worst part of this story? Now that the rollout failed, because it didn't iterate from small to large like it should have, it's being patched. Zack Whittaker has that story. We all know how hard it is to maintain a spaghetti-coded patchwork of rush-job coding fixes.
Because best practices weren't assured from the beginning, we can be pretty well assured that this beast will be a costly problem to maintain from Day One going forward.
The sad part of this story is it didn't have to be this way. Once again, I blame the politicians.