Part of the problem with the documentation and identification issues I talked about last week - and will talk about more later - is that it is very hard to separate information from disinformation.
Disinformation comes in three major forms:
- innocent mistakes;
- intentional disinformation (aka FUD); and,
- (self) delusion.
Delusions are easily the most dangerous of these. In the IT context the most common delusion is simply that what we know is right in general or applicable to some specific issue when, in reality, it isn't. We know, and we act accordingly - with frequently catastrophic results.
FUD, taken as the art of spreading fear, uncertainty, and doubt, is at its most dangerous when it plays on existing certainties to reinforce delusion.
A recent report by Security Innovation comparing Windows and Linux seems to fall squarely into that category.
Reader "Mired in Zealand" brought this thing to my attention late last week and both Dana Blankenhorn and I decided to respond via our askbloggie column. We asked George Ou to speak for Microsoft on this one, but we haven't heard from him yet. Both Dana and I will, however, be filing our comments on the askbloggie site later today.
Meanwhile, here's my contribution to that debunking.
Does Windows Rule?
Here's the reader request:
I rec'd this email newsletter today, and I found it very interesting, and admittedly down right controversial. As a Windows guy, even I was having some trouble believing that Windows is such the slam-dunk winner that it's purported to be over Linux. What are your 2 cents? I'd love to see a blog entry on this. This is of particular interest to my IT shop since we're contemplating a move from our z/OS environment to possibly a Linux environment.
From: WindowsITPro Update
Sent: Tuesday, November 22, 2005 2:19 PM
To: Mired in Zealand
Subject: Microsoft vs Linux 2005: It's All About Reliability
by Paul Thurrott, News Editor, firstname.lastname@example.org ...
OSS proponents have been pushing the supposed security, reliability, and durability advantages of Linux over Windows for years now. My gut feeling has always been that were Linux installed in as many production environments as Windows, it would fall apart as much or more, albeit in different ways. What's lacking, of course, is evidence. Whereas Microsoft has sponsored study after study to examine the competitive advantages of Windows and Linux, the cozy relationships between the software giant and the companies making these studies always made the results less than believable.
Last week, however, I think we reached a turning point in understanding how Linux and Windows differ in the real world. Yes, yet another study is involved, and yes, Microsoft commissioned this one as well. However, the company that performed the study, Security Innovation, is highly regarded for its independence and methodology. In this study, "Reliability: Analysing Solution Uptime as Business Needs Change", [URL added - murph] Security Innovation examines the real-world reliability of Windows and Linux, not abstract and often pointless statistics such as uptime.
As part of the study, sets of experienced Windows and Linux systems administrators were given control of e-commerce environments based on their respective systems. The Windows environments were based on Windows 2000, then upgraded to Windows Server 2003 and any applicable hotfixes and security patches during the simulated year of the study. The Linux environments began life with Novell SuSE Linux Enterprise Server 8 and were upgraded to SuSE 9 and any applicable updates. Both groups of administrators had to configure and maintain the systems over time, introduce new functionality to the e-commerce application over time (including personalisation, dynamic search, and list-targeting features), and perform the major OS version upgrades. Security Innovation examined the performance of the administrators, noting how long they took to execute each task.
At a high level, the Windows systems were dramatically more reliable than the Linux systems. On average, patching Linux took six times longer than patching Windows, and there were almost five times as many patches to apply on Linux (187) as there were on Windows (39). More important, perhaps, the Linux systems suffered from 14 "critical breakages," software dependency failures in which software simply stopped working on those systems. Windows had no dependency failures.
Sounds compelling doesn't it? I thought so, in fact I thought both this newsletter and the Reliability study it reports on were among the best things of this kind I've seen. On the other hand there's a challenge to the Linux community here we'd be fools to ignore.
As step one, lets look closely at what the underlying study actually says. Paul Thurrott, the newsletter writer quoted above, captured its central argument very well: it's about comparing what happens when you put both Linux and Windows into a production environment and then upgrade both the OS and the applications suite over the period of year.
In fact, however, Security Innovation didn't actually do this. Instead they simulated this by compressing all activity into an unknown period -whether days or weeks they don't say.
During that period three people hired as experienced Linux administrators and three Windows people were each given responsibility for a machine and asked to:
- apply security and recommended patches on a simulated monthly release basis;
- upgrade the e-commerce application with new functionality at the end of each simulated quarter (i.e. change it to meet changing business requirements); and,
- upgrade the core OS from SuSe 8.0 to 9.0 and from Windows 2000 server to Windows 2003/XP server at the end of the simulated year.
Here's part of Security Innovation's summary of what came out of this:
- Two of the three Linux administrators were unable to meet all business requirements within the time constraints of the study; in contrast, all three Windows administrators met all business requirements
- on average the three Linux administrators were about 70% slower than their Windows counterparts to fulfill business objectives. This was in part driven by more system failures experienced by the Linux administrators (14 compared to 0 for the Windows administrators) and a greater number of patches that needed to be applied to the Linux systems (in total, 187 compared to 39 for Windows).
- The only Linux administrator who was successful in meeting all requirements installed components and component versions that were not directly supported by the vendor (and in some cases custom compiled) that effectively put his system into an unsupported configuration. While the configuration did meet functionality requirements, the administrator is now "on his own" to resolve potential future system failures. It has also increased the IT administrative burden given that any future patches to the unsupported components would now have to be gathered from alternate sources and in some cases edited at the source code level and recompiled. On the Windows front, the system was maintained by components provided either from Microsoft or from the 3rd party component vendor and all configurations were within the boundary of support.
Not exactly good news for Linux is it?
And then again, maybe a closer look is required before we draw conclusions.
The first problem is that they don't say which patches they applied. In the period given, July 1st 2004 to June 30th 2005, Novell apparently released 237 patches, not 187. They also don't say which e-commerce application they used, or which third party upgrades were implemented, so we don't know how many patches applied specifically to those elements of the overall configuration.
Thus the numbers they give suggest they applied some subset of the patches issued by Novell, but they don't tell us which ones. Here's the first five letters worth of an alphabetical listing of what Novell's 237 patches applied to:
a2ps: Converts ASCII Text into PostScript
aaa_base: SuSE Linux base package
acl: Commands for Manipulating POSIX Access Control Lists
acpid: Executes Actions at ACPI Events
apache2-mod_python: Python module for the Apache 2 web server
arts: Modular software synthesiser
arts-devel: Include Files and Libraries mandatory for Development.
aspell: A Free and Open Source spell checker
aspell-devel: Include Files and Libraries Mandatory for Development
bison: The GNU Parser Generator
bootsplash-theme-SuSE: Default SuSE Bootsplash Theme
bootsplash-theme-SuSE-Home: Default SuSE Linux Enterprise Server Bootsplash Theme
bzip2: A program for compressing files
cadaver: Command-line WebDAV client for Unix
coreutils: GNU Core Utilities
cups: The Common UNIX Printing System
cups-client: CUPS Client Programs
cups-devel: development environment for CUPS
cups-libs: libraries for CUPS
curl-devel: header files and libraries for curl development
cvs: Concurrent Versions System
cyrus-imapd: An IMAP/POP Mailserver
cyrus-sasl: Implementation of Cyrus SASL API
cyrus-sasl-devel: Cyrus SASL API implmentation, Libraries and Header files
dhcp-server: ISC DHCP Server
drbd: Distributed Replicated Block Device
dvd+rw-tools: A Collection of Tools for Mastering DVD+RW/+R Media
emacs: GNU Emacs Base Package
enscript: An ASCII to PostScript(tm) Converter
evolution: The Integrated GNOME Mail, Calendar, and Addressbook Suite
evolution-devel: Include Files and Libraries mandatory for Development.
exim: The Exim mail transfer agent, a replacement for sendmail
ez-ipupdate: A small utility for updating dynamic DNS service
A lot of these are marked as security updates, but almost all of the software they apply to has no place in an e-commerce configuration. With Windows servers you install everything you're licensed to because the dependencies are largely unknown, with Linux you install what you need -because what isn't there doesn't have vulnerabiliites, use resources, or require patching.
In other words, knowledgeable Linux people configuring and running those servers might have had to install perhaps five or six Linux related patches during the year - nothing like 187, and none with recursive dependency tails of the kind that got two out of the three testees in trouble.
The second problem is something the author doesn't mention at all: "management" has clearly told these administrators to apply the patches directly to the "production" systems. In real life many people do this with Windows, but you don't do this with Linux. With any Unix you duplicate your production environment on the sysadmin's workstation and debug any processes to be applied to production there before proceeding. They don't say why they didn't do this, but a reasonable speculation is that there were two reasons: the simulation would have imposed unrealistic calendar time constraints, and, probably more importantly, this isn't the Windows way, and they did everything the Windows way.
The third set of problems arises because of the way they handled the e-commerce application upgrades.
Again, there's a shortage of critical information in the report: they don't tell us which e-commerce application they started with, and they don't tell us which third party upgrades they installed. Instead we get this about the quarterly application upgrades:
These feature enhancements will be simulated by adding best-of-breed third party components to the system that meet new requirements. In the running ecommerce example, this could mean adding a new shopping cart component or an add-in data mining tool. In many cases there will be multiple 3rd party products that satisfy functional requirements. Our selection among these alternatives will be made strictly based on largest market share among enterprise customers.
During the experimental trials, 3rd party best-of-breed components were chose to satisfy the needs of the solution. Our criteria for selection of components were:
- Support on both Windows and Linux
- Strong and established base of enterprise customers
In other words, the game was to add components chosen on the basis of market share and availability for both Windows and Linux. That sounds fair, but they sabotaged it from the gitgo by choosing quite dissimilar starting points:
S1 [the starting point] is a basic ecommerce application running on the Windows Server 2000 operating system, written in ASP, hosted by IIS using the SQL Server 2000 database that is operating on June1st, 2004. Similarly, we define S1 on the Linux side to be a basic ecommerce application running on Novell SuSE Linux Enterprise Server 8, written in PHP, hosted by Apache using the MySQL database engine.
The problem with this is that the requirement that component upgrades run on both Windows and Linux looks like it's intended to level the playing field but has the opposite effect - taking the best open source applications out of consideration because these might run on Windows but not with ASP and SQL-Server, and limiting the number of vendors on the Windows side to one.
As a result the Windows administrators were merely asked to load new modules "from the 3rd party component vendor" (P3, note singular) while the Linux administrators were expected to integrate dissimilar bits and pieces taken from multiple incompatible sources.
Let me be clear about this: the right thing to do would have been to do on Linux what the Windows market structure apparently forced them to do on Windows: take a single vendor integrated solution known to contain all the components needed for the end product, partially install it, and then upgrade it "quarterly."
But that's not what they did: instead the Windows people were asked to load pre-integrated modules while the Linux administrators faced integration and interfacing problems on unrelated code bundles.
Amazingly enough, one of them succeeded in keeping his machine "in production" all the way through!
In stage magic the emphasis is always on distraction - get the audience focused on what the pretty girl isn't wearing and nobody will notice the lighting change behind the magician. This works in paid advocacy studies too - get everyone focused on a known and widely shared pain like upgrade dependencies in non core toolsets and few will notice that you're crippling Linux by applying Windows methods (install everything) and Windows management ideas (interface most popular of breed components) where they don't fit.
Looking at this you might think it would be reasonable to describe the result as classic Microsoft anti-Linux FUD - a lie from one end to the other. However, there are a couple of reasons for thinking that maybe this isn't so.
In the first place there are lots of people who actually try to run Linux in just this way and presumably get just these results. They're getting, of course, just what they deserve - but this is the biggest problem in business computing: managers and administrators whose certainties about running systems drawn from one environment get applied to another to create what the authors rightly call "IT pain."
See this report in that context and what we have is a positive story in which one of three guys hired for their claimed Linux expertise and given wildly inappropriate operating instructions manages to pull it off.
As I've said many times, it's not Linux or its applications that are at fault when this happens: the problems documented in the study are largely the result of applying Windows expertise to Linux - something I see people do almost every day, and something "Mired in Zealand" will be seeing a version of at first hand if his organization transitions from zOS to Linux without a lot of retraining, rethinking, and re-staffing first.
The second reason not to dismiss this study as mere FUD is subtler. The fact that this company calls itself "Security Innovation" but works with Windows suggests some internal conflict has to exist - and the structure of much of the report leads to a "wild surmise" as to what one of those might be about.
Read it carefully and you'll see that most of the verbiage is cast as you'd expect to see it in a proposal to Microsoft to do this study, not as you'd expect to see it in a report about the outcome of the study. Thus the construction: "we will [do something]" occurs at least 65 times in the report. For example:
For each failure we will do a root cause analysis to determine its source. These causal factors will be written up and documented in our analysis. Specifically, we will capture metrics around dependency failures, version demand conflicts and other potential sources of failure.
Hence the "wild surmise:" these guys might well have set out to settle an internal argument by doing exactly what they report, exactly as they report it - only to call Microsoft for funding and publicity when their mistakes on the Linux side seemed to give Windows such a huge lead in performance and reliability.
In other words it's possible to see this report as wrong on all counts, and not only credit the authors with a legitimate attempt to come to grips with a real problem but feel sorry for them because what they ended up, in all innocence, writing a case study on how not to deploy Linux.
As I said above, FUD is at its most dangerous when it supports and reinforces delusion. In my opinion that's what happened here: with these people getting just about everything about running Linux wrong, finding what they hoped to find not as the result of any actual Linux/Windows differences but as a result of their own delusions about systems management, and then using Microsoft's money and press access as a means of spreading those delusions to others.