X
Tech

What's the real story on the Windows Home Server data corruption bug?

Last week, an alarmingly terse Knowledge Base article got the undivided attention of Windows Home Server users with its warning that they risk data corruption if they edit files stored on a home server using a handful of popular programs. How widespread is this bug, really, and why wasn't it caught during the long beta test cycle? I've got some inside information.
Written by Ed Bott, Senior Contributing Editor

In the software industry, data-damaging bugs are every product manager's nightmare. When a reproducible bug in this category is identified, sirens go off, vacations get canceled, engineers lose sleep, and product managers pop Maalox until it's fixed.

That's the context behind the alarmingly terse Knowledge Base article 946676, published last week. The entire article encompasses only a few sentences, but it got the attention of anyone using Windows Home Server:

When you use certain programs to edit files on a home computer that uses Windows Home Server, the files may become corrupted when you save them to the home server. Several people have reported issues after they have used the following programs to save files to their home servers:

  • Windows Vista Photo Gallery
  • Windows Live Photo Gallery
  • Microsoft Office OneNote 2007
  • Microsoft Office OneNote 2003
  • Microsoft Office Outlook 2007
  • Microsoft Money 2007
  • SyncToy 2.0 Beta

Additionally, there have been customer reports of issues with Torrent applications, with Intuit Quicken, and with QuickBooks program files. Our support team is currently trying to reproduce these issues in our labs.

I asked a senior member of the Windows Home Server team for more details yesterday. Here's what I learned:

This is not an issue that affects every Windows Home Server installation, and the symptoms require several factors that are not mentioned in the KB article. The largest contributing factor is when a home server is under extreme load. If you're doing a large, highly demanding file copy operation in the background and you're using one of the listed applications to edit a file that's stored on a shared folder on the home server, and you save the edited file to the server, then you might see this bug.

In fact, it took a long time to get a reproducible series of steps for this issue. A number of reports of data corruption that appeared to be related to this issue turned out instead to be traceable to faulty network cards, hard drive failures, or old routers with outdated firmware. It took some very detailed bug reports, accompanied by sample files and server logs, to create a consistently reproducible environment in the lab; that's the missing piece that it takes isolate the root cause and develop a patch.

Meanwhile, backups stored on a Windows Home Server are completely safe, as are files copied to the server for safekeeping or streaming. This issue affects only files that are saved directly from one of the listed applications to a shared folder on a Windows Home Server.

No one I talked to at Microsoft is minimizing the impact of this bug. That bare-bones KB article was specifically designed to "get people to take it seriously," I was told.

So why wasn't this issue identified months ago, during the long beta test cycle for Windows Home Server? That's the trouble with beta testing, as I know from firsthand experience. Last summer, after the Windows Home Server beta cycle had officially ended but before the software had been released to the public, I noticed that some program files stored on my custom-built Windows Home Server box were being mysteriously corrupted. Trying to open the file didn't open a Windows installer, as expected; instead, a Command Prompt window opened for a split second and then closed without doing anything. The file icon was changed to a generic MS-DOS icon, and the file properties suggested that these Windows programs had mysteriously been transformed into MS-DOS programs. It didn't affect every program, and the corruption seemed to be random.

In searching through bug reports, I found two or three other, similar reports, all of which had been closed as "not reproducible." I filed a report anyway and heard back from an engineer who peppered me with questions. Over the course of the next few days, we narrowed down the scope of the bug and created a repro test case:

  • The files had to be fairly large, at least 2 or 3 megabytes in size.
  • They had to have been downloaded from the Internet on a Windows machine, which in turn adds an alternate data stream (ZoneIdentifier) that blocks execution of the file without user consent.
  • They had to have been uploaded to the Windows Home Server from a machine running Trend Micro antivirus software. Other AV and security programs didn't trigger this bug.

That's a fairly complex series of conditions, and it's not surprising that it took some time and sleuthing to identify the exact sequence of conditions. But when the issue was documented in Knowledge Base article 943393, none of those additional details were mentioned.

That bug  was patched within a few weeks after the KB article was published (the details are in KB article 941914), and the fix was pushed out in mid-November to any Windows Home Server box via Windows Update.

I fully expect the current bug to be patched fairly quickly now that a repro case is available. Meanwhile, it pays to be conservative and heed the advice of that KB article, even if the odds are relatively low that this particular bug will strike you.

Editorial standards