Database glitch causes Windows 7 download server meltdown

This morning, MSDN and TechNet subscribers were dismayed to find that downloads of the Windows 7 Release Candidate began bogging down shortly after they were made available. For several hours after the official launch, most subscribers who tried to log on found themselves unable to reach the download pages. The problem, I'm told by a Microsoft insider, wasn’t server capacity. Instead, the glitch (now fixed) was caused by a database configuration problem. I've got details and a startling graph.

When Microsoft released the Windows 7 beta for public download in January, the resulting demand overwhelmed its servers, delaying the launch by a day and giving the software giant’s capacity planners a black eye.

This morning at 6AM PDT, when Windows 7 Release Candidate downloads were officially made available for MSDN and TechNet subscribers, it looked like a sequel to that botched release. After 20 minutes or so of smooth downloads, both sites began bogging down, and the situation deteriorated rapidly as the minutes passed. For several hours after the official launch, most subscribers who tried to log on found themselves unable to reach the download pages.

This time, though, the problem wasn’t capacity. Instead, a source tells me, the glitch was caused by a SQL Server database that reached excessive fragmentation levels because of the tremendous surge of queries. How massive was the demand surge? The number of requests to the MSDN and TechNet databases in less than an hour was equal to more than a week’s traffic under normal circumstances.

The following graphic is from an internal Microsoft document explaining what happened. The blue line indicates percentage of processor usage, which is directly tied to fragmentation of the SQL Server database:

After the SQL Server index was rebuilt (just after 9:30AM), processor use dropped back to high but acceptable levels. I’m told that Microsoft engineers are now monitoring the status of this database every 30 minutes and plan to rebuild the indexes every evening to avoid a recurrence of the problem.

Reached for comment, a Microsoft spokesperson told me, "Due to high volume of traffic on the MSDN and TechNet sites this morning, many people may have experienced difficulties trying to download the Windows 7 Release Candidate. Microsoft has made changes to accommodate the increased traffic and subscribers shouldn't experience any further issues."

The good news for anyone awaiting the public download next Tuesday (May 5) is that those pages are tied to the same sort of subscriber database as the ones on MSDN and TechNet. Still, you can bet that an army of engineers will be watching that surge of traffic and wondering whether the third time is the charm.