Red Dog: Can you teach old Windows hounds new tricks?

What led Microsoft -- a company that has spent a good part of the past decade-plus protecting the Windows franchise at the expense of the Web -- to finally create an infrastructure that would support not just Windows developers, but also Web programmers?
Written by Mary Jo Foley, Senior Contributing Editor

It has been four months since Microsoft took the official wraps off its cloud-computing initiative. Yet still relatively little still is known about the Azure platform and plans.

The part of Azure which intrigued me the most was the cloud operating system, code-named "Red Dog," that is at its heart. Late last month, Microsoft allowed me access to many of the principals behind Red Dog -- everyone from the infamous father of VMS and NT, David Cutler, to the handful of top-dog engineers who helped design and develop the various Red Dog core components. Over the course of this week, I'm going to be publishing a post a day about Red Dog.

Part 1: It's not just about Windows any more

What led Microsoft -- which has spent a good part of the past decade-plus protecting the Windows franchise at the expense of the Web -- to finally create an infrastructure that would support not just Windows developers, but also Web programmers? And how did a company known for its slipping dates more than making its shipping dates manage to build a cloud-computing platform that developers could begin test-driving in less than two years?

Not so long ago, Microsoft probably would have simply rented out a bunch of Windows Server machines and expected that anyone inside or outside the company interested in making use of them would flock to pay for datacenter power by the hour. The Microsoft of old would have pitted multiple in-house development teams (unbeknown to each of them) against one another in designing the various cloud-computing components, with "the best" team ultimately winning. And the good ol' Soft would have, undoubtedly, counted on having four or five years to get its act together before even thinking about fielding a first cloud OS test build.

None of that happened with Red Dog. Even before the finishing touches were done on Windows Vista, CEO Steve Ballmer was talking with Amitabh Srivastava, Corporate Vice President, about his next assignment. At that point, Srivastava, a 12-year Microsoft veteran, was a leader in the Windows team and was in charge of redesigning the engineering processes around how Windows was built.

In 2006, "I was thinking about what to do next," Srivastava said. "Would I work on the next version of Windows?"

Chief Software Architect Ray Ozzie had joined Microsoft about a year before and "was completely high on services," Srivastava recalled. "But I didn't have a services background. Steve (Ballmer) suggested I go talk to Ray." Hours later, the pair were still talking.

[How to turn a bunch of Windows guys into services guys] -->

Srivastava said he came away with a couple of key realizations:

1. Nobody really knows services. "Everyone" -- even "thought leaders" like Amazon and Google -- "are just five minutes into the first quarter," Srivastava quipped.

2. In spite of this fact, Microsoft actually had some substantial services experience, but not in Windows. Microsoft had been running services like Hotmail for more than ten years, Srivastava said, and there were more than 160 identifiable services runing in various units throughout the company. So even if Srivastava were to do no more than create a unified infrastructure for Microsoft's own internal services, he'd still help the company realize huge cost savings, he said.

Srivasta coaxed Dave Cutler, the father of Microsoft's NT operating system, out of "semi-retirement," and began assembling a core team of engineers. Because of Srivastava's and Cutler's pedigrees, the team consisted primarily of "Windows guys."

So how do you turn a bunch of Windows guys into services guys?

The first order of business for the core team was educating themselves about the services world. The team rented a van and drove around Redmond and Silicon Valley for two-plus months to talk to the various teams at Microsoft that already had services know-how. They talked to the Hotmail team, the Virtual Earth team, the Xbox Live team. They asked each unit about their Windows' pain points and solicited wish lists.

"We went to one of Microsoft's datacenters," Srivastava reminisced. "We all became ops guys for a day."

(It was during this tour that the "Red Dog" project got its code name. The inspiration? A seedy club in Silicon Valley called the "Pink Poodle.")

The team quickly realized that while Red Dog would be a technical project, most of the problems it would need to solve were business ones, according to Todd Proebsting, Director, Technical Strategy, Windows Azure, and one of the original Red Dog core team.

"We had a euphoria stage, where we felt like we could do anything," Proebsting said. "Then we narrowed (the ideas) down, after talking to customers," many of whom worked at Microsoft on various teams. "We talked to Search and we talked to CarPoint," he said. "We asked them what we could do for them" and took note of the pieces that those teams already had developed which could be reused with Red Dog.

The findings: Cutting costs was essential. Providing greater levels of reliability was essential.

"If we could get the number of machines down or reduce the amount of required support staff," that was going to be huge," Proebsting said. The potential Red Dog customers wanted to know how long it would take for a new feature to be introduced or to address an uptick in server demand. Those were their hot buttons, he said.

Creating a 'services mindset'

"Our problem was how do you create a services mindset?" Srivastava said. Instead of spending years crafting an operating system and then deploying it, the team needed to think about writing pieces of software that would be deployed immediately.

The team published all of its members' cell phone numbers, so the Red Doggers had an immediate understanding of what 24X7 and "five nines" (99.999 percent) availability really meant.

Srivastava, with input from the core Red Dog group, authored a memo called "Owning the Cloud," which layed out their plan of attack. (Microsoft declined to make the memo available, claiming it includes proprietary information.) At this point, the team realized the key to the cloud was to be able to better manage the datacenter.

"The idea became managing a datacenter as an operating system," Srivastava said. "We wanted to abstract the whole thing and manage all the resources."

[Ballmer: 'Go, Go, Go!'] -->

"We went in with a completely clean slate, and a lot of that credit goes to Ray (Ozzie)," Srivastava said. "We didn't want to just go and do another new Windows API (application programming interface. It wasn't about protecting Windows -- it was more about leveraging it."

In October 2006, Srivastava & Co. sent their vision statement to Ballmer. In November, Ballmer sent back mail that said "Go, Go, Go!" Srivastava said. A tight core team of about 20 senior engineers was assembled -- nothing like the size and scope of the thousands of folks working on Windows these days -- and off they went.

"Almost nobody inside Microsoft knew about Red Dog," Srivastava said. (Including Chairman Bill Gates, he noted.) "And none of us (on the team) knew if it was going to work."

The Red Dog team realized quickly that they couldn't ask any of the existing teams at Microsoft to take a chance on them by using their fledgling infrastructure, so they decided they needed to build their own datacenter test bed. The team commandeered some space in Building 18 and built a mini datacenter that was 1/20th the size of a full-fledged one. The team "took" the power it needed to run the center from three other buildings. Some of the other teams in 18 wondered what was going on in the offices next door. There were a few noise complaints, Srivastava recalled.

In April 2007, the team began writing code for the components it needed to run Windows Server 2008-powered datacenters: The fabric controller, a storage system, a virtualation substrate and a development environment. By November 2007, the first version of Red Dog was working.

(Watch for Tuesday's installment: Microsoft's Cloud OS Dream Team: A Who's Who)

Editorial standards