How the Red Dog dream team built a cloud OS from scratch

How the Red Dog dream team built a cloud OS from scratch

Summary: Before the Red Dog operating system or the larger Azure stack was even a gleam in anyone's eye, Corporate Vice President Amitabh Srivastava had the opportunity to do almost anything he wanted. He could hand-pick a team of the best and brightest to develop a new Microsoft platform for the cloud.

SHARE:

It has been four months since Microsoft took the official wraps off its cloud-computing initiative.  Yet still relatively little still is known about the Azure platform and plans.

The part of Azure which intrigued me the most was the cloud operating system, code-named “Red Dog,” that is at its heart. Late last month, Microsoft allowed me access to many of the principals behind Red Dog — everyone from the infamous father of VMS and NT, David Cutler, to the handful of top-dog engineers who helped design and develop the various Red Dog core components. Over the course of this week, I’m going to be publishing a post a day about Red Dog.

Starting from scratch

Before the Red Dog operating system or the larger Azure stack was even a gleam in anyone's eye, Corporate Vice President Amitabh Srivastava had the opportunity to do almost anything he wanted. He could hand-pick a team of the best and brightest to develop a new Microsoft platform for the cloud.

Srivastava, who admitted he is "very anti-process," assembled a handful of engineers he knew from various Windows and Research assignments at Microsoft. He knew he wanted to keep the core group small and well-knit.

"If you only have 20 people, you don't need as much process. It's not like trying to make sure 5,000 people are all on the same page." (Only recently did the Red Dog team expand, with new services-specific hires from Ask, Yahoo and other non-Windows centric companies. The current headcount for the Red Dog team is about 150, Srivastava said)

His first intended recruit was Dave Cutler, the father of NT and VMS. Cutler "didn't need to write another OS," Srivastava acknowledged, but his "weakness is that he loves coding" and solving hard problems. He convinced him to join the team. Srivastava consulted with Todd Proebsting, a former Microsoft Researcher and director of the company's Center for Software Excellence. He called a few other former colleagues: Storage expert Brad Calder; former Sun utility computing expert turned Microsoft Distinguished Engineer Yousef Khalidi; programming tool and OS specialist Hoi Vo; engineering whiz G.S. Rana; datacenter provisioning expert Hunter Hudson; and developer evangelist Manuvir Das.

"The quality of the communication (between the team) affected the agility and the quality," said Rana, the General Manager of Engineering for Red Dog. "A lot of us had worked together for a long time."

(For a Red Dog core-team "Who's Who list," check out this slide show.)

After an initial two-plus-month fact-finding mission where the core team met with various Microsoft services teams in Redmond and Silicon Valey, the Red Dog team had some ideas of what they did and didn't want to do.

"We said, let's not try to copy Google or Amazon," Srivastava recalled. "We said we'd run things very differently."

The team decided to keep their approach and their mission a secret, even from the Microsoft management. CEO Steve Ballmer knew Srivastava and his core group were working on something for the cloud, but that was about all he knew.

"Steve (Ballmer) asked me 'why are you hiring all our best people'" for your team, Srivastava joked. But he didn't share much, beyond his overall vision statement, with the sometimes loose-lipped CEO.

[Letting the 'Red Dog' cat out of the bag] -->

Keeping it simple

Srivastava and his team decided to make use of assets the company already had -- specifically Windows Server 2008 -- to power Microsoft's datacenters.

"Our biggest learning was if it's not simple, it's not going to work," Srivastava said. "There was a lot of infrastructure that already existed"-- the Windows operating system, tools, debuggers. The idea was to harness all of these things and then "force" a programming model on top of it from Day 1.

The team decided to build a layer -- with pieces akin to what is inside a modern-day operating system -- to manage the thousands of Windows Server machines. A "fabric controller" would manage the cloud; a storage subsystem would act like a traditional "file system" for all of the servers; a virtualization layer, derived from Microsoft's Hyper-V hypervisor, would be at the lowest level between the servers and the rest of the datacenter "operating system."

(Calling Red Dog an "operating system" is an oversimplification, as team members are quick to point out. But each of its components has a parallel in the modern-day operating system world. Red Dog handles switches, load balancers and servers the way a client OS handles device drivers.)

The process of "how you architect software was transferrable from our previous knowledge," said Khalidi, a Distinguished Engineer focusing on Enterprise Strategy who spearheaded the fabric-controller piece of Red Dog. "From Day 1, you just have to think about how to deploy in very large scale."

There's also a transfer of knowledge between the existing Windows teams and the Red Dog team. New features that the Red Dog team builds for its kernel/hypervisor, when applicable, are slated to be folded back into the next version of Windows, for example.

"We touch every component of Windows and tools. We know we want to push the hardware as much as we can," said Vo, Director of the Red Dog (Azure) Operating System.

The Red Dog team, even now that the cat is out of the bag (so to speak) is still big on secrecy. They are part of that "under-promise and over-deliver" school that is growing inside Microsoft. But more and more teams at the company are being moved to the Red Dog platform, starting with Live Mesh, HealthVault and Live Meeting. External beta testers of Microsoft's .Net Services/Live Services platform also are starting to test Red Dog's limits.

(What's Dave Cutler been up to? Tune in to tomorrow's installment for a Q&A with the father of Windows NT on his role in the Red Dog team.)

Topics: Windows, CXO, Microsoft, Operating Systems, Software

About

Mary Jo has covered the tech industry for 30 years for a variety of publications and Web sites, and is a frequent guest on radio, TV and podcasts, speaking about all things Microsoft-related. She is the author of Microsoft 2.0: How Microsoft plans to stay relevant in the post-Gates era (John Wiley & Sons, 2008).

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

18 comments
Log in or register to join the discussion
  • Keeping it simple is good.

    Thanks for your attention to my post requesting a definition of Red Dog yesterday.

    I was hoping it might be in the introduction. But I did encounter more later:

    Calling Red Dog an ?operating system? is an oversimplification, as team members are quick to point out. But each of its components has a parallel in the modern-day operating system world. Red Dog handles switches, load balancers and servers the way a client OS handles device drivers.

    [End quote.]

    Well, okay, though keeping it simple is good, over-simplification is bad, I guess.

    That gave a picture I thought I understood, until:

    But more and more teams at the company are being moved to the Red Dog platform, starting with Live Mesh, HealthVault and Live Meeting. External beta testers of Microsoft?s .Net Services/Live Services platform also are starting to test Red Dog?s limits.

    [End quote.]

    So Red Dog is a "platform"? Is the rest of Azure involved in this use of Red Dog?

    Just trying to keep things sorted out.
    Anton Philidor
    • naming conventions

      Yes, the naming makes things kind of complicated.

      Azure is the full stack. But they also are using Azure as the final name for Red Dog.

      So Azure the platform is built on top of the Azure (Red Dog) OS.

      This picture might help. These are the layers:

      http://blogs.zdnet.com/microsoft/?p=1671

      More info on some of the top layers will be part of the series later this week. Thanks. MJ

      Mary Jo Foley
  • RE: How the Red Dog dream team built a cloud OS from scratch

    "Srivastava, who admitted he is ?very anti-process,?
    What I hear is there was chaos in there.


    "The quality of the communication (between the team) affected the agility and the quality,? said Rana, the General Manager of Engineering for Red Dog.

    Oh really, what I hear is there is so much chaos that no one has a clue whats going on. Engineers work on one feature only to realize a week later that the next week that feature was deprecated. So to hear the 'quality of communication' is IRONIC.
    BrutalTruth
    • Chaos

      Hi. I'm sure it hasn't been all smooth sailing. Is any team effort?

      But if you have some specific examples of deprecated features and Red Dog chaos, I'd be interested in hearing about them. Feel free to email me privately if you'd prefer. mjf at microsofttracker dot com. Thanks MJ
      Mary Jo Foley
  • How long will Balmer last if this is successful?

    One would think that if he's out of the loop he may as well be out of the picture.
    softwareFlunky
    • A very long time I suspect

      Balmer is CEO, I dont think he could be considered out of the loop. Sure he may not be punching code like Bill did, me might be but im sure Balmer has enough there to keep him busy. (and rich).

      I hope Dave cutler puts some more VMS like features in MS's OS's ASAP BTW IMHO.
      Aussie_Troll
  • YACOS (yet another centralized OS)

    Seriously, who cares? What I don't need is another shared resource system. Life will be so much better when these mainframe guys retire.
    happyharry_z
    • Probably not centralised enough.

      What has been described is distributed computing. A document can be accessed, and potentially edited, from multiple desktops, laptops, and handhelds. Up to now, this is a recipe for corruption, or else a very complex change-control problem. So how would you handle it, and what did you have in mind when the mainframe guys go?
      peter_erskine@...
    • YACOS is YAUA

      Since this appears to be an upper level OS-like structure, I think it's a YAOS (yet another operating system). Using an additional acronym implies that YACOS is YAUA (yet another useless acronym).
      Have a great day,
      Steve :)
      yet_another
  • RE: How the Red Dog dream team built a cloud OS from scratch

    I'm still waiting for the Cloud privacy and security shoe to drop. None of the stories I've read to date give information protection in the cloud more than as sentence - if that.

    Given regular reports of major data breaches in conventional technology environments, corporate risk managers are asking: "How well are my intellectual property and personal infomation holdings protected from security or privacy breaches in the Cloud?". No answer equals no go for corporations that are serious about managing risk.

    Does anyone have an answer?

    GCB
    gbliss@...
  • hot air and 'secrecy a la Apple'

    this project is likely to end soon as another M$ flop.
    They won't even be able to GPL that code written in VB6, so much of an embarasment it will be!
    Linux Geek
    • Well, if it is a flop

      then they can just rename it "Linux"! :)
      GuidingLight
      • LOL

        Ouch.
        khawaja.umar.farooq@...
  • can we test this?


    Since this is a cloud OS, is there any web site where we can test it out ?

    I remembered MS put Vista online for ppl to test it out initially

    Another cloud OS like environment is ThinServer

    http://www.aikotech.com/thinserver.htm
    ThinkFairer
    • You can already get the SDK's and apply to try the Azure Services

      You can already try it by downloading the SDK's and registering for Azure Services through the Microsoft Community Technology Preview (CTP) program.

      Just go to http://www.microsoft.com/azure/default.mspx to get everything or just learn more.

      I am surprised MJ has not provided a link?

      More on how it works here - http://www.microsoft.com/azure/howdoesitwork.mspx


      Martin_Australia
  • RE: How the Red Dog dream team built a cloud OS from scratch

    This seems to be employing the addage: "The software will grow to meet the hardware constraints." The more fundamental point is "What need does this resolve?" Cool? Yes. Useful? Yet to be seen.
    Steve Scheider
    yet_another
  • RE: How the Red Dog dream team built a cloud OS from scratch

    Cloud, cloud, cloud! I've seen it before and it failed. Sure, it had another name and it was some time ago. Maybe people will buy into it this time. The last time, people were NOT interested in saving their data beyond their personal machine. Okay, they're are more machines added this time: computers, cell phones, etc., but the concept is still the same as before! Your readers may want it, fine, no problem. I will NEVER be interested in such a scheme as storing my data beyond my personal control. That's just crazy!!
    roberts_theodore@...
  • RE: How the Red Dog dream team built a cloud OS from scratch

    Fortunate the bottomline is document in regard to state-of-the-art presents found out this good quality url page, might be absolutely sure to avoid wasting the software therefore could quite possibly head to nearly always. [url=http://www.reebok-nflstore.com/]nfl football jerseys[/url],[url=http://www.reebok-nflstore.com/cheap-peyton-hillis-jersey-cheap-womens-kids-youth-authentic]peyton hillis jersey[/url],[url=http://www.reebok-nflstore.com/cheap-drew-brees-jersey-cheap-womens-kids-youth-authentic]drew brees jersey[/url],[url=http://www.reebok-nflstore.com/cheap-sam-bradford-jersey-cheap-womens-kids-youth-authentic]sam bradford jersey[/url], I'm certainly cheerful varies greatly good consist of. By which various kinds of could quite possibly a person has your brand of fabric achievement huge selection chief surgery among forming?
    makrekdw66-24353629694293760060732252096446