Putting the cloud in a box: How Microsoft built Windows Server 2012
Microsoft wanted Windows Server 2012 to be the 'definitive cloud operating system'. So how did its engineers go about this? For a start, by not writing any code for a year, according to project lead Jeffrey Snover.
Part of Microsoft's series of big bets on its future, Windows Server 2012 is a major upgrade to the company's server operating system — one designed to change the way businesses build and manage datacentres.
The idea was to build the "definitive cloud operating system", different to anything Microsoft had built before and anything the industry had seen, according to Jeffrey Snover, the lead architect for Windows Server 2012.
Microsoft launched the server OS update at the beginning of September, after three years of work. There was a lot the software maker needed to get right in the new OS: from handling virtualisation to how IT departments could work with BYOD, to delivering tools for managing many servers at the same time.
That meant the development process was very different from projects Snover — best known as the inventor of Microsoft's PowerShell scripting language — had worked on before, he told ZDNet.
"The first thing we did was stop. We said to everyone, 'Put your pens down, let's be thoughtful about this'," he said. "For a whole year, all the engineers, not a single line of production code was written."
Testing and talking
Instead, that first year was spent on planning and testing, and retooling the development system for the server OS. The planning part meant talking to hardware vendors and buyers, to understand just where the server and datacentre market was going — getting what Snover called "the voice of the technology team".
"One team spent a lot of their time talking to people running cloud datacentres with Windows, asking what's working, what's not working, what were their priorities" — Jeffrey Snover
"We got out of our cubicles and we talked to customers," he said, explaining that Microsoft wanted to know what businesses were looking for in an operating system. "One team spent a lot of their time talking to people running cloud datacentres with Windows, asking what's working, what's not working, what were their priorities."
The second part — retooling the development platform — meant Microsoft's team focused on creating new code management and development tools. This called for "great code check-in, great quality metrics, building the unit test frameworks that would be needed. Really beefing up our engineering experience", Snover said.
While no one on the team was working on production code, that didn't mean no one was writing code. Engineers used the year to try out new ideas and new technologies, familiarising themselves with the techniques and some of the tools they'd need to use when Windows Server 2012 development began — including spending time with new hardware.
Drawing on what customers had told them, the Windows Server development team identified the main things they had to take into account in their next release. Perhaps the most important was to try to improve the way the server OS worked with storage, to help IT departments manage it more effectively and at lower cost, according to Snover.
Other key areas were automation, speed and virtualisation. Automation features had to be simplified and standardised, clients said, while better virtualisation support was needed for datacentre flexibility and business agility. As for speed, the focus was on raw performance and price/performance.
Next, the team put together a set of features for the OS and came up with a list of high-level issues to tackle. The main insight they had was to treat Windows Server as a datacentre abstraction layer — they took the familiar concept of the hardware abstraction layer that had been part of Windows Server since the NT days and extended it to the entire datacentre.
This meant Windows Server 2012 needed to be able to manage and control not just compute and storage, but also networks with support for software-defined networking in a virtual switch and with tools for dynamically managing large numbers of IP addresses.
Microsoft "needs a standards-based approach to manage the whole datacentre — everything in it — with no lock-in", Snover said.
Re-engineering the OS
As with Windows 8 on the desktop, the software maker saw Windows Server 2012 as an opportunity to re-engineer the OS for the latest hardware, he added.
Processors are now uniformly multicore, so applications need to take advantage of the CPU and memory architectures in modern servers, he argued. That meant the development team had to focus on improving support for NUMA (Non-Uniform Memory Access) — seen as essential for improving virtualisation performance, as it will allow Windows Server 2012 and Hyper-V to treat servers as a compute fabric, automating memory usage.
"Getting NUMA right is really hard. So we did a ton of analysis, test, measurements and tweaking, which gave us phenomenal NUMA scaling as a result," Snover said.
One thing the team held in mind was the idea of continuous availability — basically, bringing cloud design to the datacentre. Continuous availability uses compute, storage and network fabrics to keep business systems running, even when applications, storage, and infrastructure fail. This changes the way servers and datacentres are designed, according to the Microsoft distinguished engineer.
To do this, Microsoft took what Snover called "a very engineered approach to resilience — looking at how it can be delivered for single nodes, multi-node clusters and even across multiple sites".
Snover described the approach the team took as "walking up the stack". That meant making changes in the file system and kernel, including developing a whole new resilient file system, called ReFS.
At a kernel level, Microsoft changed the way data is flushed to disk, as the enterprise shift to using commodity hardware has meant businesses using cheaper consumer storage. The result is the ability to look for NTFS problems on the fly and repair them without requiring a reboot (which takes disks offline for fractions of a second).
As well as factoring in private clouds, the engineers tackled BYOD (Bring Your Own Device) policies. Unmanaged devices are now part of most corporate networks, so there needed to be a shift from application and device management to user and information management in Windows Server, Snover said. That meant building new features into the OS to make sure it could scale and cope with the explosion in data.
The resulting Dynamic Access Control added rules that can be applied automatically, locking down access-based roles, groups and user IDs.
Workers also now expect their business tools to be as usable as their consumer devices. Microsoft worked on Windows Server's VDI support to try to meet these expectations, Snover said.
Windows Server 2008 R2 introduced RemoteFX, which brought hardware-accelerated graphics and video effects to virtual desktops using Remote Desktop Protocol. However, it required additional hardware, and meant that servers needed to have desktop graphics cards.
"We're using a lot of technology from Microsoft Research," he said. "We're using different codecs for different parts of the screen — for text, for video".
Those new codecs are meant to make it easier to deliver virtual desktops and remote applications to employees working at home or on the road.
"Efficiency is a lot better with the new codecs. You can get a lot of efficiencies across a WAN as well as a LAN," Snover said.
With Windows Server 2012 now available for download, it's the end of the journey for Microsoft's development teams — but the start of the journey for IT departments around the world as they plan server and datacentre upgrades.
With support from server, storage, and networking vendors — and the ability to buy preconfigured reference architectures — Microsoft is describing this Windows Server release as "delivering the cloud, in a box". It's going to be interesting to watch how it gets deployed, and how IT teams use it to approach key issues like BYOD and private cloud.