'

Microsoft's datacenter software and services future: Trebuchet and Monsoon

Microsoft officials recently peeled back the covers a bit on its datacenter futures by sharing some high-level strategy goals for its "Generation 4" modular datacenter plans. But this modularization is only one piece of the company's longer-term goals and plans for its cloud-computing infrastructure.

Microsoft officials recently peeled back the covers a bit on its datacenter futures by sharing some high-level strategy goals for its "Generation 4" modular datacenter plans. But this modularization is only one piece of the company's longer-term goals and plans for its cloud-computing infrastructure.

There is a bigger, cross-lab datacenter project in the works at the company. Codenamed "Trebuchet. Trebuchet is all about Microsoft's next-generation datacenter software and services, and addresses everything from the platform software down to the consumer-facing services that Microsoft envisions being hosted on its cloud platform.

One of the groups working on the Trebuchet vision is Microsoft's Networking Research Group. That group, part of Microsoft Research, is focused on designing and developing "reliable, scalable, self-managing networks." Microsoft's Windows Live Core (Live Mesh) and Global Foundation Services teams also are key to the Trebuchet efforts. From a description of Trebuchet that I saw (thanks to Google cache) that was on the Microsoft Research site:

"We are experimenting with radical new designs in network architecture, programming abstractions, and performance management tools that scale beyond the enterprise."

Another piece of Microsoft's datacenter futures puzzle is codenamed Monsoon. (Microsoft Datacenter Futures Architect James Hamilton, who recently defected to Amazon, was a key member of the Monsoon team, I hear.)

Monsoon is "a new network architecture, which scales and commoditizes data center networking."

A Microsoft white paper, entitled "Towards a Next Generation Data Center Architecture: Scalability and Commoditization," offers further Monsoon clues. The paper was presented at the ACM Presto '08 workshop in August in Seattle.

Monsoon is based on a mesh-networking architecture "using programmable commodity layer-2 switches and servers," according to the white paper. "In order to scale to 100,000 servers or more, Monsoon makes modifications to the control plane (e.g., source routing) and to the data plane (e.g., hot-spot free multipath routing via Valiant Load Balancing."

This design, according to the Softies, "creates a huge, flexible switching domain, supporting any server/any service and unfragmented server capacity at low cost."

The Monsoon paper mentions that Microsoft is designing and evaluating a new directory service "that is both scalable and quick to remove failed servers from the pools of which they were part."

The document also references SEATTLE, "a scalable ethernet architecture for large enterprises." The authors of the paper note that SEATTLE and Monsoon are similar in a number of respects, in terms of how they implement large layer-2 networks and how they bounce some packets off an intermediate switch. The authors say that Monsoon's directory service is simpler than the SEATTLE one and also handles packet-bouncing somewhat differently.

"(I)n today's cloud services data centers, the network and server capaicty is fragmented, and bisection bandwidth is one to two orders of magnitude below aggregate server bandwidth," the paper concludes. "In Monsoon, we propose a network design that leverages the power of emerging data center components."

When will Trebuchet and Monsoon start providing cost/scalability improvements for Microsoft partners and customers -- not to mention Microsoft's own product groups that will be moving their own services to Microsoft's Azure cloud platform/operating system in the coming years? I have no idea. But it's still interesting to see where Microsoft is focusing its cloud research time and investments....