Microsoft Research parallel programming project set to go commercial in 2011

By | August 18, 2010, 12:04pm PDT

Summary: Microsoft is planning to move its Dryad parallel/distributed computing stack from Microsoft Research to Microsoft’s Technical Computing Group and deliver a final version of that technology to customers by 2011.

It’s been a while since the Redmondians have talked up “Dryad,” Microsoft’s answer to Google’s MapReduce and Apache’s Hadoop. (I think the last time Dryad got any coverage outside the research community was when Microsoft Chairman Bill Gates mentioned it to the New York Times in 2006.)

Dryad is an ongoing Microsoft Research project dedicated to developing ways to write parallel and distributed programs that can scale from small clusters to large datacenters. There’s a DryadLINQ compiler and runtime that is related to the project. Microsoft released builds of Dryad and DryadLINQ code to academics for noncommercial use in the summer 2009.

It looks like Dryad is ready to take the next step. Microsoft is planning to move the Dryad stack from Microsoft Research to Microsoft’s Technical Computing Group. The plan is to deliver a first Community Technology Preview (CTP) test build of the stack in November 2010 and to release a final version of it running on Windows Server High Performance Computing servers by 2011, according to a slide from an August 2010 presentation by one of the principals working on Dryad.


But wait, there’s one more thing. (Actually, there are three more things.)

The Dryad stack is getting more detailed as the researchers continue to work on it. Here’s the existing Dryad stack diagram:

(click on the image to enlarge)

Here’s an updated version of the stack diagram from the aforementioned August 2010 presentation by one of the Dryad team members:

(click on the image to enlarge)
The Dryad layer of the stack handles scheduling and fault-tolerance, while the DryadLINQlayer is more about parallelization of programs.

The latest Dryad stack diagram includes mention of a new distributed filesystem, codenamed TidyFS, for parallel computation with Dryad. This file system “provides fault tolerance and data replication similar to GFS (the Google File System) or the Cosmos store.” (Cosmo, according to the previous stack diagram, was the codename for the Dryad file system which complemented the NT File System. TidyFS is either the new name for Cosmos or its successor, I’d say.)

There’s also a set of related data-management tools, codenamed “Nectar.” I found a white paper from Microsoft Research on Nectar, which explains its purpose this way:

In a Nectar-managed data center, all access to a derived dataset is mediated by Nectar. At the lowest level of the system, a derived dataset is referenced by the LINQ program fragment or expression that produced it. Programmers refer to derived datasets with simple pathnames that contain a simple indirection (much like a UNIX symbolic link) to the actual LINQ programs that produce them.”

There’s one more new Dryad-related codename worth noting: “Quincy.” Quincy is a scheduling system for distributed clusters. (Quincy, Wash., also happens to be the location of one of Microsoft’s major datacenters.)

Microsoft is continuing to step up its work in the HPC space, hoping to ice out Linux in that arena. The Softies are seemingly counting on Dryad to keep up their momentum both on premises, with Windows Server, and in the cloud with Windows Azure in its datacenters.

Kick off your day with ZDNet's daily e-mail newsletter. It's the freshest tech news and opinion, served hot. Get it.

Topics

Mary Jo has covered the tech industry for more than 25 years for a variety of publications and Web sites, and is a frequent guest on radio, TV and podcasts, speaking about all things Microsoft-related. She is the author of Microsoft 2.0: How Microsoft plans to stay relevant in the post-Gates era (John Wiley & Sons, 2008).

Disclosure

Mary-Jo Foley

Freelance journalist/blogger Mary Jo Foley has nothing to disclose. WYSIWYG (what you see is what you get). I do not own Microsoft stock or stock in any of its partners or competitors. I have no business ventures that are sponsored by/funded by Microsoft or any of its partners or competitors.

Biography

Mary-Jo Foley

Mary Jo Foley has covered the tech industry for 25 years for a variety of publications, including ZDNet, eWeek and Baseline. She has kept close tabs on Microsoft strategy, products and technologies for the past 10 years. In the late 1990s, she penned the award-winning "At The Evil Empire" column for ZDNet, and more recently the Microsoft Watch blog for Ziff Davis.

Got a tip? Send her an email with your rants, rumors, tips and tattles. Confidentiality guaranteed.

Talkback Most Recent of 7 Talkback(s)

  • Doa
    Peple have long ago adopted LAMP for HPC.
    M$ cloud is doomed to fail.
    ZDNet Gravatar
    Linux Geek
    18th Aug 2010
  • Linux lame duck
    @Linux Geek
    LAMP for HPC? he he he. That high-performance PHP must have slipped by.

    You obviously doesn't know what HPC is. Linux pretty much owns this space right now, but it does so because of neglect from competitors. Linux is facing serious problems with scaling in a multi-core world. Unlike Windows they have not gotten round to eliminate those big-kernel locks and spinlocks. And they will seriously impede scalability in a 16+ core world.

    High performance computing is still driven by clusters of servers. This does not require the same level of thread scalability, but it does severely challenge how you can write effective algorithms.

    Where Microsoft will excel (evidenced from the above diagrams) is in providing a complete programming model on top of the OS. They are really leading the pack here. "Concert", parallel LINQ etc. is miles ahead. This will be a hit with scientists who often need to craft algorithms themselves. Any help they can get in crafting efficient algorithms will improve the end result. And this is right up MS alley. It is what they have always been doing.
    ZDNet Gravatar
    honeymonster
    18th Aug 2010
  • [a]
    @honeymonster
    Linux Geek says something dopey and honeymaster gives us a page from Microsoft's glossy brochure.

    MapReduce is a framework for searching massive amounts of data that cannot, for performance reasons, be indexed in a relational database manner. Google has an implementation, but they are not sharing details. Hadoop implements MapReduce via java, which means the underlying servers may use different operating systems.

    Dryad looks like it has Windows servers at the lowest levels of the stack.

    Which is fine. Clearly they will be going to market with the sales pitch that the integration advantage exceeds the licensing costs.

    Truth be told, I am usually skeptical about Microsoft. Technologies don't exist until they implement them, and their implementations recognize no technologies but Microsoft's. Still, I'm willing to be surprised, though I don't really see the floats in this parade. One can have hadoop for their Windows servers and write clients in perl?
    ZDNet Gravatar
    DannyO_0x98
    18th Aug 2010
  • RE: Microsoft Research parallel programming project set to go commercial in 2011
    @honeymonster - in the research communities I've been a part of, the cost of MS OS and product licenses will tip the balance, unless MS does something radical in that area. (Maybe they will, no idea). "Free for noncommercial use" smells an awful lot like handcuffs, should e.g. a goco contractor at a national lab want to take an idea commercial, as part of its IP approach for something developed in those contexts.

    And ... while I agree with you, LAMP doesn't apply, there are already plenty of tried-and-true approaches to scaling Linux way beyond 16 cores; and you may not follow kernel.org/lkml.org, but the generic kernel just released, and the one coming in a few weeks, make big strides in eliminating the BKL and other hindrances to full-throated scalability.

    Guess it's too much to hope that MS will join in pushing for open standards in this area, and compete on implementation, rather than competing on a proprietary vertical stack ... sorry, alternate-universe moment there.
    ZDNet Gravatar
    daboochmeister
    20th Dec 2010
  • RE: Microsoft Research parallel programming project set to go commercial in 2011
    Microsoft software is still not out. Hadoop has been around for a few years already, is free and works for large datasets of companies like Facebook, Yahoo and Amazon. One advantage of Hadoop is cheaper hardware, which will not be cheap with Microsoft software.
    ZDNet Gravatar
    gsantoshg@...
    19th Aug 2010
  • RE: Microsoft Research parallel programming project set to go commercial in 2011
    Took me time so that you can review the entire write-up, the write-up is wonderful however the remarks carry an awful lot even more brainstorm wholesale jerseys solutions, quite a few many thanks.
    ZDNet Gravatar
    jackson1984-24316069205748857739440257893812
    9th Oct
  • RE: Microsoft Research parallel programming project set to go commercial in 2011
    Certainly practical and also easy to understand. Serious about much more like mulberry bags writeups!! Does one have got a myspace?
    ZDNet Gravatar
    tomlin21-24319035676893835085146735905770
    11th Oct

Talkback - Tell Us What You Think

Formatting +
BB Codes - Note: HTML is not supported in forums
  • [b] Bold [/b]
  • [i] Italic [/i]
  • [u] Underline [/u]
  • [s] Strikethrough [/s]
  • [q] "Quote" [/q]
  • [ol][*] 1. Ordered List [/ol]
  • [ul][*] · Unordered List [/ul]
  • [pre] Preformat [/pre]
  • [quote] "Blockquote" [/quote]

The best of ZDNet, delivered

ZDNet Newsletters

Get the best of ZDNet delivered straight to your inbox

Facebook Activity

White Papers, Webcasts, & Resources