Microsoft takes a step toward commercializing its 'Dryad' distributed computing technologies
Summary: Microsoft has started external developer testing of a number of interrelated parallel/distributed technologies for Windows Server that are part of the codename "Dryad" family.
Microsoft has started external developer testing of a number of interrelated parallel/distributed technologies for Windows Server that are part of the codename "Dryad" family.
According to a December 17 blog post on the Windows HPC (High Performance Computing) Team Blog, Microsoft is making available to testers via its Connect test site the first Community Technology Preview (CTP) test builds of its Dryad, DSC and DryadLINQ technologies.
Dryad is Microsoft's competitor to Google MapReduce and Apache Hadoop. In the early phase of its existence, Dryad was a Microsoft Research project dedicated to developing ways to write parallel and distributed programs that can scale from small clusters to large datacenters. There’s a DryadLINQ compiler and runtime that is related to the project. Microsoft released builds of Dryad and DryadLINQ code to academics for noncommercial use in the summer 2009. Microsoft moved Dryad from its research to its Technical Computing Group this year.
According to a presentation from August, the team's plan was to deliver a first CTP build of the stack in November 2010 and to release a final version of it running on Windows Server High Performance Computing servers by 2011.
This initial preview is intended for "developers who are exploring data-intensive computing," according to the Softies. The prerequisite for the CTP is HPC Pack 2008 R2 Enterprise-based cluster, with Service Pack 1 installed.
As I noted in a previous blog post, there are a number of interesting components that comprise Dryad, including a new distributed filesystem (codenamed "TidyFS"), a set of related data-management tools (codenamed Nectar") and a scheduler for distributed clusters (codenamed "Quincy").
Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.
Talkback
RE: Microsoft takes a step toward commercializing its 'Dryad' distributed computing technologies
Oh well, so much for that ...
RE: Microsoft takes a step toward commercializing its 'Dryad' distributed computing technologies
What's DSC stands for?
I've found something interesting
...
"Dryad uses this information to determine where to run jobs ? this system preferentially moves the computation to the data, rather than moving the data to the computation. DSC also provides file replication and load balancing functionality. Dryad understands data locality for data registered with DSC, and it preferentially schedules computations to the servers holding data that the computation needs."
"A Dryad job typically starts with a collection of persistent input data, such as a set of log files, and returns a processed version of that data to the application. The input data is typically partitioned into reasonable-sized pieces, which have been distributed across the cluster before the job starts. The partitioned input data can be created and distributed in any of several ways, including:
By using a DryadLINQ application to read data from a share, and then use hash or range partitioning to put data on compute nodes and write the metadata to the DSC database.
By manually partitioning the input data into a set of files, copying each file to a compute node, and then creating a DSC stream to contain the names and locations of the partitions"
removed
RE: Microsoft takes a step toward commercializing its 'Dryad' distributed computing technologies
RE: Microsoft takes a step toward commercializing its 'Dryad' distributed computing technologies
RE: Microsoft takes a step toward commercializing its 'Dryad' distributed computing technologies