Just about a year ago, I first mentioned TidyFS, a new, small distributed file system under development by Microsoft Research. Later this week at the Usenix '11 conference, Microsoft researchers behind the TidyFS will be sharing more publicly about their work.
TidyFS is a distributed file system for parallel computations on clusters. On commodity, "shared-nothing" clusters, the primary workloads tend to be generted by distributed execution engines like MapReduce, Hadoop or Microsoft's Dryad, the Microsoft researchers note in the abstract of their presentation. Other vendors have created distributed file systems for these workloads -- like the Google File System (GFS) and the Hadoop Distributed File System (HDFS). Microsoft has one in development, too: TidyFS.
Here's an architectural diagram from Microsoft from a year ago showing how researchers were envisioning that TidyFS and other experimental components would fit together:
(click on image above to enlarge)
Microsoft researchers are emphasizing the simplicity and small size of TidyFS as differentiators from the other parallel file systems out there. And they're sharing some of their experiences using the file system in a limited way inside Microsoft Research in their white paper detailing their TidyFS work.
From the TidyFS white paper:
"The TidyFS storage system is composed of three components: a metadata server; a node service that performs housekeeping tasks running on each cluster computer that stores data; and the TidyFS Explorer, a graphical user interface which allows users to view the state of the system."
Microsoft Research has been deploying and using actively TidyFS for the past year on a research cluster with 256 servers running large-scale, data-intensive computations, according to the white paper. The research cluster is used only for programs run using DryadLINQ, which is a parallelizing compiler for .Net programs using Dryad. (I've written before about Dryad -- a first commercial version of which Microsoft is planning to deliver later this year as part of a service pack for Windows Server 2008 R2 HPC.)
"On a typical day, several terabytes of data are read and written to TidyFS through the execution of DryadLINQ program," the white paper notes.
The experimental TidyFS cluster also is making use of a cluster-wide scheduler, codenamed "Quincy," and a computational cache-manager, codenamed "Nectar." Even though TidyFS was designed in conjunction with these various other distributed-clustering research projects, the Dryad and DryadLINQ pieces seem to be further along the path to commercialization. (When I asked Microsoft officials earlier this year if Quincy and Nectar would be commercialized later this year along with Dryad, I was told they were not on the same delivery trajectory.)
Nonetheless, the white paper says that "rather than making TidyFS more general, one direction we are considering is integrating it more tightly with our other cluster services."
As with all Microsoft research projects, there is no absolute guarantee as to when and if TidyFS will evolve into a commercial product or part of a commercial product. However, given that Dryad is on its way to being released as "LINQ to HPC" later this year, I'm thinking TidyFS may not be that far behind, and may someday find its place in the Microsoft "cloud as supercomputer" strategy, alongside Dryad.