ie8 fix

TidyFS: Microsoft's simpler distributed file system

By | June 15, 2011, 12:30pm PDT

Summary: Just about a year ago, I first mentioned TidyFS, a new, small distributed file system under development by Microsoft Research. Later this week at the Usenix ‘11 conference, Microsoft researchers behind the TidyFS will be sharing more publicly about their work.

Just about a year ago, I first mentioned TidyFS, a new, small distributed file system under development by Microsoft Research. Later this week at the Usenix ‘11 conference, Microsoft researchers behind the TidyFS will be sharing more publicly  about their work.

TidyFS is a distributed file system for parallel computations on clusters. On commodity, “shared-nothing” clusters, the primary workloads tend to be generted by distributed execution engines like MapReduce, Hadoop or Microsoft’s Dryad, the Microsoft researchers note in the abstract of their presentation. Other vendors have created distributed file systems for these workloads — like the Google File System (GFS) and the Hadoop Distributed File System (HDFS). Microsoft has one in development, too: TidyFS.

Here’s an architectural diagram from Microsoft from a year ago showing how researchers were envisioning that TidyFS and other experimental components would fit together:

(click on image above to enlarge)

Microsoft researchers are emphasizing the simplicity and small size of TidyFS as differentiators from the other parallel file systems out there. And they’re sharing some of their experiences using the file system in a limited way inside Microsoft Research in their white paper detailing their TidyFS work.

From the TidyFS white paper:

“The TidyFS storage system is composed of three components: a metadata server; a node service that performs housekeeping tasks running on each cluster computer that stores data; and the TidyFS Explorer, a graphical user interface which allows users to view the state of the system.”

Microsoft Research has been deploying and using actively TidyFS for the past year on a research cluster with 256 servers running large-scale, data-intensive computations, according to the white paper. The research cluster is used only for programs run using DryadLINQ, which is a parallelizing compiler for .Net programs using Dryad. (I’ve written before about Dryad — a first commercial version of which Microsoft is planning to deliver later this year as part of a service pack for Windows Server 2008 R2 HPC.)

“On a typical day, several terabytes of data are read and written to TidyFS through the execution of DryadLINQ program,” the white paper notes.

The experimental TidyFS cluster also is making use of a cluster-wide scheduler, codenamed “Quincy,” and a computational cache-manager, codenamed “Nectar.” Even though TidyFS was designed in conjunction with these various other distributed-clustering research projects, the Dryad and DryadLINQ pieces seem to be further along the path to commercialization. (When I asked Microsoft officials earlier this year if Quincy and Nectar would be commercialized later this year along with Dryad, I was told they were not on the same delivery trajectory.)

Nonetheless, the white paper says that “rather than making TidyFS more general, one direction we are considering is integrating it more tightly with our other cluster services.”

As with all Microsoft research projects, there is no absolute guarantee as to when and if TidyFS will evolve into a commercial product or part of a commercial product. However, given that Dryad is on its way to being released as “LINQ to HPC” later this year, I’m thinking TidyFS may not be that far behind, and may someday find its place in the Microsoft “cloud as supercomputer” strategy, alongside Dryad.

Kick off your day with ZDNet's daily e-mail newsletter. It's the freshest tech news and opinion, served hot. Get it.

Topics

Mary Jo has covered the tech industry for more than 25 years for a variety of publications and Web sites, and is a frequent guest on radio, TV and podcasts, speaking about all things Microsoft-related. She is the author of Microsoft 2.0: How Microsoft plans to stay relevant in the post-Gates era (John Wiley & Sons, 2008).

Disclosure

Mary-Jo Foley

Freelance journalist/blogger Mary Jo Foley has nothing to disclose. WYSIWYG (what you see is what you get). I do not own Microsoft stock or stock in any of its partners or competitors. I have no business ventures that are sponsored by/funded by Microsoft or any of its partners or competitors.

Biography

Mary-Jo Foley

Mary Jo Foley has covered the tech industry for 25 years for a variety of publications, including ZDNet, eWeek and Baseline. She has kept close tabs on Microsoft strategy, products and technologies for the past 10 years. In the late 1990s, she penned the award-winning "At The Evil Empire" column for ZDNet, and more recently the Microsoft Watch blog for Ziff Davis.

Got a tip? Send her an email with your rants, rumors, tips and tattles. Confidentiality guaranteed.

Related Discussions on TechRepublic

Did you know you can take part in these discussions with your ZDNet membership?

The discussion hasn’t started yet. Why don’t you begin it?

Formatting +
BB Codes - Note: HTML is not supported in forums
  • [b] Bold [/b]
  • [i] Italic [/i]
  • [u] Underline [/u]
  • [s] Strikethrough [/s]
  • [q] "Quote" [/q]
  • [ol][*] 1. Ordered List [/ol]
  • [ul][*] · Unordered List [/ul]
  • [pre] Preformat [/pre]
  • [quote] "Blockquote" [/quote]
ie8 fix
Click Here

The best of ZDNet, delivered

ZDNet Newsletters

Get the best of ZDNet delivered straight to your inbox

Facebook Activity

ie8 fix