ZFS data integrity tested

ZFS data integrity tested

Summary: File systems are supposed to protect your data, but most are 20-30 year old architectures that risk data with every I/O. The open source ZFS from Sun Oracle claims high data integrity - and now that claim has been independently tested.

TOPICS: Hardware

File systems guard all the data in your computer, but most are based on 20-30 year old architectures that put your data at risk with every I/O. The open source ZFS from Sun Oracle claims high data integrity - and now that claim has been tested.

I'm at the USENIX File and Storage Technology FAST conference (see the last couple of StorageMojo posts for more) in Silicon Valley. There is more leading edge storage thinking presented here than any other industry event.

Case in point: End-to-end Data Integrity for File Systems: A ZFS Case Study by Yupu Zhang, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau and Remzi H. Arpaci-Dusseau of the Computer Sciences Department, University of Wisconsin-Madison. It offers the first rigorous test of ZFS data integrity.

Methodology The UW-M team used fault injection to test ZFS. Fault injection is a great technique because you inject as many errors as you want and correlate the file system's response.

Goal: errors injected = errors rejected.

As the paper puts it:

. . . we analyze a state-of-the-art file system, Sun Microsystem’s ZFS, by performing fault injection tests representative of realistic disk and memory corruptions. We choose ZFS for our analysis because it is a modern and important commercial file system with numerous robustness features, including end-to-end checksums, data replication, and transactional updates; the result, according to the designers, is “provable data integrity”

Disk corruption is more common than you think.

In a recent study of 1.53 million disk drives over 41 months, Bairavasundaram et al. show that more than 400,000 blocks had checksum mis- matches, 8% of which were discovered during RAID reconstruction, creating the possibility of real data loss.

But the study didn't stop there. They also injected errors into the RAM the file system used. As regular readers know, memory errors are hundreds to thousands of times higher than thought.

Keepin' it real ZFS has several important data integrity features.

  • Data integrity checksums. The checksums are stored in parent blocks, which enables ZFS to detect silent data corruption.
  • Block replication for data recovery. ZFS keeps replicas of certain important blocks.
  • Copy-on-write for atomic updates. Object updates are grouped together and new copies are created for all the modified blocks. ZFS doesn't need journaling.

ZFS has no special tools against RAM errors.

Conclusion The study found that

. . . ZFS successfully detects all corruptions and recovers from them as long as one correct copy exists. The in-memory caching and periodic flushing of metadata on transaction commits help ZFS recover from serious disk corruptions affecting all copies of metadata.

The results for in-memory data corruption weren't as stellar, but ZFS - like every other file system - wasn't designed to handle DRAM errors. The authors offer suggestions for making ZFS less vulnerable to DRAM errors.

The Storage Bits take The disk tests are the strong evidence that ZFS delivers on its promise of superior on-disk data integrity. I hope that ZFS - or something better - arrives on other OS's soon.

But what about the memory fault tests? In this I suspect that ZFS is no worse than legacy file systems and as a clean sheet design may be better.

While suggestions for improved RAM error resistance are well taken - after all, most of the world's computers have no ECC memory - this reinforces the need for ECC DRAM in important applications. ECC everywhere!

The authors should have made the effort to correlate DRAM error rates to the likelihood of in-memory data corruption by file systems. While DRAM is not nearly as reliable as the industry let us believe, we know disks are prone to errors.

The companies who produce file systems are fortunate their failures do not end up - like Toyota - with a grandmother wrapped around a tree. But what they lack in drama they make up with volume.

We have no idea how many billions of man-hours have been wasted due to silent data corruption, but the number will keep growing until every file system is at least as good as ZFS is today.

That means you, NTFS, HFS+ and the other legacy Unix, Linux and proprietary file systems. We trust you with our data and you are letting us down. That's just wrong.

Kudos to the UW team for their testing and their paper. They've set the bar for the rest of the industry.

Comments welcome, of course. See ZFS: Threat or menace? for a detailed introduction to ZFS. For the impact of data corruption see How Microsoft puts you data at risk. The NTFS team knows this stuff, but the MSuits are more worried about the Zune.

Topic: Hardware

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.


Log in or register to join the discussion
  • Such a load of crap

    Why did apple drop it? because its not anywhere near as good as you make it sound.
    Ron Bergundy
    • Legal reasons Apple dropped ZFS, not technical

      Sun/Oracle wouldn't be using ZFS if it were crap. It's very likely that it was legal/licensing issues that caused Apple to drop ZFS use in the Mac OS.
      • That's what they'd like you to think.

      • More likely it was just uncertainty what would happen...

        When Apple found out about the Sun/Oracle thing. I'd reckon that Apple knew that deal was in the pipeline way sooner than the man in the street ever did.
      • RE: ZFS data integrity tested

        @goony I was thinking the same thing.

        p.s. WHAT?!! This article is already 3 months old I replied to.
        Arm A. Geddon
    • According to informed sources . . .

      Apple publicly committed to use ZFS in Snow Leopard server in the
      middle of '08 after an eval by an Apple due diligence team. The
      engineering was moving forward nicely.

      But the Apple team procrastinated on getting a license from Sun - and
      when Sun accepted the Oracle offer it was too late to get one until the
      deal was completed and Oracle approved.

      That introduced too much schedule risk for Apple's taste, so they
      pulled the support. It may yet come back if Oracle approves.

      R Harris
      • RE: ZFS data integrity tested

        @R Harris By no means attention it would be consequently noisy <a href="http://edpi???llsuk.com/bu???y-prilig???y-onli???ne-uk.html">pr???iligy</a>. You guys absolutely canister appease behind afterwards change the area of interest in the direction of <a href="http://a???nxietysto???pper.com/zoloft.html">buy z???oloft pi???lls</a> or on small amount <a href="http://fr???ance-phar???ma.com/ach???at-cia???lis-france.html">ach???eter ci???alis</a>, heh?
  • Again with singling out MS


    "Admission from Robin Harris: "Also, I didn't say Microsoft was worse than the others - they are on a par. They all stink."

    Headline from Robin Harris: How Microsoft puts your data at risk.

    It's obvious your headline is grossly misleading."

    If you really felt the need to call out one vendor over another, why not pick the one that actually announced plans for ZFS then pulled it.

    Hint: that wasn't MS, it was your good friend Jobs.
    • Why I single out Microsoft

      Microsoft's desktop monopoly - as legally defined - means they impact
      more people than any other OS. They make a lot of money - the most
      profitable corporation on earth - and the employ thousands of brilliant
      computer scientists and engineers.

      Yes, I expect better of Microsoft.

      Why don't you?

      R Harris
      • I'm confused by this...

        Another NTFS user who has written hundreds of terabytes of data. Where are all these errors you speak of? My media, documents, OSs... trouble free. I don't have any errors or lost data on any of my SANs', NASs', or local storage.

        Please explain how ZFS will solve a problem that very few users, if any, are experiencing? Don't you think improving the physical reliability of drives or graceful power down of disks would be far more useful? I mean, I've had drives overheat and fail several times. This is a much bigger problem to me than filesystem corruption, which I've _never_ experienced or heard of a peer experiencing.
        • it's anything but rare, you are lucky

          I have seen it dozens of times, if it is a root system (c:) drive that is unclean chkdsk will force-run at boot (other drives d: and beyond need to be done manually from what I recall, sometimes from safe mode depending on where your swap and running apps are configured).

          If it detects clusters that are "orphaned" it will create a "Found.000" folder (or more incrementing counts, "Found.001", etc).

          It happened to me about 9 months ago when a flaky motherboard caused a system crash and I had about a dozen various hdd's installed (around 2Tb total) I had a few "Found" folders and enough was enough, converted it to linux where I can have a poor-mans mirror system that mounts / umounts the mirror drive on the fly to perform the sync.

          Anyhow just google chkdsk and found.000 if you are curious about ntfs cleanup. I have been curious about ZFS, having used NetApp / WAFL in action with snapshot / cloning it's impressive - will consider ZFS / opensolaris once again since waiting for a linux port is going to be eternal.
          • Not to mention that ZFS is open source...

            Not to mention that ZFS is open source, so if there is a problem, it can
            be fixed. It also allows for anyone to implement the file system,
            unlike NTFS.

            It's disconcerting that Flash drive makers are licensing exFAT instead
            of going with a free, and more robust file system like ZFS, ext4, or

        • Are you sure?

          How do you know that you have never had any errors? Do you checksum all your data and then verify them later? If not and you have written "hundreds of terabytes of data", you have no idea how much is corrupted. Depending on the type of data you have, you could literally have millions of flipped bits and never no it.

          Unless the file system and OS offer support, nothing you do to the physical disk will help you here. Some failures can cause the system to stop in the middle of flushing the buffers. Without atomic transactions on the file system, there is no way to guarantee that the on disk image is not corrupt. Journaling helps by serializing the updates of the metadata, but then your data can still be corrupted.

          To answer your last sentence, it isn't just about filesystem corruption, it is about data corruption.
          • Yeah, pretty sure

            My pictures display, my movies and music play, my games run... so sure, it's entirely possible I have a bits flipped. But my OS runs, my installers install, and documents open. So am I concerned? Not until I lose data. Did I mention I back everything up?
  • System's been stable for years.

    My system's been stable for years, and I've probably moved terabytes of data doing various tasks. I should be chock filled with errors.

    Frankly, I think that either your data sources are suspect, or the error correction mechanisms in drives are better than you believe.

    I've never really agreed with your previous assessments.
    • Good article!

      Facts are facts...ZFS provides data protection features that other standard file systems do not. I don't know how people can argue with that!
      • Extra protection, or extra overhead?

        Drives already have ECC built in. If it's not picking up errors, it's either defective or poorly designed. Makers of terabyte drives [b]should[/b] be expecting that their users will reading/writing teraytes of data.

        If the designers of hard drives are doing their jobs, file systems like this would be redundant and just be extra overhead.

        If we really, really need to resort to file systems like this, then it's my opinion that the ECC on current drives is way too weak and needs some serious improvement.
        • What you don't know, won't hurt you - HAH!

          My understanding is the drive ECC is to ensure the data is written correctly, not permenantly stored or read correctly.
          Maybe I just have an oversimplified misunderstanding of it all?
        • Drive ECC isn't perfect, and

          First of all, drives have specified error rates. For SATA drives the number
          is typically 1 failed read in every 10^14 bytes.

          But more importantly, drives can only correct errors they know about. If
          a hinky RAID controller or poorly grounded cable corrupts data, the
          drive has no idea. That's why ZFS has checksums: they can prove that
          the data is what was written and that the sector the drive retrieves is the
          sector the file system intended. The latter is something other file
          systems don't do.

          R Harris
    • Toyota lesson?

      [i]My system's been stable for years[/i]

      ... and so has my Mom's Toyota automobile that is subject to recall. Maybe that too is hysteria, right?