ZFS data integrity tested

By | February 25, 2010, 8:09am PST

Summary: File systems are supposed to protect your data, but most are 20-30 year old architectures that risk data with every I/O. The open source ZFS from Sun Oracle claims high data integrity - and now that claim has been independently tested.

File systems guard all the data in your computer, but most are based on 20-30 year old architectures that put your data at risk with every I/O. The open source ZFS from Sun Oracle claims high data integrity - and now that claim has been tested.

I’m at the USENIX File and Storage Technology FAST conference (see the last couple of StorageMojo posts for more) in Silicon Valley. There is more leading edge storage thinking presented here than any other industry event.

Case in point: End-to-end Data Integrity for File Systems: A ZFS Case Study by Yupu Zhang, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau and Remzi H. Arpaci-Dusseau of the Computer Sciences Department, University of Wisconsin-Madison. It offers the first rigorous test of ZFS data integrity.

Methodology
The UW-M team used fault injection to test ZFS. Fault injection is a great technique because you inject as many errors as you want and correlate the file system’s response.

Goal: errors injected = errors rejected.

As the paper puts it:

. . . we analyze a state-of-the-art file system, Sun Microsystem’s ZFS, by performing fault injection tests representative of realistic disk and memory corruptions. We choose ZFS for our analysis because it is a modern and important commercial file system with numerous robustness features, including end-to-end checksums, data replication, and transactional updates; the result, according to the designers, is “provable data integrity”

Disk corruption is more common than you think.

In a recent study of 1.53 million disk drives over 41 months, Bairavasundaram et al. show that more than 400,000 blocks had checksum mis- matches, 8% of which were discovered during RAID reconstruction, creating the possibility of real data loss.

But the study didn’t stop there. They also injected errors into the RAM the file system used. As regular readers know, memory errors are hundreds to thousands of times higher than thought.

Keepin’ it real
ZFS has several important data integrity features.

  • Data integrity checksums. The checksums are stored in parent blocks, which enables ZFS to detect silent data corruption.
  • Block replication for data recovery. ZFS keeps replicas of certain important blocks.
  • Copy-on-write for atomic updates. Object updates are grouped together and new copies are created for all the modified blocks. ZFS doesn’t need journaling.

ZFS has no special tools against RAM errors.

Conclusion
The study found that

. . . ZFS successfully detects all corruptions and recovers from them as long as one correct copy exists. The in-memory caching and periodic flushing of metadata on transaction commits help ZFS recover from serious disk corruptions affecting all copies of metadata.

The results for in-memory data corruption weren’t as stellar, but ZFS - like every other file system - wasn’t designed to handle DRAM errors. The authors offer suggestions for making ZFS less vulnerable to DRAM errors.

The Storage Bits take
The disk tests are the strong evidence that ZFS delivers on its promise of superior on-disk data integrity. I hope that ZFS - or something better - arrives on other OS’s soon.

But what about the memory fault tests? In this I suspect that ZFS is no worse than legacy file systems and as a clean sheet design may be better.

While suggestions for improved RAM error resistance are well taken - after all, most of the world’s computers have no ECC memory - this reinforces the need for ECC DRAM in important applications. ECC everywhere!

The authors should have made the effort to correlate DRAM error rates to the likelihood of in-memory data corruption by file systems. While DRAM is not nearly as reliable as the industry let us believe, we know disks are prone to errors.

The companies who produce file systems are fortunate their failures do not end up - like Toyota - with a grandmother wrapped around a tree. But what they lack in drama they make up with volume.

We have no idea how many billions of man-hours have been wasted due to silent data corruption, but the number will keep growing until every file system is at least as good as ZFS is today.

That means you, NTFS, HFS+ and the other legacy Unix, Linux and proprietary file systems. We trust you with our data and you are letting us down. That’s just wrong.

Kudos to the UW team for their testing and their paper. They’ve set the bar for the rest of the industry.

Comments welcome, of course. See ZFS: Threat or menace? for a detailed introduction to ZFS. For the impact of data corruption see How Microsoft puts you data at risk. The NTFS team knows this stuff, but the MSuits are more worried about the Zune.

Kick off your day with ZDNet's daily e-mail newsletter. It's the freshest tech news and opinion, served hot. Get it.

Topics

Robin Harris has been messing with computers for over 30 years and selling and marketing data storage for over 20 in companies large and small.

Disclosure

Robin Harris

Robin Harris is a president of TechnoQWAN, a consulting and analyst firm in northern Arizona. He also writes StorageMojo.com, a blog which accepts advertising from companies in the storage industry, and has a 25 year history with IT vendors. He has many industry contacts, many of whom are friends and all of whom he has opinions about. Robin has relationships with many companies in the technology industry. Every company he writes about may have sought to influence his opinion through carefully-crafted marketing messages and self-serving white papers, gifts ranging from desk calendars, t-shirts, lunches and trips as well as analyst or consulting assignments. He also invests in some technology companies. He may accept payment for services in stock as well. Robin discloses financial investments in or client relationships with companies named in Storage Bits. To help readers sort out the gold from the dross in his writings, Robin tries to communicate his reasons as clearly as he can. If you agree, you are intelligent and discerning. If you disagree, well, you disagree. In all cases, Robin encourages readers to subject everything they read, see or hear on the internet or from politicians to some simple questions: * What assumptions are implicit in the world view and judgments of the author? * What, if any, is the factual basis for the opinions the author expresses? * Is it reasonable, logical and clear? Your critical faculties: use ‘em or lose ‘em!

Biography

Robin Harris

Harris has been messing with computers for over 30 years and selling and marketing data storage for over 20 in companies large and small. He introduced a couple of multi-billion dollar storage products (DLT, the first Fibre Channel array) to market, as well as a many smaller ones. Earlier he spent 10 years marketing servers and networks. After leaving corporate life he founded TechnoQWAN, a consulting and analyst firm. He also developed StorageMojo into one of the top storage industry blogs.

Robin writes, consults, coaches and lives among the mountains of northern Arizona.

Talkback Most Recent of 40 Talkback(s)

  • Such a load of crap
    Why did apple drop it? because its not anywhere near as good as you make it sound.
    ZDNet Gravatar
    Ron Bergundy
    25th Feb 2010
  • Legal reasons Apple dropped ZFS, not technical
    Sun/Oracle wouldn't be using ZFS if it were crap. It's very likely that it was legal/licensing issues that caused Apple to drop ZFS use in the Mac OS.
    ZDNet Gravatar
    goony
    25th Feb 2010
  • ZDNet Gravatar
    ptcruisergt
    25th Feb 2010
  • More likely it was just uncertainty what would happen...
    When Apple found out about the Sun/Oracle thing. I'd reckon that Apple knew that deal was in the pipeline way sooner than the man in the street ever did.
    ZDNet Gravatar
    zkiwi
    25th Feb 2010
  • RE: ZFS data integrity tested
    @goony I was thinking the same thing.

    p.s. WHAT?!! This article is already 3 months old I replied to.
    ZDNet Gravatar
    Arm A. Geddon
    28th May 2010
  • ZDNet Blogger

    According to informed sources . . .
    Apple publicly committed to use ZFS in Snow Leopard server in the
    middle of '08 after an eval by an Apple due diligence team. The
    engineering was moving forward nicely.

    But the Apple team procrastinated on getting a license from Sun - and
    when Sun accepted the Oracle offer it was too late to get one until the
    deal was completed and Oracle approved.

    That introduced too much schedule risk for Apple's taste, so they
    pulled the support. It may yet come back if Oracle approves.

    Robin
    ZDNet Gravatar
    R Harris
    26th Feb 2010
  • RE: ZFS data integrity tested
    @R Harris By no means attention it would be consequently noisy pr???iligy . You guys absolutely canister appease behind afterwards change the area of interest in the direction of buy z???oloft pi???lls or on small amount ach???eter ci???alis, heh?
    ZDNet Gravatar
    jkaqkgojgw
    16th Oct
  • Again with singling out MS
    http://talkback.zdnet.com/5208-12694-0.html?forumID=1&threadID=37173&messageID=683242

    "Admission from Robin Harris: "Also, I didn't say Microsoft was worse than the others - they are on a par. They all stink."

    Headline from Robin Harris: How Microsoft puts your data at risk.

    It's obvious your headline is grossly misleading."

    If you really felt the need to call out one vendor over another, why not pick the one that actually announced plans for ZFS then pulled it.

    Hint: that wasn't MS, it was your good friend Jobs.
    ZDNet Gravatar
    rtk
    25th Feb 2010
  • ZDNet Blogger

    Why I single out Microsoft
    Microsoft's desktop monopoly - as legally defined - means they impact
    more people than any other OS. They make a lot of money - the most
    profitable corporation on earth - and the employ thousands of brilliant
    computer scientists and engineers.

    Yes, I expect better of Microsoft.

    Why don't you?

    Robin
    ZDNet Gravatar
    R Harris
    25th Feb 2010
  • I'm confused by this...
    Another NTFS user who has written hundreds of terabytes of data. Where are all these errors you speak of? My media, documents, OSs... trouble free. I don't have any errors or lost data on any of my SANs', NASs', or local storage.

    Please explain how ZFS will solve a problem that very few users, if any, are experiencing? Don't you think improving the physical reliability of drives or graceful power down of disks would be far more useful? I mean, I've had drives overheat and fail several times. This is a much bigger problem to me than filesystem corruption, which I've _never_ experienced or heard of a peer experiencing.
    ZDNet Gravatar
    crazydanr@...
    25th Feb 2010
  • it's anything but rare, you are lucky
    I have seen it dozens of times, if it is a root system (c:) drive that is unclean chkdsk will force-run at boot (other drives d: and beyond need to be done manually from what I recall, sometimes from safe mode depending on where your swap and running apps are configured).

    If it detects clusters that are "orphaned" it will create a "Found.000" folder (or more incrementing counts, "Found.001", etc).

    It happened to me about 9 months ago when a flaky motherboard caused a system crash and I had about a dozen various hdd's installed (around 2Tb total) I had a few "Found" folders and enough was enough, converted it to linux where I can have a poor-mans mirror system that mounts / umounts the mirror drive on the fly to perform the sync.

    Anyhow just google chkdsk and found.000 if you are curious about ntfs cleanup. I have been curious about ZFS, having used NetApp / WAFL in action with snapshot / cloning it's impressive - will consider ZFS / opensolaris once again since waiting for a linux port is going to be eternal.
    ZDNet Gravatar
    ~doolittle~
    26th Feb 2010
  • Not to mention that ZFS is open source...
    Not to mention that ZFS is open source, so if there is a problem, it can
    be fixed. It also allows for anyone to implement the file system,
    unlike NTFS.

    It's disconcerting that Flash drive makers are licensing exFAT instead
    of going with a free, and more robust file system like ZFS, ext4, or
    UFS.






    ZDNet Gravatar
    olePigeon
    26th Feb 2010
  • Are you sure?
    How do you know that you have never had any errors? Do you checksum all your data and then verify them later? If not and you have written "hundreds of terabytes of data", you have no idea how much is corrupted. Depending on the type of data you have, you could literally have millions of flipped bits and never no it.

    Unless the file system and OS offer support, nothing you do to the physical disk will help you here. Some failures can cause the system to stop in the middle of flushing the buffers. Without atomic transactions on the file system, there is no way to guarantee that the on disk image is not corrupt. Journaling helps by serializing the updates of the metadata, but then your data can still be corrupted.

    To answer your last sentence, it isn't just about filesystem corruption, it is about data corruption.
    ZDNet Gravatar
    blu_z
    26th Feb 2010
  • Yeah, pretty sure
    My pictures display, my movies and music play, my games run... so sure, it's entirely possible I have a bits flipped. But my OS runs, my installers install, and documents open. So am I concerned? Not until I lose data. Did I mention I back everything up?
    ZDNet Gravatar
    crazydanr@...
    26th Feb 2010
  • System's been stable for years.
    My system's been stable for years, and I've probably moved terabytes of data doing various tasks. I should be chock filled with errors.


    Frankly, I think that either your data sources are suspect, or the error correction mechanisms in drives are better than you believe.

    I've never really agreed with your previous assessments.
    ZDNet Gravatar
    CobraA1
    25th Feb 2010

Talkback - Tell Us What You Think

Formatting +
BB Codes - Note: HTML is not supported in forums
  • [b] Bold [/b]
  • [i] Italic [/i]
  • [u] Underline [/u]
  • [s] Strikethrough [/s]
  • [q] "Quote" [/q]
  • [ol][*] 1. Ordered List [/ol]
  • [ul][*] · Unordered List [/ul]
  • [pre] Preformat [/pre]
  • [quote] "Blockquote" [/quote]

The best of ZDNet, delivered

ZDNet Newsletters

Get the best of ZDNet delivered straight to your inbox

Facebook Activity

White Papers, Webcasts, & Resources