56% of data loss due to system & hardware problems - Ontrack
Data loss is painful and all too common. Why? Because your file system stinks. Microsoft's NTFS (used in XP & Vista) with its de facto monopoly is the worst offender. But Apple and Linux aren't any better.
Everyone knows what the problems are AND high-end systems fixed many of them years ago. Yet only one desktop vendor is moving forward, and they aren't based in Redmond. Here's the scoop.
Y2k got fixed. File systems didn't.
That may sound harsh. But with all the lip-service paid to innovation - especially in Redmond - you'd think that sometimes we'd see some, especially in core technology. After all, more than half of all data loss is caused by system and hardware problems that the file system could recover from - but doesn't.
Instead we're using 20 year old technology that, like the 2 digit year - which led to the Y2K drama - was designed for a world of scarce storage, small disks and limited CPU power. Unlike Y2K though, we are living with, and paying for, these compromises every day with lost data, corrupted files, lame RAID solutions and hinky backup products that seem to fail almost as often as they work.
File systems? I should care because . . .
You rely on your file system every time you save or retrieve a document. It is the file system that keeps track of all the information on your computer. If the file system barfs, your data is the victim. And you get to pick up the pieces.
As documented in my last two posts (see How data gets lost and 50 ways to lose your data) PC and commodity server storage stacks are prone to data corruption and loss, many of them silent. Only your file system is positioned to see and fix these problems. It doesn't, of course, but it could.
And you enterprise data center folks, smirking over the junk consumers get, don't be too smug. Some of your costly high-end storage servers have NTFS or Linux FS's under the hood as well. And no, RAID doesn't fix these problems. According to Kroll Ontrack, only a quarter of data loss instances are due to human error - and many of those errors happen in the panic after a loss is discovered.
Hey, I thought machines were supposed to be good at keeping track of stuff? Only if they are built to.
IRON = Internal RObustNess
I came across the fascinating PhD thesis of Vijayan Prabhakaran, IRON File Systems which analyzes how five commodity journaling file systems - NTFS, ext3, ReiserFS, JFS and XFS - handle storage problems.
In a nutshell he found that the all the file systems have
. . . failure policies that are often inconsistent, sometimes buggy, and generally inadequate in their ability to recover from partial disk failures.
Dr. Prabhakaran will see you now
In a mere 155 pages of lucid prose he lays out his analysis of the interaction between hosts and local file systems. It is a clever analysis, especially of the proprietary and unpublished NTFS.
First, inject a lot of errors
Dr. Prabhakaran built an error-injection framework that enabled him to control what kind of errors the file system would see so he could document how the FS handled them. These errors include:
- Failure type: read or write? If read: latent sector fault or block corruption. Does the machine crash before or after certain block failures"
- Block type: directory block; super block? Specific inode or block numbers could be specified as well.
- Transient or permanent fault?
So how did NTFS fare?
Since NTFS is proprietary, Dr. Prabhakaran couldn't get as deeply into it as the open-source systems. While NTFS doesn't implement the strongest form of journaling, he found it pretty reliable at letting applications know when an I/O error has occurred. NTFS also retries I/O requests more than the Linux file systems, which, compared to the dearth of retries on Linux, is a good thing.
NTFS sanity checking is also stronger than some. Yet he notes that
NTFS surprisingly does not always perform sanity checking; for example, a corrupted block pointer can point to important system structures and hence corrupt them when the block pointed to is updated.
Translation: Bad Thing.
Dr. Prabhakaran offered a set of general conclusions about the commodity file systems including NTFS:
- "Detection and Recovery: Bugs are common. We also found numerous bugs across the file systems we tested, some of which are serious, and many of which are not found by other sophisticated techniques."
- "Detection: Sanity checking is of limited utility. Many of the file systems use sanity checking . . . . However, modern disk failure modes such as misdirected and phantom writes lead to cases where . . . [a] bad block thus passes sanity checks, is used, and can corrupt the file system. Indeed, all file systems we tested exhibit this behavior."
- "Recovery: Automatic repair is rare. Automatic repair is used rarely by the file systems; . . . most of the file systems require manual intervention . . . (i.e., running fsck)."
- "Detection and Recovery: Redundancy is not used. . . . [P]erhaps most importantly, while virtually all file systems include some machinery to detect disk failures, none of them apply redundancy to enable recovery from such failures."
Dr. Prabhakaran found that ALL the file systems shared
. . . ad hoc failure handling and a great deal of illogical inconsistency in failure policy . . . such inconsistency leads to substantially different detection and recovery strategies under similar fault scenarios, resulting in unpredictable and often undesirable fault-handling strategies.
. . .
We observe little tolerance to transient failures; . . . . none of the file systems can recover from partial disk failures, due to a lack of in-disk redundancy.
How doomed are we?
Pretty doomed. But there is some hope.
There are well known techniques, such as disk scrubbing, check summing, and more robust ECC used in high-end systems that could be added to our systems. Not rocket science.
Young Dr. Prabhakaran now works at Microsoft Research. Perhaps someone up in Redmond will reach out to him to see how NTFS's aging architecture might be enhanced.
Of course, Microsoft is fine with the status quo until it threatens market share. Internet Explorer's innovation hiatus after crushing Netscape is a fine example.
So it is good news that Apple has two storage initiatives that will put pressure on Redmond to clean up its act.
- Time Machine is a beautifully crafted automatic backup utility in Mac OS X.V (Leopard). While it doesn't solve the data corruption problems that I assume HFS+ has as well, it does make it very easy for regular folks to backup and recover their data. I think small business types will love it.
- ZFS is the new open-source file system from Sun that Apple is incorporating into OS X. I expect the port won't be complete for another year, but ZFS is the first file system to offer end-to-end data integrity that can detect and correct such devious problems as phantom writes.
See Apple’s new kick-butt file system for more on ZFS.
The Storage Bits take
As noted in "How data gets lost" more than half of all data loss is caused by system and hardware problems. A high quality file system that took better care of our data could eliminate many of those failures.
The industry knows how to fix the problems. The question is when. With a resurgent Mac pushing ZFS maybe Redmond will see the light sooner, rather than later, and dramatically increase the reliability of all our systems.
It will be interesting to see how Microsofties spin inferior data integrity once ZFS is the OS X default file system. Especially to the enterprise folks for whom data integrity is the ne plus ultra of the data center.
Comments welcome, of course. Itching to read a well done CompSci PhD. thesis? Here's a link to IRON File Systems. Enjoy.
Update: based on the first couple of commenters, who seem to believe that data loss is a figment of my imagination, I gave more prominence to the factual basis of data loss and added a couple of short quotes from the thesis. I single out Microsoft because their negligence impacts more people than any other company. Maybe, someday, Microsoft will start measuring success in terms of software quality instead of market share.