Disk drive life depends on . . . luck
Summary: What is the primary determinant of drive life? I've read the latest research and talked to insiders.
What is the primary determinant of drive life? I've read the latest research and talked to insiders. There are so many variables that the best answer is just . . . luck. Why is that? Is there anything you can do?
Many variables - and some that aren't The recent Carnegie Mellon University and Google research dispelled some popular myths about disk drives:
- Enterprise drives are no more reliable than consumer drives. The extra money seems to go for margin and warranty costs. These are mass-produced products. There is no secret to making a disk last 3x.
- SMART drive status reporting is pretty useless. If SMART is telling you there is a problem, there probably is a problem. But if it reports no problem, that means nothing.
- Workload has no effect on drive life. So use it all you want. Google did find that a heavy workload increases infant mortality, so when you install a new drive work it hard so you can replace it while it is still economic to do so.
- Ambient temperature has very little effect on drive life until it gets up over 104 F. or 40 C. Even then the effect is slight.
So what does affect drive life? The research shows that drives are much less reliable than vendors commonly claim. There are two major reasons for this discrepancy:
- Vendor numbers are based on accelerated testing, which means high-temperature operation. That just isn't a very good simulation of real life, especially real consumer life. But it may explain why drives aren't much affected by temperature.
- A high percentage of failed drives report "no trouble found" in vendor testing. This probably reflects the quality of the testing more than anything.
The top issues in drive life:
- Drive age. There is some infant mortality, but not much. The big issue is that once drives are three years old their annual failure rates skyrocket.
- Handling. Dropping a drive is a bad idea, even a couple of inches onto a table. I saw evidence in the 1990s that found that reducing drive handling to the minimum required to install it improved reliability by 20% or more. There have been many improvements in shock specs since so this may be less valid, but it still makes sense. Drives are mechanical devices. Don't knock them around.
- Early production quality. Can't wait to get your hands on that new 4 TB, 15K drive? You could be buying a problem. The first three months of a new drive's production typically has a higher failure rate. After that the factory line settles down and quality goes up.
- Statistical variation. Google and CMU looked at 100,000 drives each in their studies. Most of us have very small sample sizes and don't keep very good records. But the data shows that drives fail for no apparent reason at all ages and in all environments. A drive can fail at any time without warning.
That's where the luck comes in. Here at Chez Mojo I had four working disk drives at the beginning of last week. By the end of last week I had two. Different vendors, environments, enclosures, ages, everything.
It was just my bad luck. And normal statistical variation.
What about vendors? I think there are differences, but the conspiracy of silence among big drive consumers, including Google, means data is sparse. But I have some ideas on that for a future post. Stay tuned.
Comments welcome, of course. In a moment of brain cramp I left out the point about early drive production. I added it Friday morning.
Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.
Talkback
Interesting Article!!
Ah yes,
I also remember "stiction" - when drives got stuck and you needed to shake them just right to get them running again. Happened mostly with IBM and Quantum drives way back when . . .
Stiction
Stiction
Stiction also caused by stuck read/write heads
motor bearing lubricant. Stiction in roughly the other half, at least
with later drives, was caused by the read/write heads sticking to the
platters. I was never really sure exactly why that happened--I think
there were several causes: humidity getting into the drive, then
drying; residual magnetism; etc. Sometimes when I removed a drive's
cover to unstick the heads from the platters, all I'd have to do was to
clean the read/write heads with a piece of paper dampened with
anhydrous alcohol, slipped between the heads and the platters, and
rubbed back and forth a few times. Not so easy in a multi-platter
drive.
Stuck on stiction
Good ole days (?)
The good ol days..
Anybody else remember the WD " Click of Death? " I had to do a massive weekend swap of all those friggen drives..
That massive recall started when people would sit down at their desks, turn on their computers and wait. and wait. and wait.
T
RE: Disk drive life depends on . . . luck
A quick off the on agian power
In my experience, I've seen more failed Maxtor drives than any other brand. And it seems like Seagate drives fail less than any other brand.
And I trully believe that quality SCSI drives will last longer than ATA drives. I have two Compaq drive arrays with several SCSI's that are nine years old and one that is eleven. I don't have any ATA near to that age.
Seconded.
In the computer industry, my rule is to try to avoid any company whose first letter is "M". Sadly, there's at least one inescapable exception to that rule. :)
Seagate
No WD for me
You Missed the Point
THANK YOU
I just cannot wait till they change drive technology (i seem to remember at one point a holographic 3d cube for storage being in the works with almost no moving parts).
but as an example, i have had LOTS of luck with the major brands (WD, MAxtor, Seagate, Deskstar (both Hitachi and IBM)) and have rarely...actually i don't think i have had one of MY drives fail, so i must be chock full of good luck LOL. I have drives i no longer use from an old P2 dual processor system (10 gig drive IIRC, still sitting on my desk) and have a string of drives in USB enclosures so i can access the data when i need it, all from my old computers. And the brands are so varied (OK, except Fujitsu LOL).
So yes, people should quit showing their brand prejudice because that is all it is, it is not based on fact.
SCSI
I still remember working with MFM and RLL drives!!
Man, I'm getting old!! LOL
I had some failures with the IBM drives and after extra cooling of the RAID enclosure that failure stopped.
emotional purchase
WD is my problem as well...
I have decided to stick with 160gb because the WD2500 (250gb) burned up before I could grab any data - and that was my "archive" disk! It was basically used inside a Sony E-machine or an HP which did not run very long or often, but still it failed. That just a month after a WD1200 (120gb) bit the dust, but that seems to have been a simple "repartition", ... perhaps brought about by XP??? Don't know, but two WD's in one month - when there were at least a year between, and none over three or four years old was too much for me to trust WD any more.
In contrast, my oldest "native" machine (bought new by me) is the HP XE-783 with a Maxtor 30gb C:\, and it has just kept running longer than the WD's, so that's why I'm going to try the Maxtor(s). I am electing to use USB externals so that they don't run just because the computer(s) are turned on. I have a rack of DJ switches which control up to 8 outlets, and it is driven by one of those old "Power Control Centers" (remember them? they had a switch for "Computer, Monitor, Printer, Aux 1, Aux 2; but they aren't made any more for what is termed, "Startup surge", so I had to get the DJ system from a music store in St. Joseph, MO).
Can't imagine purchasing anything with more capacity, and most certainly not the 500gb or 1.x tb units because of the amount of time/access to transfer data if it begins to fail.
But I wonder if anything will change
WEllll....
Quantum was absorbed by Maxtor (well at least their HDD part, I have it on good authority they still do tape drives at the moment) and really did not affect Maxtor one way or another.
Likely same thing for Seagate. It is like Compaq when it got HPed. WEll not quite, because Compaq's quality actually WAS bad (*shudders at how many he to work on in that 3 or so years before the HP buyout) but now Compaq is the "Business computer" and HP is the "consumer computer".
Since i do not see a lot of Seagate/Maxtor advertisment, no idea if they will do the same thing