How data gets lost

How data gets lost

Summary: Our perception of risk and the reality of risk are often two different things. For example, are computer viruses or system glitches more likely to hose your data?


Our perception of risk and the reality of risk are often two different things. For example, are computer viruses or system glitches more likely to hose your data? While viruses get bad press for poor system performance, they aren't very likely to damage your data. Your system, on the other hand . . . .

This is data loss week at Storage Bits. I'll be talking about causes of data loss in some detail, starting today with the general factors and then drilling down into the major cause of data loss problems at the device and system levels.

Hoping for a silver bullet? There isn't one. Backing up your data regularly, either to a local hard drive or to an on-line service such as Carbonite or Mozy is your best bet. I will give you some concrete tips on limiting the damage in the one area you can control: yourself.

There is hope, however, that our systems will start treating our data a lot better than they do today. I'll cover that later this week.

What are the causes of data loss? I had a nice talk with the folks at Kroll Ontrack, the worlds largest data recovery firm. They have some interesting statistics on what actually causes data loss.

Cause of data loss Perception Reality
Hardware or system problem 78% 56%
Human error 11% 26%
Software corruption or problem 7% 9%
Computer viruses 2% 4%
Disaster 1-2% 1-2%
The numbers don't add up due to rounding errors.

Disk and system problems What is interesting here is that Ontrack's data suggests that we are too quick to blame our systems and not quick enough to blame ourselves. Their experience is that human error is a bigger piece of data loss than we'd like to admit. More on that below.

In general, there is little you can do about hardware or software problems. I use a battery backup unit with a surge protector to keep power clean and steady and maybe that helps.

My major strategy is to backup every day to a local disk. And I backup important files to an online service as well. As long as I've got a credit card to buy another system I can be on line in a couple of hours. And yes, I keep that password with me.

Why do I do both? Because recovery from a local disk is hundreds of times faster than downloading gigabytes over the net. But if a catastrophe happened, at least I still have access to critical data. I also keep a second machine - a notebook - for backup as well.

Panic is a factor in data loss Human error is a major problem in data loss. That shouldn't be a surprise: human error is a major factor in everything.

But there are some common things that people do that make data loss situations worse. When you suspect data loss, follow these simple steps;

  • Take a deep breath and stop! Panic is a common reaction, and people do really stupid things. Experienced admins will pull the wrong drive from a RAID array or reformat a drive, destroying all their information. Acting without thinking is dangerous to your data. Stop stressing about the loss and don't do anything to the disk. Better yet, stop using the computer until you have a plan of attack.
  • If your disk is making weird noises, normal file recovery software isn't going to work. I've had luck with performing a backup right away after hearing odd noises, but that is a matter of luck.
  • If the drive is still spinning and you can't find your data download a data recovery utility onto another computer. Google for "free data recovery software" for some options, including one from Ontrack. The important thing is to download them onto another drive, either on another computer, or onto a USB thumb drive or hard disk. If you don't you could be overwriting the data you are trying to save. It is good practice to save the recovered data to another disk.

If you have some tech moxie you might want to look at free data recovery programs such as NTFS Reader and PC Inspector. Neither is for newbies, so if you don't have a high degree of confidence in your PC ability, get a more knowledgeable friend to help.

If your drive is making grinding noises or has stopped completely, your data can still be recovered, but it will cost you in time and money. Google "disk data recovery" or check the back pages of your favorite PC mag for disk recovery companies.

The Storage Bits take Disk drive life may be a matter of luck, but data loss isn't. You can dramatically improve your odds by backing up your data on a regular basis to a USB or Firewire hard drive. For those whose livelihoods rely on computers, off-site backup is also a good idea, as is a backup computer.

Next: There are 50 ways to lose your data: a catalogue of disk woes.

Comments welcome. I'll update the post if I see a good one.

Topics: Data Centers, Data Management, Hardware

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.


Log in or register to join the discussion
  • One of the most frequent data loss problems

    One of the most frequent data loss problems that people come to me with: after working for hours on an Excel spreadsheet or Powerpoint slide deck, they get in a hurry when closing the file and hit the "No" button when asked to save changes. Not much I can do when the user throws their own data away ...
    terry flores
    • My related favorite:

      You get a spreadsheet in the mail, open it from mail, work on it, hit save, and
      because it is an attachment, it doesn't get saved.

      Been bit by that one myself more than once.

      R Harris
      • Using Outlook?

        I'm not certain about other e-mail clients, but Outlook actually puts the opened attachment in a folder beginning with "OLK" (at least up to the 2003 version). If you navigate there, you should be able to find the modified file there. I've saved a few people a lot of work with that bit of information. Even if you're not using Outlook, it would be worth investigating to see if the client you use does something similar.

    How about data loss due to rounding problems. That would account for your numbers that don't add up to 100.
    • Binary Coded Decimal

      That's why you <i>want</i> the computer to throw digits away.

      Good catch!

      R Harris
    • Hmmm...

      The per cent came out close enough to 100. So I wouldn't take that for a serious flaw. I do not consider it appropriate to laugh at these matters however trivial the description might seem.

      AND - a web dictionary of abbreviations gave a following entry for ROTFLMAO:

      A chatroom abbreviation used mainly by imbeciles, usually in response to something mildly, often very mildly, amusing. People who use this type of shorthand should be avoided like the Spanish flu.
  • My opinions

    Forget online backups. Even with broadband, they're way too slow for multi-gigabyte hard drives. You're much better buying a few extra hard drives and backing up directly. Remember to keep at least one of them offsite.

    When boadband reaches gigabit speeds, maybe. But as long as we're talking magabits, no way. And even then, I worry about them having access to my data. Why should I ever trust these people?

    As far as maintenance and repair software goes, I recommend SpinRite. It doesn't cost much and it's worth every penny. If your hard drive is failing, it'll usually keep it up long enough to transfer the data to a new hard drive.

    Keep in mind that even if such software "fixes" a hard drive, you should still transfer the information to a new hard drive and throw away the old one! Once a hard drive starts failing, it only gets worse over time - often very quickly. You do [b]not[/b] want to keep using a failing hard drive!

    Remember when disposing of any hard drive for a business to follow your business's disposal procedures. If it has sensitive data on it, it should be destroyed before disposal.
    • Defense in depth

      For most folks backing up to local hard drives is sufficient. For me and other
      people who work on the web or are highly mobile, online backup is another layer
      of protection. <br>
      True, it would take me a few days to download everything I have stored online. But
      that is far better than never seeing it again. Most people don't need it, but most of
      my work life is online.
      I don't worry about data security as my data is encrypted on my machine before it
      leaves home. Even the storage company doesn't have a key.
      I've heard good things about SpinRite. Thanks for the tip. <br>

      See <a href="" target="_blank">How to
      REALLY erase a hard drive</a> to learn about the NIST approved built-in method.
      Check out the <a href=""
      target="_blank">update</a> as well.<br>
      R Harris
    • On-line and others

      The network speed is a factor, indeed. But for those who do not want to have a hard drive of their own either in the same box or somewhere on the shelf, it offers at least some solution. In case of a serious crash one might also wait some time before he gets his data back - compared to them being lost completely.

      Regarding those people at services like Mozy sorting our "dirty underware" - I do not care. My financial data are not to be found there, just a collection of writings in a language I doubt they can read. The rest of my data - different images, some music etc. rest on a shelf in a 300 GB portable drive. Which is perfectly enough for me.

      My sensitive financial data are the most insecure in case of loss - they are in my memory alone. If I happen to damage THIS hardware, I might be heading right into trouble... ;-)
    • Online backups are a must to...

      Online backups are a must to protect critical data, not for all your MP3 collection. For example, last week emails, the current project files, maybe your clients contact information, financial files, etc. And of course, nobody should store that without encryption, not even in your backup disks (you can encrypt it yourself or use the encrypting software provided by most [serious] online backup providers).

      Of course, if you cannot decide what data from your 500 GB of data is "critical", then online-backups are not for you.


  • Just a grammar quibble

    When will people--especially professional writers--realize that the word "data" is a plural noun. Therefore the correct title of this article SHOULD be "How Data Get Lost" and it's "Data are...." not "data is..."

    I know it's frowned upon to put grammar quibbles in a place like this, but this was just too egregious not to say something.

    Flames go to /dev/null.

    • Language changes...

      And data has been a mass noun for decades.

      A mass noun is one where we don't consider counting the members individually, but rather think of measuring it indirectly, if we consider quantity at all.

      We don't say "the sand are sparkling in the sun". But we'd say "the grains of sand are".

      So we don't say "the data are" -- we say "the data items are".

      And we say "how much data", not "how many data" -- more evidence that it has become a mass noun. "How many data" just doesn't make any sense! 2 trillion what? Characters? Records? Events? Observations? The move to a mass noun was really forced by the structured nature of data; it simply is not amenable to a unique count.

      Yes, originally, "data" was the plural of "datum". But that language has been dead for about one thousand years.

      If people were in fact talking about the individual data items (the "datums"), then you would be correct. But now when people talk about "data", we are speaking of a large collection, not the content of that collection.

      To speak specifically of multiple collections, we have phrases like "data sets". Maybe in the far future, that will evolve into "datas". Hopefully not in my lifetime...

      (Feel free to berate me about "hopefully", which makes no grammatical sense to me either, but I still use -- often in sentence fragments....)

      It's been my experience that most of the time, when people pop up with things like this, they've simply (subtly) misunderstood the usage in question. I've seen many a professional grammar writer fall into this trap, so you're not alone.

      Now for some fun, everyone can chime in, pro and con, with "Hear, Hear, Hear!" or "Here, Here, Here!" -- nearly opposite in meaning, but frequently confused...

      (I wonder how much data has been lost writing to /dev/null?)
  • Bad recovery an extra factor; malware may magnify

    Those numbers look about right, except that on the cusp of a destructive malware outbreak, malware can zoom up to 80%. For example, in "CIH payload week", I had a dozen PCs in for data recovery and none of those had bad hardware.

    So I think we're correct to concentrate on malware, though the recent commercialization of the malware scene may have lulled folks into a false sense of security. The lack of recent data-eating payloads has less to do with the protective abilities of NTFS (read up on Witty, a pure network worm that trashes NTFS from within XP via raw sector writes) than, well, few folks writing that sort of code at the moment.

    I'd add an extra category; botched maintenance, and I'd rate that as between 5% and 20%. This escalates a recoverable data loss to a larger unrecoverable one.

    For example, I had a client in tears after the Thus Word vir.. uh, "prank macro" payload trashed her data. The details of how Thus works made me fairly confident her data could be recovered, especially as it was off C: on logical drive D:, where there was no other non-data write traffic.

    But her "husband's work" had already "fixed the virus" by "just" wiping and re-installing Windows, and had also replaced the partitioning with one big doomed C: volume. They'd thoughtfully wiped the HD before doing this "to be sure the virus was gone", so there was no happy ending.

    It was the event that led me to write this...

    ...and you may sense the anger in that writing.
  • Backup: Hard drive + online

    I strongly recommend the external USB-drive and online combination.

    As you note, for recovery, the fast access to the USB-drive is critical. Using an external drive is good, because you can easily move it to another computer for recovery. (But of course, in a pinch you can extract an internal drive, and drop it in a USB case, so internal is an option of you don't like external for one reason or another).

    Online is for disaster recovery. Fire. Theft. And for backup or recovery when you're on the road.

    Hard drive is for bare-iron recovery.

    And yes, you CAN back up many, many gigabytes of data over broadband. The initial backup may take days to accomplish, but unless you're producing many gigabytes per day of new/changed data, it's not really a problem.

    Unless you have both, you're not really protected.

    But forget anything that's not fully automatic. It's just not going to happen, and you'll be left in the lurch.
  • online data and security

    Anyone really have an experience with data loss or corruption on an online site?

    I'd love to hear any stories...
  • Use online backups

    People lose data all the time.So for me it is better to store all my data online because when all my backups are online i feel secure.I personally use to backup all the files i need because they offer very good services.For example they offer 3GB trial for free.And their prices are affordable and with SafeCopy,if i delete any data accidentally,am able to recover them back by just a few clicks.