Why backup isn't enough

Why backup isn't enough

Summary: You can back up your data daily - even hourly - and still lose data. Here's how it happened to me.


You can back up your data daily - even hourly - and still lose data. Here's how it happened to me.

My backup regimen I run 3 backups:

  • Time Machine. Part of Mac OS X, Time Machine backs up changed files in my user account every hour. Like too much of Apple's tech these days, the UI is great, the underlying tech not so much.
  • Carbon Copy Cloner. My 300 GB VelociRaptor system disk and a 2 drive, 2 TB RAID 0 are cloned nightly. 2 copies of everything.
  • Online backup. My critical documents are backed up to an online provider.

Given that most people don't back up at all, I should be golden, right? But I'm not.

The limits of backup As explained in How Microsoft puts your data at risk our Windows, Mac and (most) Linux file systems are, in a word, junk. Architected decades ago in a world of costly storage and puny CPUs they will slowly hose your data.

Flaky hardware, inconsistent error handling, bit rot, phantom writes and more give consumer file systems problems they can't handle. And backup can't handle them either.

In fact, backup just spreads the corruption.

Let's say you you have a large PDF. You read it, make a couple of annotations, save and then close it. Unbeknownst to you and your file system the save corrupts the file. A bad write perhaps.

The corrupted file now gets saved by Time Machine in the next hour. Then cloned to the backup drive that night. Now I've got 3 corrupted files.

But wait! Time Machine keeps old versions for months, as do some of the online backup providers. I still have a good copy.

Until several months pass. Then all I've got are corrupted copies. That's what happened to me.

The textbook answer If you backup a corrupted file you still have a corrupted file. Big company IT shops have dealt with this problem for decades and they have the answer: archiving.

They take copies of files and place them in a read-only archive. If the file is later corrupted on the active storage, they go to the archive and pull out the - hopefully - uncorrupted copy.

But if few people backup even fewer make archives: PC archiving software is geeky; the storage requirements are large; and the perceived benefit is low.

The Storage Bits take Those negatives around archiving aren't changing. Home users will rarely archive even 2 decades from now, despite backing up. It is just too boring and expensive.

Which gets back to yesterday's post Apple's weak tech-fu. We need file systems that make data corruption rare.

As we store more files for longer times and look at them less often, data corruption will become more visible. Let's get in front of this problem today.

Comments welcome, of course.

Topics: Storage, Data Management

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.


Log in or register to join the discussion
  • I thought Apple was going to steal ZFS?

    Epic fail!!! HAHAHAHAHAHA!!!!
  • RE: Why backup isn't enough

    What options are even available to home users to archive
    • If you use a superior OS (Windows), you can use Windows Home Server

      If you are using something like OS X though, your options are limited to hacking a new file system into your OS. The instructions are very simple:

      Don't be put off by things like:
      [i]Now for the important part: right at the beginning, I said ZFS WILL crash your Mac.[/i]
      Your Mac will crash a lot but it won't lose any data!!!
      • NTFS is from the early 90s. wow how modern

        You're just upset because you invested in MS stock instead of Apple stock, HAHAHAHAHA!!!!!
      • No, I invested in MS software

        And I haven't had any of the troubles that Robin is facing on his Apple software. Go me!!! <img border="0" src="http://www.cnet.com/i/mb/emoticons/happy.gif" alt="happy">

        PS OS X's kernel is from the mid 80s. wow how modern.
        Cue the double standards...
      • Why? It wouldn't help

        Sadly NonZealot doesn't understand.<br><br>Robin point is clear:<br><br>"our Windows, Mac and (most) Linux file systems are, in a word, junk".<br><br>Whilst I wouldn't call them junk, there is certainly room for improvement.<br><br>WHS doesn't help.<br><br>Archiving is the solution as Robin correctly points out, until we've an infallible filesystem. WHS "archiving" suffers from the same limitations as Time Machine.
        Richard Flude
    • It's called your DVD burner

      Just make an archive copy. Making sure, of course, that it's good before you make the copy. Something the blog author conveniently leaves out of his rant. If he'd archived the bad copy, he'd be just as screwed.
      • RE: Why backup isn't enough


        Some people have TOO MUCH DATA to do that. A DVD only keeps 4GB's of stuff. We are now at drives that are 1-2 TB's!

        There has to be a new filesystem that is focused on making sure that things are saved to the hard drives properly, the article writer is right about that.

        However, NTFS is VERY OLD and a lot of people wouldn't want to do the rejiggering that would be needed for a new file system.
        Remember how many people bitched about FAT and FAT32 being dropped?
    • RE: Why backup isn't enough

      @justinkearney Hey Justin look into Backup Exec System Recovery. It allows you to backup online and you can run a snap shot prior to modifications. I use it and it just works.
      • RE: Why backup isn't enough

        @kjgslg@... We are using it in our company, and I can say is not so simple to use, it has poor documentation and a nightimare translation; it also has a strange bug to restore sharepoint.
        Yes, it works, but it is not for mass market.

        Marco Mangiante
    • RE: Why backup isn't enough

      You can do what my wife and I did. We bought a spindle of DVD+RW disks and archived our important files to them. She gives her set to her mom and I take my set to work, so we have not only an archive but also an offsite backup. If our house burns down while we're on vacation, we won't lose all our data. Occasionally we bring back the disks and update them.

      Of course, this form of archiving isn't very thorough, but it's a lot better than what most people do.
  • There is always a potential point of failure.

    CD/DVD/BD cracks, HD crashes, Bad data gets replicated out and overwrites the good. Can go so far as to have all of the different sequences of backup's possible, but does the average user need that kind of redundancy? Not likely.

    I guess my recommendation for a back up scheme is fairly simple. Have a local backup such as Time Machine or a USB HD. Have an online option for off site storage, for $50 a year, well worth it. And do a DVD/BD medium backup monthly or Quarterly, and stick that in a safety deposit box somewhere.

    With that is it still possible to have some data loss? probably. Likely? well I am not statistician, but my guess is you have better odds of getting struck by lightning.

    I have data as old as 10 years. I don't ever really do anything with it, but it is there. Likelihood I will ever need it? very remote. But it keeps getting backed up, and archived. Would it be the end of the world if I lost it? Not likely. Aside from tax records, I am a firm believer in the idea that if you don't use something within 6 months, the likelihood is that you never will.
    • RE: Why backup isn't enough

      @JM1981 I totally agree. Btw, I'm a big supporter of that last line you wrote.
      Arm A. Geddon
    • RE: Why backup isn't enough


      Online backups are NOT worth it when they give 2GB's of space and people have 1TB of data to back up!

      No, a better thing is to have a double backup system: one on a SEPARATE internal hard drive that is ONLY for backups, and another one on an EXTERNAL hard drive.
  • Maybe I'm just dumb...

    ... but your basic argument seems to be that you can't get your data when the backup has been corrupted? Well...duh.

    Time machine is just a slick "Itunes" -type interface on a cumulative backup schema. There aren't any new technologies there, it's just "pretty". The file dependencies grow over time. Perhaps even more so in Apple boxes (maybe not a 1-for-1 dependency?). I don't know if they use data-deduplication (aka: putting all your eggs in one old basket).

    Every now and again one should start from scratch and have a data backup built from current data, and not build on a leaning tower of 3-year old data backups on top of 5-year old data backups. After all, every type of storage media that I've ever heard of (other than perhaps stone tablets) eventually physically corrodes and loses data. Well duh (again). Sigh.
    • No, I'm saying


      That just backing up will not fully protect your data. You (I) need an archive - which would also protect against a corrupted backup.

      R Harris
      • RE: Why backup isn't enough

        @R Harris

        Unless you can explain exactly how an archive magically cures or prevents corrupted data, archive and backup are semantic equivalents. Simply making a file read-only (which is what you say is the difference between an archive and a backup) cannot cure or prevent corruption that has already happened, which is what your article is about.
  • Simple Solution really....

    If you are working on something that you need to make sure is not corrupted after saving it, simply close it and re-open it. Wala presto. Like magic you can verify that the next time it gets backed up you are safe.
    • RE: Why backup isn't enough


      Or just put the thing on a SSD drive, where 'phantom writes' are almost never going to happen.
  • Just use BAK Files

    If you're really that serious about your files

    Set up "bak" files which always contains your data prior to the last save no matter how long ago it was.

    Unless of course you edit/save the corrupted file
    ... but that would be your fault