How would you feel if you woke up one Monday morning to discover that 5+ years of cloud backups were missing? That they appeared to have been purposely deleted by the very same cloud backup vendor to whom you entrusted your backups?
That was my Monday last week. It took the whole week to resolve, fueled by back and forth support tickets, Twitter blasts, frustration, worry, and eventual resolution.
Fortunately, the cloud backups are now fine. The vendor acquitted itself honorably. And while I have a new, unwelcome homework assignment, my data will be properly re-homed.
While it was a week of stress I would have rather not had to endure, some important lessons came out of the experience, both for companies who provide IT services and their customers.
Backing up with CrashPlan
I started using the CrashPlan cloud backup service in August of 2010. CrashPlan is the off-site component of my overall backup strategy, which involves four arrays of mirrored NAS drives, on-site backup servers I keep off the Internet and powered off when not in use, and a working use of Dropbox, Evernote, and other cloud SaaS providers.
While I've had a few complaints about the speed of support, I'm generally happy with the service. Actually, CrashPlan has, more than once, saved my bacon.
I haven't had to give CrashPlan too much thought, since it has generally worked. The big management activity I do is open up its weekly backup report and make sure the Backed Up % column is at 100 percent (or close enough for one or two problem machines).
As you might imagine, over the period of six years, I've added many new machines and taken quite a few out of service. When I take a machine out of service, I normally move its data over to its replacement and keep a local backup. Even so, one of the things I've appreciated about CrashPlan is that the backups from those retired machines are still stored in the cloud.
I have about 7TB stored in CrashPlan, which includes not only work-related files but our home media, some of our movies, and our music collection. At a few points during my relationship with CrashPlan, I was concerned about whether I could store that much data with them, so I asked the company.
One support person, Matt G, responded in January 2011, and I quote, "Our data plans are truly unlimited, no silly limits, no limits, no lockdowns." In August 2012, CrashPlan support rep Nick S. responded with "Unlimited means unlimited."
And that's about where I left it. Until Monday, when the archives for 15 of my retired machines vanished from the backup report.
The back-and-forth discussion
I won't bore you with the detail of our back-and-forth, except to say that my part was primarily "Wah! Wah! Wah! Where's my data?" and their answer was pretty much "Blah, blah, blah, corporate policy."
When one of their answers took four days to get back and didn't result in what I considered a satisfactory answer to my "Where's my data?" question, I decided to take it to Twitter. Their, my dialog was pretty much "Rant, rant, rant, they lost my data," and their response was "Sigh, okay, let's look into it."
While we argued issues about their policies, terms of service, and so forth, the end result was that I eventually asked useful questions like, "Does the data from my old archives still exist?" and "Can it be made available for download?" They provided constructive replies, like "Yes it does," and "Yes it can."
Other than not making me deal with this situation at all, CrashPlan provided about as reasonable a solution as possible. They reinstated my old archives and gave me all the way until August to get them off their servers. They also wisely advised me to pull off the most important stuff first.
Where was the disconnect?
The disconnect was, quite literally, about disconnected computers. Because I've been getting the weekly reports for years showing the status of my files on disconnected computers, I assumed that they were being properly stored and were not at risk.
CrashPlan's policy, at least since 2013, has been that computers must connect within 180 days, or their backups will be discarded. I was not aware of this and I honestly can't tell if it changed from what it was when I first talked to them, or I just misinterpreted an overly enthusiastic tech support response.
A second disconnect was that I have a plan that allows for a maximum of ten computers. I thought that was ten connected computers, but that wasn't how CrashPlan defined it.
Because of the comments of the support reps about "Unlimited means unlimited," combined with years of weekly reports that itemized the data from the disconnected computers, I assumed that the data from the disconnected computers fell under the unlimited umbrella. Clearly (and, from my perspective, suddenly) that was not CrashPlan's view.
Lessons we should all learn from this
After it became clear that I would be able to get back my data from retired machines, I reached out to CrashPlan's management to discuss the issue from a broader perspective. Steve Buege, General Manager for Consumer & Small Business at Code42 (the company behind CrashPlan) gave me some great feedback.
First, I asked, "What are your current terms of service with regard to backup retention and what are those limits?" Here is his answer:
Per our retention policy, computers that back up to CrashPlan Central must periodically connect to the online destination. After six months of inactivity (180 days), backups stored on CrashPlan Central will be deleted. This article describes the backup retention policy for inactive archives stored on CrashPlan Central: Backup Retention Policy For Inactive Cloud Destination
The archive maintenance only happens when your computer regularly connects to CrashPlan Central. Connecting to CrashPlan Central ensures the safety and integrity of your backed-up data. We run regular maintenance and optimization of the backup archives, which can include:
- Checking for corrupted files and repairing any bad sectors.
- Pruning file versions and blocks, and removing deleted files.
- Purging files no longer selected for backup.
My last detailed discussion with CrashPlan over their retention policy was back in 2012. After that, things seemed to run fine, so the company mostly stayed off my radar.
Apparently, CrashPlan changed their policy in 2013 and -- as far as I can tell -- did not effectively communicate that to their customers. At any rate, I missed it. I'm not sure exactly how the policy changed, or exactly what it was before 2013, but here's some clarification.
Me: How did your policies change from those that were in place in August 2012
Steve: The policy for inactive backup removal, where the device has not connected at least once in the past six months, has been in place since 2013. We encourage customers to view our support site for the most up to date policy.
Figuring that I probably missed the communication explaining their policy change, I wanted to know how they reached out to customers:
Me: Given the possibility that I missed or ignored communication about your service changing from "no limits" to limits, how, exactly, did you communicate that to customers and when?
Steve: Our "unlimited" plans refer to the amount of storage - and this has not changed. We have always had limits on the number of devices that are allowed for each subscription.
And that leads us to our first lesson.
Lesson #1: For IT services customers, don't just assume that the services you bought a few years ago have the same terms of service today. Check on them regularly.
It's abundantly clear that I didn't stay up on the current ToS, but it's not clear that CrashPlan communicated their changes well, which brings me to the next lesson.
Lesson #2: For service providers, communicating to your customers before any drastic action is taken needs to be visible and memorable. Otherwise (giving CrashPlan the benefit of the doubt), even if you communicate to your customers, they may not notice. Make sure you truly go out of your way to communicate any service changes loud and clear, and do your best to make sure that message has been received.
Me: Why, if you made this change, did you suddenly remove all of my older archives in the last week? Did you send me a "we're about to remove your archives" message within the last month or two that I missed?
Steve: This policy, which has been in place since 2013, has not changed. We simply sent notification to customers that had more than six months of inactivity informing them that we had removed their inactive backup and to contact us if they had questions or concerns. Based upon some feedback, we changed our notification to provide five days advance notice before removing these archives. We actively work directly with individual customers who wish to retain old data.
As far as I can tell (and I did go through my email history), I didn't get a notice anytime recently telling me those retired backups would be eliminated. I didn't see anything obvious in 2013 either. The only difference was last week, when a shorter listing of backed up devices showed in my report.
While the computers that were retired were listed in red on my weekly report, there was no verbiage stating they were not valid backups. This went on for roughly three years, so we're talking about well over 100 reports, well after the policy change.
Lesson #3: For service providers. if your terms of service change, even if you have communicated that status change, use your regular reports to reinforce it. Otherwise, while you may think you're providing a hugely generous grace period (three years is definitely generous), your customers may not know they've been cutting into their grace period.
They will be up in arms when the hammer finally falls. Then, you'll have to field phone calls, Twitter posts, and support emails from customers who could have been satisfied but are, instead, upset.
On the other hand, what Steve said earlier about working with customers is real. After some Twitter bell-ringing, the company did step up and provide a more than fair solution. Email support was slow and unsatisfactory, but once I went onto Twitter, response picked up measurably.
Me: What tools (for example, Hootsuite, Zendesk, etc) are you using to monitor social media for discussion and did the nearly immediate response I got (compared to the very long wait via traditional support channels) reflect the communications mechanism (i.e., Twitter) or the fact that I have quite a few followers?
Steve: We use a variety of tools and our goal is to respond quickly to concerns to every customer, either on social media or through a direct channel.
And that brings us to our next important lesson.
Lesson #4: For service providers, give reasonable support, use omnichannel outreach methods, and be actively willing to resolve problems. It can go a long way to overcoming any concern an IT services customer might have with a vendor. Making things right makes a big difference. You can follow CrashPlan's lead here.
My big takeway -- and it's not one I'm happy with, but it's wise -- is to go over the terms of service for each of my vendors each year. I'm not happy with it because I don't want another 20 to-do items. But it just might be a lot less work-intensive than cleaning up a mess.
The last word
So, will I continue to use CrashPlan? The short answer is: most likely. It's great for backing up machines in current use. CrashPlan has saved me from disaster more than once, so it's nice to have them in my corner. However, as a matter of course, I review my backup and archiving strategy every few years. This episode has exposed a flaw. Obviously, long term off-site backups for retired machines will need to find a new home.
As for Code42's policies, I'll give Steve the last word:
Customers must periodically connect their devices to CrashPlan Central (our consumer cloud) at least once every six months to maintain active archives, per our retention policy. We send customers detailed backup reports every week that allow them to view the last time they connected, across all devices.
To ensure the health and integrity of their backups, we let customers know when we detect these old, unused archives and inform them that these files will be deleted after a period of time. This is in accordance with our retention policy. We're directly working with any individual customer concerns related to these ongoing communications.