Yahoo's financial troubles as well as recent examples of Flickr account mismanagement highlight the need for personalized disaster recovery and Cloud data permanence.
Imagine that you are a blogger, say someone who likes to take a lot of pictures and videos of things. Like myself.
Many bloggers, be it those with sponsored or advertising-free sites use services like Flickr, Google's Picasa Web Albums, Photobucket, YouTube and Vimeo to host this content, because it's much cheaper to pay a nominal $25-$50 per year fee for "Pro" access to each of these services than to try to host them on your own webspace, such as on WordPress.com, Blogger.com, Movable Type or elsewhere.
In terms of raw infrastructure cost and bandwidth, it makes a lot of sense, if you're a one-man/one-woman blogging shop with limited resources.
The problem is that when you outsource your blogging infrastructure to others, you're dependent on the reliability/resiliency of the Cloud rather than your own infrastructure, which by contrast you have full control over (as well as complete responsibility in terms of required maintenance). And if you use a mix of services, that reliability/resiliency may or may not be defined as under the Terms of Service for each of the service providers.
Here's the kicker -- if you read most of these, they're under no obligation to back up your data.
As I discussed in a previous article in November of 2008, "Preparing for a Flickr Apocalypse", losing your data hosted in the Cloud could be utterly catastrophic. Even having good backups of that data doesn't necessarily ensure you can put your blog back together again without a huge amount of aggravation and time sink.
This week, for a Swiss photo blogger, Mirco Wilhelm, that Flickr Apocalypse came.
Apparently, due to an error by one of Flickr's system administrators, Mirco's Flickr account was deleted, along with 4,000 of his photos. All because he reported to Flickr that another user had used his content without his consent. Instead of deleting the other user's photos, Flickr deleted his.
Now, this would be just fine, provided that Yahoo did snapshot-based SAN backups and second-tier tape archives of their storage for disaster recovery and for occasional mishaps like this. Sysadmins do make mistakes. It happens. All Yahoo would need to do is restore those backups and everything would be fine.
The problem is that Flickr doesn't appear to preserve historical data. So even though they were able to restore Mirco's account, they couldn't bring back his photos. And while Mirco has offline backups of his photos, there's no way to link them to the metadata -- the unique URLs -- that exist on his blogspace. So now he has thousands of broken links on his blog.
To me and many other bloggers who rely on Flickr and other content-hosting services, this is a massive wake-up call.
To Yahoo and Flickr, I say this: you had better get your storage and disaster recovery act together. Soon. And if you don't get Mirco's data fixed, you will be in a world of pain.
[UPDATED, February 3 2011: Flickr has restored Mirco Wilheim's photos.]
Yahoo, If you can't prove that your users can recover from this type of catastrophe, then expect them to start transitioning their data to other services such as Google's. Think of it as a potential "Bank Run" on your Cloud. I'm certainly considering hosting all of my new photos with Picasa already and thinking of ways I can transition all of my data over to a more financially stable company, eventually.
Two years ago, I was worried that Yahoo and the Flickr service could go under due to overall issues with the economy and Yahoo's thrashing about with Microsoft.
Today, with Yahoo's continuing financial woes, facing competitive challenges across the board due to ineffective leadership, with clear signs of asset consolidation and severe workforce reductions underway along with recent indications that if they screw up, you can't come back, I'm even more concerned.
If Yahoo were to shut down Flickr tomorrow, it would be utterly catastrophic. Literally hundreds of thousands of blogs and other websites (if not millions) would lose all of their photo content overnight. In my personal case, 90 percent of my food blog's content -- over 17,000 photos -- would simply vanish.
Many of my friends and peers sites in the food blogging community would be irreparably damaged, as much of them rely on Flickr as well due to its ease of use and excellent blog integration.
Of course, I have backups of all of the photos. I use services such as QOOP (which will mail you DVDs of your pictures for a nominal fee) as well as programs like Bulkr and Cloud-based replication services like Backupify. If you're a Flickr user, I encourage you to start using, them, stat. But even using these just isn't good enough.
As of today, there is no known way to take those photo backups, restore them to private webspace or another hosting provider of your choosing, and "automagically" fix all of your broken links on your blog by doing a data transform/parse on an XML backup of your blog posts. The Flickr metadata itself of how the directory structures are represented would need to be copied over to the new site, along with the canonical name change in the URL.
That's not including, of course, the descriptions, groupings, taggings, et cetera that go along with all of those photos on Flickr, if I wanted to painlessly move them to something like Picasa or Photobucket. And in Flickr's case, pictures can be stored in multiple resolutions. I only have the ORIGINAL uploaded, high-res files backed up, and that's 30GB of data, even though I only (tend to) link to the 500 pixel-width sized ones on my blog.
What we need is some sort of universal standard for preserving content. I should, depending on how my customer loyalty wind blows on any particular day, be able to easily switch from Flickr to Google Picasa. Or to PhotoBucket, or whatever else is out there, including my own private webspace.
There needs to be toolsets for doing this. Heck, Flickr needs to provide a means for users to back up their data repositories and restore them, just in case the users make accidental catastrophic mistakes.
My biggest concern is not just for my own personal blog, but for the retention and permanence of data as the years go on. Does anyone even remember SONY's Imagestation? That content is long gone. How many countless memories from that site have been lost?
Flickr unfortunately is just too big to fail, regardless of Yahoo's financial situation. What we need is a data preservation and a disaster recovery strategy that works, for everyone's photos and content.
Are you concerned about the permanence of your online digital photo collections and your blog content links? Does Yahoo! and Flickr need to do good by its users in assuring their data and providing them tools to migrate and restore their photos? Talk Back and Let Me Know.