The Cloud in 2018: what we have learned so far
What does 11 9s durability really mean?
Marketing's hype frontier is the cloud. Cloud vendors need trust. So they claim 99.999999999 percent availability. Huh? Why not 15 or l7 9s? Here's the deal.
Read also: Top cloud providers 2018: How AWS, Microsoft, Google Cloud
The good folks at Backblaze published a post today on the truth behind cloud durability claims. If you want the gory details, go read the whole thing.
If not, keep reading.
Forget the math
The math behind availability numbers is impressive. But the math depends on the assumptions behind it.
The assumptions are key. And they are:
- Average rebuild time, which isn't what you think it is.
- Annualized drive failure rate, or AFR. Also not what you think.
Read also: More enterprises are going 'all-in' with select cloud providers
All very scientific, except they're beside the point.
Modern cloud storage is a tech wonder. Active cloud storage typically has three copies of your data. Partly for availability, but also for performance, since 7200 RPM disks have more than 8ms of rotational latency, and with three you cut that number significantly.
Read also: Best cloud services for small businesses - CNET
Backup storage -- what Backblaze offers -- can't afford three copies, so they use fancy erasure codes to protect data. These systems break data into shards with some mathematically intense parity that typically enables data to survive four drive failures with little impact on storage capacity.
Vendors don't wait until four drives fail to take corrective action. The shards are spread over the infrastructure so no single power supply, switch, or rack can take out more than one shard. And rebuilds are usually highly parallel, so the failures are repaired a lot faster than a home user would expect.
As the Backblaze post notes, 11 9s availability is irrelevant because:
Somewhere around the 8th nine we start moving from practical to purely academic. Why? Because at these probability levels, it's far more likely that:
- An armed conflict takes out data center(s).
- Earthquakes / Flood / Pests / or other events known as "Acts of God" destroys multiple data centers.
- There's a prolonged billing problem and your account data is deleted.
Read also: Five competitive differentiators for cloud services - CNET
Asteroids v billing?
As it notes:
You change your credit card provider. The credit card on file is invalid when the vendor tries to bill it. Your email service provider thinks billing emails are SPAM. You don't see the emails coming from your vendor saying there is a problem. You do not answer phone calls from numbers you do not recognize. Customer Support is trying to call you from a blocked number, they are trying to leave voicemails but the mailbox is full.
Thereby hangs a tale -- with an unhappy ending.
Read also: How to manage vendors in a cloud-first world
Designing for failure
Behind the marketing though are serious engineering teams that sweat the details, from architecture to root cause analysis of unexpected events. We've come a long way from the oversold RAID arrays of 1990s whose data loss rates were far above what the optimistic assumptions predicted.
Read also: The 15 most important hybrid cloud vendors
The Storage Bits take
Is a full voicemail box more likely to result in data loss than an asteroid? Since no asteroids have hit Earth in the last 20 years and people have lost data due to billing issues, yeah, clean out your voicemail.
The important point is that for backup purposes -- which means you only need access occasionally -- any provider with credible numbers is good enough. So then the decision points come down to price and privacy.
The former is easy to figure. The latter? Well, the client software should encrypt your data before it leaves your system with some suitably expensive algorithm like AES-256. Once the data is sharded at the backup site, it is pretty much unreadable unless the entire data center -- and your password -- are hacked. No single drive will have useful recoverable data.
Read also: Enterprises learning to love cloud lock-in too
Active data -- a website, or database -- that needs constant access is another matter. There are a lot more moving parts, and nobody claims 11 9s availability.
But that's a post for another time.
Courteous comments welcome, of course. I've been a satisfied customer of Backblaze for several years. Other than that, I have no commercial relationship with them.
Cloud services: 24 lesser-known web services your business needs to try