Cloud storage appliances: Backup and recovery made simple

Why should you integrate cloud backup appliances into your IT environment? Because you've made this decision before.
Written by Jason Perlow, Senior Contributing Writer

Cloud. Cloud. Yay cloud!

If you're not an IT decision maker, I did not write this article for you. Go away. Your mother is calling you and wants you to clean up your room in the basement.

OK, now that we're left with just the adults in the joint, let me put this in very simple terms that I am sure any stressed out, overworked CIO or CTO can understand: Your storage is very expensive.

Like many organizations, you are probably always on the verge of having to buy another frame, another chassis, and trays of drives because you've got VM and filer sprawl. And the guy or gal who has the authority to sign the purchase orders to get you those new frames, chassis, network infrastructure, et cetera, likes to say no a lot.

They do this because they love to make you miserable. They enjoy it. They have a big giant rubber stamp embossed with "Denied" on it in a 1920s-style font with a pad of red ink next to them, and they relish every moment to use it when one of those POs comes across their desk.

Sound familiar? Do I get it? Are you still with me? Good.

If you can't get new storage frames, then you have to by definition free up that storage. Chances are you've got a lot of infrequently used files, but maybe because of regulatory reasons or other business drivers, you have to retain that information. So where to put it?

Where to put. It.

So in the olden days, you had to solve this problem with things like physical boxes of printed paper documents and DLT tapes, and because you didn't have enough physical real estate to store the stuff and that real estate was expensive, you shipped it offsite. In armored trucks, in many cases.

But unlike Iron Mountain or similar services, it's not expensive to retrieve that infrequently used stuff, and it also happens extremely quickly. It's also more secure than that armored truck.

Now, back in those days of yore, the 1990s, you used services like Iron Mountain to cart truckloads of that stuff out your door. And I am sure there were many conversations at the time about the pros and cons of doing that.

Certainly, one-off retrieval of documents and tapes wasn't cheap when it had to occur, and there were some trust issues about the transport of those documents and tapes offsite, but, overall, it was a net win for your company and a good idea, and you were probably wondering why after all was said and done, you did not do it sooner.

Cloud-based storage is the same deal. You use it to move all sorts of infrequently used stuff offsite, in a secure fashion, so you can free up space on that storage that's a pain in the ass and expensive to buy.

That's certainly the primary use case, but there are others, which I will get into momentarily.

However, unlike Iron Mountain or similar services, it's not expensive to retrieve that infrequently used stuff, and it also happens extremely quickly. It's also more secure than that armored truck.

No, really, it is. When stored in the cloud, be it Amazon's, Microsoft's, Google's, or anyone else's, these "Cloud Storage Gateways", as they are called, transport your data using military-spec network encryption protocols and then store it in an encrypted file format that is machine unreadable should anyone actually invade the target datacenter, which by the way is geo-redundant if you want to pay for that premium.

Armored trucks can be broken into, and there were a number of instances during the early 2000s where major financial and government institutions simply lost DLT tapes on them and had major public fiascos.

Yes, I'm sure the NSA can tap your MPLS and OC lines, but, honestly, they have better things to do with their time.

So first of all, cloud storage is cheap. How cheap? Take a look at the Amazon S3 and Microsoft Azure price lists, for starters. It's way, way cheaper than your frames.

Now, you're probably thinking that you gotta use a whole lot of programmatic API junk to integrate this stuff with your line-of-business apps. Nope.

So all of these Cloud Storage services have APIs, but you can literally just drop one of these gateway appliances into a rack, or even run one as a virtual machine, and point your servers at it using an iSCSI connection over your IP network and let it do all that API stuff.

Your servers just see the gateway as just another LUN. A block storage device like all the others you have, just like on your SAN or your NAS filer.

There are many companies that make these gateway devices.

The vendors that make these gateways or have the functionality included in their storage systems include Amazon, Microsoft, CTERA, Riverbed, EMC, IBM, F5, Twinstrata, Barracuda, Nasuni, and Panzura. I've linked to all of these so you can examine their offerings closely.

Obviously, Amazon and Microsoft have products that are optimized for their own clouds. Amazon's is provided as a free VM that runs on your on-premises VMware ESX or Microsoft Hyper-V systems, and Microsoft's StorSimple is three configurations of physical appliance containing a mix of SSD and SAS disk.

All of these solutions, including the cloud-agnostic ones listed above, can be used not only to cache and front-end your on-premises data and transparently offload and retrieve the infrequently accessed stuff to and from cloud storage, but they can also be used for disaster recovery scenarios.

Many of these appliances have snapshotting capability and essentially act as virtual tape libraries.

If your datacenter has a catastrophic failure, you can use another appliance/gateway at another location to remotely restore that data to a set of servers from that cloud storage.

This is also the part of the article where I tell you where I work for a company that owns a cloud and makes said gateway devices (Microsoft/StorSimple).

But you knew that already, so I'm not going to recommend anything in particular, but I will tell you what questions to ask your vendor so you get the functionality that you want. Here's a whole bunch:

  • What's the capacity/scale of the solution; ie, how much can be cached or stored locally on a per-volume (LUN) basis, and what is the maximum number of volumes that you can store per VM?

  • Can you do local snapshots? Can you do cloud-based snapshots?

  • Can you do incremental snapshots with storage optimization?

  • Is the restore process WAN optimized?

  • Do you provide application consistency for your data protection? (ie, VSS integration for enterprise services and databases)

  • Do you de-duplicate the primary storage and the snapshots?

  • How do you do data encryption to and from the cloud provider?

  • Do you supply a high-availability architecture for your gateway device?

  • Do you support multipath I/O (MPIO)?

  • Does your appliance support non-destructive upgrades?

  • Do you have an SLA for local storage performance on the appliance?

  • Is the gateway plug and play and self-contained?

  • Is the gateway certified for my vendor hypervisor of choice's VMs (VMware, Microsoft Hyper-V, KVM, Xen, Unix)?

Are you planning on bringing cloud-integrated storage using a gateway appliance into your IT environment? Talk back and let me know.

Editorial standards