Hybrid storage: Taking advantage of the cloud

The way we think about storage is changing. Starting from a world of direct-attached storage (DAS), we moved to using storage area networks (SANs), before virtualising our storage and not worrying about what physical disks we were using. Now we have the cloud, and with it cloud-hosted storage, and the way we store data is changing again.
The economics of the cloud make it possible to rethink how we store data, taking advantage of its scalability, and the economies of scale that are offered by its massive data centres. If you need an extra terabyte or two hundred, is it worth purchasing additional racks and disks now, or just storing that data in the cloud?
New design patterns for cloud infrastructures are changing the way we think about storage, giving us the option of using it to extend our existing storage environments -- either through new storage hardware, or through cloud platforms, or using features built into your software.
Working with cloud storage as a new tier in a storage virtualisation model makes a lot of sense. Hyperscale clouds like Microsoft's, Google's and Amazon's have a buying power that's beyond that of even the largest IT department, and that's allowed them to invest significantly in storage -- both in SSD and in HDD. They've also built large global networks, storing data multiple times across several geographic regions, providing data protection for applications and services running on their platforms.
Amazon's AWS offers a range of storage options that make it a useful extension of your existing storage platform. Its slow Glacier system makes sense as part of a backup and archiving strategy, providing a long-term cold storage facility for non-urgent data that needs to be kept for significant amounts of time. While it's not strictly a hybrid storage environment, it's available as a cloud-hosted backup tier in some backup packages. Cloud-based backup tools like Backblaze or Mozy are focused on single PCs, although Microsoft has made Azure a backup option for Windows Server.
StorSimple & Azure
Uploading data to the cloud can be slow. That's where hybrid storage devices like Microsoft's StorSimple appliances come into play. Looking like just another storage area network device to your applications, they're able to extend your existing storage into Azure's storage platform. The appliance is best thought of as a local cache, hosting recent and most frequently accessed data. Once on a StorSimple, data is uploaded to an Azure-hosted store -- with changes reflected from the local cache to the cloud.
You don't need to know much about storage to get started with StorSimple, as everything is controlled from the Azure Management Portal. You can use this to configure the appliance, and to manage storage snapshots for backup and disaster recovery. The appliance uses iSCSI, so it works with more than just Windows. You can also work with Linux, or use it as a host for VMware virtual disks. The appliance uses a mix of SSD and HDD for automated data tiering, with the cloud store the final tier -- giving you up to 500TB of storage with just a few U of rack space. If space is an issue, then this is an approach well worth considering.
One advantage of StorSimple is its cloud controller. If you need to access your stored data in an Azure application, you can use the cloud controller virtual appliance to connect your storage to your cloud infrastructure. It's a model that works well as part of a cloud-hosted disaster recovery system, as well as a way of bringing data from on-premises sources to cloud applications. You can also use a StorSimple appliance in another data centre to access your cloud data, helping share data between business units and sites. It's not the cheapest storage option, but it's certainly one of the most flexible.
Other hybrid cloud storage solutions
You can also get hybrid cloud storage from your private cloud management tools, and VMware's vCloud includes tools to help you build hybrid storage solutions, working with both Cloud Foundry and OpenStack. While it's focused on working with private clouds, there's an option of working with service-provider-hosted OpenStack services. Building your own hybrid cloud can be more complex than buying off-the-shelf, but it's certainly a more flexible option.
Applications and cloud storage
Things get more interesting when we start to mix cloud storage with applications. Microsoft's next SQL Server release is adding support for Azure storage, using it for what Microsoft is calling a 'stretch database'. Taking the tiering model to heart, SQL Server 2016 can take an on-premises set of tables and move the older data to Azure -- while still treating it as part of the database. That way you get fast access to recent and relevant data, while older data that would traditionally have been archived remains accessible at all times. Support for encryption means that data can only be accessed by a local SQL Server instance, with access to the encryption keys.
End users get the benefits of hybrid storage with cloud file synchronization tools. Save a file in a Dropbox, Box or OneDrive folder, and you have a local copy with a copy in the cloud. New APIs in Office 365 mean Office applications are able to work with multiple cloud storage providers, allowing files to be shared with whatever service businesses or individuals prefer to use.
You can also construct your own hybrid cloud storage. APIs in cloud platforms mean it's possible to wrap RESTful storage calls with your own applications for anywhere access. You can wrap a RESTful storage API as a storage driver, making it look like a file system endpoint, ready for use. It's important to remember that there's always going to be a lot more latency in a hybrid storage architecture than in an on-premises storage network.
As all you're presented with is a familiar storage protocol, whether it's CIFS or iSCSI, you don't need to know what the backend storage is. It might be fast storage in Azure or slow AWS Glacier storage, or something new, like the cold storage arrays Facebook is developing. What matters is the files you're storing, the local cache you have, and the speed of your connection to your cloud storage.
Outlook
What's becoming clear is that cloud storage is not an alternative to on-premises storage. Instead, it's becoming a way of extending what you already have, without having to invest in new hardware. Making storage a hybrid of capex and opex is part-and-parcel of the process, as it allows organizations to keep track of who is storing what where, with appropriate internal billing. Storage virtualization means it's easy to deploy hybrid storage, as it looks like any other node in a storage network.
Unless you've invested in a dedicated high-speed connection to a cloud data centre, like Azure's ExpressRoute MPLS, you're unlikely to be using hybrid cloud storage as a live store for your applications. Instead you'll need to think of it as an additional tier to your storage architecture, or as a way of transferring data from on-premises applications to cloud applications -- or even as a disaster recovery solution. There's also the added bonus of storing data where it can be accessed from branch offices, or staff on the road, anywhere in the world.