Cloud Storage is one of the hottest topics in the enterprise space, particularly as organizations start to face mounting SAN, NAS and on-premises server data sprawl.
In a previous article I talked a bit about cloud integrated storage appliances and how they can help you offload some of that data, whether you are using it as backup or direct-access storage.
However, cloud storage appliances and gateways installed within the enterprise datacenter are only half of the hybrid cloud equation -- the other half is the public or private cloud provider.
In this gallery we'll round up the top public and private clouds, concentrating solely on their storage offerings. Obviously, all of these providers also have self-service IaaS and other web services to pair with their storage offerings, but I've authored this with a hybrid or on-premises data offload scenario in mind.
Due to the volatility in Cloud Storage pricing between vendors, I've decided not to focus too much on who is cheapest -- it's clear that this industry is becoming highly commoditized and we're in a race to the bottom when it comes to paying for raw terabytes, so where each of these players differentiates is feature sets.
Amazon Web Services' S3 (Simple Storage Service) may not have been the first enterprise cloud storage offering on the market, but there's no question that the Kirkland, Washington-based company is the 800 pound gorilla when it comes to market penetration and branding when it comes to the cloud.
Although S3 is commodity priced like many of the other services on this list, and is the pace-setter for price reductions at its competitors, the company has managed to present a diverse offering that addresses numerous market segments and use-cases that most of its competitors find difficulty matching.
S3's "blob" storage has become an industry-standard offering, which comes in a locally-redundant and a geo-redundant format, with the geo-redundant option being more expensive. Amazon also has a "Reduced redundancy" option for applications that don't need such a high level of storage protection, and thus is often able to offer that with a lower price to that of its main competitors.
A single AWS account can have up to 100 S3 "buckets" or containers, each of which has no limit on the number of objects stored, with a 5TB size limit per bucket object. So there's no practical limit as to how much data you can store on Amazon's cloud. However, if you've got multiple petabyte-level requirements, maybe you should be talking to Jeff Bezos directly.
In addition to blob storage, which is random-access and is only limited by the speed of your internet connection to Amazon's datacenters (9 regions total, with 3 in the US) the company offers Glacier, which sacrifices RTO (Recovery Time Objective) with restores potentially taking hours to retrieve files and having higher relative restore (data egress) costs for being half the price of their standard blob/bucket storage.
Because of the high RTO relative to blob, Glacier is positioned towards long-term "vaulting" of data and should be considered more like a cloud-based tape drive rather than a random access storage service.
S3 as well as Glacier are accessed through Amazon's S3 API or through products and front-ends that can interface with it, such as Cloud Integrated Storage gateways and appliances. Eucalyptus, for example, as well as a number of other smaller private cloud storage services use Amazon's APIs for compatability.
S3 supports four separate access control mechanisms: Identity Access Management, Bucket-Level, Access Control Lists and Query Strings. It also supports encryption using SSL, SSE-C as well as through the use of private libraries. S3 also supports Audit Logs, Versioning, and Multi-Factor Authentication with Versioning.
Although Amazon S3 is the industry cloud storage giant and has gone broad-spectrum with its commoditization of offers, Microsoft is clearly commiting heavily to its own public cloud and its storage technology with Azure.
Azure handles storage differently than Amazon, by using its own unique set of web-based, platform-agnostic HTTP/HTTPS REST APIs as well as with multiple types of storage which are optimized by use-case and workload.
Blob storage, which is similar to Amazon's S3 bucket/object-based product, is limited to 50 storage accounts, or containers per subscription, with up to 500 terabytes per account.
However, unlike Amazon, a single Microsoft Live ID can have multiple Azure subscriptions on it, so there's no real limitation. Again, if you have needs in the multiple petabyte range, you'll likely want to talk with Microsoft directly.
As a result of a $15B investment in infastructure, Microsoft currently operates Azure datacenters in 15 regions (8 in the US) on four continents and has multiple avaliability options.
With Locally Redundant Storage (LRS), three copies of your data are stored locally within the storage account's primary region.
Zone-redundant storage (ZRS) maintains three copies of your data. ZRS is replicated three times across two to three facilities, either within a single region or across two regions, providing higher durability than LRS.
If you need the ultimate in data protection, there's Geo Redundant Storage (GRS) which is stored in a secondary region 250+ miles from the primary region but within the same geography, thus six copies of the data are stored.
In addtion to regular blob storage Azure differentiates from S3 in that it also has Drives (block-based storage for IaaS VMs), Queue and SMB File Shares, the latter which is currently in preview and is designed for supporting legacy apps for "Lift and Shift" scenarios that cannot easily be rearchitected to use more modern cloud storage programmatic interfaces.
Like Amazon, Azure also offers a NoSQL solution as Table storage.
While Azure is probably the best in class storage service for Windows-based systems, Microsoft-based software architecture and .NET, Java, Android, C++ and Node.js are also fully supported as first-class citizens via supplied client libraries.
Google, not to be outdone by Amazon or Microsoft has also entered the cloud storage fray.
The value proposition for Google is that they've always been the largest cloud player, although not necessarily for the enterprise. But there's no question that the company has massive datacenter infrastructure which it uses to serve its consumer web services such as GMail, YouTube and of course their own search engine.
Now it is leveraging that same infrastructure to provide cloud storage. Similar to Amazon and Microsoft, it uses a web-based "RESTful" API, which they call the XML API .
Google's blob storage offerings are less stratified than that of Amazon or Microsoft -- They only offer "Standard" storage, or Durable Reduced Availability (DRA) storage for projects that are more cost-sensitive. Standard or DRA is enabled at the bucket/container level.
Unlike Amazon and Azure, where the user has more explicit regional control, Google does not. The company is currently experimenting with "Regional buckets" that allow you to colocate your storage in the same region as Google's Compute instances. However, it is limited to DRA and the company's Cloud Storage SLA doesn't apply to it yet.
The prime differentiator for Google Storage is that their service can do chunked encoding as well as resumable uploads, and naturally it is probably your best choice if you also use Google's App Engine nd Compute Engine. Like Azure and Amazon S3, Google offers a NoSQL table-style datastore.
Google also classifies their MySQL database service as storage, whereas AWS and Azure refer to them as databases and have specific offerings as such.
Of the cloud storage providers on our list, Oracle, a giant in the enterprise server, Java and RDBMS business is actually the newest -- its offering has only been in existence since April of 2014.
If you are looking for a huge competitive swath of features from the Silicon Valley enterprise juggernaut, you're not going to find it. Oracle's Cloud Storage service is pretty barebones, although it is reasonably priced, starting at $30 per month per terabyte.
As of this writing both Amazon S3 and Azure were both a few dollars a month cheaper at the 1TB level, and Oracle's 450TB pricing is the same as Azure's 100TB LRS pricing.
Oracle Storage Cloud does have the distinction of being able to do end-to-end encryption using 2048-bit key pairs, but only if you use their Java-based client and not their RESTful web services API.
Right now, Oracle only offers blob containers, using locally redundant storage, which would be roughly analagous to Azure's lowest tier LRS offering, using 3 replicated copies of your data between 3 different nodes within a single datacenter.
Interestingly enough, Oracle in its own marketing material makes a point that your data never leaves the datacenter in your own region. In data soverignity-conscious environments, this could be seen as an advantage, but that also means you can't get geo-redundancy.
It should also be noted that the service is only avaliable in North America at this time.
Altthough Oracle states there are no lmits as to how many objects you can store, Oracle's storage cloud does not work in a "buy as you go" like Amazon, Azure and Google Storage does. Instead,
When purchasing the Oracle Storage Cloud Service the buyer must specify how much storage capacity is required. Users of the Service Instance cannot store more data than originally purchased. At any time, the buyer can increase a service instance's storage capacity.
Additionally Oracle Storage objects are limited to 5GB in size, which is significantly smaller than the object sizes of either S3 or Azure. Per the FAQ, Oracle's storage cloud requires the data to be chunked when it is uploaded:
Files of any size can be uploaded to the Oracle Storage Cloud Service. A single object in Oracle Storage Cloud Service can be as large as 5GB. To store files larger than 5GB, simply segment the original file into sizes of 5GB or less and upload the segments following a defined naming convention. Then create a new manifest object to represent all of the pieces of the original file. The resulting file can then be downloaded as a single file and is identical to the original file.
Why would you look into Oracle's storage cloud as opposed to anyone else? Probably because you already have a standing customer relationship with the company, they are incentivising you to do so, and perhaps their backup service for their RDBMS might have some value as it is their preferred integrated solution.
If you're looking to build your own private cloud, and you are inclined to use Open Source software, chances are, you've probably heard of OpenStack.
OpenStack is a project that was originally formed as a partnership between RackSpace and NASA, but now is managed by the OpenStack Foundation.
While OpenStack is geared more towards building private clouds, there are an increasing number of public and private cloud providers that use the platform to provide cloud services, including cloud storage. This includes but is not limited to:
The two largest players in the public cloud OpenStack space are Rackspace and HP Helion. HP also offers its branded version of OpenStack that it uses in its public cloud as a "distribution" you can install on-premises. Mirantis also offers a distribution as well.
OpenStack offers two types of cloud storage, Object/Container based, which is referred to as "Swift", and block storage, which is called "Cinder". Cinder is mostly used as mountable virtual hard drives which is used by VMs running under IaaS within an OpenStack-based environment.
Perhaps you've architected a cloud-based application that uses Amazon S3, but you'd like to move it to a compatible private cloud of your own design, or one hosted by a service provider. For people like you, there's Eucalyptus.
Like OpenStack, Eucalyptus is an Open Source project, but with a key difference -- its APIs are compatible with Amazon Web Services. If you started off with a project at Amazon, but want to migrate off of it to your own premises (or to a managed hosting scenario) and maintain that level of API and binary VM image compatability, this is your best bet.
What's most notable about Eucalyptus other than the fact that it is AWS-compatible is that Hewlett-Packard, which is a significant OpenStack contributor and also maintains an OpenStack-based public cloud just bought them for a rumored $100M.
In the long term, that may mean that their HP Helion Openstack public cloud is likely to get Amazon S3 compatability sometime in the future.
But that also means that HP now has two Open Source cloud stacks, and whether it stays 100 percent compatible with OpenStack's "pure" implementation remains to be seen.