Rick Vanover of Veeam recently had me on as a guest for his community podcast to talk about Storage Virtualization and the use of commodity server technologies that are transforming the modern datacenter.
As usual, Rick and I like to go off into tangents. Due to my cloud and service provider technology background at Microsoft, we talked a bit about Windows 2012 R2 Storage Spaces, as well as converged networking technologies from vendors like Cisco.
That being said, I want to point out that the overall concept of Storage Virtualization/Software Defined Storage and using commodity disks to build out your datacenter is probably the most vendor-agnostic and anti lock-in approach for enterprises as well as service providers, going forward.
Early this year, ZDNet contributor Robin Harris published an article "Why SSDs are obsolete". He makes a number of valid points, as the predominantly used storage bus technologies (SAS and SATA) need a desperate make-over.
After all, current generation SAS and SATA bus interfaces are simply evolutionary improvements from the SCSI and ATA interface technologies that proceeded them decades earlier.
Modern server technology has eclipsed them, as well as our ever increasing demand for data-driven applications.
As Robin posits, should the industry be thinking about modernizing our fundamental storage bus/backplane technology? Heck yes.
But these are purely academic issues when the real problems facing enterprises aren't so much getting those IOPS out of the spindles and SSDs for the most demanding of data-driven applications, but the total cost of ownership of that storage.
In the largest enterprises data sprawl continues to be more and more of concern, especially in regulated industries that require long-term data retention and need to keep that data in an online, ready to be accessed state rather than by using archival storage technologies, such as tape or even Cloud Integrated Storage.
Let's get right to it. Your enterprise SAN and NAS hardware is super expensive and they probably contribute to at least one third if not more of your datacenter hardware spend. And while the vendors of these products have strong reputations and you rely on these frames to store your most mission-critical data, let's face it -- there isn't a lot of secret sauce inside these refrigerator-sized boxes filled with hard drives.
Inside one of these chassis, you've got SAS and SATA backplanes, storage fabric interfaces, controllers, and some special software that performs the partitioning of the trays for LUN presentation. Most of these controllers run on some proprietary version of UNIX or even BSD derivatives which you never get to see -- the storage vendor keeps you out of it, leaving you to their utility software.
For the most part, a SAN or NAS frame is a black box to you after you have chopped up the trays and presented a LUN to a server.
There is an end to this expensive madness, but it requires some creative and disruptive thinking on the part of an enterprise -- and that means approaching storage just like a hyperscale service provider does.
And when I mean hyperscale I'm talking the Amazon Web Services, the Microsoft Azures and the Google Compute Engines of the world.
You don't think these companies manage to price out rock-bottom cloud storage using EMCs and NetApps, do you? Of course not. They'd go broke if they did.
Instead of SANs and NAS appliances, hyperscale providers have been largely using JBODs -- effectively, pools of "Just a Bunch of Disks" for years. Unlike SMBs or enterprises, these hyperscale cloud providers created their own storage architectures using commodity parts, JBODs and homegrown engineering experience in order to bring their storage costs down way below their enterprise counterparts.
For the most part, the "secret sauce" to building a commodity storage architecture has not been standardized, and its also been hampered by the fact that one of the major blockers to doing things this way was a continued reliance on fiber-channel HBAs to provide the clustered storage fabric, which was still expensive.
The concept is fairly simple -- connect multiple file server "heads" to your existing switched Ethernet fabric, and use multiple JBODs (such as those made by DataOn, Dell, or Supermicro) filled with a combination of SAS 15K and SSD in a tiered configuration connected to those heads in a SAS clustered formation.
The file server heads in turn connect to your virtualization hosts or physical systems which provide connectivity to your data using SMB 3.0. The resiliency is built into the operating system providing the storage itself, rather than some proprietary secret sauce built into the controllers within the storage frame, as in the case of a SAN or NAS.
In my example above, I'm using Microsoft's Scale-out File Server (SoFS) which comes built-in for free with Windows Server 2012 R2 and is also based on Microsoft's Storage Spaces which is the software-defined storage that is built into the OS. The storage hardware used in this scenario is DataOn's DNS-1660D in combination with commodity Dell R820 rackmount servers and Mellanox RDMA cards.
The configuration described in the whitepaper linked above is capable of achieving sustained speeds of over 1 million IOPS per second.
Dell itself has also published its own whitepaper on how to build a SoFS using its MD1220 PowerVault JBOD arrays, but conceivably, any combination of JBOD and commodity x86 server hardware using SAS and 10Gbps Ethernet connectivity following these basic architectural guidelines should work.
Besides Microsoft's SoFS and Storage Spaces, there are other vendors which provide similar JBOD-based storage architectures, such as Nexenta (which is based on Solaris' ZFS). For Linux, there is HA-LVM and GFS/GFS2 which is provided with RedHat's Resilient Storage add-on.
On Ubuntu Server, the equivalent is called ClusterStack. If you're looking for a Linux software-defined storage architecture that is a bit more pre-packaged, you might want to examine QuantaStor by OSNEXUS.
The bottom line is that while SANs and NAS are proven and reliable, their hegemony of providing the highest performance and most resilient storage for the dollar is coming to an end.
If you're a smart CxO trying to cut down on your ever increasing storage costs, start looking into what the hyperscale cloud providers do -- use JBODs and the software defined storage built into modern server operating systems, and use Cloud Integrated Storage as well for those apps that don't need to leverage near-line storage or can keep archival data in the Cloud.
Has your organization started using software-defined storage using commodity hardware yet? Talk Back and Let Me Know.