Backup and restore: Is it time for a new approach?

Q&A with Chris Wahl, chief technical evangelist at cloud data management company, Rubrik.

istock-466743609.jpg

"[Backup and recovery] is a $50 billion market that is served by incumbents using software which can be up to 27 years old."

Image: Getty Images/iStockphoto

Start talking data storage and most people's eyes start to glaze over. Backup and restore? Sure, we rely on them, we know they're vital to the security of our systems, but that's about as far as our interest goes.

Cloud data management company Rubrik says the storage world is ready for new ideas. ZDNet recently spoke to Rubrik's chief technical evangelist, Chris Wahl, to describe how their technology revitalises good ol' backups.

ZDNet: Tell me about Rubrik.

p1030128.jpg

Rubrik's Chris Wahl: "The banner item behind Rubrik is its simplicity."

Image: Colin Barker

Wahl: Rubrik just celebrated its third birthday. We are a blend of talent from the enterprise space and the consumer space brought together to solve specific problems.

Backup and recovery is an under-served market that is really enormous. It's a $50 billion market that is served by incumbents using software which can be up to 27 years old. And trying to fit that, or dovetail that into a modern, cloud-first, cloud native, Agile [system], just doesn't work.

There is no way that that long ago people imagined globally orchestrated public clouds. It is just not the technology that is servicing the needs of today. So the idea for Rubrik was: let's build the software and storage systems that can easily serve businesses at the enterprise level that was backup, recovery, and archive, and a number of other data services that can fall under the umbrella of cloud data services management.

rubrikslide.jpg

A typical deployment for a customer is a hardware appliance -- there is nothing special about it -- but it clusters using a shared-nothing, scale-out architecture.

There are a couple of differences that make it different from, say, EMC. First, every node within a cluster uses a distributed task scheduler -- a distributed decision-making process to parallel ingest data across pretty much, innumerable data sources. That is physical data loads, or Windows or Linux, VMware, or application-specific workloads like databases.

It can grab all those sources in the nodes, ingest the flash -- we have solved a lot of the STUN problems [a tool for communications protocols to detect and traverse network address translators] -- and those are just unique differentiators.

Flash technology seems to be coming into its own, don't you think?

We are using flash as a component with all our nodes. The software leverages that with a number of our customers. At the end of the day what they see is a box that comes into the datacentre and within an hour it's racked, stacked, and configured, providing the enterprise with all its backup.

All they have to do is create an SLA domain or a policy, which is a very simple set of more questions. What's your recovery point time objective? What's your recovery time objective? How long do you want to retain the data? Where do you want to shuffle it off to? Where do you want to power your data -- public cloud, object storage, file storage or whatever you choose?

Then you can associate the policy to any piece of data and it does all the rest.

Presumably this is all straightforward for the IT guys?

Yes, but that's just all the nuts and bolts. From an open perspective, we have a full API that is published and documented in a project called SWAGGER.

If someone launches the interface, which is no harder than using Facebook, they don't need to be very technical to use it. They will be using the same APIs that we make available to customers.

Some companies would charge you for a whole-body API. With some of the vendors, to get their implementation going, you have to wade through a couple of hundred pages and spend a few months of specialised programming training.

What would you name as your headline feature?

Simplicity. I don't think that feature exists much in the enterprise. At the end of the day, if it is not simple to use, consume, and restore, who's going to use it? The banner item behind Rubrik is its simplicity.

Is the hardware off-the-shelf?

Wahl: Yes. From a performance perspective, because it is just off-the-shelf hardware, all of the intelligence and the [intellectual property] is in the software. Examples would be, we wrote our own file system -- specifically for grabbing and restoring data -- and we put a lot of metadata into that to make operations predictive in different formats. And there are a lot of tiers mixing flash and disk inside of the nodes so that there is a lot of capacity and a kind of a soaking area for 'hot blocks' and mixing data.

The way that we actually grab data is parallelised across all the nodes and it scales out with really no limitation.

And then the way that we store and allow you to get the data out of the file system is quite unique. There is a lot of IP in there. And those are the folks behind us. The lead technician who wrote our file system [Arvind Jain, co-founder and VP engineering] also wrote Colossus for Google, its second-generation file system. He knows how to write for a 100,000-node file system but wrote it so that you can start with three nodes.

How are you achieving such fast growth?

Wahl: You know, the people in the storage market have been abused for a long time. The last major innovation was backup and de-dupe, and that was back in 2006. Other than version releases that add new backup targets, the architecture that remains is really kludgy.

And this all fits in with the cloud too?

One part of what's driving the adoption is, yes, it's simple and it solves that backup and recovery problem. But a lot of people subscribe to our vision of data management at cloud scale. That's why we call it 'cloud data management'.

Read more about storage