X
Tech

Virtual data storage: old concept, new uses

Business is about data. Lots of data. We have to find a way to harness the information explosion--541 petabytes of data in just the past year--and virtualization is the key.
Written by Mark Canepa, Contributor
COMMENTARY-- Virtualization isn't new--computer scientists have used the concept for decades--but it is suddenly sparking a much-needed revolution.

We have to find a way to harness the information explosion--541 petabytes of data in just the past year--and virtualization is the key.

Business is about data. It's about finding your unique constituency, getting to know these customers better than your competitors do, and providing better service. That requires data and analysis--and speed. The data must be delivered in milliseconds because it's not just people who are asking for it, but high-speed data-mining applications continually sifting through mountains of the stuff.

Fortunately, the costs of computing power and storage capacity have been coming down steadily, but managing it all is still too complex and costly. Which brings us to virtualization.

We use the term virtualization because although applications need data, they don't necessarily need to know where the data resides physically.

In layman's terms, let's say you have a bunch of folders. There are lots of ways to label them: by date, by content. "Taxes" might be one. Now you've got to store your folders. So you put them in filing cabinets 1, 2, and 3. But you don't want to think in terms of cabinets; you just want fast access to Taxes. Thanks to information technology, all you need is the the virtual address, Taxes, even though the physical address may be cabinet 3, folder 17.

And, by the way, the Taxes folder must never be lost, so it's also in cabinet 4, folder 28. And it's also in Cincinnati, cabinet 2, folder 12--your disaster-recovery mirror in case an earthquake, fire, or other calamity strikes your main site.

Virtualization hides all that complexity, so that it seems like you have just one filing cabinet for everything.

Companies have tried to do that physically, starting with one humongous storage box. The problem is nobody can build a box big enough. So pretty soon it's not one box; it's 50 boxes. Suddenly it's not so simple anymore.

What the world is moving to--for basic total-cost-of-ownership reasons--is storage area networks. A SAN is designed to allow you to manage and grow storage as a single pool of resources.

Why is that important? Here's a simple example: Let's say one filing cabinet fills and the other is two-thirds empty. The idea behind virtualization is, "Hey, why don't I take some of the extra room in the second cabinet and make it an extension of the first cabinet?"

Today, that's not easy to do. You almost have to physically move the disk drives. Virtualization brings a higher level of abstraction to the process so that all your storage looks and acts as one huge filing cabinet--even though it may be implemented physically in a lot of separate cabinets.

All of which leads to all sorts of other benefits. With virtualization, if Taxes gets too big and doesn't fit in one folder anymore, your application doesn't have to worry about that. In the past the application would have to be smart enough to say, "Oh, this doesn't fit anymore. I'd better create separate folders for 1999, 2000, 2001. Oh, and there's no more room in this cabinet, so I'd better store Taxes for 2001 in cabinet 42. Now I've got to remember that I've got one set of Taxes in this cabinet, one in that."

Virtualization says, "Don't worry about that. Just ask for Taxes and I'll get it for you."

With each piece of data, you decide how critical it is to you--whether you want recovery time of one second, one minute, one hour, or whatever. And with virtualization, all that will be automated, which enables you to fundamentally lower the cost of running the storage side of your data center.

That's the theory. In practice, the technology isn't quite there yet. Today's SANs are more like traditional direct-attached storage flying in close formation. But the industry is just now starting to grapple with this whole concept, and we're working on it like crazy. Not just in storage but throughout the whole IT infrastructure. That way we'll be able to reduce complexity, automate change management, and make better use of all our resources.

Three years from now, we'll look back and wonder how the world was able to function before this virtual revolution.

Mark Canepa is executive vice president of network storage for Sun Microsystems, Inc.

Editorial standards