Getting a Grip on Storage Growth

Storage growth is out of control! By addressing storage growth, you can manage costs.

Storage growth is out of control! Practically everyone has awakened to the reality that information is a key differentiator in business. Even if they don’t know how to analyze the data yet, they are collecting it because stored data is like money in the bank. However, just as it costs something to keep money in the bank (those huge vaults aren’t cheap), storing data has its costs too.

It’s easy for the business to underestimate the expenses of storing data. After all, disks are getting cheaper by the month. But what users often don’t realize is that they also need to backup and maintain all that content. And these operational costs add up when you have enormous quantities of data.

In my last post, I talked about storage virtualization as an abstraction mechanism that permits greater flexibility in the underlying physical infrastructure. Another benefit of virtualization is that it enables higher utilization through unit pooling.

The fact is, there are two ways to address storage growth. One is to reduce the amount of data that needs to be stored. The other is to improve the utilization of the devices that store this data. Let’s look at both of these.

One of the best mechanisms to reduce the overall volume of information is de-duplication. A large part of the data explosion is a result of the fact that the same information is replicated and reused for different purposes. Backups, for example, often involve highly redundant versions from one run to the next because the incremental changes are small, yet there is still value in having a complete snapshot at regular intervals.

A de-duplication process, like that provided by Windows Server 2012’s deduping feature, identifies unique patterns during its analysis. When it finds a subsequent repetition of such a pattern, it replaces it with a reference rather than duplicating the information. So the actual data is only stored once. Since these pointers are often much smaller than the original pattern, it is possible to significantly compress stored data. This technique makes it possible to reduce the size of virtual hard disks to about one tenth of their original size.

The second approach to reigning in storage growth is improving the utilization of storage. Applications tend to allocate storage according to their peak needs in order to ensure that it is available when needed. Most of the time, their actual data usage is much lower. This “peak needs” strategy wastes a lot of capacity.

Thin provisioning addresses this problem. Rather than requiring up-front allocation of storage, thin provisioning allocates blocks of data on demand. In other words, the volume is automatically extended as needed until it reaches its maximum permitted size. A feature called Trim in Windows Server 2012 adds to this concept. It reclaims any unused space after the application has come back down from its peak storage requirements.

Data volume, along with its cost implications, is one of the biggest challenges that IT managers face today. Data growth also shows no signs of abating. In fact, information is becoming increasingly critical to enterprises as they seek new forms of competitive differentiation. There is no point in trying to curtail this trend. Instead, the best way to avoid spiraling storage costs is to exploit techniques that improve hardware utilization without reducing service capabilities.