X
Home & Office

SMB storage gets smart. Or does it?

Time was when a network-attached storage (NAS) box was pretty much that: just storage. But those times have changed, and you can now buy a NAS that offers features that were enterprise-level not that long ago.
Written by Manek Dubash, Contributor

Time was when a network-attached storage (NAS) box was pretty much that: just storage. But those times have changed, and you can now buy a NAS that offers features that were enterprise-level not that long ago. But do they work properly?

As well as offering shares for Windows and Mac (CIFS and AFP) today's NAS systems can now be found offering thin provisioning, deduplication, replication and snapshotting. If you include systems such as FreeNAS, which is based on Sun's ZFS, it even offers all that for free -- although you will of course need a hardware platform if you don't already happen to have hanging around a VMware ESXi server with a couple of gigs of memory to spare.

But looking at finished products, Infortrend's EonNAS range offers deduplication, a rare feature in SMB-level products, while Synology and Drobo allow you to use the extra capacity of disks larger than the smallest in a RAID array.

One thing to watch out for though, when considering whether deduping is worthwhile in a production environment, as opposed to backup where it clearly makes sense, is the potential performance impact. The way in-line deduping works is by grabbing a chunk of incoming data, hashing to produce a value and comparing that to previous chunks. If there's a match, don't store it, store a pointer instead.

This is fine when backing up as, in the normal run of things, you don't actually care how long a backup takes as long as it's done by the time people come into work the next day. So an extra few minutes are neither here or there.

When working on production data however, you do care, as time taken to dedupe is perceived as sluggishness and latency. I know of one example where just this happened, prompting the vendor in question to rewrite its deduping engine.

And since hash tables are usually held in RAM, the solution in many cases is to add more: a lot more. But a better one is write a smarter hash table algorithm, so that the most likely hits don't need to be searched for on disk but out of RAM. Or even SSD, if there's one fitted. There's more about this topic here if you're interested.

Editorial standards