Best storage strategies for the multimedia PC

This information is also available as a TechRepublic PDF download. The modern PC has become the storage and entertainment hub of the home for many consumers and some people are gobbling up terabytes of data per year storing their DVD library, downloads, TV episodes, and other "stuff" that Seagate's CEO so candidly admits he's helping you I mean someone you know store.

This information is also available as a TechRepublic PDF download

The modern PC has become the storage and entertainment hub of the home for many consumers and some people are gobbling up terabytes of data per year storing their DVD library, downloads, TV episodes, and other "stuff" that Seagate's CEO so candidly admits he's helping you I mean someone you know store. While this has changed the way we consume media for the better, it's created a huge dilemma of how we go about storing all that stuff efficiently AND reliably.

The computer storage industry is peddling a dumbed down form of storage to consumers that causes them to give up half their capacity with mirroring technology such as RAID-1 and RAID-10. Other cheap RAID devices that use striping technology such as RAID-0 puts the user's data at severe risk because a single disk failure kills the entire volume. The most practical solution such as RAID-5 is unfortunately often overlooked because RAID-5 controllers cost a little more to integrate. But if you buy the right gear and you know the pros and cons of each RAID technology, you won't need to spend a lot of money and still get the best solution.

On a side note, my colleague Robin Harris has been on a misplaced vendetta against RAID storage in which he believes RAID doesn't solve *his* problem. While Robin is right that RAID doesn't solve *his* problem of reliable backup and disaster recovery, RAID was never meant to solve the backup problem and it's a solution to an entirely different problem that Robin Harris may not care about but others do. What I and many other consumers need is to be able to store all that video content in a cheap and reliable manner. This content doesn't really need to be backed up and mirrored off-site because it's impractical to mirror terabytes of data and because the data is replaceable. Things like your personal photos and documents which can't ever be replaced should absolutely be using practical off-site strategies.

If you just shoved six large 750GB hard drives in to your computer, you end up with a management nightmare because you have at least six independent volumes to manage. While six independent drives is very flexible and in some cases better performing when you're doing multiple copy tasks at the same time, it's simply too much to manage and I've lost track of files because I'll accidentally forget to back it up and accidentally deleted what I thought was a replica. To solve the storage dilemma for the user that needs terabytes of storage, one of the most practical solutions on the market in terms of performance and price is the Intel ICH RAID controller built in to modern Intel chipset motherboards like the 965 and the latest 3-Series chipset. I reviewed 965 ICH8R here where it showed performance rivaling dedicated RAID controllers that cost hundreds of dollars. The latest ICH9R which is built in to the Intel 3-Series chipset improves upon its ICH8R predecessor by allowing you to use up to 6 drives in a single RAID-5 array.

<Next page - Intel's ICH9R RAID controller>

Intel's ICH9R RAID controller

Doing RAID-5 with the Intel ICH9R on-board controller means you get to combine up to 6 hard drives in to a single drive that performs like a champ, has the aggregate capacity of 5 drives, and only loses the capacity-equivalent of 1 hard drive to gain fault tolerance. Should any one of the 6 drives fail, no data will be lost and all you need to do is replace the failed drive with a new one. But one word of caution here, should you fail to immediately replace the failed drive and a second drive fails, you will lose the entire multi-terabyte RAID-5 array. It's like having 6 survivors in the ocean tethered to each other and one of the survivors passed out for a minute and the other 5 keep him from drowning before he wakes up. The strategy has its risks where if a second survivor passes out before the first one wakes up then they all drown but it's statistically unlikely to happen. If you turn off the system and immediately go out and buy a replacement drive, the risk of losing any data is minimal.

There's a better technology called RAID-6 which allows up to two simultaneous failures within an array but RAID-6 controllers are expensive and you lose two drives to redundancy so it's cost prohibitive for most home users. If you could build a 12-drive RAID-6 volume and only lose 2 drives worth of capacity, that would be the ideal survival strategy but most consumers will have to wait until RAID-6 becomes more commoditized and less expensive. Currently, a 12-port RAID-6 capable controller costs around $500 and the ICH9R on-board RAID on Intel-chipset motherboards is essentially free. ICH9R only supports six drives and RAID-5 though there is the possibility of using SATA expanders to support a multiple of six drives and maybe Intel can provide a software/firmware update to support RAID-6.

On the Intel ICH RAID controllers, you can mix the order of the drives in an array which means it's easy to replace the motherboard or upgrade to a newer Intel ICH RAID motherboard. You can just reconnect the drives in any order and the RAID array will automatically mount with the correct settings. However, Intel's ICH RAID controllers are not without fault and one of the most glaring weaknesses is that it lacks a beaconing feature that tells you which drive failed. That means when a drive does fail, the software will tell you the drive number but it won't light up the LED on the drive which means there is a possibility you'll pull the wrong drive and corrupt the array.

Furthermore, the labeling on the motherboard for the port numbers are wrong on a motherboard like the Intel DG965WH ICH8R-based motherboard though I don't know if otherboards have this kind of mislabling. I had to painstakingly track down what the correct port numbers are and I matched it up to the slot number on my hot-swap tray so that the drives are neatly stacked from port 1 to 5. The safest way to replace a failed drive is to shutdown and replace the drive you believe is the bad one. If it turns out you replaced the wrong drive then the array won't mount and you get another shot at figuring out where the failed drive is so there's less chance of corrupting the array. There's no reason Intel can't upgrade their software to simply spit out a fake disk busy message over the SATA port which will light up the disk activity LED to always on so you know which disk to pull. I've requested this feature from Intel months ago but I have not gotten a response.

<Next page - Optimum RAID configurations>

Optimum RAID configurations

So if you were building a new computer with terabytes of storage, how should you configure the array using Intel's built-in ICH9R RAID controller? Do you make one massive RAID-5 array using all six drives and mix your OS (Operating System) and data on the same logical volume or do you leave your OS on a separate drive outside of the RAID array used for data? There are a few possibilities but here's why you should consider having them separate. The OS has a tendency to read and write temporary data from the drive it's stored on from time to time which keeps the drive from spinning down in to power save mode. If the OS is located on the massive RAID volume, it won't ever spin down.

A typical hard drive consumes about 8 watts in idle and five drives in a RAID array consumes a minimum of 40 watts. That means your PC will idle at 100 watts instead of 60 watts on a modern efficient computer. By using port 0 for the OS on an internal hard drive and leaving ports 1-5 for the 5 port hot-swap SATA module, you can allow the RAID volume to spin down and conserve 40 watts of power which works out to about $35/year if your computer is always on if you're paying 10 cents per kilowatt*hour. You also need to factor in the extra heat generated from the drives which means you may have to spend another $35/year on air-conditioning. To put this in context, even a new refrigerator averages around 57 watts 24 hours a day. Whether you care about being green or not, saving green bills in your wallet is always a good thing.

One solution is to use the first drive for OS, Applications, and Temporary storage shown in the diagram below.

One other issue to think about is that BitTorrent downloads may run for hours or days and the files come in all fragmented. Since you only need about 100 GB for OS and Application storage, you have 650 GBs of open space on your OS drive if you're using 750 GB hard drives which have recently dropped to 25 cents per gigabyte. You can use that 650 GB drive as a temporary dumping ground so that the RAID volume can rest. BitTorrent downloads also tend to dump data in a highly fragmented manner since bits and pieces are coming in randomly. By dumping it in to the spare partition on the OS drive and then moving it to the RAID volume when you've verified the download is good, the file is written in a sequential non-fragmented manner and it keeps your data volume clean.

Another use for the 650 GB extra partition is that you can use it to store extra copies of critical data like photos and personal documents stored on the main RAID array. Again that does not mean you don't need off-line and off-site backup and I always tell people to burn their photos on to multiple DVDs and share it with their family. Not only does that allow family to enjoy each other's photographs, it also ensures off-site disaster recovery if one of the homes were to burn down.

If you're ok with only 4 drives in your massive RAID array, you can use 2 SATA RAID ports in a hybrid RAID1/RAID0 array for the OS/Temporary drive. A 150 GB RAID-1 mirror will give your OS volume an extra boost in I/O performance as well as fault-tolerance. The remaining 650 GBs of space can be merged in to a massive RAID-0 1200 GB temporary drive with massive capacity and throughput. Loading your game applications from the RAID-0 volume or RAID-5 data volume will mean that they load large maps much faster. The following diagram shows what this configuration looks like.

One other possibility is that many of the Intel 3-series chipset motherboards like the Gigabyte GA-P35-DS3R ($137 with shipping) often have two extra SATA ports in addition to the six SATA RAID-capable you get from the Intel ICH9R controller. You can use one extra port for the SATA optical drive and the other for the OS and you can use all six RAID ports for a massive RAID array. Here's what this final option looks like.