RAID 1+0 is the Cadillac of RAID

Is RAID 1+0 superiority a myth?My fellow ZDnet blogger, George Ou, makes some strong statements, bolstered by damning performance numbers, that it is.

Is RAID 1+0 superiority a myth? My fellow ZDnet blogger, George Ou, makes some strong statements, bolstered by damning performance numbers, that it is. But this wouldn't be the blogosphere if everyone agreed. So I won't.

What is RAID 1+0? Sometimes shortened to RAID 10, RAID 1+0 is a particular combination of two different RAID levels: mirroring (RAID 1) and striping (RAID 0). The appeal of RAID 1+0 is simple: mirroring gives you the highest level of availability RAID offers, with the fastest rebuild times when a disk fails; while striping - using the proper chunk size - is the basis for high-performance I/O.

Put the two together and you have the best of both worlds. Until George showed up with his facts.

What about the poor performance George found? We're coming to that, but first a word from our sponsor.

There is an important difference between RAID 1+0 and RAID 0+1 that not everyone appreciates. In RAID 1+0, you first mirror the disks, and then you lay the stripes on top of the mirrored disks. In RAID 0+1 you first stripe and then mirror. Seems like it should be commutative, but it isn't. Here's why.

Let's say you have 6 disks, named 1, 2, 3, a, b and c. So you mirror 1+a, 2+b, 3+c to create three virtual disks 1', 2' and 3'. Then you stripe across 1', 2' and 3' to create your RAID 1+0 array. Each of your mirrored pairs is a virtual disk, so you can lose one member of each pair and the stripe still works.

Now that I'm thoroughly confused, remind me why I should care? The cool thing about this is that you could lose as many as three drives and still get your data. You could lose drives 1, b and 3 and still the RAID would soldier on. [Update: of course, if you lose both drives in a pair, your data is toast - just as if you lost a drive in a RAID 0 configuration.]

In contrast, if you striped 1, 2 and 3 first and then striped a, b and c, and *then* mirrored the two stripes (RAID 0+1). You are more vulnerable to failures. It is the difference between three mirrors (in this example) and two. Lose drive 1 and drive c and you are SOL. That's when you'll wish you'd paid attention to whatsisname in Storage Bits. But it will be too late.

That tear in the corner of my eye? That tear is for you, my friend. And your lost data, gone to that great bit bucket in the sky.

Yeah, so why does the performance suck? As I read it, George made two points about performance:

  • SQL server and some other applications benefit by being spread across multiple volumes, instead of one RAID 1+0
  • A 4-disk RAID 5 was about 50% faster than a 4-disk RAID 1+0

Both points are correct. They are also irrelevant to the goodness of RAID 1+0.

The reason striping helps performance - with the right chunk size - is that it is, ideally, spreading the workload across multiple spindles. I know zip about SQL Server, but it sounds like it wasn't doing that on the RAID system George describes. When he put it on multiple volumes it does so automatically. So the workload is distributed and performance dramatically improves. But that is the data layout, not the RAID.

Nor is it surprising that a 4-disk RAID 5 would be higher performance than a 4-disk RAID 1+0. Why? The RAID 5 is reading and writing data from 3 drives. The RAID 1+0 is only using 2 drives. You'd expect one to be 50% faster. And so it is.

Aren't both 4-disk configurations - why would the number of drives be different? We're talking virtual drives here. With mirrored drives, you can only read from two of the drives, because there are only two drives worth of data. Four drives, two virtual disks.

With the RAID 5 configuration the data is spread over all four drives. One quarter of the data is parity information, so you effectively have three drives to read from.

Note: I'm not delving into the infamous RAID 5 "write hole" where small random writes get very slow. That is "whole" 'nother topic.

The Storage Bits take George has performed a signal service by taking a close look at RAID performance and the trade-offs. His points about understanding how your application uses disk are important and too often ignored. And he is correct to say that big fancy RAID arrays may not be the right answer for many applications.

Yet if you do choose to use RAID, I submit that for important data, RAID 1+0 should be your first choice. It offers good performance - not as good as RAID 5 on reads, but much better on small writes - and it is much more resilient than RAID 5 when you do have a disk failure.

A RAID 5 rebuild costs you about half your IOPS capacity as well as controller or CPU cycles. With RAID 1+0 a rebuild is a simple disk to disk copy which is as efficient and fast as you can get.

Because it mirrors, RAID 1+0 capacity is more expensive than RAID 5. For business critical data, RAID 1+0 gives the best combination of performance, availability and redundancy.

Update: after I got some comments from smart folks who misunderstood parts of this post I went back and clarified some things. No animals were harmed in the testing of this post.

Comments welcome, of course.