Why do people think RAID means performance?
George Ou, Technical Director at ZDNet and a fellow ZDnet blogger, has a great post about real life RAID performance - hardware vs software - plus some helpful comments about data layout, especially for MS SQL Server. As George notes, data layout can have a major impact on storage performance. But what about RAID itself? What is the theory behind RAID performance?
RAID = Redundant Array of Inexpensive Disks
RAID wasn't originally about performance, it was about cost. Some CompSci jocks at Berkeley wanted to use 5.25" drives for a bunch of cheap storage because they couldn't afford the then-common 9" drives. The small drives had poor reliability as well as lower cost, so they cobbled them together to create the first RAID array.
But as they worked out the details, they saw that RAID could have performance advantages for certain workloads.
The three pillars of RAID performance
- Chunk size
Let's look at all three.
Cache is simply RAM, or memory, placed in the data path in front of a disk or disk array. You can read or write RAM about 100,000 times faster than a fast disk and about 300,000 times faster than a slow disk. Writing or reading cache is a lot faster than disk.
Most external array controllers are redundant as well, with some kind of load balancing driver that keeps both controllers productive. In this case, the cache has to be dual-ported. If a controller fails, the other controller can see all the pending writes of the failed controller and completes them.
Dual-ported cache, controller failover, dual server interfaces and failover drivers are all tricky to engineer and test, which is one reason why mid-range and high-end controllers are so expensive. But they sure speed writes up.
Striping for speed Striping is taking a virtual disk that the operating system sees, and spreading that virtual disk across several real, physical disks.
A RAID controller presents something that looks like a disk, like a C: drive, to the operating system, be it Windows, OS X or Linux. The RAID controller isn't presenting a real disk. It is presenting a group of disks and making them look like a single disk to your computer.
The advantage is that instead of a single disk's performance, you now have the performance of several disks. Instead of 50 I/Os per second (IOPS) on a 5400 RPM drive, you might have 150, 200 IOPS or more, depending on the number of drives. If you use fast 15k drives you might reach 900, 1,000 or more IOPS.
And instead of 1.5 gigabits per second bandwidth, you might have 4.5 or 6 Gb/sec. If you have a cache that you are in a hurry to empty, a nice fast stripe set is very helpful.
The hard part: spreading I/Os smoothly across all the disks. If they all jam up on one disk your costly storage system will be no faster than a single disk. That's where chunk size comes in.
Chunk size: the hidden key to RAID performance
Stripes go across disk drives. But how big are the pieces of the stripe on each disk? The pieces a stripe is broken into are called chunks. Is the stripe broken into 1k byte pieces? Or 1 MB pieces? Or even larger? To get good performance you must have a reasonable chunk size.
So what is a reasonable chunk size? It depends on your average I/O request size. Here's the rule of thumb: big I/Os = small chunks; small I/Os = big chunks.
Do you do video editing or a lot of Photoshop work? Then your average request size will be large and your performance will be dominated by how long it takes to get the data to or from the disks. So you want a lot of bandwidth to move data quickly. To get a lot of bandwidth you want each disk to shoulder part of the load, so you want a small chunk size. What is small? Anywhere from 512 bytes (one block) to 8 KB.
If you are running a database and doing lots of small I/Os - 512 bytes to 4 KB say - then you want to maximize your IOPS, which ideally means sending each I/O to only one disk and spreading the I/Os evenly across the disks. What you don't want is a single I/O getting sent to two disks, since waiting for the heads will slow things down. So you want a large chunk size - at least 64 KB or more. That large chunk will mean that most I/Os get serviced by a single disk and more I/Os are available on the remaining disks.
However, many databases use their own strategies to gather I/Os to minimize I/O overhead. In that case you need to know what the database is actually doing to choose the right chunk size.
The Storage Bits take
RAID systems are complex and their operation is sometimes counter-intuitive. You should take the time to understand what your workload is in order to configure a RAID system for good performance. Otherwise you could end up with an expensive and slow problem instead of a fast I/O solution.
Comments welcome, of course.