Chunks: the hidden key to RAID performance

By | May 7, 2007, 5:54pm PDT

Summary: Why do people think RAID means performance? George Ou, Technical Director at ZDNet and a fellow ZDnet blogger, has a great post about real life RAID performance - hardware vs software - plus some helpful comments about data layout, especially for MS SQL Server. As George notes, data layout can have a major impact on storage [...]

Why do people think RAID means performance?
George Ou, Technical Director at ZDNet and a fellow ZDnet blogger, has a great post about real life RAID performance - hardware vs software - plus some helpful comments about data layout, especially for MS SQL Server. As George notes, data layout can have a major impact on storage performance. But what about RAID itself? What is the theory behind RAID performance?

RAID = Redundant Array of Inexpensive Disks
RAID wasn’t originally about performance, it was about cost. Some CompSci jocks at Berkeley wanted to use 5.25″ drives for a bunch of cheap storage because they couldn’t afford the then-common 9″ drives. The small drives had poor reliability as well as lower cost, so they cobbled them together to create the first RAID array.

But as they worked out the details, they saw that RAID could have performance advantages for certain workloads.

The three pillars of RAID performance

  • Cache
  • Striping
  • Chunk size

Let’s look at all three.

Cache
Cache is simply RAM, or memory, placed in the data path in front of a disk or disk array. You can read or write RAM about 100,000 times faster than a fast disk and about 300,000 times faster than a slow disk. Writing or reading cache is a lot faster than disk.

Most external array controllers are redundant as well, with some kind of load balancing driver that keeps both controllers productive. In this case, the cache has to be dual-ported. If a controller fails, the other controller can see all the pending writes of the failed controller and completes them.

Dual-ported cache, controller failover, dual server interfaces and failover drivers are all tricky to engineer and test, which is one reason why mid-range and high-end controllers are so expensive. But they sure speed writes up.

Striping for speed
Striping is taking a virtual disk that the operating system sees, and spreading that virtual disk across several real, physical disks.

A RAID controller presents something that looks like a disk, like a C: drive, to the operating system, be it Windows, OS X or Linux. The RAID controller isn’t presenting a real disk. It is presenting a group of disks and making them look like a single disk to your computer.

The advantage is that instead of a single disk’s performance, you now have the performance of several disks. Instead of 50 I/Os per second (IOPS) on a 5400 RPM drive, you might have 150, 200 IOPS or more, depending on the number of drives. If you use fast 15k drives you might reach 900, 1,000 or more IOPS.

And instead of 1.5 gigabits per second bandwidth, you might have 4.5 or 6 Gb/sec. If you have a cache that you are in a hurry to empty, a nice fast stripe set is very helpful.

The hard part: spreading I/Os smoothly across all the disks. If they all jam up on one disk your costly storage system will be no faster than a single disk. That’s where chunk size comes in.

Chunk size: the hidden key to RAID performance
Stripes go across disk drives. But how big are the pieces of the stripe on each disk? The pieces a stripe is broken into are called chunks. Is the stripe broken into 1k byte pieces? Or 1 MB pieces? Or even larger? To get good performance you must have a reasonable chunk size.

So what is a reasonable chunk size? It depends on your average I/O request size. Here’s the rule of thumb: big I/Os = small chunks; small I/Os = big chunks.

Do you do video editing or a lot of Photoshop work? Then your average request size will be large and your performance will be dominated by how long it takes to get the data to or from the disks. So you want a lot of bandwidth to move data quickly. To get a lot of bandwidth you want each disk to shoulder part of the load, so you want a small chunk size. What is small? Anywhere from 512 bytes (one block) to 8 KB.

If you are running a database and doing lots of small I/Os - 512 bytes to 4 KB say - then you want to maximize your IOPS, which ideally means sending each I/O to only one disk and spreading the I/Os evenly across the disks. What you don’t want is a single I/O getting sent to two disks, since waiting for the heads will slow things down. So you want a large chunk size - at least 64 KB or more. That large chunk will mean that most I/Os get serviced by a single disk and more I/Os are available on the remaining disks.

However, many databases use their own strategies to gather I/Os to minimize I/O overhead. In that case you need to know what the database is actually doing to choose the right chunk size.

The Storage Bits take
RAID systems are complex and their operation is sometimes counter-intuitive. You should take the time to understand what your workload is in order to configure a RAID system for good performance. Otherwise you could end up with an expensive and slow problem instead of a fast I/O solution.

Comments welcome, of course.

Kick off your day with ZDNet's daily e-mail newsletter. It's the freshest tech news and opinion, served hot. Get it.

Topics

Robin Harris has been messing with computers for over 30 years and selling and marketing data storage for over 20 in companies large and small.

Disclosure

Robin Harris

Robin Harris is a president of TechnoQWAN, a consulting and analyst firm in northern Arizona. He also writes StorageMojo.com, a blog which accepts advertising from companies in the storage industry, and has a 25 year history with IT vendors. He has many industry contacts, many of whom are friends and all of whom he has opinions about. Robin has relationships with many companies in the technology industry. Every company he writes about may have sought to influence his opinion through carefully-crafted marketing messages and self-serving white papers, gifts ranging from desk calendars, t-shirts, lunches and trips as well as analyst or consulting assignments. He also invests in some technology companies. He may accept payment for services in stock as well. Robin discloses financial investments in or client relationships with companies named in Storage Bits. To help readers sort out the gold from the dross in his writings, Robin tries to communicate his reasons as clearly as he can. If you agree, you are intelligent and discerning. If you disagree, well, you disagree. In all cases, Robin encourages readers to subject everything they read, see or hear on the internet or from politicians to some simple questions: * What assumptions are implicit in the world view and judgments of the author? * What, if any, is the factual basis for the opinions the author expresses? * Is it reasonable, logical and clear? Your critical faculties: use ‘em or lose ‘em!

Biography

Robin Harris

Harris has been messing with computers for over 30 years and selling and marketing data storage for over 20 in companies large and small. He introduced a couple of multi-billion dollar storage products (DLT, the first Fibre Channel array) to market, as well as a many smaller ones. Earlier he spent 10 years marketing servers and networks. After leaving corporate life he founded TechnoQWAN, a consulting and analyst firm. He also developed StorageMojo into one of the top storage industry blogs.

Robin writes, consults, coaches and lives among the mountains of northern Arizona.

5
Comments

Join the conversation!

Just In

RE: Chunks: the hidden key to RAID performance
oz@... 5th Jan 2008
Hello Robin:

Nice article. Congratulations!

What chunk size would you suggest to an RAID5 array
(4 disks) writing files of 600k to 4Mb each?

I usually have 100 files being transfered to this array
in a 2-3 seconds interval.

thank you!
Oz
0 Votes
+ -
The "i" in RAID got redefined by the industry to mean "independent" because they got sick and tired of being questioned about their 500 to 1000 percent premium over identical hard drives on the market that don't carry name brand logos on the drives themselves.

Nice post, thanks!

The concept of using the application to split a database table can be applied to MySQL as well I believe. You can always do manual table splitting too though.
0 Votes
+ -
Contributr
RAID industry is in sad shape
R Harris 8th May 2007
The whole array business is under significant pressure as customers are slowly
waking up to the fact that paying 5% for capacity and 95% for protection is whacked.
0 Votes
+ -
Yes i get more storage, but say a raid 5, if more than one drive dies, i am dead, all of my data is gone.

Raid 1 is good, but you take a performance hit, but you have realtime backup, course the downside to that is that viruses or system changes can negate the backup, thus rendering it useless.

Raid 0.. good for speed, but if you lose a disk.. your dead in the water. Great for log files or gaming directories.

I like something i recently found called XRaid, which keeps each disk as its own entity and if one fails and can rebuild it like a raid 5. If more than one dies, your out of luck but atleast the data on the remaining drives lives on.

The problem right now is that we have soo much storage space and if you want backup, you either buy two or buy rediculously expensive backup devices. Its a lose-lose. Hence why i like the XRaid.

Overall though, i have never had a raid 5 die. I have lost drives in raid 5 setups, but never the whole array. Had a scare once, but turned out to be the controller.

Essentially it all comes down to the risk your willing to take vs the money your willing to spend.
0 Votes
+ -
Contributr
And I don't think RAID will be that popular in ten years - copies of data will be the
norm

Why I think that is another post.

Thanks for writing.

Robin
0 Votes
+ -
Hello Robin:

Nice article. Congratulations!

What chunk size would you suggest to an RAID5 array
(4 disks) writing files of 600k to 4Mb each?

I usually have 100 files being transfered to this array
in a 2-3 seconds interval.

thank you!
Oz

Join the conversation!

Formatting +
BB Codes - Note: HTML is not supported in forums
  • [b] Bold [/b]
  • [i] Italic [/i]
  • [u] Underline [/u]
  • [s] Strikethrough [/s]
  • [q] "Quote" [/q]
  • [ol][*] 1. Ordered List [/ol]
  • [ul][*] · Unordered List [/ul]
  • [pre] Preformat [/pre]
  • [quote] "Blockquote" [/quote]
ie8 fix

The best of ZDNet, delivered

ZDNet Newsletters

Get the best of ZDNet delivered straight to your inbox

Facebook Activity

White Papers, Webcasts, & Resources
ie8 fix