RAID storage explained

RAID storage explained

Summary:  This information is also available as a PDF download.Since I've been doing a lot of coverage of storage technology both for the enterprise and for the home lately, I thought I should give an explanation of what RAID storage is.

SHARE:

 This information is also available as a PDF download.

Since I've been doing a lot of coverage of storage technology both for the enterprise and for the home lately, I thought I should give an explanation of what RAID storage is. I won't go in to every RAID type under the sun, I just want to cover the basic types of RAID and what the benefits and tradeoffs are.

RAID was originally defined as Redundant Array of Inexpensive Drives, but RAID setups were traditionally very expensive so the definition of "I" became Independent. The costs have recently come down significantly because of commoditization and RAID features are now embedded on to most higher-end motherboards. Storage RAIDs were primarily designed to improve fault tolerance, offer better performance, and easier storage management because it presents multiple hard drives as a single storage volume which simplifies storage management. Before we start talking about the different RAID types, I'm going to define some basic concepts first.

Fault tolerance defined: Basic fault tolerance in the world of storage means your data is intact even if one or more hard drives fails. Some of the more expensive RAID types permit multiple hard drive failures without loss of data. There are also more advanced forms of fault tolerance in the enterprise storage world called path redundancy (AKA multi-path) which allows different storage controllers and the connectors that connect hard drives to fail without loss in service. Path redundancy isn't considered a RAID technology but it is a form of storage fault tolerance.

Storage performance defined: There are two basic metrics of performance in the world of storage. They are I/O performance and throughput. In general, read performance is more valued than write performance because storage devices spend the majority of their time reading data. I/O (Input/Output) performance is the measure of how many small random read/write requests can be processed in a single second and it is very important in the server world, especially database type applications. IOPS (I/O per second) is the common unit of measurement for I/O performance.

Throughput is the measurement of how much data can be read or written in a single second and it is important in certain server applications and very desirable for home use. Throughput is typically measured in MB/sec (megabytes transferred per second) though mbps (megabits per second) is sometimes also used to describe storage communication speeds. There is sometimes confusion between megabits versus megabytes since they sound alike. For example, 100 megabit FastEthernet might sound faster than a typical hard drive that gets 70 MB/sec but this would be like thinking that 100 ounces weighs more than 70 pounds. In reality, the hard drive is much faster because 70 MB/sec is equivalent to 560 mbps.

RAID techniques defined: There are three fundamental RAID techniques and the various RAID types can use one or more of these techniques. The three fundamental techniques are:

  • Mirroring
  • Striping
  • Striping with parity

Mirroring: Data mirroring stores the same data across two hard drives which provides redundancy and read speed. It's redundant because if a single drive fails, the other drive still has the data. It's great on read I/O performance and read throughput because it can independently process two read requests at the same time. In a well implemented RAID controller that uses mirroring, the read IOPS and read throughput (for two tasks) can be twice that of a single drive. Write IOPS and write throughput aren't any faster than a single hard drive because they can't be process independently since data must be written to both hard drives at the same time. The downside to mirroring is that your capacity is only half of the total capacity of all your hard drives so it's expensive.

Striping: Data striping distributes data across multiple hard drives. Striping scales very well on read and write throughput for single tasks but it has less read throughput than data mirroring when processing multiple tasks. A good RAID controller can produce single-task read/write throughput equal to the total throughput of each individual drive. Striping also produces better read and write IOPS though it's not as effective on read IOPS as data mirroring. You also get a large consolidated drive volume equal to the total capacity of all the drives in the RAID array. Striping is rarely used by itself because it provides zero fault tolerance and a single drive failure causes not only the data on that drive to fail, but the entire RAID array. Striping is often used in conjunction with data mirroring or with parity.

Striping with parity: Because striping alone is so unreliable in terms of fault tolerance, striping with parity solves the reliability problem at the expense of some capacity and a big hit on write IOPS and write throughput compared to just data striping. Data is striped across multiple hard drives just like normal data striping but a parity is generated and stored on one or more hard drives. Parity data allows a RAID volume to be reconstructed if one (sometimes two) hard drives fail within the array. Generating parity can be done in the RAID controller hardware or done via software (driver level, OS level, or add-on volume manager) using the general purpose processor. The hardware method of generating parity either results in an expensive RAID controller and/or poor throughput performance. The software method is computationally expensive though that's no longer a problem with fast multi-core processors. Despite the performance and capacity penalty of using parity, parity uses up far less capacity than data mirroring while providing drive fault tolerance making this a very cost-effective form of reliable large-capacity storage.

<Next page - Basic RAID Levels defined>

Basic RAID Levels defined

The various RAID types used in the storage world are defined by Level numbers. At the basic level, we have RAID Level 0 through 6. We also have various composite RAID types comprised of multiple RAID levels. Note that people often drop the word "Level" when referring to RAID types and this has become an accepted practice. Also note that even though same-sized hard drives are not technically required, RAID normally uses hard drives of similar size. Any implementation that uses different sized hard drives will result in wasted capacity.

RAID Level 0: RAID Level 0 is the cluster-level implementation of data striping and it is the only RAID type that doesn't care about fault tolerance. Clusters can vary in size and are user-definable but they are typically blocks of 64 thousand bytes. The clusters are evenly distributed across multiple hard drives. It's used by people who don't care about data integrity if a single drive fails. This RAID type is sometimes used by video editing professionals who are only using the drive as a temporary work space. It's also used by some PC enthusiasts who want maximum throughput and capacity.

RAID Level 1: RAID Level 1 is the pure implementation of data mirroring. In a nutshell RAID Level 1 gives you fault tolerance but it cuts your usable capacity in half and it offers excellent throughput and I/O performance. This RAID level is often used in servers for the system partition for enhanced reliability but PC enthusiasts can also get a nice performance boost from RAID Level 1. Using multiple independent RAID Level 1 volumes can offer the best performance for database storage.

RAID Level 2: RAID Level 2 is a bit-level implementation of data striping with parity. The bits are evenly distributed across multiple hard drives and one of the drives in the RAID is designated to store parity. Out of an array with "N" number of drives, the total capacity is equal to the sum of "N-1" hard drives. For example, an array with 6 equal sized hard drives will have the combined capacity of 5 hard drives. It's interesting to note that this RAID level is almost forgotten and is very rarely used.

RAID Level 3: RAID Level 3 is a byte-level implementation of data striping with parity. The bytes are evenly distributed across multiple hard drives and one of the drives in the RAID is designated to store parity. Out of an array with "N" number of drives, the total capacity is equal to the sum of "N-1" hard drives. For example, an array with 4 equal sized hard drives will have the combined capacity of 3 hard drives. This RAID level is not so commonly used and is rarely supported.

RAID Level 4: RAID Level 4 is a cluster-level implementation of data striping with parity. Clusters can vary in size and are user-definable but they are typically blocks of 64 thousand bytes. The clusters are evenly distributed across multiple hard drives and one of the drives in the RAID is designated to store parity. Out of an array with "N" number of drives, the total capacity is equal to the sum of "N-1" hard drives. For example, an array with 8 equal sized hard drives will have the combined capacity of 7 hard drives. This RAID level is not so commonly used and is rarely supported.

RAID Level 5: RAID Level 5 is a cluster-level implementation of data striping with DISTRIBUTED parity for enhanced performance. Clusters can vary in size and are user-definable but they are typically blocks of 64 thousand bytes. The clusters and parity are evenly distributed across multiple hard drives and this provides better performance than using a single drive for parity. Out of an array with "N" number of drives, the total capacity is equal to the sum of "N-1" hard drives. For example, an array with 7 equal sized hard drives will have the combined capacity of 6 hard drives. This is the most common implementation of data striping with parity.

RAID Level 6: RAID Level 6 is a cluster-level implementation of data striping with DUAL distributed parity for enhanced fault tolerance. It's very similar to RAID Level 5 but it uses the equivalent capacity of two hard drives to store parity. RAID Level 6 is used in high-end RAID systems but it's slowly becoming more common as technology becomes more commoditized. Dual parity allows ANY two hard drives in the array to fail without data loss which is unique in all the basic RAID types. If a drive fails in a RAID Level 5 array, you better hope there is a hot spare that will quickly restore the array to a healthy state in a few hours and you don't get a second failure during that recovery time. RAID Level 6 allows that second drive failure during recovery and is considered the ultimate RAID Level for fault tolerance. Out of an array with "N" number of drives, the total capacity is equal to the sum of "N-2" hard drives. For example, an array with 8 equal sized hard drives will have the combined capacity of 6 hard drives.

RAID Level 10 (composite of 1 and 0): RAID Level 10 (sometimes called 1+0) is probably the most common composite RAID type used on the market both in the server and home/enthusiast market. For example, there are plenty of cheap consumer-grade RAID controllers that might support RAID Level 0, 1, and 10 that don't support Level 5. The most common and recommended implementation of mirroring and striping is that mirroring is done before striping. This provides better fault tolerance because it can statistically survive more often with multiple drive failures and performance isn't degraded as much when a single drive has failed in the array. RAID Level 0+1 which does striping before mirroring is considered an inferior form of RAID and is not recommended. RAID Level 10 is very commonly used in database applications because it provides good I/O performance when the application can't distribute its own data across multiple storage volumes. But when the application knows how to evenly distribute data across multiple volumes, independent pairs of RAID Level 1 provides superior performance.

<Return to beginning>

Topics: Data Centers, Hardware, Storage

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

19 comments
Log in or register to join the discussion
  • RAID water cooler

    this image explains it more simpler:

    [img=http://www.edvt.net/Pictures/raid.jpg]
    justgold79
    • Thanks, I've seen that before - nt

      nt
      georgeou
  • Go Stuff

    Per chance, is there a pdf edition for this?
    Fine work George
    D T Schmitz
    • Thanks, we'll get a PDF for TechRepublic tomorrow sometime

      The staff is off today.
      georgeou
      • RE: RAID storage explained

        thank you so mch that made my research much easier thank u so much
        Boineelo moyo
  • Thanks George

    Even though I've read this in a bunch of different places, I still keep reading it to better understand it. One of these day's I'll get my hands completely around this.

    Dan
    DanLM
    • Any thing else I can help explain better?

      Any thing else I can help explain better?
      georgeou
      • Thanks George

        I have also been trying to get my mind around all these issues before building new PCs and maybe a server for home use. Nothing annoys me more than a slow PC. Sometimes I even long for the good old days of DOS.

        Your articles on junk services, RAID and hardware builds are helping a great deal.

        My questions still center around an ideal partitioning scheme for a home PC. Should the OS and data be separated? Are the IO characteristics of OS and data sufficiently different to justify different RAID levels? Should it all be wrapped up in one RAID partition?

        My personal point of view is that I will spend a fair amount of money for a significant increase in performance. Data redundancy is very important but OS redundancy isn't.
        kmatzen9
        • OS and Data should always be separated

          OS and Data should always be separated for manageability. RAID covers you for hardware failure but not for OS/Data corruption. That corruption could be due to stupid human tricks and malware. The "previous version" feature in Vista and Windows Home Server can help alleviate this problem though, but you still need a good backup. I like being able to recover my OS/Applications partition without having to restore all of my data.

          I prefer to use 2 drive volumes (physical preferred by logical partition is ok) at a minimum. I keep an OS/Applications volume and a Data volume. I don?t believe in over doing partitions. I will keep a third 80 GB partition for game installs because they?re ridiculously large at 8 GB per game. All other applications are installed in the OS/Applications volume. I always take a Ghost or PQDI image of my OS/Apps volume. Vista Ultimate has its own backup tool that lets you backup your drives in to VHD files and those can be recovered with the Windows Vista install DVD.

          The Intel ICH7R, ICH8R, and the new ICH9R lets you create RAID sets from just the first part of the drive. So you can create a RAID1 set from 2 HDDs with 64 GB for the OS/Application partition and leave the remaining 400 GBs in the two drives independent. RAID1 actually boots your PC faster because it can process two read requests at the same time. I keep a "brief case" that manually synchronizes my files for my data since that's an actual backup and not just hardware redundancy. Of course, it's feasible to just use a RAID-5 with 3 or more HDDs in a home system and make the whole thing fault tolerant.
          georgeou
          • Separation

            I can not explain this any clearer to my clients. I always have a separate drive(s) to keep any and all data away from the OS(s). After installation most people can't even tell the difference. It's simply best practices to at least have a separate partition in the case of malware or virus.
            majoritywhip
          • "brief case"?

            By "brief case" do you mean a cross synch of the two independent partitions or sychronizing with an external drive?
            kmatzen9
          • It's an old syncronization feature from Win9x days

            It's an old synchronization feature from Win9x days. You create a "brief case" and you drag folders in to it. It replicates the changes. It's not an ideal solution but it works. It's not the most usable solution and it's not offline backup. Using Windows Home Server to do cluster-level incremental backup/synchronization is definitely a better way to go.
            georgeou
      • RAID History

        As I remember it the I started as Independent and then changed to Inexpensive as the prices fell. I also remember Storage Tek's development of Iceberg, a RAID 5 implementation with nine drives which each had one bit of the Byte with one for parity. A drive could be unplugged while running and the data rate would be virtually unaffected.
        earlkaplan
        • Vendors prefer independent so they can sell 500-1000 percent premiums

          Vendors prefer the term "independent" so they can sell 500-1000 percent premiums on hard drives. Enterprise RAID storage was anything BUT inexpensive. Enterprise-grade RAID solutions could be built very cheaply, but you're always going to pay a 500 to 1000 percent premium when you go with a name brand storage vendor.

          So a normal enterprise class 15K RPM hard drive might cost $400, the big named vendors will slap their own label and mounting kit on the same identical hard drive and charge you $2K or $4K for the same hard drive.
          georgeou
          • Inexpensive vs Independant

            The term was originally inexpensive.

            at the time RAID was created, and even now, the cost to use any old "inexpensive" drive laying around to make a larger array of storage is usually cheaper than buying the larger size of the day.

            example, 400 gig seagate drives are available online for 99 dollars.

            for 160 dollars I can now get a 500 gig drive, for $240 I can get a 750 gig drive.

            but, for $200, I can buy 2 of the 400 gig drives and make an "800" gig raid 0 array. All drives being equal, I'd only need a decent hardware controller (built in now) to make that work, and usually the hardware add on controllers allow more than two drives..

            So lets expand ... that 750 gig drive has no redundancy. To *mirror* in that situation I would have to have TWO of them, and that's 480 dollars, and still only 750 gigs of storage.

            For 400 dollars, I could buy 4 of the 400 gig drives, and get 1200 gigs of storage (yes 1.2 terra). So now, for less money, I've got 4 drives in raid 5 config, with single failure redundancy. And if the hardware controller is worth anything, I won't see a big performance hit on the store and will greatly benefit on read operations.

            That's more storage, with failure tolerance, for less money.... That .. is inexpensive.

            We could apply the same thing to the Seagate Cheetah drives... 10k and 15k drives are vastly different in price ... a 10k drive is $250 for 147 gigs, and a 15k drive is $1095. For 1000 dollars, I could buy *four* of the 10k drives, in raid 0+1 and theoretically out perform the single 15k drive, and have redundancy.

            ie, striping across two 10000 rpm drives should be faster (marginally) than a single 15k rpm drive. add in the fact I've doubled storage, and given redundancy for the price of *one* of the 15k drives... THAT is inexpensive!

            :)
            TG2
  • Nice explanation

    George. One question though - why did Level 2 disappear? Was it slower/harder to implement/invalidated by larger size operands/other ??

    Also - using RAID partially with multi-boot setups, is it practical to run a single boot drive with multi-systems, then put all data on a RAID array? Or is there an advantage to the OS being RAIDed as well? (and will RAID hinder multi-booting at all!)

    Thanks for the update - we need to see these things every once in a while! (mostly remember using NetWare to handle the mirroring for me.. :) )
    Freebird54
    • RAID Level 1 mirroring will enhance boot times

      RAID Level 1 mirroring will enhance boot times because a mirrored volume has the ability to do two independent seeks at once. This improves read I/O performance greatly and speeds up the OS.

      RAID Level 5 for the everything also works fairly well if you're not trying to access the data partition at the same time you access the OS/SWAP partition. RAID Level 5 also has better I/O and throughput than a single hard drive and should speed up OS Boot Times.

      RAID Level 2, 3, and 4 were all depreciated because they don't perform as well as RAID Level 5. Cluster-level striping with parity distribution just works much better.
      georgeou
  • Love it

    George,
    Nicely done on explaining all this stuff! I've run across more than one vendor who doesn't know RAID from Raid! Thanks.

    -MC
    Mercutio_Viz
  • more about RAID Level zero

    I wrote a blog posting about Raid Zero and how a couple people
    were using it without being aware of the dangers.

    http://www.cnet.com/defensive-computing/8301-13554_1-9740476-33.html?tag=head
    Michael Horowitz