Btrfs hands on: Exploring RAID and redundancy

A look at the RAID capabilities of the btrfs Linux filesystem
Written by J.A. Watson, Contributor

In the first post on this subject I discussed btrfs basics, showing how to create simple btrfs filesystems. 

In the second post, more on btrfs, I showed how btrfs filesystems can be dynamically resized, and can span multiple partitions and multiple devices (please remember that I am only trying to give an overview of btrfs functionality and use in these articles, and the authoritative source for information remains the Btrfs Wiki).

Btrfs is a new file system for Linux, one that is still very much in development. Although I wouldn't exactly describe it as "experimental" any more, it is, as stated in the Wiki at kernel.org, "a fast-moving target."

This time I want to take a look at the RAID capabilities of btrfs. To do this, I need to depart from my usual test systems, which are a variety of laptops and netbooks, because in general terms RAID is most interesting when you have more than one physical disk to distribute and replicate data. 

For this purpose I dragged out an old Dell Dimension E521 desktop system, simply because it happens to have two disk drives already installed (one 250GB and one 750GB). 

It has a dual-core AMD Sempron CPU and 4GB of memory, and originally came with Windows XP loaded — I believe it was also possible to upgrade to Vista, but that was the end of the road for Windows on it. 

I have reloaded it with openSuSE 13.1, since that was what I was using in the first two posts. As usual, Linux installed and runs without a hitch, and in the interest of keeping consistency here I configured it as a pure-btrfs system. No worries.

I set it up on the 750GB disk drive with a 32GB root and 64GB /home, which leaves a bit less than 600GB on that drive, and of course there is about 232GB of usable space on the second drive. I then created two unformatted 128GB partitions, one on each disk drive — CLI users can do this with fdisk, and those who prefer a GUI can use gparted or the like.  Shown graphically, what I ended up with was this:

The system disk
The second disk

Please excuse me for this being such a totally lame configuration, my intention here is to show the simplicity of the btrfs commands, not to get into a long dissertation about optimal disk configurations. Also, please keep in mind that btrfs is perfectly happy to use an entire unpartitioned disk drive (and in many situations this will be the preferred approach).

For those who are not familiar with RAID terminology or levels, here is a quick-and-dirty summary:

  •  RAID0: No redundancy, data is "striped" (distributed evenly across multiple devices)
  •  RAID1: Data is "mirrored" (copied identically to two devices)
  •  RAID10: Also known as RAID1+0, combines the previous two, requires at least 4 disks

So, with my two partitions ready for use, all I have to do is create a filesystem that spans both of them:

    mkfs.btrfs /dev/sda4 /dev/sdb1

     WARNING! - Btrfs v0.20-rc1+20131031 IS EXPERIMENTAL

    WARNING! - see http://btrfs.wiki.kernel.org before using

    adding device /dev/sdb1 id 2

    Turning on extended refs (higher hardlink limit)

    fs created label (null) on /dev/sda4

        nodesize 4096 leafsize 4096 sectorsize 4096 size 256.00GiB

    Btrfs v0.20-rc1+20131031

After the device has been created, I can mount it:

    mount /dev/sda4 /mnt

Note that for the btrfs utilities, the names of the partitions which make up the filesystem are equivalent, so I could just as easily have said:

    mount /dev/sdb1 /mnt

Once the filesystem is mounted, I can check the structure of it.  The command in this case must be given the pathname of a mounted btrfs filesystem:

    btrfs filesystem df /mnt

        Data, RAID0: total=2.00GiB, used=0.00

        Data: total=8.00MiB, used=0.00

        System, RAID1: total=8.00MiB, used=4.00KiB

        System: total=4.00MiB, used=0.00

        Metadata, RAID1: total=1.00GiB, used=24.00KiB

        Metadata: total=8.00MiB, used=0.00

What has happened here, by default, is that it created a filesystem which spans two partitions, on two physical devices, and it has made the data RAID0, so that it will simply be distributed as evenly as possible between the two devices, but there is no redundancy, while the system and metadata are RAID1, meaning it will be duplicated on both devices. 

I could can force higher or lower levels of redundancy (RAID levels) with command line options to the command; "-d xxxx" to set the RAID level for data, and "-m XXXX" for the metadata. The options range from single, which disables all RAID function and gives a filesystem which simply spans partitions and/or devices, to RAID10, which gives striping (balancing) and duplication, but which requires a minimum of four devices.

If I wanted to force duplication of your data, I could add "-d raid1" to the original command to create the filesystem:

    mkfs.btrfs -d raid1 /dev/sda4 /dev/sdb1

Then mount it, and check the structure:

    btrfs filesystem df /mnt

        Data, RAID1: total=1.00GiB, used=0.00

        Data: total=8.00MiB, used=0.00

        System, RAID1: total=8.00MiB, used=4.00KiB

        System: total=4.00MiB, used=0.00

        Metadata, RAID1: total=1.00GiB, used=24.00KiB

        Metadata: total=8.00MiB, used=0.00

For comparison with what was done and to fill in a few details about what actually happening in the first two posts on this subject,  I went back and created a simple btrfs filesystem using only one partition.  When I check the structure of that, it is:

    btrfs filesystem df /mnt

        Data: total=8.00MiB, used=256.00KiB

        System, DUP: total=8.00MiB, used=4.00KiB

        System: total=4.00MiB, used=0.00

        Metadata, DUP: total=1.00GiB, used=24.00KiB

        Metadata: total=8.00MiB, used=0.00

The important thing to note here is that the system and metadata show DUP, which means that they will be duplicated within the filesystem even though they cannot be distributed across different devices. If I subsequently add a second partition or device to this filesystem, these characteristics will remain the same.

OK, let's summarise all of this in regards to multi-partition/multi-device btrfs filesystems.  At one extreme, I can create a btrfs filesystem which does nothing more than span two or more partitions or devices, with no redundancy or balancing of data - I can use the "-d single" and "-m single" command line options to force this configuration.  Next, if I have at least two separate disk drives I can have RAID0 "striping" or balanced distribution across multiple devices, or I can have RAID1 "mirroring" or duplication across multiple devices - but if I want both of these, that would be RAID10 (or RAID1+0), you have to have at least four separate disk drives.

Btrfs also supports RAID5 and RAID6, which use parity data to increase data robustness, reliability and recoverability.  I have chosen not to include those levels in this brief overview, because they are much less commonly used in desktop systems and by casual users.

So, after three posts that's about it for the creation and management of btrfs filesystems.  Still to come is (hopefully) just one more post, concerning subvolumes and snapshots.

Further reading

Editorial standards