Reiser FS: The open source file system fallout

Yesterday, the Open Source community took an emotional hit when veteran Linux programmer Hans Reiser was convicted of first degree murder in the suspicious disappearing of his wife, Nina. While I won't go into the details of the case, as this has been covered extensively in the press, I would like to talk a little bit about how this verdict will impact the technology in play for file system dominance in our favorite Open Source operating system, Linux.

Yesterday, the Open Source community took an emotional hit when veteran Linux programmer Hans Reiser was convicted of first degree murder in the suspicious disappearing of his wife, Nina. While I won't go into the details of the case, as this has been covered extensively in the press, I would like to talk a little bit about how this verdict will impact the technology in play for file system dominance in our favorite Open Source operating system, Linux.

While Namesys' ReiserFS, of which Hans Reiser (right) was the primary programmer and lead designer was not the pre-dominant journaled file system used on Linux systems, it was praised for its stability and performance, and was and still is the default file system on the second most popular enterprise Linux distribution, SuSE Linux Enterprise Server (SLES). ReiserFS was also included in the "upstream" Linux kernel maintained by Linus Torvalds because it shares the same license, GPL version 2. ReiserFS is also popular on Debian-based systems as well.

Paula Rooney: Reiser found guilty of first degree murder

SuSE and Debian use ReiserFS version 3, a stable and proven version of the code that has been sitting mostly fallow for some time, and is maintained with bug and security fixes on a best effort basis. Prior to the whirlwind and highly publicized trial, Hans Reiser and his small team were working on Reiser4, but much of this development ground to a halt due to his legal woes, and the project is more than likely to die an unfortunate death by virtue of its lead programmer having to serve a minimum 25 year life sentence in prison.

From the SuSE and Debian perspective, this is an obviously unacceptable state of affairs. The OpenSuSE project has already moved its distribution to use the ext3 file system, the most common one used on Linux systems. However, it is far from the fastest and most reliable of the Linux file systems and development of its replacement ext4, is also underway. Ted Ts'o, one of the original programmers of ext3 and its predecessor, ext2, is leading the ext4 project, and is employed by IBM. ext4 shows a great deal of promise, but it is unlikely that the main commercial enterprise distributions -- such as RHEL and SLES -- will adopt its initial release for production use.

So the question is, what do we replace ReiserFS with? Its not like there are a lack of good high performance, journaled files systems on the Linux platform. IBM's JFS2 , which is extensively used on its AIX UNIX derivative on the pSeries hardware platform, has been battle tested on midrange and high-end enterprise computing for many years and has already been ported to Linux. It also has the benefit of being licensed under GPL2, and already incoprorated into the mainline Linux kernel. However, the project has not seen a new release since June of 2004 and a new user space utilities release since August of 2007, so a crash program to restart development on Linux would need to be thrown into high gear to get this rolling again. IBM has not made any statements regarding the future development of JFS2 on Linux, nor has it made any commitments as of yet to using or incorporating EXT4 in AIX, so the question of what will end up being the eventual pSeries standard for both Linux and AIX remains unanswered.

Perhaps some of the most compelling filesystem technology comes from Sun, in the form of its Zetabyte File System (ZFS). ZFS was introduced with Solaris 10, and was ported to Linux during Google's "Summer of Code" project in 2006. As of March 2008, the Linux implementation is still currently in Beta. Unlike most traditional file systems such as ext3, JFS, or ReiserFS, where a "Volume Manager" such as LVM is needed to span more than one device on a single system, ZFS filesystems are built upon an abstraction layer called "zpools" that are virtual devices which talk directly to the physical block device. Zpools can span not just multiple devices, but also multiple computers on a network. ZFS's 128-bit implementation also allows it to have a much larger theoretical upper limit in storage capacity than 64-bit file systems -- so it can store 18 billion billion (18.4 × 1018) times more data.

Unfortunately, ZFS is currently licensed under Sun's CDDL Open Source license, which is incompatible with GPLv2, the license that the Linux kernel uses -- so at this time, it runs as "userspace" code and is not integrated into the upstream kernel source. Sun has made various allusions to the possibility of GPLing ZFS and other parts of the Solaris source code, but it is likely they will use the GPLv3 license -- and Linus has no intention, as of yet, to migrate the kernel to the latest version of the license. Until some sort of an impasse is breached -- such as Sun licensing ZFS under GPL version 2 (as it does with Java and Glassfish) it is unlikely we will see this code integrated at the kernel level and have widespread use among Linux distributions.

There is another option which many have not have anticipated, or may have dismissed outright -- native implementation of Microsoft's NTFS journaling file system. Currently, NTFS is implemented in Linux using NTFS-3G (EDIT: NTFS-3G uses the FUSE kernel module, which mounts NTFS in "userspace" not as a native implementation as a module directly loaded by the kernel as mentioned previously) which is licensed under GPL. However, no Linux distribution uses it as a primary file system, with the exception of latest released Ubuntu 8.04, and only under the WUBI implementation where it virtualizes an ext3 file system within an NTFS "container". NTFS-3G was developed using clean room reverse engineering techniques, and while the driver appears to be quite stable, it is not considered to be enterprise-worthy.

However, initial performance tests conducted in 2007 indicate that with continued development, NTFS may actually have some potential as a mainstream Linux filesystem. However, in order to get any sort of community buy-in, the full Microsoft specifications for the filesystem primitives and metadata would have to be released -- and perhaps even a full GPL2 release of Microsoft's actual NTFSv5 drivers on Windows Server 2008. While this sounds like something of a pipe dream, Microsoft has already stated its intention to cooperate with the Opern Source community by stating its Interoperability Principles in February of 2008, and has already published 45,000 pages of documentation detailing the functions and workings of its network protocols, so such a gesture is not completely out of the question.

Now that the fate of ReiserFS appears to be sealed, which Open Source filesystem shall reign supreme? Talk Back and Let Me Know.