'

Observing reliability of journalling filesystems

Recently I helped install a series of large digital displays that have Windows 7 boxes attached to them that send the video to the displays. When the project was started, we decided to use Windows 7 on the player boxes because it's the latest version of Microsoft's operating system.

Recently I helped install a series of large digital displays that have Windows 7 boxes attached to them that send the video to the displays. When the project was started, we decided to use Windows 7 on the player boxes because it's the latest version of Microsoft's operating system. The thought was to avoid having to go back and upgrade them in 1-2 years. Plus, they were being installed in an organization already heavily vested in Windows. Now that I look back at this decision, I am not so sure that was the best choice.

For the 2 systems running XP, there have been zero problems. But with 3 of the 7 systems running Windows 7, we've had intermittent issues, especially when there is a power outage, leaving Windows halted at the repair screen. And, the error doesn't have any useful information about the problem, it simply states "Status: 0xc000000e. Info: The boot selection failed because a required device is inaccessible". So, anybody care go guess what the "required device" would be? On the upside, we are able to boot the systems back up but it requires somebody to access the player box (which are sometimes only accessible by a ladder), and plug in a keyboard to choose the option to boot Windows normally and skip the repair process since it cannot start it anyway. We also observed that immediately after booting it like normal, it forced a checkdisk on the NTFS filesystem and rebooted once more after that was finished. No events appeared in the event log related to this issue, other than the messages about services being started when the machine was booted.

Since this issue has happened to us several times on these player PCs mostly when there seems to be a power outage, it makes me wonder if the issue isn't the NTFS filesystem again, especially since checkdisk was forced to run. All hardware has been powered off to drain the electrical charge, so my best guess is these issues are strictly software related. Then I had to ask myself, if NTFS is a journalling filesystem, why do we need to run checkdisk at all? By definition, a journalling filesystem is supposed to keep track of changes in a log, before they are committed to disk. The goal is to eliminate corrupted files since the system should know about all changes to the disk and therefore not lose any changes, causing corruption. So there must be a reason that Microsoft thinks that checkdisk is necessary for a filesystem that supposedly should not encounter corruption.

On top of these issues, I've seen issues here and there mostly on Windows Server 2003 with the NTFS filesystem becoming corrupted on its own which corrupts permissions as well, and other times where files exist but can't be deleted. All of this on servers that never have a power outage. These mysteries will need to stay mysteries for now as solutions were never found, and the company didn't have the money and resources available right away to pay Microsoft.

And why offer a repair mechanism that just doesn't work? Windows sees the drive, boots to it (offers the boot selection screen), then stops. So the feature just doesn't work. We've seen similar issues with other desktop PCs running Windows 7 as well, but those are outside of the scope of this article.

In my opinion other filesystems like XFS, EXT3, EXT4, ReiserFS, and others are better examples of true journalling filesystems. I've personally used XFS, EXT3, and EXT4 in production environments that are under rigorous loads, and we've never seen file corruption like we do with Microsoft NTFS. I highly recommend the use of EXT4 because it is the default filesystem of choice for new Linux installations, and it's open source so there should be more compatibility in the future, as well as no software patent troubles like we've seen with Microsoft FAT32 filesystem. Currently FAT32 seems to be the most supported filesystem because it's been around for a long time. NTFS is not as supported outside of Windows because Microsoft has kept it closed source, making it difficult to develop applications and other operating systems to use with it. I've been highly impressed with EXT4's speed advantages over other filesystems as well. And defragging is not even needed with EXT4 (nor is it with EXT3 or EXT2). It's important to choose the correct filesystem (IF you have a choice), as the entire system depends on it and it can cause trouble since it's at the very core of the operating system. With the high overhead of the NTFS filesystem and unreliability, I avoid it as much as possible for storage. Unfortunately since Windows is proprietary, NTFS must be used, there is no choice like Linux has.