Apple's new kick-butt file system
Summary: As a long time fan of Apple - I bought an Apple // in 1978 - I watch Apple's storage efforts with special interest. The least talked about addition to the next version of Mac OS X, Leopard, is notable.
As a long time fan of Apple - I bought an Apple // in 1978 - I watch Apple's storage efforts with special interest. The least talked about addition to the next version of Mac OS X, Leopard, is notable. Especially since Microsoft's WinFS bit the dust.
Apple is doing something really cool with storage - not to discount their laudable RAID product - and that something is called ZFS. The bright side of the Leopard slip: more time to integrate ZFS is a Good Thing.
ZFS = non-acronym ZFS is a very cool - and open source - file system that some smart guys at Sun built. Its tree structured checksums eliminates most of the bit rot that afflicts Macs and PCs. When ZFS retrieves your data, you can be sure it is your data, and not the misbegotten spawn of a driver burp.
Add a disk drive to ZFS and it simply joins the pool of blocks available for storage. You don't have to manage another disk.
It is cheaper for ZFS to do a snapshot copy than it is to overwrite your data. While Time Machine doesn't require ZFS - journaling HFS+ can do it, ZFS would make it easier and perform better.
Here's some more cool stuff
Here are the highlights of some of the changes you'd see with ZFS on Leopard, the next version of the Mac OS.
No more Disk Warrior Data corruption on PCs and Macs is a sad and stupid fact of life. Power failures, flaky RAM, poor grounding, (slowly) failing hard drives, driver glitches, phantom writes and more conspire to rot your data.
ZFS eliminates that. All blocks are checksummed and the checksum is stored in a parent block. ZFS always knows if the block is correct and/or corrupt. Every block has a parent block (with one obvious exception that gets special treatment), so the entire data store is self-validating. You'll never have to wonder if all your data is correct again. It is.
No RAID cards or controllers ZFS implements very fast RAID that fixes the performance knock-off against software RAID. In ZFS all writes are the fastest kind: full stripe writes. And the RAID is running on the fastest processor in your system (your Mac), rather than some 3-5 year old microcontroller.
Just add drives to your system and you have a fast RAID system. With Serial Attach SCSI and SATA drives you'll pay for the drives (cheap and getting cheaper), cables and enclosures.
No more volumes Every time you add a disk to your Mac you see another disk icon on the desktop. If you want to RAID some disks you use Disk Utility (or something) to create the volume. Slow, error-prone, confusing.
ZFS eliminates the whole volume concept. Add a disk or five to your system and it joins your storage pool. More capacity. Not more management.
Backup made easy ZFS does something called snapshot copy, which creates a copy of all your data at whatever point in time you want. Copy the snapshot up to a disk, tape or NAS box and you are backed up.
Create a snapshot on every write if you want, so if your database barfs you can go back to just before it choked.
But that's not all! For in-depth treatment of ZFS see here and here. Includes links to more technical info and benchmarks.
Why does Apple care? After all, journaled HFS+ isn't perfect, but it is competitive with NTFS and the other common filesystems out there. My original thought was "here is this great free product so why wouldn't you use it."
Well, as others have noted, while plugging in a new file system isn't that hard, it does take investment, such as migration, and creating the front ends for all the cool things you can do with ZFS. Steve may not care much about plumbing but he is all over user experience. Migration in particular is difficult for home users who don't have empty external hard drives.
Now we know The motive is clear: HDTV content to feed Apple TV. How does this impact storage?
Video downloads: big and getting bigger! Here's how. Imagine you've built the world's largest and most successful online music store and sold billions of dollars of hardware to play that music. Each of those tracks cost $0.99 and is 3-5 MB each. People can easily back them up and even if they have a few hundred, it is maybe a GB or two. Easy to back up on a few CDs or DVDs. And they are on your iPod anyway. So HFS+ burps on your music and other than yelling at an underpaid Apple tech support guy, what are you going to do? If it wasn't backed up, whose fault is that?
Enter the terabyte media collection Now you want to build the world's largest and most successful online video store, with DVD and HDTV quality content. You are a little ahead of the market, but that usually works out. You want people to buy movies as freely as they now do tracks. Yet there is the scale problem: movie files are 1000x the size of audio or photo files. Not only that, the studios don't want you to back them up to DVD or anything else.
"Halfway through T3 the hard drive started clicking?" iTunes music is automatically backed up if you have an iPod. Movies aren't. Movies are large - 1 to 2 GB today - and much larger with HDTV and DTS sound. If you want people to store and play movies digitally, both purchased and home video, they need safety and capacity. No disk tools. No RAID set-up. No volume management. Suddenly storage quality and ease of use becomes a critical success factor for a new billion dollar business.
ZFS is the answer Steve Jobs has two questions. First, how can I sell more online content and equipment to play it? Second, how can I kick Microsoft's butt? By solving the high-capacity storage problem for HDTV content way better than Microsoft can, he's got a great answer to both questions. He'll never utter "ZFS" to a starstruck MacWorld audience. But he will wheel out a half dozen features, like Time Machine, based on ZFS, that will instantly become must-haves for the home digital media center.
Apple Computer had the means, ZFS; motive, a big market; and opportunity to murder the Media PC.
I expect they'll introduce the way they did HFS+: on OS X server. After they're confident, it will be the default file system. And the folks in Redmond will be scratching their heads once more..
Comments Welcome, As Always I'm off to NAB and SNW this week, so look for updates on cool new stuff.
Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.
Talkback
To play on what?
Do you have an Apple TV? The AP report is flawed...
comment "mediocre quality" isn't based on that. It turns out the
AP writer was using different quality video FILES to do his
comparison.
See http://www.macdailynews.com/index.php/weblog/
comments/ap_writer_criticizes_apple_tv_video_quality/
and many other reviewers who point out why the AP review was
very poorly done.
"Yawn"
Pagan jim
Apple, once again, shows off its 'smarts'
versions of everything so that they can hold us over a barrel, Apple identifies smart
open source projects and just integrates them.
Apple is just pulling further and further ahead and those NeXT engineers and the
culture that they've inculcated at Apple is brilliant. They're the envy of every tech
company today.
A few notes
2. There is always a chance of corruption, if not, a bug in zfs.
3. RAID cards, why would the cpu/microcontroller on the raid card be 3-5 years older than the CPU in a system? Would you use the same argument for the offload GPU you use for your graphics?
4. snapshot technology is old, most filesystems have had this feature several years. it's about time sun got sorted by now.
5. I agree with you, with higher throughput on broadband the more people will buy and download movies and they want to store it somewhere, however, I do think "black box" technology connected to their TVs will excell here as it is just computer freaks that will use their PC/Mac box to manage the downloaded material on.
You raise some good points
of your points, Robert.
-Bugs? True enough, and yet the huge majority of data burps are caused by flaky
hardware rather than flaky file system software. ZFS is getting wrung out on both
Solaris and Mac OS X, which cover the waterfront from high-volume desktops to
big servers.
-Old microcontrollers: OEM margins are low, so once a product is fully qualified
by major customers and the worst bugs are fixed, the vendor hates to make
changes. Add in their design cycle, even using early parts, and it is a rare RAID
controller whose CPU is less than 18 months old.
-Snapshots have been around for a long time. What's cool is that for ZFS, creating
a snapshot requires less in CPU and I/O resources than overwriting the data. This
has a number of salutary implications.
-Your mention of "black box" consumer products makes me wonder how you
would classify Apple TV?
Robin
I'm still skeptical, but interested
1. I'm not convinced of your assertion that usually it is the hardware not the software (driver). I expect it is the reverse. So while it may be reduced, I'm not sure that ZFS can eliminate corruption.
2. Nothing is free. So CRC's, checksums, XOR or other forms of ECC always come at a cost. Usually space and performance.
3. It has always been my experience both on a system and embedded level that hardware outperforms software emulation all other things being equal. It has most definitely been true of RAID and I don't see how ZFS can fix that. Or maybe I missed something. Yes, raid controller IC's and microcontrollers are going to be way slower and inferior to an advanced microprocessor, in a general sense, but they end up being significantly faster in the end. Why? Those circuits are dedicated, they only have to do a few things really well and fast. They are closer to the actual storage and know more about the hard drive, for example, the position of the head. You don't want the CPU to know these details. But a dedicated RAID controller can and does and so can optimize in a way that the CPU doesn't have time to.
That's not to say that the user experience with ZFS might not be really great but be careful that you don't buy an arguement that doesn't hold real water.
This occured to me...
WHY does it have to be so new? Yeah, new and shiny is great, but all it is doing is managing the drives and data. What it is NOT doing is running the entire Computer, like the Main CPU is. It is why we have Graphics Cards that have their own CPU. Graphics is far more intense than a RAID controller. Basic RAID methodology has not changed, so it would seem to me an year-year and a half old RAID processor design is...perfectly fine. And a DEDICATED CPU to that process, instead of further splitting up the MainCPU.
Another thing that came to mind:
You talked about how it just lumps drives together now to create one big "storage block" (sic). However, I don't neccesarily see that as a great thing. What happens when it comes time to Upgrade one of the existing drives? How do you KNOW which files are on that drive to move them off? The nice thing about multiple drives showing up, as annoying as you have perceived it to be, is YOU KNOW WHICH DRIVE YOUR DATA IS ON. That, it seems, is very important when upgrading.
Chalk up yet another feature for Leopard
And the Apple cultists go wild lauding Jobs for another brilliant innovation.
Well you don't see Microsoft doing the same...
Please don't think that WinFS
I don't know why...
"WinFS" been promised? I do believe the original name was "Cairo" some 10 to 12
years ago? So we'll just have to wait, however long it takes, and see.
Could it be cause
Ummm No
Redmond.
Rick...
I just didn't feel like defending the fact that you are not the one to be calling someone else a zealot. I've seen your Mac wi-fi hack posts and woooeeee you get on a rampage. Take a rant and multiply it by 10. Maybe you have reached the next level above zealot.
re: WinFS is dead.
http://fishbowl.pastiche.org/2006/06/25/we_come_to_bury_winfs
gnu/linux...giving choice to the nex(11)t generation.
You REALLY don't want people to start picking apart Windows, do you?
You started the stupidity.
I get it.
You started the stupidity.
</i><br><br>
But it's not funny.
I knew you couldn't resist...
"People who live in glass houses..." "Pot, kettle, black", etc :-)
I knew you couldn't resist
In fact the industry is going wild over architecture that has been around at Microsoft for ages. Ever hear of SOA or SaaS? What architecture comes to mind...hmmm...service based software....services available to handle your code.....oh yeah, the Windows Operating System. It's built on the concept of service oriented architecture. <br>
You know if Apple had more than 2 products you might have some traction here but as it stands and I've said before, you are holding an empty hand and bluffing everytime you post.
;) <br>
ZEALOT