Future File Systems: Btrfs and ZFS

September 18th, 2009 7:51 pm
Posted by Douglas Eadline
Tags: , , , ,

The prediction is in. What file system will move us into the future?

The thirst for better file system technology is not new to the Unix/Linux world. There is a rich history of trying to optimize the balance between the storage system and the process improvements in a demanding user environment. The efforts to build a better file system are quite numerous and are built on the efforts of many people. For example Kirk McKusicks’ original Berkeley Fast File System improved on the original V7 release. Steven Tweedie’s ext3 took ideas from the database logs and Margo Seltzer’s LFS and added them to Linux’s implementation of UFS -- Ted Tsao’s ext2. In the mean time, DEC released Megasafe, SGI released XFS and Sun released ZFS all to the wild. And now Oracle has developed Btrfs for Linux.

So why do we as users care? There better be a good reason to change a file system because a new file system usually means converting to and trusting a new format with your data. Thus, any new format or change must provide a compelling reason or solve a big problem. Otherwise what is "good enough and works" is often better than that which is "new and fancy."

If you follow the details of file systems development, then this quick update may not be of interest to you. For the rest of us, who just choose whatever file system the installer offers, you may want to read further because changes are afoot.

If you are like me, you probably are running Linux with the ext3 file system. There is nothing wrong with ext3 as it is stable, robust, and a standard Linux file systems. And, one other thing, it is old. Even if you are running the newer ext4, you are still running a 30-year old file format that is more than a little short on features.

There are those that believe ext4 is going to be the end of the line and a switch over to Btrfs is very likely. Btrfs (pronounced "butter-F-S") is being developed by Chris Mason at Oracle. It is an open source project that has recently been added to the Linux kernel (as of 2.6.29) as experimental code.

Btrfs is based on several new ideas including b-trees (binary trees, which is where the btr comes from in Btrfs) and "copy-on-write" or COW. While, I won't go into the details, b-trees and COW allow for some new features that would be difficult in the ext* line of file systems. (If you want to learn more about the technical details of Btrfs, see A short history of btrfs on lwn.net.) Some of the new features include file-system snapshots, check-summing, online defragmentation, compression, extents, resizing, and more. In particular, Btrfs allows one thing that has been difficult to achieve in the past -- optimizing both access time and disk space.

The fact that Oracle sponsors Btrfs has lead to some concern. Recently, Oracle purchased Sun Microsystems which has been developing the ZFS file system for many years. ZFS is similar to Btrfs (it uses COW) and provides many of the same features, but it is very different in its internal implementation. ZFS will also "run" under Linux using Fuse. Mason and other have assured the community that Btrfs is important to Oracle and they will continue development. In addition, the open source nature of Btrfs ensures that it cannot be "taken away" now or in the future.

There is plenty more to consider and I suggest reading Linux Don't Need No Stinkin' ZFS: BTRFS Intro & Benchmarks by my friend Jeff Layton. The consensus seems to be that Btrfs is destined to become the default Linux file systems within two years. ZFS on the other hand must overcome some licensing issues before it can even make it into the Linux kernel for testing. Your next Linux install may offer a new and better (or "btr") file system than in the past.

Comments

Pingback from IT Crate Tutorial and Articles » Blog Archive » Future File Systems: Btrfs and ZFS
Time September 24, 2009 at 10:26 pm

[...] Source: http://www.clusterconnection.com/2009/09/future-file-systems-btrfs-and-zfs/ [...]

Pingback from Anonymous
Time November 17, 2009 at 4:13 pm

[...] [...]

Comment from robheus
Time March 17, 2010 at 6:04 am

I think the concept of a traditional file system and an (object) relational database managent system at some point need to merge into something what might be called an object relational file system, and which stores all data, both at the operating system level and at the application level, in object-relational form.
No more walking through directories to find or search your files/data, just enter a query with appropriate search criteria to find back your data, as you can store all kinds of associative information together with your data/files for easy retrieval. And an automatic versioning system could easily be included.
Why doesn't an operating system/file system already implement that?

JOIN THE CONVERSATION


You must be a Registered Member in order to comment on Cluster Connection posts.

Members enjoy the ability to take an active role in the conversations that are shaping the HPC community. Members can participate in forum discussions and post comments to a wide range of HPC-related topics. Share your challenges, insights and ideas right now.

Login     Register Now


Author Info


Dr. Douglas Eadline has worked with parallel computers since 1988 (anyone remember the Inmos Transputer?). After co-authoring the original Beowulf How-To, he continued to write extensively about Linux HPC Clustering and parallel software issues. Much of Doug's early experience has been in software tools and and application performance. He has been building and using Linux clusters since 1995. Doug holds a Ph.D. in Chemistry from Lehigh University.