As commercial storage becomes increasingly expensive, more and more of the Education vertical is looking at Open Source solutions for storage. In this article, we discuss the value of OpenZFS for Universities and how system administrators can best leverage it to their benefit.
History of ZFS – Part 1: The Birth of ZFS
This is part of our article series published as “History of OpenZFS”. Subscribe to our article series to find out more about the secrets of OpenZFS
The Zettabyte File System (commonly known as ZFS) is approaching its 15th birthday, and over a decade since integration into FreeBSD. Originally created by Sun Microsystems in the early 2000s, ZFS grew in popularity because its advanced features fulfilled a previously unmet need. Today we will take a look at the birth of this revolutionary file system.
A Problem in Search of a Solution
Before the birth of ZFS, the state-of-the-art file systems were not pretty. Data was in constant danger of being lost or corrupted due to “bit rot, disk and controller firmware bugs, flaky cables, etc”. In fact, one of the creators of ZFS said, the goal of ZFS was to “end the suffering of system administrators.”
The seeds for ZFS were planted by such suffering. In the late 1990s, Sun Microsystems was using a server named Jurassic to store the engineers’ data. This particular server was running the Unix File System or UFS as the file system, and the Solaris Volume Manager to handle management of the disks. One day it all came crashing down.
An engineer named Mike Sullivan was performing routine maintenance on Jurassic when he made a typo. With that keystroke, the server went down and a thousand engineers were left with nothing to do until the data was restored from backups.
Jeff Bonwick, one of the creators of ZFS, explains what happened. “Everything came crashing down because the device mapping was now wrong. So, UFS was seeing wrong blocks from the file system.” “SVM like a lot of these software based volume management tools, they had their own layering, they had they own idea of labels. If it got confused about which devices were part of the RAID group and which weren’t, then everything was all over. You might have your data there. But if the label wasn’t there and it couldn’t figure it out, your date was essentially all gone.”
While the restore was underway, Bonwick wandered into another Sun employee’s office. He and Tim Marsland started to talk about a subject that was foremost on their minds at the moment: Sun’s inability to create a new file system. Over the years, Sun had made several attempts to create a new file system, but always ended up pulling the plug on the projects before they would be finished.
The two went on to discuss what a file system should look like. Basically, it should be easier. It should be just like adding memory to a system, just add it and reboot. The system just starts using the added memory and everything works faster.
As Bonwick said, “There’s no DIMM config that you have to run. You don’t create virtual DIMMs. There’s no DIMM management software…Why can’t you just treat your disk as a pool of storage and you allocated from them with a memory allocator…That became the SPA, the Storage Pool Allocator…Just like in VM system, where you have an MMU (Memory Management Unit) to manage translations of things from virtual to physical. We could have a Data Management System.”
Unfortunately, nothing came out of this conversation at the time. Both engineers went back to their respective projects after the data was restored.
A Failed Start
Around the year 2000, yet another file system project bit the dust at Sun. Bonwick decided that he wanted to take a crack at the idea himself. He said that he got so mad at the current file system options that he wanted to try and fix the problem.
Bonwick went to Mark Himmelstein, director of the Operating System Group. He asked Himmelstein for a small team of “five or six people in the same building” to work on a new file system. Himmelstein thought about it and told Bonwick that he needed someone to lead of a team of 80 engineers named the Data Organization. He told Bonwick that “I need a data architect, so if you take that role, you can go do the file system.”
Bonwick called what followed “the worse year of my career”. He could not get the members of the team to “buy into” his project. The team was divided between Colorado, Los Angeles, and Santa Clara. Bonwick spent a large portion of his time flying between the locations and was unable to get any momentum going. The project ended up crashing. He later commented that “You can’t start with a large team and say ‘Here is the vision’ and everyone’s suddenly going to get on board with that. This is not the way engineers work.”
Getting Down to Business
In the fall of 2000, Bonwick’s third son was born. As a result, he took six weeks off to spend time with his family. He also spent some of this time thinking about what a file system should look like and how to overcome the failure of his previous project. When he returned to Sun, he decided that he would take one last stab at a file system. But this time he did not want to deal with a large team.
Instead, he and a new hire named Matthew Ahrens locked themselves in a room with a whiteboard and started throwing ideas around. They started working on July 20th, 2001 and by Halloween they had a working prototype. Ahrens worked on creating the data management unit and Bonwick wrote the storage pool allocator. Since they had a working prototype, Himmelstein gave them permission to continue working on their new file system.
Bonwick and Ahrens had several things that they wanted to focus on. According to Bill Moore, “one of the design principles we set for ZFS was: never, ever trust the underlying hardware”. They also wanted to simplify data management. Finally, they wanted to make sure that the file system had high performance. “So, unless you make it fast, people will fundamentally be uninterested—except for those who have experienced data corruption firsthand.”
The Team Grows
With the blessing of the higher-ups of Sun, Bonwick and Ahrens brought on more engineers, including Mark Maybee and Mark Shellenbaum, the team growing eventually to over twelve people. Maybee and Shellenbaum were brought into the project because they were “file systems people, who knew how to write a file system from scratch“. In fact, Maybee and Shellenbaum joined Sun in the late 1990s “with a carrot dangled in front of us that we were going to possibly get to write a new file system.” Unfortunately, that hadn’t happened until then.
So, the team went to work with the two Marks working in Colorado, and Bonwick and Ahrens working in California. Maybee and Shellenbaum worked on creating the ZFS POSIX Layer. This allowed the operating system to talk to the file system. On October 31, 2002, they had their first kernel mount. This was a big step towards their first stable release.
As their project come together, the ZFS team started using the new file system to store their own files. This took place in 2004. The following year, on October 31st, 2005, they integrated ZFS into Solaris, Sun’s Unix-based operating system. The version of Solaris was only used internally at Sun. After this milestone, ZFS went from being used by about a dozen people to over a thousand.
In 2006, Sun Microsystems officially released ZFS to the public as part of their Solaris 10 operating system. With ZFS now available to the public, its popularity grew exponentially. But that story is for another day. Be sure to check back for that.
Like this article? Share it!
Discover how OpenZFS can provide cost-effective and reliable storage for high-performance computing (HPC) workloads in this comprehensive write-up.
The most common category of ZFS questions is “how should I set up my pool?” Sometimes the question ends “… using the drives I already have” and sometimes it ends with “and how many drives should I buy.” Either way, today’s article can help you make sense of your options.