Announcement

Save Your Spot Today — Live Webinar on ZFS Mastery: The Bits They Don’t Put in the Man Pages Learn More

Klara

This year, the 13th annual OpenZFS User and Developer summit returned to Portland, Oregon. This is the second year of the expanded conference, with user-focused content over the weekend, and then transitioning to a developer focus for Monday and Tuesday. 

Combining the two audiences yielded unexpected insights for both and sparked many rewarding conversations. It is always extremely valuable to bring together those who develop the software, those who deploy and support the infrastructure, and the end users whose workloads are powered by ZFS. This end-to-end view unlocks new understanding and new possibilities. 

User Summit Highlights: Real-World Challenges That Spark Innovation 

MIT CSAIL – Hardware Chaos, ZFS Control  

The first standout session was an overview of the operational difficulties of managing a wide array of different JBOD hardware at the MIT Computer Science & Artificial Intelligence Laboratory. Not just the varied feature sets between hardware from different vendors, but even identical arrays exposing disks in different orders, or with differing physical paths.  

Others from the audience shared their own stories, including an appliance vendor who had tried to find the slot with the lowest latency to house their SLOG device, only to realize that which slot was optimal turned out to be different in each of the dozens of identical JBODs. 

Klara & TrueNAS – Cross-Site Clustering for Collaborative Workloads 

Another invigorating discussion surrounded the topic of how to extend ZFS to provide a clustering mechanism suitable for common cross-site workloads, such as video production and scientific research, where different stages of work are carried out at different locations, and the manual coordination of ZFS replication is insufficient to solve the operational challenges.  

Klara led an initial design review and is working with TrueNAS to put together user stories and a funding coalition to take the next steps. 

OpenZFS – Object Storage and Beyond 

The final two broadly championed ideas included the possibility of a native object storage interface, and a user-space oriented version of ZFS that might even be able to be embedded into applications like database engines.  

The goal with object storage is to provide S3-compatible APIs and storage without requiring a trip through the POSIX interfaces, or even to offer a POSIX-plus interface to allow ingestion of data via object storage APIs while also providing traditional NFS file-based access, and maintaining semantics like per-file versioning. 

Other wide-ranging topics included: testing and quality control, bringing support for illumos into OpenZFS, continuous replication, extending channel programs, providing a public and stable interface similar to libzfs, ZFS native auditing support, and data security and sanitization. 

Development Summit Highlights 

Klara – AnyRAID: Redefining Storage Flexibility 

As we moved over to the developer-focused day, focus shifted to completed and in-progress development, starting with the AnyRAID feature being developed for HexOS by Klara.  

After walking the audience through how AnyRAID enables creating pools from inconsistently sized disks and enables much more flexibility to add, remove, and change disks in the pool, the discussion changed to how this technology may be applicable beyond the original home user and enthusiast audience it was targeted at. 

Paul Dagnelie from Klara presenting the AnyRAID feature for OpenZFS during the Developer Summit.
Paul Dagnelie from Klara presenting the AnyRAID feature for OpenZFS during the Developer Summit.

Seagate – Regenerative Drives and Hardware Innovation 

This could not have meshed better with the following talk, where Seagate presented a number of new technologies that are available in modern hard drives from all vendors. The first, Command Duration Limits, will allow ZFS to communicate priorities and deadlines to the individual disks to help avoid tail latency in demand reads.  

The next and most interesting section was about drive regeneration, where, when an element of an HDD fails, such as an individual head, the disk can regenerate itself at a reduced capacity. Either a destructive regeneration where the disk reappears as a smaller size without the broken element, or a non-destructive mode where the disk communicates the ranges of inaccessible data, and the filesystem can adapt.  

Given the case of a theoretical 20 TB HDD, with 10 platters and 20 heads, each head addresses 1 TB of data. If a single head degrades or fails, non-destructive regeneration could be used to keep this drive in service, just with a smaller addressable storage size.  

This poses a problem for ZFS RAID-Z, as the data addressed by the failed element cannot simply be relocated. However, given the design of AnyRAID, it would be possible to relocate only the tiles within the unreachable regions, while maintaining RAID-Z parity across the disks with mismatched sizes. It could also allow a single spare to be used as a destination for failed elements across multiple different disks, further reducing the need to send someone to a data center and swap these partially degraded disks.  

Seagate also revealed their roadmap out to 100 TB disks, and their long-term projections about the price competitiveness of their and other vendors’ flash-based storage compared to magnetic media. 

Klara – Large Labels and Next-Gen Flash 

After the lunch break, Klara presented its work on a pull request to add the “large label” feature to ZFS, to support both new large-sector-size flash devices and to improve recoverability of ZFS in the face of hardware misbehaviour and human error. Through hard-won experience supporting ZFS in the field and providing emergency data recovery services, Klara has identified improvements to ZFS that will further bolster ZFS’s famed resilience. We then presented an early proposal for changes to the on-disk format of ZFS’ block pointer data structure, to support larger records and future massive storage devices. 

Klara also presented about QLC flash, and some of the unique challenges it might pose, and features it could offer to ZFS. Unexpected synergy between the way QLC flash might report degraded flash cells and the prior presentation from Seagate on how HDDs report similar failures extended the lively discussion on how ZFS can support these future devices and ensure that, once they arrive on the market, support for them is already incorporated into OpenZFS. 

AWS – Cloud-Scale Performance Improvements 

The team from AWS presented how they use ZFS to power the popular AWS FSx cloud storage service, and a set of performance improvements they are contributing back to OpenZFS. 

All recorded Summit sessions are available to watch here. 

Hackathon and Closing Notes 

The summit wrapped up with a day of prototyping and ideation, with prizes for the best hackathon projects. These projects tend to be the beginnings of the next set of features that will be presented at future summits and land in the next release of OpenZFS. 

We thoroughly enjoyed this opportunity to meet with our peers, vendors, customers, and users and share ideas and insights in the true nature of open source infrastructure. We look forward to seeing even more of you next year! 

 

 

 

 

 

 

Topics / Tags
Back to Articles