Klara

There are two components to ZFS – the utilities such as zpool, zfs and zdb and the the kernel modules which are required to mount ZFS filesystems. Kernel modules typically need to be compiled against the exact kernel that is in use to ensure the components are compatible. The deep integration of ZFS with FreeBSD – or indeed Solaris or NetBSD – can be a significant advantage when deploying ZFS. As the code lives in the same tree, all of the components are updated together. However, rather infamously, license incompatibilities preclude ZFS from being included directly in the upstream Linux kernel. There are many good reasons to use Linux, and we’ll cover options for deployment of ZFS on Linux as an external kernel module.

Your choice of Linux distribution will affect many of the details. Ubuntu, in particular, makes a compatible version of ZFS available in their package repositories. However, the version of ZFS tends to be older and you may want access to newer features. So, Ubuntu could be a good choice if you don’t have other constraints. In this article we will primarily consider Redhat and variants of it like AlmaLinux and Rocky Linux, but the fundamentals apply to all Linux distributions more widely.

The Two Approaches to Managing Linux Kernel Modules

There are two basic approaches: 

    • DKMS (Dynamic Kernel Module Support): A method that automatically rebuilds modules locally whenever the kernel is upgraded.
    • kmod packages: Precompiled kernel modules for a specific kernel that are then packaged for installation using normal package management tools such as dnf or apt. 

With Ubuntu, the method is closest to the kmod approach though the zfs module is included along with other modules in the linux-modules package. For Redhat variants, there are kmod packages available on the zfsonlinux.org site where the released kernels are tracked. If pre-built packages aren’t available for your kernel then you will need to compile the modules on your own system. For just a single system, DKMS is the easiest approach. But if you are managing multiple systems it can be better to build the modules once and make them available in a local repository.

A good starting point is the OpenZFS documentation. It includes details specific to a variety of distributions including links to third-party package repositories. Also worth checking out is the documentation from your own distribution as many include good support for ZFS without the need for a third-party package repository such as, for example, voidlinux. In the case of a Redhat variant, the documented steps start with installing a <zfs-release package that contains only the repository file in /etc/yum.repos.d/zfs.repo along with keys for verifying packages. 

Kernel Upgrades with DKMS

With DKMS (Dynamic Kernel Module Support), modules are automatically rebuilt locally on the host every time there is a kernel upgrade. While this is intended to be seamless, in practice it can occasionally fail. It is useful to have some understanding of how to operate it manually if problems do occur. 

On Redhat, in addition to the zfs-release package we already mentioned, you first need to install epel-release. EPEL is a repository for extra packages that are maintained for Fedora but which are not otherwise part of RHEL. Packages taken from EPEL don’t come with any support guarantees and though we’re already venturing into unsupported territory with OpenZFS you may want to take care to track the sources of any key software you rely on. It is necessary to install epel-release first as a separate step because that adds a package repository that is then needed for later dependencies:

 

# dnf install epel-release

A system using DKMS will need the kernel header files and development tools such as the compiler installed. By default, these headers are in a package named kernel-devel. Yet, if you’re using a different kernel, it will have a separate package. With an ELRepo kernel, it’ll either be kernel-ml-devel or kernel-lt-devel for the mainline and long-term support kernels respectively while for Oracle’s UEK it will be kernel-uek-devel.

# dnf install kernel-devel zfs

One of the packages pulled in as a dependency will be dkms. While this is usually invoked via hooks from the package manager, we can use it directly. For example, to check the status of the modules:

# dkms status
zfs/2.1.15, 5.14.0-427.37.1.el9_4.x86_64, x86_64: installed

While the modules have already been compiled and installed, they may not yet be loaded. Either zpool or zfs with version passed as an action will tell you the installed versions of both the ZFS user tools and kernel modules. This might not immediately work – the following is an example where the kernel-devel package was slightly newer than the running kernel:

# zpool version 
The ZFS modules are not loaded. 
Try running '/sbin/modprobe zfs' as root to load them. 
# modprobe zfs 
modprobe: FATAL: Module zfs not found in directory /lib/modules/5.14.0-427.35.1.el9_4.x86_64

This is not a problem – just upgrade the kernel first and then reboot. In case of problems, it can help to cross-reference kernel versions and dig around in the /lib/modules directory that error message references. Once you have a pool created, there are systemd services that handle loading the modules and importing the pools and you should get results such as the following:

# zpool version 
zfs-2.1.15-3 
zfs-kmod-2.1.15-3 

Ideally these versions should match but a mismatched combination may work if they are close enough.

In the case of problems building the modules, it can help to attempt the build manually. The zfs-dkms package installs source code for the kernel. This is likely to be below /usr/src but for DKMS, look under /var/lib/dkms. We can manually build and then install the modules by specifying the name and version, for example:

dkms build zfs/2.1.15 
dkms install zfs/2.1.15 

This will build the modules for the currently running kernel. You can build for a different kernel by specifying the version with the -k option. Log output from the builds can also be found down below /var/lib/dkms. Be aware that kernel modules must be built with the same compiler as the kernel. If you run cat /proc/version, in addition to the kernel version, it reports the compiler that was used. For example, on Oracle Linux 8, the UEK 7 kernel is built with gcc 11 but the default compiler is gcc 8. Gcc 11 is available as gcc-toolset-11 in the Appstreams repository and can be enabled using the scl command.

Precompiled kmod Packages

The need to rebuild the ZFS kernel modules on every system for every kernel update can be somewhat inconvenient, especially if you have many systems. Building the modules once and distributing them to multiple machines via the normal package manager is a way to reduce this burden.

In contrast to using DKMS, this avoids the need for the compiler and development tools to be installed on all systems, which may not be appropriate for a production system. You may also see the term kABI-tracking kmod. This refers to a module that is built against a specific kernel Application Binary Interface (kABI). Where the kABI remains consistent, the module will work without being rebuilt.

Precompiled kmod packages are supplied for the RHEL kernels by the zfsonlinux project. These will also work across the various RHEL variants for their compatible kernel builds and can make it very easy to deploy ZFS.

After installing zfs-release as for DKMS, the steps documented on the zfsonlinux.org site are just:

dnf config-manager --disable zfs 
dnf config-manager --enable zfs-kmod 
dnf install zfs

You can instead enable zfs-testing-kmod to get a newer release of ZFS, currently 2.2 instead of 2.1.

How to build kmod Packages

Sometimes for reasons such as newer hardware support you may need to support systems running kernels that aren’t directly supported with kmod packages from the zfsonlinux project. The ELRepo and Oracle kernels to name some common examples. Or you may be using an unsupported processor architecture. Even without these reasons, it can be reassuring to have knowledge of how to build the kmod packages manually.

The first step is to get hold of the zfs sources. You can start with a checkout of openzfs from git or a release tarball downloaded from Github. These do include an rpm spec file and, for that matter, Debian package rules files. However, the source rpm is easier to use. These can be found manually in http://download.zfsonlinux.org/epel/9.4/SRPMS/ or you can use dnf to fetch them:

cd /tmp 
dnf download --enablerepo zfs-source --source zfs-kmod 

The file that downloads can be “installed” as a normal unprivileged user:

rpm -i /tmp/zfs-kmod-2.1.15-3.el9.src.rpm

By default, this will put source tarballs and the .spec file under rpmbuild in your home directory. If you do start with a git checkout. The steps to create a .spec file and the source tarball and put them in place are as follows:

dnf install automake libtool
. ./autogen.sh
./configure --with-config=srpm
make dist-gzip
mv rpm/redhat/zfs-kmod.spec rpm/redhat/zfs.spec ~/rpmbuild/SPECS
mv zfs-2.3.99.tar.gz ~/rpmbuild/SOURCES

There are a number of build dependencies which we’ll need to install:

dnf install kernel-rpm-macros kernel-abi-stablelists \
        rpm-build kernel-devel

In simple cases we can now compile the module as follows:

rpmbuild -bb ~/rpmbuild/SPECS/zfs-kmod.spec

If you’re also building the ZFS utilities, there are further dependencies which we won’t list other than libtirpc-devel. It's notable because it is only available in the “CodeReady Linux Builder” repository so you may need to enable that with dnf config-manager --enable crb.

As mentioned for DKMS, you need to use the same compiler as was used to build the kernel itself. So, you may initially need to switch compiler such as with the following before running rpmbuild:

scl enable gcc-toolset-11 $SHELL

The earlier invocation of rpmbuild compiles ZFS against the currently running kernel. It is likely more convenient to build against a newer kernel before rebooting to it, especially if you use ZFS on your build machine. We can define a kernel_version RPM macro to build against a different kernel version. For example:

rpmbuild "-Dkernel_version 5.14.0-427.37.1.el9_4.x86_64" -bb zfs-kmod.spec

With the many version components, it can be a little tricky to identify the latest kernel release and the following command may help:

rpm -qa kernel-devel | /usr/lib/rpm/redhat/rpmsort -r

As part of the final build steps, rpmbuild will list the generated files. For example:

Wrote: /home/opk/rpmbuild/SRPMS/zfs-kmod-2.1.15-3.el9.src.rpm
Wrote: /home/opk/rpmbuild/RPMS/x86_64/zfs-kmod-debugsource-2.1.15-3.el9.x86_64.rpm
Wrote: /home/opk/rpmbuild/RPMS/x86_64/kmod-zfs-2.1.15-3.el9.x86_64.rpm
Wrote: /home/opk/rpmbuild/RPMS/x86_64/kmod-zfs-devel-2.1.15-3.el9.x86_64.rpm
Wrote: /home/opk/rpmbuild/RPMS/x86_64/kmod-zfs-debuginfo-2.1.15-3.el9.x86_64.rpm

Only the kmod-zfs package is needed unless you’re doing something unusual like developing changes. We’ll cover other options in the next section. For now, you can install the module manually as root with, for example:

rpm -i /home/opk/rpmbuild/RPMS/x86_64/kmod-zfs-2.1.15-3.el9.x86_64.rpm

Local Package Repository

In order to make a locally built package easily available to multiple systems, a local package repository is needed. After copying the compiled zfs-kmod .rpm files into a suitable directory, you can use the createrepo tool to create a repository. To make that tool available, first do dnf install createrepo_c. Then from the directory containing the .rpm files, run:

createrepo -d .

This just creates an index of the packages in a directory named repodata. To actually make the packages available you’ll need to share the files out by running a web server. DNF can use any URL scheme supported by libcurl so if you don’t want to run a web server, other options such as ssh or smb also work. Each client system needs to be configured to use the local repository. This involves a file in /etc/yum.repos.d. The zfs.repo file from the zfs-release package should serve as a good template for this. Also consider signing your packages with rpmsign which adds a degree of integrity and security checking.

Operating a local package repository can be good practice even where you don’t have locally built packages. Given a local mirror, your systems don’t all need to download the same files from the Internet but can use your faster local mirror. If you have air-gapped systems a local repository can be essential as a means to deploy updates. It also makes it easier when you want to first test updates on a limited number of systems before deploying updates more widely because you can point most systems to your tested baseline – which may just involve a simple symlink to a particular snapshot. For mirroring repositories, there is a reposync tool in the yum-utils package. For example, to mirror the zfs repository, you might use:

reposync -c /etc/yum.repos.d/zfs.repo \
           --repoid zfs \
           --arch x86_64 --arch noarch \
           --downloadcomps --download-metadata \
           --newest-only --delete \
           --download-path . --metadata-path .

Further Linux-Specific ZFS Deployment Advice

To wrap things up, we’ll highlight a few additional points that, while not directly related to earlier sections, are still important for successfully deploying ZFS on Linux. These points cover miscellaneous considerations that could impact system performance, compatibility, or ease of use. They’ll ensure that you are well-prepared for any potential challenges when working with ZFS in a Linux environment.

Stable Disk Paths

Be aware that disk device names are not always consistent across reboots. Rather than specifying, for example, /dev/sda, find the appropriate WWN number for the device. These can be found under /dev/disk/by-id in the form of symbolic links to the current real device. Specify these when creating ZFS pools or adding additional devices to an existing pool.

SELinux Integration

SELinux stores file contexts in filesystem extended attributes. This entails essentially every single file carrying an extended attribute. In practical usage, setting the ZFS property xattr=sa on ZFS datasets is very much recommended. OpenZFS reserves space for file metadata in the form of system attributes and this makes use of that reserved space for the SELinux context instead of creating a separate directory entry for each file. This makes a very noticeable difference to performance, especially for anything that needs to traverse many files such as rsync.

While the SELinux Reference Policy includes a module for ZFS, distributions such as RHEL don’t include it. This is not necessary to use ZFS with SELinux enabled; with SELinux in general it adds proactive security restrictions, it limits the ZFS utilities to only doing those things they need and no more. It is possible to extract and use just the zfs module from the reference policy. 

ZFS on Root

We barely give a second thought to choosing ZFS for our production data on Linux systems. However, when choosing to use ZFS for root, it can be worth contemplating the trade-offs since a failure to mount the root filesystem at boot-time can be somewhat frustrating. The bulk of data on a root partition can often be restored with a fresh OS install. If you use an automation tool like Ansible to push configuration, there may be little to no valuable data stored on the root partition. The boot environments of FreeBSD and Solaris that build on ZFS snapshots can be invaluable for coping with problematic upgrades so definitely take a look at our recent Introduction to ZFSBootMenu.

Conclusion

As OpenZFS combines reliability for safeguarding critical data with a robust set of modern features, it has become a compelling choice for production environments. This is in spite of the need to run out-of-tree kernel modules – something you might otherwise shy away from on critical systems. While the length of this article might imply otherwise, installing ZFS on Linux is (for the most part) surprisingly easy and trouble free. It is better to have a good understanding of the alternative approaches in order to make an informed choice for what works best on your systems. DKMS is a simple approach on standalone systems. Ensuring you have kmod packages might entail additional preparation effort when verifying new kernel releases but can facilitate seamless deployment across a fleet of systems. 

 

Topics / Tags
Back to Articles