Improve the way you make use of ZFS in your company.
Did you know you can rely on Klara engineers for anything from a ZFS performance audit to developing new ZFS features to ultimately deploying an entire storage system on ZFS?
ZFS Support ZFS DevelopmentAdditional Articles
Here are more interesting articles on ZFS that you may find useful:
- Fall 2024 Top Reads for Open Source Enthusiasts
- Deploying pNFS file sharing with FreeBSD
- Applying the ARC Algorithm to the ARC
- Introducing OpenZFS Fast Dedup
- DKMS vs kmod: The Essential Guide for ZFS on Linux
Building Your Own FreeBSD-based NAS with ZFS
In the first article in this series, we concentrated on selecting suitable hardware for your FreeBSD and OpenZFS-based NAS. We’re taking a build-up approach, where we first walk you through the hardware steps, and now we’re bringing up the next layer in our step-up – setting up your FreeBSD operating system. In this article we take a closer look at the operating system and the configurations, both during and after installation, to fine-tune the system for OpenZFS storage.
Why Vanilla FreeBSD?
Given that several FreeBSD-based NAS distributions are readily available, ranging from completely open source to pre-built with customer support plans, the question naturally arises: “why not just use a FreeBSD-based NAS distro rather than building your own”? After all, someone else has done the work to tune FreeBSD and most solutions provide a nice GUI for configuring the various NAS options.
Pre-built NAS solutions will only offer a small selection of hardware choices which may or may not meet your needs. While some vendors offer customized hardware builds, if you already know what you want for hardware, you can build it yourself without adding the cost of a middleman to build it for you.
Configuring a NAS on top of your own FreeBSD installation also means that you don’t have the overhead of running a GUI, or learning where to perform the various configuration options from that GUI. In practical terms, NAS configuration is mostly a one-time thing that is easy enough to perform from the command line.
Ongoing maintenance and monitoring can also be achieved from the command line or with the tools that you already have in place for maintaining your other FreeBSD systems. Most software bugs for NAS solutions tend to be for the GUI; taking it out of the equation means that you aren’t waiting for the solutions provider to provide usabilityand security patches.
Before you install your FreeBSD System
Before installation, you must determine which device to install the operating system onto. Most open source NAS solutions recommend installing the operating system on its own pool and device in order to maintain a separation between the operating system and the storage drives and pools. The device holding the operating system does not need to be large, though you should consider that:
- it should be large enough to hold logs (unless you’ll be configuring the system to save logs to a separate device or another system) and a few boot environments.
- a mirrored pool may save some unexpected downtime should the operating system device fail.
After installation, there are a few actions you can perform to optimize performance. Let’s take a look at periodic scripts, network configuration, and ZFS tuning.
Disable Unneeded Periodic Scripts
If you’re unfamiliar with FreeBSD’s periodic system, our article on FreeBSD Periodic Scripts provides a good introduction.
Disabling the general-purpose maintenance scripts which aren’t suited to a NAS will reduce the performance hit from running them unnecessarily. For example, several scripts run find across all of your data. What find is looking for might be appropriate for Internet-facing systems or systems with multiple user accounts; however, you probably don’t want it trawling through TB worth of NAS storage on a daily, weekly, or monthly schedule!
It is easy enough to determine which scripts are using find:
grep -R find etc/periodic/*
You can get a short description of what each script is doing in periodic.conf(5). It is also useful to review which scripts are enabled by default:
grep YES etc/defaults/periodic.conf | more
The security scripts can run daily, weekly, and/or monthly. It is worth reviewing these scripts to see if they need to run at their current schedule (eg daily or weekly) or if they even need to run at all on a NAS. In some cases, you might wish to let the scripts run, but limit their scope to the operating system itself—not the data you’re storing on the NAS.
Remember, if you decide to disable any of the periodic scripts (by changing their YES to NO), make your edits to /etc/periodic.conf so they are not overwritten by an operating system upgrade.
Review ZFS Periodic Scripts
One of the non-default scripts which you will want to enable is the ZFS scrub scheduler. Set daily_scrub_zfs_enable to YES. By default, daily_scrub_zfs_default_threshold sets a scrub to run every 35 days. You can also set different scrub schedules for different pools by setting values for daily_scrub_zfs_pools and daily_scrub_zfs_<poolname>_threshold.
You may also want to consider enabling daily_status_zfs_enable, which sends the root user a daily email showing the output of zpool status.
FreeBSD Network Configuration
FreeBSD supports both jumbo frames and link aggregation for optimizing network speed. Which optimization to configure depends upon the network hardware and switching infrastructure the NAS will connect to.
Jumbo Frames, or setting the MTU higher than the Ethernet default of 1500, is a common configuration in storage networks. This configuration assumes adedicated network or VLAN where every network device and switch supports and has been configured to use the same higher MTU value (typically 9000). Using a larger MTU results in larger packet sizes, which should in turn result in less fragmentation and lower overhead.
Many FreeBSD NIC drivers also support LSO (Large Send Offload), LRO (Large Receive Offload), and TOE (TCP Offload Engines). Our article on FreeBSD TCP Performance System Controls describes this in more detail.
On FreeBSD, ifconfig(8) is used to both display and configure NIC values such as the MTU and offload options. To make configuration changes persistent, remember to add the ifconfig values to /etc/rc.conf.
Link Aggregation combines multiple physical interfaces into a virtual lagg interface in order to provide fault-tolerance or high-speed multi-link throughput. In FreeBSD, link aggregations are created using ifconfig and the virtual interface is called laggN, where N is the virtual interface number starting at 0. FreeBSD supports these link aggregation protocols:
- Failover: Sends traffic only through the active port and, by default, only receives traffic through the active port. If the master active port becomes unavailable, the next active port is used.
- LACP: Requires switches which support the IEEE 802.1ax Link Aggregation Control Protocol (LACP). LACP negotiates a set of same-speed, full-duplex links and balances traffic across them. If the physical network changes, LACP quickly converges to a new configuration.
- Load Balance: balances outgoing traffic across the active ports based on hashed protocol header information and accepts incoming traffic from any active port. Unlike LACP, this is a static configuration which does not negotiate aggregation with the peer or exchange frames to monitor the link.
- Round Robin: distributes outgoing traffic using a round-robin scheduler through all active ports and accepts incoming traffic from any active port. This mode can cause unordered packet arrival at the client which can be CPU-intensive on the client.
- Broadcast: Sends frames to all ports of the lagg and receives frames on any port of the lagg.
You can learn more about configuring lagg in lagg(4). Note that the lagg driver needs to be loaded in /boot/loader.conf.
OpenZFS Performance Tuning
Several ZFS properties can affect the performance of a NAS system. You will want to investigate these before creating your storage pools as some of these properties can only be set at pool creation time or do not affect already written data.
ashift: When creating a pool, you want to use disks with the same sector size as a pool comprised of disks with different sector sizes can result in poorer performance and inefficient space utilization.
The ashift property can only be set at vdev creation time and its setting affects the lifetime of the pool containing those vdevs. The ashift is a binary exponent of the disk sector size in bytes—in other words, ashift=9 tells ZFS that the underlying disk sectors contain 2^9 bytes each.
When ashift=0, this represents auto-recognition of reported size. Unfortunately, many disks (including but not limited to Samsung consumer SSDs) misrepresent their sector size, so we recommend manually setting ashift appropriately. Setting a value of 9 means the disks use a sector size of 2^9==512b, a value of 12 means 2^12==4K, and a value of 13 means 8K sectors.
Although setting too low an ashift value can cripple performance, setting it too high shouldn’t have a performance impact. This means that you should probably set the ashift to 12 or 13 in order to future-proof the pool.
recordsize: In ZFS, all data and metadata is stored in blocks, and recordsize (or volblocksize, if you’re using zvols) is the maximum size of a block. This property is set per-dataset; while it can be changed, the new value will only affect new blocks, not the blocks which have already been written.
The default recordsize is 128K but can be set to any value from 4K through 1M. If you expect bulk reads/writes on the NAS, increasing the recordsize will improve performance—but if you’d like to learn more about this important tunable, our recordsize tuning guide has you covered.
Use zfs get recordsize datasetto view the current recordsize and zfs set recordsize=value dataset to change it..
atime: When enabled, this property updates the last atime (access time) every time a file is changed or accessed by an application. This could potentially double the IOPS load of the NAS, especially if lots of small files are frequently accessed. The FreeBSD kernel already tracks a file’s access time, meaning this performance hit is essentially redundant.
Since the default value for new datasets is to enable atime, it is a good idea to get in the habit of setting this property to off when creating new datasets. Use zfs get atime to view the atime value for any existing filesystems.
The Workload Tuning section of the OpenZFS documentation contains other recommendations for tuning OpenZFS and is well worth a read.
Next Time...
In the next article in this series, we’ll concentrate on configuring NAS shares: NFS, Samba, and iSCSI.
One more thing...
The experts at Klara have been designing NAS solutions for over two decades and have in-the-trenches experience with using new technologies and planning for technology upgrades. Reach out to us if you would like to discuss the practicalities of creating your own NAS solution.
officeklara
Learn About KlaraGetting expert ZFS advice is as easy as reaching out to us!
At Klara, we have an entire team dedicated to helping you with your ZFS Projects. Whether you’re planning a ZFS project or are in the middle of one and need a bit of extra insight, we are here to help!