Evaluating FreeBSD CURRENT for Production Use
Evaluating FreeBSD CURRENT for Production Use
The FreeBSD Operating System introduces new features in CURRENT, its main development branch. Snapshots of CURRENT are made available as installer images weekly—making it easy to follow the CURRENT branch directly, by simply building a newer FreeBSD system from the source as changes land.
Why FreeBSD CURRENT?
Following CURRENT can take a lot of work since FreeBSD releases are stabilized over a long development process, but CURRENT is where new features first arrive and get tested. CURRENT doesn’t get official security or errata notices, so users must follow developments via the commit logs and mailing list, creating obvious downsides to running it in production. This doesn’t stop large organizations such as Netflix using systems based on CURRENT in production—but doing so responsibly means having an experienced team capable of catching and mitigating issues in CURRENT before they reach the production stack.
Why might you want to run CURRENT? If you have a large modified code base, or are building a product based on FreeBSD, CURRENT gives you a look into the future of FreeBSD. Running CURRENT will help you understand changes that are happening in the FreeBSD Operating System and it gives you an opportunity to see how your stack performs with new features.
Running CURRENT can help you address incompatibilities in the software you run long before they appear in a release. As new FreeBSD releases get closer, you might want to run CURRENT to validate that your software stack runs and performs well, and report problems upstream in time for them to be fixed.
CURRENT includes a number of features that make it easier for FreeBSD to make the system more reliable. These debugging features detect errors in the system and alert system users when the errors occur. In addition to breaking bugs, they can even catch behaviors that simply result in “undesired behaviors.” These features help give FreeBSD releases the high level of reliability we love, but they come at the cost of system performance.
This is why the UPDATING file in root of the FreeBSD source tree has offered this warning since 20020:
NOTE TO PEOPLE WHO THINK THAT FreeBSD 14.x IS SLOW: FreeBSD 14.x has many debugging features turned on, in both the kernel and userland. These features attempt to detect incorrect use of system primitives, and encourage loud failure through extra sanity checking and fail stop semantics. They also substantially impact system performance. If you want to do performance measurement, benchmarking, and optimization, you'll want to turn them off. This includes various WITNESS-related kernel options, INVARIANTS, malloc debugging flags in userland, and various verbose features in the kernel. Many developers choose to disable these features on build machines to maximize performance. (To completely disable malloc debugging, define WITH_MALLOC_PRODUCTION in /etc/src.conf and rebuild world, or to merely disable the most expensive debugging functionality at runtime, run "ln -s 'abort:false,junk:false' /etc/malloc.conf".)
This warning was added during the development of FreeBSD 5. FreeBSD 5 saw the introduction of many SMP features, including debugging options with high overhead. Benchmarks of FreeBSD 5 looking to see SMP-related performance improvements saw an apparent reduction of performance instead. If early testers don’t understand the impact of debugging features, they can make FreeBSD pre-releases look bad.
If you want to evaluate new features and performance impacts in CURRENT, you will need to disable its debugging features in order to get accurate measurements.
Understanding FreeBSD Development Features
The warning in UPDATING mentions three main features that help with debugging: INVARIANTS, WITNESS and malloc debugging.
INVARIANTS enables a lot of debugging code paths in the kernel. These are features that developers add to include extra checks beyond the kasserts (assertions that will result in a panic if the expected condition is not true).
INVARIANTS features include extra walks of lists in loops, more assertions, and checks for points that never should become null. These extra tests can be costly, but also expose errors in the system.
WITNESS tracks when locks are acquired and released in the kernel, as well as keeping track of the order in which locks are taken. If you have ever seen the ‘lock order reversal’ message in dmesg, that is WITNESS in action.
WITNESS is a powerful feature for debugging concurrency and locking issues in a development system. The extra tracking it adds is too much for a production system, but isn’t so onerous as to make a development system unusable.
Finally, malloc (memory allocation) debugging features enable tracking of use-after-free conditions, the collection of memory usage statistics, and assertions that help protect against common errors.
A/B Testing FreeBSD CURRENT
In this article we will show how to build a CURRENT system with the debugging features disabled, and perform some benchmarks to test the impact debugging features have on performance.
FreeBSD offers a built-in kernel configuration called GENERIC-NODEBUG that disables a lot of the debugging features so that CURRENT can be more accurately benchmarked and evaluated for performance.
sys/amd64/conf/GENERIC-NODEBUG: # # GENERIC-NODEBUG -- WITNESS and INVARIANTS free kernel configuration file # for FreeBSD/amd64 # # This configuration file removes several debugging options, including # WITNESS and INVARIANTS checking, which are known to have significant # performance impact on running systems. When benchmarking new features # this kernel should be used instead of the standard GENERIC. # This kernel configuration should never appear outside of the HEAD # of the FreeBSD tree. # # For more information on this file, please read the config(5) manual page, # and/or the handbook section on Kernel Configuration Files: # # https://docs.freebsd.org/en/books/handbook/kernelconfig/#kernelconfig-config # # The handbook is also available locally in /usr/share/doc/handbook # if you've installed the doc distribution, otherwise always see the # FreeBSD World Wide Web server (https://www.FreeBSD.org/) for the # latest information. # # An exhaustive list of options and more detailed explanations of the # device lines are also present in the ../../conf/NOTES and NOTES files. # If you are in doubt as to the purpose or necessity of a line, check first # in NOTES. # # $FreeBSD$ include GENERIC include "../../conf/std.nodebug" ident GENERIC-NODEBUG
From the configuration we can see that this config includes the normal GENERIC config file and a second file, sys/conf/std.nodebug, where debugging features are removed:
sys/conf/std.nodebug: # # std.nodebug -- Disable the debug options found in the GENERIC kernel config. # nooptions INVARIANTS nooptions INVARIANT_SUPPORT nooptions WITNESS nooptions WITNESS_SKIPSPIN nooptions DEBUG_VFS_LOCKS nooptions BUF_TRACKING nooptions FULL_BUF_TRACKING nooptions DEADLKRES nooptions COVERAGE nooptions KCOV nooptions MALLOC_DEBUG_MAXZONES nooptions QUEUE_MACRO_DEBUG_TRASH # Net80211 debugging nooptions IEEE80211_DEBUG # USB debugging nooptions USB_DEBUG nooptions HID_DEBUG # CAM debugging nooptions CAMDEBUG nooptions CAM_DEBUG_FLAGS
We used a development machine with a Ryzen 3800X with 32GB of RAM and NVME storage to test the performance impact of CURRENT’s debugging features. The system was installed with a recent 14-CURRENT image with ZFS on root and then updated to the latest tip-of-tree from git. We also used a boot environment (here is a great introduction to using boot environments with FreeBSD) to more easily perform differential (A/B) tests in very similar environments.
We’ll call the CURRENT system with no changes to /etc/src.conf or /etc/make.conf debug, and we’ll call the CURRENT system with debugging disabled nodebug.
As well as kernel changes, we need to disable some userland features, we can put the full set of build configuration for nodebug in /etc/make.conf and /etc/src.conf, which looks like this on nodebug:
/etc/make.conf: KERNCONF="GENERIC-NODEBUG" /etc/src.conf: WITH_MALLOC_PRODUCTION="YES" WITHOUT_LLVM_ASSERTIONS="YES"
The FreeBSD development features we disabled in the “nodebug” system have different impacts in different sub systems, and their impacts are exercised by different workloads.
When we benchmark, it is important to test workloads that accurately model your system’s real-world workload. In the best case, we would be able to steer real traffic to a test machine and evaluate changes in its ability to serve traffic.
For this article we are going to build FreeBSD itself as a test workload, and use two tests from the large number of benchmarks available in the FreeBSD ports system.
There are 124 tools in the benchmarks category of the ports tree—most likely, some of those development tools can exercise your own real-world workloads well.
In this article we will look at the following benchmarks:
- building FreeBSD kernel and world
- iperf3 TCP loopback test
As we have seen earlier, many of the development features we need to disable in CURRENT affect many if not most workloads. The malloc debugging features can impact memory-based workloads even in userspace, but INVARIANTS and WITNESS debug paths mostly impact kernel-space workloads.
We can see this by comparing the OpenSSL benchmarks on our debug and nodebug systems.
Building FreeBSD is a great benchmark for development style workloads. Building FreeBSD involves reading thousands of small files, compiling them, writing the results to temporary files, and then linking the results together. Compilation is a heavy userspace processing task, but getting the files stresses the kernel too.
World and kernel builds on this machine take about half an hour when we don’t need to rebuild llvm. The long build time makes testing over several runs impractical, but this isn’t a typical task for most users.
Building the FreeBSD kernel exercises a similar workload to doing a full system build, but with just a subset of the total files. To look at the penalty to building FreeBSD on the debug host, we built the kernel once to prime our filesystem cache, then ran 10 more build iterations to test performance repeatably. The compiler toolchain is in sync with the system here, so we see just the kernel compilation time in both boot environments.
debug $ for x in $(jot 10); do /usr/bin/time -o ~/buildkernel-debug -a make -j 16 buildkernel KERNCONF=GENERIC; done nodebug $ for x in $(jot 10); do /usr/bin/time -o ~/buildkernel-nodebug -a make -j 16 buildkernel KERNCONF=GENERIC; done
In each case, the buildkernel stage is timed using time and appended to a log file. The average time to run buildkernel on both systems is plotted below:
The impact on build performance for a FreeBSD CURRENT system with debugging features is clear. Kernel builds complete about 30% faster with debugging features disabled.
Similarly, a single ‘buildworld buildkernel’ on the debug machine took 36 minutes, but only 24 minutes on the nodebughost. This is a huge reduction in build times.
Benchmarking Network Throughput
For a quick estimate of network stack performance, we can use iperf3. We run the test with the client and server on the same machine. This removes the network from the equation, and exercises the locking and invariants debug paths in the kernel in both directions.
For this quick test, we set up iperf3 to transfer 100GiB from the client side to the server side. There is some potential noise that filters through from the TCP congestion controller, so we run ten iterations to minimize its impact.
We need to start the iperf3 server side in one terminal:
$ iperf3 -s
In another terminal, we first disable TCP host caching, then do 10 iperf3 client iterations in a loop, appending the time each transfer took to a log file.
# sysctl net.inet.tcp.hostcache.enable=0
The test transfer is started like so:
for x in `jot 10`; do /usr/bin/time -o iperf3-nodebug -a iperf3 -c localhost -n 100G; done
The average transfer time is plotted below for both debug and nodebug:
CURRENT’s debug features have a large impact in this test, with runs on the debug host taking nearly twice as long as runs on the nodebug host. Converted into bitrates, debug gets ~18Gbit/s compared to ~31Gbit/s on the nodebug environment.
Benchmarking with lzbench
Finally, let’s look at lzbench—a benchmark that should be unaffected by most of the debug features.
lzbench is an in-memory benchmark of open-source compression tools and algorithms. It exercises a compression algorithm in memory, attempting to avoid interaction with other parts of the system. This means that when it is doing a good job, the results we get should be far more representative of the algorithm than the host operating system.
For these tests, we created a random file of ascii data like so:
dd if=/dev/random bs=1m count=1024| b64encode - > testfile.out
This creates a nicely compressible test file —it is just ascii, but it won’t compress down to nothing in the way a file of all zeros would.
We then ran lzbench with the following command:
debug $ lzbench -ezstd testfile.out lzbench 1.8 (64-bit FreeBSD) Assembled by P.Skibinski Compressor name Compress. Decompress. Compr. size Ratio Filename memcpy 5718 MB/s 5755 MB/s 1450493368 100.00 testfile.out zstd 1.4.5 -1 1020 MB/s 1319 MB/s 1093210136 75.37 testfile.out zstd 1.4.5 -2 1007 MB/s 1318 MB/s 1093210154 75.37 testfile.out zstd 1.4.5 -3 960 MB/s 1317 MB/s 1093211523 75.37 testfile.out zstd 1.4.5 -4 987 MB/s 1318 MB/s 1093211809 75.37 testfile.out zstd 1.4.5 -5 345 MB/s 1320 MB/s 1093254405 75.37 testfile.out zstd 1.4.5 -6 317 MB/s 1322 MB/s 1093272851 75.37 testfile.out zstd 1.4.5 -7 309 MB/s 1322 MB/s 1093271832 75.37 testfile.out zstd 1.4.5 -8 313 MB/s 1321 MB/s 1093271832 75.37 testfile.out zstd 1.4.5 -9 311 MB/s 1324 MB/s 1093299343 75.37 testfile.out ^C
lzbench displays the compression speed, decompression speed and the ratio between the uncompressed and compressed files. It starts with a memory benchmark which shows how fast the null compressor (which does nothing at all) can go.
The first 10 levels of decompression for debug and nodebug from lzbench were plotted below:
We can see that lzbench doesn’t benefit our nodebug test system, but it also doesn’t suffer a penalty—lzbench is doing a good job of testing the algorithm rather than the operating system.
In heavy CPU userspace benchmarks, the debugging features in CURRENT don’t have a high overhead—but more realistic server and networking workloads can experience huge impacts
When it comes to performance benchmarking, you should always test with workloads which accurately model your real workloads. Most workloads will benefit significantly from disabling CURRENT’s debugging features, but some CPU only tasks won’t.
It is important to know how upgrades to FreeBSD are going to impact your servers, and testing CURRENT is the best way to know what is going to happen in the future FreeBSD releases.
If you integrate CURRENT hosts into your test or parts of your production environment, it is important to disable its development-only debugging features. Failing to do so is likely to give you a false impression of how FreeBSD is progressing, since a promising commit offering a huge performance increase can easily be overshadowed by the performance impact of WITNESS of INVARIANTS debugging.
Here are some interesting resources on the shell that you may also find useful:
- If you want to learn more or have questions on how Klara can help you stay on top of performance, simply contact us. Our team of senior engineers will be ready to answer your questions
- The release of #FreeBSD 13.1 means it’s time to start validating and benchmarking your stack against 14.0-CURRENT before its release in 2023. Klara can provide any support you need with this process.
- Need any specific help with implementing benchmarking? You know where to find us.