FreeBSD vs. Linux – Virtualization Showdown with bhyve and KVM
Virtualization showdown – FreeBSD’s bhyve vs. Linux’s KVM
Let’s compare the two open source virtualization engines and see how they perform
Not too long ago, we walked you through setting up bhyve on FreeBSD 13.1. Today, we’re going to take a look specifically at how bhyve stacks up against the Linux Kernel Virtual Machine—but before we can do that, we need to talk about the best performing configurations under bhyve itself.
When we talk about configuration options that have a massive performance impact, we’re mostly talking about storage configuration—CPU configuration options tend to be fairly straightforward, but storage can be configured with different back-end formats and virtual controllers, which can have a massive impact on both throughput and latency.
OpenZFS is the only back-end storage stack we’ll be testing today—its performance is generally excellent, and its feature set for virtual machine hosting is unparalleled.
FreeBSD bhyve storage configuration
There are three major categories of storage configuration tunable for virtual machines running atop OpenZFS:
- OpenZFS blocksize
- Hypervisor storage type—raw device (on ZVOLs) or raw file (on datasets)
- Hypervisor storage controller—eg emulated NVMe, emulated SATA/SAS, and VirtIO
The first of these configuration choices—OpenZFS blocksize—is the most flexible, and the one we devoted the least testing to. Generally speaking, you should match your blocksize (volblocksize if using ZVOLs, recordsize if using datasets) directly to your actual workload for the best results.
When there’s no single storage workload to specifically tune for, we generally recommend a blocksize of 64K—this allows for decent (though not ideal) performance on both throughput-challenged (1MiB random I/O) and IOPS-challenged (4K random I/O) workloads.
With recordsize=64K or volblocksize=64K, 4K random I/O suffers from an amplification penalty of 16x. This refers to the fact that the block is the smallest individual amount of data which can be read or written—so a 4KiB operation requires 64KiB of actual data to be read or written.
Similarly, a 64KiB blocksize means a 16x IOPS amplification penalty for 1MiB random I/O. In this case, we’re looking at the fact that we need to issue sixteen separate operations to read or write 1MiB of data in 64KiB blocks, as opposed to a single operation if we needed to move the same data with blocksize=1M.
We standardized on a recordsize/volblocksize of 64KiB for these tests, since “generic” storage workloads typically revolve around two potential bottlenecks—latency on 4K I/O, as seen in small files and metadata, and throughput on 1MiB I/O, as seen in the majority of documents, data, and even operating system patching systems like Windows Update.
Hypervisor storage type
Unlike OpenZFS blocksize, there’s usually a single, clear answer as to what storage type performs best under a given hypervisor. Under Linux’s KVM, there are three primary options—QCOW2 on datasets, RAW files on datasets, and direct access to ZVOLs as block devices.
QCOW2 is a QEMU-specific storage format, and it therefore doesn’t make much sense to try to use it under FreeBSD. Under Linux KVM, QCOW2 can be worth using despite sometimes lower performance than RAW files, because it enables QEMU-specific features, including VM hibernation.
This leaves us with RAW files on OpenZFS datasets, vs OpenZFS ZVOLs passed directly down to the VM as block devices (on Linux) or character devices (on FreeBSD). On paper, ZVOLs seem like the ideal answer to VM storage needs—but we’ve found them terribly unperforming under Linux for many years, so we didn’t want to blindly assume they would be performance winners under FreeBSD either.
Hypervisor storage controller
When it comes to the storage controller, we again have a few major options: emulated NVMe, emulated SATA/SAS/SCSI, or paravirtual VirtIO “hardware” which is directly compatible with Linux KVM’s.
On the Linux platform, VirtIO “hardware”—designed specifically to function in a virtualized “guest” environment, therefore leaner and simpler than emulated hardware—drastically outperforms any other options.
However, we’re not in the business of blindly assuming an outcome. We came here to test these things, and in doing so, we discovered that VirtIO isn’t FreeBSD’s highest-performance storage controller option.
As it turns out, bhyve/NVMe isn’t just faster than bhyve/VirtIO—it’s faster than KVM/VirtIO as well!
The which, how, and why of what we tested
Since we specifically wanted to test performance inside the virtual machine, not just on the host, we needed to choose tests which would run on Windows, Linux, and FreeBSD alike. This limited our options pretty severely.
For CPU performance, we tested openssl encryption using the sha256 and aes-256-gcm algorithms, with 16 byte and 16KiB blocksizes. The smaller blocksize largely tests the system’s ability to manage incessant interrupts, while the larger does a better job exposing simpler raw throughput on more reasonable compute workloads.
For storage, we tested two workloads: 4KiB random I/O at iodepth=1, and 1MiB random I/O at iodepth=1. In both cases, we ran a simultaneous 50/50 mix of reads and writes, and used –end_fsync=1 to make sure that all writes actually hit the bare metal before the test “completed” and stats were calculated.
Our test hardware is a 2u Supermicro server equipped with dual Xeon e5 14 core / 28 thread CPUs, 256GiB ECC RAM, and a 2TB Kingston DC500M datacenter-grade SSD. Each guest operating system was allocated 100GiB of storage, four vCPU cores, and 8GiB of RAM.
Before we move on, we’d like to point out that the Kingston SSD is a 6Gbps SATA SSD—so the performance benefits of bhyve’s emulated NVMe controller as tested here don’t just apply to NVMe disks, they apply to SATA/SAS as well!
Finding the most performant storage configuration on bhyve
Under most hypervisors, VirtIO paravirtual storage controllers are vastly more performant than emulated ”real hardware” controllers. But to my surprise, this turned out not to be the case under FreeBSD 13.1.
To find the most performant controller, we tested two freshly-installed guests—one Windows 2019 server, and one Ubuntu 22.04 LTS server. Each guest was installed using a raw file (.img) backing store, with that raw file itself stored in an OpenZFS dataset with recordsize=64K.
Our first guest, the Windows 2019 server, showed shockingly higher performance under emulated NVMe rather than VirtIO—nearly four times the 4KiB random I/O performance, and more than eight times the 1MiB random I/O!
When we repeated the same tests for Ubuntu 22.04 guests, we saw much smaller improvements—but for both 4KiB and 1MiB random I/O, we still saw significantly higher performance from bhyve’s emulated NVMe than the VirtIO paravirtualized controller.
Since even a Linux guest benefited significantly from using bhyve’s emulated NVMe rather than its own, natively supported VirtIO paravirtual controller, we elected not to directly test VirtIO performance on a FreeBSD guest—it seems clear enough that bhyve’s NVMe is a win across the board.
Moving on from the guest’s storage controller to its image format, we tested the performance of a FreeBSD 13.1 guest using a bhyve-emulated NVMe controller on both a dataset and a zvol.
The dataset (which contained a raw image file) and zvol each used a blocksize of 64KiB, set with recordsize=64K for the dataset and volblocksize=64K for the zvol. For both 4KiB and 1MiB random I/O, fio tests inside the guest produced far higher performance for the raw file than for the zvol. This is something the Klara ZFS development team is investigating.
We know most people expect zvols to be the highest-performing storage option for virtual machines using ZFS-backed storage—after all, providing the guest with a simple character device seems much more efficient than forcing it to use a raw file as a sort of “fake” device. But the numbers don’t lie—the raw file outperforms the zvol handily here, with more than twice the 1MiB throughput and six times the 4KiB throughput.
Although I suspect this will surprise many readers, it didn’t surprise me personally—I’ve been testing guest storage performance for OpenZFS and Linux KVM for more than a decade, and zvols have performed poorly by comparison each time I’ve tested them.
Bhyve vs KVM
Now that we’ve figured out the most performant storage configuration for our guests—raw image files, stored on OpenZFS datasets—we can compare guest performance on a bhyve host directly to guest performance on a Linux KVM host.
In all cases, we’re testing on the same dual Xeon e5 server, with 256GiB of ECC RAM and a single Kingston DC500M 6Gbps SATA SSD. Bhyve tests are run under FreeBSD 13.1, and KVM tests under Ubuntu Server 22.04 LTS.
In this chart, we are once again looking at performance differences expressed by percentage, rather than raw numbers. In each column and each group, we’re pitting FreeBSD 13.1 vs Ubuntu Server 22.04 as the host operating system.
When you see a positive value on this chart, FreeBSD 13.1 (and bhyve) outperformed Ubuntu 22.04 LTS (and KVM), while negative values demonstrate a performance win for Ubuntu 22.04. Each value is the delta in performance between FreeBSD and Ubuntu; so if you see +100%, FreeBSD performed twice as well as Ubuntu, and if you see –100%, Ubuntu performed twice as well as FreeBSD.
Now, let’s cut this chart into two sections—storage, and compute—and examine each section separately.
Storage performance results
In our first group of results—for the fio 4KiB random read/write storage test—FreeBSD wins across the board, directly on the host itself and in all three guest operating systems. It won so handily when running directly on the host that we had to crop the chart—the missing data value there is 795.03%, indicating 4KiB random/IO throughput nearly nine times higher than Ubuntu 22.04’s.
Inside the guests, FreeBSD 13.1 still dominates across the board in 4KiB reads and writes with smaller, but still commanding margins of more than double Ubuntu 22.04’s performance regardless of guest operating system used.
Moving onto 1MiB I/O, FreeBSD still generally outperforms Ubuntu, but with smaller margins—and we see our first performance loss to Ubuntu 22.04. Curiously, that loss occurs in a FreeBSD 13.1 guest—which means we get better 1MiB random I/O out of FreeBSD 13.1 running under Ubuntu than we do from FreeBSD under FreeBSD!
Despite FreeBSD 13.1’s single curious loss—when virtualizing FreeBSD itself—the overall situation here is beyond clear; bhyve’s emulated NVMe controller performs vastly better than KVM’s VirtIO, even though it’s using native drivers for real-hardware NVMe under Windows and Linux which were not developed with FreeBSD in mind.
Compute performance results
We should point out that although good enough for a rough guide, this CPU performance test is far from perfect—OpenSSL speed tests don’t exercise all aspects of CPU performance, and just to make matters worse, the version of OpenSSL in each platform’s repository isn’t the same either.
With that said, we’re not really focusing on the performance of an individual OpenSSL version, and most of those differences should iron out since we’re looking at differences between KVM and bhyve by percentage, rather than focusing on raw numbers.
Now that we understand the limitations of our CPU testing, let’s take a look at the results.
Although FreeBSD and bhyve dominated Linux KVM in our storage tests, our compute performance results mostly favored KVM. The only test group which bhyve almost carried across the board is openssl SHA256 with 16B buffers.
In all our other openssl compute performance tests, Linux KVM beat Bhyve directly on the host as well as inside both Linux and FreeBSD virtual machines.
However, there’s one more interesting outlier—and in this case, it’s based on guest operating system rather than on individual test. Windows Server 2019 performed better under FreeBSD 13.1 and bhyve than it did under Ubuntu 22.04 and KVM on each compute test performed!
Although the bhyve management ecosystem is currently quite limited in comparison to Linux KVM’s, its performance is already quite impressive.
For storage-heavy workloads, the benefit of bhyve’s emulated NVMe controller is difficult to overstate—it produced massive throughput improvements that even a long-time KVM fan simply cannot ignore.
Compute-heavy workloads are more of a mixed bag—for the most part, guests under bhyve tend to run openssl speed tests roughly 25% slower than the same guest does under KVM. Bizarrely, this trend inverts almost exactly for Windows guests, with openssl performance gains of roughly 25% across the board.
Although neither hypervisor beats the other at every single test, we believe bhyve’s overwhelming storage performance wins outweigh its relatively minor compute performance losses. Unless you’ve got a hyper-specialized workload, four to eight times better disk performance will almost always make a bigger impression than slightly worse CPU performance.
This would be much more useful with numbers rather than percents. For example, on storage benchmarks, if there is any caching all bets are off. It’s difficult to sanity check anything without raw results.
Pingback: Valuable News – 2022/11/14 | 𝚟𝚎𝚛𝚖𝚊𝚍𝚎𝚗
Can you share the fio config? Especially interesting in the ZVOL vs raw file test.